Unique Key Generation

Unique key generation

There are only 3 ways to generate unique values, rather they be passwords, user IDs, etc.:

  1. Use an effective GUID generator - these are long and cannot be shrunk. If you only use part you FAIL.
  2. At least part of the number is sequentially generated off of a single sequence. You can add fluff or encoding to make it look less sequential. Advantage is they start short - disadvantage is they require a single source. The work around for the single source limitation is to have numbered sources, so you include the [source #] + [seq #] and then each source can generate its own sequence.
  3. Generate them via some other means and then check them against the single history of previously generated values.

Any other method is not guaranteed. Keep in mind, fundamentally you are generating a binary number (it is a computer), but then you can encode it in Hexadecimal, Decimal, Base64, or a word list. Pick an encoding that fits your usage. Usually for user entered data you want some variation of Base32 (which you hinted at).

Note about GUIDS: They gain their strength of uniqueness from their length and the method used to generate them. Anything less than 128-bits is not secure. Beyond random number generation there are characteristics that go into a GUID to make it more unique. Keep in mind they are only practically unique, not completely unique. It is possible, although practically impossible to have a duplicate.

Updated Note about GUIDS: Since writing this I learned that many GUID generators use a cryptographically secure random number generator (difficult or impossible to predict the next number generated, and a not likely to repeat). There are actually 5 different UUID algorithms. Algorithm 4 is what Microsoft currently uses for the Windows GUID generation API. A GUID is Microsoft's implementation of the UUID standard.

Update: If you want 7 to 16 characters then you need to use either method 2 or 3.

Bottom line: Frankly there is no such thing as completely unique. Even if you went with a sequential generator you would eventually run out of storage using all the atoms in the universe, thus looping back on yourself and repeating. Your only hope would be the heat death of the universe before reaching that point.

Even the best random number generator has a possibility of repeating equal to the total size of the random number you are generating. Take a quarter for example. It is a completely random bit generator, and its odds of repeating are 1 in 2.

So it all comes down to your threshold of uniqueness. You can have 100% uniqueness in 8 digits for 1,099,511,627,776 numbers by using a sequence and then base32 encoding it. Any other method that does not involve checking against a list of past numbers only has odds equal to n/1,099,511,627,776 (where n=number of previous numbers generated) of not being unique.

Unique Key Generation Logic

I don't understand what your problem really is, but I'll try.

It seems like you mean something like this:

A SQL table which saves the public and private ID (and maybe other things).

You can generate a key like this:

$chars = '0123456789abcedfghijklmnopqrstuvwxyz';

function generateKey($length, $charsLength = 10) {
global $chars;

$key = '';

for($i=0;$i<$length;++$i) {
$key .= $chars[rand(0, $charsLength - 1)];
}

return $key;
}

$keyPublic = generateKey(10); // Public key with length 10

// Now check if the key already exist
while(mysql_num_rows(mysql_select('SELECT publicKey FROM keys WHERE publicKey = \''.$keyPublic.'\')) === 1) {
$keyPublic = generateKey(10);
}

$keyPrivate = generateKey(10, 36); // Private key with length 10

// Now check if the key already exist
while(mysql_num_rows(mysql_select('SELECT privateKey FROM keys WHERE privateKey = \''.$keyPrivate.'\')) === 1) {
$keyPrivate = generateKey(10, 36);
}

In this example there are two keys generated and it is checked if the keys already exist. (in the example in the table "keys").

Efficient unique key generation for database entries

Generating a random string in the application and checking if it's unique is not a bad solution. Don't worry about it being inefficient, it's not -- and definitely not compared to the alternatives. It will certainly be faster than running db.user.count() or keeping a separate table with precalculated IDs. You just need to do it right.

First of all, how often will new users be created? Probably not very often compared to other things, so really the whole efficiency discussion is moot. Secondly, with 7 characters A-Z, 0-9 that's a range of 36^7 or somewhere around 78 billion. It will be some time before you will start seeing collisions, to say the least.

If you just do it like this, it will not incur any performance penalties unless there's a collision (which is extremely unlikely):

  • Generate a unique user ID
  • Insert your user object, using the user ID as the value of _id
  • Check for duplicate key errors (how to do this depends on the language and driver, but might involve running the getLastError command).
  • On a duplicate key error start over by generating a new user ID

This way there will only be extra work in the event of a collision (and I really, really want to stress how incredibly unlikely that will be).

There's another way of generating a unique user ID: take the current UNIX timestamp (down to the second), append a hash of the hostname and then the process ID, and finally the current value of a counter. This is in fact how Mongo's ObjectId is generated, and guarantees that you can generate as many objects per second, per process, as the max value of your counter (which in Mongo is 3 bytes, so 16 million). See the docs on ObjectId if you're interested in the details: http://www.mongodb.org/display/DOCS/Object+IDs

It has the property that your user IDs will naturally sort in order of creation, but it's 12 bytes long, so a bit longer than your 7 chars, unfortunately. You can use the same method and skip the hostname/pid, and shorten the counter (which can also be a random number if you like) to two bytes, then you would be down to 6 bytes, which could probably be squeezed into about 9 chars A-Z, 0-9.

simple algorithm to generate unique key

Sounds to me like you want something like request signing:

  1. generate a random secret key that you give to the authenticating app (ahead of time, shared secret)
  2. require that the authenticating app sends its current date as part of the request
  3. require that the authenticating app creates a hash of a concatenation of

    • the date sent in 2.
    • any other unique data that's part of the request
    • the secret key

    This will form your "unique key". Since you're looking at a message authentication code, you'll want an HMAC hash. E.g.:

    code = HMAC(date + data, secret key)
  4. verify that the date is within a certain tolerance, e.g. ±15 minutes

  5. repeat the same hashing algorithm
  6. compare the received hash with your hash

This way you can authenticate each request as being sent by the entity in possession of the secret key without sending the secret key over the wire, and each request has a unique authentication code.

Generate a Unique Key

Don't over-complicate it:

$key = md5(microtime().rand());

Generate a set of unique keys that can be validated without keeping a white-list

A 10 byte key is not enough to make anything secure.

You need a secure hash function such as SHA2-256 whose output is 32-byte in length. SHA2 can easily be implemented on most systems.

Your key needs two parts:

[text + hash]

The first part is like a "username" and the second part is like a "password"

You also need a "secret key". This key is an array of bytes which is stored in your software. You then add the "secret key" to your "username". Find the SHA2 hash for the resulting string. Now you have an output that is the length of original text + 32 bytes for the hash.

You can use this key as a unique verifiable ID.

To test they key's authenticity, take the "username" part and add your secret key. Take SHA2 of that string, the result should match "password"

If secrecy and uniqueness is not a big issue then you can use MD5 whose output is 16 bytes. Change plain text to binary so it can store more information in fewer bytes, and your final key will be only 20 bytes. You could cut that down a bit more but reducing to 10 bytes is not recommended.

Here is an example. I used SHA2 implementation from this link:

https://github.com/B-Con/crypto-algorithms (I am not sure if it works on a big-endian machine)

Any SHA2 implementation should work.

void sha2(BYTE* dst, const BYTE* src, int len)
{
SHA256_CTX ctx;
sha256_init(&ctx);
sha256_update(&ctx, (const BYTE*)src, len);
sha256_final(&ctx, (BYTE*)dst);
}

void create_verifiable_id(const BYTE* source, BYTE *uid)
{
BYTE hash[32];
sha2(hash, source, ID_SIZE);

//combine source + hash
memcpy(uid, source, ID_SIZE);
memcpy(uid + ID_SIZE, hash, 32);
}

int test_verfiable_id(const BYTE *uid)
{
BYTE hash[32];
sha2(hash, uid, ID_SIZE);

//hash should match the second part of uid
return memcmp(hash, uid + ID_SIZE, 32) == 0;
}

int main(void)
{
//use a number from 0 to 0xFFFFFFFF, store in buf (4 bytes)
//this is the "plain text" portion
int number = 0x12345678;
BYTE buf[ID_SIZE];
for(int i = 0; i < sizeof(buf); i++)
{
buf[i] = number & 0xFF;
number >>= 8;
}

//add sha2 to "plain text" to make verifiable id
BYTE verifiable_id[32 + ID_SIZE];
create_verifiable_id(buf, verifiable_id);

printf("UID as hex string:\n");
for(int i = 0; i < 32 + ID_SIZE; i++)
printf("%02X", verifiable_id[i] & 0xFF);
printf("\n");

printf("Test (should succeed): %d\n", test_verfiable_id(verifiable_id));

//change verifiable_id and test it again
verifiable_id[0]++;
printf("Test (should fail): %d\n", test_verfiable_id(verifiable_id));
return 0;
}


Related Topics



Leave a reply



Submit