Generating Unique, Hard-To-Guess "Coupon" Codes

Generating unique, hard-to-guess coupon codes

The code needs to be unguessable, because the only verification you can perform before giving the user their reward is to check whether the code they entered exists in your list of "issued" codes.

  • That means the number of all possible codes in that format is much larger than the number of codes you want to issue,. Depending on how easy it is to simply try codes (think of a script trying repeatedly), then you might need all possible codes to outnumber issued codes by a factor of a million or a billion or more. This sounds high, but is possible in relatively short strings.

  • It also means that the codes that you use must be chosen as randomly as possible within all possible codes. This is necessary to avoid users figuring out that most valid codes start with "AAA" for example. More sophisticated users might spot that your "random" codes use a hackable random number generator (Ruby's default rand() is fast and statistically good for random data, but is hackable in this way, so don't use it).

The starting point for such a secure code would be the output from a cryptographic PRNG. Ruby has the securerandom library, which you can use to get a raw code like this:

require 'securerandom'
SecureRandom.hex
# => "78c231af76a14ef9952406add6da5d42"

This code is long enough to cover any realistic number of vouchers (millions each for everyone on the planet), without any meaningful chance of repetition or being easy to guess. However, it is a bit awkward to type from a physical copy.

Once you know how to generate a random, practically unguessable code, your next problem is understanding user experience and deciding how much you can realistically compromise security in the name of usability. You need to bear in mind the value to the end user, and therefore how hard someone might try to get a valid code. I cannot answer that for you, but can make some general points about usability:

  • Avoid ambiguous characters. In print, it is sometimes difficult to see the difference between 1, I and l for example. We often understand what it is supposed to be from context, but a randomised string of characters does not have this context. It would be a bad user experience to have to try several variations of a code by testing 0 vs O, 5 vs S etc.

  • Use either lower case or upper case letters but not both. Case sensitivity will not be understood or followed by some %age of your users.

  • Accept variations when matching codes. Allow spaces and dashes. Perhaps even allow 0 and O to mean the same thing. This is best done by processing the input text so it is in the right case, strip separator characters etc.

  • In print, separate the code into a few small parts, it will be easier for the user to find their place in the string and type a few characters at once.

  • Don't make the code too long. I would suggest 12 characters, in 3 groups of 4.

  • Here's an interesting one - you may want to scan the code for possible rude words, or avoid the characters that would generate them. If your code contained only the characters K, U, F, C, then there would be a high chance of offending a user. This isn't usually a concern because users do not see most computer secure codes, but these ones will be in print!

Putting that all together, this is how I might generate a usable code:

# Random, unguessable number as a base20 string
# .rjust(12, '0') covers for unlikely, but possible small numbers
# .reverse ensures we don't use first character (which may not take all values)
raw = SecureRandom.random_number( 2**80 ).to_s( 20 ).rjust(12, '0').reverse
# e.g. "3ecg4f2f3d2ei0236gi"

# Convert Ruby base 20 to better characters for user experience
long_code = raw.tr( '0123456789abcdefghij', '234679QWERTYUPADFGHX' )
# e.g. "6AUF7D4D6P4AH246QFH"

# Format the code for printing
short_code = long_code[0..3] + '-' + long_code[4..7] + '-' + long_code[8..11]
# e.g. "6AUF-7D4D-6P4A"

There are 20**12 valid codes in this format, which means you can issue a billion of your own codes, and there would be one in four million chance of a user simply guessing a correct one. In cryptography circles that would be very bad (this code is insecure against a fast local attack), but for a web form offering free burritos to registered users, and where you would notice a someone trying four million times with a script, it is ok.

Rails 3: Generate unique codes (coupons)

You can do something like this too:

chars = ('a'..'z').to_a + ('A'..'Z').to_a
def String.random_alphanumeric(size=16)
(0...size).collect { chars[Kernel.rand(chars.length)] }.join
end

But then you would have to compare against a database to make sure it is not used yet.

How to manage coupons codes?

You could use an UUID as your coupon code, to ensure uniqueness, or store the coupon code in a record "forever", marked as used.

Coupon code generation

Basically you can split your operation into to parts:

  1. Somehow "encrypt" your initial number n, so that two consecutive numbers yield (very) different results
  2. Construct your "human-readable" code from the result of step 1

For step 1 I'd suggest to use a simple block cipher (e.g. a Feistel cipher with a round function of your choice). See also this question.

Feistel ciphers work in several rounds. During each round, some round function is applied to one half of the input, the result is xored with the other half and the two halves are swapped. The nice thing about Feistel ciphers is that the round function hasn't to be two-way (the input to the round function is retained unmodified after each round, so the result of the round function can be reconstructed during decryption). Therefore you can choose whatever crazy operation(s) you like :). Also Feistel ciphers are symmetric, which fulfills your first requirement.

A short example in C#

const int BITCOUNT = 30;
const int BITMASK = (1 << BITCOUNT/2) - 1;

static uint roundFunction(uint number) {
return (((number ^ 47894) + 25) << 1) & BITMASK;
}

static uint crypt(uint number) {
uint left = number >> (BITCOUNT/2);
uint right = number & BITMASK;
for (int round = 0; round < 10; ++round) {
left = left ^ roundFunction(right);
uint temp = left; left = right; right = temp;
}
return left | (right << (BITCOUNT/2));
}

(Note that after the last round there is no swapping, in the code the swapping is simply undone in the construction of the result)

Apart from fulfilling your requirements 3 and 4 (the function is total, so for different inputs you get different outputs and the input is "totally scrambled" according to your informal definition) it is also it's own inverse (thus implicitely fulfilling requirement 1), i.e. crypt(crypt(x))==x for each x in the input domain (0..2^30-1 in this implementation). Also it's cheap in terms of performance requirements.

For step 2 just encode the result to some base of your choice. For instance, to encode a 30-bit number, you could use 6 "digits" of an alphabet of 32 characters (so you can encode 6*5=30 bits).

An example for this step in C#:

const string ALPHABET= "AG8FOLE2WVTCPY5ZH3NIUDBXSMQK7946";
static string couponCode(uint number) {
StringBuilder b = new StringBuilder();
for (int i=0; i<6; ++i) {
b.Append(ALPHABET[(int)number&((1 << 5)-1)]);
number = number >> 5;
}
return b.ToString();
}
static uint codeFromCoupon(string coupon) {
uint n = 0;
for (int i = 0; i < 6; ++i)
n = n | (((uint)ALPHABET.IndexOf(coupon[i])) << (5 * i));
return n;
}

For inputs 0 - 9 this yields the following coupon codes

0 => 5VZNKB
1 => HL766Z
2 => TMGSEY
3 => P28L4W
4 => EM5EWD
5 => WIACCZ
6 => 8DEPDA
7 => OQE33A
8 => 4SEQ5A
9 => AVAXS5

Note, that this approach has two different internal "secrets": First, the round function together with the number of rounds used and second, the alphabet you use for encoding the encyrpted result. But also note, that the shown implementation is in no way secure in a cryptographical sense!

Also note, that the shown function is a total bijective function, in the sense, that every possible 6-character code (with characters out of your alphabet) will yield a unique number. To prevent anyone from entering just some random code, you should define some kind of restictions on the input number. E.g. only issue coupons for the first 10.000 numbers. Then, the probability of some random coupon code to be valid would be 10000/2^30=0.00001 (it would require about 50000 attempts to find a correct coupon code). If you need more "security", you can just increase the bit size/coupon code length (see below).

EDIT: Change Coupon code length

Changing the length of the resulting coupon code requires some math: The first (encrypting) step only works on a bit string with even bit count (this is required for the Feistel cipher to work).

In the the second step, the number of bits that can be encoded using a given alphabet depends on the "size" of chosen alphabet and the length of the coupon code. This "entropy", given in bits, is, in general, not an integer number, far less an even integer number. For example:

A 5-digit code using a 30 character alphabet results in 30^5 possible codes which means ld(30^5)=24.53 bits/Coupon code.

For a four-digit code, there is a simple solution: Given a 32-Character alphabet you can encode *ld(32^4)=5*4=20* Bits. So you can just set the BITCOUNT to 20 and change the for loop in the second part of the code to run until 4 (instead of 6)

Generating a five-digit code is a bit trickier and somhow "weakens" the algorithm: You can set the BITCOUNT to 24 and just generate a 5-digit code from an alphabet of 30 characters (remove two characters from the ALPHABET string and let the for loop run until 5).

But this will not generate all possible 5-digit-codes: with 24 bits you can only get 16,777,216 possible values from the encryption stage, the 5 digit codes could encode 24,300,000 possible numbers, so some possible codes will never be generated. More specifically, the last position of the code will never contain some characters of the alphabet. This can be seen as a drawback, because it narrows down the set of valid codes in an obvious way.

When decoding a coupon code, you'll first have to run the codeFromCoupon function and then check, if bit 25 of the result is set. This would mark an invalid code that you can immediately reject. Note that, in practise, this might even be an advantage, since it allows a quick check (e.g. on the client side) of the validity of a code without giving away all internals of the algorithm.

If bit 25 is not set you'll call the crypt function and get the original number.

generating coupon code

If I understand you correctly, you're generating a 16-digit coup code, and then to validate it, you're using a sort of checksum?

If anyone figures out your checksum algorithm, they're going to be able to generated unlimited coupons.

I think it's better to pre-generate a few thousand or hundred thousand (however many you need) coupon codes, and perhaps make them one-time use (by deleting them or checking if they're already used).

Of course...this depends on your needs. A lot of sites just have easy-to-remember unlimited use coupon codes just to trick people into thinking they're getting a deal.

Simple coupon generator in rails

You don't mention what sort of coupon format you require and I am sure there are a bunch of gems that do similar things. I guess one approach is to use a unique code that you can generate and then tag a user_id to the end of it to ensure uniqueness across many codes.

def generate_coupon_code(user_id)
characters = %w(A B C D E F G H J K L M P Q R T W X Y Z 1 2 3 4 5 6 7 8 9)
code = ''

4.times { code << characters.sample }
code << user_id.to_s

code
end


Related Topics



Leave a reply



Submit