Generating an Instagram- or Youtube-Like Unguessable String Id in Ruby/Activerecord

Generating an Instagram- or Youtube-like unguessable string ID in ruby/ActiveRecord

You could do something like this:

random_attribute.rb

module RandomAttribute

def generate_unique_random_base64(attribute, n)
until random_is_unique?(attribute)
self.send(:"#{attribute}=", random_base64(n))
end
end

def generate_unique_random_hex(attribute, n)
until random_is_unique?(attribute)
self.send(:"#{attribute}=", SecureRandom.hex(n/2))
end
end

private

def random_is_unique?(attribute)
val = self.send(:"#{attribute}")
val && !self.class.send(:"find_by_#{attribute}", val)
end

def random_base64(n)
val = base64_url
val += base64_url while val.length < n
val.slice(0..(n-1))
end

def base64_url
SecureRandom.base64(60).downcase.gsub(/\W/, '')
end
end
Raw

user.rb

class Post < ActiveRecord::Base

include RandomAttribute
before_validation :generate_key, on: :create

private

def generate_key
generate_unique_random_hex(:key, 32)
end
end

How to generate a random string in Ruby


(0...8).map { (65 + rand(26)).chr }.join

I spend too much time golfing.

(0...50).map { ('a'..'z').to_a[rand(26)] }.join

And a last one that's even more confusing, but more flexible and wastes fewer cycles:

o = [('a'..'z'), ('A'..'Z')].map(&:to_a).flatten
string = (0...50).map { o[rand(o.length)] }.join

If you want to generate some random text then use the following:

50.times.map { (0...(rand(10))).map { ('a'..'z').to_a[rand(26)] }.join }.join(" ")

this code generates 50 random word string with words length less than 10 characters and then join with space

How to generate a random, unique, alphanumeric ID of length N in Postgres 9.6+?

Figured this out, here's a function that does it:

CREATE OR REPLACE FUNCTION generate_uid(size INT) RETURNS TEXT AS $$
DECLARE
characters TEXT := 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789';
bytes BYTEA := gen_random_bytes(size);
l INT := length(characters);
i INT := 0;
output TEXT := '';
BEGIN
WHILE i < size LOOP
output := output || substr(characters, get_byte(bytes, i) % l + 1, 1);
i := i + 1;
END LOOP;
RETURN output;
END;
$$ LANGUAGE plpgsql VOLATILE;

And then to run it simply do:

generate_uid(10)
-- '3Rls4DjWxJ'

Warning

When doing this you need to be sure that the length of the IDs you are creating is sufficient to avoid collisions over time as the number of objects you've created grows, which can be counter-intuitive because of the Birthday Paradox. So you will likely want a length greater (or much greater) than 10 for any reasonably commonly created object, I just used 10 as a simple example.


Usage

With the function defined, you can use it in a table definition, like so:

CREATE TABLE collections (
id TEXT PRIMARY KEY DEFAULT generate_uid(10),
name TEXT NOT NULL,
...
);

And then when inserting data, like so:

INSERT INTO collections (name) VALUES ('One');
INSERT INTO collections (name) VALUES ('Two');
INSERT INTO collections (name) VALUES ('Three');
SELECT * FROM collections;

It will automatically generate the id values:

    id     |  name  | ...
-----------+--------+-----
owmCAx552Q | ian |
ZIofD6l3X9 | victor |

Usage with a Prefix

Or maybe you want to add a prefix for convenience when looking at a single ID in the logs or in your debugger (similar to how Stripe does it), like so:

CREATE TABLE collections (
id TEXT PRIMARY KEY DEFAULT ('col_' || generate_uid(10)),
name TEXT NOT NULL,
...
);

INSERT INTO collections (name) VALUES ('One');
INSERT INTO collections (name) VALUES ('Two');
INSERT INTO collections (name) VALUES ('Three');
SELECT * FROM collections;

id | name | ...
---------------+--------+-----
col_wABNZRD5Zk | ian |
col_ISzGcTVj8f | victor |

Getting primary key sequence number in postgres

Postgres will generate new ids for sequences each time. You might get gaps in the sequence caused by rollbacks etc, but if that isn't important, then you should be good to go.

You should be using nextval to get a new value. currval reports the last generated value.

Sequence documentation.
https://www.postgresql.org/docs/current/static/sql-createsequence.html

pseudo_encrypt() function in plpgsql that takes bigint

4294967295 must be used as the bitmask to select 32 bits (instead of 4294967296).
That's the reason why currently you get the same value for different inputs.

I'd also suggest using bigint for the types of l2 and r2, they shouldn't really differ from r1 and l1

And, for better randomness, use a much higher multiplier in the PRNG function to get intermediate block that really occupy 32 bits, like 32767*32767 instead of 32767.

The complete modified version:

CREATE OR REPLACE FUNCTION pseudo_encrypt(VALUE bigint) returns bigint AS $$
DECLARE
l1 bigint;
l2 bigint;
r1 bigint;
r2 bigint;
i int:=0;
BEGIN
l1:= (VALUE >> 32) & 4294967295::bigint;
r1:= VALUE & 4294967295;
WHILE i < 3 LOOP
l2 := r1;
r2 := l1 # ((((1366.0 * r1 + 150889) % 714025) / 714025.0) * 32767*32767)::int;
l1 := l2;
r1 := r2;
i := i + 1;
END LOOP;
RETURN ((l1::bigint << 32) + r1);
END;
$$ LANGUAGE plpgsql strict immutable;

First results:


select x,pseudo_encrypt(x::bigint) from generate_series (1, 10) as x;
x | pseudo_encrypt
----+---------------------
1 | 3898573529235304961
2 | 2034171750778085465
3 | 169769968641019729
4 | 2925594765163772086
5 | 1061193016228543981
6 | 3808195743949274374
7 | 1943793931158625313
8 | 88214277952430814
9 | 2835217030863818694
10 | 970815170807835400
(10 rows)


Related Topics



Leave a reply



Submit