Finding the Hash Value of a Row in Postgresql

Finding the hash value of a row in postgresql

Cast the row to text and use md5 to make a hash:

SELECT
md5(CAST((f.*)AS text))
FROM
foo f;

Finding out the hash value of a group of rows

The simplest way - just concat all the string form md5 with string_agg. But to use this aggregate correctly you must specify ORDER BY.

Or use md5(string_agg(md5(CAST((f.*)AS text)),'')) with some ORDER BY - it will change if any field of f.* changes and it is cheap to compare.

How can I get a hash of an entire table in postgresql?

just do like this to create a hash table aggregation function.

create function pg_concat( text, text ) returns text as '
begin
if $1 isnull then
return $2;
else
return $1 || $2;
end if;
end;' language 'plpgsql';

create function pg_concat_fin(text) returns text as '
begin
return $1;
end;' language 'plpgsql';

create aggregate pg_concat (
basetype = text,
sfunc = pg_concat,
stype = text,
finalfunc = pg_concat_fin);

then you could use the pg_concat function to caculate the table's hash value.

select md5(pg_concat(md5(CAST((f.*)AS text)))) from f order by id

How to hash a query result with sha256 in PostgreSQL?

PostgreSQL doesn't come with a built-in streaming hash function exposed to the user, so the easiest way is to build the string in memory and then hash it. Of course this won't work with giant result sets. You can use digest from the pg_crypto extension. You also need to order your rows, or else you might get different results on the same data from one execution to the next if you get the rows in different orders.

select digest(string_agg(output::text,' ' order by output),'sha256')
from result;

How to know which partition will be used in Postgres hash partitioning?

Reverse engineering the code, you can get the partition number with the following statement:

SELECT (hashtextextended('value', 8816678312871386365)::numeric + 5305509591434766563) % 8;

Replace 8 with the number of partitions and 'value' with the string in question.

You can test the partition number with satisfies_hash_partition. To test if 'value' would end up in partition 6 of 8 in table tab, you can run

SELECT satisfies_hash_partition('tab'::regclass, 8, 6, 'value'::text);

PostgreSQL hash a column using SHA256 and a pre-defined salt

That is trivial, just concatenate the string with the salt:

SELECT encode(digest('this_is_my_salt' || 'email@example.com', 'sha256'),'hex');

Hashing a String to a Numeric Value in PostgreSQL

Just keep the first 32 bits or 64 bits of the MD5 hash. Of course, it voids the main property of md5 (=the probability of collision being infinitesimal) but you'll still get a wide dispersion of values which presumably is good enough for your problem.

SQL functions derived from the other answers:

For bigint:

create function h_bigint(text) returns bigint as $$
select ('x'||substr(md5($1),1,16))::bit(64)::bigint;
$$ language sql;

For int:

create function h_int(text) returns int as $$
select ('x'||substr(md5($1),1,8))::bit(32)::int;
$$ language sql;

Get records by hash value

It is not easy to search within a serialized field, but you can use LIKE with some limitations.

Activity.where('parameters LIKE ?', '%status: new%')

This is working, but I do suggest adding a custom field, just like what public activity gem owner has mentioned:

"Besides the few fields that every activity has, you can also set
custom fields. This could be very beneficial, as parameters are a
serialized hash, which cannot be queried easily from the database."



Related Topics



Leave a reply



Submit