Hashing a String to a Numeric Value in PostgreSQL

ε祈祈猫儿з 提交于 2019-11-27 00:43:28

问题


I need to Convert Strings stored in my Database to a Numeric value. Result can be Integer (preferred) or Bigint. This conversion is to be done at Database side in a PL/pgSQL function.

Can someone please point me to some algorithm or any API's that can be used to achieve this?

I have been searching for this on Google for hours now, could not find anything useful so far :(


回答1:


Just keep the first 32 bits or 64 bits of the MD5 hash. Of course, it voids the main property of md5 (=the probability of collision being infinitesimal) but you'll still get a wide dispersion of values which presumably is good enough for your problem.

SQL functions derived from the other answers:

For bigint:

create function h_bigint(text) returns bigint as $$
 select ('x'||substr(md5($1),1,16))::bit(64)::bigint;
$$ language sql;

For int:

create function h_int(text) returns int as $$
 select ('x'||substr(md5($1),1,8))::bit(32)::int;
$$ language sql;



回答2:


You can create a md5 hash value without problems:

select md5('hello, world');

This returns a string with a hex number.

Unfortunately there is no built-in function to convert hex to integer but as you are doing that in PL/pgSQL anyway, this might help:

https://stackoverflow.com/a/8316731/330315




回答3:


Must it be an integer? The pg_crypto module provides a number of standard hash functions (md5, sha1, etc). They all return bytea. I suppose you could throw away some bits and convert bytea to integer.

bigint is too small to store a cryptographic hash. The largest non-bytea binary type Pg supports is uuid. You could cast a digest to uuid like this:

select ('{'||encode( substring(digest('foobar','sha256') from 1 for 16), 'hex')||'}')::uuid;
                 uuid                 
--------------------------------------
 c3ab8ff1-3720-e8ad-9047-dd39466b3c89



回答4:


This is an implementation of Java's String.hashCode():

CREATE OR REPLACE FUNCTION hashCode(_string text) RETURNS INTEGER AS $$
DECLARE
  val_ CHAR[];
  h_ INTEGER := 0;
  ascii_ INTEGER;
  c_ char;
BEGIN
  val_ = regexp_split_to_array(_string, '');

  FOR i in 1 .. array_length(val_, 1)
  LOOP
    c_ := (val_)[i];
    ascii_ := ascii(c_);
    h_ = 31 * h_ + ascii_;
    raise info '%: % = %', i, c_, h_;
  END LOOP;
RETURN h_;
END;
$$ LANGUAGE plpgsql;


来源:https://stackoverflow.com/questions/9809381/hashing-a-string-to-a-numeric-value-in-postgresql

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!