Consistent String#hash based only on the string's content

后端 未结 4 896
南方客
南方客 2020-12-15 16:06

GOAL: Map every URL handled by a server to 0, 1, 2, or 3, distributing as uniformly as possible.

While the documentation for ruby\'s String#hash met

相关标签:
4条回答
  • 2020-12-15 16:36

    The easiest (and consistent) way may be this (and it's fast):

    "https://www.example.com/abc/def/123?hij=345".sum % 4
    

    That will always produce an integer 0 - 3, is quite fast, and should be fairly well distributed (though I haven't actually run tests on distribution).

    0 讨论(0)
  • 2020-12-15 16:36

    There is tiny library xxHash:

    XXhash.xxh32('qwe') #=> 2396643526
    XXhash.xxh64('qwe') #=> 9343136760830690622
    

    Maybe it will have more collisions but it is 10x faster than SHA1:

    Benchmark.bm do |x|
      n = 100_000
      str = 'qweqweqwe'
      x.report('xxhash32') { n.times { XXhash.xxh32(str) } }
      x.report('xxhash64') { n.times { XXhash.xxh64(str) } }
      x.report('hexadigest') { n.times { Digest::SHA1.hexdigest(str) } }
    end;1
    
    #       user     system      total        real
    # xxhash32  0.020000   0.000000   0.020000 (  0.021948)
    # xxhash64  0.040000   0.000000   0.040000 (  0.036340)
    # hexadigest  0.240000   0.030000   0.270000 (  0.276443)
    
    0 讨论(0)
  • 2020-12-15 16:44

    there are lot of such functionality in ruby's digest module: http://ruby-doc.org/stdlib/libdoc/digest/rdoc/index.html

    simple example:

    require 'digest/sha1'
    Digest::SHA1.hexdigest("some string")
    
    0 讨论(0)
  • 2020-12-15 16:49

    You can try to_i(36).

    "Hash me please :(".to_i(36)
    => 807137
    
    0 讨论(0)
提交回复
热议问题