hash | 易学教程

让人眼前一亮的算法------一致性Hash

阅读更多关于让人眼前一亮的算法------一致性Hash

背景随着时代的发展，数据量与日俱增，相比纵向扩展单机的性能，人们更倾向于横向扩展，将多台一般的廉价机器组成集群来充当超级计算机，节省了大量的成本，代价是极大地增加了系统的复杂性。为了应对这些复杂性，一批又一批分布式领域的技术相继诞生，其中不乏一些看过之后令人拍案叫绝的精彩的想法。从存储来说，数据量大的时候，一台机器不能胜任时，那么通常的做法是将数据分片，存储到多台机器上，通过集群的方式完成数据存储的需求，举个例子，你有大量的数据需要缓存，比如100G，一般的机器显然没有这么大的内存，于是不得不把这100G分布到比如10台机器上，每台存储10G的数据。数据分配的算法有很多种，一种比较容易想到的就是hash，通过将数据对10取模，hash到各个机器上。看似很美好，但是有两点因素是不得不考虑的: 组成集群的机器都是廉价的小型计算机，机器故障是在正常不过的事情了。随着数据量的持续增加，你发现10台机器不够用了，想增加1台上去，过一段时间，又需要加一台。以上两种情况有一个共同点：机器数量的变动。而机器数量变动之后，对数据重新取模时，会造成大量的缓存失效。举个例子：本来10台机器，有个key是100，通过key % 10将数据均匀分布到各个机器上，这时100 % 10 = 0，这个key被分配到地0号机器上存储了，然后一台机器挂了，这时，去获取刚才存储的key，100 % 9 = 1

Calculate Hash or Checksum for a table in SQL Server

阅读更多关于 Calculate Hash or Checksum for a table in SQL Server

问题 I'm trying to compute a checksum or a hash for an entire table in SQL Server 2008. The problem I'm running into is that the table contains an XML column datatype, which cannot be used by checksum and has to be converted to nvarchar first. So I need to break it down into two problems: calculate a checksum for a row, schema is unknown before runtime. calculate the checksum for all of the rows to get the full table checksum. 回答1: You can use CHECKSUM_AGG. It only takes a single argument, so you

haproxy调度算法详解二

阅读更多关于 haproxy调度算法详解二

uri 基于对用户请求的uri做hash并将请求转发到后端指定服务器，也可以通过map-based和consistent定义使用取模法还是一致性hash。 http://example.org/absolute/URI/with/absolute/path/to/resource.txt #URI/URL；对于网络来讲既是URL也是URI ftp://example.org/resource.txt #URI/URL /relative/URI/with/absolute/path/to/resource.txt #URI；对于服务器来讲，只是URI uri 一致性hash配置根据uri做hash运算，用于缓存服务器 listen yewu-service-80 bind 192.168.38.37:80 mode http balance uri #根据用户请求的uri进行调度；uri是属于应用层，所以mode后面的协议类型必须为http协议 hash-type consistent option forwardfor server web1 192.168.38.27:80 weight 1 check inter 3000 fall 3 rise 5 server web2 192.168.38.47:80 weight 1 check inter 3000 fall 3

What are “Resource#'s”?

阅读更多关于 What are “Resource#'s”?

问题 HI I am getting Resource#6 and Resource#7 when I print the following variables: $salty_password = sha1($row['salt'], $_POST['password']); if(isset($_POST['subSignIn']) && !empty($_POST['email']) && !empty($_POST['password'])) { $query = "SELECT `salt` FROM `cysticUsers` WHERE `Email` = '" . $_POST['email'] . "'"; $request = mysql_query($query,$connection) or die(mysql_error()); $result = mysql_fetch_array($request); $query2 = "SELECT * FROM `cysticUsers` WHERE `Email` = '". $_POST['email']."'

PHP.net says that md5() and sha1() unsuitable for password?

阅读更多关于 PHP.net says that md5() and sha1() unsuitable for password?

问题 http://www.php.net/manual/en/faq.passwords.php#faq.passwords.fasthash I'm storing user passwords in a MySQL database in hash form. Does this mean that it is unsafe to do so? If it is, what are my alternatives? 回答1: The next question in the FAQ you linked to discusses it: How should I hash my passwords, if the common hash functions are not suitable? From the FAQ: The suggested algorithm to use when hashing passwords is Blowfish, as it is significantly more computationally expensive than MD5 or

Perfect hash function generator for functions

阅读更多关于 Perfect hash function generator for functions

问题 I have a set of C++ functions. I want to map this functions in an hash table, something like: unordered_map<function<ReturnType (Args...)> , SomethingElse> , where SomethingElse is not relevant for this question. This set of functions is previously known, small (let say less than 50) and static (is not gonna change). Since lookup performance is crucial (should be performed in O(1) ), I want to define a perfect hashing function. There exists a perfect hash function generator for this scenario?

How hash is implemented in Python 3.2?

阅读更多关于 How __hash__ is implemented in Python 3.2?

问题 I want to make custom object hash-able (via pickling). I could find __hash__ algorithm for Python 2.x (see code below), but it obviously differs from hash for Python 3.2 (I wonder why?). Does anybody know how __hash__ implemented in Python 3.2? #Version: Python 3.2 def c_mul(a, b): #C type multiplication return eval(hex((int(a) * b) & 0xFFFFFFFF)[:-1]) class hs: #Python 2.x algorithm for hash from http://effbot.org/zone/python-hash.htm def __hash__(self): if not self: return 0 # empty value =

让人眼前一亮的算法------布隆过滤器

阅读更多关于让人眼前一亮的算法------布隆过滤器

问题假设你现在要处理这样一个问题，你有一个网站并且拥有很多访客，每当有用户访问时，你想知道这个ip是不是第一次访问你的网站。这是一个很常见的场景，为了完成这个功能，你很容易就会想到下面这个解决方案：把访客的ip存进一个hash表中，每当有新的访客到来时，先检查哈希表中是否有改访客的ip，如果有则说明该访客在黑名单中。你还知道，hash表的存取时间复杂度都是O(1),效率很高，因此你对你的方案很是满意。然后我们假设你的网站已经被1亿个用户访问过，每个ip的长度是15，那么你一共需要15 * 100000000 = 1500000000Bytes = 1.4G，这还没考虑hash冲突的问题（hash表中的槽位越多，越浪费空间，槽位越少，效率越低）。于是聪明的你稍一思考，又想到可以把ip转换成无符号的int型值来存储，这样一个ip只需要占用4个字节就行了，这时1亿个ip占用的空间是4 * 100000000 = 400000000Bytes = 380M，空间消耗降低了很多。那还有没有在不影响存取效率的前提下更加节省空间的办法呢? BitSet 32位无符号int型能表示的最大值是4294967295，所有的ip都在这个范围内，我们可以用一个bit位来表示某个ip是否出现过，如果出现过，就把代表该ip的bit位置为1

redis笔记

阅读更多关于 redis笔记

redis笔记 redis 是什么? 能干嘛？去哪下？怎么玩？ redis的安装 redis设置外网访问 redis数据类型及api操作(http://redisdoc.com/) key 1.string 2.list 3.set 4.hash 5.zset redis的持久化机制 1.RDB 是什么？ 1.这个持久化文件在哪里 2.他什么时候fork子进程，或者什么时候触发rdb持久化机制 2.aof(--fix) ls -l --block-size=M 是什么？ 1.这个持久化文件在哪里 2.触发机制（根据配置文件配置项） 3.aof重写机制 4.redis4.0后混合持久化机制开启混合持久化小总结： 1.redis提供了rdb持久化方案，为什么还要aof？ 2.如果aof和rdb同时存在，听谁的？ 3.rdb和aof优势劣势性能建议（这里只针对单机版redis持久化做性能建议）： redis集群专题 Redis主从复制 1.是什么 2.能干嘛 3.怎么玩 4.全量复制消耗 5.缺点 redis哨兵模式 1.是什么，能干嘛？ 2.哨兵主要功能（做了哪些事） 3.架构 4.怎么玩（实战）？ 1.部署主从节点 2.部署哨兵节点哨兵节点的启动有两种方式，二者作用是完全相同的： 5.故障转移演示（哨兵的监控和自动故障转移功能） 6.客户端（jedis）访问哨兵系统

Why do the md5 hashes of two tarballs of the same file differ?

阅读更多关于 Why do the md5 hashes of two tarballs of the same file differ?

问题 I can run: echo "asdf" > testfile tar czf a.tar.gz testfile tar czf b.tar.gz testfile md5sum *.tar.gz and it turns out that a.tar.gz and b.tar.gz have different md5 hashes. It's true that they're different, which diff -u a.tar.gz b.tar.gz confirms. What additional flags do I need to pass in to tar so that its output is consistent over time with the same input? 回答1: tar czf outfile infiles is equivalent to tar cf - infiles | gzip > outfile The reason the files are different is because gzip

订阅 hash