An Interesting phenomenon of Lua's table

主宰稳场 提交于 2019-11-30 19:21:55

I haven't looked at how the # operator is implemented, but I bet what's going on is that by adding the extra 100 indexes, you've caused the range 1-300 to become dense enough that the indexes 100-300 end up in the "array" part of the table implementation instead of the "hash" part.

Update:

Ok, I looked at the source for the primitive table length. If the final entry in the array part is nil, it binary-searches the array to find the lowest "boundary" (a non-nil index followed by a nil index). If it's not nil, it decides the boundary must be in the hash and searches for it.

So with the table containing numeric indexes {1, 2, 3, 100..200}, I assume it's not dense enough and the array part only contains {1, 2, 3}. But with the table containing {1, 2, 3, 100..300}, it's presumably dense enough that the array part ends somewhere within the 100..300 part (I think the array part is always a power of 2, so it can't possibly end at 300, but I'm not 100% positive).

Update 2:

When a lua table is rehashed, it counts the number of integer keys. It then walks up all the powers of two that are no more than twice the number of integral keys, and finds the largest power of two that is at least 50% dense (meaning that if the array part were this large, at least 50% of all values would be non-nil).

So with {1, 2, 3, 100..200}, it walks up

1: 100% dense; good
2: 100% dense; good
4: 75% dense; bad
8: 37.5% dense; bad
16: 18.75% dense; bad
32: 9.375% dense; bad
64: 4.6875% dense; bad
128: 25% dense; bad
256: 40.625% dense; bad

The best good value is 2, so it ends up with an array size of 2. Since 2 is non-nil, it searches the hash for the boundary and finds 3.

Once you add 201..300 the last step becomes

256: 62.5% dense; good

which causes the array part to cover 1..256, and since 256 is non-nil, it again searches for the boundary in the hash and gets 300`.


In the end, Lua 5.2 defines a "sequence" as a table with exclusively integral keys starting at 1 and going up with no holes. And it defines # as only being valid for sequences. This way Lua can get away with the weird behavior you noticed for tables that have holes in their integral sequences.

The length of a table t is only defined if the table is a sequence, that is, the set of its positive numeric keys is equal to {1..n} for some integer n.

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!