How to get distinct count on dynamodb on billion objects?

泄露秘密 提交于 2019-12-08 16:49:21

问题


What is the most efficient way to get a number of how many distinct objects is stored in mine dynamodb?

Such as my objects have ten properties and I want to get a distinct count based on 3 properties.


回答1:


In case you need counters it's better to use the AtomicCounters (http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/WorkingWithDDItems.html). In your case, DynamoDB doesn't support out of the box keys composed out of 3 attributes, unless you concatenate them, so the option would be to create a redundant table where the key is the concatenation of those 3 attributes and each you manage those objects, also update the AtomicCounter (add, delete, update - not needed actually).

Then you just query the counter, avoiding scans. So, it's space complexity to gain speed of retrieving data.




回答2:


Perform a Scan with the appropriate ScanFilter (in this case, that the three properties are not_null), and use withCount(true) to return only the number of matching records instead of the records themselves.

See the documentation for some example code.



来源:https://stackoverflow.com/questions/15892611/how-to-get-distinct-count-on-dynamodb-on-billion-objects

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!