Service fabric reliable dictionary performance with 1 million keys

。_饼干妹妹 提交于 2019-12-03 08:22:38

Ok, for what it's worth:

  • Not everything is stored in memory. To support large Reliable Collections, some values are cached and some of them reside on disk, which potentially could lead to extra I/O while retrieving the data you request. I've heard a rumor that at some point we may get a chance to adjust the caching policy, but I don't think it has been implemented already.

  • You iterate through the data reading records one by one. IMHO, if you try to issue half a million separate sequential queries against any data source, the outcome won't be much optimistic. I'm not saying that every single MoveNext() results in a separate I/O operation, but I'd say that overall it doesn't look like a single fetch.

  • It depends on the resources you have. For instance, trying to reproduce your case on my local machine with a single partition and three replicas, I get the records in 5 seconds average.

Thinking about a workaround, here is what comes in mind:

  • Chunking I've tried to do the same stuff splitting records into string arrays capped with 10 elements(IReliableDictionary< string, string[] >). So essentially it was the same amount of data, but the time range was reduced from 5sec down to 7ms. I guess if you keep your items below 80KB thus reducing the amount of round-trips and keeping LOH small, you should see your performance improved.

  • Filtering CreateEnumerableAsync has an overload that allows you to specify a delegate to avoid retrieving values from the disk for keys that do not match the filter.

  • State Serializer In case you go beyond simple strings, you could develop your own Serializer and try to reduce the incurred I/O against your type.

Hopefully it makes sense.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!