Need to know pros and cons of using RAMDirectory

|▌冷眼眸甩不掉的悲伤 提交于 2019-12-03 17:39:08

问题


I need to improve performance of my Lucene search query. Can I use RAMDirectory?Does it optimize performance?Is there any index size limit for this? I would appreciate if someone could list pros and cons of using a RAMDirectory.

Thanks.


回答1:


I compare FSDirectory and RAMDirectory.

  • index size is 1.4G
  • Centos, 5G memory

Search 1000 keywords, the average/min/max response time (ms) is here

  • FSDirectory
    • first run: 351/7/2611
    • second run: 47/7/837
    • third run(restart app): 53/7/2343
  • RAMDirectory
    • first run: 38/7/1133
    • second run: 34/7/189
    • third run(restart app): 38/7/959

So, you can see RAMDirectory is do faster then FSDirectory, but after 'os file cache warm up', the speed gap is not so distinct. What's the disadvantage of RMADirectory? In my test

  • It eats much more memory, 1.4G file need about 2G to load it into memory. while FSDirectory uses only 700m. Then it means longer time for full gc.
  • It need more time to load, especially when the index file is large. It need copy the data from file to memory when opening the index. That means requests would be blocked for more time when restart app.
  • It's not so practical to maintain two index in the same time. Because our app switches index every several hours. We want new index is warming up while old index is still working in the same tomcat.



回答2:


A RAMDirectory is faster, but doesn't get written to the disk. It only exists as long as your program is running, and has to be created from scratch every time your program runs.

If your index is small enough to fit comfortably into RAM, and you don't update it frequently, you can maintain an index on the disk and then create a RAMDirectory from it using the RAMDirectory(Directory dir) constructor. Querying that should then be faster than querying the one on disk, once you've paid the penalty of loading it up. But do measure the difference - if the index can fit into memory as a RAMDirectory, then it can fit in the disk cache as well, so you might not see much difference.




回答3:


You should profile the use of RAMDirectory. At least in Linux, using RAMDirectory is not any faster than using the default FSDirectory, due to the way the OS buffers I/O.



来源:https://stackoverflow.com/questions/1582377/need-to-know-pros-and-cons-of-using-ramdirectory

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!