问题
We have a Solr core that has about 250 TrieIntField
s (declared as dynamicField
). There are about 14M docs in our Solr index and many documents have some value in many of these fields. We have a need to sort on all of these 250 fields over a period of time.
The issue we are facing is that the underlying lucene fieldCache
gets filled up very quickly. We have a 4 GB box and the index size is 18 GB. After a sort on 40 or 45 of these dynamic fields, the memory consumption is about 90% and we start getting OutOfMemory errors.
For now, we have a cron job running every minute restarting tomcat if the total memory consumed is more than 80%.
From what I have read, I understand that restricting the number of distinct values on sortable Solr fields will bring down the fieldCache
space. The values in these sortable fields can be any integer from 0 to 33000 and quite widely distributed. We have a few scaling solutions in mind, but what is the best way to handle this whole issue?
UPDATE: We thought instead of sorting, if we did boosting it won't go to fieldCache. So instead of issuing a query like
select?q=name:alba&sort=relevance_11 desc
we tried
select?q={!boost relevance_11}name:alba
but unfortunately boosting also populates the field cache :(
回答1:
I think you have two options:
1) Add more memory.
2) Force Solr not to use the field cache by specifying facet.method=enum
, as per documentation.
There's also a solr-user mailing list thread discussing the same problem.
Unless your index is huge, I'd go with option 1). RAM is cheap these days.
回答2:
We have a way to rework the schema by keeping a single sort field. The dynamic fields we have are like relevance_CLASSID
. The current schema has a unique key NODEID
and a multi-valued field CLASSID
- the relevance scores are for these class Ids. If we instead keep one document per classId per nodeId i.e. the new schema will have NODEID:CLASSID
as unique key and store some redundant information across documents with the same NODEID
, then we can sort on a single field relevance
and do a filter query on CLASSID.
来源:https://stackoverflow.com/questions/13393248/solr-lucene-fieldcache-outofmemory-error-sorting-on-dynamic-field