We have a solr instance with 86,315,770 documents. It\'s using up to 4GB of memory and we need it for faceting on a tokenized field called content. The index size on disk is 23G
You could use the topTerms feature of LukeRequestHandler.
topTerms