How can I address the 10GB limit on Google App Engine?

橙三吉。 提交于 2019-12-11 10:32:35

问题


We are trying to index inboxes by sitting on top of the GMail, and are using the App Engine search API, but we are hitting up the 10 GB limit. This is because we are indexing the whole organization's emails so we can search across the whole team's inbox. How can we work around this? One way might be to have an individual index per person and somehow combine the results manually, but worried that merging results might be really complex! Wondering what options are available?


回答1:


This is a typical problem in any document retrieval system, and the solution is to slice the entire corpus into multiple buckets. You should choose a slicing strategy based on your requirements/usage pattern.

One possibility is to slice messages by their date. You keep adding messages to an index until you come close to the limit, at which point you start a new index for newer messages. Or you can do it by calendar intervals (per year, per quarter or per month, depending on your volume).

Merging results from several indexes is simple. You can also give users a chance to choose how far back in time they want to go in their search. Often people know that they are looking for something recent or something that happened a long time ago.




回答2:


File a feature request:

https://code.google.com/p/googleappengine/wiki/FilingIssues?tm=3

There was this filed too so maybe star it: https://code.google.com/p/googleappengine/issues/detail?id=10667



来源:https://stackoverflow.com/questions/26816366/how-can-i-address-the-10gb-limit-on-google-app-engine

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!