Google App Engine - SiteMap Creation for a social network

£可爱£侵袭症+ 提交于 2019-12-08 04:17:25
Phil H

Update frequency

Cache invalidation is a hard problem, see: Cache Invalidation - Is there a General Solution?

As far as I can see, you need to decide how often you want search bots to recrawl your site, rather than how often things are actually changed; if a user's page may contain information they want to remove at short notice, then you want the search bot to re-crawl within a couple of days, even though profiles are changed rarely on average.

Keeping an up-to-date map

Since the speed of your website now figures in its Google PageRank, it's worth updating a static file ready to serve up to the spiders. Perhaps have one script that continually updates a db table with sitemap entries, and another that periodically regenerates the static file(s) from the db table. That way, there is always a static version available for the spiders and it can all happen asynchronously.

Static pages on App Engine

I forgot that you can't have static page files on App Engine. According to this SO question, the best way is to use generate your file and push it to memcache. Also see the documentation on using memcache with App Engine

What you describe is very similar to how Django implements a sitemap framework: http://docs.djangoproject.com/en/dev/ref/contrib/sitemaps/ specifically the section on creating index files: http://docs.djangoproject.com/en/dev/ref/contrib/sitemaps/#creating-a-sitemap-index

If you want to see it on AppEngine with a patched version of the helper you can look here: http://code.google.com/p/dherbst-app-engine-django/wiki/Sitemaps

These are the changes applied to the helper: http://code.google.com/p/dherbst-app-engine-django/source/detail?r=509403105ec97fb1f3dfeadfada808f2cf1ff9a7

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!