How would you design an AppEngine datastore for a social site like Twitter?

馋奶兔 提交于 2019-11-28 15:07:03

Take a look at Building Scalable, Complex Apps on App Engine (pdf), a fascinating talk given at Google I/O by Brett Slatkin. He addresses the problem of building a scalable messaging service like Twitter.

Here's his solution using a list property:

class Message(db.Model):
    sender = db.StringProperty()
    body = db.TextProperty()

class MessageIndex(db.Model):
    #parent = a message
    receivers = db.StringListProperty()

indexes = MessageIndex.all(keys_only = True).filter('receivers = ', user_id)
keys = [k.parent() for k in indexes)
messages = db.get(keys)

This key only query finds the message indices with a receiver equal to the one you specified without deserializing and serializing the list of receivers. Then you use these indices to only grab the messages that you want.

Here's the wrong way to do it:

class Message(db.Model):
    sender = db.StringProperty()
    receivers = db.StringListProperty()
    body = db.TextProperty()

messages = Message.all().filter('receivers =', user_id)

This is inefficient because queries have to unpackage all of the results returned by your query. So if you returned 100 messages with 1,000 users in each receivers list you'd have to deserialize 100,000 (100 x 1000) list property values. Way too expensive in datastore latency and cpu.

I was pretty confused by all of this at first, so I wrote up a short tutorial about using the list property. Enjoy :)

I don't know whether it is the best design for a social application, but jaiku was ported to App Engine by it's original creator when the company was acquired by Google, so it should be reasonable.

See the section Actors and Tigers and Bears, Oh My! in design_funument.txt. The entities are defined in common/models.py and the queries are in common/api.py.

Robert, about your proposed solution:

messages = Message.query(Message.receivers == user_id).fetch(projection=[Message.body])

I think the ndb.TextProperty "body" can't be used with projections because is not indexed. Projections only support indexed properties. The valid solution would be to maintain the 2 tables: Message and MessageIndex.

I think this can now be solved with the new Projection Queries in NDB.

class Message(ndb.Model):
    sender = ndb.StringProperty()
    receivers = ndb.StringProperty(repeated=True)
    body = ndb.TextProperty()

messages = Message.query(Message.receivers == user_id).fetch(projection=[Message.body])

Now you don't have to deal with the expensive cost of deserializing the list property.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!