问题
I am new to ndb and gae and have a problem coming up with a good solution setting indexes. Let say we have a user model like this:
class User(ndb.Model):
name = ndb.StringProperty()
email = ndb.StringProperty(required = True)
fb_id = ndb.StringProperty()
Upon login if I was going to check against the email address with a query, I believe this would be quite slow and inefficient. Possibly it has to do a full table scan.
q = User.query(User.email == EMAIL)
user = q.fetch(1)
I believe it would be much faster, if User models were saved with the email as their key.
user = user(id=EMAIL)
user.put()
That way I could retrieve them like this a lot faster (so I believe)
key = ndb.Key('User', EMAIL)
user = key.get()
So far if I am wrong please correct me. But after implementing this I realized there is a chance that facebook users would change their email address, that way upon a new oauth2.0 connection their new email can't be recognized in the system and they will be created as a new user. Hence maybe I should use a different approach:
- Using the social-media-provider-id (unique for all provider users)
and
- provider-name (in rare case that two twitter and facebook users share the same provider-id)
However in order to achieve this, I needed to set two indexes, which I believe is not possible.
So what could I do? Shall I concatenate both fields as a single key and index on that?
e.g. the new idea would be:
class User(ndb.Model):
name = ndb.StringProperty()
email = ndb.StringProperty(required = True)
provider_id = ndb.StringProperty()
provider_type = ndb.StringProperty()
saving:
provider_id = 1234
provider_type = fb
user = user(id=provider_id + provider_type)
user.put()
retrieval:
provider_id = 1234
provider_type = fb
key = ndb.Key('User', provider_id + provider_type)
user = key.get()
This way we don't care any more if the user changes the email address on his social media. Is this idea sound?
Thanks,
UPDATE
Tim's solution sounded so far the cleanest and likely also the fastest to me. But I came across a problem.
class AuthProvider(polymodel.PolyModel):
user_key = ndb.KeyProperty(kind=User)
active = ndb.BooleanProperty(default=True)
date_created = ndb.DateTimeProperty(auto_now_add=True)
@property
def user(self):
return self.user_key.get()
class FacebookLogin(AuthProvider):
pass
View.py: Within facebook_callback method
provider = ndb.Key('FacebookLogin', fb_id).get()
# Problem is right here. provider is always None. Only if I used the PolyModel like this:
# ndb.Key('AuthProvider', fb_id).get()
#But this defeats the whole purpose of having different sub classes as different providers.
#Maybe I am using the key handeling wrong?
if provider:
user = provider.user
else:
provider = FacebookLogin(id=fb_id)
if not user:
user = User()
user_key = user.put()
provider.user_key = user_key
provider.put()
return user
回答1:
One slight variation on your approach which could allow a more flexible model will be to create a separate entity for the provider_id, provider_type, as the key or any other auth scheme you come up with
This entity then holds a reference (key) of the actual user details.
You can then
- do a direct get() for the auth details, then get() the actual user details.
- The auth details can be changed without actually rewriting/rekeying the user details
- You can support multiple auth schemes for a single user.
I use this approach for an application that has > 2000 users, most use a custom auth scheme (app specific userid/passwd) or google account.
e.g
class AuthLogin(ndb.Polymodel):
user_key = ndb.KeyProperty(kind=User)
status = ndb.StringProperty() # maybe you need to disable a particular login with out deleting it.
date_created = ndb.DatetimeProperty(auto_now_add=True)
@property
def user(self):
return self.user_key.get()
class FacebookLogin(AuthLogin):
# some additional facebook properties
class TwitterLogin(AuthLogin):
# Some additional twitter specific properties
etc...
By using PolyModel as the base class you can do a AuthLogin.query().filter(AuthLogin.user_key == user.key)
and get all auth types defined for that user as they all share the same base class AuthLogin. You need this otherwise you would have to query in turn for each supported auth type, as you can not do a kindless query without an ancestor, and in this case we can't use the User
as the ancestor becuase then we couldn't do a simple get() to from the login id.
However some things to note, all subclasses of AuthLogin will share the same kind in the key "AuthLogin" so you still need to concatenate the auth_provider and auth_type for the keys id so that you can ensure you have unique keys. E.g.
dev~fish-and-lily> from google.appengine.ext.ndb.polymodel import PolyModel
dev~fish-and-lily> class X(PolyModel):
... pass
...
dev~fish-and-lily> class Y(X):
... pass
...
dev~fish-and-lily> class Z(X):
... pass
...
dev~fish-and-lily> y = Y(id="abc")
dev~fish-and-lily> y.put()
Key('X', 'abc')
dev~fish-and-lily> z = Z(id="abc")
dev~fish-and-lily> z.put()
Key('X', 'abc')
dev~fish-and-lily> y.key.get()
Z(key=Key('X', 'abc'), class_=[u'X', u'Z'])
dev~fish-and-lily> z.key.get()
Z(key=Key('X', 'abc'), class_=[u'X', u'Z'])
This is the problem you ran into. By adding the provider type as part of the key you now get distinct keys.
dev~fish-and-lily> z = Z(id="Zabc")
dev~fish-and-lily> z.put()
Key('X', 'Zabc')
dev~fish-and-lily> y = Y(id="Yabc")
dev~fish-and-lily> y.put()
Key('X', 'Yabc')
dev~fish-and-lily> y.key.get()
Y(key=Key('X', 'Yabc'), class_=[u'X', u'Y'])
dev~fish-and-lily> z.key.get()
Z(key=Key('X', 'Zabc'), class_=[u'X', u'Z'])
dev~fish-and-lily>
I don't believe this is any less convenient a model for you.
Does all that make sense ;-)
回答2:
While @Greg's answer seems OK, I think it's actually a bad idea to associate an external type/id as a key for your entity, because this solution doesn't scale very well.
- What if you would like to implement your own username/password at one point?
- What if the user going to delete their Facebook account?
- What if the same user wants to sign in with a Twitter account as well?
- What if the user has more than one Facebook accounts?
So the idea of having the type/id as key looks weak. A better solution would be to have a field for every type to store only the id. For example facebook_id
, twitter_id
, google_id
etc, then query on these fields to retrieve the actual user. This will happen during sign-in and signup process so it's not that often. Of course you will have to add some logic to add another provider for an already existed user or merge users if the same user signed in with a different provider.
Still the last solution won't work if you want to support multiple sign-ins from the same provider. In order to achieve that you will have to create another model that will store only the external providers/ids and associate them with your user model.
As an example of the second solution you could check my gae-init project where I'm storing the 3 different providers in the User model and working on them in the auth.py module. Again this solution doesn't not scale very well with more providers and doesn't support multiple IDs from the same provider.
回答3:
Concatenating the user-type with their ID is sensible.
You can save on your read and write costs by not duplicating the type and ID as properties though - when you need to use them, just split the ID back up. (Doing this will be simpler if you include a separator between the parts, '%s|%s' % (provider_type, provider_id)
for example)
回答4:
If you want to use a single model, you can do something like:
class User(ndb.Model):
name = ndb.StringProperty()
email = ndb.StringProperty(required = True)
providers = ndb.KeyProperty(repeated=True)
auser = User(id="auser", name="A user", email="auser@example.com")
auser.providers = [
ndb.Key("ProviderName", "fb", "ProviderId", 123),
ndb.Key("ProviderName", "tw", "ProviderId", 123)
]
auser.put()
To query for a specific FB login, you simple do:
fbkey = ndb.Key("ProviderName", "fb", "ProviderId", 123)
for entry in User.query(User.providers==fbkey):
# Do something with the entry
As ndb
does not provide an easy way to create a unique constraint, you could use the _pre_put_hook to ensure that providers
is unique.
来源:https://stackoverflow.com/questions/17433607/is-it-possible-to-set-two-fields-as-indexes-on-an-entity-in-ndb