Is it possible to set two fields as indexes on an entity in ndb?

旧时模样 提交于 2019-12-13 16:14:44

问题


I am new to ndb and gae and have a problem coming up with a good solution setting indexes. Let say we have a user model like this:

class User(ndb.Model):
    name = ndb.StringProperty()    
    email = ndb.StringProperty(required = True)    
    fb_id = ndb.StringProperty()

Upon login if I was going to check against the email address with a query, I believe this would be quite slow and inefficient. Possibly it has to do a full table scan.

q = User.query(User.email == EMAIL)
user = q.fetch(1)

I believe it would be much faster, if User models were saved with the email as their key.

user = user(id=EMAIL)
user.put()

That way I could retrieve them like this a lot faster (so I believe)

key = ndb.Key('User', EMAIL) 
user = key.get()

So far if I am wrong please correct me. But after implementing this I realized there is a chance that facebook users would change their email address, that way upon a new oauth2.0 connection their new email can't be recognized in the system and they will be created as a new user. Hence maybe I should use a different approach:

  • Using the social-media-provider-id (unique for all provider users)

and

  • provider-name (in rare case that two twitter and facebook users share the same provider-id)

However in order to achieve this, I needed to set two indexes, which I believe is not possible.

So what could I do? Shall I concatenate both fields as a single key and index on that?

e.g. the new idea would be:

class User(ndb.Model):
    name = ndb.StringProperty()    
    email = ndb.StringProperty(required = True)    
    provider_id = ndb.StringProperty()
    provider_type = ndb.StringProperty()

saving:

provider_id = 1234
provider_type = fb
user = user(id=provider_id + provider_type)
user.put()

retrieval:

provider_id = 1234
provider_type = fb
key = ndb.Key('User', provider_id + provider_type) 
user = key.get()

This way we don't care any more if the user changes the email address on his social media. Is this idea sound?

Thanks,

UPDATE

Tim's solution sounded so far the cleanest and likely also the fastest to me. But I came across a problem.

class AuthProvider(polymodel.PolyModel):
    user_key = ndb.KeyProperty(kind=User)
    active = ndb.BooleanProperty(default=True)  
    date_created = ndb.DateTimeProperty(auto_now_add=True)

    @property
    def user(self):
        return self.user_key.get()

class FacebookLogin(AuthProvider):
    pass

View.py: Within facebook_callback method

provider = ndb.Key('FacebookLogin', fb_id).get() 

# Problem is right here. provider is always None. Only if I used the PolyModel like this:
# ndb.Key('AuthProvider', fb_id).get()
#But this defeats the whole purpose of having different sub classes as different providers. 
#Maybe I am using the key handeling wrong?


if provider:
    user = provider.user
else:
    provider = FacebookLogin(id=fb_id)          
if not user:
        user = User()
        user_key = user.put()
        provider.user_key = user_key
        provider.put() 
return user

回答1:


One slight variation on your approach which could allow a more flexible model will be to create a separate entity for the provider_id, provider_type, as the key or any other auth scheme you come up with

This entity then holds a reference (key) of the actual user details.

You can then

  1. do a direct get() for the auth details, then get() the actual user details.
  2. The auth details can be changed without actually rewriting/rekeying the user details
  3. You can support multiple auth schemes for a single user.

I use this approach for an application that has > 2000 users, most use a custom auth scheme (app specific userid/passwd) or google account.

e.g

class AuthLogin(ndb.Polymodel):
     user_key = ndb.KeyProperty(kind=User)
     status = ndb.StringProperty()  # maybe you need to disable a particular login with out deleting it.
     date_created = ndb.DatetimeProperty(auto_now_add=True)

     @property
     def user(self):
         return self.user_key.get()


class FacebookLogin(AuthLogin):
    # some additional facebook properties

class TwitterLogin(AuthLogin):
    # Some additional twitter specific properties

etc...

By using PolyModel as the base class you can do a AuthLogin.query().filter(AuthLogin.user_key == user.key) and get all auth types defined for that user as they all share the same base class AuthLogin. You need this otherwise you would have to query in turn for each supported auth type, as you can not do a kindless query without an ancestor, and in this case we can't use the User as the ancestor becuase then we couldn't do a simple get() to from the login id.

However some things to note, all subclasses of AuthLogin will share the same kind in the key "AuthLogin" so you still need to concatenate the auth_provider and auth_type for the keys id so that you can ensure you have unique keys. E.g.

dev~fish-and-lily> from google.appengine.ext.ndb.polymodel import PolyModel
dev~fish-and-lily> class X(PolyModel):
...    pass
... 
dev~fish-and-lily> class Y(X):
...    pass
... 
dev~fish-and-lily> class Z(X):
...    pass
... 
dev~fish-and-lily> y = Y(id="abc")
dev~fish-and-lily> y.put()
Key('X', 'abc')
dev~fish-and-lily> z = Z(id="abc")
dev~fish-and-lily> z.put()
Key('X', 'abc')
dev~fish-and-lily> y.key.get()
Z(key=Key('X', 'abc'), class_=[u'X', u'Z'])

dev~fish-and-lily> z.key.get()
Z(key=Key('X', 'abc'), class_=[u'X', u'Z'])

This is the problem you ran into. By adding the provider type as part of the key you now get distinct keys.

dev~fish-and-lily> z = Z(id="Zabc")
dev~fish-and-lily> z.put()
Key('X', 'Zabc')
dev~fish-and-lily> y = Y(id="Yabc")
dev~fish-and-lily> y.put()
Key('X', 'Yabc')
dev~fish-and-lily> y.key.get()
Y(key=Key('X', 'Yabc'), class_=[u'X', u'Y'])
dev~fish-and-lily> z.key.get()
Z(key=Key('X', 'Zabc'), class_=[u'X', u'Z'])
dev~fish-and-lily> 

I don't believe this is any less convenient a model for you.

Does all that make sense ;-)




回答2:


While @Greg's answer seems OK, I think it's actually a bad idea to associate an external type/id as a key for your entity, because this solution doesn't scale very well.

  • What if you would like to implement your own username/password at one point?
  • What if the user going to delete their Facebook account?
  • What if the same user wants to sign in with a Twitter account as well?
  • What if the user has more than one Facebook accounts?

So the idea of having the type/id as key looks weak. A better solution would be to have a field for every type to store only the id. For example facebook_id, twitter_id, google_id etc, then query on these fields to retrieve the actual user. This will happen during sign-in and signup process so it's not that often. Of course you will have to add some logic to add another provider for an already existed user or merge users if the same user signed in with a different provider.

Still the last solution won't work if you want to support multiple sign-ins from the same provider. In order to achieve that you will have to create another model that will store only the external providers/ids and associate them with your user model.

As an example of the second solution you could check my gae-init project where I'm storing the 3 different providers in the User model and working on them in the auth.py module. Again this solution doesn't not scale very well with more providers and doesn't support multiple IDs from the same provider.




回答3:


Concatenating the user-type with their ID is sensible.

You can save on your read and write costs by not duplicating the type and ID as properties though - when you need to use them, just split the ID back up. (Doing this will be simpler if you include a separator between the parts, '%s|%s' % (provider_type, provider_id) for example)




回答4:


If you want to use a single model, you can do something like:

class User(ndb.Model):
    name = ndb.StringProperty()
    email = ndb.StringProperty(required = True)
    providers = ndb.KeyProperty(repeated=True)

auser = User(id="auser", name="A user", email="auser@example.com")
auser.providers = [
    ndb.Key("ProviderName", "fb", "ProviderId", 123),
    ndb.Key("ProviderName", "tw", "ProviderId", 123)
]
auser.put()

To query for a specific FB login, you simple do:

fbkey = ndb.Key("ProviderName", "fb", "ProviderId", 123)
for entry in User.query(User.providers==fbkey):
    # Do something with the entry

As ndb does not provide an easy way to create a unique constraint, you could use the _pre_put_hook to ensure that providers is unique.



来源:https://stackoverflow.com/questions/17433607/is-it-possible-to-set-two-fields-as-indexes-on-an-entity-in-ndb

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!