SqlAlchemy optimizations for read-only object models

拥有回忆 提交于 2019-12-03 07:36:13

问题


I have a complex network of objects being spawned from a sqlite database using sqlalchemy ORM mappings. I have quite a few deeply nested:

for parent in owner.collection: 
    for child in parent.collection: 
        for foo in child.collection: 
            do lots of calcs with foo.property 

My profiling is showing me that the sqlalchemy instrumentation is taking a lot of time in this use case.

The thing is: I don't ever change the object model (mapped properties) at runtime, so once they are loaded I don't NEED the instrumentation, or indeed any sqlalchemy overhead at all. After much research, I'm thinking I might have to clone a 'pure python' set of objects from my already loaded 'instrumented objects', but that would be a pain.

Performance is really crucial here (it's a simulator), so maybe writing those layers as C extensions using sqlite api directly would be best. Any thoughts?


回答1:


If you reference a single attribute of a single instance lots of times, a simple trick is to store it in a local variable.

If you want a way to create cheap pure python clones, share the dict object with the original object:

class CheapClone(object):
    def __init__(self, original):
        self.__dict__ = original.__dict__

Creating a copy like this costs about half of the instrumented attribute access and attribute lookups are as fast as normal.

There might also be a way to have the mapper create instances of an uninstrumented class instead of the instrumented one. If I have some time, I might take a look how deeply ingrained is the assumption that populated instances are of the same type as the instrumented class.


Found a quick and dirty way that seems to at least somewhat work on 0.5.8 and 0.6. Didn't test it with inheritance or other features that might interact badly. Also, this touches some non-public API's, so beware of breakage when changing versions.

from sqlalchemy.orm.attributes import ClassManager, instrumentation_registry

class ReadonlyClassManager(ClassManager):
    """Enables configuring a mapper to return instances of uninstrumented 
    classes instead. To use add a readonly_type attribute referencing the
    desired class to use instead of the instrumented one."""
    def __init__(self, class_):
        ClassManager.__init__(self, class_)
        self.readonly_version = getattr(class_, 'readonly_type', None)
        if self.readonly_version:
            # default instantiation logic doesn't know to install finders
            # for our alternate class
            instrumentation_registry._dict_finders[self.readonly_version] = self.dict_getter()
            instrumentation_registry._state_finders[self.readonly_version] = self.state_getter()

    def new_instance(self, state=None):
        if self.readonly_version:
            instance = self.readonly_version.__new__(self.readonly_version)
            self.setup_instance(instance, state)
            return instance
        return ClassManager.new_instance(self, state)

Base = declarative_base()
Base.__sa_instrumentation_manager__ = ReadonlyClassManager

Usage example:

class ReadonlyFoo(object):
    pass

class Foo(Base, ReadonlyFoo):
    __tablename__ = 'foo'
    id = Column(Integer, primary_key=True)
    name = Column(String(32))

    readonly_type = ReadonlyFoo

assert type(session.query(Foo).first()) is ReadonlyFoo



回答2:


You should be able to disable lazy loading on the relationships in question and sqlalchemy will fetch them all in a single query.




回答3:


Try using a single query with JOINs instead of the python loops.



来源:https://stackoverflow.com/questions/2322437/sqlalchemy-optimizations-for-read-only-object-models

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!