How do I produce nested JSON from database query with joins? Using Python / SQLAlchemy

≯℡__Kan透↙ 提交于 2019-11-28 14:36:54

Look into marshmallow-sqlalchemy, as it does exactly what you're looking for.

I strongly advise against baking your serialization directly into your model, as you will eventually have two services requesting the same data, but serialized in a different way (including fewer or more nested relationships for performance, for instance), and you will either end up with either (1) a lot of bugs that your test suite will miss unless you're checking for literally every field or (2) more data serialized than you need and you'll run into performance issues as the complexity of your application scales.

With marshmallow-sqlalchemy, you'll need to define a schema for each model you'd like to serialize. Yes, it's a bit of extra boilerplate, but believe me - you will be much happier in the end.

We build applications using flask-sqlalchemy and marshmallow-sqlalchemy like this (also highly recommend factory_boy so that you can mock your service and write unit tests in place of of integration tests that need to touch the database):

# models

class Parent(Base):
    __tablename__ = 'parent'
    id = Column(Integer, primary_key=True)
    children = relationship("Child", back_populates="parent")

class Child(Base):
    __tablename__ = 'child'
    id = Column(Integer, primary_key=True)
    parent_id = Column(Integer, ForeignKey('parent.id'))
    parent = relationship('Parent', back_populates='children',
                          foreign_keys=[parent_id])

# schemas. Don't put these in your models. Avoid tight coupling here

from marshmallow_sqlalchemy import ModelSchema
import marshmallow as ma


class ParentSchema(ModelSchema):
    children = ma.fields.Nested(
        'myapp.schemas.child.Child', exclude=('parent',), many=True)
    class Meta(ModelSchema.Meta):
        model = Parent
        strict = True
        dump_only = ('id',)


class ChildSchema(ModelSchema):
    parent = ma.fields.Nested(
        'myapp.schemas.parent.Parent', exclude=('children',))
    class Meta(ModelSchema.Meta):
        model = Child
        strict = True
        dump_only = ('id',)

# services

class ParentService:
    '''
    This service intended for use exclusively by /api/parent
    '''
    def __init__(self, params, _session=None):
        # your unit tests can pass in _session=MagicMock()
        self.session = _session or db.session
        self.params = params

    def _parents(self) -> typing.List[Parent]:
        return self.session.query(Parent).options(
            joinedload(Parent.children)
        ).all()

    def get(self):
        schema = ParentSchema(only=(
            # highly recommend specifying every field explicitly
            # rather than implicit
            'id',
            'children.id',
        ))
        return schema.dump(self._parents()).data

# views

@app.route('/api/parent')
def get_parents():
    service = ParentService(params=request.get_json())
    return jsonify(data=service.get())


# test factories
class ModelFactory(SQLAlchemyModelFactory):
    class Meta:
        abstract = True
        sqlalchemy_session = db.session

class ParentFactory(ModelFactory):
    id = factory.Sequence(lambda n: n + 1)
    children = factory.SubFactory('tests.factory.children.ChildFactory')

class ChildFactory(ModelFactory):
    id = factory.Sequence(lambda n: n + 1)
    parent = factory.SubFactory('tests.factory.parent.ParentFactory')

# tests
from unittest.mock import MagicMock, patch

def test_can_serialize_parents():
    parents = ParentFactory.build_batch(4)
    session = MagicMock()
    service = ParentService(params={}, _session=session)
    assert service.session is session
    with patch.object(service, '_parents') as _parents:
        _parents.return_value = parents
        assert service.get()[0]['id'] == parents[0].id
        assert service.get()[1]['id'] == parents[1].id
        assert service.get()[2]['id'] == parents[2].id
        assert service.get()[3]['id'] == parents[3].id

I would add a .json() method to each model, so that they call each other. It's essentially your "hacked" solution but a bit more readable/maintainable. Your Order model could have:

def json(self):
    return {
        "id": self.id,
        "order_lines": [line.json() for line in self.order_lines]
    }

Your OrderLine model could have:

def json(self):
    return {
        "product_id": self.product_id,
        "product_name": self.product.name,
        "quantity": self.quantity
    }

Your resource at the top level (where you're making the request for orders) could then do:

...
orders = Order.query.all()
return {"orders": [order.json() for order in orders]}
...

This is how I normally structure this JSON requirement.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!