Proper Repository Pattern Design in PHP?

后端 未结 11 774
天涯浪人
天涯浪人 2020-11-29 14:22

Preface: I\'m attempting to use the repository pattern in an MVC architecture with relational databases.

I\'ve recently started learning TDD in PHP, and I\'

相关标签:
11条回答
  • 2020-11-29 14:56

    I use the following interfaces:

    • Repository - loads, inserts, updates and deletes entities
    • Selector - finds entities based on filters, in a repository
    • Filter - encapsulates the filtering logic

    My Repository is database agnostic; in fact it doesn't specify any persistence; it could be anything: SQL database, xml file, remote service, an alien from outer space etc. For searching capabilities, the Repository constructs an Selector which can be filtered, LIMIT-ed, sorted and counted. In the end, the selector fetches one or more Entities from the persistence.

    Here is some sample code:

    <?php
    interface Repository
    {
        public function addEntity(Entity $entity);
    
        public function updateEntity(Entity $entity);
    
        public function removeEntity(Entity $entity);
    
        /**
         * @return Entity
         */
        public function loadEntity($entityId);
    
        public function factoryEntitySelector():Selector
    }
    
    
    interface Selector extends \Countable
    {
        public function count();
    
        /**
         * @return Entity[]
         */
        public function fetchEntities();
    
        /**
         * @return Entity
         */
        public function fetchEntity();
        public function limit(...$limit);
        public function filter(Filter $filter);
        public function orderBy($column, $ascending = true);
        public function removeFilter($filterName);
    }
    
    interface Filter
    {
        public function getFilterName();
    }
    

    Then, one implementation:

    class SqlEntityRepository
    {
        ...
        public function factoryEntitySelector()
        {
            return new SqlSelector($this);
        }
        ...
    }
    
    class SqlSelector implements Selector
    {
        ...
        private function adaptFilter(Filter $filter):SqlQueryFilter
        {
             return (new SqlSelectorFilterAdapter())->adaptFilter($filter);
        }
        ...
    }
    class SqlSelectorFilterAdapter
    {
        public function adaptFilter(Filter $filter):SqlQueryFilter
        {
            $concreteClass = (new StringRebaser(
                'Filter\\', 'SqlQueryFilter\\'))
                ->rebase(get_class($filter));
    
            return new $concreteClass($filter);
        }
    }
    

    The ideea is that the generic Selector uses Filter but the implementation SqlSelector uses SqlFilter; the SqlSelectorFilterAdapter adapts a generic Filter to a concrete SqlFilter.

    The client code creates Filter objects (that are generic filters) but in the concrete implementation of the selector those filters are transformed in SQL filters.

    Other selector implementations, like InMemorySelector, transform from Filter to InMemoryFilter using their specific InMemorySelectorFilterAdapter; so, every selector implementation comes with its own filter adapter.

    Using this strategy my client code (in the bussines layer) doesn't care about a specific repository or selector implementation.

    /** @var Repository $repository*/
    $selector = $repository->factoryEntitySelector();
    $selector->filter(new AttributeEquals('activated', 1))->limit(2)->orderBy('username');
    $activatedUserCount = $selector->count(); // evaluates to 100, ignores the limit()
    $activatedUsers = $selector->fetchEntities();
    

    P.S. This is a simplification of my real code

    0 讨论(0)
  • 2020-11-29 14:56

    I can only comment on the way we (at my company) deal with this. First of all performance is not too much of an issue for us, but having clean/proper code is.

    First of all we define Models such as a UserModel that uses an ORM to create UserEntity objects. When a UserEntity is loaded from a model all fields are loaded. For fields referencing foreign entities we use the appropriate foreign model to create the respective entities. For those entities the data will be loaded ondemand. Now your initial reaction might be ...???...!!! let me give you an example a bit of an example:

    class UserEntity extends PersistentEntity
    {
        public function getOrders()
        {
            $this->getField('orders'); //OrderModel creates OrderEntities with only the ID's set
        }
    }
    
    class UserModel {
        protected $orm;
    
        public function findUsers(IGetOptions $options = null)
        {
            return $orm->getAllEntities(/*...*/); // Orm creates a list of UserEntities
        }
    }
    
    class OrderEntity extends PersistentEntity {} // user your imagination
    class OrderModel
    {
        public function findOrdersById(array $ids, IGetOptions $options = null)
        {
            //...
        }
    }
    

    In our case $db is an ORM that is able to load entities. The model instructs the ORM to load a set of entities of a specific type. The ORM contains a mapping and uses that to inject all the fields for that entity in to the entity. For foreign fields however only the id's of those objects are loaded. In this case the OrderModel creates OrderEntitys with only the id's of the referenced orders. When PersistentEntity::getField gets called by the OrderEntity the entity instructs it's model to lazy load all the fields into the OrderEntitys. All the OrderEntitys associated with one UserEntity are treated as one result-set and will be loaded at once.

    The magic here is that our model and ORM inject all data into the entities and that entities merely provide wrapper functions for the generic getField method supplied by PersistentEntity. To summarize we always load all the fields, but fields referencing a foreign entity are loaded when necessary. Just loading a bunch of fields is not really a performance issue. Load all possible foreign entities however would be a HUGE performance decrease.

    Now on to loading a specific set of users, based on a where clause. We provide an object oriented package of classes that allow you to specify simple expression that can be glued together. In the example code I named it GetOptions. It's a wrapper for all possible options for a select query. It contains a collection of where clauses, a group by clause and everything else. Our where clauses are quite complicated but you could obviously make a simpler version easily.

    $objOptions->getConditionHolder()->addConditionBind(
        new ConditionBind(
            new Condition('orderProduct.product', ICondition::OPERATOR_IS, $argObjProduct)
        )
    );
    

    A simplest version of this system would be to pass the WHERE part of the query as a string directly to the model.

    I'm sorry for this quite complicated response. I tried to summarize our framework as quickly and clear as possible. If you have any additional questions feel free to ask them and I'll update my answer.

    EDIT: Additionally if you really don't want to load some fields right away you could specify a lazy loading option in your ORM mapping. Because all fields are eventually loaded through the getField method you could load some fields last minute when that method is called. This is not a very big problem in PHP, but I would not recommend for other systems.

    0 讨论(0)
  • 2020-11-29 15:02

    I thought I'd take a crack at answering my own question. What follows is just one way of solving the issues 1-3 in my original question.

    Disclaimer: I may not always use the right terms when describing patterns or techniques. Sorry for that.

    The Goals:

    • Create a complete example of a basic controller for viewing and editing Users.
    • All code must be fully testable and mockable.
    • The controller should have no idea where the data is stored (meaning it can be changed).
    • Example to show a SQL implementation (most common).
    • For maximum performance, controllers should only receive the data they need—no extra fields.
    • Implementation should leverage some type of data mapper for ease of development.
    • Implementation should have the ability to perform complex data lookups.

    The Solution

    I'm splitting my persistent storage (database) interaction into two categories: R (Read) and CUD (Create, Update, Delete). My experience has been that reads are really what causes an application to slow down. And while data manipulation (CUD) is actually slower, it happens much less frequently, and is therefore much less of a concern.

    CUD (Create, Update, Delete) is easy. This will involve working with actual models, which are then passed to my Repositories for persistence. Note, my repositories will still provide a Read method, but simply for object creation, not display. More on that later.

    R (Read) is not so easy. No models here, just value objects. Use arrays if you prefer. These objects may represent a single model or a blend of many models, anything really. These are not very interesting on their own, but how they are generated is. I'm using what I'm calling Query Objects.

    The Code:

    User Model

    Let's start simple with our basic user model. Note that there is no ORM extending or database stuff at all. Just pure model glory. Add your getters, setters, validation, whatever.

    class User
    {
        public $id;
        public $first_name;
        public $last_name;
        public $gender;
        public $email;
        public $password;
    }
    

    Repository Interface

    Before I create my user repository, I want to create my repository interface. This will define the "contract" that repositories must follow in order to be used by my controller. Remember, my controller will not know where the data is actually stored.

    Note that my repositories will only every contain these three methods. The save() method is responsible for both creating and updating users, simply depending on whether or not the user object has an id set.

    interface UserRepositoryInterface
    {
        public function find($id);
        public function save(User $user);
        public function remove(User $user);
    }
    

    SQL Repository Implementation

    Now to create my implementation of the interface. As mentioned, my example was going to be with an SQL database. Note the use of a data mapper to prevent having to write repetitive SQL queries.

    class SQLUserRepository implements UserRepositoryInterface
    {
        protected $db;
    
        public function __construct(Database $db)
        {
            $this->db = $db;
        }
    
        public function find($id)
        {
            // Find a record with the id = $id
            // from the 'users' table
            // and return it as a User object
            return $this->db->find($id, 'users', 'User');
        }
    
        public function save(User $user)
        {
            // Insert or update the $user
            // in the 'users' table
            $this->db->save($user, 'users');
        }
    
        public function remove(User $user)
        {
            // Remove the $user
            // from the 'users' table
            $this->db->remove($user, 'users');
        }
    }
    

    Query Object Interface

    Now with CUD (Create, Update, Delete) taken care of by our repository, we can focus on the R (Read). Query objects are simply an encapsulation of some type of data lookup logic. They are not query builders. By abstracting it like our repository we can change it's implementation and test it easier. An example of a Query Object might be an AllUsersQuery or AllActiveUsersQuery, or even MostCommonUserFirstNames.

    You may be thinking "can't I just create methods in my repositories for those queries?" Yes, but here is why I'm not doing this:

    • My repositories are meant for working with model objects. In a real world app, why would I ever need to get the password field if I'm looking to list all my users?
    • Repositories are often model specific, yet queries often involve more than one model. So what repository do you put your method in?
    • This keeps my repositories very simple—not an bloated class of methods.
    • All queries are now organized into their own classes.
    • Really, at this point, repositories exist simply to abstract my database layer.

    For my example I'll create a query object to lookup "AllUsers". Here is the interface:

    interface AllUsersQueryInterface
    {
        public function fetch($fields);
    }
    

    Query Object Implementation

    This is where we can use a data mapper again to help speed up development. Notice that I am allowing one tweak to the returned dataset—the fields. This is about as far as I want to go with manipulating the performed query. Remember, my query objects are not query builders. They simply perform a specific query. However, since I know that I'll probably be using this one a lot, in a number of different situations, I'm giving myself the ability to specify the fields. I never want to return fields I don't need!

    class AllUsersQuery implements AllUsersQueryInterface
    {
        protected $db;
    
        public function __construct(Database $db)
        {
            $this->db = $db;
        }
    
        public function fetch($fields)
        {
            return $this->db->select($fields)->from('users')->orderBy('last_name, first_name')->rows();
        }
    }
    

    Before moving on to the controller, I want to show another example to illustrate how powerful this is. Maybe I have a reporting engine and need to create a report for AllOverdueAccounts. This could be tricky with my data mapper, and I may want to write some actual SQL in this situation. No problem, here is what this query object could look like:

    class AllOverdueAccountsQuery implements AllOverdueAccountsQueryInterface
    {
        protected $db;
    
        public function __construct(Database $db)
        {
            $this->db = $db;
        }
    
        public function fetch()
        {
            return $this->db->query($this->sql())->rows();
        }
    
        public function sql()
        {
            return "SELECT...";
        }
    }
    

    This nicely keeps all my logic for this report in one class, and it's easy to test. I can mock it to my hearts content, or even use a different implementation entirely.

    The Controller

    Now the fun part—bringing all the pieces together. Note that I am using dependency injection. Typically dependencies are injected into the constructor, but I actually prefer to inject them right into my controller methods (routes). This minimizes the controller's object graph, and I actually find it more legible. Note, if you don't like this approach, just use the traditional constructor method.

    class UsersController
    {
        public function index(AllUsersQueryInterface $query)
        {
            // Fetch user data
            $users = $query->fetch(['first_name', 'last_name', 'email']);
    
            // Return view
            return Response::view('all_users.php', ['users' => $users]);
        }
    
        public function add()
        {
            return Response::view('add_user.php');
        }
    
        public function insert(UserRepositoryInterface $repository)
        {
            // Create new user model
            $user = new User;
            $user->first_name = $_POST['first_name'];
            $user->last_name = $_POST['last_name'];
            $user->gender = $_POST['gender'];
            $user->email = $_POST['email'];
    
            // Save the new user
            $repository->save($user);
    
            // Return the id
            return Response::json(['id' => $user->id]);
        }
    
        public function view(SpecificUserQueryInterface $query, $id)
        {
            // Load user data
            if (!$user = $query->fetch($id, ['first_name', 'last_name', 'gender', 'email'])) {
                return Response::notFound();
            }
    
            // Return view
            return Response::view('view_user.php', ['user' => $user]);
        }
    
        public function edit(SpecificUserQueryInterface $query, $id)
        {
            // Load user data
            if (!$user = $query->fetch($id, ['first_name', 'last_name', 'gender', 'email'])) {
                return Response::notFound();
            }
    
            // Return view
            return Response::view('edit_user.php', ['user' => $user]);
        }
    
        public function update(UserRepositoryInterface $repository)
        {
            // Load user model
            if (!$user = $repository->find($id)) {
                return Response::notFound();
            }
    
            // Update the user
            $user->first_name = $_POST['first_name'];
            $user->last_name = $_POST['last_name'];
            $user->gender = $_POST['gender'];
            $user->email = $_POST['email'];
    
            // Save the user
            $repository->save($user);
    
            // Return success
            return true;
        }
    
        public function delete(UserRepositoryInterface $repository)
        {
            // Load user model
            if (!$user = $repository->find($id)) {
                return Response::notFound();
            }
    
            // Delete the user
            $repository->delete($user);
    
            // Return success
            return true;
        }
    }
    

    Final Thoughts:

    The important things to note here are that when I'm modifying (creating, updating or deleting) entities, I'm working with real model objects, and performing the persistance through my repositories.

    However, when I'm displaying (selecting data and sending it to the views) I'm not working with model objects, but rather plain old value objects. I only select the fields I need, and it's designed so I can maximum my data lookup performance.

    My repositories stay very clean, and instead this "mess" is organized into my model queries.

    I use a data mapper to help with development, as it's just ridiculous to write repetitive SQL for common tasks. However, you absolutely can write SQL where needed (complicated queries, reporting, etc.). And when you do, it's nicely tucked away into a properly named class.

    I'd love to hear your take on my approach!


    July 2015 Update:

    I've been asked in the comments where I ended up with all this. Well, not that far off actually. Truthfully, I still don't really like repositories. I find them overkill for basic lookups (especially if you're already using an ORM), and messy when working with more complicated queries.

    I generally work with an ActiveRecord style ORM, so most often I'll just reference those models directly throughout my application. However, in situations where I have more complex queries, I'll use query objects to make these more reusable. I should also note that I always inject my models into my methods, making them easier to mock in my tests.

    0 讨论(0)
  • 2020-11-29 15:03

    Issue #3: Impossible to match an interface

    I see the benefit in using interfaces for repositories, so I can swap out my implementation (for testing purposes or other). My understanding of interfaces is that they define a contract that an implementation must follow. This is great until you start adding additional methods to your repositories like findAllInCountry(). Now I need to update my interface to also have this method, otherwise, other implementations may not have it, and that could break my application. By this feels insane...a case of the tail wagging the dog.

    My gut tells me this maybe requires an interface that implements query optimized methods alongside generic methods. Performance sensitive queries should have targeted methods, while infrequent or light-weight queries get handled by a generic handler, maybe the the expense of the controller doing a little more juggling.

    The generic methods would allow any query to be implemented, and so would prevent breaking changes during a transition period. The targeted methods allow you to optimize a call when it makes sense to, and it can be applied to multiple service providers.

    This approach would be akin to hardware implementations performing specific optimized tasks, while software implementations do the light work or flexible implementation.

    0 讨论(0)
  • 2020-11-29 15:03
       class Criteria {}
       class Select {}
       class Count {}
       class Delete {}
       class Update {}
       class FieldFilter {}
       class InArrayFilter {}
       // ...
    
       $crit = new Criteria();  
       $filter = new FieldFilter();
       $filter->set($criteria, $entity, $property, $value);
       $select = new Select($criteria);
       $count = new Count($criteria);
       $count->getRowCount();
       $select->fetchOne(); // fetchAll();
    

    So i think

    0 讨论(0)
提交回复
热议问题