Should internal class methods return values or just modify instance variables?

I am creating a query builder class that will help in constructing a query for mongodb from URL params. I have never done much object oriented programming, or designed classes for consumption by people other than myself, besides using basic language constructs and using django's built in Models.

So I have this QueryBuilder class

class QueryHelper():
    """
    Help abstract out the problem of querying over vastly
    different dataschemas.
    """

    def __init__(self, collection_name, field_name, params_dict):
        self.query_dict = {}
        self.params_dict = params_dict
        db = connection.get_db()
        self.collection = db[collection_name]

    def _build_query(self):
        # check params dict and build a mongo query
        pass

Now in _build_query I will be checking the params_dict and populating query_dict so as to pass it to mongo's find() function. In doing this I was just wondering if there was an absolute correct approach to as whether _build_query should return a dictionary or whether it should just modify self.query_dict. Since it is an internal method I would assume it is OK to just modify self.query_dict. Is there a right way (pythonic) way of approaching this? Is this just silly and not an important design decision? Any help is appreciated.

Returning a value is preferable as it allows you to keep all the attribute modifying in one place (__init__). Also, this makes it easier to extend the code later; suppose you want to override _build_query in a subclass, then the overriding method can just return a value, without needing to know which attribute to set. Here's an example:

class QueryHelper(object):
    def __init__(self, param, text):
        self._param = param
        self._query = self._build_query(text)

    def _build_query(self, text):
        return text + " and ham!"

class RefinedQueryHelper(QueryHelper):
    def _build_query(self, text):
        # no need to know how the query object is going to be used
        q = super(RefinedQueryHelper, self)._build_query()
        return q.replace("ham", "spam")

vs. the "setter version":

class QueryHelper(object):
    def __init__(self, param, text):
        self._param = param
        self._build_query(text)

    def _build_query(self, text):
        self._query = text + " and ham!"

class RefinedQueryHelper(QueryHelper):
    def _build_query(self, text):
        # what if we want to store the query in __query instead?
        # then we need to modify two classes...
        super(RefinedQueryHelper, self)._build_query()
        self._query = self._query.replace("ham", "spam")

If you do choose to set an attribute, you might want to call the method _set_query for clarity.

It's perfectly fine to modify self.query_dict as the whole idea of object-oriented programming is that methods can modify an object's state. As long as an object is in a consistent state after a method has finished, you're fine. The fact that _build_query is an internal method does not matter. You can choose to call _build_query after in __init__ to construct the query already when the object is created.

The decision mostly matters for testing purposes. Fur testing purposes, it's convenient to test each method individually without necessarily having to test the whole object's state. But that does not apply in this case because we're talking about an internal method so you alone decide when to call that method, not other objects or other code.

If you return anything at all, I'd suggest self. Returning self from instance methods is convenient for method chaining, since each return value allows another method call on the same object:

foo.add_thing(x).add_thing(y).set_goal(42).execute()

This is sometimes referred to as a "fluent" API.

However, while Python allows method chaining for immutable types such as int and str, it does not provide it for methods of mutable containers such as list and set—by design—so it is arguably not "Pythonic" to do it for your own mutable type. Still, lots of Python libraries do have "fluent" APIs.

A downside is that such an API can make debugging harder. Since you execute the whole statement or none of it, you can't easily see the object at intermediate points within the statement. Of course, I usually find print perfectly adequate for debugging Python code, so I'd just throw a print in any method whose return value I was interested in!

While it's common for methods of an object to directly modify its state, it can sometimes be advantageous for an object to be its own "client", and access themselves indirectly through (typically) private access methods. In Python you can do this easily by creating using the built-in property() class/function.

Doing this provides better encapsulation and the benefits that follow from it (insulation from the implementation details being the major one). However doing so may be impractical because it would require too much additional code, and is often slower which might affect performance adversely an unacceptable amount — so trade-offs may often have to be/are made with respect to this ideal.

来源：https://stackoverflow.com/questions/11017364/should-internal-class-methods-return-values-or-just-modify-instance-variables

标签

python

oop

instance-variables