I have some code where instances of classes have parent<->child references to each other, e.g.:
class Node(object):
def __init__(self):
self.parent
I wanted to clarify which references can be weak. The following approach is general, but I use the doubly-linked tree in all examples.
Logical Step 1.
You need to ensure that there are strong references to keep all the objects alive as long as you need them. It could be done in many ways, for example by:
Logical Step 2.
Now you add references to represent information, if required.
For instance, if you used [container] approach in Step 1, you still have to represent the edges. An edge between nodes A and B can be represented with a single reference; it can go in either direction. Again, there are many options, for example:
Of course, if you used [root + children] approach in Step 1, all your information is already fully represented, so you skip this step.
Logical Step 3.
Now you add references to improve performance, if desired.
For instance, if you used [container] approach in Step 1, and [children] approach in Step 2, you might desire to improve the speed of certain algorithms, and add references between each each node and its parent. Such information is logically redundant, since you could (at a cost in performance) derive it from existing data.
All the references in Step 1 must be strong.
All the references in Steps 2 and 3 may be weak or strong. There is no advantage to using strong references. There is an advantage to using weak references until you know that cycles are no longer possible. Strictly speaking, once you know that cycles are impossible, it makes no difference whether to use weak or strong references. But to avoid thinking about it, you might as well use exclusively weak references in the Steps 2 and 3.
I suggest using child.parent = weakref.proxy(self)
. This is good solution to avoid circular references in case when lifetime of (external references to) parent
covers lifetime of child
. Contrary, use weakref
for child
(as Alex suggested) when lifetime of child
covers lifetime of parent
. But never use weakref
when both parent
and child
can be alive without other.
Here these rules are illustrated with examples. Use weakref-ed parent if you store root in some variable and pass it around, while children are accessed from it:
def Run():
root, c1, c2 = Node(), Node(), Node()
root.AddChild("first", c1)
root.AddChild("second", c2)
return root # Note that only root refers to c1 and c2 after return,
# so this references should be strong
Use weakref-ed children if you bind all them to variables, while root is accessed through them:
def Run():
root, c1, c2 = Node(), Node(), Node()
root.AddChild("first", c1)
root.AddChild("second", c2)
return c1, c2
But neither will work for the following:
def Run():
root, c1, c2 = Node(), Node(), Node()
root.AddChild("first", c1)
root.AddChild("second", c2)
return c1
Yep, weakref's excellent here. Specifically, instead of:
self.children = {}
use:
self.children = weakref.WeakValueDictionary()
Nothing else needs change in your code. This way, when a child has no other differences, it just goes away -- and so does the entry in the parent's children
map that has that child as the value.
Avoiding reference loops is up high on a par with implementing caches as a motivation for using the weakref
module. Ref loops won't kill you, but they may end up clogging your memory, esp. if some of the classes whose instances are involved in them define __del__
, since that interferes with the gc
's module ability to dissolve those loops.