How to efficiently reference count cons cells (detecting cycles)?

一世执手 提交于 2020-01-05 08:37:33

问题


I need to make some sort of liblisp (in C11), and it will need to handle the basic functions, pretty much like what libobjc does for the Objective-C language.

Edit

I'm rewritting the question to something less generic.

I got an implementation like this:

typedef struct cons {
  void *car, *cdr;
} *cons_t;

cons_t cons_init(void *, void *);
void *cons_get_car(cons_t);
void *cons_get_cdr(cons_t);
void cons_set_car(cons_t, void *);
void cons_set_cdr(cons_t, void *);
void cons_free(cons_t);
bool cons_is_managed(cons_t);

So I can make a cons cell (it uses a memory pool with reference counted objects). I can also use cons_is_managed to check if the cons cell is inside the memory pool (so you can use externally defined cells, not created with cons_init (like static data).

How could I efficiently implement an automatic reference counting here, making if someone calls cons_set_car or cons_set_cdr it would increment the reference count if the void * argument is a managed cons cell?

The harem and the tortoise problem wouldn't be useful here, because each cell have two possible ways to go (and it could go nowhere if car nor cdr are conses), they can be lists, trees, or graphs.

I should probably register external (non-managed) conses used in on cons_set_car/cons_set_cdr in order to find cycles that involve them, but I'm still not sure how to do this efficiently.

Since this is a more controled context then general cycles in graphs (max of two vertices on a node), is there any chance I could do this in linear time and avoid a garbage collection (which will be my plan B)?

The main problem is that this is the core of any functional languages, so those functions will be called a lot of times (like obj_msgSend), they are the bottleneck.

Thanks.


On a different approach, to simplify the question: how could one implement a cons cell on a language based on reference counting, like Objective-C + ARC or Vala?


回答1:


I'm assuming the main goal of the reference counting that you are aiming to implement is efficient garbage collection (even though you say "avoid garbage collection", it is clear that you are aiming to implement automatic memory management).

First, I would advise you to consider whether to instead switch to some sort of tracing garbage collection, such as most modern Lisp implementations use. The basic difference between that and a reference counting garbage collection is positive vs. negative relationship to memory: with reference-counting, allocated elements are assumed to be live until proved otherwise (typically by a graph traversal algorithm). With tracing, allocated elements are assumed to be garbage until proved otherwise (by reachability from a root set of objects, such as the REPL interface).

Yes, you can get an occasional significant performance hit when mark-and-sweep algorithms are running, but depending on the use you are aiming for with your library that may be worth it. Similarly, if you manage threading carefully, you can have one core handling garbage collection while the other continues execution. Most efficiently, there are hybrid strategies, such as performing a "cheap" reference counting that cannot handle cycles to take care of the high-frequency easy cases, then using tracing methods to collect the cyclic garbage as it accumulates.

As for how to do it efficiently... if you want to do reference counting, you need to store one number per cons. Why not just store it in the struct?

typedef struct cons {
   void *car, *cdr;
   size_t reference_count;
} *cons_t;

If you adopt a hybrid strategy, then high-frequency operations like list processing in maps, reductions, and recursive functions can be handled in O(n) time where n is the number of elements to be garbage collected.



来源:https://stackoverflow.com/questions/19142499/how-to-efficiently-reference-count-cons-cells-detecting-cycles

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!