How to implement a garbage collector?

Could anyone point me to a good source on how to implement garbage collection? I am making a lisp-like interpreted language. It currently uses reference counting, but of course that fails at freeing circularly dependent objects.

I've been reading of mark and sweep, tricolor marking, moving and nonmoving, incremental and stop-the-world, but... I don't know what the best way to keep the objects neatly separated into sets while keeping per-object memory overhead at a minimum, or how to do things incrementally.

I've read some languages with reference counting use circular reference detection, which I could use. I am aware I could use freely available collectors like Boehm, but I would like to learn how to do it myself.

I would appreciate any online material with some sort of tutorial or help for people with no experience on the topic like myself.


Could anyone point me to a good source on how to implement garbage collection?

There's a lot of advanced material about garbage collection out there. The Garbage Collection Handbook is great. But I found there was precious little basic introductory information so I wrote some articles about it. Prototyping a mark-sweep garbage collector describes a minimal mark-sweep GC written in F#. The Very Concurrent Garbage Collector describes a more advanced concurrent collector. HLVM is a virtual machine I wrote that includes a stop-the-world collector that handles threading.

The simplest way to implement a garbage collector is:

  • Make sure you can collate the global roots. These are the local and global variables that contain references into the heap. For local variables, push them on to a shadow stack for the duration of their scope.

  • Make sure you can traverse the heap, eg every value in the heap is an object that implements a Visit method that returns all of the references from that object.

  • Keep the set of all allocated values.

  • Allocate by calling malloc and inserting the pointer into the set of all allocated values.

  • When the total size of all allocated values exceeds a quota, kick off the mark and then sweep phases. This recursively traverses the heap accumulating the set of all reachable values.

  • The set difference of the allocated values minus the reachable values is the set of unreachable values. Iterate over them calling free and removing them from the set of allocated values.

  • Set the quota to twice the total size of all allocated values.


  • Check out the following page. It has many links. http://lua-users.org/wiki/GarbageCollection


    As suggested by delnan, I started with a very naïve stop-the-world tri-color mark and sweep algorithm. I managed to keep the objects in the sets by making them linked-list nodes, but it does add a lot of data to each object (the virtual pointer, two pointers to nodes, one enum to hold the color). It works perfectly, no memory lost on valgrind :) From here I might try to add a free list for recycling, or some sort of thing that detects when it is convenient to stop the world, or an incremental approach, or a special allocator to avoid fragmentation, or something else. If you can point me where to find info or advice (I don't know whether you can comment on an answered question) on how to do these things or what to do, I'd be very thankful. I'll be checking Lua's GC in the meantime.

    链接地址: http://www.djcxy.com/p/27162.html

    上一篇: newLISP是否使用垃圾收集?

    下一篇: 如何实现垃圾收集器?