clean up re.match objects

This loop is used in barcode scanning software. It may run as many times as a barcode is scanned, which is hundreds of times in an hour.

# locpats is a list of regular expression patterns of possible depot locations

for pat in locpats:
    q = re.match(pat, scannedcode)
    if q:
        print(q)
        return True

q is a Match object. The print(q) tells me that every match object gets its own little piece of memory. They'll add up. I have no idea to what amount in total.

I don't need the Match object anymore once inside the if . Should I wipe it, like so?

    q = re.match(pat, scannedcode)
    if q:
        q = None
        return True

Or is there a cleaner way? Should I bother at all?

If I understand right (from this), garbage collection with gc.collect() won't happen until a process terminates, which in my case is at the end of the day when the user is done scanning. Until that time, these objects won't be regarded as garbage, even.


cPython uses reference counting (plus some cyclical reference detection, not applicable here) to handle gc of objects. Once an object reaches 0 extant references, it will be immediately gc'd.

In the case of your loop:

for pat in locpats:
    q = re.match(pat, scannedcode)

Each successive pat in locpats binds a new re.match object to q . This implies that the old re.match object has 0 remaining references, and will be immediately garbage collected. A similar situation applies when you return from your function.

This is all an implementation detail of cPython; other flavors of python will handle gc differently. In all cases, don't prematurely optimize. Unless you can pinpoint a specific reason to do so, leaving the gc alone is likely to be the most performant solution.


This is not a problem, since q is local, and therefore won't persist after you return.

If you want to make yourself feel better, you can try

if re.match(pat, scannedcode):
  return True

which will do what you're doing now without ever naming the match - but it won't change your memory footprint.

(I'm assuming that you don't care about the printed value at all, it's just diagnostic)


If your print statement is showing that each match is getting its own piece of memory then it looks like one of two things is happening:

1) As others have mentioned you are not using CPython as your interpreter and the interpreter you have chosen is doing something strange with garbage collection

2) There is code you haven't shown us here which is keeping a reference to the match object so that the GC code never frees it as the reference count to the match object never reaches zero

Is either of these the case?

链接地址: http://www.djcxy.com/p/53158.html

上一篇: 从fit图像构建数据集的有效方法

下一篇: 清理重新匹配的对象