在Python中使用字符串<128KB时内存泄漏?

原标题:内存泄漏打开文件<128KB在Python中?

原始问题

在运行我的Python脚本时,我发现我认为是内存泄漏。 这是我的脚本:

import sys
import time


class MyObj(object):
    def __init__(self, filename):
        with open(filename) as f:
            self.att = f.read()


def myfunc(filename):
    mylist = [MyObj(filename) for x in xrange(100)]
    len(mylist)
    return []


def main():
    filename = sys.argv[1]
    myfunc(filename)
    time.sleep(3600)


if __name__ == '__main__':
    main()

主函数调用myfunc() ,它创建一个100个对象的列表,每个对象打开并读取一个文件。 从myfunc()返回后,我希望从100项目列表中读取内容,并从读取文件中释放出来,因为它们不再被引用。 但是,当我使用ps命令检查内存使用情况时,Python进程使用的内存比从脚本12和13注释掉的脚本运行的Python进程多10,000 KB。

奇怪的是,内存泄漏(如果是这样的话)似乎只对<128KB大小的文件发生。 我创建了一个bash脚本来运行这个脚本,大小范围从1KB到200KB,当文件大小达到128KB时,内存增加停止。 这里是bash脚本:

#!/bin/bash

echo "PID RSS S TTY TIME COMMAND" > output.txt

for i in `seq 1 200`;
do
    python debug_memory.py "data/stuff_${i}K.txt" &
    pid=$!
    sleep 0.1
    ps -e -O rss | grep $pid | grep -v grep >> output.txt
    kill $pid
done   

这里是bash脚本的输出:

PID RSS S TTY TIME COMMAND
28471  5552 S pts/16   00:00:00 python debug_memory.py data/stuff_1K.txt
28477  5656 S pts/16   00:00:00 python debug_memory.py data/stuff_2K.txt
28483  5756 S pts/16   00:00:00 python debug_memory.py data/stuff_3K.txt
28488  5852 S pts/16   00:00:00 python debug_memory.py data/stuff_4K.txt
28494  5952 S pts/16   00:00:00 python debug_memory.py data/stuff_5K.txt
28499  6052 S pts/16   00:00:00 python debug_memory.py data/stuff_6K.txt
28505  6156 S pts/16   00:00:00 python debug_memory.py data/stuff_7K.txt
28511  6256 S pts/16   00:00:00 python debug_memory.py data/stuff_8K.txt
28516  6356 S pts/16   00:00:00 python debug_memory.py data/stuff_9K.txt
28522  6452 S pts/16   00:00:00 python debug_memory.py data/stuff_10K.txt
28527  6552 S pts/16   00:00:00 python debug_memory.py data/stuff_11K.txt
28533  6656 S pts/16   00:00:00 python debug_memory.py data/stuff_12K.txt
28539  6756 S pts/16   00:00:00 python debug_memory.py data/stuff_13K.txt
28544  6852 S pts/16   00:00:00 python debug_memory.py data/stuff_14K.txt
28550  6952 S pts/16   00:00:00 python debug_memory.py data/stuff_15K.txt
28555  7056 S pts/16   00:00:00 python debug_memory.py data/stuff_16K.txt
28561  7156 S pts/16   00:00:00 python debug_memory.py data/stuff_17K.txt
28567  7252 S pts/16   00:00:00 python debug_memory.py data/stuff_18K.txt
28572  7356 S pts/16   00:00:00 python debug_memory.py data/stuff_19K.txt
28578  7452 S pts/16   00:00:00 python debug_memory.py data/stuff_20K.txt
28584  7556 S pts/16   00:00:00 python debug_memory.py data/stuff_21K.txt
28589  7652 S pts/16   00:00:00 python debug_memory.py data/stuff_22K.txt
28595  7756 S pts/16   00:00:00 python debug_memory.py data/stuff_23K.txt
28600  7852 S pts/16   00:00:00 python debug_memory.py data/stuff_24K.txt
28606  7952 S pts/16   00:00:00 python debug_memory.py data/stuff_25K.txt
28612  8052 S pts/16   00:00:00 python debug_memory.py data/stuff_26K.txt
28617  8152 S pts/16   00:00:00 python debug_memory.py data/stuff_27K.txt
28623  8252 S pts/16   00:00:00 python debug_memory.py data/stuff_28K.txt
28629  8356 S pts/16   00:00:00 python debug_memory.py data/stuff_29K.txt
28634  8452 S pts/16   00:00:00 python debug_memory.py data/stuff_30K.txt
28640  8556 S pts/16   00:00:00 python debug_memory.py data/stuff_31K.txt
28645  8656 S pts/16   00:00:00 python debug_memory.py data/stuff_32K.txt
28651  8756 S pts/16   00:00:00 python debug_memory.py data/stuff_33K.txt
28657  8856 S pts/16   00:00:00 python debug_memory.py data/stuff_34K.txt
28662  8956 S pts/16   00:00:00 python debug_memory.py data/stuff_35K.txt
28668  9056 S pts/16   00:00:00 python debug_memory.py data/stuff_36K.txt
28674  9156 S pts/16   00:00:00 python debug_memory.py data/stuff_37K.txt
28679  9256 S pts/16   00:00:00 python debug_memory.py data/stuff_38K.txt
28685  9352 S pts/16   00:00:00 python debug_memory.py data/stuff_39K.txt
28691  9452 S pts/16   00:00:00 python debug_memory.py data/stuff_40K.txt
28696  9552 S pts/16   00:00:00 python debug_memory.py data/stuff_41K.txt
28702  9656 S pts/16   00:00:00 python debug_memory.py data/stuff_42K.txt
28707  9756 S pts/16   00:00:00 python debug_memory.py data/stuff_43K.txt
28713  9852 S pts/16   00:00:00 python debug_memory.py data/stuff_44K.txt
28719  9952 S pts/16   00:00:00 python debug_memory.py data/stuff_45K.txt
28724 10052 S pts/16   00:00:00 python debug_memory.py data/stuff_46K.txt
28730 10156 S pts/16   00:00:00 python debug_memory.py data/stuff_47K.txt
28739 10256 S pts/16   00:00:00 python debug_memory.py data/stuff_48K.txt
28746 10352 S pts/16   00:00:00 python debug_memory.py data/stuff_49K.txt
28752 10452 S pts/16   00:00:00 python debug_memory.py data/stuff_50K.txt
28757 10556 S pts/16   00:00:00 python debug_memory.py data/stuff_51K.txt
28763 10656 S pts/16   00:00:00 python debug_memory.py data/stuff_52K.txt
28769 10752 S pts/16   00:00:00 python debug_memory.py data/stuff_53K.txt
28774 10852 S pts/16   00:00:00 python debug_memory.py data/stuff_54K.txt
28780 10952 S pts/16   00:00:00 python debug_memory.py data/stuff_55K.txt
28786 11052 S pts/16   00:00:00 python debug_memory.py data/stuff_56K.txt
28791 11152 S pts/16   00:00:00 python debug_memory.py data/stuff_57K.txt
28797 11256 S pts/16   00:00:00 python debug_memory.py data/stuff_58K.txt
28802 11356 S pts/16   00:00:00 python debug_memory.py data/stuff_59K.txt
28808 11452 S pts/16   00:00:00 python debug_memory.py data/stuff_60K.txt
28814 11556 S pts/16   00:00:00 python debug_memory.py data/stuff_61K.txt
28819 11656 S pts/16   00:00:00 python debug_memory.py data/stuff_62K.txt
28825 11752 S pts/16   00:00:00 python debug_memory.py data/stuff_63K.txt
28831 11852 S pts/16   00:00:00 python debug_memory.py data/stuff_64K.txt
28836 11956 S pts/16   00:00:00 python debug_memory.py data/stuff_65K.txt
28842 12052 S pts/16   00:00:00 python debug_memory.py data/stuff_66K.txt
28847 12152 S pts/16   00:00:00 python debug_memory.py data/stuff_67K.txt
28853 12256 S pts/16   00:00:00 python debug_memory.py data/stuff_68K.txt
28859 12356 S pts/16   00:00:00 python debug_memory.py data/stuff_69K.txt
28864 12452 S pts/16   00:00:00 python debug_memory.py data/stuff_70K.txt
28871 12556 S pts/16   00:00:00 python debug_memory.py data/stuff_71K.txt
28877 12652 S pts/16   00:00:00 python debug_memory.py data/stuff_72K.txt
28883 12756 S pts/16   00:00:00 python debug_memory.py data/stuff_73K.txt
28889 12856 S pts/16   00:00:00 python debug_memory.py data/stuff_74K.txt
28894 12952 S pts/16   00:00:00 python debug_memory.py data/stuff_75K.txt
28900 13056 S pts/16   00:00:00 python debug_memory.py data/stuff_76K.txt
28906 13156 S pts/16   00:00:00 python debug_memory.py data/stuff_77K.txt
28911 13256 S pts/16   00:00:00 python debug_memory.py data/stuff_78K.txt
28917 13352 S pts/16   00:00:00 python debug_memory.py data/stuff_79K.txt
28922 13452 S pts/16   00:00:00 python debug_memory.py data/stuff_80K.txt
28928 13556 S pts/16   00:00:00 python debug_memory.py data/stuff_81K.txt
28934 13652 S pts/16   00:00:00 python debug_memory.py data/stuff_82K.txt
28939 13752 S pts/16   00:00:00 python debug_memory.py data/stuff_83K.txt
28945 13852 S pts/16   00:00:00 python debug_memory.py data/stuff_84K.txt
28951 13952 S pts/16   00:00:00 python debug_memory.py data/stuff_85K.txt
28956 14052 S pts/16   00:00:00 python debug_memory.py data/stuff_86K.txt
28962 14152 S pts/16   00:00:00 python debug_memory.py data/stuff_87K.txt
28967 14256 S pts/16   00:00:00 python debug_memory.py data/stuff_88K.txt
28973 14352 S pts/16   00:00:00 python debug_memory.py data/stuff_89K.txt
28979 14456 S pts/16   00:00:00 python debug_memory.py data/stuff_90K.txt
28984 14552 S pts/16   00:00:00 python debug_memory.py data/stuff_91K.txt
28990 14652 S pts/16   00:00:00 python debug_memory.py data/stuff_92K.txt
28996 14756 S pts/16   00:00:00 python debug_memory.py data/stuff_93K.txt
29001 14852 S pts/16   00:00:00 python debug_memory.py data/stuff_94K.txt
29007 14956 S pts/16   00:00:00 python debug_memory.py data/stuff_95K.txt
29012 15052 S pts/16   00:00:00 python debug_memory.py data/stuff_96K.txt
29018 15156 S pts/16   00:00:00 python debug_memory.py data/stuff_97K.txt
29024 15252 S pts/16   00:00:00 python debug_memory.py data/stuff_98K.txt
29029 15360 S pts/16   00:00:00 python debug_memory.py data/stuff_99K.txt
29035 15456 S pts/16   00:00:00 python debug_memory.py data/stuff_100K.txt
29040 15556 S pts/16   00:00:00 python debug_memory.py data/stuff_101K.txt
29046 15652 S pts/16   00:00:00 python debug_memory.py data/stuff_102K.txt
29052 15756 S pts/16   00:00:00 python debug_memory.py data/stuff_103K.txt
29057 15852 S pts/16   00:00:00 python debug_memory.py data/stuff_104K.txt
29063 15952 S pts/16   00:00:00 python debug_memory.py data/stuff_105K.txt
29069 16056 S pts/16   00:00:00 python debug_memory.py data/stuff_106K.txt
29074 16152 S pts/16   00:00:00 python debug_memory.py data/stuff_107K.txt
29080 16256 S pts/16   00:00:00 python debug_memory.py data/stuff_108K.txt
29085 16356 S pts/16   00:00:00 python debug_memory.py data/stuff_109K.txt
29091 16452 S pts/16   00:00:00 python debug_memory.py data/stuff_110K.txt
29097 16552 S pts/16   00:00:00 python debug_memory.py data/stuff_111K.txt
29102 16652 S pts/16   00:00:00 python debug_memory.py data/stuff_112K.txt
29108 16756 S pts/16   00:00:00 python debug_memory.py data/stuff_113K.txt
29113 16852 S pts/16   00:00:00 python debug_memory.py data/stuff_114K.txt
29119 16952 S pts/16   00:00:00 python debug_memory.py data/stuff_115K.txt
29125 17056 S pts/16   00:00:00 python debug_memory.py data/stuff_116K.txt
29130 17156 S pts/16   00:00:00 python debug_memory.py data/stuff_117K.txt
29136 17256 S pts/16   00:00:00 python debug_memory.py data/stuff_118K.txt
29141 17356 S pts/16   00:00:00 python debug_memory.py data/stuff_119K.txt
29147 17452 S pts/16   00:00:00 python debug_memory.py data/stuff_120K.txt
29153 17556 S pts/16   00:00:00 python debug_memory.py data/stuff_121K.txt
29158 17656 S pts/16   00:00:00 python debug_memory.py data/stuff_122K.txt
29164 17756 S pts/16   00:00:00 python debug_memory.py data/stuff_123K.txt
29170 17856 S pts/16   00:00:00 python debug_memory.py data/stuff_124K.txt
29175 17952 S pts/16   00:00:00 python debug_memory.py data/stuff_125K.txt
29181 18056 S pts/16   00:00:00 python debug_memory.py data/stuff_126K.txt
29186 18152 S pts/16   00:00:00 python debug_memory.py data/stuff_127K.txt
29192  5452 S pts/16   00:00:00 python debug_memory.py data/stuff_128K.txt
29198  5456 S pts/16   00:00:00 python debug_memory.py data/stuff_129K.txt
29203  5456 S pts/16   00:00:00 python debug_memory.py data/stuff_130K.txt
29209  5452 S pts/16   00:00:00 python debug_memory.py data/stuff_131K.txt
29215  5456 S pts/16   00:00:00 python debug_memory.py data/stuff_132K.txt
29220  5456 S pts/16   00:00:00 python debug_memory.py data/stuff_133K.txt
29226  5456 S pts/16   00:00:00 python debug_memory.py data/stuff_134K.txt
29231  5452 S pts/16   00:00:00 python debug_memory.py data/stuff_135K.txt
29237  5456 S pts/16   00:00:00 python debug_memory.py data/stuff_136K.txt
29243  5456 S pts/16   00:00:00 python debug_memory.py data/stuff_137K.txt
29248  5456 S pts/16   00:00:00 python debug_memory.py data/stuff_138K.txt
29254  5452 S pts/16   00:00:00 python debug_memory.py data/stuff_139K.txt
29260  5452 S pts/16   00:00:00 python debug_memory.py data/stuff_140K.txt
29265  5452 S pts/16   00:00:00 python debug_memory.py data/stuff_141K.txt
29271  5452 S pts/16   00:00:00 python debug_memory.py data/stuff_142K.txt
29276  5452 S pts/16   00:00:00 python debug_memory.py data/stuff_143K.txt
29282  5452 S pts/16   00:00:00 python debug_memory.py data/stuff_144K.txt
29288  5456 S pts/16   00:00:00 python debug_memory.py data/stuff_145K.txt
29293  5452 S pts/16   00:00:00 python debug_memory.py data/stuff_146K.txt
29299  5452 S pts/16   00:00:00 python debug_memory.py data/stuff_147K.txt
29305  5456 S pts/16   00:00:00 python debug_memory.py data/stuff_148K.txt
29310  5452 S pts/16   00:00:00 python debug_memory.py data/stuff_149K.txt
29316  5452 S pts/16   00:00:00 python debug_memory.py data/stuff_150K.txt
29321  5456 S pts/16   00:00:00 python debug_memory.py data/stuff_151K.txt
29327  5456 S pts/16   00:00:00 python debug_memory.py data/stuff_152K.txt
29333  5452 S pts/16   00:00:00 python debug_memory.py data/stuff_153K.txt
29338  5456 S pts/16   00:00:00 python debug_memory.py data/stuff_154K.txt
29344  5452 S pts/16   00:00:00 python debug_memory.py data/stuff_155K.txt
29349  5452 S pts/16   00:00:00 python debug_memory.py data/stuff_156K.txt
29355  5452 S pts/16   00:00:00 python debug_memory.py data/stuff_157K.txt
29361  5452 S pts/16   00:00:00 python debug_memory.py data/stuff_158K.txt
29366  5452 S pts/16   00:00:00 python debug_memory.py data/stuff_159K.txt
29372  5456 S pts/16   00:00:00 python debug_memory.py data/stuff_160K.txt
29378  5456 S pts/16   00:00:00 python debug_memory.py data/stuff_161K.txt
29383  5460 S pts/16   00:00:00 python debug_memory.py data/stuff_162K.txt
29389  5456 S pts/16   00:00:00 python debug_memory.py data/stuff_163K.txt
29394  5456 S pts/16   00:00:00 python debug_memory.py data/stuff_164K.txt
29400  5452 S pts/16   00:00:00 python debug_memory.py data/stuff_165K.txt
29406  5456 S pts/16   00:00:00 python debug_memory.py data/stuff_166K.txt
29411  5456 S pts/16   00:00:00 python debug_memory.py data/stuff_167K.txt
29417  5452 S pts/16   00:00:00 python debug_memory.py data/stuff_168K.txt
29423  5456 S pts/16   00:00:00 python debug_memory.py data/stuff_169K.txt
29428  5456 S pts/16   00:00:00 python debug_memory.py data/stuff_170K.txt
29434  5456 S pts/16   00:00:00 python debug_memory.py data/stuff_171K.txt
29439  5456 S pts/16   00:00:00 python debug_memory.py data/stuff_172K.txt
29445  5456 S pts/16   00:00:00 python debug_memory.py data/stuff_173K.txt
29451  5456 S pts/16   00:00:00 python debug_memory.py data/stuff_174K.txt
29456  5452 S pts/16   00:00:00 python debug_memory.py data/stuff_175K.txt
29463  5456 S pts/16   00:00:00 python debug_memory.py data/stuff_176K.txt
29483  5456 S pts/16   00:00:00 python debug_memory.py data/stuff_177K.txt
29489  5456 S pts/16   00:00:00 python debug_memory.py data/stuff_178K.txt
29496  5452 S pts/16   00:00:00 python debug_memory.py data/stuff_179K.txt
29501  5452 S pts/16   00:00:00 python debug_memory.py data/stuff_180K.txt
29507  5452 S pts/16   00:00:00 python debug_memory.py data/stuff_181K.txt
29512  5452 S pts/16   00:00:00 python debug_memory.py data/stuff_182K.txt
29518  5452 S pts/16   00:00:00 python debug_memory.py data/stuff_183K.txt
29524  5452 S pts/16   00:00:00 python debug_memory.py data/stuff_184K.txt
29529  5452 S pts/16   00:00:00 python debug_memory.py data/stuff_185K.txt
29535  5452 S pts/16   00:00:00 python debug_memory.py data/stuff_186K.txt
29541  5452 S pts/16   00:00:00 python debug_memory.py data/stuff_187K.txt
29546  5456 S pts/16   00:00:00 python debug_memory.py data/stuff_188K.txt
29552  5456 S pts/16   00:00:00 python debug_memory.py data/stuff_189K.txt
29557  5456 S pts/16   00:00:00 python debug_memory.py data/stuff_190K.txt
29563  5456 S pts/16   00:00:00 python debug_memory.py data/stuff_191K.txt
29569  5456 S pts/16   00:00:00 python debug_memory.py data/stuff_192K.txt
29574  5456 S pts/16   00:00:00 python debug_memory.py data/stuff_193K.txt
29580  5456 S pts/16   00:00:00 python debug_memory.py data/stuff_194K.txt
29586  5456 S pts/16   00:00:00 python debug_memory.py data/stuff_195K.txt
29591  5452 S pts/16   00:00:00 python debug_memory.py data/stuff_196K.txt
29597  5456 S pts/16   00:00:00 python debug_memory.py data/stuff_197K.txt
29602  5452 S pts/16   00:00:00 python debug_memory.py data/stuff_198K.txt
29608  5456 S pts/16   00:00:00 python debug_memory.py data/stuff_199K.txt
29614  5452 S pts/16   00:00:00 python debug_memory.py data/stuff_200K.txt

有人可以解释发生了什么吗? 为什么在使用<128KB的文件时会看到内存使用量的增加?

我的完整测试环境位于:https://github.com/saltycrane/debugging-python-memory-usage/tree/50f73358c7a84a504333ce9c4071b0f3537bbc0f

我在Ubuntu 12.04上运行Python 2.7.3。

更新1

此问题并非特定于使用<128K大小的文件。 我得到了相同的结果,将对象属性设置为与从文件中读取的大小相同的值。 这里是更新后的代码:

import sys
import time


class MyObj(object):
    def __init__(self, size_kb):
        self.att = ' ' * int(size_kb) * 1024


def myfunc(size_kb):
    mylist = [MyObj(size_kb) for x in xrange(100)]
    len(mylist)
    return []


def main():
    size_kb = sys.argv[1]
    myfunc(size_kb)
    time.sleep(3600)


if __name__ == '__main__':
    main()

运行这个脚本会得到类似的结果 更新后的测试环境位于:https://github.com/saltycrane/debugging-python-memory-usage/tree/59b7ff61134dfc11c4195e9201b2c1728ed4fcce

更新2

我进一步简化了我的测试脚本:1.删除类并简单地创建一个字符串列表2.删除myfunc()并使用del删除mylist对象

import sys
import time

def main():
    size_kb = sys.argv[1]

    mylist = []
    for x in xrange(100):
        mystr = ' ' * int(size_kb) * 1024
        mylist.append(mystr)

    del mylist

    time.sleep(3600)

if __name__ == '__main__':
    main()

我简化的脚本也给出了与原文相似的结果。 但是,如果我不创建单独的字符串变量,我不会看到内存的增加。 以下是不会增加内存的脚本:

import sys
import time

def main():
    size_kb = sys.argv[1]

    mylist = []
    for x in xrange(100):
        mylist.append(' ' * int(size_kb) * 1024)

    del mylist

    time.sleep(3600)

if __name__ == '__main__':
    main()

更新后的测试环境位于:https://github.com/saltycrane/debugging-python-memory-usage/tree/423ca6a50dccbe32572a9d0dea1068ddcb06663b

更多问题:

  • 其他人可以重现我的结果吗?
  • ps预计会增加内存吗?
  • 有关正在发生的事情的提示

    我发现了一些关于“免费列表”的有趣信息,看起来他们可能与此问题有关:

  • 为什么当我删除一个大对象时,Python不释放内存?
  • 我如何在Python中显式释放内存?
  • Python内存管理 - Theano v0.6rc3文档
  • 从最后一个链接:

    为了加速内存分配(并重用),Python为小对象使用了许多列表。 每个列表将包含类似大小的对象

    事实上:如果一个项目(大小为x)被释放(由于缺乏引用而被释放),它的位置不会被返回到Python的全局内存池(对系统甚至更少),而只是标记为空闲并添加到空闲列表中大小为x的项目。

    如果小的对象内存永远不会被释放,那么不可避免的结论是,像金鱼一样,这些小对象列表只会保持增长,永不缩小,并且应用程序的内存占用量是由任何给定的最大数量的小对象所支配的点。

    更新3

    我简化了Update 2中的代码。在脚本的末尾添加del mystr行,释放了内存。 (请参阅:https://github.com/saltycrane/debugging-python-memory-usage/blob/dd058e4774802cae7cbfca520fb835ea46b645e8/debug_memory_leaks.py)

    我更新了脚本,以证明问题足够复杂。 以下代码仍然存在此问题。 最新的代码/环境位于:https://github.com/saltycrane/debugging-python-memory-usage/tree/fc0c8ce9ba621cb86b6abb93adf1b297a7c0230b

    import gc
    import sys
    import time
    
    
    def main():
        size_kb = sys.argv[1]
    
        mylist = []
        for x in xrange(100):
            mystr = ' ' * int(size_kb) * 1024
            mydict = {'mykey': mystr}
            mylist.append(mydict)
    
        del mystr
        del mydict
        del mylist
    
        gc.collect()
    
        time.sleep(3600)
    
    
    if __name__ == '__main__':
        main()
    

    我也运行脚本是一些其他的环境。 奇怪的结果是从一个干净的virtualenv内运行。 在这种情况下,内存丢失发生在260KB而不是128KB。 请参阅https://github.com/saltycrane/debugging-python-memory-usage/tree/52fbd5d57ff45affdcd70623ddb74fa1f1ffbbc2

    环境:

  • Ubuntu 12.04 64位,系统Python 2.7.3:原始运行
  • 从源代码编译的Ubuntu 12.04 64位,Python 3.3.0:类似的结果
  • 科学Linux 6 64位,Python 2.6.6:类似的结果
  • Ubuntu 12.04 64位,来自virtualenv:内存丢失的Python 2.7.3发生在260KB而不是128KB
  • 更多参考:

  • http://revista.python.org.ar/2/en/html/memory-fragmentation.html
  • http://www.evanjones.ca/python-memory.html
  • http://mail.python.org/pipermail/python-dev/2004-October/049480.html(注意:这是从2004年起)
  • http://mail.python.org/pipermail/python-dev/2006-March/061991.html
  • http://www.evanjones.ca/memoryallocator/
  • http://www.evanjones.ca/memory-allocator.pdf
  • http://hg.python.org/releasing/2.7.3/file/7bb96963d067/Objects/obmalloc.c

    在阅读了其中的一些内容之后,我看到提到了256KB的“竞技场规模”。 也许这是相关的?

  • 更新4(最有效的解决方案)

    schlenk发现内存使用在128KB下降的原因。 128KB是“内存分配函数”(malloc?)使用mmap而不是使用sbrk增加程序中断的点。 有趣的是,可以通过环境变量来改变阈值。 我运行了一个测试,将MALLOC_MMAP_THRESHOLD_环境变量设置为不同的值,并且内存使用情况的下降与该值相匹配。 查看结果:https://github.com/saltycrane/debugging-python-memory-usage/blob/97d93cd165a139a6b6f96720de63a92561dd2f05/output_debug_memory_leaks.py.txt

    我仍然想知道它是否预期我的脚本将字符串值泄漏到内存中的行为<128KB。

    更多链接:

  • mallopt(3) - Linux手册页(来自schlenk)
  • Python内存管理和TCMalloc | 推动网络
  • 回复:将x设置为None并且del x在python 2.7.1(HPUX 11.23,ia64)中不释放内存«回首页»ActiveState List Archives
  • 问题3526:SunOS和AIX上的自定义malloc实现 - Python跟踪器
  • 使malloc中的mmap / brk阈值处于动态状态以提高性能
  • 注意:根据最后两个链接,使用mmap而不是sbrk会有性能(速度)命中。


    你可以直接点击linux内存分配器的默认行为。

    基本上Linux有两种分配策略,sbrk()用于小块内存,mmap()用于大块。 sbrk()分配的内存块不能轻易返回到系统,而基于mmap()的内存块可以(仅取消映射页面)。

    因此,如果您分配的内存块大于libc中的malloc()分配器决定在sbrk()和mmap()之间切换的值,您会看到这种效果。 请参阅mallopt()调用,尤其是MMAP_THRESHOLD(http://man7.org/linux/man-pages/man3/mallopt.3.html)。

    更新为了回答你的额外问题:是的,如果内存分配器的工作方式与Linux上的libc版本相同,那么预计会出现这种内存泄漏。 如果您使用Windows LowFragmentationHeap,它可能不会泄漏,在AIX上类似,具体取决于配置了哪个malloc。 也许其他分配器之一(tcmalloc等)也解决了这些问题。 sbrk()非常快,但存在内存碎片问题。 CPython不能做太多的事情,因为它没有压缩垃圾收集器,只是简单的引用计数。

    Python提供了一些方法来减少缓冲区分配,例如参见这里的博客文章:http://eli.thegreenplace.net/2011/11/28/less-copies-in-python-with-the-buffer-protocol -and-memoryviews /


    我会研究垃圾收集。 可能是更大的文件更频繁地触发垃圾回收,但小文件正在被释放,但共同停留在某个阈值。 具体来说,调用gc.collect(),然后在对象上调用gc.get_referrers(),希望能够揭示实例保留的内容。 在这里看到Python文档:

    http://docs.python.org/2/library/gc.html?highlight=gc#gc.get_referrers

    更新:

    这个问题涉及垃圾收集,名称空间和引用计数。 你发布的bash脚本给出了一个相当狭窄的垃圾回收器行为视图。 尝试更大的范围,你会看到特定范围需要多少内存模式。 例如,为更大范围更改bash for循环,如: seq 0 16 2056

    如果你del mystr你注意到内存使用减少了,因为你删除了对它的任何引用。 如果您将mystr变量限制为它自己的函数,可能会发生类似的结果,如下所示:

    def loopy():
        mylist = []
        for x in xrange(100):
            mystr = ' ' * int(size_kb) * 1024
            mydict = {x: mystr}
            mylist.append(mydict)
        return mylist
    

    我认为你可以使用内存分析器获得更多有用的信息,而不是使用bash脚本。 这里有几个使用Pympler的例子。 这第一个版本与Update 3中的代码类似:

    import gc
    import sys
    import time
    from pympler import tracker
    
    tr = tracker.SummaryTracker()
    print 'begin:'
    tr.print_diff()
    
    size_kb = sys.argv[1]
    
    mylist = []
    mydict = {}
    
    print 'empty list & dict:'
    tr.print_diff()
    
    for x in xrange(100):
        mystr = ' ' * int(size_kb) * 1024
        mydict = {x: mystr}
        mylist.append(mydict)
    
    print 'after for loop:'
    tr.print_diff()
    
    del mystr
    del mydict
    del mylist
    
    print 'after deleting stuff:'
    tr.print_diff()
    
    collected = gc.collect()
    print 'after garbage collection (collected: %d):' % collected
    tr.print_diff()
    
    time.sleep(2)
    print 'took a short nap after all that work:'
    tr.print_diff()
    
    mylist = []
    print 'create an empty list for some reason:'
    tr.print_diff()
    

    输出:

    $ python mem_test.py 256
    begin:
                      types |   # objects |    total size
    ======================= | =========== | =============
                       list |         957 |      97.44 KB
                        str |         951 |      53.65 KB
                        int |         118 |       2.77 KB
         wrapper_descriptor |           8 |     640     B
                    weakref |           3 |     264     B
          member_descriptor |           2 |     144     B
          getset_descriptor |           2 |     144     B
      function (store_info) |           1 |     120     B
                       cell |           2 |     112     B
             instancemethod |          -1 |     -80     B
           _sre.SRE_Pattern |          -2 |    -176     B
                      tuple |          -1 |    -216     B
                       dict |           2 |   -1744     B
    empty list & dict:
      types |   # objects |   total size
    ======= | =========== | ============
       list |           2 |    168     B
        str |           2 |     97     B
        int |           1 |     24     B
    after for loop:
      types |   # objects |   total size
    ======= | =========== | ============
        str |           1 |    256.04 KB
       list |           0 |    848     B
    after deleting stuff:
      types |   # objects |      total size
    ======= | =========== | ===============
       list |          -1 |      -920     B
        str |          -1 |   -262181     B
    after garbage collection (collected: 0):
      types |   # objects |   total size
    ======= | =========== | ============
    took a short nap after all that work:
      types |   # objects |   total size
    ======= | =========== | ============
    create an empty list for some reason:
      types |   # objects |   total size
    ======= | =========== | ============
       list |           1 |     72     B
    

    注意在for循环之后,str类的总大小是256 KB,与我传递给它的参数基本相同。 在del mystr显式移除对mystr的引用后,内存被释放。 在这之后,垃圾已经被拾取,所以在gc.collect()之后没有进一步减少。

    下一个版本使用函数为字符串创建不同的名称空间。

    import gc
    import sys
    import time
    from pympler import tracker
    
    def loopy():
        mylist = []
        for x in xrange(100):
            mystr = ' ' * int(size_kb) * 1024
            mydict = {x: mystr}
            mylist.append(mydict)
        return mylist
    
    
    tr = tracker.SummaryTracker()
    print 'begin:'
    tr.print_diff()
    
    size_kb = sys.argv[1]
    
    mylist = loopy()
    
    print 'after for loop:'
    tr.print_diff()
    
    del mylist
    
    print 'after deleting stuff:'
    tr.print_diff()
    
    collected = gc.collect()
    print 'after garbage collection (collected: %d):' % collected
    tr.print_diff()
    
    time.sleep(2)
    print 'took a short nap after all that work:'
    tr.print_diff()
    
    mylist = []
    print 'create an empty list for some reason:'
    tr.print_diff()
    

    最后从这个版本的输出:

    $ python mem_test_2.py 256
    begin:
                      types |   # objects |    total size
    ======================= | =========== | =============
                       list |         958 |      97.53 KB
                        str |         952 |      53.70 KB
                        int |         118 |       2.77 KB
         wrapper_descriptor |           8 |     640     B
                    weakref |           3 |     264     B
          member_descriptor |           2 |     144     B
          getset_descriptor |           2 |     144     B
      function (store_info) |           1 |     120     B
                       cell |           2 |     112     B
             instancemethod |          -1 |     -80     B
           _sre.SRE_Pattern |          -2 |    -176     B
                      tuple |          -1 |    -216     B
                       dict |           2 |   -1744     B
    after for loop:
      types |   # objects |   total size
    ======= | =========== | ============
       list |           2 |   1016     B
        str |           2 |     97     B
        int |           1 |     24     B
    after deleting stuff:
      types |   # objects |   total size
    ======= | =========== | ============
       list |          -1 |   -920     B
    after garbage collection (collected: 0):
      types |   # objects |   total size
    ======= | =========== | ============
    took a short nap after all that work:
      types |   # objects |   total size
    ======= | =========== | ============
    create an empty list for some reason:
      types |   # objects |   total size
    ======= | =========== | ============
       list |           1 |     72     B
    

    现在,我们不必清理str,我想这个例子说明了为什么使用函数是一个好主意。 在一个名称空间中存在一个大块的情况下生成代码确实会阻止垃圾回收器完成它的工作。 它不会进入你的房子,并开始假设东西是垃圾:)它必须知道事情是安全的收集。

    顺便说一句,埃文琼斯链接是非常有趣的。

    链接地址: http://www.djcxy.com/p/66923.html

    上一篇: Memory leak when using strings < 128KB in Python?

    下一篇: Memory leak in the Win64 Delphi RTL during thread shutdown?