Python应用内存排查工具

近期遇到Python内存泄露的问题,记录一下排查工具。

tracemalloc

tracemalloc是Python3.4起在标准库中默认加入的模块,可以用作内存分配跟踪。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
import tracemalloc

# 启动内存跟踪
tracemalloc.start()

def allocate_memory():
# 创建一些占用内存的数据结构
a = [i for i in range(100000)]
b = {str(i): i for i in range(100000)}
c = (i for i in range(100000))
return a, b, c

# 分配内存
a, b, c = allocate_memory()

# 获取当前的内存分配情况
snapshot = tracemalloc.take_snapshot()

# 获取内存分配最多的前10个地方
top_stats = snapshot.statistics('lineno')

print("Top 10 lines with the most memory usage:")
for stat in top_stats[:10]:
print(stat)

print(top_stats)

# 停止内存跟踪
tracemalloc.stop()

运行结果:

1
2
3
4
5
6
7
Top 10 lines with the most memory usage:
/Users/taoxi/Desktop/动手1.py:201: size=11.1 MiB, count=199744, average=58 B
/Users/taoxi/Desktop/动手1.py:200: size=3899 KiB, count=99744, average=40 B
/Users/taoxi/Desktop/动手1.py:202: size=392 B, count=3, average=131 B
/Users/taoxi/Desktop/动手1.py:198: size=232 B, count=2, average=116 B
/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/tracemalloc.py:551: size=72 B, count=1, average=72 B
[<Statistic traceback=<Traceback (<Frame filename='/Users/taoxi/Desktop/动手1.py' lineno=201>,)> size=11625466 count=199744>, <Statistic traceback=<Traceback (<Frame filename='/Users/taoxi/Desktop/动手1.py' lineno=200>,)> size=3992704 count=99744>, <Statistic traceback=<Traceback (<Frame filename='/Users/taoxi/Desktop/动手1.py' lineno=202>,)> size=392 count=3>, <Statistic traceback=<Traceback (<Frame filename='/Users/taoxi/Desktop/动手1.py' lineno=198>,)> size=232 count=2>, <Statistic traceback=<Traceback (<Frame filename='/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/tracemalloc.py' lineno=551>,)> size=72 count=1>]

其中:
● size代表该行实际使用的内存
● count代表内存块分配次数
● average代表每个内存块的平均大小

objgraph

objgraph是pypi三方库,需要额外安装,pip install objgraph。

使用方法:

1
2
3
4
5
6
7
8
9
10
from pympler.tracker import SummaryTracker

tracker = SummaryTracker()

# 创建一些占用内存的数据结构
a = [i for i in range(100000)]
b = {str(i): i for i in range(100000)}
c = (i for i in range(100000))

tracker.print_diff()

结果:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
                  types |   # objects |     total size
======================= | =========== | ==============
dict | 5 | 5.00 MB
int | 101749 | 2.72 MB
list | 9308 | 1.54 MB
str | 9307 | 665.85 KB
method_descriptor | 13 | 936 B
weakref | 2 | 144 B
function (store_info) | 1 | 136 B
code | -2 | 86 B
wrapper_descriptor | 1 | 72 B
member_descriptor | 1 | 64 B
method | 1 | 64 B
list_iterator | -2 | -96 B
generator | -1 | -112 B
cell | -156 | -6240 B
tuple | -142 | -10288 B

可见,主要展示了各数据类型的数量及总大小,可用于辅助排查。

memory_profiler

memory_profiler是pypi三方库,需要额外安装,pip install memory_profiler。

使用方法:

1
2
3
4
5
6
7
8
9
from memory_profiler import profile

@profile
def fake_func(n):
lst = [i for i in range(n)]
return lst

if __name__ == "__main__":
fake_func(1000000)

结果:

1
2
3
4
5
6
Line #    Mem usage    Increment  Occurrences   Line Contents
=============================================================
3 76.4 MiB 76.4 MiB 1 @profile
4 def fake_func(n):
5 115.0 MiB 38.5 MiB 1000003 lst = [i for i in range(n)]
6 115.0 MiB 0.0 MiB 1 return lst

这个感觉更骚一点,定位效果更好。

感谢您请我喝咖啡~O(∩_∩)O,如果要联系请直接发我邮箱chenx1242@163.com,我会回复你的
-------------本文结束感谢您的阅读-------------