V2EX = way to explore
V2EX 是一个关于分享和探索的地方
现在注册
已注册用户请  登录
推荐学习书目
Learn Python the Hard Way
Python Sites
PyPI - Python Package Index
http://diveintopython.org/toc/index.html
Pocoo
值得关注的项目
PyPy
Celery
Jinja2
Read the Docs
gevent
pyenv
virtualenv
Stackless Python
Beautiful Soup
结巴中文分词
Green Unicorn
Sentry
Shovel
Pyflakes
pytest
Python 编程
pep8 Checker
Styles
PEP 8
Google Python Style Guide
Code Style from The Hitchhiker's Guide
Multicom
V2EX  ›  Python

ThreadPoolExecutor 奇怪的内存增长

  •  
  •   Multicom · 2021-06-18 02:20:37 +08:00 · 1674 次点击
    这是一个创建于 1037 天前的主题,其中的信息可能已经有所发展或是发生改变。

    看到此贴 2021 年了,requests 内存泄露的问题解决了吗?如果没解决,怎么解决? , 便去测试了下

    from concurrent.futures import ThreadPoolExecutor, wait, ALL_COMPLETED
    import requests
    from memory_profiler import profile
    
    s = requests.Session()
    
    def get(i):
        s.get('URL')
    
    @profile
    def test():
        executor = ThreadPoolExecutor(max_workers = 100)
        task = [executor.submit(get, (i)) for i in range(5000)]
        wait(task, return_when = ALL_COMPLETED)
        s.close()
    
    if __name__ == '__main__':
        test()
    

    5000 任务,116.3 MB

    Line #    Mem usage    Increment  Occurences   Line Contents
    ============================================================
        10     25.6 MiB     25.6 MiB           1   @profile
        11                                         def test():
        12     25.6 MiB      0.0 MiB           1       executor = ThreadPoolExecutor(max_workers = 100)
        13    115.9 MiB     90.3 MiB        5003       task = [executor.submit(get, (i)) for i in range(5000)]
        14    116.3 MiB      0.3 MiB           1       wait(task, return_when = ALL_COMPLETED)
    

    500 任务 10 次,141.3 MB,逐渐增长

    Line #    Mem usage    Increment  Occurences   Line Contents
    ============================================================
        10     25.7 MiB     25.7 MiB           1   @profile
        11                                         def test():
        12     25.7 MiB      0.0 MiB           1       executor = ThreadPoolExecutor(max_workers = 100)
        13     94.8 MiB     69.1 MiB         503       all_task = [executor.submit(get, (i)) for i in range(500)]
        14    106.2 MiB     11.4 MiB           1       wait(all_task, return_when=ALL_COMPLETED)
        15    109.6 MiB      3.4 MiB         503       all_task = [executor.submit(get, (i)) for i in range(500)]
        16    111.7 MiB      2.1 MiB           1       wait(all_task, return_when=ALL_COMPLETED)
        17    114.3 MiB      2.6 MiB         503       all_task = [executor.submit(get, (i)) for i in range(500)]
        18    115.8 MiB      1.5 MiB           1       wait(all_task, return_when=ALL_COMPLETED)
        19    120.0 MiB      4.1 MiB         503       all_task = [executor.submit(get, (i)) for i in range(500)]
        20    121.0 MiB      1.0 MiB           1       wait(all_task, return_when=ALL_COMPLETED)
        21    124.1 MiB      3.1 MiB         503       all_task = [executor.submit(get, (i)) for i in range(500)]
        22    124.6 MiB      0.5 MiB           1       wait(all_task, return_when=ALL_COMPLETED)
        23    126.7 MiB      2.1 MiB         503       all_task = [executor.submit(get, (i)) for i in range(500)]
        24    127.4 MiB      0.8 MiB           1       wait(all_task, return_when=ALL_COMPLETED)
        25    130.5 MiB      3.1 MiB         503       all_task = [executor.submit(get, (i)) for i in range(500)]
        26    131.8 MiB      1.3 MiB           1       wait(all_task, return_when=ALL_COMPLETED)
        27    135.2 MiB      3.4 MiB         503       all_task = [executor.submit(get, (i)) for i in range(500)]
        28    136.7 MiB      1.5 MiB           1       wait(all_task, return_when=ALL_COMPLETED)
        29    137.7 MiB      1.0 MiB         503       all_task = [executor.submit(get, (i)) for i in range(500)]
        30    138.0 MiB      0.3 MiB           1       wait(all_task, return_when=ALL_COMPLETED)
        31    139.8 MiB      1.8 MiB         503       all_task = [executor.submit(get, (i)) for i in range(500)]
        32    141.3 MiB      1.5 MiB           1       wait(all_task, return_when=ALL_COMPLETED)
        33    141.3 MiB      0.0 MiB           1       s.close()
    
    0x0208v0
        1
    0x0208v0  
       2021-06-18 07:59:48 +08:00
    requests 的 session 是非线程安全的,这么用似乎也不太对
    ospider
        2
    ospider  
       2021-06-18 09:21:40 +08:00
    requests 就是个内存泄漏的坑爹货,建议尽早划成 httpx
    warcraft1236
        3
    warcraft1236  
       2021-06-18 11:22:33 +08:00
    @ospider requests 为啥会有内存泄漏呢?
    Multicom
        4
    Multicom  
    OP
       2021-06-18 20:54:33 +08:00
    @ospider 将 requests.Session() 更换为 httpx.Client() ,内存占用降低,但仍持续增长
    ```
    Line # Mem usage Increment Occurences Line Contents
    ============================================================
    10 25.2 MiB 25.2 MiB 1 @profile
    11 def test():
    12 25.2 MiB 0.0 MiB 1 executor = ThreadPoolExecutor(max_workers = 100)
    13 28.2 MiB 3.1 MiB 503 task = [executor.submit(get, (i)) for i in range(500)]
    14 28.5 MiB 0.3 MiB 1 wait(task)
    15 28.8 MiB 0.3 MiB 503 task = [executor.submit(get, (i)) for i in range(500)]
    16 28.8 MiB 0.0 MiB 1 wait(task)
    17 29.0 MiB 0.3 MiB 503 task = [executor.submit(get, (i)) for i in range(500)]
    18 31.3 MiB 2.3 MiB 1 wait(task)
    19 31.6 MiB 0.3 MiB 503 task = [executor.submit(get, (i)) for i in range(500)]
    20 35.9 MiB 4.3 MiB 1 wait(task)
    21 35.9 MiB 0.0 MiB 503 task = [executor.submit(get, (i)) for i in range(500)]
    22 39.0 MiB 3.0 MiB 1 wait(task)
    23 39.2 MiB 0.3 MiB 503 task = [executor.submit(get, (i)) for i in range(500)]
    24 41.0 MiB 1.8 MiB 1 wait(task)
    25 41.0 MiB 0.0 MiB 503 task = [executor.submit(get, (i)) for i in range(500)]
    26 43.8 MiB 2.8 MiB 1 wait(task)
    27 43.8 MiB 0.0 MiB 503 task = [executor.submit(get, (i)) for i in range(500)]
    28 45.8 MiB 2.0 MiB 1 wait(task)
    29 45.8 MiB 0.0 MiB 503 task = [executor.submit(get, (i)) for i in range(500)]
    30 48.1 MiB 2.3 MiB 1 wait(task)
    31 48.1 MiB 0.0 MiB 503 task = [executor.submit(get, (i)) for i in range(500)]
    32 49.4 MiB 1.3 MiB 1 wait(task)
    33 49.4 MiB 0.0 MiB 1 s.close()
    ```
    Multicom
        5
    Multicom  
    OP
       2021-06-18 20:56:34 +08:00
    @v2exblog 刚学所以不太了解,原来这是错误用法,受教了
    关于   ·   帮助文档   ·   博客   ·   API   ·   FAQ   ·   我们的愿景   ·   实用小工具   ·   867 人在线   最高记录 6543   ·     Select Language
    创意工作者们的社区
    World is powered by solitude
    VERSION: 3.9.8.5 · 26ms · UTC 21:36 · PVG 05:36 · LAX 14:36 · JFK 17:36
    Developed with CodeLauncher
    ♥ Do have faith in what you're doing.