Python getting meaningful results from cProfile

前端 未结 1 1924
予麋鹿
予麋鹿 2020-12-08 00:34

I have a Python script in a file which takes just over 30 seconds to run. I am trying to profile it as I would like to cut down this time dramatically.

I am tryin

相关标签:
1条回答
  • 2020-12-08 00:58

    As I mentioned in a comment, when you can't get cProfile to work externally, you can often use it internally instead. It's not that hard.

    For example, when I run with -m cProfile in my Python 2.7, I get effectively the same results you did. But when I manually instrument your example program:

    import fileinput
    import cProfile
    
    pr = cProfile.Profile()
    pr.enable()
    for line in fileinput.input():
        for i in range(10):
            y = int(line.strip()) + int(line.strip())
    pr.disable()
    pr.print_stats(sort='time')
    

    … here's what I get:

             22002533 function calls (22001691 primitive calls) in 3.352 seconds
    
       Ordered by: internal time
    
       ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     20000000    2.326    0.000    2.326    0.000 {method 'strip' of 'str' objects}
      1000001    0.646    0.000    0.700    0.000 fileinput.py:243(next)
      1000000    0.325    0.000    0.325    0.000 {range}
          842    0.042    0.000    0.042    0.000 {method 'readlines' of 'file' objects}
     1684/842    0.013    0.000    0.055    0.000 fileinput.py:292(readline)
            1    0.000    0.000    0.000    0.000 fileinput.py:197(__init__)
            1    0.000    0.000    0.000    0.000 fileinput.py:91(input)
            1    0.000    0.000    0.000    0.000 {isinstance}
            1    0.000    0.000    0.000    0.000 fileinput.py:266(nextfile)
            1    0.000    0.000    0.000    0.000 fileinput.py:240(__iter__)
            1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
    

    That's a lot more useful: It tells you what you probably already expected, that more than half your time is spent calling str.strip().


    Also, note that if you can't edit the file containing code you wish to profile (mwe.py), you can always do this:

    import cProfile
    pr = cProfile.Profile()
    pr.enable()
    import mwe
    pr.disable()
    pr.print_stats(sort='time')
    

    Even that doesn't always work. If your program calls exit(), for example, you'll have to use a try:/finally: wrapper and/or an atexit. And it it calls os._exit(), or segfaults, you're probably completely hosed. But that isn't very common.


    However, something I discovered later: If you move all code out of the global scope, -m cProfile seems to work, at least for this case. For example:

    import fileinput
    
    def f():
        for line in fileinput.input():
            for i in range(10):
                y = int(line.strip()) + int(line.strip())
    f()
    

    Now the output from -m cProfile includes, among other things:

    2000000 4.819 0.000 4.819 0.000 :0(strip) 100001 0.288 0.000 0.295 0.000 fileinput.py:243(next)

    I have no idea why this also made it twice as slow… or maybe that's just a cache effect; it's been a few minutes since I last ran it, and I've done lots of web browsing in between. But that's not important, what's important is that most of the time is getting charged to reasonable places.

    But if I change this to move the outer loop to the global level, and only its body into a function, most of the time disappears again.


    Another alternative, which I wouldn't suggest except as a last resort…

    I notice that if I use profile instead of cProfile, it works both internally and externally, charging time to the right calls. However, those calls are also about 5x slower. And there seems to be an additional 10 seconds of constant overhead (which gets charged to import profile if used internally, whatever's on line 1 if used externally). So, to find out that split is using 70% of my time, instead of waiting 4 seconds and doing 2.326 / 3.352, I have to wait 27 seconds, and do 10.93 / (26.34 - 10.01). Not much fun…


    One last thing: I get the same results with a CPython 3.4 dev build—correct results when used internally, everything charged to the first line of code when used externally. But PyPy 2.2/2.7.3 and PyPy3 2.1b1/3.2.3 both seem to give me correct results with -m cProfile. This may just mean that PyPy's cProfile is faked on top of profile because the pure-Python code is fast enough.


    Anyway, if someone can figure out/explain why -m cProfile isn't working, that would be great… but otherwise, this is usually a perfectly good workaround.

    0 讨论(0)
提交回复
热议问题