* Performance Profiling in Python using cProfile
@ 2024-03-22 21:24 Housam Alamour
0 siblings, 0 replies; only message in thread
From: Housam Alamour @ 2024-03-22 21:24 UTC (permalink / raw)
To: bunsen
[-- Attachment #1: Type: text/plain, Size: 8660 bytes --]
Performance Profiling in Python using cProfile
In this brief post, I will describe how to profile and test the performance
of code in Python, and how I used this method to solve a major performance
issue in my code. To profile and test the performance of Python code, you
can use the built-in cProfile library, and then use the pstats library to
analyze the results. These libraries enable you to view statistics related
to the execution time and frequency of different parts of Python code. I
used cProfile to work out a performance issue on my new additions to the
bunsen webview, the r-grid-testcase tool.
To start, cProfile can be run on any python file from the command line for
the whole program you want to profile with
python -m cProfile main.py
However, you may also want to profile more specific parts of your code. To
do this, you must import the library as well as pstats, with
import cProfile
import pstats
Next, navigate to the part of the code you want to profile. In my case it
was the @app.route r_grid_testcase decorator within the r-httpd-browse
flask web app. We instantiate a cProfile profiler at the beginning of the
code we want to profile.
@app.route('/r_grid_testcase/')
def r_grid_testcase():
with cProfile.Profile() as profile:
...
Finally, collect the results from the profiler using a pstats object at the
end of the function, sort the results based on the variable of interest (in
this case, cumulative time), and then print the results. If the function
you are profiling has a return value, ensure the return line is indented
one line less than the cProfile line to ensure it is outside the statements
block and the function returns correctly.
...
results = pstats.Stats(profile)
results.sort_stats(pstats.SortKey.CUMULATIVE)
results.print_stats()
return response
You can sort the results by various attributes, such as NAME for function
name, CUMULATIVE for cumulative time, or FILENAME, among others.
Now, when you run the program, the statistics will be printed to stdout
each time that function returns. Here are the top results of the old
r-httpd-browse code:
119461 function calls (112011 primitive calls) in 12.938 seconds
Ordered by: cumulative time
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 12.795 12.795
/usr/lib64/python3.11/subprocess.py:506(run)
1 0.000 0.000 12.791 12.791
/usr/lib64/python3.11/subprocess.py:1165(communicate)
1 0.000 0.000 12.791 12.791
/usr/lib64/python3.11/subprocess.py:2060(_communicate)
4 0.000 0.000 12.790 3.198
/usr/lib64/python3.11/selectors.py:402(select)
4 12.790 3.198 12.790 3.198 {method 'poll' of
'select.poll' objects}
1 0.000 0.000 0.138 0.138
/home/halamour/.local/lib/python3.11/site-packages/flask/templating.py:139(render_template)
1 0.000 0.000 0.092 0.092
/home/halamour/.local/lib/python3.11/site-packages/jinja2/environment.py:1066(get_or_select_template)
1 0.000 0.000 0.092 0.092
/home/halamour/.local/lib/python3.11/site-packages/jinja2/environment.py:975(get_template)
1 0.000 0.000 0.092 0.092
/home/halamour/.local/lib/python3.11/site-packages/jinja2/environment.py:950(_load_template)
You can see a few different statistics shown: ncalls is the number of calls
made to the function, tottime is the total time spent in the function
excluding time made in calls to sub-functions, percall is the per call
time, cumtime is the cumulative time spent in this and all subfunctions,
and filename:lineno(function) is the function details.
In our case, we can see that by far the most amount of time was spent on
the poll method, which means r-httpd-browse is waiting on some other
program to run. With this knowledge, we can see that the only call to
another program is the subprocess.run command. This is spawning a new
process to run the r-grid-testcase command which is what extracts the
required test results for the webview.
p = subprocess.run(...)
So following the breadcrumbs I applied the same profiling steps above to
find the source of the slowdown in the r-grid-testcase file. I discovered
that the r-grid-test case was running many small database queries to get
the test information it was returning, which for a large search limit
resulted in huge overhead. To rectify this, I refactored r-grid-test case
to run a single large query instead of many smaller queries. This improved
performance noticeably. The time to run the r_grid_testcase decorator of
r-httpd-browse before and after the change went from almost 13 seconds to
below 0.5 seconds.
New r-httpd-browse profile:
42320 function calls (41152 primitive calls) in 0.433 seconds
Ordered by: cumulative time
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.385 0.385
/usr/lib64/python3.11/subprocess.py:506(run)
1 0.000 0.000 0.384 0.384
/usr/lib64/python3.11/subprocess.py:1165(communicate)
1 0.000 0.000 0.384 0.384
/usr/lib64/python3.11/subprocess.py:2060(_communicate)
4 0.000 0.000 0.383 0.096
/usr/lib64/python3.11/selectors.py:402(select)
4 0.383 0.096 0.383 0.096 {method 'poll' of
'select.poll' objects}
1 0.000 0.000 0.045 0.045
/home/halamour/.local/lib/python3.11/site-packages/flask/templating.py:139(render_template)
1 0.000 0.000 0.045 0.045
/home/halamour/.local/lib/python3.11/site-packages/flask/templating.py:127(_render)
1 0.000 0.000 0.045 0.045
/home/halamour/.local/lib/python3.11/site-packages/jinja2/environment.py:1269(render)
197/8 0.001 0.000 0.044 0.006 {method 'join' of 'str'
objects}
2785 0.004 0.000 0.043 0.000 <template>:4(root)
431 0.003 0.000 0.029 0.000
/home/halamour/.local/lib/python3.11/site-packages/jinja2/runtime.py:260(call)
195 0.001 0.000 0.024 0.000
/home/halamour/.local/lib/python3.11/site-packages/flask/app.py:1580(url_for)
Old r-grid-testcase profile:
105721 function calls (103399 primitive calls) in 11.551 seconds
Ordered by: cumulative time
ncalls tottime percall cumtime percall filename:lineno(function)
50/1 0.001 0.000 11.551 11.551 {built-in method
builtins.exec}
1 0.000 0.000 11.551 11.551 r-grid-testcase:1(<module>)
1 0.012 0.012 11.426 11.426 r-grid-testcase:207(main)
201 7.564 0.038 7.564 0.038 {method 'fetchall' of
'sqlite3.Cursor' objects}
235 3.734 0.016 3.734 0.016 {method 'execute' of
'sqlite3.Connection' objects}
6 0.000 0.000 0.129 0.021 __init__.py:1(<module>)
68/4 0.001 0.000 0.125 0.031 <frozen
importlib._bootstrap>:1165(_find_and_load)
68/4 0.000 0.000 0.125 0.031 <frozen
importlib._bootstrap>:1120(_find_and_load_unlocked)
65/4 0.001 0.000 0.123 0.031 <frozen
importlib._bootstrap>:666(_load_unlocked)
48/4 0.000 0.000 0.123 0.031 <frozen
importlib._bootstrap_external>:934(exec_module)
151/8 0.000 0.000 0.120 0.015 <frozen
importlib._bootstrap>:233(_call_with_frames_removed)
New r-grid-testcase profile:
75916 function calls (73594 primitive calls) in 0.213 seconds
Ordered by: cumulative time
ncalls tottime percall cumtime percall filename:lineno(function)
50/1 0.001 0.000 0.213 0.213 {built-in method
builtins.exec}
1 0.000 0.000 0.213 0.213 r-grid-testcase:1(<module>)
6 0.000 0.000 0.120 0.020 __init__.py:1(<module>)
68/4 0.001 0.000 0.116 0.029 <frozen
importlib._bootstrap>:1165(_find_and_load)
68/4 0.000 0.000 0.116 0.029 <frozen
importlib._bootstrap>:1120(_find_and_load_unlocked)
65/4 0.001 0.000 0.115 0.029 <frozen
importlib._bootstrap>:666(_load_unlocked)
48/4 0.000 0.000 0.114 0.029 <frozen
importlib._bootstrap_external>:934(exec_module)
151/8 0.000 0.000 0.112 0.014 <frozen
importlib._bootstrap>:233(_call_with_frames_removed)
It was even obvious from using the webview, as the webpage loaded in much
faster.
In conclusion, cProfile is a simple yet powerful tool for profiling Python
code and quickly debugging performance issues.
You can explore the tool in more depth on the Python docs:
https://docs.python.org/3/library/profile.html#module-cProfile
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2024-03-22 21:25 UTC | newest]
Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-22 21:24 Performance Profiling in Python using cProfile Housam Alamour
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).