Formatting Collected Runtime Statistics With pstats
00:00
In this lesson, you will use pstats
and cProfile
together to get a better grip on your profiling output. pstats
is a Python module that lets you load and explore profiling results collected by cProfile
.
00:14
It lets you sort, filter, and inspect function calls to better understand your program’s performance. This time, you will integrate the pstats
module with cProfile
.
00:26
Use pstats
to analyze the profiling data from fib(35)
sorted by cumulative time and print the top function calls.
00:37
Here you have a file called pstats_fib.py
with the same fib
function from before. In the previous lesson, you used the run
method of cProfile
and learned that another way to use cProfile
is to use it as a context manager.
00:54
This approach gives you a lot more room to work with the results of your profiler, especially when you want to take advantage of pstats
too.
01:02
But pay attention that the context manager support is only available for cProfile
and not the pure Python implementation. Okay, now that you know all that, time to use pstats
. But before that, to use the context manager support of cProfile
, you need to import Profile
from cProfile import Profile
, and then you need to import SortKey
and Stats
from pstats
.
01:36
import SortKey
with S
and K
capitalized, and then Stats
with a capital S
. You’re going to be using SortKey
and Stats
to sort the results of the profiler.
01:51
After your function definition, you can create the context manager with Profile()
02:00
as profile
: First, you need to print the actual results from fib(35)
. So print(f"...")
for format, inside of curly brackets fib(35)
02:19
=
. And now comes pstats
. First, open and closed parentheses. And inside this parenthesis, you need to create a Stats
object from the profile data.
02:34
Stats(profile)
. This object can sort, format, and display profiling statistics. The first thing that you will do is to make the results more readable.
02:48
You can strip long directory paths from file names in the output. For that, you need the strip_dirs()
method, so .strip_
dirs()
.
03:02
This keeps only the base filename. Then you need the profiler to sort the function list by the number of calls. For that, you need the sort_stats()
method.
03:18
Inside of the parentheses, you need SortKey
that you imported before, SortKey
and you want to sort by the number of calls. So .CALLS
and CALLS
is all caps.
03:32 This makes function calls with the most calls appear at the top. And finally, you want to be able to print the actual profile statistics table to the standard out.
03:45
For that, you need the print_stats()
method. .print_stats(
)
03:57 Okay, now you can go to the terminal and run this file.
04:09 Okay, let’s take a closer look here. As you can see, this table is now showing the results ordered by the call count. You can order the results by other things.
04:19 For example, if you would want to order the stats by, say, cumulative time, you’d need to use the cumulative key. For more information, you can always take a look at the Python Profilers page of the official Python docs.
04:35
When you take a closer look here, you realize that the numbers you’re seeing here are a bit different from when you ran cProfile.run()
. For example, now you have 10 primitive calls instead of four.
04:49
The reason is that here you’re profiling everything inside that with
block. So you are profiling the call to fib(35)
. You are profiling the print()
function that shows the results, the creation of the Stats
object, and so on.
05:06
But still, when you take a look at the results that are now ordered by the call count, you can see that the main fib
function is still the hotspot of your program by having barely any primitive calls and around 30 million recursive calls.
05:24
Before you move on from cProfile
, pay attention to one thing. cProfile
gave you useful runtime statistics about fib
, but did you get any insight at the line level?
05:37
No, you didn’t. When you run cProfile
, Python uses a tool called sys.setprofile
behind the scenes. This tool tracks each time a function starts and ends, but it does not track what happens inside each function line by line.
05:58
So if you need to measure how much time each line of your function takes, you can either use sys.settrace
inside of cProfile
or use a tool like line_profiler
.
06:12
In this lesson, you learned that cProfile
is a deterministic profiler. It tracks every single function call and measures exactly how much time each one takes, but that’s just one way to profile your code.
06:26 In the next lesson, you’ll look at statistical profiling with Pyinstrument. It works by taking quick snapshots of your program while it runs, giving you a big picture overview of where most of the time is spent.
06:41 And because it doesn’t track every single call, it runs with less overhead, meaning it’s faster and affects your program less. Let’s dive in.
Become a Member to join the conversation.