Locked learning resources

Join us and get access to thousands of tutorials and a community of expert Pythonistas.

Unlock This Lesson

Locked learning resources

This lesson is for members only. Join us and get access to thousands of tutorials and a community of expert Pythonistas.

Unlock This Lesson

Formatting Collected Runtime Statistics With pstats

Negar Vahid

Profiling Performance in Python Negar Vahid 06:52

Transcript
Discussion

00:00 In this lesson, you will use pstats and cProfile together to get a better grip on your profiling output. pstats is a Python module that lets you load and explore profiling results collected by cProfile.

00:14 It lets you sort, filter, and inspect function calls to better understand your program’s performance. This time, you will integrate the pstats module with cProfile.

00:26 Use pstats to analyze the profiling data from fib(35) sorted by cumulative time and print the top function calls.

00:37 Here you have a file called pstats_fib.py with the same fib function from before. In the previous lesson, you used the run method of cProfile and learned that another way to use cProfile is to use it as a context manager.

00:54 This approach gives you a lot more room to work with the results of your profiler, especially when you want to take advantage of pstats too.

01:02 But pay attention that the context manager support is only available for cProfile and not the pure Python implementation. Okay, now that you know all that, time to use pstats. But before that, to use the context manager support of cProfile, you need to import Profile from cProfile import Profile, and then you need to import SortKey and Stats from pstats.

01:36 import SortKey with S and K capitalized, and then Stats with a capital S. You’re going to be using SortKey and Stats to sort the results of the profiler.

01:51 After your function definition, you can create the context manager with Profile()

02:00 as profile: First, you need to print the actual results from fib(35). So print(f"...") for format, inside of curly brackets fib(35)

02:19 =. And now comes pstats. First, open and closed parentheses. And inside this parenthesis, you need to create a Stats object from the profile data.

02:34 Stats(profile). This object can sort, format, and display profiling statistics. The first thing that you will do is to make the results more readable.

02:48 You can strip long directory paths from file names in the output. For that, you need the strip_dirs() method, so .strip_ dirs().

03:02 This keeps only the base filename. Then you need the profiler to sort the function list by the number of calls. For that, you need the sort_stats() method.

03:13 .sort_stats(

03:18 Inside of the parentheses, you need SortKey that you imported before, SortKey and you want to sort by the number of calls. So .CALLS and CALLS is all caps.

03:32 This makes function calls with the most calls appear at the top. And finally, you want to be able to print the actual profile statistics table to the standard out.

03:45 For that, you need the print_stats() method. .print_stats( )

03:57 Okay, now you can go to the terminal and run this file.

04:02 python pstats_fib.py

04:09 Okay, let’s take a closer look here. As you can see, this table is now showing the results ordered by the call count. You can order the results by other things.

04:19 For example, if you would want to order the stats by, say, cumulative time, you’d need to use the cumulative key. For more information, you can always take a look at the Python Profilers page of the official Python docs.

04:35 When you take a closer look here, you realize that the numbers you’re seeing here are a bit different from when you ran cProfile.run(). For example, now you have 10 primitive calls instead of four.

04:49 The reason is that here you’re profiling everything inside that with block. So you are profiling the call to fib(35). You are profiling the print() function that shows the results, the creation of the Stats object, and so on.

05:06 But still, when you take a look at the results that are now ordered by the call count, you can see that the main fib function is still the hotspot of your program by having barely any primitive calls and around 30 million recursive calls.

05:24 Before you move on from cProfile, pay attention to one thing. cProfile gave you useful runtime statistics about fib, but did you get any insight at the line level?

05:37 No, you didn’t. When you run cProfile, Python uses a tool called sys.setprofile behind the scenes. This tool tracks each time a function starts and ends, but it does not track what happens inside each function line by line.

05:58 So if you need to measure how much time each line of your function takes, you can either use sys.settrace inside of cProfile or use a tool like line_profiler.

06:12 In this lesson, you learned that cProfile is a deterministic profiler. It tracks every single function call and measures exactly how much time each one takes, but that’s just one way to profile your code.

06:26 In the next lesson, you’ll look at statistical profiling with Pyinstrument. It works by taking quick snapshots of your program while it runs, giving you a big picture overview of where most of the time is spent.

06:41 And because it doesn’t track every single call, it runs with less overhead, meaning it’s faster and affects your program less. Let’s dive in.

Become a Member to join the conversation.