Locked learning resources

Join us and get access to thousands of tutorials and a community of expert Pythonistas.

Unlock This Lesson

Locked learning resources

This lesson is for members only. Join us and get access to thousands of tutorials and a community of expert Pythonistas.

Unlock This Lesson

Collecting Detailed Runtime Statistics With cProfile

00:00 In the last lesson, you saw how timeit can help you measure how long a block of code takes to run on average. That’s great for quick timing, but it doesn’t tell you why something is slow.

00:11 You need more details, something that shows you which parts of your program are doing the heavy lifting, and that’s where cProfile comes in.

00:21 cProfile is a built-in deterministic profiler that tracks every single function call in your program. For each function, it records how long it took, how many times it ran, and even which function called it.

00:37 Python’s standard library gives you two main profiling modules. First one is profile, which is a pure Python implementation. Then you have cProfile, which is a faster C extension with the same interface.

00:52 Generally, it’s a good idea to go with cProfile unless you need to extend or modify the profiling behavior at the source level or cProfile is not available on your system.

01:05 cProfile is way more efficient, especially for larger programs.

01:11 Now, it’s time for you to write some code and see how cProfile works in action.

01:17 Use cProfile to collect runtime statistics for fib(35).

01:24 You can run cProfile either in the command line or you can use it directly in a Python file. Here you’re going to use it directly in a Python file named cprofile_fib.py.

01:36 You already have your fib function in lines three and four. Now you can run cProfile in a few different ways. For example, the quickest way is to use cProfile.run().

01:47 It’s fast and gets the job done, but you don’t have much control over what gets profiled or how the output is handled. If you want more flexibility, you could also use cProfile as a context manager.

02:02 That gives you full control over the scope you’re profiling and lets you work with the results directly. For this task, you’re going to stick with the first option to get a general understanding of what cProfile does. For that, first you need to import cProfile.

02:20 Then after your function definition, you can actually run cProfile. cProfile.run(), make sure P is capitalized, pass in the function that you want to profile.

02:32 In this case, fib( 35) as a string. As you’re all done, you can go to the terminal and run this file: python cProfile_fib.py.

02:52 Okay. Here you see the results from cProfile. You have 29.8 million function calls, but only four of them are primitive, meaning non-recursive.

03:06 All of these function calls were made in 3.1 seconds. You can see five columns and four rows sorted by the standard name of each function. Each row corresponds to a specific function and each column presents key statistics about that function.

03:25 Take a look at the columns. First one is ncalls. This is the number of times the function was called. Then you have tottime. This is the total time spent inside the function itself, so excluding time spent in sub-function calls.

03:41 Then you have percall. This is the average time spent per function invocation just for internal work is equal to total time divided by ncalls.

03:53 Then you have another important column. Cumulative time is the total time in this function and all sub-functions it calls. Then you have another percall column.

04:04 This is equal to cumtime divided by primitive calls. It gives you the average time per call, including subfunction`.

04:14 And the last column shows you where the function is defined. It gives you the filename, line number, and function name.

04:23 Now you can interpret what these columns are showing for each function in your profiler output. First one is string column one module. This refers to line one of the string passed to cProfile.run().

04:41 ncalls is telling you only one top-level line was run, fib(35) as a string and cumulative time is equal to 3.1 seconds, meaning all time was spent in nested calls, which you know that it was inside of fib().

05:00 The second row is cProfile_ fib.py:3 (fib) This is the third line in your cProfile_ fib.py file, which is your main function fib().

05:16 Pay attention to the first column: ncalls. This time you have two numbers. The first number is close to 29.8 million, that’s the number of recursive calls.

05:28 Then you have a slash, and then the second number, which is one. That’s the number of non-recursive or primitive function calls. Whenever your function is recursive, you get the ncalls column in this exact same format.

05:44 So now you know that fib() was called about 29.8 million times in total, but only once from the outside.

05:53 Both the total time and cumulative time are 3.1 seconds, which shows you that all work is done inside of the recursive tree. The third row, built-in method builtins. exec() is Python using the exec() method to run the string fib(35), and the last row is the disabled methods from your profiler.

06:19 This is just a profiler shutting itself off.

06:23 That is a lot of information that you got from cProfile.

06:28 Here is a quick cheat sheet in case you ever need to recheck what each column means. You can find the cheat sheet with the materials in the dropdown below this video.

06:39 Again, you just got a ton of detailed information about every function in your script, but as the number of functions grows, you’ll probably want more control over what gets shown.

06:50 Maybe you want to sort the results by cumulative time instead of the default order, or maybe you want to focus on a specific function instead of looking at everything. So how do you get a better grip on the output?

07:05 The answer is pstats.

Become a Member to join the conversation.