Understanding Software Profiling
00:00 In this lesson, you’ll learn about software profiling and then compare it to other things like testing and refactoring.
00:08 Software profiling is the process of collecting and analyzing various measures of a running program to identify performance bottlenecks, also known as hotspots.
00:19 To really grasp this better, try to figure out if the following cases are considered profiling or there’s something else like refactoring or testing.
00:31
The first case is you’ve written a function called find_
prime_numbers()
, and you want to make sure it returns the correct primes in a given range.
00:40
A quick way to do this is with assert
. For example, the first line is checking whether calling the function from zero to ten returns exactly a list that contains 2, 3, 5, and 7.
00:56
If it does, nothing really happens. If not, Python raises an AssertionError
telling you the function isn’t behaving as expected.
01:07 Now, do you think this is profiling? In fact, it’s not. You are not getting any information about how long it takes for your function to run. This is in fact testing.
01:21
The second case is that you have this function called calc()
that is a quadratic function that computes the square of x
plus one.
01:30
You have def calc
that take in (x)
and then returns x * x + 2 * x +
1
, and it does work, but it could be more readable.
01:43
So you change its name to calculate_quadratic()
and replace x * x
with x ** 2
, and you fix the indentation. Now, your function looks a lot more readable, but is this profiling?
01:58 No, it is not. This doesn’t give you any information about how slow or fast your function is. This is called refactoring.
02:08
Your third case is you’re working on a script that downloads data, processes it, and then saves results to disk through two functions: download_data()
and process_data()
.
02:20
It works, but it feels slow, so you run something called cProfile
, which you’ll learn about later on, and see this. The column total time shows you that a function called download_data()
is the culprit of your program being slow and process_data()
is innocent.
02:40 Now, do you think this one is profiling? Yes. Yes, it is. You found out the reason your program is slow.
02:49 Notice that profiling just tells you what’s making your program slow and doesn’t show you how to fix it. You have to figure that out on your own.
02:59 Okay, now you have a better understanding of the core purpose of profiling. It’s time to zoom in on it a little bit more. Profilers come in two main types: deterministic and statistical.
03:12 Deterministic profilers trace every single function call in your program. They measure exactly how long each function runs and how often it was called. You get precise reproducible numbers.
03:25
A good example of this is cProfile
, which you’ll explore in next lessons. On the other hand, statistical profilers work differently. Instead of tracking every call, they take snapshots of your program at regular intervals.
03:41 They check which function is running at that moment and use those samples to estimate where time is being spent overall.
03:49
In the next lessons, you’ll start with one of Python’s built-in timing tools called timeit
to measure the runtime of small code snippets. Then you’ll gradually move on to more powerful tools like deterministic and statistical profilers that help you analyze the performance of entire programs, identify slow functions, and pinpoint bottlenecks in practice.
Become a Member to join the conversation.