Multithreading in A Nutshell (Python)
Multithreading allows for concurrency. It allows you to use multiple threads within a single core.
- Concurrency: the ability of different parts or units of a program, algorithm, or problem to be executed out-of-order or in partial order.
- Thread: a stream of execution in a program.
- Core: a CPU within a computer.
Multithreading takes advantage of the fact that there tends to be downtime in some programs. If your single threaded model already uses all of the processing power in a single core, adding threads will not help.
Use multithreading when there is downtime.
Examples:
- Waiting for user input
- Waiting for an API response
Concept
Single Thread — You have three API calls: A, B, and C. You run A. Wait for a response. Get a Response. Then do the same for B and C. The total time is the sum of the time needed for A, B, and C.
Multithreading— You have three API calls: A, B, and C. You run A. There is downtime as you wait for the response from the server so you run B. There is downtime as you wait for the response from the server so you run C. The three requests happen almost instantly. You will receive the responses for all three of these requests as they come in. They do not wait on other tasks to finish. The total time is the maximum of the times for A, B, and C.
Example
The following is a program that gets a list of 20 random dumb Donald Trump quotes
No Multithreading (single thread)
Note that the elapsed for this program takes 3.721 seconds.
Multithreading
As you can see, with multithreading, our program’s runtime drops down to 0.626 seconds.
Multithreading Over A List Of Data
Instead of calling the same API each time (a random quote), we will instead call quotes by their specific IDs. We will call our function on each of our quote URLs.
Common Hangups With Multithreading
Overhead — The point of threading is often to make your program run faster. However, it takes time and processing power to start a thread. If what you are threading (ex: API request) is faster than the actual process of making the thread, you could actually be slowing your program down. When working with threading, measure everything. Measure the speed of a piece of your program with threading. Then measure the speed without threading. You should only consider using threading if you are sure that it will offer a performance boost.
Invalid Order — There are some situations where things might not be in the order you want when using threading. For example, you can thread 3 API calls A, then B, then C. However, the server may respond to C before it responds to A. Make sure that you are aware of this caveat.
No Parentheses! — Common syntax error in the following lines:
- TPE.map(get_quote, quote_urls)
- [TPE.submit(get_quote, quote_url) for _ in range(20)]
Notice the lack of parentheses for the get_quote function. You want to pass in the function. Not invoke it.