Multiprocessing with Python – A Complete Guide
Thanks to Python’s built-in multiprocessing library, the system can execute numerous processes at once. It will make it possible to divide apps into smaller, independent threads. The operating system can then assign each of these threads or processes to a processor to run concurrently, enhancing performance and effectiveness.
While working on a computer vision project, preprocessing a large amount of image data is necessary. Processing several photographs simultaneously is preferable because it takes a lot of time. The capacity of a system to use many processors simultaneously is known as multiprocessing.
A computer with a single processor would cycle between various tasks to keep them all active. However, most modern computers include at least a multi-core CPU, which enables the simultaneous execution of multiple tasks. You can use the Python Multiprocessing Module to assign tasks to various processes, which will help your scripts run more quickly.
Why Use Multiprocessing In Python?
It becomes difficult to perform several processes on a single processor. To keep up with the increasing quantity of processes, the processor will have to pause the current one and switch to the next. As a result, it will have to stop each activity, which will lower performance.
It is comparable to a worker in a company who is expected to handle tasks from various departments. For example, the employee will have to cease selling when working on accounts, and vice versa, if he is responsible for managing sales, accounts, and even the backend.
Let’s assume that there are various employees, each with a distinct job description. It gets easier. Python multiprocessing is necessary because of this. It is simpler to handle and manage diverse operations since the smaller task threads function like distinct employees. An illustration of a multiprocessing system is as follows:
- A setup that uses many central processors
- A single computing device containing several separate core processing units, sometimes known as a “multi-core processor,”
The system can divide and distribute tasks to other processors in multiprocessing.
Basic Multiprocessing Using Python
Using the Python Multiprocessing module, let’s create a simple program that exemplifies concurrent programming.
Let’s examine the task() function, which prints before and after a 0.5-second sleep.
Import time
def task():
print('Sleeping for 0.5 seconds')
time.sleep(0.5)
print('Finished')
Using the multiprocessing module, we can easily declare a process:
import multiprocessing p1 = multiprocessing.Process(target=task) p2 = multiprocessing.Process(target=task)
The process’s target function is specified by the target parameter of the Process() method. But before we start, these procedures don’t begin immediately:
p1.start() p2.start()
A full concurrent program might look like this:
import multiprocessing
import time
def task():
print('Sleeping for 0.5 seconds')
time.sleep(0.5)
print('Finished sleeping')
if __name__ == "__main__":
start_time = time.perf_counter()
# Creates two processes
p1 = multiprocessing.Process(target=task)
p2 = multiprocessing.Process(target=task)
# Starts both processes
p1.start()
p2.start()
finish_time = time.perf_counter()
print(f"It took the programme {finish_time-start_time} seconds to complete.")
If __name__ == “__main__,” we must fence our main program under that condition to avoid the multiprocessing module’s objections. This safety feature ensures Python completes program analysis before the sub-process is formed.
The code has a flaw, though, as the program timer is printed before the operations we wrote have even been carried out. Here is what the code is shown above produced:
The program finished in 0.01618915414532 seconds Sleeping for 0.5 seconds Sleeping for 0.5 seconds Finished sleeping Finished sleeping
To get the two processes to run before the time is printed, we must use the join() function on the two processes. This is because three processes p1, p2, and the main process are active. The primary process is the one that monitors the passing of time and prints the execution time. The line finish time should not run before processes p1 and p2 have been completed. Just add the following line of code right after the start() function calls:
p1.join() p2.join()
The join() function enables us to instruct other processes to pause until the processes on which join() was called have finished. The output with the join statements added is as shown below:
Sleeping for 0.5 seconds Sleeping for 0.5 seconds Finished sleeping Finished sleeping Program finished in 0.5688213340181392 seconds
Similar thinking allows us to run more processes. For example, the entire code from above, modified to have ten processes, is as follows:
import multiprocessing
import time
def task():
print('Sleeping for 0.5 seconds')
time.sleep(0.5)
print('Finished')
if __name__ == "__main__":
start_time = time.perf_counter()
processes = []
# Creates 10 processes then starts them
for i in range(10):
p = multiprocessing.Process(target = task)
p.start()
processes.append(p)
# Joins all the processes
for p in processes:
p.join()
finish_time = time.perf_counter()
print(f"It took the programme {finish_time-start_time} seconds to complete.)
Python Multiprocessing Module: What Is It?
Multiple classes are available in the Python multiprocessing module. It enables us to create parallel programs to implement multiprocessing in Python. It provides an easy-to-use API for distributing tasks among numerous processors, fully utilizing multiprocessing. It gets around the drawbacks of Global Interpreter Lock (GIL) by using sub-processes rather than threads. The Python multiprocessing module’s main classes are:
- Process
- Queue
- Lock
What Purposes Do Python’s Multiprocessing Pipes Serve?
Python’s multiprocessing uses Pipes as a route for communication. For example, pipes are useful when you want to start a conversation between several processes. They use the send() & recv() methods to exchange information and return two connection objects, one for either end of the Pipe. For a better understanding, let’s look at an example. The code below uses a Pipe to deliver data from the child connected to the parent connection.
import multiprocessing
from multiprocessing import Process, Pipe
def exm_function(c):
c.send(['Welcome to TechVidvan'])
c.close()
if __name__ == '__main__':
par_c, chi_c = Pipe()
mp1 = multiprocessing.Process(target=exm_function, args=(chi_c,))
mp1.start()
print (par_c.recv() )
mp1.join()
Output:
What Purposes Do Queues Serve in Python Multiprocessing?
The FIFO (First-In-First-Out) principle is the foundation of the Python data structure known as the Queue. A queue aids in communication between various processes in Python multiprocessing, just like the Pipe does. It offers the put() and get() methods to add data to the queue and retrieve it. Here is an illustration of how Python’s queue can be used for multiprocessing. The function will be created using this code to determine if a number is even or odd and to add it to the queue. The procedure will then be started, and the numbers will be printed.
def even_no(num, n):
for i in num:
if i % 2 == 0:
n.put(i)
if __name__ == "__main__":
n = multiprocessing.Queue()
p = multiprocessing.Process(target=even_no, args=(range(10), n))
p.start()
p.join()
while n:
print(n.get())
Output:
2
4
6
8
What is Queue class Python Multiprocessing?
A process is a running instance of computer software. Every Python program is executed in a process, which is a new instance of the Python interpreter. This process, known as MainProcess, executes the program’s commands using a single thread, known as the MainThread. Processes and threads are both created and managed by the underlying operating system.
To run code concurrently, we may need to establish new child processes in our application—Python’s multiprocessing. Process class enables users to create and control new processes.
We frequently need to transfer data between processes when programming for several processors. Using a queue data structure is one way to share data.
Multiple processing. By initially constructing a class instance, the queue can be used. By default, this will establish an unbounded queue or a queue with no maximum size.
# created an unbounded queue queue = multiprocessing.Queue()
What is Lock Class Python Multiprocessing?
Users can launch new processes and use the Python API thanks to the multiprocessing package for Python. This is similar to playing around with the threading module. It’s crucial to execute multiple tasks. We must load the multiprocessing module into the Python script to perform multiprocessing activities. In programming, issues could arise when two processes or threads try to access a shared resource like memory files or other data. Therefore, we need to use a lock to secure that access. The processor units can run the applications simultaneously by sharing the main memory and peripherals.
The multiprocessing application splits into more manageable chunks, each running independently. The operating system gives each process a processor. We use the multiprocessing Lock class to gain a lock on the process, preventing other processes from performing the same function until the lock is released.
The Lock class’s job is relatively straightforward. It enables code to claim a lock so that until the lock is released, no other process can run identical code. So, there are two main tasks for the Lock class. Both are unlocking and claiming the lock are possible. The acquire() function claims the lock, while the release() function is used to release the lock.
Processes can construct, acquire, and then release an instance of the lock before accessing a vital area.
#create lock lock = multiprocessing.Lock() # acquire the lock lock.acquire() # And release the lock lock.release()
The lock can only ever be held by one process at once. An acquired lock cannot be reclaimed if a process does not relinquish it.
If another process holds the lock and releases it, the process trying to acquire it will become blocked until it is acquired.
Setting the “blocking” option to False will try to obtain the lock without blocking. A value of False is returned if the lock cannot be acquired.
# acquire the lock without blocking lock.acquire(blocking=false)
Conclusion
With this article, you have learned multiprocessing in Python and how to use it. Sharing CPU resources and ATM operations is the most practical multiprocessing application, as seen in the previous example. Multiprocessing in Python will surely catch on because it is simple to handle several processes. Therefore, gaining a thorough understanding and practical experience will be smart.
