Python Multiprocessing Example
Python Multiprocessing
Parallel processing is getting more attention nowadays. If you still don’t know about parallel processing, learn from Wikipedia. As CPU manufacturers start adding more and more cores to their processors, creating parallel code is a great way to improve performance. Python introduced a multiprocessing module to let us write parallel code. To understand the main motivation of this module, we have to know some basics about parallel programming. After reading this article, we hope that you would be able to gather some knowledge on this topic.
Python Multiprocessing Process, Queue, and Locks
There are plenty of classes in the python multiprocessing module for building a parallel program. Among them, three basic classes are Process
, Queue
and Lock
. These classes will help you to build a parallel program. But before describing those, let us initiate this topic with simple code. To make a parallel program useful, you have to know how many cores are there on your pc. Python Multiprocessing module enables you to know that. The following simple code will print the number of cores in your pc.
import multiprocessing print("Number of cpu : ", multiprocessing.cpu_count())
The following output may vary for your pc. For me, the number of cores is 8.
Python multiprocessing Process class
Python multiprocessing Process
class is an abstraction that sets up another Python process, provides it to run code, and a way for the parent application to control execution. There are two important functions that belong to the Process class – start()
and join()
function. At first, we need to write a function, that will be run by the process. Then, we need to instantiate a process object. If we create a process object, nothing will happen until we tell it to start processing via start()
function. Then, the process will run and return its result.
After that, we tell the process to complete via join()
function. Without join()
function call, the process will remain idle and won’t terminate. So if you create many processes and don’t terminate them, you may face a scarcity of resources. Then you may need to kill them manually. One important thing is if you want to pass any argument through the process you need to use args
keyword argument. The following code will be helpful to understand the usage of the Process class.
from multiprocessing import Process def print_func(continent='Asia'): print('The name of continent is : ', continent) if __name__ == "__main__": # confirms that the code is under main function names = ['America', 'Europe', 'Africa'] procs = [] proc = Process(target=print_func) # instantiating without any argument procs.append(proc) proc.start() # instantiating process with arguments for name in names: # print(name) proc = Process(target=print_func, args=(name,)) procs.append(proc) proc.start() # complete the processes for proc in procs: proc.joi
The output of the following code will be:
Python multiprocessing Queue class
You have basic knowledge of computer data structure, and you probably know about queues. Python Multiprocessing modules provide Queue
a class that is exactly a First-In-First-Out data structure. They can store any pickle Python object (though simple ones are best) and are extremely useful for sharing data between processes. Queues are especially useful when passed as a parameter to a Process’ target function to enable the Process to consume data. By using put()
the function we can insert data into the queue and using get()
we can get items from queues. See the following code for a quick example.
from multiprocessing import Queue colors = ['red', 'green', 'blue', 'black'] cnt = 1 # instantiating a queue object queue = Queue() print('pushing items to queue:') for color in colors: print('item no: ', cnt, ' ', color) queue.put(color) cnt += 1 print('\npopping items from queue:') cnt = 0 while not queue.empty(): print('item no: ', cnt, ' ', queue.get()) cnt +=
Python multiprocessing Lock Class
The task of the Lock class is quite simple. It allows the code to claim the lock so that no other process can execute a similar code until the lock has been released. So the task of the Lock class is mainly two. One is to claim the lock and the other is to release the lock. To claim to lock the, acquire()
the function is used to release the lock release()
the function is used.
Python multiprocessing example
In this Python multiprocessing example, we will merge all our knowledge together. Suppose we have some tasks to accomplish. To get that task done, we will use several processes. So, we will maintain two queues. One will contain the tasks and the other will contain the log of the completed task. Then we instantiate the processes to complete the task. Note that the python Queue class is already synchronized.
That means we don’t need to use the Lock class to block multiple processes to access the same queue object. That’s why we don’t need to use the Lock class in this case. Below is the implementation where we are adding tasks to the queue, then creating processes and starting them, then using join()
them to complete the processes. Finally, we are printing the log from the second queue.
from multiprocessing import Lock, Process, Queue, current_process import time import queue # imported for using queue.Empty exception def do_job(tasks_to_accomplish, tasks_that_are_done): while True: try: ''' try to get task from the queue. get_nowait() function will raise queue.Empty exception if the queue is empty. queue(False) function would do the same task also. ''' task = tasks_to_accomplish.get_nowait() except queue.Empty: break else: ''' if no exception has been raised, add the task completion message to task_that_are_done queue ''' print(task) tasks_that_are_done.put(task + ' is done by ' + current_process().name) time.sleep(.5) return True def main(): number_of_task = 10 number_of_processes = 4 tasks_to_accomplish = Queue() tasks_that_are_done = Queue() processes = [] for i in range(number_of_task): tasks_to_accomplish.put("Task no " + str(i)) # creating processes for w in range(number_of_processes): p = Process(target=do_job, args=(tasks_to_accomplish, tasks_that_are_done)) processes.append(p) p.start() # completing process for p in processes: p.join() # print the output while not tasks_that_are_done.empty(): print(tasks_that_are_done.get()) return True if __name__ == '__main__':
Depending on the number of tasks, the code will take some time to show you the output. The output of the following code will vary from time to time.
Python multiprocessing Pool
Python multiprocessing Pool can be used for the parallel execution of a function across multiple input values, distributing the input data across processes (data parallelism). Below is a simple Python multiprocessing Pool example.
from multiprocessing import Pool import time work = (["A", 5], ["B", 2], ["C", 1], ["D", 3]) def work_log(work_data): print(" Process %s waiting %s seconds" % (work_data[0], work_data[1])) time.sleep(int(work_data[1])) print(" Process %s Finished." % work_data[0]) def pool_handler(): p = Pool(2) p.map(work_log, work) if __name__ == '__main__': pool_ha
The below image shows the output of the above program. Notice that the pool size is 2, so two executions of work_log
function are happening in parallel. When one of the function processing finishes, it picks the next argument, and so on.