In the last post we saw what are the differences between Multiprocessing and Multithreading in Python3 and the code sample which uses concurrent.futures
library for crawling URLs in separate threads.
Here we use the multiprocessing
library for Multithreading example.
import logging
from multiprocessing import Process, Queue, current_process
import time
import queue
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(name)s - %(levelname)s - %(message)s')
logger = logging.getLogger(__name__)
JOBS_NUM = 7
PROCESSES_NUM = 3
def work(jobs_to_do, jobs_done):
while True:
try:
job = jobs_to_do.get_nowait()
except queue.Empty:
break
else:
logger.info(job)
jobs_done.put(job + ' is done by ' + current_process().name)
time.sleep(.7)
return True
def main():
jobs_to_do = Queue()
jobs_done = Queue()
processes = []
for i in range(JOBS_NUM):
jobs_to_do.put("job #" + str(i))
for w in range(PROCESSES_NUM):
p = Process(target=work, args=(jobs_to_do, jobs_done))
processes.append(p)
p.start()
for p in processes:
p.join()
while not jobs_done.empty():
logger.info(jobs_done.get())
return True
if __name__ == '__main__':
main()
So here we defined two Queues
– jobs_to_do
and jobs_done
. Queue
class is synchronized, so there is no need to use Lock
to block access to the queue object by different processes.
First we adding jobs that we want to be done to the jobs_to_do
queue and then using available number of processes we starting the work function which picks up a job from the queue (get_nowait()
) performing the job, in our case is only printing, and finally adding the completed job to the jobs_done
queue.
By stating p.join()
procedure we telling to Python to wait for the processes to complete.
The output:
2021-03-22 16:55:19,002 - __mp_main__ - INFO - job #0
2021-03-22 16:55:19,013 - __mp_main__ - INFO - job #1
2021-03-22 16:55:19,711 - __mp_main__ - INFO - job #2
2021-03-22 16:55:19,719 - __mp_main__ - INFO - job #3
2021-03-22 16:55:20,415 - __mp_main__ - INFO - job #4
2021-03-22 16:55:20,422 - __mp_main__ - INFO - job #5
2021-03-22 16:55:21,119 - __mp_main__ - INFO - job #6
2021-03-22 16:55:22,554 - __main__ - INFO - job #0 is done by Process-1
2021-03-22 16:55:22,556 - __main__ - INFO - job #1 is done by Process-2
2021-03-22 16:55:22,556 - __main__ - INFO - job #2 is done by Process-1
2021-03-22 16:55:22,556 - __main__ - INFO - job #3 is done by Process-2
2021-03-22 16:55:22,557 - __main__ - INFO - job #4 is done by Process-1
2021-03-22 16:55:22,557 - __main__ - INFO - job #5 is done by Process-2
2021-03-22 16:55:22,557 - __main__ - INFO - job #6 is done by Process-1
Did you notice that in this case two processes was enough to complete all the jobs. If you’ll run the code several times you will see that the output will be differ. The result depends on your machine hardware, machine load at the exact time you running the script. Your OS will regulate processes behavior accordingly.
In the next article we’ll learn how to write concurrent code using the async/await
syntax.
[…] In the next post we’ll see a code sample for Python3 Multiprocessing. […]
[…] the last two articles we reviewed Python3 Multithreading and Multiprocessing […]