The first way to create a process is:
from multiprocessing import Process def f(name): print(name,"In subprocess") if __name__ == "__main__": p = Process(target=f,args=("aaa",)) p.start() print("Execute main process content") # Print as follows Execute main process content aaa In subprocess
From the print results, we can see that the program executed the print of the main process before the print of the child process.This is mainly because the operating system takes some time to develop the process, so the program executes the print of the main process before the print of the sub-process.
The second way to create a process is to:
from multiprocessing import Process class MyProcess(Process): # This must be called Process In init Initialization parameters # Otherwise it will cause an error because it cannot be passed def __init__(self,name): super().__init__() # Must have self.name = name def run(self): print(f"I am a child process{self.name}") if __name__ == "__main__": p = MyProcess("aaa") p.start() print("I am the master process") # Print as follows I am the master process //I am a child process aaa
You can use the join method after creating a child process so that the program waits for the child process to finish executing the code below the join.
from multiprocessing import Process def f(name): print("I am a child process",name) if __name__ == "__main__": p = Process(target=f,args=("aaa",)) p.start() p.join() # When the subprocess finishes, execute the following code print("I am the master process") # Print as follows I am a child process aaa //I am the master process
Use the os module to view the process PID number.
from multiprocessing import Process import os def f(): print(f"Parent Process PID: {os.getppid()},Subprocess PID: {os.getpid()}") if __name__ == "__main__": p = Process(target=f,args=()) p.start() p.join() # When the subprocess finishes, execute the following code print("Main process content") # Print as follows //Parent process PID:1588, child process PID:3292 //Main process content
Execute multiple processes:
from multiprocessing import Process def f(process_name): print(f"{process_name}") if __name__ == "__main__": for i in range(3): p = Process(target=f, args=("Subprocess-"+str(i),)) p.start() print("Main Process") # Print as follows Main Process //Subprocess-0 //Subprocess-1 //Subprocess-2
We will find that the main process has priority over all the child processes. What if we want to finish all the child processes and execute the parent process?Yes, join is used
Example 1:
from multiprocessing import Process import time def f(process_name): print(f"{process_name}") if __name__ == "__main__": start_time = time.time() for i in range(3): p = Process(target=f, args=("Subprocess-"+str(i),)) p.start() p.join() end_time = time.time() print(f"Executed{end_time - start_time}") # Print as follows //Subprocess-0 //Subprocess-1 //Subprocess-2 //Executed 0.4480257034301758
From the printed results, we can see that the main process was not executed until all the sub-processes were running.However, discovery can be slow because our join changes a program that should have been run simultaneously by multiple processes (asynchronous) to synchronous, and we have to wait for one process to finish before executing the next.This runs counter to our original intention of having multiple processes execute at the same time, so our join cannot be placed there.
See Example 2 below:
from multiprocessing import Process import time def f(process_name): print(f"{process_name}") if __name__ == "__main__": start_time = time.time() pro_list = [] for i in range(3): p = Process(target=f, args=("Subprocess-"+str(i),)) p.start() pro_list.append(p) # Add process objects to a list for i in pro_list: # Loop waits for all processes to end i.join() end_time = time.time() print(f"Executed{end_time - start_time}") # Print as follows //Subprocess-1 //Subprocess-2 //Subprocess-0 //Executed 0.18201017379760742
By comparing Example 1 with Example 2, we can clearly see that Example 2 really achieves the concurrent effect of multiple processes.
Process creation (second way, rarely used)
import os from multiprocessing import Process class MyProcess(Process): def __init__(self,name): super().__init__() self.name = name def run(self): print(f"Subprocess-{self.name},PID:{os.getpid()}") if __name__ == "__main__": p1 = MyProcess("aaa") p2 = MyProcess("bbb") p3 = MyProcess("ccc") p1.start() p2.start() p3.start() print("Main Thread") # Print as follows //Subprocess-aaa,PID:7360 //Subprocess-bbb,PID:6956 //Subprocess-ccc,PID:4912
Although not commonly used, it's best to know that you can create threads in this way.
Daemon:
There are two features:
1. The daemon terminates after the execution of the main process code.
2. The daemon cannot open any more subprocesses or throw an exception.
Creating a daemon is as simple as the following:
from multiprocessing import Process def f(): print("Daemon") if __name__ == "__main__": p = Process(target=f,args=()) p.daemon=True # Be sure to be in start Pre-Execution daemom=True p.start() print("Main Process") # Print as follows //Main Process
We found that the daemon was not executed, or that it ended before it was time to execute. We know that the operating system takes a while to start the process, during which time the main process code runs out, so the daemon ends before it has time to execute.You can use join to wait for the daemon to finish executing before ending the main process.
from multiprocessing import Process def f(): print("Daemon") if __name__ == "__main__": p = Process(target=f,args=()) p.daemon=True # Be sure to be in start Pre-Execution daemom=True p.start() p.join() # Waiting for the daemon to end print("Main Process") # Print as follows Daemon //Main Process
Process locks:
To ensure the security of data, in some cases, process locks will change from parallel to serial, which will reduce the efficiency of the program. However, it ensures the security of the data. In the face of data security and program efficiency, the security of the data is greater than the efficiency of the program.
Now, for example, there is one more ticket:
from multiprocessing import Process import time,json def search(name): #Check tickets di = json.load(open("db")) print(f"{name} check, remaining votes {di['count']}") def get(name): #Buy tickets di = json.load(open("db")) time.sleep(0.1) if di["count"] > 0: di["count"] -= 1 time.sleep(0.2) json.dump(di,open("db","w")) print(f"{name} ticket successfully purchased") def task(name): search(name) get(name) if __name__ == "__main__": for i in range(5): #Simulate only five people to grab a ticket p = Process(target=task,args=("tourists-"+str(i),)) p.start() #Print results as follows Tourist-2 ticket, remaining 1 Tourist-1 ticket, remaining 1 Tourist-0 ticket, remaining 1 Tourist-4 ticket, remaining 1 Tourist-3 ticket, remaining 1 Tourist-2 ticket successfully purchased Tourist-1 ticket successfully purchased Tourist-0 ticket successfully purchased Tourist-4 ticket successfully purchased Tourist-3 ticket successfully purchased
The success of all ticket purchases poses a challenge to data security.Originally there was only one ticket, but five people showed that the ticket was successfully purchased, which is certainly not the result we want. The problem is that all tourists have purchased tickets at almost the same time. Everyone sees one ticket. After the first user purchases a ticket, it is not time to write the result to a file to subtract one ticket, and other users have also purchased tickets.During the process of writing the remaining ticket 0 to the file, other users also successfully purchased the ticket and wrote the results to the file, causing confusion in the data.
Here we use process lock Lock, also known as mutex, to solve the problem.
from multiprocessing import Process,Lock import time,json def search(name): # Checking di = json.load(open("db")) print(f"{name}Check, remaining votes{di['count']}") def get(name): # Purchase ticket di = json.load(open("db")) time.sleep(0.1) if di["count"] > 0: di["count"] -= 1 time.sleep(0.2) json.dump(di,open("db","w")) print(f"{name}Ticket Purchase Success") def task(name,lock): search(name) # Checking lock.acquire() # Locking get(name) # Purchase ticket lock.release() # Unlock if __name__ == "__main__": lock = Lock() # Acquire locks for i in range(5): # Simulate five people grabbing a ticket p = Process(target=task,args=("Tourist-"+str(i),lock)) p.start() # Print as follows //Tourists-0 Check, remaining votes 1 //Tourist-1 Check, remaining votes 1 //Tourist-2 Check, remaining votes 1 //Tourist-3 Check, remaining votes 1 //Tourist-4 Check, remaining votes 1 //Tourist-0 Ticket Purchase Success
Add a mutually exclusive lock when purchasing tickets. When purchasing tickets, other processes can only view but cannot purchase tickets, which ensures the security of the data. The final result is correct. This is why we clearly see tickets, but click on the purchase and say no tickets. Although the lock makes the original parallel programs become serial.But we need to know that all efficiency is empty when data security is not guaranteed.
Inter-Process Communication queue
Queue: Queue is a multiprocess-safe queue that can be used to transfer data between multiple processes.
Common methods of queues:
Queue([maxsize])
Create a shared process queue.maxsize is the maximum number allowed in the queue.If this parameter is omitted, there is no size limit.The underlying queue is implemented using piping and locking.In addition, you need to run the supporting threads so that the data in the queue can be transferred to the underlying pipeline.
Queue's instance q has the following methods:
q.get( [ block [ ,timeout ] ] )
Returns an item in q.If the queue is empty, this method will block until items are available in the queue.Blocks are used to control blocking behavior and default to True. If set to False, a Queue.Empty exception (defined in the Queue module) is thrown.Timeout is an optional timeout and is used in blocking mode.Queue.Empty exceptions are raised if no items become available within the specified time interval.
q.get_nowait() is the same as the q.get(False) method.
q.put(item [, block [,timeout ] ] )
Queue item.If the queue is full, this method will block until space is available.Block controls blocking behavior and defaults to True.If set to False, a Queue.Empty exception (defined in the Queue library module) is thrown.Timeout specifies how long to wait for free space in blocking mode.Queue.Full exception will be thrown after timeout.
q.qsize()
Returns the correct number of items currently in the queue.The result of this function is unreliable because items may have been added or deleted from the queue between returning the result and using it in a later program.On some systems, this method may throw a NotImplementedError exception.
q.empty()
Returns True if q is empty when this method is called.If other processes or threads are adding items to the queue, the results are unreliable.That is, between returning and using the results, new items may have been added to the queue.
q.full()
If q is full, return to True. The result may also be unreliable due to the presence of threads (refer to the q.empty() method).
q.close()
Close the queue to prevent more data from being added to it.When this method is called, the background thread will continue to write data that has been queued but not yet written, but will close as soon as the method completes.This method is called automatically if q is garbage collected.Closing the queue does not generate any type of data end signal or exception in the queue user.For example, if a user is blocked on a get () operation, closing the queue in the producer will not cause the get () method to return an error.
q.cancel_join_thread()
No more background threads will be automatically connected when the process exits.This prevents the join_thread() method from blocking.
q.join_thread()
The background thread that connects the queue.This method is used to wait for all queue items to be consumed after calling the q.close() method.By default, this method is called by all processes that are not the original creator of Q.This behavior can be prohibited by calling the q.cancel_join_thread() method.
Here we have a producer-consumer model to demonstrate:
from multiprocessing import Process,Queue import time,random def consumer(name,q): # Consumer while True: task = q.get() # Remove data from queue if task == None:break print(f"{name}get data{task}") time.sleep(random.random()) # Consumer efficiency is higher than producer efficiency def producer(name,q): # Producer for i in range(3): q.put(i) # Add data to paired columns print(f"{name}production data{i}") time.sleep(random.uniform(1,2)) # Simulate producer efficiency without consumer efficiency if __name__ == "__main__": q = Queue() # Get a Queue pro = [] for i in range(3): # Start the producer process p = Process(target=producer,args=("Producer"+str(i),q)) p.start() pro.append(p) # Start the consumer process p1 = Process(target=consumer,args=("aaa",q)) p2 = Process(target=consumer,args=("bbb",q)) p1.start() p2.start() for i in pro: # Waiting for producer to finish i.join() q.put(None) # There are several consumer processes on put How many times? None q.put(None)
JoinableQueue([maxsize]) module
Create a connected shared process queue.This is like a Queue object, but the queue allows users of the project to notify the producer that the project has been successfully processed.Notification processes are implemented using shared signals and conditional variables.(
In addition to the same methods as Queue objects, instance p of JoinableQueue has the following methods:
q.task_done()
The consumer uses this method to signal that the item returned by q.get() has been processed.If this method is called more than the number of items deleted from the queue, a ValueError exception is raised.
q.join()
The producer will use this method to block until all items in the queue are processed.Blocking will continue until the q.task_done() method is called for each item in the queue.
The following examples show how to set up a process that runs forever, use and process items on a queue.Producers queue items and wait for them to be processed.
We are implementing the above producer-consumer model.
from multiprocessing import Process,JoinableQueue import time,random def consumer(name,q): # Consumer while True: task = q.get() # Remove data from queue q.task_done() # Notify the producer that I have taken all the data print(f"{name}get data{task}") time.sleep(random.random()) # Consumer efficiency is higher than producer efficiency def producer(name,q): # Producer for i in range(1): q.put(i) # Add data to paired columns print(f"{name}production data{i}") time.sleep(random.uniform(1,2)) # Simulate producer efficiency without consumer efficiency q.join() # Production is complete, waiting for consumer notification data is available if __name__ == "__main__": q = JoinableQueue() # Get a Queue pro = [] for i in range(1): # Start the producer process p = Process(target=producer,args=("Producer"+str(i),q)) p.start() pro.append(p) # Start the consumer process p1 = Process(target=consumer,args=("aaa",q)) p2 = Process(target=consumer,args=("bbb",q)) p1.daemon=True # Without a daemon, these two processes will not end. p2.daemon=True # Because they just notified the producer that I received all the data and didn't end the cycle. p1.start() p2.start() for i in pro: # Waiting for producer to finish i.join()
Again, this is why the consumer is set as a daemon. q.task_done simply notifies the producer that I've run out of data, that's all, so the while loop doesn't exit.If no daemon is set, the program gets stuck in a while loop.
Process Pool
A process pool is a process group that is pre-created and then assigned to perform tasks from the pool when there are tasks.When the number of tasks exceeds the number of process pools, you must wait for idle processes in the process pool to execute tasks using the idle processes.
Advantages of process pooling:
1. Make full use of CPU resources.
2. Multiple processes can execute simultaneously at the same time, which achieves the effect of concurrency.
Disadvantages of process pools: CPU time is required for process creation and destruction.Multiprocessing is appropriate when complex computations with less I/O blocking are required.If the program does not involve complex operations, it is best to use a thread pool.
Some methods for process pool multiprocessing.Pool
apply(func [, args [, kwargs]]):
func(*args,**kwargs), and then return the result.
It is important to note that apply is a process synchronization operation, that is, it must wait for one process to finish before the next process can execute.
apply_async(func [, args [, kwargs]]):
func(*args,**kwargs), and then return the result.
apply_async is an asynchronous operation of a process, and all processes can execute simultaneously. The result of this method is an instance of the AsyncResult class, and callback is a callable object that receives input parameters.When the result of the func becomes available, it is immediately passed to callback.Callback prohibits performing any blocking operations or will receive results from other asynchronous operations.
p.close(): Close the process pool to prevent further operations.If all operations continue to hang, they will be completed before the worker process terminates
P.jion(): Wait for all worker processes to exit.This method can only be called after close() or teminate()
Example of synchronization process pool:
import os,time from multiprocessing import Pool def work(n): print("PID:%s run" %os.getpid()) time.sleep(1) return n ** 2 if __name__ == "__main__": p = Pool(3) # Open process pool res = [] for i in range(3): res.append(p.apply(work,args=(i,))) # Process Synchronization Mode print(res) # Print Return Results # Print as follows PID:6180 run PID:9728 run [0, 1, 4]
Because of the synchronization of the process pool, the execution order of a process is ordered, and one process must execute before the next process executes.
Asynchronous example of process pool:
import os,time from multiprocessing import Pool def work(n): print("PID:%s run" %os.getpid()) time.sleep(1) return n ** 2 if __name__ == "__main__": p = Pool(3) # Open process pool res = [] for i in range(5): # Process Asynchronous Mode res.append(p.apply_async(work,args=(i,))) # Because it is asynchronous, the process will be fast # So all of our processes end up printing results p.close() # Close process pool p.join() # Waiting for the process pool to end, for i in res: print(i.get(),end=" ") # Print as follows PID:7512 run PID:10176 run PID:7240 run PID:10176 run PID:7512 run 0 1 4 9 16
There is a problem with process pools and it should be noted that the input function () cannot be used in child processes