python process, process pool code section

Keywords: Python JSON less

The first way to create a process is:

from multiprocessing import Process
def f(name):
    print(name,"In subprocess")
if __name__ == "__main__":
    p = Process(target=f,args=("aaa",))
    p.start()
    print("Execute main process content")

# Print as follows
Execute main process content
aaa In subprocess

From the print results, we can see that the program executed the print of the main process before the print of the child process.This is mainly because the operating system takes some time to develop the process, so the program executes the print of the main process before the print of the sub-process.

The second way to create a process is to:

from multiprocessing import Process

class MyProcess(Process):
   # This must be called Process In init Initialization parameters
   # Otherwise it will cause an error because it cannot be passed
   def __init__(self,name):
      super().__init__()  # Must have
      self.name = name
   def run(self):
      print(f"I am a child process{self.name}")
if __name__ == "__main__":
   p = MyProcess("aaa")
   p.start()
   print("I am the master process")

# Print as follows
I am the master process
//I am a child process aaa

You can use the join method after creating a child process so that the program waits for the child process to finish executing the code below the join.

from multiprocessing import Process
def f(name):
    print("I am a child process",name)
if __name__ == "__main__":
    p = Process(target=f,args=("aaa",))
    p.start()
    p.join()  # When the subprocess finishes, execute the following code
    print("I am the master process")

# Print as follows
I am a child process aaa
//I am the master process

Use the os module to view the process PID number.

from multiprocessing import Process
import os
def f():
    print(f"Parent Process PID: {os.getppid()},Subprocess PID: {os.getpid()}")
if __name__ == "__main__":
    p = Process(target=f,args=())
    p.start()
    p.join()  # When the subprocess finishes, execute the following code
    print("Main process content")

# Print as follows
//Parent process PID:1588, child process PID:3292
//Main process content

Execute multiple processes:

from multiprocessing import Process
def f(process_name):
    print(f"{process_name}")
if __name__ == "__main__":
    for i in range(3):
        p = Process(target=f, args=("Subprocess-"+str(i),))
        p.start()
    print("Main Process")

# Print as follows
Main Process
//Subprocess-0
//Subprocess-1
//Subprocess-2

We will find that the main process has priority over all the child processes. What if we want to finish all the child processes and execute the parent process?Yes, join is used

Example 1:

from multiprocessing import Process
import time
def f(process_name):
    print(f"{process_name}")
if __name__ == "__main__":
    start_time = time.time()
    for i in range(3):
        p = Process(target=f, args=("Subprocess-"+str(i),))
        p.start()
        p.join()
    end_time = time.time()
    print(f"Executed{end_time - start_time}")

# Print as follows
//Subprocess-0
//Subprocess-1
//Subprocess-2
//Executed 0.4480257034301758

From the printed results, we can see that the main process was not executed until all the sub-processes were running.However, discovery can be slow because our join changes a program that should have been run simultaneously by multiple processes (asynchronous) to synchronous, and we have to wait for one process to finish before executing the next.This runs counter to our original intention of having multiple processes execute at the same time, so our join cannot be placed there.

See Example 2 below:

from multiprocessing import Process
import time
def f(process_name):
    print(f"{process_name}")
if __name__ == "__main__":
    start_time = time.time()
    pro_list = []
    for i in range(3):
        p = Process(target=f, args=("Subprocess-"+str(i),))
        p.start()
        pro_list.append(p) # Add process objects to a list

    for i in pro_list: # Loop waits for all processes to end
        i.join()
    end_time = time.time()
    print(f"Executed{end_time - start_time}")

# Print as follows
//Subprocess-1
//Subprocess-2
//Subprocess-0
//Executed 0.18201017379760742

By comparing Example 1 with Example 2, we can clearly see that Example 2 really achieves the concurrent effect of multiple processes.

Process creation (second way, rarely used)

import os
from multiprocessing import Process
class MyProcess(Process):
    def __init__(self,name):
        super().__init__()
        self.name = name
    def run(self):
        print(f"Subprocess-{self.name},PID:{os.getpid()}")

if __name__ == "__main__":
    p1 = MyProcess("aaa")
    p2 = MyProcess("bbb")
    p3 = MyProcess("ccc")
    p1.start()
    p2.start()
    p3.start()
    print("Main Thread")

# Print as follows
//Subprocess-aaa,PID:7360
//Subprocess-bbb,PID:6956
//Subprocess-ccc,PID:4912

Although not commonly used, it's best to know that you can create threads in this way.

Daemon:

There are two features:

1. The daemon terminates after the execution of the main process code.

2. The daemon cannot open any more subprocesses or throw an exception.

Creating a daemon is as simple as the following:

from multiprocessing import Process
def f():
    print("Daemon")
if __name__ == "__main__":
    p = Process(target=f,args=())
    p.daemon=True  # Be sure to be in start Pre-Execution daemom=True
    p.start()
    print("Main Process")

# Print as follows
//Main Process

We found that the daemon was not executed, or that it ended before it was time to execute. We know that the operating system takes a while to start the process, during which time the main process code runs out, so the daemon ends before it has time to execute.You can use join to wait for the daemon to finish executing before ending the main process.

from multiprocessing import Process
def f():
    print("Daemon")
if __name__ == "__main__":
    p = Process(target=f,args=())
    p.daemon=True  # Be sure to be in start Pre-Execution daemom=True
    p.start()
    p.join()  # Waiting for the daemon to end
    print("Main Process")

# Print as follows
Daemon
//Main Process

Process locks:

To ensure the security of data, in some cases, process locks will change from parallel to serial, which will reduce the efficiency of the program. However, it ensures the security of the data. In the face of data security and program efficiency, the security of the data is greater than the efficiency of the program.

Now, for example, there is one more ticket:

 

from multiprocessing import Process
import time,json
def search(name): #Check tickets
    di = json.load(open("db"))
    print(f"{name} check, remaining votes {di['count']}")

def get(name): #Buy tickets
    di = json.load(open("db"))
    time.sleep(0.1)
    if di["count"] > 0:
        di["count"] -= 1
        time.sleep(0.2)
        json.dump(di,open("db","w"))
        print(f"{name} ticket successfully purchased")
def task(name):
    search(name)
    get(name)
if __name__ == "__main__":
    for i in range(5): #Simulate only five people to grab a ticket
        p = Process(target=task,args=("tourists-"+str(i),))
        p.start()

#Print results as follows
Tourist-2 ticket, remaining 1
Tourist-1 ticket, remaining 1
Tourist-0 ticket, remaining 1
Tourist-4 ticket, remaining 1
Tourist-3 ticket, remaining 1
Tourist-2 ticket successfully purchased
Tourist-1 ticket successfully purchased
Tourist-0 ticket successfully purchased
Tourist-4 ticket successfully purchased
Tourist-3 ticket successfully purchased

The success of all ticket purchases poses a challenge to data security.Originally there was only one ticket, but five people showed that the ticket was successfully purchased, which is certainly not the result we want. The problem is that all tourists have purchased tickets at almost the same time. Everyone sees one ticket. After the first user purchases a ticket, it is not time to write the result to a file to subtract one ticket, and other users have also purchased tickets.During the process of writing the remaining ticket 0 to the file, other users also successfully purchased the ticket and wrote the results to the file, causing confusion in the data.

Here we use process lock Lock, also known as mutex, to solve the problem.

from multiprocessing import Process,Lock
import time,json

def search(name):  # Checking
    di = json.load(open("db"))
    print(f"{name}Check, remaining votes{di['count']}")

def get(name):  # Purchase ticket
    di = json.load(open("db"))
    time.sleep(0.1)
    if di["count"] > 0:
        di["count"] -= 1
        time.sleep(0.2)
        json.dump(di,open("db","w"))
        print(f"{name}Ticket Purchase Success")
def task(name,lock):
    search(name)  # Checking
    lock.acquire() # Locking
    get(name)   # Purchase ticket
    lock.release()  # Unlock
if __name__ == "__main__":
    lock = Lock()  # Acquire locks
    for i in range(5):  # Simulate five people grabbing a ticket
        p = Process(target=task,args=("Tourist-"+str(i),lock))
        p.start()

# Print as follows
//Tourists-0 Check, remaining votes 1
//Tourist-1 Check, remaining votes 1
//Tourist-2 Check, remaining votes 1
//Tourist-3 Check, remaining votes 1
//Tourist-4 Check, remaining votes 1
//Tourist-0 Ticket Purchase Success

Add a mutually exclusive lock when purchasing tickets. When purchasing tickets, other processes can only view but cannot purchase tickets, which ensures the security of the data. The final result is correct. This is why we clearly see tickets, but click on the purchase and say no tickets. Although the lock makes the original parallel programs become serial.But we need to know that all efficiency is empty when data security is not guaranteed.

Inter-Process Communication queue

Queue: Queue is a multiprocess-safe queue that can be used to transfer data between multiple processes.

Common methods of queues:

Queue([maxsize]) 
Create a shared process queue.maxsize is the maximum number allowed in the queue.If this parameter is omitted, there is no size limit.The underlying queue is implemented using piping and locking.In addition, you need to run the supporting threads so that the data in the queue can be transferred to the underlying pipeline. 
Queue's instance q has the following methods:
q.get( [ block [ ,timeout ] ] ) 
Returns an item in q.If the queue is empty, this method will block until items are available in the queue.Blocks are used to control blocking behavior and default to True. If set to False, a Queue.Empty exception (defined in the Queue module) is thrown.Timeout is an optional timeout and is used in blocking mode.Queue.Empty exceptions are raised if no items become available within the specified time interval.
q.get_nowait() is the same as the q.get(False) method.
q.put(item [, block [,timeout ] ] ) 
Queue item.If the queue is full, this method will block until space is available.Block controls blocking behavior and defaults to True.If set to False, a Queue.Empty exception (defined in the Queue library module) is thrown.Timeout specifies how long to wait for free space in blocking mode.Queue.Full exception will be thrown after timeout.
q.qsize() 
Returns the correct number of items currently in the queue.The result of this function is unreliable because items may have been added or deleted from the queue between returning the result and using it in a later program.On some systems, this method may throw a NotImplementedError exception.
q.empty() 
Returns True if q is empty when this method is called.If other processes or threads are adding items to the queue, the results are unreliable.That is, between returning and using the results, new items may have been added to the queue.
q.full() 
If q is full, return to True. The result may also be unreliable due to the presence of threads (refer to the q.empty() method).
q.close() 
Close the queue to prevent more data from being added to it.When this method is called, the background thread will continue to write data that has been queued but not yet written, but will close as soon as the method completes.This method is called automatically if q is garbage collected.Closing the queue does not generate any type of data end signal or exception in the queue user.For example, if a user is blocked on a get () operation, closing the queue in the producer will not cause the get () method to return an error.
q.cancel_join_thread() 
No more background threads will be automatically connected when the process exits.This prevents the join_thread() method from blocking.
q.join_thread() 
The background thread that connects the queue.This method is used to wait for all queue items to be consumed after calling the q.close() method.By default, this method is called by all processes that are not the original creator of Q.This behavior can be prohibited by calling the q.cancel_join_thread() method.

Here we have a producer-consumer model to demonstrate:

 

from multiprocessing import Process,Queue
import time,random

def consumer(name,q):  # Consumer
    while True:
        task = q.get() # Remove data from queue
        if task == None:break
        print(f"{name}get data{task}")
        time.sleep(random.random())   # Consumer efficiency is higher than producer efficiency

def producer(name,q):  # Producer
    for i in range(3):
        q.put(i)  # Add data to paired columns
        print(f"{name}production data{i}")
        time.sleep(random.uniform(1,2))  # Simulate producer efficiency without consumer efficiency

if __name__ == "__main__":
    q = Queue()  # Get a Queue
    pro = []
    for i in range(3):  # Start the producer process
        p = Process(target=producer,args=("Producer"+str(i),q))
        p.start()
        pro.append(p)
    # Start the consumer process
    p1 = Process(target=consumer,args=("aaa",q))
    p2 = Process(target=consumer,args=("bbb",q))
    p1.start()
    p2.start()
    for i in pro:  # Waiting for producer to finish
        i.join()
  
    q.put(None)  # There are several consumer processes on put How many times? None
    q.put(None)

JoinableQueue([maxsize]) module
Create a connected shared process queue.This is like a Queue object, but the queue allows users of the project to notify the producer that the project has been successfully processed.Notification processes are implemented using shared signals and conditional variables.(

In addition to the same methods as Queue objects, instance p of JoinableQueue has the following methods:
q.task_done() 
The consumer uses this method to signal that the item returned by q.get() has been processed.If this method is called more than the number of items deleted from the queue, a ValueError exception is raised.
q.join() 
The producer will use this method to block until all items in the queue are processed.Blocking will continue until the q.task_done() method is called for each item in the queue. 
The following examples show how to set up a process that runs forever, use and process items on a queue.Producers queue items and wait for them to be processed.

We are implementing the above producer-consumer model.

from multiprocessing import Process,JoinableQueue
import time,random

def consumer(name,q):  # Consumer
    while True:
        task = q.get() # Remove data from queue
        q.task_done()  # Notify the producer that I have taken all the data
        print(f"{name}get data{task}")
        time.sleep(random.random())   # Consumer efficiency is higher than producer efficiency

def producer(name,q):  # Producer
    for i in range(1):
        q.put(i)  # Add data to paired columns
        print(f"{name}production data{i}")
        time.sleep(random.uniform(1,2))  # Simulate producer efficiency without consumer efficiency
    q.join()  # Production is complete, waiting for consumer notification data is available

if __name__ == "__main__":
    q = JoinableQueue()  # Get a Queue
    pro = []
    for i in range(1):  # Start the producer process
        p = Process(target=producer,args=("Producer"+str(i),q))
        p.start()
        pro.append(p)
    # Start the consumer process
    p1 = Process(target=consumer,args=("aaa",q))
    p2 = Process(target=consumer,args=("bbb",q))
    p1.daemon=True  # Without a daemon, these two processes will not end.
    p2.daemon=True  # Because they just notified the producer that I received all the data and didn't end the cycle.
    p1.start()
    p2.start()
    for i in pro:  # Waiting for producer to finish
        i.join()

Again, this is why the consumer is set as a daemon. q.task_done simply notifies the producer that I've run out of data, that's all, so the while loop doesn't exit.If no daemon is set, the program gets stuck in a while loop.

 

Process Pool

A process pool is a process group that is pre-created and then assigned to perform tasks from the pool when there are tasks.When the number of tasks exceeds the number of process pools, you must wait for idle processes in the process pool to execute tasks using the idle processes.

Advantages of process pooling:

1. Make full use of CPU resources.

2. Multiple processes can execute simultaneously at the same time, which achieves the effect of concurrency.

Disadvantages of process pools: CPU time is required for process creation and destruction.Multiprocessing is appropriate when complex computations with less I/O blocking are required.If the program does not involve complex operations, it is best to use a thread pool.

Some methods for process pool multiprocessing.Pool

apply(func [, args [, kwargs]]): 
 func(*args,**kwargs), and then return the result.
It is important to note that apply is a process synchronization operation, that is, it must wait for one process to finish before the next process can execute. 
apply_async(func [, args [, kwargs]]): 
func(*args,**kwargs), and then return the result.
apply_async is an asynchronous operation of a process, and all processes can execute simultaneously. The result of this method is an instance of the AsyncResult class, and callback is a callable object that receives input parameters.When the result of the func becomes available, it is immediately passed to callback.Callback prohibits performing any blocking operations or will receive results from other asynchronous operations.
p.close(): Close the process pool to prevent further operations.If all operations continue to hang, they will be completed before the worker process terminates
P.jion(): Wait for all worker processes to exit.This method can only be called after close() or teminate()

Example of synchronization process pool:

import os,time
from multiprocessing import Pool
def work(n):
    print("PID:%s run" %os.getpid())
    time.sleep(1)
    return n ** 2

if __name__ == "__main__":
    p = Pool(3)  # Open process pool
    res = []
    for i in range(3):
        res.append(p.apply(work,args=(i,)))  # Process Synchronization Mode
    print(res)  # Print Return Results

 # Print as follows
PID:6180 run
PID:9728 run
[0, 1, 4]

Because of the synchronization of the process pool, the execution order of a process is ordered, and one process must execute before the next process executes.

Asynchronous example of process pool:

import os,time
from multiprocessing import Pool
def work(n):
    print("PID:%s run" %os.getpid())
    time.sleep(1)
    return n ** 2

if __name__ == "__main__":
    p = Pool(3)  # Open process pool
    res = []
    for i in range(5):
        # Process Asynchronous Mode
        res.append(p.apply_async(work,args=(i,)))
    # Because it is asynchronous, the process will be fast
    # So all of our processes end up printing results
    p.close()  # Close process pool
    p.join()   # Waiting for the process pool to end,
    for i in res:
        print(i.get(),end=" ")

# Print as follows
PID:7512 run
PID:10176 run
PID:7240 run
PID:10176 run
PID:7512 run
0 1 4 9 16

There is a problem with process pools and it should be noted that the input function () cannot be used in child processes

Posted by aznkidzx on Fri, 17 May 2019 02:08:50 -0700