Use module
multiprocessing
Simple example
import multiprocessing def test(n): name = multiprocessing.current_process().name print(name,"starting") print("is",n) return if __name__ == '__main__': num_list =[] for i in range(10): p = multiprocessing.Process(target=test,args=(i,)) num_list.append(p) p.start() #join is to block the main process and wait for the child process to return p.join() print(multiprocessing.cpu_count()) print("End") print(num_list)
In the above example, Process in multiprocessing is used to create a Process. target is the execution method, the incoming function, and args is the parameter that the method needs to pass in. In addition, group,name, etc. can be set. Generally, group is not used. join is the blocking main Process. CPU count() is used to obtain the core number of copu. Name to name the child Process. The Process has five states: create, ready, run, block and end. The start() method is ready but not run. It tells the CPU to let the system schedule.
Use of process pools
from multiprocessing import Pool def func(n): return n**2 if __name__ == '__main__': pool = Pool(processes=4) #Asynchronous non blocking result = pool.apply_async(func,[10]) print(result.get(timeout=1)) print(pool.map(func,range(10)))
Processes means to start four processes, one cpu starts one process, and apply async means to start asynchronously. For example, a task has four steps. If asynchrony is used, it does not need to be executed in order. When executing 1, it can execute 2 at the same time without waiting. There are also close() and join(), map() and other methods in the process pool. Close() is to prohibit adding tasks to the process pool. Join() is similar to ordinary multi processes. map(func,iterate) and python's map execute func in the same iterative queue.
Multiprocess queue
from multiprocessing import Process,Queue def set_data(queue): for i in range(10): queue.put("hello"+str(i)) def get_data(queue): for i in range(10): print(queue.get()) if __name__ == '__main__': q = Queue() p1 = Process(target=set_data,args=(q,)) p2 = Process(target=get_data,args=(q,)) p1.start() p2.start() p1.join() p2.join() print("empty...Is it empty?",q.empty())
A process adds data to the queue and a process fetches data. In this way, a process can store an uncapped url and a process fetches it to improve the speed of crawling data.
Process lock
from multiprocessing import Process,Lock,Semaphore def test_lock(lock,n): lock.acquire()#Get lock print("this is ",n) lock.release()#Last unlock release if __name__ == '__main__': lock = Lock() # s = Semaphore(3) for i in range(10): Process(target=test_lock,args=(lock,i)).start()
In multi process, it is possible to access and modify a variable at the same time. At this time, a lock should be added to prevent one side from modifying and submitting the modified data while the other side accesses and modifies the unmodified data. Lock can ensure the stability and security of data. There are two ways to lock in multiprocessing: lock and Semaphore. The latter can set how many processes can access at the same time.
message passing
from multiprocessing import Process,Event def wait(event): print("wait~~~") event.wait() print("event.is_set",event.is_set()) def wait_timeout(event,t): print("timeout!!!") print() print("event.is_set", event.is_set()) event.set()#Set to true if __name__ == '__main__': event = Event() print(event.is_set()) t1 = Process(target=wait,args=(event,)) t1.start() t2 = Process(target=wait_timeout,args=(event,2)) t2.start() print("set event")
In multiprocessing, you can use Event to communicate with each other. You can get the current status through is ﹐ set() method to perform the next operation. wait() method is blocking wait. If you do not receive the executable signal, it will block all the time. As the name implies, timeout() is a timeout setting. After how long, set the signal, and then proceed to the next step. You can comment out the event.set() in wait Ou timeout to see the specific effect.
The Conduit
from multiprocessing import Process,Pipe def p1(pipe): pipe.send('pipe1') print("pipe1 received: %s"%pipe.recv()) pipe.send("who are you") print("pipe1 received: %s"%pipe.recv()) def p2(pipe): pipe.send('pipe2') print("pipe2 received: %s"%pipe.recv()) pipe.send("this is a bad problem") print("pipe2 received: %s"%pipe.recv()) if __name__ == '__main__': pipe = Pipe() #The first pipeline object is passed into the first process process1 = Process(target=p1,args=(pipe[0],)) # The second pipeline object is passed into the second process process2 = Process(target=p2,args=(pipe[1],)) process1.start() process2.start() process1.join() process2.join()
The pipeline can be regarded as the message exchange between two processes. You can set a one-way pipeline, which defaults to a two-way pipeline. The port intelligence of the pipeline is used by one process, otherwise an error occurs. Send to send messages, recv to receive messages.
Multiprocess shared variables
from multiprocessing import Process,Value,Array def func(n,a): n.value = 200 for i in range(len(a)): a[i] = -a[i] if __name__ == '__main__': num = Value('d',0.0) print(num.value) arr = Array('i',range(10)) print(arr[:]) p = Process(target=func,args=(num,arr)) p.start() p.join() print(num.value) print(arr[:])
In multiprocessing, value and Array are used to share variables. Value is the shared memory of creating variable size, and Array is the shared memory of creating Array size. The first parameter is the type of creation. The variables in the process are independent, while num and arr are the variables of the main process, which can be modified in the sub process through value and Array.