Multiprocessing - multitasking and multiprocessing notes

Keywords: Python queue multiple processes processing

In the previous section, we talked about the concept of multiprocessing. Just look at multithreading in the previous section. Multi process mainly uses
multiprocessing library. The main code flow is as follows
1. Import library:
import multiprocessing
2. Create process:
t1 = multiprocessing.Process(target=test1)
t1.start()

import multiprocessing
import time
def test1():
    print("----1----")
    time.sleep(1)
def test2():
    print("----2----")
    time.sleep(1)
def main():
    t1 = multiprocessing.Process(target=test1)
    t2 = multiprocessing.Process(target=test2)
    t1.start()
    t2.start()
if __name__ == "__main__":
    main()

For the communication problem between multiple processes, the socket is used for communication.
Queue queue is another way of inter process communication. Queues are characterized by first in first out.
1. Import library:
import multiprocessing
2. Create queue:
q = multiprocessing.Queue()
3. Create a process and pass a reference to the queue as an argument
p1 = multiprocessing.Process(target=download_from_web, args=(q,))
args = () passed in tuple data, so "," should be added
4. Put data into queue
q.put()
5. Fetch data from queue
q.get()
6. Judge whether the queue is empty. The return value is: True or False
q.empty()
7. Judge whether the queue is full. The return value is: True or False
q.full()
Specific cases are as follows

import multiprocessing
def download_from_web(q):
    """Simulated download data"""
    data = [11, 22, 33, 44]
    # Write data to the list
    for temp in data:
        q.put(temp)
    print("...The downloader has downloaded the data and stored it in the list...")
def analysis_data(q):
    """data processing"""
    # Get data from queue
    waitting_analysis_data = list()
    while True:
        data = q.get()
        waitting_analysis_data.append(data)
        if q.empty():
            break
    # Analog data processing
    print(waitting_analysis_data)
    print("Data processing completed")
def main():
    # 1. Create a queue
    q = multiprocessing.Queue()
    # 2. Create multiple processes and pass the reference of the queue as an argument
    p1 = multiprocessing.Process(target=download_from_web, args=(q,))
    p2 = multiprocessing.Process(target=analysis_data, args=(q,))
    p1.start()
    p2.start()
if __name__ == '__main__':
    main()

"" "concept of process pool" ""
When the number of child processes is too large and the workload of manually creating processes is huge, you can
Use the pool method provided by the multiprocessing module
When initializing the Pool, you can set a maximum number of processes. When a new request is submitted to the Pool
, if the pool is not full, a new process will be created to execute the request
If the number of processes has reached the specified maximum, the request will wait. Until a process ends in the pool

1. Import library
from multiprocessing.pool import Pool
2. Define the process pool and set the maximum number of processes to 3
po = Pool(3)
3. Call process pool
po.apply_ Async (target to be called, (parameter tuple passed to target, plus', ')
po.apply_async(worker, (i,))
4. Close process pool
Close the process pool. After closing, po will not accept new requests
po.close()
5. Add blocking. The main process will not be executed until the process of the process pool is executed
Wait for the execution of all sub processes in po to complete, and the processes in the process pool
After execution, continue to execute the main process program. Must be placed after the close statement
If you do not add a join, the main program will not wait for the process pool process to finish executing and the main process to end,
The whole process is over.
po.join()

Case of process pool -- copy folder
"""

Insert knowledge points about folder copying

Copy folder: first find the folder path to copy,
print(os.file) # find the file path

Change the path to the "web" path
os.chdir("web")

Create a new folder "test" under the web path
os.mkdir("test")

Change the path to the "test" path, and the data written later is directly written to the path of the current test folder
os.chdir("test")

Returns the names of all files in the folder in the form of a list
file_list = os.listdir(file_path)

Traverse the files in the folder and return the string format file.
for file_name in file_list:

Find all. py files in the folder. If they are not. py files, they will be filtered automatically
You can also find other format files in the folder in this form, such as. txt file
index = file_name.rfind(".py") if index > 0:

For the new file name defined, add [backup] before. py
new_name = file_name[:index] + '[backup]' + file_name[index:]

Open the original file and read the contents of the file
old_f = open(file_path + file_name, 'rb')
content = old_f.read()

Open the backup file and write its contents
new_f = open(new_name, 'wb')
new_f.write(content)

Close both folders
old_f.close()
new_f.close()

import os
from multiprocessing.pool import Pool
import multiprocessing


def copy_file(q, file_name, old_file_name, new_file_name):
    """Finish copying files"""
    # print(file_name)
    #  Open the original file and read the contents
    old_f = open(old_file_name + file_name, "rb")
    content = old_f.read()
    old_f.close()
    os.chdir(new_file_name)
    new_f = open(new_file_name + file_name, "wb")
    new_f.write(content)
    # print(new_f.read())
    new_f.close()
    # After copying the file, write information to the queue, indicating that the copy has been completed
    q.put(file_name)


def main():
    # Gets the name of the folder that the user wants to copy
    file_path = "F://python Project file/web/"
    old_name = input("Please enter the information you want to copy Name of folder:")
    old_file_name = file_path + str(old_name) + "/"

    # # Create a new folder
    try:
        new_name = old_name + "[Copy]"
        os.mkdir(new_name)
    except:
        pass
    new_file_name = file_path + str(new_name) + "/"
    # # Get the names of all files to be copied in the folder listdir()
    file_names = os.listdir(old_file_name)
    # Create process pool
    po = Pool(5)
    # Create a queue. Unlike before, if the process pool communicates with the main process, the queue uses multiprocessing.Manager().queue()
    q = multiprocessing.Manager().Queue()
    # Task to add a copy file to the process pool
    for file_name in file_names:
        po.apply_async(copy_file, args=(q, file_name, old_file_name, new_file_name))

    po.close()
    file_len = len(file_names)
    copy_complat = 0
    # po.join()
    # Display the copy progress, inter process communication, add a queue, copy a file, put it in the queue, and then get the content in the queue
    while True:
        file_name = q.get()
        copy_complat += 1
        # The number of copies completed divided by the total file length, even if the progress is completed.
        # "\ r" is to return to the beginning of the line, end = "" is to set no line break, and only one line will be displayed at the end
        print("\r Copy completed:%0.2f %%" % (copy_complat*100/file_len), end=" ")
        if copy_complat >= file_len:
            break


if __name__ == '__main__':
    os.chdir("web")
    main()

Posted by ManicMax on Sat, 04 Sep 2021 14:40:11 -0700