Synchronization in python threads

Keywords: Python

Problem of Multithread Development

Assuming that both threads T 1 and t2 add 1 operation to num=0, t1 and T2 modify num 1000000 times each, the final result of num should be 200000. However, due to multithreaded access, the following may occur:

from threading import Thread
import time

num = 0

def test1():
    global num
    for i in range(1000000):
        num += 1

    print("--test1--num=%d" % num)


def test2():
    global num
    for i in range(1000000):
        num += 1

    print("--test2--num=%d" % num)


if __name__ == '__main__':
    Thread(target=test1).start()
    Thread(target=test2).start()
    print("num = %d" % num)
"""
num = 134116
--test1--num=1032814
--test2--num=1166243
"""

The results may be different, but the results are often not 2000000. The reason for the problem is that there is no control over the access of multiple threads to the same resource, which destroys the data and makes the results of thread operation unpredictable. This phenomenon is called thread insecurity.

Thread Synchronization - Using Mutex Locks

If multiple threads modify a data together, unexpected results may occur. In order to ensure the correctness of the data, it is necessary to synchronize multiple threads.
Simple thread synchronization can be achieved by using Lock and Rlock of Thread objects. Both objects have acquire and release methods. For data that requires only one thread operation at a time, their operations can be placed between acquire and release methods.

Use mutex to implement the above example:

from threading import Thread, Lock
import time

num = 0


def test1():
    global num
    # Lock up
    mutex.acquire()
    for i in range(1000000):
        num += 1
    # Unlock
    mutex.release()
    print("--test1--num=%d" % num)


def test2():
    global num
    mutex.acquire()
    for i in range(1000000):
        num += 1
    mutex.release()
    print("--test2--num=%d" % num)


start_time = time.time()  # start time
# Create a mutex with no lock by default
mutex = Lock()
p1 = Thread(target=test1)
p1.start()

# time.sleep(3)   # If you run the program again after canceling the shield, the result will be different. Why?

p2 = Thread(target=test2)
p2.start()
p1.join()
p2.join()
end_time = time.time()  # Ending time
print("num = %d" % num)

print("Running time:%fs" % (end_time - start_time))  # End time - start time

"""
//Output results:
--test1--num=1000000
--test2--num=2000000
num = 2000000
//Running time: 0.287206s
"""

Application of Synchronization: Multiple Threads Execute Orderly

from threading import Lock, Thread
from time import sleep


class Task1(Thread):
    def run(self):
        while True:
            # To determine whether the lock was successful, the return value is bool type
            if lock1.acquire():
                print("--task1--")
                sleep(0.5)
                lock2.release()


class Task2(Thread):
    def run(self):
        while True:
            if lock2.acquire():
                print("--task2--")
                sleep(0.5)
                lock3.release()


class Task3(Thread):
    def run(self):
        while True:
            if lock3.acquire():
                print("--task3--")
                sleep(0.5)
                lock1.release()

if __name__ == '__main__':    
    # Create a lock
    lock1 = Lock()
    
    # Create a lock and lock it
    lock2 = Lock()
    lock2.acquire()
    
    # Create a lock and lock it
    lock3 = Lock()
    lock3.acquire()
    
    t1 = Task1()
    t2 = Task2()
    t3 = Task3()
    
    t1.start()
    t2.start()
    t3.start()
"""
--task1--
--task2--
--task3--
--task1--
--task2--
--task3--
--task1--
--task2--
...
"""

Producer and Consumer Model

Why use the producer-consumer model

In the world of threads, producers are threads of production data and consumers are threads of consumption data. In multi-threaded development, if the producer processes quickly and the consumer processes slowly, then the producer must wait for the consumer to finish processing before continuing to produce data. In the same way, if the consumer's processing power is greater than that of the producer, then the consumer must wait for the producer. In order to solve this problem, producer and consumer models are introduced.

What is the producer-consumer model?

The producer-consumer model solves the strong coupling problem between producer and consumer through a container. Producers and consumers do not communicate directly with each other, but through blocking queues to communicate, so producers do not have to wait for consumers to process the data after production, and throw it directly to the blocking queue. Consumers do not look for producers to ask for data, but directly from the blocking queue, blocking queue is equivalent to a buffer. It balances the processing power of producers and consumers.

Python's Queue module provides synchronous, thread-safe queue classes, including FIFO (first in first out) queue Queue, LIFO (last in first out) queue LifoQueue, and priority queue Priority Queue. These queues all implement lock primitives (which can be understood as atomic operations, i.e., either not done or done), and can be used directly in multiple threads. Queues can be used to synchronize threads.

The code to implement the above producer-consumer problem with FIFO queue is as follows:

import threading
import time
from queue import Queue


class Producer(threading.Thread):
    def run(self):
        global queue
        count = 0
        while True:
            if queue.qsize() < 1000:
                for i in range(100):
                    count += 1
                    msg = "Generating products" + str(count)
                    queue.put(msg)
                    print(msg)
            time.sleep(0.5)


class Consumer(threading.Thread):
    def run(self):
        global queue
        while True:
            if queue.qsize() > 100:
                for i in range(3):
                    msg = self.name + "Consumption" + queue.get()
                    print(msg)
            time.sleep(0.5)


if __name__ == '__main__':
    queue = Queue()

    for i in range(500):
        queue.put("Initial product" + str(i))
    # Create 2 production lines
    for i in range(2):
        p = Producer()
        p.start()
    # Create five consumption threads
    for i in range(5):
        c = Consumer()
        c.start()

ThreadLocal

In a multithreaded environment, each thread has its own data. It is better for a thread to use its own local variables than global variables, because local variables can only be seen by the thread itself and will not affect other threads, and the modification of global variables must be locked.
ThreadLocal solves the problem of passing parameters between functions in a thread.

import threading
"""
//Although each ThreadLocal variable is a global variable, each thread can only read and write its own thread.
//Duplicates, do not disturb each other.
"""
# Create a global ThreadLocal object:
local_school = threading.local()


def process_student():
    # Get the student associated with the current thread:
    std = local_school.student
    print('Hello, %s (in %s)' % (std, threading.current_thread().name))


def process_thread(name):
    # student bound to ThreadLocal:
    local_school.student = name
    process_student()


t1 = threading.Thread(target=process_thread, args=('dongGe',), name="Thread-A")
t2 = threading.Thread(target=process_thread, args=('King Wang',), name="Thread-B")
t1.start()
t2.start()

Posted by Smruthi on Sat, 24 Aug 2019 04:02:03 -0700