Python threading guide

Keywords: Python Back-end Programmer

1. Thread Foundation

1.1. Thread status

Threads have five states. The process of state transition is shown in the following figure:

[external chain picture transfer failed. The source station may have anti-theft chain mechanism. It is recommended to save the picture and upload it directly (img-j67esbgb-1636273815328)( https://images.cnblogs.com/cnblogs_com/huxi/WindowsLiveWriter/Python_11F5/thread_stat_simple_3.png “thread_stat_simple”)]

1.2. Thread synchronization (lock)

The advantage of multithreading is that it can run multiple tasks at the same time (at least it feels like this). However, when threads need to share data, there may be a problem of data synchronization. Consider this situation: all elements in a list are 0, the thread "set" changes all elements to 1 from back to front, and the thread "print" is responsible for reading and printing the list from front to back. Then, when the thread "set" starts to change, the thread "print" will print the list, and the output will be half 0 and half 1, which is the asynchrony of data. In order to avoid this situation, the concept of lock is introduced.

The lock has two states - locked and unlocked. Whenever a thread, such as "set", wants to access shared data, it must first obtain a lock; If another thread, such as "print", has been locked, the thread "set" is suspended, that is, synchronization blocking; Wait until the thread "print" is accessed and the lock is released, and then let the thread "set" continue. After such processing, when printing the list, either all 0 or all 1 will be output, and the embarrassing scene of half 0 and half 1 will not appear again.

The interaction between thread and lock is shown in the following figure:

[external chain picture transfer failed. The source station may have anti-theft chain mechanism. It is recommended to save the picture and upload it directly (img-c6fqxigt-1636273815337)( https://images.cnblogs.com/cnblogs_com/huxi/WindowsLiveWriter/Python_11F5/thread_lock_3.png “thread_lock”)]

1.3. Thread communication (condition variable)

However, there is another embarrassing situation: the list does not exist at the beginning; It is created through the thread "create". If "set" or "print" accesses the list before "create" is run, an exception will appear. Using locks can solve this problem, but "set" and "print" will need an infinite loop - they don't know when "create" will run. It is obviously a better solution for "create" to notify "set" and "print" after running. Therefore, conditional variables are introduced.

Conditional variables allow threads such as "set" and "print" to wait when the conditions are not met (when the list is None). When the conditions are met (the list has been created), a notice is sent to tell you that the "set" and "print" conditions already exist, and you should get up and work; Then "set" and "print" continue to run.

The interaction between thread and condition variable is shown in the following figure:

[external chain picture transfer failed. The source station may have anti-theft chain mechanism. It is recommended to save the picture and upload it directly (img-radcgu0t-1636273815340)( https://images.cnblogs.com/cnblogs_com/huxi/WindowsLiveWriter/Python_11F5/thread_condition_wait_3.png “thread_condition_wait”)]

[external chain picture transfer failed. The source station may have anti-theft chain mechanism. It is recommended to save the picture and upload it directly (img-75nrmtr0-1636273815345)( https://images.cnblogs.com/cnblogs_com/huxi/WindowsLiveWriter/Python_11F5/thread_condition_notify_3.png “thread_condition_notify”)]

1.4. State transition of thread running and blocking

Finally, look at the transition between thread running and blocking state.

[external chain picture transfer failed. The source station may have anti-theft chain mechanism. It is recommended to save the picture and upload it directly (img-chrgv8jc-1636273815351)( https://images.cnblogs.com/cnblogs_com/huxi/WindowsLiveWriter/Python_11F5/thread_stat_3.png “thread_stat”)]

There are three types of blocking:
Synchronization blocking refers to the state of contention locking. When a thread requests locking, it will enter this state. Once it successfully obtains the locking, it will return to the running state;
Waiting for blocking is the state waiting for other threads to notify. After obtaining conditional locking, the thread will call "wait" to enter this state. Once the other threads give notice, the thread will enter the synchronous blocking state and compete again for conditional locking.
Other blocking refers to the blocking when calling time.sleep(), otherthread. Join() or waiting for IO. In this state, the thread will not release the obtained lock.

tips: if you can understand these contents, the next topic will be very easy; Moreover, these contents are the same in most popular programming languages. (it means you have to understand > < you should understand the tutorials of others if you don't think the author's level is low)

2. thread

Python provides thread support through two standard libraries thread and threading. Thread provides low-level, raw threads and a simple lock.

\# encoding: UTF-8
import thread
import time

# A function used to execute in a thread
def func():
    for i in range(5):
        print 'func'
        time.sleep(1)
   
    # End current thread
    # This method is similar to thread.exit\_thread() equivalent
    thread.exit() # When func returns, the thread will also end
       
# Start a thread and the thread starts running immediately
# This method is similar to thread.start\_new\_thread() equivalent
# The first parameter is the method, and the second parameter is the parameter of the method
thread.start\_new(func, ()) # An empty tuple needs to be passed in when the method has no parameters

# Create a lock (LockType, cannot be instantiated directly)
# This method is similar to thread.allocate\_lock() equivalent
lock = thread.allocate()

# Determine whether the lock is locked or released
print lock.locked()

# Locks are often used to control access to shared resources
count = 0

# Obtain the lock, and return True after successfully obtaining the lock
# When the optional timeout parameter is not filled in, it will be blocked until the lock is obtained
# Otherwise, False will be returned after timeout
if lock.acquire():
    count += 1
   
    # Release lock
    lock.release()

# All threads provided by the thread module will end at the same time after the main thread ends
time.sleep(6)

**Other methods provided by thread module:
**thread.interrupt_main(): terminates the main thread in another thread.
thread.get_ident(): get a magic number representing the current thread, which is often used to get thread related data from a dictionary. This number itself has no meaning and will be reused by the new thread when the thread ends.

Thread also provides a ThreadLocal class to manage thread related data, called thread_ This class is referenced in local and threading.

Because thread provides few thread functions, it cannot continue to run after the main thread ends, and does not provide condition variables, thread module is generally not used, which will not be introduced here.

3. threading

Threading is a Java based thread model design. Lock and Condition variables are the basic behaviors of objects in Java (each object has its own lock and Condition variables), while they are independent objects in Python. Python Thread provides a subset of the behavior of Java Thread; Without priority and thread group, threads cannot be stopped, suspended, resumed or interrupted. Some static methods implemented by Python in Java Thread are provided in the form of module methods in threading.

Common methods provided by threading module:
threading.currentThread(): returns the current thread variable.
threading.enumerate(): returns a list of running threads. Running refers to threads after starting and before ending, excluding threads before starting and after termination.
threading.activeCount(): returns the number of running threads, which is the same as len(threading.enumerate()).

Classes provided by threading module:   Thread, Lock, Rlock, Condition, [Bounded]Semaphore, Event, Timer, local.

3.1. Thread

Thread is a thread class, which is similar to Java. There are two ways to use it: directly pass in the method to run or inherit from thread and override run():

\# encoding: UTF-8
import threading

# Method 1: pass the method to be executed as a parameter to the Thread constructor
def func():
    print 'func() passed to Thread'

t = threading.Thread(target=func)
t.start()

# Method 2: inherit from Thread and override run()
class MyThread(threading.Thread):
    def run(self):
        print 'MyThread extended from Thread'

t = MyThread()
t.start()

**Construction method:
**Thread(group=None, target=None, name=None, args=(), kwargs={})
Group: thread group, which has not been implemented yet. The prompt in the library reference must be None;
target: method to execute;
Name: thread name;
args/kwargs: parameters to pass in the method.

**Instance method:
**isAlive(): Returns whether the thread is running. Running refers to after startup and before termination.
get/setName(name): gets / sets the thread name.
is/setDaemon(bool): gets / sets whether to daemon threads. The initial value is inherited from the thread that created the thread. When no non daemon thread is still running, the program terminates.
start(): starts the thread.
join([timeout]): blocks the thread of the current context until the thread calling this method terminates or reaches the specified timeout (optional parameter).

An example of using join():

\# encoding: UTF-8
import threading
import time

def context(tJoin):
    print 'in threadContext.'
    tJoin.start()
   
    # tContext will be blocked until threadJoin terminates.
    tJoin.join()
   
    # tJoin continues after termination.
    print 'out threadContext.'

def join():
    print 'in threadJoin.'
    time.sleep(1)
    print 'out threadJoin.'

tJoin = threading.Thread(target=join)
tContext = threading.Thread(target=context, args=(tJoin,))

tContext.start()

Operation results:

in threadContext.
in threadJoin.
out threadJoin.
out threadContext.

3.2. Lock

Lock is the lowest level of synchronization instruction available. When lock is locked, it is not owned by a specific thread. Lock contains two states - locked and unlocked, and two basic methods.

It can be considered that Lock has a Lock pool. When a thread requests locking, the thread will be placed in the pool until it is locked and out of the pool. Threads in the pool are in the synchronization blocking state in the state diagram.

**Construction method:
**Lock()

**Instance method:
**acquire([timeout]): put the thread into a synchronous blocking state and try to obtain a lock.
release(): release the lock. The thread must be locked before use, or an exception will be thrown.

\# encoding: UTF-8
import threading
import time

data = 0
lock = threading.Lock()

def func():
    global data
    print '%s acquire lock...' % threading.currentThread().getName()
   
    # When acquire(\[timeout \]) is called, the thread will be blocked all the time,
    # Until the lock is obtained or until after timeout seconds (the timeout parameter is optional).
    # Returns whether the lock was obtained.
    if lock.acquire():
        print '%s get the lock.' % threading.currentThread().getName()
        data += 1
        time.sleep(2)
        print '%s release lock...' % threading.currentThread().getName()
       
        # Calling release() will release the lock.
        lock.release()

t1 = threading.Thread(target=func)
t2 = threading.Thread(target=func)
t3 = threading.Thread(target=func)
t1.start()
t2.start()
t3.start() 

3.3. RLock

RLock (reentrant lock) is a synchronous instruction that can be requested multiple times by the same thread. RLock uses the concepts of "owned thread" and "recursion level". When it is locked, RLock is owned by a thread. A thread with RLock can call acquire() again. release() needs to be called the same number of times to release the lock.

It can be considered that RLock contains a lock pool and a counter with an initial value of 0. Each time acquire()/release() is successfully called, the counter will be + 1 / - 1. When it is 0, the lock is in an unlocked state.

**Construction method:
**RLock()

**Instance method:
**acquire([timeout])/release(): similar to Lock.

\# encoding: UTF-8
import threading
import time

rlock = threading.RLock()

def func():
    # First request lock
    print '%s acquire lock...' % threading.currentThread().getName()
    if rlock.acquire():
        print '%s get the lock.' % threading.currentThread().getName()
        time.sleep(2)
       
        # Second request lock
        print '%s acquire lock again...' % threading.currentThread().getName()
        if rlock.acquire():
            print '%s get the lock.' % threading.currentThread().getName()
            time.sleep(2)
       
        # First release lock
        print '%s release lock...' % threading.currentThread().getName()
        rlock.release()
        time.sleep(2)
       
        # Second release lock
        print '%s release lock...' % threading.currentThread().getName()
        rlock.release()

t1 = threading.Thread(target=func)
t2 = threading.Thread(target=func)
t3 = threading.Thread(target=func)
t1.start()
t2.start()
t3.start()

3.4. Condition

Condition is usually associated with a lock. When you need to share a lock among multiple contidions, you can pass a Lock/RLock instance to the constructor, otherwise it will generate an RLock instance itself.

It can be considered that in addition to the Lock pool with Lock, Condition also includes a waiting pool. The threads in the pool are in the waiting blocking state in the state diagram until another thread calls notify()/notifyAll() notification; After being notified, the thread enters the Lock pool and waits for locking.

**Construction method:
**Condition([lock/rlock])

**Instance method:
**acquire([timeout])/release(): call the corresponding method of the associated lock.
wait([timeout]): calling this method will cause the thread to enter the waiting pool of Condition, wait for notification, and release the lock. The thread must be locked before use, or an exception will be thrown.
notify(): calling this method will select a thread from the waiting pool and notify it. The thread receiving the notification will automatically call acquire() to try to obtain the lock (enter the lock pool); Other threads are still waiting in the pool. Calling this method does not release the lock. The thread must be locked before use, or an exception will be thrown.
notifyAll(): calling this method will notify all threads in the waiting pool that they will enter the lock pool to try to obtain a lock. Calling this method does not release the lock. The thread must be locked before use, or an exception will be thrown.

An example is the common producer / consumer model:

\# encoding: UTF-8
import threading
import time

# commodity
product = None
# Conditional variable
con = threading.Condition()

# Producer method
def produce():
    global product
   
    if con.acquire():
        while True:
            if product is None:
                print 'produce...'
                product = 'anything'
               
                # Inform consumers that the goods have been produced
                con.notify()
           
            # Waiting for notification
            con.wait()
            time.sleep(2)

# Consumer approach
def consume():
    global product
   
    if con.acquire():
        while True:
            if product is not None:
                print 'consume...'
                product = None
               
                # Inform the producer that the goods are gone
                con.notify()
           
            # Waiting for notification
            con.wait()
            time.sleep(2)

t1 = threading.Thread(target=produce)
t2 = threading.Thread(target=consume)
t2.start()
t1.start()

3.5. Semaphore/BoundedSemaphore

Semaphore is one of the oldest synchronous instructions in the history of computer science. Semaphore manages a built-in counter, which is - 1 whenever acquire() is called and + 1 when release() is called. Counter cannot be less than 0; When the counter is 0, acquire() blocks the thread to the synchronous lock state until other threads call release().

Based on this feature, Semaphore is often used to synchronize some objects with "visitor limit", such as connection pool.

The only difference between BoundedSemaphore and Semaphore is that the former will check whether the value of the counter exceeds the initial value of the counter when calling release(). If so, an exception will be thrown.

Construction method:
Semaphore(value=1): value is the initial value of the counter.

**Instance method:
**acquire([timeout]): request Semaphore. If the counter is 0, the blocking thread will be in the synchronous blocking state; Otherwise, the counter will be - 1 and returned immediately.
release(): release the Semaphore, and the counter will be + 1. If BoundedSemaphore is used, the number of releases will also be checked. The release() method does not check whether the thread has obtained Semaphore.

\# encoding: UTF-8
import threading
import time

# The initial value of the counter is 2
semaphore = threading.Semaphore(2)

def func():
   
    # Request Semaphore, counter - 1 after success; Blocking when counter is 0
    print '%s acquire semaphore...' % threading.currentThread().getName()
    if semaphore.acquire():
       
        print '%s get semaphore' % threading.currentThread().getName()
        time.sleep(4)
       
        # Release Semaphore, counter + 1
        print '%s release semaphore' % threading.currentThread().getName()
        semaphore.release()

t1 = threading.Thread(target=func)
t2 = threading.Thread(target=func)
t3 = threading.Thread(target=func)
t4 = threading.Thread(target=func)
t1.start()
t2.start()
t3.start()
t4.start()

time.sleep(2)

# The main thread that does not get the semaphore can also call release
# If BoundedSemaphore is used, t4 releasing semaphore will throw an exception
print 'MainThread release semaphore without acquire'
semaphore.release()

3.6. Event

Event is one of the simplest thread communication mechanisms: one thread notifies the event, and other threads wait for the event. Event has a built-in flag that is initially False. It is set to True when calling set() and reset to False when calling clear(). wait() will block the thread to wait for blocking.

Event is actually a simplified version of Condition. Event has no lock and cannot put the thread into synchronization blocking state.

**Construction method:
**Event()

**Instance method:
**isSet(): returns True when the built-in flag is True.
set(): set the flag to True and notify all threads in the waiting blocking state to resume running state.
clear(): set the flag to False.
wait([timeout]): if the flag is True, it will be returned immediately. Otherwise, the thread will be blocked to the waiting blocking state and wait for other threads to call set().

\# encoding: UTF-8
import threading
import time

event = threading.Event()

def func():
    # Wait for the event and enter the wait blocking state
    print '%s wait for event...' % threading.currentThread().getName()
    event.wait()
   
    # Enter the running state after receiving the event
    print '%s recv event.' % threading.currentThread().getName()

t1 = threading.Thread(target=func)
t2 = threading.Thread(target=func)
t1.start()
t2.start()

time.sleep(2)

# Send event notification
print 'MainThread set event.'
event.set()

3.7. Timer

Timer (timer) is a derived class of Thread, which is used to invoke a method after a specified time.

**Construction method:
**Timer(interval, function, args=[], kwargs={})
interval: specified time
function: method to execute
args/kwargs: parameters of method

**Instance method:
**Timer derives from Thread and does not add instance method.

\# encoding: UTF-8
import threading

def func():
    print 'hello timer!'

timer = threading.Timer(5, func)
timer.start()

3.8. local

Local is a lowercase class that manages thread local data. For the same local, the thread cannot access the properties set by other threads; A property set by a thread is not replaced by a property with the same name set by another thread.

local can be regarded as a "thread attribute dictionary" dictionary. local encapsulates the details of retrieving the corresponding attribute dictionary from its own thread as the key and then using the attribute name as the key.

\# encoding: UTF-8
import threading

local = threading.local()
local.tname = 'main'

def func():
    local.tname = 'notmain'
    print local.tname

t1 = threading.Thread(target=func)
t1.start()
t1.join()

print local.tname

Mastering Thread, Lock and Condition can deal with most situations where threads are needed. In some cases, local is also very useful. At the end of this article, these classes are used to show the scenarios mentioned in the Thread Foundation:

\# encoding: UTF-8
import threading

alist = None
condition = threading.Condition()

def doSet():
    if condition.acquire():
        while alist is None:
            condition.wait()
        for i in range(len(alist))\[::-1\]:
            alist\[i\] = 1
        condition.release()

def doPrint():
    if condition.acquire():
        while alist is None:
            condition.wait()
        for i in alist:
            print i,
        print
        condition.release()

def doCreate():
    global alist
    if condition.acquire():
        if alist is None:
            alist = \[0 for i in range(10)\]
            condition.notifyAll()
        condition.release()

tset = threading.Thread(target=doSet,name='tset')
tprint = threading.Thread(target=doPrint,name='tprint')
tcreate = threading.Thread(target=doCreate,name='tcreate')
tset.start()
tprint.start()
tcreate.start()

Posted by keeve on Sun, 07 Nov 2021 19:21:41 -0800