One Python standard library per week

Keywords: Python Programming network less

Technology blog: https://github.com/yongxinz/tech-blog

At the same time, welcome to pay attention to my WeChat public number AlwaysBeta, more wonderful content waiting for you to come.

In fact, in Python, multithreading is not recommended. Unless you explicitly do not support the scenario of using multiple processes, you can use multiple processes if you can. The purpose of this article can be compared with the multi process article. There are many similarities. After reading this article, you may have a better understanding of concurrent programming.

GIL

Python's multithreaded code doesn't take advantage of multi-core, but is handled by the famous global interpretation lock (GIL). If it is a computational task, using multi-threaded Gil will make multi-threaded slow. Let's take an example of calculating Fibonacci series:

# coding=utf-8
import time
import threading


def profile(func):
    def wrapper(*args, **kwargs):
        import time
        start = time.time()
        func(*args, **kwargs)
        end   = time.time()
        print 'COST: {}'.format(end - start)
    return wrapper


def fib(n):
    if n<= 2:
        return 1
    return fib(n-1) + fib(n-2)


@profile
def nothread():
    fib(35)
    fib(35)


@profile
def hasthread():
    for i in range(2):
        t = threading.Thread(target=fib, args=(35,))
        t.start()
    main_thread = threading.currentThread()
    for t in threading.enumerate():
        if t is main_thread:
            continue
        t.join()

nothread()
hasthread()

# output
# COST: 5.05716490746
# COST: 6.75599503517

What do you think of the result of the operation? It's better not to multithread!

GIL is required, which is the problem of Python design: the Python interpreter is non thread safe. This means that there is a global mandatory lock when trying to access Python objects safely from within a thread. At any time, only a single thread can get Python objects or C API s. Every 100 bytes of Python instruction interpreter reacquires the lock, which (potentially) blocks I/O operations. Because of locks, CPU intensive code will not get performance improvement when using thread library (but when it uses the multi process library introduced later, performance can be improved).

Is it because of the existence of GIL that multithreading library is a "chicken rib"? Of course not. In fact, we usually contact many programs related to network communication or data input / output, such as web crawler, text processing, etc. At this time, due to the limitations of network conditions and I/O performance, the Python interpreter will wait for the function call to read and write data to return. At this time, the multithreaded library can be used to improve the concurrency efficiency.

Thread object

Let's start with a very simple way to instantiate the objective function directly using Thread and then call start() to execute it.

import threading


def worker():
    """thread worker function"""
    print('Worker')


threads = []
for i in range(5):
    t = threading.Thread(target=worker)
    threads.append(t)
    t.start()
    
# output
# Worker
# Worker
# Worker
# Worker
# Worker

When generating a thread, parameters can be passed to the thread. Any type of parameters can be used. The following example only passes one number:

import threading


def worker(num):
    """thread worker function"""
    print('Worker: %s' % num)


threads = []
for i in range(5):
    t = threading.Thread(target=worker, args=(i,))
    threads.append(t)
    t.start()
    
# output
# Worker: 0
# Worker: 1
# Worker: 2
# Worker: 3
# Worker: 4

There is also a way to create a Thread. By inheriting the Thread class and then overriding the run() method, the code is as follows:

import threading
import logging


class MyThread(threading.Thread):

    def run(self):
        logging.debug('running')


logging.basicConfig(
    level=logging.DEBUG,
    format='(%(threadName)-10s) %(message)s',
)

for i in range(5):
    t = MyThread()
    t.start()
    
# output
# (Thread-1  ) running
# (Thread-2  ) running
# (Thread-3  ) running
# (Thread-4  ) running
# (Thread-5  ) running

Because args and kwargs passed to the Thread constructor are saved as private variables with a prefix of "UU", they cannot be accessed in the child Thread, so in the custom Thread class, the constructor should be re constructed.

import threading
import logging


class MyThreadWithArgs(threading.Thread):

    def __init__(self, group=None, target=None, name=None,
                 args=(), kwargs=None, *, daemon=None):
        super().__init__(group=group, target=target, name=name,
                         daemon=daemon)
        self.args = args
        self.kwargs = kwargs

    def run(self):
        logging.debug('running with %s and %s',
                      self.args, self.kwargs)


logging.basicConfig(
    level=logging.DEBUG,
    format='(%(threadName)-10s) %(message)s',
)

for i in range(5):
    t = MyThreadWithArgs(args=(i,), kwargs={'a': 'A', 'b': 'B'})
    t.start()
    
# output
# (Thread-1  ) running with (0,) and {'b': 'B', 'a': 'A'}
# (Thread-2  ) running with (1,) and {'b': 'B', 'a': 'A'}
# (Thread-3  ) running with (2,) and {'b': 'B', 'a': 'A'}
# (Thread-4  ) running with (3,) and {'b': 'B', 'a': 'A'}
# (Thread-5  ) running with (4,) and {'b': 'B', 'a': 'A'}

Determine current thread

Each Thread has a name, which can be used by default or specified when creating a Thread.

import threading
import time


def worker():
    print(threading.current_thread().getName(), 'Starting')
    time.sleep(0.2)
    print(threading.current_thread().getName(), 'Exiting')


def my_service():
    print(threading.current_thread().getName(), 'Starting')
    time.sleep(0.3)
    print(threading.current_thread().getName(), 'Exiting')


t = threading.Thread(name='my_service', target=my_service)
w = threading.Thread(name='worker', target=worker)
w2 = threading.Thread(target=worker)  # use default name

w.start()
w2.start()
t.start()

# output
# worker Starting
# Thread-1 Starting
# my_service Starting
# worker Exiting
# Thread-1 Exiting
# my_service Exiting

Daemon thread

By default, the main program does not exit until all child threads exit. Sometimes it's useful to start a background thread without preventing the main program from exiting, such as generating a "heartbeat" task for a monitoring tool.

To mark a thread as a daemons, pass the daemon=True or call set ˊ (true) at creation time. By default, a thread is not a daemons.

import threading
import time
import logging


def daemon():
    logging.debug('Starting')
    time.sleep(0.2)
    logging.debug('Exiting')


def non_daemon():
    logging.debug('Starting')
    logging.debug('Exiting')


logging.basicConfig(
    level=logging.DEBUG,
    format='(%(threadName)-10s) %(message)s',
)

d = threading.Thread(name='daemon', target=daemon, daemon=True)

t = threading.Thread(name='non-daemon', target=non_daemon)

d.start()
t.start()

# output
# (daemon    ) Starting
# (non-daemon) Starting
# (non-daemon) Exiting

The output does not contain the Exiting of the daemons because other threads, including the main program, have exited before the daemons wake up from sleep().

If you want to wait for the daemons to finish their work, you can use the join() method.

import threading
import time
import logging


def daemon():
    logging.debug('Starting')
    time.sleep(0.2)
    logging.debug('Exiting')


def non_daemon():
    logging.debug('Starting')
    logging.debug('Exiting')


logging.basicConfig(
    level=logging.DEBUG,
    format='(%(threadName)-10s) %(message)s',
)

d = threading.Thread(name='daemon', target=daemon, daemon=True)

t = threading.Thread(name='non-daemon', target=non_daemon)

d.start()
t.start()

d.join()
t.join()

# output
# (daemon    ) Starting
# (non-daemon) Starting
# (non-daemon) Exiting
# (daemon    ) Exiting

The output information already includes the Exiting of the daemons.

By default, join() is blocked indefinitely. You can also pass a floating-point value that represents the number of seconds to wait for the thread to become inactive. If the thread does not complete within the timeout period, join() returns anyway.

import threading
import time
import logging


def daemon():
    logging.debug('Starting')
    time.sleep(0.2)
    logging.debug('Exiting')


def non_daemon():
    logging.debug('Starting')
    logging.debug('Exiting')


logging.basicConfig(
    level=logging.DEBUG,
    format='(%(threadName)-10s) %(message)s',
)

d = threading.Thread(name='daemon', target=daemon, daemon=True)

t = threading.Thread(name='non-daemon', target=non_daemon)

d.start()
t.start()

d.join(0.1)
print('d.isAlive()', d.isAlive())
t.join()

# output
# (daemon    ) Starting
# (non-daemon) Starting
# (non-daemon) Exiting
# d.isAlive() True

Because the timeout of delivery is less than the time when the daemons thread sleeps, the thread is still "active" after join() returns.

Enumerate all threads

The enumerate() method returns a list of active Thread instances. Because the list includes the current Thread, and because joining the current Thread introduces a deadlock condition, it must be skipped.

import random
import threading
import time
import logging


def worker():
    """thread worker function"""
    pause = random.randint(1, 5) / 10
    logging.debug('sleeping %0.2f', pause)
    time.sleep(pause)
    logging.debug('ending')


logging.basicConfig(
    level=logging.DEBUG,
    format='(%(threadName)-10s) %(message)s',
)

for i in range(3):
    t = threading.Thread(target=worker, daemon=True)
    t.start()

main_thread = threading.main_thread()
for t in threading.enumerate():
    if t is main_thread:
        continue
    logging.debug('joining %s', t.getName())
    t.join()
    
# output
# (Thread-1  ) sleeping 0.20
# (Thread-2  ) sleeping 0.30
# (Thread-3  ) sleeping 0.40
# (MainThread) joining Thread-1
# (Thread-1  ) ending
# (MainThread) joining Thread-3
# (Thread-2  ) ending
# (Thread-3  ) ending
# (MainThread) joining Thread-2

timer thread

Timer() starts working after the delay time and can be cancelled at any point in the delay time period.

import threading
import time
import logging


def delayed():
    logging.debug('worker running')


logging.basicConfig(
    level=logging.DEBUG,
    format='(%(threadName)-10s) %(message)s',
)

t1 = threading.Timer(0.3, delayed)
t1.setName('t1')
t2 = threading.Timer(0.3, delayed)
t2.setName('t2')

logging.debug('starting timers')
t1.start()
t2.start()

logging.debug('waiting before canceling %s', t2.getName())
time.sleep(0.2)
logging.debug('canceling %s', t2.getName())
t2.cancel()
logging.debug('done')

# output
# (MainThread) starting timers
# (MainThread) waiting before canceling t2
# (MainThread) canceling t2
# (MainThread) done
# (t1        ) worker running

The second timer in this example does not run, and the first timer appears to run after the main program completes. Because it is not a daemonic thread, it is implicitly called when the main thread completes.

Synchronization mechanism

Semaphore

In multithreaded programming, in order to prevent different threads from modifying a common resource (such as all variables) at the same time, the number of simultaneous accesses (usually 1) is required. Semaphore synchronization is based on the internal counter. For each call to acquire(), the counter is decreased by 1; for each call to release(), the counter is increased by 1. When the counter is 0, the acquire() call is blocked.

import logging
import random
import threading
import time


class ActivePool:

    def __init__(self):
        super(ActivePool, self).__init__()
        self.active = []
        self.lock = threading.Lock()

    def makeActive(self, name):
        with self.lock:
            self.active.append(name)
            logging.debug('Running: %s', self.active)

    def makeInactive(self, name):
        with self.lock:
            self.active.remove(name)
            logging.debug('Running: %s', self.active)


def worker(s, pool):
    logging.debug('Waiting to join the pool')
    with s:
        name = threading.current_thread().getName()
        pool.makeActive(name)
        time.sleep(0.1)
        pool.makeInactive(name)


logging.basicConfig(
    level=logging.DEBUG,
    format='%(asctime)s (%(threadName)-2s) %(message)s',
)

pool = ActivePool()
s = threading.Semaphore(2)
for i in range(4):
    t = threading.Thread(
        target=worker,
        name=str(i),
        args=(s, pool),
    )
    t.start()
    
# output
# 2016-07-10 10:45:29,398 (0 ) Waiting to join the pool
# 2016-07-10 10:45:29,398 (0 ) Running: ['0']
# 2016-07-10 10:45:29,399 (1 ) Waiting to join the pool
# 2016-07-10 10:45:29,399 (1 ) Running: ['0', '1']
# 2016-07-10 10:45:29,399 (2 ) Waiting to join the pool
# 2016-07-10 10:45:29,399 (3 ) Waiting to join the pool
# 2016-07-10 10:45:29,501 (1 ) Running: ['0']
# 2016-07-10 10:45:29,501 (0 ) Running: []
# 2016-07-10 10:45:29,502 (3 ) Running: ['3']
# 2016-07-10 10:45:29,502 (2 ) Running: ['3', '2']
# 2016-07-10 10:45:29,607 (3 ) Running: ['2']
# 2016-07-10 10:45:29,608 (2 ) Running: []

In this case, the ActivePool() class is just to show that at most two threads are running at the same time.

Lock

Lock can also be called a mutex, which is equivalent to a semaphore of 1. Let's take a look at an example without lock:

import time
from threading import Thread

value = 0


def getlock():
    global value
    new = value + 1
    time.sleep(0.001)  # Using sleep to give threads a chance to switch
    value = new


threads = []

for i in range(100):
    t = Thread(target=getlock)
    t.start()
    threads.append(t)

for t in threads:
    t.join()

print value	# 16

Without lock, the result will be far less than 100. Let's add the mutex:

import time
from threading import Thread, Lock

value = 0
lock = Lock()


def getlock():
    global value
    with lock:
        new = value + 1
        time.sleep(0.001)
        value = new

threads = []

for i in range(100):
    t = Thread(target=getlock)
    t.start()
    threads.append(t)

for t in threads:
    t.join()

print value	# 100

RLock

acquire() can be called multiple times by the same thread without being blocked. Note, however, that release() requires the same number of calls as acquire () to release the lock.

Let's first look at the use of Lock:

import threading

lock = threading.Lock()

print('First try :', lock.acquire())
print('Second try:', lock.acquire(0))

# output
# First try : True
# Second try: False

In this case, the second call gives acquire() a zero timeout to prevent it from blocking because the first call has been locked.

Let's look at RLock as an alternative.

import threading

lock = threading.RLock()

print('First try :', lock.acquire())
print('Second try:', lock.acquire(0))

# output
# First try : True
# Second try: True

Condition

One thread waits for a particular condition, while another signals that a particular condition is met. The best example is the "producer / consumer" model:

import time
import threading

def consumer(cond):
    t = threading.currentThread()
    with cond:
        cond.wait()  # The wait() method creates a lock named waiter and sets the state of the lock to locked. This waiter lock is used for communication between threads
        print '{}: Resource is available to consumer'.format(t.name)


def producer(cond):
    t = threading.currentThread()
    with cond:
        print '{}: Making resource available'.format(t.name)
        cond.notifyAll()  # Release waiter lock to wake up consumers


condition = threading.Condition()

c1 = threading.Thread(name='c1', target=consumer, args=(condition,))
c2 = threading.Thread(name='c2', target=consumer, args=(condition,))
p = threading.Thread(name='p', target=producer, args=(condition,))

c1.start()
time.sleep(1)
c2.start()
time.sleep(1)
p.start()

# output
# p: Making resource available
# c2: Resource is available to consumer
# c1: Resource is available to consumer

You can see that after the producer sends the notification, the consumer receives it.

Event

One thread sends / delivers the event, and the other waits for the event to trigger. We also use the example of "producer / consumer" model:

# coding=utf-8
import time
import threading
from random import randint


TIMEOUT = 2

def consumer(event, l):
    t = threading.currentThread()
    while 1:
        event_is_set = event.wait(TIMEOUT)
        if event_is_set:
            try:
                integer = l.pop()
                print '{} popped from list by {}'.format(integer, t.name)
                event.clear()  # Reset event status
            except IndexError:  # In order to make it fault-tolerant at first startup
                pass


def producer(event, l):
    t = threading.currentThread()
    while 1:
        integer = randint(10, 100)
        l.append(integer)
        print '{} appended to list by {}'.format(integer, t.name)
        event.set()	 # Set events
        time.sleep(1)


event = threading.Event()
l = []

threads = []

for name in ('consumer1', 'consumer2'):
    t = threading.Thread(name=name, target=consumer, args=(event, l))
    t.start()
    threads.append(t)

p = threading.Thread(name='producer1', target=producer, args=(event, l))
p.start()
threads.append(p)

for t in threads:
    t.join()
    
# output
# 77 appended to list by producer1
# 77 popped from list by consumer1
# 46 appended to list by producer1
# 46 popped from list by consumer2
# 43 appended to list by producer1
# 43 popped from list by consumer2
# 37 appended to list by producer1
# 37 popped from list by consumer2
# 33 appended to list by producer1
# 33 popped from list by consumer2
# 57 appended to list by producer1
# 57 popped from list by consumer1

You can see that the event was received and processed by two consumers on average. If we use the wait() method, the thread will wait for us to set the event, which also helps to ensure the completion of the task.

Queue

Queues are most commonly used in concurrent development. We use the "producer / consumer" model to understand that the producer puts the produced "message" into the queue, and the consumer executes the corresponding message from the queue.

You are mainly concerned about the following four methods:

put: adds an item to the queue.
get: removes and returns an item from the queue.
Task? Done: called when a task is completed.
join: block until all items are processed.

# coding=utf-8
import time
import threading
from random import random
from Queue import Queue

q = Queue()


def double(n):
    return n * 2


def producer():
    while 1:
        wt = random()
        time.sleep(wt)
        q.put((double, wt))


def consumer():
    while 1:
        task, arg = q.get()
        print arg, task(arg)
        q.task_done()


for target in(producer, consumer):
    t = threading.Thread(target=target)
    t.start()

This is the simplest queue architecture.

The Queue module also comes with two special queues: PriorityQueue (with priority) and LifoQueue (last in, first out). Here we show the usage of thread safe priority Queue. The format of the data that PriorityQueue requires us to put is (priority u number, data). Let's take a look at the following example:

import time
import threading
from random import randint
from Queue import PriorityQueue


q = PriorityQueue()


def double(n):
    return n * 2


def producer():
    count = 0
    while 1:
        if count > 5:
            break
        pri = randint(0, 100)
        print 'put :{}'.format(pri)
        q.put((pri, double, pri))  # (priority, func, args)
        count += 1


def consumer():
    while 1:
        if q.empty():
            break
        pri, task, arg = q.get()
        print '[PRI:{}] {} * 2 = {}'.format(pri, arg, task(arg))
        q.task_done()
        time.sleep(0.1)


t = threading.Thread(target=producer)
t.start()
time.sleep(1)
t = threading.Thread(target=consumer)
t.start()

# output
# put :84
# put :86
# put :16
# put :93
# put :14
# put :93
# [PRI:14] 14 * 2 = 28
# 
# [PRI:16] 16 * 2 = 32
# [PRI:84] 84 * 2 = 168
# [PRI:86] 86 * 2 = 172
# [PRI:93] 93 * 2 = 186
# [PRI:93] 93 * 2 = 186

In order to save space, only 5 random results are generated. It can be seen that the number in put is random, but in get, it is first obtained from higher priority (small number means high priority).

Thread pool

In object-oriented development, it is known that it takes time to create and destroy objects, because to create an object, you need to obtain memory resources or other resources. Creating and destroying threads without restraint is a great waste. Can we reuse the thread that has completed the task without destroying it? It's like putting these threads into a pool. On the one hand, we can control the number of threads working at the same time, and on the other hand, we can avoid the cost of creation and destruction.

The thread pool is actually embodied in the standard library, but it is not mentioned in the official articles:

In : from multiprocessing.pool import ThreadPool
In : pool = ThreadPool(5)
In : pool.map(lambda x: x**2, range(5))
Out: [0, 1, 4, 9, 16]

Of course, we can also achieve one by ourselves:

# coding=utf-8
import time
import threading
from random import random
from Queue import Queue


def double(n):
    return n * 2


class Worker(threading.Thread):
    def __init__(self, queue):
        super(Worker, self).__init__()
        self._q = queue
        self.daemon = True
        self.start()
    def run(self):
        while 1:
            f, args, kwargs = self._q.get()
            try:
                print 'USE: {}'.format(self.name)  # Thread name
                print f(*args, **kwargs)
            except Exception as e:
                print e
            self._q.task_done()


class ThreadPool(object):
    def __init__(self, num_t=5):
        self._q = Queue(num_t)
        # Create Worker Thread
        for _ in range(num_t):
            Worker(self._q)
    def add_task(self, f, *args, **kwargs):
        self._q.put((f, args, kwargs))
    def wait_complete(self):
        self._q.join()


pool = ThreadPool()
for _ in range(8):
    wt = random()
    pool.add_task(double, wt)
    time.sleep(wt)
pool.wait_complete()

# output
# USE: Thread-1
# 1.58762376489
# USE: Thread-2
# 0.0652918738849
# USE: Thread-3
# 0.997407997138
# USE: Thread-4
# 1.69333900685
# USE: Thread-5
# 0.726900613676
# USE: Thread-1
# 1.69110052253
# USE: Thread-2
# 1.89039743989
# USE: Thread-3
# 0.96281118122

The thread pool will guarantee 5 threads to work at the same time, but we have 8 tasks to be completed, and we can see that threads are recycled in order.

Programmer Group