Ctrl+C in Python Can't Terminate Multiprocessing Pool Solution

Keywords: Python Java

This paper is also theoretically effective for multiprocessing.dummy's Pool.

multiprocessing in Python 2.x provides a function-based process pool. Pressing Ctrl + C does not stop all processes and exit. That is to say, the residual sub-processes must be found after ctrl+z and eliminated. First, look at a piece of code that is invalid for ctrl+c:

#!/usr/bin/env python
import multiprocessing
import os
import time


def do_work(x):
    print 'Work Started: %s' % os.getpid()
    time.sleep(10)
    return x * x


def main():
    pool = multiprocessing.Pool(4)
    try:
        result = pool.map_async(do_work, range(8))
        pool.close()
        pool.join()
        print result
    except KeyboardInterrupt:
        print 'parent received control-c'
        pool.terminate()
        pool.join()
 

if __name__ == "__main__":
    main()

After this code runs, a process can't be kill ed by pressing ^ c. Finally, there will be five processes (1 + 4), including the main process. Killing the main process can make it all exit. Clearly, Keyboard Interrupt cannot be captured by a process when using a process pool. There are two solutions.

Scheme 1

The following section is a section of the multiprocessing pool.py in the python source code. Apply Result is the class that Pool uses to store the results of a function's run.

class ApplyResult(object):

    def __init__(self, cache, callback):
        self._cond = threading.Condition(threading.Lock())
        self._job = job_counter.next()
        self._cache = cache
        self._ready = False
        self._callback = callback
        cache[self._job] = self

And the following code is also ^ c invalid

if __name__ == '__main__':
    import threading

    cond = threading.Condition(threading.Lock())
    cond.acquire()
    cond.wait()
    print "done"

Obviously, the threading.Condition(threading.Lock()) object cannot receive Keyboard Interrupt, but with a slight modification, a timeout parameter can be given to cond.wait(), which can be passed by get after map_async.

result = pool.map_async(do_work, range(4))

Change to

result = pool.map_async(do_work, range(4)).get(1)

You can receive ^ c successfully. Fill in 99999 or 0xffff in get.

Option two

Another way, of course, is to write your own process pool. You need to use queues and paste a piece of code to feel it.

#!/usr/bin/env python
import multiprocessing, os, signal, time, Queue

def do_work():
    print 'Work Started: %d' % os.getpid()
    time.sleep(2)
    return 'Success'

def manual_function(job_queue, result_queue):
    signal.signal(signal.SIGINT, signal.SIG_IGN)
    while not job_queue.empty():
        try:
            job = job_queue.get(block=False)
            result_queue.put(do_work())
        except Queue.Empty:
            pass
        #except KeyboardInterrupt: pass

def main():
    job_queue = multiprocessing.Queue()
    result_queue = multiprocessing.Queue()

    for i in range(6):
        job_queue.put(None)

    workers = []
    for i in range(3):
        tmp = multiprocessing.Process(target=manual_function,
                                      args=(job_queue, result_queue))
        tmp.start()
        workers.append(tmp)

    try:
        for worker in workers:
            worker.join()
    except KeyboardInterrupt:
        print 'parent received ctrl-c'
        for worker in workers:
            worker.terminate()
            worker.join()

    while not result_queue.empty():
        print result_queue.get(block=False)

if __name__ == "__main__":
    main()

Common Error Solutions

I have to mention that I found that some people on segmentfault s were misled.

In theory, when a Pool is initialized, a initializer function is passed to allow the subprocess to ignore the SIGINT signal, that is, ^ c, and then the Pool is terminate d. Code

#!/usr/bin/env python
import multiprocessing
import os
import signal
import time


def init_worker():
    signal.signal(signal.SIGINT, signal.SIG_IGN)


def run_worker(x):
    print "child: %s" % os.getpid()
    time.sleep(20)
    return x * x


def main():
    pool = multiprocessing.Pool(4, init_worker)
    try:
        results = []
        print "Starting jobs"
        for x in range(8):
            results.append(pool.apply_async(run_worker, args=(x,)))

        time.sleep(5)
        pool.close()
        pool.join()
        print [x.get() for x in results]
    except KeyboardInterrupt:
        print "Caught KeyboardInterrupt, terminating workers"
        pool.terminate()
        pool.join()


if __name__ == "__main__":
    main()

However, this code can only be interrupted by ctrl+c when it runs at time.sleep(5). In the first five seconds, you press ^ c to be valid, and once pool.join(), it will be totally invalid! uuuuuuuuuuu

proposal

Firstly, confirm whether multi-process is really needed. If IO-rich programs recommend multi-threading or co-routing, especially multi-process computing. If you have to use multi-process, you can use Python 3's concurrent.futures package (python 2. x can also be installed) to write simpler and easier-to-use multi-threaded / multi-process code, which is somewhat similar to Java's concurrent framework.

Reference resources

  1. http://bryceboe.com/2010/08/26/python-multiprocessing-and-keyboardinterrupt/#georges

  2. http://stackoverflow.com/questions/1408356/keyboard-interrupts-with-pythons-multiprocessing-pool#comment12678760_6191991

Posted by fil on Tue, 25 Dec 2018 15:54:06 -0800