This paper is also theoretically effective for multiprocessing.dummy's Pool.
multiprocessing in Python 2.x provides a function-based process pool. Pressing Ctrl + C does not stop all processes and exit. That is to say, the residual sub-processes must be found after ctrl+z and eliminated. First, look at a piece of code that is invalid for ctrl+c:
#!/usr/bin/env python import multiprocessing import os import time def do_work(x): print 'Work Started: %s' % os.getpid() time.sleep(10) return x * x def main(): pool = multiprocessing.Pool(4) try: result = pool.map_async(do_work, range(8)) pool.close() pool.join() print result except KeyboardInterrupt: print 'parent received control-c' pool.terminate() pool.join() if __name__ == "__main__": main()
After this code runs, a process can't be kill ed by pressing ^ c. Finally, there will be five processes (1 + 4), including the main process. Killing the main process can make it all exit. Clearly, Keyboard Interrupt cannot be captured by a process when using a process pool. There are two solutions.
Scheme 1
The following section is a section of the multiprocessing pool.py in the python source code. Apply Result is the class that Pool uses to store the results of a function's run.
class ApplyResult(object): def __init__(self, cache, callback): self._cond = threading.Condition(threading.Lock()) self._job = job_counter.next() self._cache = cache self._ready = False self._callback = callback cache[self._job] = self
And the following code is also ^ c invalid
if __name__ == '__main__': import threading cond = threading.Condition(threading.Lock()) cond.acquire() cond.wait() print "done"
Obviously, the threading.Condition(threading.Lock()) object cannot receive Keyboard Interrupt, but with a slight modification, a timeout parameter can be given to cond.wait(), which can be passed by get after map_async.
result = pool.map_async(do_work, range(4))
Change to
result = pool.map_async(do_work, range(4)).get(1)
You can receive ^ c successfully. Fill in 99999 or 0xffff in get.
Option two
Another way, of course, is to write your own process pool. You need to use queues and paste a piece of code to feel it.
#!/usr/bin/env python import multiprocessing, os, signal, time, Queue def do_work(): print 'Work Started: %d' % os.getpid() time.sleep(2) return 'Success' def manual_function(job_queue, result_queue): signal.signal(signal.SIGINT, signal.SIG_IGN) while not job_queue.empty(): try: job = job_queue.get(block=False) result_queue.put(do_work()) except Queue.Empty: pass #except KeyboardInterrupt: pass def main(): job_queue = multiprocessing.Queue() result_queue = multiprocessing.Queue() for i in range(6): job_queue.put(None) workers = [] for i in range(3): tmp = multiprocessing.Process(target=manual_function, args=(job_queue, result_queue)) tmp.start() workers.append(tmp) try: for worker in workers: worker.join() except KeyboardInterrupt: print 'parent received ctrl-c' for worker in workers: worker.terminate() worker.join() while not result_queue.empty(): print result_queue.get(block=False) if __name__ == "__main__": main()
Common Error Solutions
I have to mention that I found that some people on segmentfault s were misled.
In theory, when a Pool is initialized, a initializer function is passed to allow the subprocess to ignore the SIGINT signal, that is, ^ c, and then the Pool is terminate d. Code
#!/usr/bin/env python import multiprocessing import os import signal import time def init_worker(): signal.signal(signal.SIGINT, signal.SIG_IGN) def run_worker(x): print "child: %s" % os.getpid() time.sleep(20) return x * x def main(): pool = multiprocessing.Pool(4, init_worker) try: results = [] print "Starting jobs" for x in range(8): results.append(pool.apply_async(run_worker, args=(x,))) time.sleep(5) pool.close() pool.join() print [x.get() for x in results] except KeyboardInterrupt: print "Caught KeyboardInterrupt, terminating workers" pool.terminate() pool.join() if __name__ == "__main__": main()
However, this code can only be interrupted by ctrl+c when it runs at time.sleep(5). In the first five seconds, you press ^ c to be valid, and once pool.join(), it will be totally invalid! uuuuuuuuuuu
proposal
Firstly, confirm whether multi-process is really needed. If IO-rich programs recommend multi-threading or co-routing, especially multi-process computing. If you have to use multi-process, you can use Python 3's concurrent.futures package (python 2. x can also be installed) to write simpler and easier-to-use multi-threaded / multi-process code, which is somewhat similar to Java's concurrent framework.