[ACCEPTED]-Python join a process without blocking parent-multiprocessing
You can set up a separate thread which does 4 the joining. Have it listen on a queue into which 3 you push the subprocess handles:
class Joiner(Thread): def __init__(self, q): self.__q = q def run(self): while True: child = self.__q.get() if child == None: return child.join()
Then, instead 2 of
joinq.put(p) and do a
joinq.put(None) to signal the thread 1 to stop. Make sure you use a FIFO queue.
In your while loop, call
Return list of 3 all live children of the current process. Calling 2 this has the side affect of “joining” any 1 processes which have already finished.
Instead of trying to shoehorn
multiprocessing.Process() into working 32 for you, perhaps you should use a different 31 tool, like
apply_async() with a multiprocessing.Pool():
def main(argv): # parse command line args ... # set up variables ... # set up multiprocessing Pool pool = multiprocessing.Pool() try: watch_dir(watch_dir, download_dir, pool) # catch whatever kind of exception you expect to end your infinite loop # you can omit this try/except if you really think your script will # run "forever" and you're okay with zombies should it crash except KeyboardInterrupt: pool.close() pool.join() def watch_dir(wDir, dDir, pool): # Grab the current watch directory listing before = dict([(f, None) for f in os.listdir (wDir)]) # Loop FOREVER while 1: # sleep for 10 secs time.sleep(10) # Grab the current dir listing after = dict([(f, None) for f in os.listdir (wDir)]) # Get the list of new files added = [f for f in after if not f in before] # Get the list of deleted files removed = [f for f in before if not f in after] if added: # We have new files, do your stuff print "Added: ", ", ".join(added) # launch the function in a subprocess - this is NON-BLOCKING pool.apply_async(child, (added, wDir, dDir)) if removed: # tell the user the file was deleted print "Removed: ", ", ".join(removed) # Set before to the current before = after def child(filename, wDir, dDir): # Open filename and extract the url ... # Download the file and to the dDir directory ... # Delete filename from the watch directory ... # simply return to "exit cleanly" return
multiprocessing.Pool() is a pool of worker subprocesses that you 29 can submit "jobs" to. The
pool.apply_async() function 28 call causes one of the subprocesses to run 27 your function with the arguments provided, asynchronously, and 26 doesn't need joined until your script is 25 done with all of its work and closes the 24 whole pool. The library manages the details 23 for you.
I think this will serve you better 22 than the current accepted answer for the 21 following reasons:
1. It removes the unnecessary 20 complexity of launching extra threads and 19 queues just to manage subprocesses.
2. It 18 uses library routines that are made specifically for this purpose, so you 17 get the benefit of future library improvements.
3. IMHO, it 16 is much more maintainable.
4. It is a more 15 flexible. If you one day decide that you 14 want to actually see a return value from 13 your subprocesses, you can store the return 12 value from the
apply_async() call (a result object) and check it whenever 11 you want. You could store a bunch of them 10 in a list and process them as a batch when 9 your list gets above a certain size. You 8 can move the creation of the pool into the 7
watch_dir() function and do away with the try/except 6 if you don't really care what happens if 5 the "infinite" loop is interrupted. If 4 you put some kind of break condition in 3 the (presently) infinite loop, you can simply 2 add
pool.join() after the loop and everything 1 is cleaned up.
If you don't care about when and whether 5 the child terminates, and you just want 4 to avoid the child ending up as a zombie 3 process, then you can do a double-fork, so 2 that the grandchild ends up being a child 1 of
init. In code:
def child(*args): p = Process(target=grandchild, args=args) p.start() os._exit(0) def grandchild(filename, wDir, dDir): # Open filename and extract the url ... # Download the file and to the dDir directory ... # Delete filename from the watch directory ... # exit cleanly os._exit(0)
You can also use
deamon=True (daemonic process); the 5
process.start() method does not block so your parent process 4 can continue working without waiting for 3 its child to finish.
The only caveat is 2 that daemonic processes are not allowed to spawn 1 children.
from multiprocessing import Process child_process = Process( target=my_func, daemon=True ) child_process.start() # Keep doing your stuff
More Related questions