Skip to content Skip to sidebar Skip to footer

Python, Subprocess: Launch New Process When One (in A Group) Has Terminated

I have n files to analyze separately and independently of each other with the same Python script analysis.py. In a wrapper script, wrapper.py, I loop over those files and call anal

Solution 1:

This outlines how to use multiprocessing.Pool which exists exactly for these tasks:

from multiprocessing import Pool, cpu_count

# ...
all_files = ["file%d" % i for i inrange(5)]


defprocess_file(file_name):
    # process filereturn"finished file %s" % file_name

pool = Pool(cpu_count())

# this is a blocking call - when it's done, all files have been processed
results = pool.map(process_file, all_files)

# no more tasks can go in the pool
pool.close()

# wait for all workers to complete their task (though we used a blocking call...)
pool.join()


# ['finished file file0', 'finished file file1',  ... , 'finished file file4']print results

Adding Joel's comment mentioning a common pitfall:

Make sure that the function you pass to pool.map() contains only objects that are defined at the module level. Python multiprocessing uses pickle to pass objects between processes, and pickle has issues with things like functions defined in a nested scope.

The docs for what can be pickled

Post a Comment for "Python, Subprocess: Launch New Process When One (in A Group) Has Terminated"