Categories
cmu sphinx multiprocessing Python

Python multiprocessing

As my readers may noticed, I haven’t updated this blog as I have pretty heavy workload. It doesn’t help that I was sick in the middle of March as well. Excuses aside though, I am happy to come back. If I couldn’t write much about Sphinx and programming, I think it’s still worth it to keep posting links.

I also come up with requests on writing more details on individual parts of Sphinx.   I love these requests so feel free to send me more.   Of course, it usually takes me some time to fully grok a certain part of Sphinx and I could describe it in an approachable way.   So before that, I could only ask for your patience.

Recently I come up with parallel processing a lot and was intrigued on how it works in the practice. In python, a natural choice is to use the library multiprocessing. So here is a simple example on how you can run multiple processes in python. It would be very useful in the modern days CPUs which has multi-cores.

Here is an example program on how that could be done:

1:  import multiprocessing  
2: import subprocess
3: jobs = []
4: for i in range (N):
5: p = multiprocessing.Process(target=process,
6: name = 'TASK' + str(i),
7: args=(i, ......
8: )
9: )
10: jobs.append(p)
11: p.start()
12: for j in jobs:
13: if j.is_alive():
14: print 'Waiting for job %s' %(j.name)
15: j.join()

The program is fairly trivial. Interesting enough, it is also quite similar to the multithreading version in python. Line 5 to 11 is where you run your task and I just wait for the tasks finished from Line 12 to 15.

It feels little bit less elegant than using Pool because it provides a waiting mechanism for the entire pool of task.  Right now, I am essentially waiting for job which is still running by the time job 1 is finished.

Is it worthwhile to go another path which is thread-based programming.  One thing I learned in this exercise is that older version of python, multi-threaded program can be paradoxically slower than the single-threaded one. (See this link from Eli Bendersky.) It could be an easier being resolved in recent python though.

Arthur