When I was working on CMU Sphinx, I was more an aggressive young guy and love to start many projects (still am).   So I started many projects and not many of them completed.   I wasn't completely insane: what was lacking at that point of development is that we lack of passion and momentum.  So working on many things give a sense of we are moving forward.

One of the projects, which I feel I should be responsible, is the Hieroglyph.   It was meant to be a complete set of documentation for several Sphinx components work together.   But when I finished the 3rd draft, my startup work kicked in.    That's why what you can see is only an incomplete form of the document.

Fast-forward 6 years later, it was unfortunate that the document is still the comprehensive source of sphinx if you want to understand the underlying structure/method of CMU Sphinx C-based executables.     The current CMU Sphinx encompasses way more than I decided to cover.   For example, the Java-based Sphinx4 has gained much followings.   And pocketsphinx is pretty much the de-facto speech recognizer for embedded speech recognition.

If you were following me (unlikely but possible), I have personally changed substantially.   For example, my job experience taught me that Java is a very important language and having a recognizer in Java would significantly boost the project.    I also feel embedded speech recognition is probably the real future of our life.

Back to Hieroglyph, suffice to say it is not yet a sufficient document.   I hope that I can go back to it and ask what I can do to make it better.


  1. And, more importantly, this book is very much needed. While for speech recognition developers the requirements are different, the researchers who need in-depth understanding of the algorithms require a book which covers in-depth design of the CMUSphinx toolkit and the core decisions which affected it.

    It would be really great to proceed on the book, either existing variant or a new one. Probably if sources of the book would be put on in the subversion/on the wiki, the update speed could grow.

  2. Hieroglyph was a rich knowledge of Treasure. It would be helpful for many, if the work on it is resumed. Having available with the sources of it, many interested people from the community,with your guidance, will come ahead to contribute.

