Love the current trend that Sphinx is everywhere.
I worked on Sphinx 3 a lot. In these days, it was generally regarded as an "old-style" recognizer as compared to Sphinx 4 and PocketSphinx. It is also not support officially by the SF's guys.
Coders of speech recognition think a little bit different. They usually stick to a certain codebase which they feel comfortable with. For me, it is not just a personal preference, it also reflects how much I know about a certain recognizer. For example, I know quite a bit of how Sphinx 3 performs. In these days, I tried to learn how Sphinx 4 fare as well. So far, if you ask me to choose an accurate recognizer, I will still probably choose Sphinx 3, not because the search technology is better (Sphinx 4 is way superior), but because it can easily made to support several advanced modeling types. This seems to be how the 2010 developer meeting concluded as well.
But that was just me. In fact, I am bullish on all Sphinx recognizers. One thing I want to note is the power of Sphinx 4 in development. There are many projects are based on Sphinx 4. In these days, if you want to get a job on speech recognizer, knowing Sphinx 4 is probably a good ticket. That's why I am quite keen on learning it more so hopefully I can write on both recognizers more.
In any case, this is a Sphinx 3's article. I will probably write more on each components. Feel free to comments.
How Sphinx3 is initialized:
-> kbcore_init (*)
-> set operation mode
-> Look for feat.params very early on.
-> s3_am_init (*)
-> misc. models init
mgau_init such as
-> dict2pid_build <- Should put into search
-> read in mdef.
-> depends on -senmgau type
.semi or .s3cont.
- -hmmdir override all other sub-parameters.
As I update this blog more frequently, I noticed more and more people are directed to here. Naturally, there are many questions about some work in my past. For example, "Are you still answering questions in CMUSphinx forum?" and generally requests to have certain tutorial. So I guess it is time to clarify my current position and what I plan to do in future.
- Sphinx 4 maintenance and refactoring
- PocketSphinx's maintenance
- An HTKbook-like documentation : i.e. Hieroglyphs.
- Regression tests on all tools in SphinxTrain.
- In general, modernization of Sphinx software, such as using WFST-based approach.
Programming as a profession is a a strange one. If you are a doctor, you can usually carry your knowledge and skills from one place to another provided that you have exactly the same tool. If you are a programmer, you speed and skill are partially determined by the tools you build in house for a particular place. So for example, I am not supposed to use any tool I built when I worked in the small video-advertising start-up. Even if I can do something in 1 second at that period of time, if I change my job, I will need to restart and rebuild the tool again. We are probably talking about days to rebuild the tool and weeks to refine it again.
|File||Rev.||Age||Author||Last log entry|
|CLP/||10079||23 months||dhdfu||Finally add an -F argument to use the full path in the control file as the label…|
|PocketSphinxAndroidDemo/||11117||9 months||nshmyrev||Wrapper for nbest|
|SimpleLM/||22||12 years||rickyhoughton||Initial revision|
|Speech-Recognizer-SPX/||8933||3 years||nshmyrev||Update module to recent pocketsphinx API|
|SphinxTrain/||11350||9 days||nshmyrev||Extract warped features during 000 stage if VTLN is enabled. See for detailsht…|
|archive_s3/||7289||4 years||egouvea||Fixed error message in decoder script reporting failure in bw, and made result d…|
|cmuclmtk/||11035||10 months||nshmyrev||Fixes bug in wngram2idngram and adds a test for it|
|cmudict/||11348||3 weeks||air||cleaned up documentation and code (a bit) recompiled the dict|
|gst-sphinx/||7848||4 years||dhdfu||Support changing language models at runtime (maybe)|
|htk2s3conv/||11336||6 weeks||nshmyrev||Adds warning about different number of mixtures|
|jsgfparser/||7230||4 years||dhdfu||Fix the main program to output the only public rule if no rule is specified, and…|
|logios/||11339||4 weeks||tkharris||remove duplicated code|
|misc_scripts/||10147||22 months||dhdfu||handle zero references|
|multisphinx/||10945||12 months||dhdfu||clean up better and introduce vocabulary maps|
|pocketsphinx/||11351||8 days||nshmyrev||Updated lat2dot script. I need to move it to the other location though|
|pocketsphinx-extra/||9972||2 years||dhdfu||add sc models with mixture_weights and mdef.txt files|
|scons/||5868||5 years||egouvea||updated the scons support to reflect that plugin.jar is now part of the package|
|share/||5532||6 years||egouvea||Setting dsp and dsw files to have have windows EOL regardless where it's downloa…|
|sphinx2/||8767||3 years||egouvea||Updated the sphinx-2 MS files to MS .NET, consistent with the other packages, an…|
|sphinx3/||11329||2 months||nshmyrev||Patch to solve memory issues in python module. See for detailshttps://bugzilla…|
|sphinx4/||11344||3 weeks||nshmyrev||Properly sets logger for AudioFileDataSource. Thanks to Bandele Ola.|
|sphinx_fsttools/||10791||14 months||nshmyrev||Some bit in AM to FST conversion|
|sphinxbase/||11346||3 weeks||nshmyrev||Properly select buffer size when using audioresample. Thanks to balkce See fo…|
|tools/||9009||3 years||nshmyrev||Updated to the latest release of sphinx4|
|web/||10249||21 months||nshmyrev||There is no sphinx3 development anymore|