For the most parts, working on machine learning means you create programs which work in real life. On that regard, here are the selected systems I have touched in my life. Noted work is bolded. And not all work is shown, so contact me if you have questions on specific machine learning problems, because I might have work on them before.
Automatic Speech Recognition/Speech-based Machine Learning Systems
- Grad-school years (2000-2002)
- my own implementation of Viterbi Algorithm, which allows exactly K-frames to be skipped, featured in several academic papers.
- Pronunciation learning system, also known as PLACER.
- Speechworks (2002-2003)
- Speechworks 6.5 Cantonese, Singaporean and Australian English model training.
- CMU (2003-2006)
- Maintainer of CMUSphinx, in particular Sphinx3, SphinxTrain.
- Scanscout (aka Tremor) (2006-2008)
- Internally known as “Content Analyzer” with speech recognition as one of the backend components
- BBN (2009-2011):
- Unsupervised topic/dialogue classifier based on PTM segmentation. (Chosen as one of the best papers in Interspeech 2010)
- British English model training.
- Voci (2012 – now)
- Architect/Maintainer of Voci’s high-speed speech recognizer, also known as V-Blaze,
- High-performance keyword spotter,
- Speech-based gender detector
Statistical Machine Translation
- Voci (2012 – now)
- Statistical machine translator based on moses.
Text Classification
- Scanscout (aka Tremor) (2006 – 2008)
- Controversy classifier based on text/tag.
Donate to this Page: