My ML Portfolio

For the most parts, working on machine learning means you create  programs which work in real life.  On that regard, here are the selected systems I have touched in my life.   Noted work is bolded.  And not all work is shown, so contact me if you have questions on specific machine learning problems, because I might have work on them before.

Automatic Speech Recognition/Speech-based Machine Learning Systems
  • Grad-school years (2000-2002)
    • my own implementation of Viterbi Algorithm, which allows exactly K-frames to be skipped, featured in several academic papers.
    • Pronunciation learning system, also known as PLACER.
  • Speechworks (2002-2003)
    • Speechworks 6.5 Cantonese, Singaporean and Australian English model training.
  • CMU (2003-2006)
    • Maintainer of CMUSphinx, in particular Sphinx3, SphinxTrain.
  • Scanscout (aka Tremor) (2006-2008)
    • Internally known as "Content Analyzer" with speech recognition as one of the backend components
  • BBN (2009-2011):
    • Unsupervised topic/dialogue classifier based on PTM segmentation.  (Chosen as one of the best papers in Interspeech 2010)
    • British English model training.
  • Voci (2012 - now)
    • Architect/Maintainer of Voci's high-speed speech recognizer, also known as V-Blaze,
    • High-performance keyword spotter,
    • Speech-based gender detector
Statistical Machine Translation
  • Voci (2012 - now)
    • Statistical machine translator based on moses.
Text Classification
  • Scanscout (aka Tremor) (2006 - 2008)
    • Controversy classifier based on text/tag.