My personal mission statement: apply human language technology and machine learning to improve everyone’s life.
Some highlights of my career:
• Architect of Voci’s speech recognition engine since 2012.
• Ex-maintainer of open source CMU Sphinx from 2004 to 2006.
• Early employees of two startups, Scanscout (#4) and Voci (#9). Scanscout was acquired by Tremor Video at 2010.
• Research staff at Raytheon BBN, Speechworks (now part of Nuance), Scanscout (now part of Tremor Video) and Voci.
• Coauthor of one of the best papers in a prestigious international conference.
Skills:
Programming: C, Perl, Python, C++, Java. With working knowledge of web/iOS programming.
Speech Recognition-Related:
- Architecture of Speech Recognition System (Decoder+Trainer), Very Fast Speech Recognition, Keyword Spotting, Speech-based Topic/Language/Emotion/Gender Classification, Robust Speech Recognition.
- Sphinx (2,3,4 and pocketsphinx), Kaldi, HTK, Julius, Speechworks (<OSR 2.0), Byblos, CMULM, SRILM, MITLM.
Deep Learning-Related:
- Speech recognition: source-code level handling of two major deep learning toolkits in speech recognition and language modeling.
- Image classification: DNN, CNN, RNN and variants, trained systems on MNIST, CIFAR-10 and CIFAR-100. Also transfer learning.
- Art and deep learning: style-transfer,
Language generation: char-rnn style training, - Administration: knowledgeable in setup and install multiple deep learning tools.
- Theano, Tensorflow, (Keras), Torch, neon, scikit-learn, libsvm, pandas, Dato/Turi’s graphlab
General Machine Learning-Related: Application Of Machine Learning Algorithms (Regression, SVM, GMM), Sentiment classification, Information retrieval, deep learning.
Soft Skills: Startup, MVP building, Speech Applications, Speech Analytics, Business Use of Speech Recognition and Machine Learning (in particular, for startup scenario), Open Source Speech Recognition.