Category Archives: HMM

Some Resources for Graphical Models

I have been taking a break from deep learning, and I am quite into graphical models (GM) lately.   So that's why I am gathering resources of understanding various concepts of GM.

Here are some useful courses one can use.  They are not sorted/categorized, it's just useful for me to look them through later.

Courses:

Note that except Koller's class, not all of the following classes have video available.

Books:

Radev's Coursera Introduction to Natural Language Processing - A Review

As I promised earlier, I was going to review Prof.  Dragomir Radev's introductory class on natural language processing.   Few words about Prof. Radev: from his Wikipedia entry, Prof Radev is an award winning Professor, who co-found North American Computational Linguistics Olympiad (NACLO), which is the equivalent of USAMO in computational linguistics. He was also the coach of U.S. coach of International Language Olympiad 2011 and helped the team won several medals [1].    I think these are great contributions to the speech and language community.  In late 90s, when I was still in undergraduate, there was not much recognition of computational language processing as an important computation skill.    With competition in high-school or college level, there will be a generation of young minds who would aspire to build intelligent conversation agent, robust speech recognizer and versatile question and answering machine.   (Or else everyone would think Linux kernel is the only cool hack in town. 🙂 )

The Class

So how about the class?  I got to say I am more than surprised and happy with it.   I was searching for an intro NLP class, so the two natural choices was Prof. Jurafsky' and Manning' s and Prof.  Collin's Natural Language Processing.   Both classes received great praise and comments and few of my friends recommend to take both.   Unfortunately, there was no class offering recently so I could only watch the material off-line.

Then there comes the Radev's class,  it is as Prof. Radev explains: "more introductory" than Collin's class and "more focused on Linguistics and resources" than Jurafsky and Manning.   So it is good for two types of learners:

  1. Those who just started out in NLP.
  2. Those who want to gather useful resources and start projects on NLP.

I belong to both types.   My job requires me to have more comprehensive knowledge of language and speech processing.

The Syllabus and The Lectures

The class itself is a brief survey of many important topics of NLP.   There are the basics:  parsing, tagging, language modeling.  There are the advanced topics such as summarization, statistical machine translation (SMT), semantic analysis and dialogue modeling.   The lectures, except occasionally mistakes, are quite well done and filled with interesting examples.

My only criticism is perhaps the length of videos, I would hope that most videos I watch would be less than 10 minutes.    That makes it easier to rotate with my other daily tasks.

The material is not too difficult to absorb for newcomers.   For starter, advanced topic such as  SMT is not covered in too much detail mathematically.  (So no need to derive EM on IBM models.)  That I think it's quite appropriate for first time learners like me.

One more unique feature of the lectures: it fills with interesting NACLO problems.    While NACLO is more a high-school level competition, most of the problems are challenging even for experienced practitioners.  And I found them quite stimulating.

The Prerequisites and The Homework Assignments

To me, the fun part is the homework.   There were 3 of them, they focus on,

  1. Nivre's Dependency Parser,
  2. Language Modeling and POS Tagging,
  3. Word Sense Disambiguation

All homework are based on python.   If you know what you are doing, they are not that difficult to do.   For me, I spent around 12-14 hours on each.   (Those are usually weekends.) Just like Ng's Machine Learning class,   you need to match numbers with  the golden reference.   I think that's the right approach to learn any machine learning task the first time.   Blindly come up with a system and hope it works never get you anywhere.

The homework does speak about an issue of the class, i.e. you do need to know the basics of Machine Learning .  Also, if you never had any programming experience would find the homework very difficult.   This probably described many linguistic students but never take any computer science classes.  [3]    You can still "power it through" and pass.  But it can be unnecessarily hard.

So I will recommend you first take the Ng's class or perhaps the latest Machine Learning specialization from Guestrin and Fox first.   Those are the classes which would give you some basics of programming as well as basic concept of Machine Learning.

If you didn't take any machine learning class, one way to go through more difficult classes like this is to read forum messages.   There are many nice people in the course was answering various questions.   To be frank, if the forum doesn't exist, then it will take me around 3 times more time to finish all assignments.

Final Word

All-in-all, I highly recommend Prof. Radev's class to anyone who is interested in NLP.    As I mentioned though, the class does require prerequisite such as basics of programming and machine learning.   So  I would recommend any learners to first take the Ng's class before taking this one.

In any case, I want to thank Prof. Radev and all teaching staffs who prepare this wonderful course.   I also thank to many classmates who help me through the homework.

Arthur

Postscript at 2017 April

After I wrote this review, Coursera had since upgraded to the new format.  It's a pity none of the NLP classes, including Prof. Radev's survive.   To bad for NLP lovers!

There is also a seismic shift in the field of NLP toward deep learning. While deep learning does not dominate evaluations like in computer vision or speech recognition, it is perhaps the most actively researched direction right now.  So if you are curious about what's new, consider to take the latest Standford cs224n 2017 or Oxford's Deep Learning for NLP.

[1] http://www.eecs.umich.edu/eecs/about/articles/2010/Radev-Linguistics.html

[2] Week 1 Lecture 1 Introduction

[3] One anecdote:  In the forum, some students was asking why you can't just sum all data points of a class together and pour into scikit-learn's fit().    I don't blame the student because she started late and lacks of prerequisite.   She later finished all assignment and I really admire her determination.

Friday Speech-related Links

Future Windows Phone speech recognition revealed in leaked video

Whether you like Softie, they are innovative in speech recognition in these few years.  I am looking forward for their integration of DBN in many of their products.

German Language Learning Startup Babbel Buys Disrupt Finalist PlaySay To Target The U.S. Market

Not exactly in ASR but language learning has been a main stay.  Look at EnglishCentral, they have been around and kicking well.

HMM with scipy-learn

When I first learned HMM, I was always hoping to use a scripting language to train the simplest HMM.  scipy-learn is one such software.

Google Keep

Voice memo is a huge market.  But mobile continus speech recognition is a very challenging task.  Yet, with Google technology, I think it should be better than its competitor, Evernote.

Arthur

Learning vs Writing

I haven't done any serious writings for a week.  Mostly post interesting readings just to keep up the momentum.   Work is busy so I slowed down.  Another concern is what to write.   Some of the topics I have been writing such as Sphinx4 and SphinxTrain take a little bit of research to get them right.

So far I think I am on the right track.  There are not many bloggers on  speech recognition.  (Nick is an exception.)   To really increase awareness of how ASR is done in practice, blogging is a good way to go.

I also describe myself as "recovering" because there are couple of years I hadn't seriously thought about open source Sphinx.  In fact though I was working on speech related stuffs, I didn't spend too much time on mainstream ASR neither because my topic is too esoteric.

Not to say, there are many new technologies emerged in the last few years.   The major one I would say is the use of neural network in speech recognition.  It probably won't replace HMM soon but it is a mainstay for many sites already.   WFST, with more tutorial type of literature available, has become more and more popular.    In programming, Python now is a mainstay plus job-proof type of language.  The many useful toolkit such as scipy, nltk by themselves deserves book-length treatment.  Java starts to be like C++, a kind of necessary evil you need to learn.  C++ has a new standard.   Ruby is huge in the Web world and by itself is fun to learn.

All of these new technologies took me back to a kind of learning mode.   So some of my articles become longer and in more detail.   For now, they probably cater to only a small group of people in the world.   But it's okay, when you blog, you just want to build landmarks on the blogosphere.   Let people come to search for them and get benefit from it.   That's my philosophy of going on with this site.

Arthur