All posts by grandjanitor

Should we go to College?

Reading James Altucher's "I Was Blind But Now I See", he made a controversial point: don't send kids into college.   Before you throw stuffs, his point is sophisticated.   You would think you could refute him by saying "What about profession such as lawyer and doctor?" But then Altucher counters that to be a professional,  you just need to read the right book and ask the right question to the right people.   It is difficult to refute : say if you want to learn programming, taking a university course and getting credits don't really help that much.   Working on an open source project or an internship does.    On speech recognition, classes may be useful but at the end of the day, reading papers, or generally talking with experts in the field is the real help. 

So what is the meaning of University then?   Though I have many friends who have graduate degrees and I have a master myself,  I do appreciate Altucher's point.    Because what he said highlight some of my doubt about the college education system.   e.g. Is a person really smarter after 5 years of college education?   Do they learn better?   Does it worth the $50000 debt?    When I look at many of my friends,  for most of the time, the answer is no.    The truth is for many who want to learn, they will seek out college education after they have some experience.    They actively seek for knowledge they lack of.  On the other hand, when I look at many of my PhD friends, they either have no motivation to learn nor their duty gives them no time to learn.   It is a pity.   

I believe learning is a life-long issue and it should be independent to any institutions.   
Arthur

January 2013 Write-up

Miraculously, I still have some momentum for this blog and I have kept on the daily posting schedule.

Here is a write up for this month:  Feel free to look at this post on how I plan to write this blog:

Some Vision of the Grand Janitor's Blog

Sphinx' Tutorials and Commentaries

SphinxTrain1.07's bw:

Commentary on SphinxTrain1.07's bw (Part I)
Commentary on SphinxTrain1.07's bw (Part II)

Part I describes the high-level layout, Part II and describe half the state network was built.

Others:
Acoustic Score and Its Sign
Subword Units and their Occasionally Non-Trivial Meanings

Sphinx4:
Sphinx 4 from a C background : Material for Learning

News

Goldman Sachs not Liable
Aaron Swartz......

Other writings:

On Kurzweil : a perspective of an ASR practitioner

Enjoy!

Arthur

Speech-related Readings at Jan 30, 2013

Amazon acquired Ivona:

I am aware of Amazon's involvement in ASR.   Though it's a question on the domain.

Goldman-Dragon Trial:

I simply hope Dr. Baker has a closure on the whole thing.   In fact, when you think about it,  the whole L&H fallout is the reason why the ASR industry has a virtual monopoly now.  So if you are interested in ASR, you should be concerned.

Arthur

Subword Units and their Occasionally Non-Trivial Meanings

While I was writing the next article on bw,  I found myself forget the meaning of different type of subword units (i.e phones, biphones, diphones, triphones and such).  So I decide to write a little note.

On this kind of topics, someone would likely to come up and say "X always mean Y bla bla bla etc and so on."  My view (and hope) is that the wording of a certain should reflect what it means.  So when I hear a term and can come up with multiple definition in your head, I would say the naming convention is a little bit broken.

Phones vs Phonemes

Linguist distinguish between phoneme and phone The former usually means a kind of abstract categorization of a sound, whereas the latter usually mean the actual realization of a sound.

In a decoder though, what you see most is the term phone.

Biphones vs Diphones

(Ref here) " ..... one important difference between the two units. Biphones are understood tobe just left or right context dependent (mono)phones. On the other hand, diphones represent the transition regions that strech between the two ”centres” of the subsequent phones. "
So that's why there can be left-biphone and right-biphone.  Diphones is intuitively better in synthesis.
Possible combination of left-biphones/right-biphones/diphones are all N^2.  With N equals to the number of phones. 
Btw, the link I gave also has a term called "bi-diphone", which I don't think it's a standard term. 

Triphones

For most of the time, it means considering both left and right context.  Possible combinations N^3. 

Quinphones

For most of the time, it means considering both the two left and two right contexts. Possible combinations N^5. 

Heptaphones


For most of the time, it means considering both three left and three right  contexts. Possible combinations N^7. 

"Quadphones" and Other possible confusions in terminology. 

I guess what I don't feel comfortable are terms such as "Quadphones".   Even quinphones and heptaphones can potentially means different things from time-to-time.  
For example, if you look at LID literature, occasionally, you will see the term quadphone.  But it seems the term "phone 4-gram" (or more correctly quadgram...... if you think too much,) might be a nicer choice.  
Then there is how the context looks like:  2 left 1 right? 1 right 2 left?   Come to think of it, this terminology is confusing for even triphones because we can also mean a phone depend on 2 left or 2 right phones.  ASR people don't feel that ways probably because of a well-established convention.  Of course, the same can be said for quinphone and hetaphones. 
Arthur

Readings at Jan 28, 2013

Tools of the Trade : Mainly an iOS article but it has many tools on maintaining contacts, task lists and requests.
C11 : I have no idea C99 tried to implement variable length array.  It's certainly not very successful in the past 10 years.....   Another great book to check out is Ning's C book.
How to make iPhone App that actually sells : Again, probably not just for iOS but generally for writing free/shareware.
Bayesian vs Non-Bayesian:  Nice post.  I don't fully grok Bayesian/Non-Bayesian but if you know better, they are essentially two schools of thoughts. (ASR? The whole training process starts from a flat-start, you figure.)