My Third Quick Impression on HODL - Interviews with Pieter Abbeel and Yuanqing Lin

My Third Quick Impression on Heroes of Deep Learning (HODL), from the course deeplearning.ai. This time on the interviews with Pieter Abbeel and Yuanqing Lin.
 
* This is my 3rd write-up on HODL, unlike the previous two (Hinton and Bengio), I will summarize two interviews, Pieter Abbeel and Yuanquing Lin in one post because both of the interviews are short (<15 mins).
 
* Both researchers are comparatively less known than stars such as Hinton, Bengio, Lecun and Ng. But everyone knows Pieter Abbeel as a important RL researchers and lecturers and Yuanqin Lin is the head of Baidu's Institutes of Deep Learning.
 
* Gems from Pieter Abbeel:
- Is there anyway to learn RL from another algorithm?
- Is there anyway we can learn a game but use the knowledge to learn another game faster?
- He used to want to be a basketball player. (More like a fun fact.)
- On learning: Having a mentor is good.
 
* Gems from Yuanqin Lin
- Lin is the director of Baidu, when he was at NEC, he won the first Imagenet competition.
- Lin describes a fairly impressive experimental framework based on PaddlePaddle. Based on what he describe, Lin is building a framework which allow researchers to rerun an experiment using an ID. I wonder how scalable such framework is.
- Lin was a physics student specialized in Optics
- On learning: use open source framework first, but learn up basic algorithms.
 
That's what I have. Enjoy!
Arthur Chan

Some Useful Links on Neural Machine Translation

Some good resources for NNMT

Tutorial:

a bit special: Tensor2Tensor uses a novel architecture instead of pure RNN/CNN decoder/encoder.   It gives a surprisingly large amount of gain.  So it's likely that it will become a trend in NNMT in the future.

Important papers:

  • Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation by Cho Et al. (link) - Very innovative and smart paper by Kyunghyun Cho.  It also introduces GRU.
  • Sequence to Sequence Learning with Neural Networks by Ilya Sutskever (link) - By Google's researchers, and perhaps it shows for the first time an NMT system is comparable to the traditional pipeline.
  • Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation (link)
  • Neural Machine Translation by Joint Learning to Align and Translate by Dzmitry Bahdanau (link) - The paper which introduce attention
  • Neural Machine Translation by Min-Thuong Luong (link)
  • Effective Approaches to Attention-based Neural Machine Translation by Min-Thuong Luong (link) - On how to improve attention approach based on local attention.
  • Massive Exploration of Neural Machine Translation Architectures by Britz et al (link)
  • Recurrent Convolutional Neural Networks for Discourse Compositionality by Kalchbrenner and Blunsom (link)

Important Blog Posts/Web page:

Others: (Unsorted, and seems less important)

Usage in Chatbot and Summarization (again unsorted, and again perhaps less important.....)

Why AIDL doesn't talk about "Consciousness" more?

Here is an answer to the question, (Rephrased from Xyed Abz) "Isn't consciousness the only algorithm we need to build to create a artificial general intelligence like humans or animals?"

My thought:

Xyed Abz: I like your question because it not exactly those "How do you build an AGI, Muahaha?"-type of fluffy topic. At least you thought about "consciousness" is important in building intelligent machine.

 
But then why AIDL doesn't talk about the consciousness more? Part of the reasons is that the English term consciousness is fairly ambiguous. There are at least three definitions: "wakefulness" which humans are awake. A bit like you just wake up, but then you are not too aware of the surroundings. Then there is "attention" which is certain groups of world stimulation is arriving to your perception. And finally is a kind of "cognition access" which is Oh, out out all these things you attended, such as I am typing with my fingers, I feel the keyboard, I listen to the fan noice, I listen to car running outside. I decide to allow "writing" to occupy my mind.
 
Just a side note, these categorization are not arbitrary. Nor it is come up by me. This thinking can be traced to Christoph Koch and his long time collaborator, Francis Crick (The Nobel Prize Winner of DNA discovery). Stannish Dahaene is also another representative of such thought. I often use this school of thought to explain because they are the ones which has more backup from experiments.
 
So to your question, we should first ask what you actually mean by consciousness? If you meant a kind of "cognition access", yeah, I do think it is one of the keys to build intelligent machine. Because you may think that all the deep learning machines we build is only one type of "attention" we created, but there is no central binding mechanism to control them. That's what Bengio called "Cognition" in his HODL interview.
 
Will that be enough? Of course not. Just as I said, if you do build a binding mechanism, you are also suppose to build the perception mechanism to go around it as well. At least that's what's going on with humans.
 
Now, all these sound very nice, so aren't we have a theory already? Nope, even Koch and Dahaene's ideas are more hypothesis about the brain. But how does this "cognitive access" mechanism actually works? No one knows. Koch believes it is a region call claustrum in the brain which carries out such mechanism, yet there are many disagree with him. And of course, even if you find such region, it will take humans a while to reverse engineer it. So you might have heard of "cognitive architecture" which suggest different mechanism how the brain works.
 
Does it sound complicated? Yes, it is. Especially we really don't know what we are talking about. People who are super assertive about the brain, usually don't know what they are talk about. That's why I rather go party/dance/sing karaoke. But today is Saturday, why not?
 
Hope it is helpful!

Arthur

Quick Impression on deeplearning.ai's "Heroes of Deep Learning" with Prof. Yoshua Bengio

Quick Impression on deeplearning.ai's "Heroes of Deep Learning". This time is the interview of Prof. Yoshua Bengio. As always, don't post any copyrighted material here at the forum!

* Out of the 'Canadian Mafia', Prof Bengio is perhaps the less known among the three. Prof. Hinton and Prof. Lecun have their own courses, and as you know they work for Google and Facebook respectively. Whereas Prof. Bengio does work for MS, the role is more of a consultant.

* You may know him as one of the coauthors of the book "Deep Learning". But then again, who really understand that book, especially part III?

* Whereas Prof. Hinton strikes me as an eccentric polymath, Prof. Bengio is more a conventional scholar. He was influenced by Hinton in his early study of AI which was mostly expert-system based.

* That explains why everyone seems to leave his interview out, which I found it very intersting.

* He named several of his group's contributions: most of what he named was all fundamental results. Like Glorot and Bengio 2010 on now widely called Xavier's initialization or attention in machine translation, his early work in language model using neural network, of course, the GAN from GoodFellow. All are more technical results. But once you think about these ideas, they are about understanding, rather than trying to beat the current records.

* Then he say few things about early deep learning researcher which surprised me: First is on depth. As it turns out, the benefit of depth was not as clear early in 2000s. That's why when I graduated in my Master (2003), I never heard of the revival of neural network.

* And then there is the doubt no using ReLU, which is the current day staple of convnet. But the reason makes so much sense - ReLU is not smooth on all points of R. So would that causes a problem. Many one who know some calculus would doubt rationally.

* His idea on learning deep learning is also quite on point - he believe you can learn DL in 5-6 months if you had the right training - i.e. good computer science and Math education. Then you can just pick up DL by taking courses and reading proceedings from ICML.

* Finally, it is his current research on the fusion of neural networks and neuroscience. I found this part fascinating. Would backprop really used in brain a swell?

That's what I have. Hope you enjoy!

Quick Impression on deeplearning.ai (After Finishing Coursework)

Following experienced guys like Arvind Nagaraj​ and Gautam Karmakar​, I just finished all course works for deeplearning.ai. I haven't finished all videos yet. But it's a good idea to write another "impression" post.

* It took me about 10 days clock time to finish all course works. The actual work would only take me around 5-6 hours. I guess my experience speaks for many veteran members at AIDL.
* python numpy has its quirk. But if you know R or matlab/octave, you are good to go.
* Assignment of Course 1 is to guide you building an NN "from scratch". Course 2 is to guide you to implement several useful initialization/regularization/optimization algorithms. They are quite cute - you mostly just fill in the right code in python numpy.
* I quoted "from scratch" because you actually don't need to write your own matrix routine. So this "from scratch" is quite different from people who try to write a NN package "from scratch using C", in which you probably need to write a bit of code on matrix manipulation, and derive a set of formulate for your codebase. So Ng's Course gives you a taste of how these program feel like. In that regard, perhaps the next best thing is Michael Nielsen's NNDL book.
* Course 3 is quiz-only. So by far, is the easiest to finish. Just like Arvind and Gautam, I think it is the most intriguing course within the series (so far). Because it gives you a lot of many big picture advice on how to improve an ML system. Some of these advices are new to me.

Anyway, that's what I have, once I watch all the videos, I will also come up with a full review. Before that, go check out our study group "Coursera deeplearning.ai"?

Thanks,
Arthur Chan​

https://www.facebook.com/groups/DeepLearningAISpecialization/

Certificate Or Not

Many members at Coursera deeplearning.ai ask about if a Coursera certificate is something useful. So I want to sum up couple of my thoughts here:

* The most important thing is whether you learn something in the process. And there are many ways to learn. Taking a course is good because usually the course preparer would give you a summary of the field you are interested in.

* So the purpose of certification is mostly a way of motivation so that you can *finish* a class. Note that it is tough to *finish* a class, e.g. Coursera statistics suggest that completion rate is ~9-13%. This number might be smaller at Coursera because it doesn't cost you much to click the enroll button. But you go to understand finishing a class is no small business. And certification is a way to help you to do so. (Oh, because you paid $ ?)

* Some also ask whether a certificate is useful for resume. It's hard to say. So for now, there is a short supply of university-trained deep learning experts. If you have a lot of non-traditional experience from Coursera and Kaggle, you do get an edge. But as time goes on, when more learners have achieved status similar to yours, then your edge will fade. So if you think of certificates as part of your resume, be ready to keep on learning.

Arthur

Tips for Completing Course 1 of deeplearning.ai

For people who got stuck in Course 1. Here are some tips:

  • Most assignments are straight-forward. And you can finish it within 30 mins. The key is not to overthink it. If you want to derive the equations yourself, you are not reading the question carefully.
  • When in doubt, the best tool to help you is the python print statement. Check out the size and shape of a python numpy matrix always give you insights.
  • I know a lot of reviewers claim that the exercise is supposed to teach you neural network "from scratch". So .... it depends on what you mean. Ng's assignment has bells and whistles built for you. You are really doing these out of nothing. If you write everything from C and has no reference. Yeah, then it is much harder. But that's not Ng's exercise. Once again, this goes back to the point of the assignment being straight-forward. No need to overthink them.

Hope this helps!

Arthur Chan

Quick Impression on deeplearning.ai Heroes of Deep Learning - Geoffrey Hinton

So I was going through deeplearning.ai. You know we started a new FB group on it? We haven't public it yet but yes we are v. exited.
 
Now one thing you might notice of the class is that there is this optional lectures which Andrew Ng is interviewing luminaries of deep learning. Those lectures, in my view, are very different from the course lectures. Most of the topics mentioned are research and beginners would find it very perplexed. So I think these lectures deserve separate sets of notes. I still call it "quick impression" because usually I will do around 1-2 layers of literature search before I'd say I grok a video.
 
* Sorry I couldn't post the video because it is copyrighted by Coursera, but it should be very easy for you to find it. Of course, respect our forum rules and don't post the video here.
 
* This is a very interesting 40-min interview of Prof. Geoffrey Hinton. Perhaps it should also be seen as an optional material after you finish his class NNML on coursera.
 
* The interview is in research-level. So that means you would understand more if you took NNML or read part of Part III of deep learning.
 
* There are some material you heard from Prof. Hinton before, including how he became a NN/Brain researcher, how he came up with backprop and why he is not the first one who come up.
 
* There are also some which is new to me, like why does his and Rumelhart's paper was so influential. Oh, it has to do with his first experience on marriage relationship (Lecture 2 of NNML).
 
* The role of Prof. Ng in the interview is quite interesting. Andrew is also a giant in deep learning, but Prof Hinton is more the founder of the field. So you can see that Prof. Ng was trying to understand several of Prof. Hinton's thought, such as 1) Does back-propagation appear in brain? 2) The idea of capsule, which is a distributed representation of a feature vector, and allow a kind of what Hinton called "agreement". 3) Unsupervised learning such as VAE.
 
* On Prof. Hinton's favorite idea, and not to my surprise:
1) Boltzmann machine, 2) Stacking RBM to SBN, 3) variational method. I frankly don't fully understand Pt. 3. But then L10 to L14 of NNML are all about Pt 1 and 2. Unfortunately, not everyone love to talk about Boltzmann machine - they are not hot as GAN, and perceived as not useful at all. But if you want to understand the origin of deep learning, and one way to pre-train your DNN, you should go to take NNML.
 
* Prof. Hinton's advice on research is also very entertaining - he suggest you don't always read up from literature first - which according to him is good for creative researchers.
 
* The part I like most is Prof Hinton's view of why computer science departments are not catching up on teaching deep learning. As always, he words are penetrating. He said, " And there's a huge sea change going on, basically because our relationship to computers has changed. Instead of programming them, we now show them, and they figure it out."
 
* Indeed, when I first start out at work, thinking as an MLer is not regarded as cool - programming is cool. But things are changing. And we AIDL is embracing the change.
 
Enjoy!
 
Arthur Chan

Quick Impression on deeplearning.ai

(Also see my full review of Course 1 and Course 2 here.)

Fellows, as you all know by now, Prof. Andrew Ng has started a new Coursera Specialization on Deep Learning. So many of you came to me today and ask my take on the class. As a rule, I usually don't comment on a class unless I know something about it. (Search for my "Learning Deep Learning - Top 5 Lists" for more details.) But I'd like to make an exception for the Good Professor's class.

 
So here is my quick take after browsing through the specialization curriculum:
 
* Only Course 1 to 3 are published now, they are short classes, more like 2-4 weeks. It feels like the Data Science Specialization so it feels good for beginners. Assume that Course 4 and 5 are long: 4 weeks. So we are talking about 17 weeks of study.
 
* Unlike the standard Ng's ML class, python is the default language. That's good in my view because close to 80-90% of practitioners are using python-based framework.
 
* Course 1-3 has around 3 weeks of curriculum overlapped with "Intro to Machine Learning" Lecture 2-3. Course 1's goal seems to implement NN from scratch. Course 2 is on regularization. Course 3 on different methodologies of deep learning and it's short, only 2 weeks long.
 
* Course 4 and 5 are about CNN and RNN.
 
* So my general impression here is that it is more a comprehensive class, comparable with Hugo Larochelle's Lectures, as well as Hinton's lecture. Yet the latter two classes are known to be more difficult. Hinton's class in particular, are know to confuse even PhDs. So that shows one of the values of this new DL class, it is a great transition from "Intro to ML" to more difficult classes such as Hinton's.
 
* But how does it compared with other similar course such as Udacity's DL nanodegree then? I am not sure yet, but the price seems to be more reasonable if you go through the Coursera route. Assume we are talking about 5 months of study, you are paying $245.
 
* I also found that many existing beginner classes advocate too much on running scripts, but avoid linking more fundamental concepts such as bias/variance with DL. Or go deep to describe models such as Convnet and RNN. cs231n did a good job on Convnet, and cs224n teach you RNN. But they seem to be more difficult than Ng or Udacity's class. So again, Ng's class sounds like a great transition class.
 
* My current take: 1) I am going to take the class myself. 2) It's very likely this new deeplearning.ai class will change my recommendations of class on Top-5 list.
 
Hope this is helpful for all of you.
 
Arthur Chan