All posts by grandjanitor

Some Thoughts on Hours

Hours is one of the taboo topics in the tech industry. I can say couple of things, hopefully not fluffy:

  • Most hours are self-reported, so from a data perspective. It's really unclean. Funny story: Since I was 23, I work on weekends regularly, so in my past jobs, there were moments I note down some colleagues of mine who claim who work 60+ hours. What really happen is they only work 35-40. Most of them are stunned when I give them the measurement. There are few of them refused to talk with me later on. (Oh I work for some of them too.)
  • Then there is what it means by working long hours (60+ hours). And practically you should wonder why that's the case. How come one can't just issue an Unix command to solve a problem? Or if you want to know what you are doing, how come writing a 2000 words note take one more than 8 hours? How come it takes such a long time to solve your weekly issues? If we talk about coding, it also doesn't make sense. Because once you have the breakdown of a coding problem, you just have to solve them iteratively in small chunks. Usually it doesn't take more than 2 hours.
  • So here is a realistic portrait of respectable people I work with which you feel like he works long hours. What they did actually do?
    1, They do some work everyday even on holidays/vacations/weekends.
    2, They respond to you even at hours such as 1 or 2.
    3, They look agitated when things go wrong in their projects.
  • Now once you really analyze these behaviors : it doesn't really prove that the person works N hours. What it really means is that they stay up all the time. For the agitation part, it also makes more sense to say "Oh, this guy probably has anger issue, but at least he cares."
  • Sadly, there are also many people who really work more than 40, but they are also the least effective people I ever know.
  • I should mention that there are more positive part of long hours: first off learning. And my guess it is what the job description really means - you spent all your moments to learn. You might code daily but if you don't learn, then your speed won't improve at all. So this extra cost of learning is always worthwhile to pay. And that's why we always encourage members to learn.
  • Before I go, I actually follow the scheduling method from "Learning How to Learn". i.e. I took frequent breaks after 45-60 mins intense works. And my view of productivity is to continuously learn. Because new skills usually improve your workflow. Some of my past employers have huge issues with my approach. So you should understand my view is biased.
  • I would also add, there are individuals who can really work 80 hours and actually code. Usually they are either obliged by culture, influenced by drugs or shaped by their very special genes.

Hope this helps,

Arthur

My Third Quick Impression on HODL - Interviews with Pieter Abbeel and Yuanqing Lin

My Third Quick Impression on Heroes of Deep Learning (HODL), from the course deeplearning.ai. This time on the interviews with Pieter Abbeel and Yuanqing Lin.
 
* This is my 3rd write-up on HODL, unlike the previous two (Hinton and Bengio), I will summarize two interviews, Pieter Abbeel and Yuanquing Lin in one post because both of the interviews are short (<15 mins).
 
* Both researchers are comparatively less known than stars such as Hinton, Bengio, Lecun and Ng. But everyone knows Pieter Abbeel as a important RL researchers and lecturers and Yuanqin Lin is the head of Baidu's Institutes of Deep Learning.
 
* Gems from Pieter Abbeel:
- Is there anyway to learn RL from another algorithm?
- Is there anyway we can learn a game but use the knowledge to learn another game faster?
- He used to want to be a basketball player. (More like a fun fact.)
- On learning: Having a mentor is good.
 
* Gems from Yuanqin Lin
- Lin is the director of Baidu, when he was at NEC, he won the first Imagenet competition.
- Lin describes a fairly impressive experimental framework based on PaddlePaddle. Based on what he describe, Lin is building a framework which allow researchers to rerun an experiment using an ID. I wonder how scalable such framework is.
- Lin was a physics student specialized in Optics
- On learning: use open source framework first, but learn up basic algorithms.
 
That's what I have. Enjoy!
Arthur Chan

Some Useful Links on Neural Machine Translation

Some good resources for NNMT

Tutorial:

a bit special: Tensor2Tensor uses a novel architecture instead of pure RNN/CNN decoder/encoder.   It gives a surprisingly large amount of gain.  So it's likely that it will become a trend in NNMT in the future.

Important papers:

  • Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation by Cho Et al. (link) - Very innovative and smart paper by Kyunghyun Cho.  It also introduces GRU.
  • Sequence to Sequence Learning with Neural Networks by Ilya Sutskever (link) - By Google's researchers, and perhaps it shows for the first time an NMT system is comparable to the traditional pipeline.
  • Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation (link)
  • Neural Machine Translation by Joint Learning to Align and Translate by Dzmitry Bahdanau (link) - The paper which introduce attention
  • Neural Machine Translation by Min-Thuong Luong (link)
  • Effective Approaches to Attention-based Neural Machine Translation by Min-Thuong Luong (link) - On how to improve attention approach based on local attention.
  • Massive Exploration of Neural Machine Translation Architectures by Britz et al (link)
  • Recurrent Convolutional Neural Networks for Discourse Compositionality by Kalchbrenner and Blunsom (link)

Important Blog Posts/Web page:

Others: (Unsorted, and seems less important)

Usage in Chatbot and Summarization (again unsorted, and again perhaps less important.....)

Why AIDL doesn't talk about "Consciousness" more?

Here is an answer to the question, (Rephrased from Xyed Abz) "Isn't consciousness the only algorithm we need to build to create a artificial general intelligence like humans or animals?"

My thought:

Xyed Abz: I like your question because it not exactly those "How do you build an AGI, Muahaha?"-type of fluffy topic. At least you thought about "consciousness" is important in building intelligent machine.

 
But then why AIDL doesn't talk about the consciousness more? Part of the reasons is that the English term consciousness is fairly ambiguous. There are at least three definitions: "wakefulness" which humans are awake. A bit like you just wake up, but then you are not too aware of the surroundings. Then there is "attention" which is certain groups of world stimulation is arriving to your perception. And finally is a kind of "cognition access" which is Oh, out out all these things you attended, such as I am typing with my fingers, I feel the keyboard, I listen to the fan noice, I listen to car running outside. I decide to allow "writing" to occupy my mind.
 
Just a side note, these categorization are not arbitrary. Nor it is come up by me. This thinking can be traced to Christoph Koch and his long time collaborator, Francis Crick (The Nobel Prize Winner of DNA discovery). Stannish Dahaene is also another representative of such thought. I often use this school of thought to explain because they are the ones which has more backup from experiments.
 
So to your question, we should first ask what you actually mean by consciousness? If you meant a kind of "cognition access", yeah, I do think it is one of the keys to build intelligent machine. Because you may think that all the deep learning machines we build is only one type of "attention" we created, but there is no central binding mechanism to control them. That's what Bengio called "Cognition" in his HODL interview.
 
Will that be enough? Of course not. Just as I said, if you do build a binding mechanism, you are also suppose to build the perception mechanism to go around it as well. At least that's what's going on with humans.
 
Now, all these sound very nice, so aren't we have a theory already? Nope, even Koch and Dahaene's ideas are more hypothesis about the brain. But how does this "cognitive access" mechanism actually works? No one knows. Koch believes it is a region call claustrum in the brain which carries out such mechanism, yet there are many disagree with him. And of course, even if you find such region, it will take humans a while to reverse engineer it. So you might have heard of "cognitive architecture" which suggest different mechanism how the brain works.
 
Does it sound complicated? Yes, it is. Especially we really don't know what we are talking about. People who are super assertive about the brain, usually don't know what they are talk about. That's why I rather go party/dance/sing karaoke. But today is Saturday, why not?
 
Hope it is helpful!

Arthur

Quick Impression on deeplearning.ai's "Heroes of Deep Learning" with Prof. Yoshua Bengio

Quick Impression on deeplearning.ai's "Heroes of Deep Learning". This time is the interview of Prof. Yoshua Bengio. As always, don't post any copyrighted material here at the forum!

* Out of the 'Canadian Mafia', Prof Bengio is perhaps the less known among the three. Prof. Hinton and Prof. Lecun have their own courses, and as you know they work for Google and Facebook respectively. Whereas Prof. Bengio does work for MS, the role is more of a consultant.

* You may know him as one of the coauthors of the book "Deep Learning". But then again, who really understand that book, especially part III?

* Whereas Prof. Hinton strikes me as an eccentric polymath, Prof. Bengio is more a conventional scholar. He was influenced by Hinton in his early study of AI which was mostly expert-system based.

* That explains why everyone seems to leave his interview out, which I found it very intersting.

* He named several of his group's contributions: most of what he named was all fundamental results. Like Glorot and Bengio 2010 on now widely called Xavier's initialization or attention in machine translation, his early work in language model using neural network, of course, the GAN from GoodFellow. All are more technical results. But once you think about these ideas, they are about understanding, rather than trying to beat the current records.

* Then he say few things about early deep learning researcher which surprised me: First is on depth. As it turns out, the benefit of depth was not as clear early in 2000s. That's why when I graduated in my Master (2003), I never heard of the revival of neural network.

* And then there is the doubt no using ReLU, which is the current day staple of convnet. But the reason makes so much sense - ReLU is not smooth on all points of R. So would that causes a problem. Many one who know some calculus would doubt rationally.

* His idea on learning deep learning is also quite on point - he believe you can learn DL in 5-6 months if you had the right training - i.e. good computer science and Math education. Then you can just pick up DL by taking courses and reading proceedings from ICML.

* Finally, it is his current research on the fusion of neural networks and neuroscience. I found this part fascinating. Would backprop really used in brain a swell?

That's what I have. Hope you enjoy!

Quick Impression on deeplearning.ai (After Finishing Coursework)

Following experienced guys like Arvind Nagaraj​ and Gautam Karmakar​, I just finished all course works for deeplearning.ai. I haven't finished all videos yet. But it's a good idea to write another "impression" post.

* It took me about 10 days clock time to finish all course works. The actual work would only take me around 5-6 hours. I guess my experience speaks for many veteran members at AIDL.
* python numpy has its quirk. But if you know R or matlab/octave, you are good to go.
* Assignment of Course 1 is to guide you building an NN "from scratch". Course 2 is to guide you to implement several useful initialization/regularization/optimization algorithms. They are quite cute - you mostly just fill in the right code in python numpy.
* I quoted "from scratch" because you actually don't need to write your own matrix routine. So this "from scratch" is quite different from people who try to write a NN package "from scratch using C", in which you probably need to write a bit of code on matrix manipulation, and derive a set of formulate for your codebase. So Ng's Course gives you a taste of how these program feel like. In that regard, perhaps the next best thing is Michael Nielsen's NNDL book.
* Course 3 is quiz-only. So by far, is the easiest to finish. Just like Arvind and Gautam, I think it is the most intriguing course within the series (so far). Because it gives you a lot of many big picture advice on how to improve an ML system. Some of these advices are new to me.

Anyway, that's what I have, once I watch all the videos, I will also come up with a full review. Before that, go check out our study group "Coursera deeplearning.ai"?

Thanks,
Arthur Chan​

https://www.facebook.com/groups/DeepLearningAISpecialization/

Certificate Or Not

Many members at Coursera deeplearning.ai ask about if a Coursera certificate is something useful. So I want to sum up couple of my thoughts here:

* The most important thing is whether you learn something in the process. And there are many ways to learn. Taking a course is good because usually the course preparer would give you a summary of the field you are interested in.

* So the purpose of certification is mostly a way of motivation so that you can *finish* a class. Note that it is tough to *finish* a class, e.g. Coursera statistics suggest that completion rate is ~9-13%. This number might be smaller at Coursera because it doesn't cost you much to click the enroll button. But you go to understand finishing a class is no small business. And certification is a way to help you to do so. (Oh, because you paid $ ?)

* Some also ask whether a certificate is useful for resume. It's hard to say. So for now, there is a short supply of university-trained deep learning experts. If you have a lot of non-traditional experience from Coursera and Kaggle, you do get an edge. But as time goes on, when more learners have achieved status similar to yours, then your edge will fade. So if you think of certificates as part of your resume, be ready to keep on learning.

Arthur

Tips for Completing Course 1 of deeplearning.ai

For people who got stuck in Course 1. Here are some tips:

  • Most assignments are straight-forward. And you can finish it within 30 mins. The key is not to overthink it. If you want to derive the equations yourself, you are not reading the question carefully.
  • When in doubt, the best tool to help you is the python print statement. Check out the size and shape of a python numpy matrix always give you insights.
  • I know a lot of reviewers claim that the exercise is supposed to teach you neural network "from scratch". So .... it depends on what you mean. Ng's assignment has bells and whistles built for you. You are really doing these out of nothing. If you write everything from C and has no reference. Yeah, then it is much harder. But that's not Ng's exercise. Once again, this goes back to the point of the assignment being straight-forward. No need to overthink them.

Hope this helps!

Arthur Chan

Quick Impression on deeplearning.ai Heroes of Deep Learning - Geoffrey Hinton

So I was going through deeplearning.ai. You know we started a new FB group on it? We haven't public it yet but yes we are v. exited.
 
Now one thing you might notice of the class is that there is this optional lectures which Andrew Ng is interviewing luminaries of deep learning. Those lectures, in my view, are very different from the course lectures. Most of the topics mentioned are research and beginners would find it very perplexed. So I think these lectures deserve separate sets of notes. I still call it "quick impression" because usually I will do around 1-2 layers of literature search before I'd say I grok a video.
 
* Sorry I couldn't post the video because it is copyrighted by Coursera, but it should be very easy for you to find it. Of course, respect our forum rules and don't post the video here.
 
* This is a very interesting 40-min interview of Prof. Geoffrey Hinton. Perhaps it should also be seen as an optional material after you finish his class NNML on coursera.
 
* The interview is in research-level. So that means you would understand more if you took NNML or read part of Part III of deep learning.
 
* There are some material you heard from Prof. Hinton before, including how he became a NN/Brain researcher, how he came up with backprop and why he is not the first one who come up.
 
* There are also some which is new to me, like why does his and Rumelhart's paper was so influential. Oh, it has to do with his first experience on marriage relationship (Lecture 2 of NNML).
 
* The role of Prof. Ng in the interview is quite interesting. Andrew is also a giant in deep learning, but Prof Hinton is more the founder of the field. So you can see that Prof. Ng was trying to understand several of Prof. Hinton's thought, such as 1) Does back-propagation appear in brain? 2) The idea of capsule, which is a distributed representation of a feature vector, and allow a kind of what Hinton called "agreement". 3) Unsupervised learning such as VAE.
 
* On Prof. Hinton's favorite idea, and not to my surprise:
1) Boltzmann machine, 2) Stacking RBM to SBN, 3) variational method. I frankly don't fully understand Pt. 3. But then L10 to L14 of NNML are all about Pt 1 and 2. Unfortunately, not everyone love to talk about Boltzmann machine - they are not hot as GAN, and perceived as not useful at all. But if you want to understand the origin of deep learning, and one way to pre-train your DNN, you should go to take NNML.
 
* Prof. Hinton's advice on research is also very entertaining - he suggest you don't always read up from literature first - which according to him is good for creative researchers.
 
* The part I like most is Prof Hinton's view of why computer science departments are not catching up on teaching deep learning. As always, he words are penetrating. He said, " And there's a huge sea change going on, basically because our relationship to computers has changed. Instead of programming them, we now show them, and they figure it out."
 
* Indeed, when I first start out at work, thinking as an MLer is not regarded as cool - programming is cool. But things are changing. And we AIDL is embracing the change.
 
Enjoy!
 
Arthur Chan