Some good resources for NNMT
- NMT tutorial written by Thang Luong - my impression is that it is a shorter tutorial with step-by-step procedure. The part which is slightly disappointing is that it doesn't quite record exactly how the benchmarking experiments were run and evaluated. Of course, it's kind of trivial to fix it, but it did take me a bit of time.
- The original Tensorflow seq2seq tutorial - more a big gun of SMT, the first experiment I played with. Now we are talking about the WMT15 set.
- tf-seq2seq (blog post: here)
- Graham Neubig's tutorial
- NeuralMonkey (Tensorflow-based)
- Prof. Philip Koehn's new chapter on NMT
a bit special: Tensor2Tensor uses a novel architecture instead of pure RNN/CNN decoder/encoder. It gives a surprisingly large amount of gain. So it's likely that it will become a trend in NNMT in the future.
- Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation by Cho Et al. (link) - Very innovative and smart paper by Kyunghyun Cho. It also introduces GRU.
- Sequence to Sequence Learning with Neural Networks by Ilya Sutskever (link) - By Google's researchers, and perhaps it shows for the first time an NMT system is comparable to the traditional pipeline.
- Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation (link)
- Neural Machine Translation by Joint Learning to Align and Translate by Dzmitry Bahdanau (link) - The paper which introduce attention
- Neural Machine Translation by Min-Thuong Luong (link)
- Effective Approaches to Attention-based Neural Machine Translation by Min-Thuong Luong (link) - On how to improve attention approach based on local attention.
- Massive Exploration of Neural Machine Translation Architectures by Britz et al (link)
- Recurrent Convolutional Neural Networks for Discourse Compositionality by Kalchbrenner and Blunsom (link)
Important Blog Posts/Web page:
- Attention and Augmented Recurrent Neural Networks: Only partially relevant to attention-based RNN, but Olah's writing is always worthwhile to read.
- Stanford NMT research page: Related to Luong, See and Manning's work on NMT. Very entertaining to look at recent techniques. Tutorial/code/models are available.
Others: (Unsorted, and seems less important)
- JayPark's Github https://github.com/JayParks/tf-seq2seq
Usage in Chatbot and Summarization (again unsorted, and again perhaps less important.....)
Here is an answer to the question, (Rephrased from Xyed Abz) "Isn't consciousness the only algorithm we need to build to create a artificial general intelligence like humans or animals?"
Xyed Abz: I like your question because it not exactly those "How do you build an AGI, Muahaha?"-type of fluffy topic. At least you thought about "consciousness" is important in building intelligent machine.
Some list of Deep Learning on NLP - unsorted.
- https://github.com/keon/awesome-nlp#implementation *
Quick Impression on deeplearning.ai's "Heroes of Deep Learning". This time is the interview of Prof. Yoshua Bengio. As always, don't post any copyrighted material here at the forum!
* Out of the 'Canadian Mafia', Prof Bengio is perhaps the less known among the three. Prof. Hinton and Prof. Lecun have their own courses, and as you know they work for Google and Facebook respectively. Whereas Prof. Bengio does work for MS, the role is more of a consultant.
* You may know him as one of the coauthors of the book "Deep Learning". But then again, who really understand that book, especially part III?
* Whereas Prof. Hinton strikes me as an eccentric polymath, Prof. Bengio is more a conventional scholar. He was influenced by Hinton in his early study of AI which was mostly expert-system based.
* That explains why everyone seems to leave his interview out, which I found it very intersting.
* He named several of his group's contributions: most of what he named was all fundamental results. Like Glorot and Bengio 2010 on now widely called Xavier's initialization or attention in machine translation, his early work in language model using neural network, of course, the GAN from GoodFellow. All are more technical results. But once you think about these ideas, they are about understanding, rather than trying to beat the current records.
* Then he say few things about early deep learning researcher which surprised me: First is on depth. As it turns out, the benefit of depth was not as clear early in 2000s. That's why when I graduated in my Master (2003), I never heard of the revival of neural network.
* And then there is the doubt no using ReLU, which is the current day staple of convnet. But the reason makes so much sense - ReLU is not smooth on all points of R. So would that causes a problem. Many one who know some calculus would doubt rationally.
* His idea on learning deep learning is also quite on point - he believe you can learn DL in 5-6 months if you had the right training - i.e. good computer science and Math education. Then you can just pick up DL by taking courses and reading proceedings from ICML.
* Finally, it is his current research on the fusion of neural networks and neuroscience. I found this part fascinating. Would backprop really used in brain a swell?
That's what I have. Hope you enjoy!
Following experienced guys like Arvind Nagaraj and Gautam Karmakar, I just finished all course works for deeplearning.ai. I haven't finished all videos yet. But it's a good idea to write another "impression" post.
* It took me about 10 days clock time to finish all course works. The actual work would only take me around 5-6 hours. I guess my experience speaks for many veteran members at AIDL.
* python numpy has its quirk. But if you know R or matlab/octave, you are good to go.
* Assignment of Course 1 is to guide you building an NN "from scratch". Course 2 is to guide you to implement several useful initialization/regularization/optimization algorithms. They are quite cute - you mostly just fill in the right code in python numpy.
* I quoted "from scratch" because you actually don't need to write your own matrix routine. So this "from scratch" is quite different from people who try to write a NN package "from scratch using C", in which you probably need to write a bit of code on matrix manipulation, and derive a set of formulate for your codebase. So Ng's Course gives you a taste of how these program feel like. In that regard, perhaps the next best thing is Michael Nielsen's NNDL book.
* Course 3 is quiz-only. So by far, is the easiest to finish. Just like Arvind and Gautam, I think it is the most intriguing course within the series (so far). Because it gives you a lot of many big picture advice on how to improve an ML system. Some of these advices are new to me.
Anyway, that's what I have, once I watch all the videos, I will also come up with a full review. Before that, go check out our study group "Coursera deeplearning.ai"?
Many members at Coursera deeplearning.ai ask about if a Coursera certificate is something useful. So I want to sum up couple of my thoughts here:
* The most important thing is whether you learn something in the process. And there are many ways to learn. Taking a course is good because usually the course preparer would give you a summary of the field you are interested in.
* So the purpose of certification is mostly a way of motivation so that you can *finish* a class. Note that it is tough to *finish* a class, e.g. Coursera statistics suggest that completion rate is ~9-13%. This number might be smaller at Coursera because it doesn't cost you much to click the enroll button. But you go to understand finishing a class is no small business. And certification is a way to help you to do so. (Oh, because you paid $ ?)
* Some also ask whether a certificate is useful for resume. It's hard to say. So for now, there is a short supply of university-trained deep learning experts. If you have a lot of non-traditional experience from Coursera and Kaggle, you do get an edge. But as time goes on, when more learners have achieved status similar to yours, then your edge will fade. So if you think of certificates as part of your resume, be ready to keep on learning.
For people who got stuck in Course 1. Here are some tips:
- Most assignments are straight-forward. And you can finish it within 30 mins. The key is not to overthink it. If you want to derive the equations yourself, you are not reading the question carefully.
- When in doubt, the best tool to help you is the python print statement. Check out the size and shape of a python numpy matrix always give you insights.
- I know a lot of reviewers claim that the exercise is supposed to teach you neural network "from scratch". So .... it depends on what you mean. Ng's assignment has bells and whistles built for you. You are really doing these out of nothing. If you write everything from C and has no reference. Yeah, then it is much harder. But that's not Ng's exercise. Once again, this goes back to the point of the assignment being straight-forward. No need to overthink them.
Hope this helps!
Fellows, as you all know by now, Prof. Andrew Ng has started a new Coursera Specialization on Deep Learning. So many of you came to me today and ask my take on the class. As a rule, I usually don't comment on a class unless I know something about it. (Search for my "Learning Deep Learning - Top 5 Lists" for more details.) But I'd like to make an exception for the Good Professor's class.