Thoughts From Your Humble Curators
The biggest AI/DL news last week is definitely Andrew Ng's departure from Baidu so naturally it is the top news this issue.
Other than Ng's departure, last week was filled with news on interesting researches and source code:
- OpenAI's research on multiple agents led to emergence of a simple language,
- FAIR's Kaiming He proposed Mask-RCNN, which shatters previous records,
- Distill, a new on-line journal for deep learning,
- Google's Syntaxnet upgrade,
- Google's new skip-thought model.
"Deep Learning" by Ian GoodFellow et al
I (Arthur) have some leisure lately to browse "Deep Learning" by Goodfellow for the first time. Since it is known as the bible of deep learning, I decide to write a short afterthought post, they are in point form and not too structured.
- If you want to learn the zen of deep learning, "Deep Learning" is the book. In a nutshell, "Deep Learning" is an introductory style text book on nearly every contemporary fields in deep learning. It has a thorough chapter covered Backprop, perhaps best introductory material on SGD, computational graph and Convnet. So the book is very suitable for those who want to further their knowledge after going through 4-5 introductory DL classes.
- Chapter 2 is supposed to go through the basic Math, but it's unlikely to cover everything the book requires. PRML Chapter 6 seems to be a good preliminary before you start reading the book. If you don't feel comfortable about matrix calculus, perhaps you want to read "Matrix Algebra" by Abadir as well.
- There are three parts of the book, Part 1 is all about the basics: math, basic ML, backprop, SGD and such. Part 2 is about how DL is used in real-life applications, Part 3 is about research topics such as E.M. and graphical model in deep learning, or generative models. All three parts deserve your time. The Math and general ML in Part 1 may be better replaced by more technical text such as PRML. But then the rest of the materials are deeper than the popular DL classes. You will also find relevant citations easily.
- I enjoyed Part 1 and 2 a lot, mostly because they are deeper and fill me with interesting details. What about Part 3? While I don't quite grok all the Math, Part 3 is strangely inspiring. For example, I notice a comparison of graphical models and NN. There is also how E.M. is used in latent model. Of course, there is an extensive survey on generative models. It covers difficult models such as deep Boltmann machine, spike-and-slab RBM and many variations. Reading Part 3 makes me want to learn classical machinelearning techniques, such as mixture models and graphical models better.
- So I will say you will enjoy Part 3 if you are 1) a DL researcher in unsupervised learning and generative model or 2) someone wants to squeeze out the last bit of performance through pre-training, 3) someone who want to compare other deep methods such as mixture models or graphical model and NN.
Anyway, that's what I have now. May be I will summarize in a blog post later on, but enjoy these random thoughts for now.
Original version from my (Arthur's) blog post.