Author: grandjanitor

Resources for GPT-N

Official papers/blogs

GPT-1: (blog post) paper: Improving Language Understanding by Generative Pretraining –
GPT-2: (blog post) paper: Language Models are Unsupervised Learners
GPT-3: (API and blog post) paper: Language Models are Few-Shot Learners
An important paper on the bigger context of GPT-N : How CE loss scales with model size? “Scaling Laws of Neural Language Model”
Image-GPT: (blog post) paper: Generative Pretraining from Pixels

While I am quite aware of the technology back at GPT-1, I started writing this blog post at the time of GPT-3. So I will focus on GPT-3 and image GPT, which you may think they are the iteration which fascinate the public most. But I will also dovetail back to GPT-1/2 on technical reference.

Resources on GPT-3

Top 10 demo curated by Suzana Ilić.
Awesome GPT-3
Excellent discussion on GPT-3+ extensive experiments on poetry/essay generation from Gwern Branwen
Another excellent discussion on GPT-3, focusing on GPT-3 addition capability
Another one.
Some notes on the original paper:
- It works on various NLP tasks: “For example, GPT-3 achieves 81.5 F1 on CoQA in the zero-shot setting, 84.0 F1 on CoQA in the one-shot setting, 85.0 F1 in the few-shot setting. Similarly, GPT-3 achieves 64.3% accuracy on TriviaQA in the zero-shot setting, 68.0% in the one-shot setting, and 71.2% in the few-shot setting, the last of which is state-of-the-art relative to fine-tuned models operating in the same closed-book setting.“
- Surprisingly, GPT-3 excels at several few-shot learning tasks: (p.5) “GPT-3 also displays one-shot and few-shot proficiency at tasks designed to test rapid adaption or on-the-fly reasoning, which include unscrambling words, performing arithmetic, and using novel words in a sentence after seeing them defined only once. We also show that in the few-shot setting, GPT-3 can generate synthetic news articles which human evaluators have difficulty distinguishing from human-generated articles.”
- Unsurprisingly, GPT-3 struggle in several tasks in few-shot learning tasks: (p.5 same paragraph as above) “At the same time, we also find some tasks on which few-shot performance struggles, even at the scale of GPT-3. This includes natural language inference tasks like the ANLI dataset, and some reading comprehension datasets like RACE or QuAC.”
- But data contamination issue is real: “We also undertake a systematic study of “data contamination” – a growing problem when training high capacity models on datasets such as Common Crawl, which can potentially include content from test datasets simply because such content often exists on the web. In this paper we develop systematic tools to measure data contamination and quantify its distorting effects”
Performance on individual tasks:
- Few-shot learning allows OpenGPT-3 to do NMT, (p.14) it beats the best unsupervised learning SOTA.
- With few-shot learning, it beats the SOTA of PIQA.
- [TBD]

Training:

Interesting links:

https://transformer.huggingface.co/
Speech-based GPT-2 for speaker recognition

Tags deep learning, GPT-3

Uncategorized

Resources on Kaldi

Introduction:

Kaldi is one of the three active open source ASR projects which is based on hybrid approach. It has perhaps the best feature sets, but it is seen to be more advanced as a toolkit.

I like the toolkit because it works. Also ASR developers are colorful people (, sometimes too colorful), and I enjoy reading their source code.

(Yes, you need to read source code to understand kaldi.)

General Resources:

awesome-kaldi. – Well-deserved to be called “awesome”. Tons of useful links.
this page.

Basic Tutorials – the structure of kaldi, running from egs/ etc

Once a CMU professor told me that knowing how to use a hybrid ASR toolkits like htk, kaldi and sphinx are for really for bright kids. He is not wrong. All three toolkits require you to understand ASR enough to wield them effectively. Here are bunch of resources which you can help you.

HTK Book – We are talking about kaldi, why bring up HTK then? Well, kaldi was a response to htk. Both were written as unix command-line tools. Comparing kaldi with htk, htk was developed as a company codebase (Entropic). So the code is thought as more refined, but harder to change. Looking at both toolkit now (2020), I still find that the HTK tutorial is easier to follow.
The original kaldi tutorial – it uses RM, so if you don’t have RM, nah, this is not going to help you run end-to-end. But it will teach you basics of the resources.
The original ICASSP 2011 lecture.
Eleanor Chodroff’s tutorial – Rare wordy explanation of the toolkits. With some decent notes on what #senones really means.
Qianhui Wan’s runthrough of stages in a kaldi training – Good high-level run through of kaldi’s script.

When you need to hack kaldi……

Changing source code of kaldi, or in general, open source speech recognizers, is not the worst thing happen to a hacker. For the most part, you can derive most information by reading the source code. There are modules which are terse . e.g. nnet3. Say if you want to add a new computation command, then you want to go through several classes to make it works. On the same vein, you don’t really see any description of how individual command works. Think of it as assembly code to C, you will need to work it through yourself.
The good news is …… it’s possible. As always, you just need some coffee and a comfortable chair.
What if you want to read some documentation then? Then go with https://kaldi-asr.org/doc/index.html. You will be able to read high-level understanding of some algorithms.
I never work with Dr. Povey. But I often think his code and description are terse. i.e. He certainly know what he is doing, and many critics just don’t miss his points, but you need to be experienced in ASR to understand some of his “moves”.
Also see the next section on specific questions you may ask about Kaldi.

Questions you will ask when using Kaldi

Many data structures in kaldi are not “created with human readability as the first priority“. (I chucked when I read this phrase from Kaldi doc. 🙂 ) But then users often convinced Povey to come up with a terse yet readable description.
Tree-related: How does the decision tree look like? Check out copytree. How does decision tree work in kaldi? Oh you better learn what Event Maps is. The link also brings you to what the internal of decision tree building looks like. A more high-level description can be found at here.
Transition ID: What is transition ID? Two important answers: it is a 5-tuples including the identity of transitions, source, forward and self-loop transID and the phone. It is also an ID which the minimal description of a compiled decoding graph. See here.
Lattice-related: here. Also lattice-copy is your friends.
nnet3-related: How does neural network computation work? How was an NN compiled? What optimization was used on the NN? Actually, what are the optimizations? If you feel confused about this question, check out all nnet3 links from the “nnet3 setup” page.
Example generation for NN training: That… if you never “read between the pipes”, you will never understand. Various Chinese hackers have analyze the chain though. You can easily look it up on Google.

Acknowledgement

You always want to thank Dan Povey and the kaldi team for their great work. Hybrid approach is not going away soon because of them.

Update Logs:

(Sep 7, 2020) Add notes on interesting topics such as tree, transition ID and neural network computation.
(Before Sep 7, 2020) Wrote the backbone of the note.

Uncategorized

Resources for Quantum Computation/Machine Learning

Post author By grandjanitor
Post date October 3, 2019
No Comments on Resources for Quantum Computation/Machine Learning

Books:

Quantum Computation and Quantum Information, or so called “Mike and Ike”.

Quantum Computing since Democritus by Scott Aaronson.

Course:

EdX Quantum Computing

Quantum Information Science Part I, Part II, Part III

Daniel Gottesman’s Lectures.

Tools:

PyQuil

Visualizing quantum circuit and bloch sphere : quirk.

Websites/Blogs:

Scott Aaronson’s Shtetl-Optimized

Quantum Computing Factsheet.

Videos:

David Deutsch’s video lectures.

Michael Nielson’s video lectures.

Uncategorized

AIDL Weekly Past Links

Here is the link for all past AIDL Weekly .

Indeed, we have stopped our weekly publication of AIDL Weekly early this year. And many of you asked about why – nothing much. Me and Waikit are just busy in our life, and writing a weekly newsletter is tough.

But never say never, we might come back in the future. So stay tuned.

Arthur

Uncategorized

AIDL Weekly Issue 86 – CES 2019, Edward Grefenstette and Common Voice

Post author By grandjanitor
Post date July 25, 2019
No Comments on AIDL Weekly Issue 86 – CES 2019, Edward Grefenstette and Common Voice

Issue 86 January 14th 2019

Editorial

Thoughts From Your Curators

This week on AIDL:

AI news from CES 2019
Facebook’s poaching of Edward Grefenstette from DeepMind
We also talk about the significance of the Common Voice Initiative by Mozilla

Join our community for real-time discussions here: Expertify

This newsletter is published by Waikit Lau and Arthur Chan. We also run Facebook’s most active A.I. group with 193,000+ members and host an occasional “office hour” on YouTube. To help defray our publishing costs, you may donate via link. Or you can donate by sending Eth to this address: 0xEB44F762c58Da2200957b5cc2C04473F609eAA65.

Artificial Intelligence and Deep Learning Weekly

News

Edward Grefenstette is now with FAIR

How much does a top AI researcher make? We did not know until OpenAI make the salary of several top researchers such as Ilya Sutskever public. And we learn that top-level researcher can make a million a year even in a non-profit. It does explain why there is so much buzz when we learn a researcher like Edward Grefenstette is changing house.

What does Dr. Grefenstette do? He was first a researcher in mathematical logic, but then in 2014, he co-authored a landmark paper with Nal Kalchbrenner (now at Google Brain) and Phil Blunsom (now at Oxford), where it suggested you might use convolutional neural network to model sentence. For some context, before the paper, most researchers believe that the best sentence model is perhaps BLSTM. But with CNN in the mix, it opens up a lot of interesting possibilities to research and development: CNN parallelizes much better than BLSTM.

forbes.com

Pulse of AI Last Week

Intel and Alibaba partner on 3-D athlete tracking

CrowdAnalytix $40M

DARPA tries to use AI to find hidden patterns in events

Artificial Intelligence and Deep Learning Weekly

CES

Baidu:

AMD:

Radeon

Google:

Google and NXP

And SDC news from CES:

Safety First

Artificial Intelligence and Deep Learning Weekly

Blog Posts

Auto Keras and AutoML by Adrian Rosebrock.

Adrian Rosebrock investigated the Keras equivalent of AutoML. Other than his usual meticulous step-by-step procedure, he found that the automatically tuned network is not as good as his hand-defined structure.

pyimagesearch.com

Quote From Prof. Hinton

The future depends on some graduate student who is deeply suspicious of everything I have said. – Geoffrey Hinton

facebook.com

Open Source

Mozilla’s Common Voice

Many people asked the question – “Where is Imagenet for speech recognition?” The closest thing we know is the Voxforge.org, which attempts to collect voice data from volunteers who speak different languages. The open secret in voice data collection is that it has to be very contextual to the modality. So if your domain is telephone speech, collecting speech from desktop microphone will give you a very poor model.

Once you have the data, you also have to transcribe it. That is the equivalent of labeling in a general machine learning task, but speech is a sequence of sound, so you also need to annotate the word order correctly. You might also want to come up with deeper annotations such as the accents or mispronunciation of the speakers. All these variations are the reasons why transcription is expensive.

So if collecting speech is hard and transcription is expensive, very few companies want to share their data, even when companies are very happy to open source their recognizer.

All these should make you realize Mozilla’s effort of Common Voice is very important. What they try to do is to come up with a large scale speech database, such that researchers around the world can create models.

mozilla.org

Other News

About Us

Join our community for real-time discussions here: Expertify

Artificial Intelligence and Deep Learning Weekly

Uncategorized

AIDL Weekly #85 – AGI is Nowhere Near – According to Prof. Hinton

Post author By grandjanitor
Post date July 25, 2019
No Comments on AIDL Weekly #85 – AGI is Nowhere Near – According to Prof. Hinton

Artificial Intelligence and Deep Learning Weekly – Issue 85

Issue 85 January 7th 2019

Editorial

Thoughts From Your Curators

We were on vacation during Christmas and we are back this week. We bring you several interesting links:

How does Prof. Hinton think of AGI?
The MIT Course on deep learning, reinforcement learning and SDC,
2018 in review.

Join our community for real-time discussions here: Expertify

Artificial Intelligence and Deep Learning Weekly

News

Pulse of AI (Since Last Issue)

Funding:

Investment:

MS and BMW invest in Graphcore

Artificial Intelligence and Deep Learning Weekly

Blog Posts

People’s Anger Towards SDC is Real

Especially towards Waymo’s SDC in Arizona.

nytimes.com

AGI is Nowhere Near, According to Demis Hassabis and Geoffrey Hinton

In ML reporting, we often see VentureBeat outdoes itself from time to time. And this piece, though appears as a blog post, deserves your attention.

There are two parts of the piece. First is how Hassabis think about AGI. As you know, DeepMind created a super-human level Go-engine with deep reinforcement learning. And in AlphaZero, researchers also found that they can configure engine to learn multiple board games such as Shogi to be super-human level . This should make you ask, can we use the same technology to create AGI as well?

Hassabis says no:

Despite DeepMind’s impressive achievements, Hassabis cautions that they by no means suggest AGI is around the corner — far from it. Unlike the AI systems of today, he says, people draw on intrinsic knowledge about the world to perform prediction and planning. Compared to even novices at Go, chess, and shogi, AlphaGo and AlphaZero are at a bit of an information disadvantage.

Now let’s look at Prof. Hinton’s view, which is even more interesting. So say you create a robotic agent and put him into society. Now can’t you just run a reinforcement learning algorithm to train the robot to be human? Or aka creating an AGI?

Not that easy, you have a scalability issue. Because most real-life reward signal is weak. As Prof. Hinton said,

“Every so often you get a scalar signal that tells you that you did good, and it’s not very often, and there’s not very much information, and you’d like to train the system with millions of parameters or trillions of parameters just based on this very wimpy signal,”

So the kicker here is how do you solve such scalability issue in reinforcement learning. Prof. Hinton thinks that a hierarchy of goals could be the answer:

““By creating subgoals, and paying off people to achieve these subgoals, you can magnify these wimpy signals by creating many more wimpy signals,” he added.”

In any case this article has a lot of gems. So it deserves your time to take a closer look. We also think both Hassabis and Prof. Hinton’s view are based on their extensive experienced in the existing deep learning. To us, their arguments are more convincing than .. say… pure futurists who blindly believe the coming of singularity.

venturebeat.com

2018 in Review.

Here are couple of posts which reviews AI development in 2018.

Not exactly review, but this is a good read about convolutional neural networks.

Artificial Intelligence and Deep Learning Weekly

MIT Courses on Deep Learning, Reinforcement Learning and Autonomous Vehicles

The very popular MIT course on SDC is now expanded to three more classes which include the fundamental of deep learning and reinforcement learning.

mit.edu

Strang on deep learning activation functions.

We saw Prof. Strang’s essay on deep learning activation functions, and of course we are excited about whatever he writes. But then more interestingly, Prof. Strang is also writing a new book called “Linear Algebra and Learning from Data” and is going to publish soon. (!) It is interesting because there has always been a void on how to learn the mathematics of machine learning. That includes topics such as parameter estimation, or more basic techniques in matrix calculus.

You may now order the book, and the publisher says they will confirm order in January.

siam.org

lexfridman.com

AIDL Members frequently asked whether there are interesting podcasts. The answer is yes. One of our recommendations is Prof. Lex Fridman’s “AI Podcast”. All guests he invited are prominent researchers or developers who have contributed to AI. So what they said should interest you.

lexfridman.com

Grow your expertise with these 5 out-of-field reads

AIDL member and Data Scientist, Briana Brownell, came up with 5 interesting reads which is supposed to be out-of-field. Yet, one you look at the list closely, all topics are deeply relevant to what MLEs and data scientist works on.

The one we like most? Perhaps is Homo Deus by Yuval Noah Harari. We are reading his “Sapien” which is equally thought-provoking.

towardsdatascience.com

Notable AI Blog Posts

Google:

Artificial Intelligence and Deep Learning Weekly

Other News

NYT wrote an obituary for Prof. Karen Sparck Jones

SDC

Alexa :

Others:
Lighthouse is shutting down . Also story of Babylon Health.

Artificial Intelligence and Deep Learning Weekly

About Us

This newsletter is published by Waikit Lau and Arthur Chan. We also run Facebook’s most active A.I. group with 191,000+ members and host an occasional “office hour” on YouTube. To help defray our publishing costs, you may donate via link. Or you can donate by sending Eth to this address: 0xEB44F762c58Da2200957b5cc2C04473F609eAA65.

Join our community for real-time discussions here: Expertify

Artificial Intelligence and Deep Learning Weekly

Uncategorized

AIDL Weekly #84 – Udacity Reorg, AI Index 2018 Report

Post author By grandjanitor
Post date July 25, 2019
No Comments on AIDL Weekly #84 – Udacity Reorg, AI Index 2018 Report

Issue 84 December 17th 2018

Editorial

Thoughts From Your Curators

This week we link you to the latest AI Index report, discuss the impact of reorganization of Udacity and analyze a blog post from OpenAI.

Join our community for real-time discussions here: Expertify

This newsletter is published by Waikit Lau and Arthur Chan. We also run Facebook’s most active A.I. group with 186,000+ members and host an occasional “office hour” on YouTube. To help defray our publishing costs, you may donate via link. Or you can donate by sending Eth to this address: 0xEB44F762c58Da2200957b5cc2C04473F609eAA65.

Artificial Intelligence and Deep Learning Weekly

News

Pulse of AI Last Week

Artificial Intelligence and Deep Learning Weekly

Udacity Reorg

When you think about it, democratization of AI also coincided with democratization of on-line learning. Remember Coursera Machine Learning by Andrew Ng? It is when many of us start to learn about machine learning, but it’s also the first time many of us learn an important skill through an on-line curriculum.

Yet these days both Coursera and Udacity seem to have issues to come up with a sustainable business models. For Coursera, they hired a new CEO back in July last year, and his focus is on searching for a viable business model. Udacity’s route is as winding and tortuous. They are thinking of filing an IPO earlier this year, yet they went through a round of layoff. And it seems like they will focus more on corporate training than the current models.

What does it mean to us AI/DL learners? For starter, it might mean that the beloved subsidies for underprivileged students would be gone from on-line learning sites. Or it can also mean development of new materials would slow down.

We hope our prediction is wrong. Hopefully, both Udacity and Coursera can come up with business models which can be both profitable yet cater to learners.

xconomy.com

Blog Posts

Reading the AI Index 2018 Report

The new AI Index report for 2018 is just published last week. We are excited because you may think of it as one report which capture the state of AI development every year. Looking through the report, you may find the section on the growth of AI as a field and its public perception. You may also learn the trend of the state of the art of several fields such as computer vision and machine translation.

The 2017 report was criticized for focusing too much on North America development, but then this year you can see the report cover much of activities around the world. e.g. We learn that Tsinghua University has the highest increase of course enrollment in AI.

Notably missing in the report is automatic speech recognition (ASR). That perhaps has to do with the difficulty of searching for one golden benchmark for ASR. As you may know, ASR performance tremendously across different noise condition.

Anyway, we recommend you to read the report in details.

aiindex.org

How to scale AI Training without voodoo?

Last few years, we several works in deep learning training which use large batch size. The advantage of the approach is that you may easily spread the computation of a large batch across several machines.

But here is the problem, how can you decide what batch size to use, or… if a task is suitable to have a large batch size. OpenAI’s work seems to give at least one answer to the problem. They found that a simple quantity, gradient noise scale correlate to the optimal batch size in a training task.

Reading the paper, there are many to unpack here. e.g. The authors favor the use of simplified version of the metric which just require calculation of a determinant of a matrix, and the trace of the covariance matrix. They found that such measure correlates with the batch size.

All-in-all, this is indeed an interesting finding because now you may have a guidance for tuning one important parameter, batch size, in your training. Perhaps the question in practice: can you just use a sample of your dataset to training set to measure gradient noise scale, and use it in a larger set? And more importantly, do we have other guidance on tuning other parameters? Those are interesting question to ask, and if we can solve them, may be we can truly call neural network training a science more than an art.

openai.com

“Contributing and bringing machine learning to JavaScript ecosystem with TensorFlow.js” by Manraj Singh

There are many posts on AIDL which ask about how to use a deep learning framework. But there are many of them are about experience in contributing deep learning engine. Singh’s post is a notable exception.

medium.com

Notable AI Blog Posts

BAIR

Google:

Providing Gender-Specific Translations in Google Translate

Artificial Intelligence and Deep Learning Weekly

Open Source

Papers With Code

Written by Zaur Fataliyev and suggested by one of our AIDL moderators, Zubair Ahmed, this github contains papers with its codebase, and it spans from 2013 to 2018. It’s a great resource for everyone who not just want to read deep learning papers, but also study the underlying implementations.

github.com

About Us

Join our community for real-time discussions here: Expertify

Artificial Intelligence and Deep Learning Weekly

Uncategorized

AIDL Weekly #83 – NeurIPS 2018

Post author By grandjanitor
Post date July 25, 2019
No Comments on AIDL Weekly #83 – NeurIPS 2018

Issue 83 December 10th 2018

Editorial

Thoughts From Your Curators

This week we round up NeurIPS 2018 news for you. We also bring you posts on careers of AI.

Join our community for real-time discussions here: Expertify

This newsletter is published by Waikit Lau and Arthur Chan. We also run Facebook’s most active A.I. group with 184,000+ members and host an occasional “office hour” on YouTube. To help defray our publishing costs, you may donate via link. Or you can donate by sending Eth to this address: 0xEB44F762c58Da2200957b5cc2C04473F609eAA65.

Artificial Intelligence and Deep Learning Weekly

News

Pulse of AI Last Week

Artificial Intelligence and Deep Learning Weekly

FAIR at Five – What Do We Think?

How should you think about FAIR’s fifth year anniversary? It’s obvious that FAIR has achieved a lot in AI: its image processing expertise is world class, so is its neural machine translation. PyTorch is wildly popular (but it takes some commercials to catch up with rival Tensorflow).

But are there any weaknesses in their efforts? There are some more obvious ones. For example, Facebook is not showing any progress in speech recognition (, as we discussed in Issue #80). Five years is more than enough to build a decent speech recognizer if there is a long term vision.

facebook.com

Blog Posts

Notes on Andrew Ng’s Machine Learning Q&A Webinar

Here is the note on an Andrew Ng’s ML Q&A Webinar, prepared by AIDL member Dylan Ler.

facebook.com

Video

Long Term Future of AI – An Interview with Stuart Russell by Lex Friedman

Here is an interview of Stuart Russell, who as you know, wrote the famous textbook, Artificial Intelligence: A Modern Approach. Lex Friedman had an interview with him on the future of AI.

facebook.com

Member’s Question

Starting a Career in AI

A member of AIDL asked: “How do I start my career in artificial intelligence?”

Answered by Arthur:

“As I promised. Here are few thoughts on the topic. I will focus on more commercial applications of ML. Of course you can be a professor, but you should know thats a moonshot.

I’d like to organize them as four parts: should you?, learning, starting out, and first 2-5 years.

should you?

I like Race Vanderdecken’s comment the most. Being MLE is not for everybody. If you go through general CS education, chances are you would be trained as a competent programmer. But being MLE means you need to sit model training, analyze results. Your instinct of finishing something fast as learned from competitive programming is useless. MLE favors slow and deliberate thinking which is not the norm in our fast-paced world.

learning

ML is a topic you should learn and can learn in great detail first before practice it. So if you are in school, take as many ML classes as possible. You should spend at least 50% of time to learn the practice of ML. So like train a classifier? Think of ways to improve its classification rate? Improve its inference speed? You should think about these issues every time you work on a new project. And your worth has a lot of do with this experience you accumulate.

starting out

So how do you actually get hired as MLE then? You go to seek one. The idea is similar to any job seeking process: present your resume to your potential employers, then pitch yourself to them. There are other routes: Someone might recommend you. You might have done an internship in the company before so they like you. It can be a placement program. But in any cases, you got to build up your skillset, and present it well in a resume.

What do people look for in candidates? First off, it’s your project portfolio. Suppose you want to work for a computer vision company, you really want to have some compelling projects on image processing. So if you tell me you train MNIST, I would think, “Okay this guy went through the basics”. But if you told me you train the whole Imagenet at home, then I would think, “ah, that’s not easy”.

Then it is your general knowledge in ML. In an interview, senior engineers would usually probe holes on your understanding and In ML, there are many misconceptions. e.g. Many people will give you silly and unsubstantiated reasonings on what deep learning is, like “it uses big data”, like “DL is just deeper than ML”. Those answers are hand-waiving and it doesn’t quite explain what deep learning is.

Another thing I do in interview: I just go ahead to look the projects quoted in a candidate resume, and asked detailed questions on each of them. Very quickly, you would realize if someone worths one’s salt.

first 2-5 years

If you are successful and got hired, you will start to go through the daily chores of being MLE. So what do MLE does? For the most part you try to make a living through machine learning. The key metric here: Do something you make being used? What that entails is you want to create an ML product, and refine it to a point that a company can sell it. There are many things to unpack here. Because just to create something in ML is hard, but usually the prototype performance would be too bad for production. Or if something is good for production, the company might just decide they don’t want to sell it.

So whether you can start out has everything to do with hardwork plus a lot of luck. My suggestion there is to start with small projects within a company, then build up your reputation. Make sure you are employed, because if you want to get better in ML, you have to keep educating yourself and that cost time and $.

after 5 years

I also have advices for people who stay in the business for around 5 years. But this comment is getting long, so let’s leave it next time?”

Artificial Intelligence and Deep Learning Weekly

NeurIPS2018

Round-up of NeurIPS 2018

We have NeurIPS 2018 this week, with its name change, and some reporters shut out from the conferences. We heard couple of stories/reviews from our members. Here is a round-up of the news:

Google’s NeurIPS 2007 paper “The Trade-Offs of Large Scale Learning” by Léon Bottou (then at NEC Labs, now at Facebook AI Research) and Olivier Bousquet (Google AI, Zürich) got the “Test of Time Award”
Presence of Facebook
Reviews so far: from Sicara, from Synced Review, from TowardsDataScience. More an industry review from VentureBeat. Searching “NeurIPS 2018” on YouTube should give many videos from the conferences too.
And we learnt that Vancouver will be the new host for NeurIPS 2019

Artificial Intelligence and Deep Learning Weekly

Other News

What We Read Last Week

The Alibaba Voice Assistant , and MIT Tech Review thinks it’s better than Google’s.
China’s ByteDance
Riding on Waymo One Experience on the first commercial SDC service.
Rumor: Does Tumblr’s later adult content filter work? Some said otherwise.
Facebook is dust without AI, according to Prof. LeCun

Artificial Intelligence and Deep Learning Weekly

About Us

Join our community for real-time discussions here: Expertify

Artificial Intelligence and Deep Learning Weekly

Uncategorized

AIDL Weekly #82 – So You Want to be an AI Researcher?

Post author By grandjanitor
Post date July 25, 2019
No Comments on AIDL Weekly #82 – So You Want to be an AI Researcher?

Issue 82 December 3rd 2018

Editorial

Thoughts From Your Curators

This week we round up all Amazon’s announcements at Re:invent. We also bring you a trending article from Google’s researcher Vincent Vanhoucke.

This newsletter is published by Waikit Lau and Arthur Chan. We also run Facebook’s most active A.I. group with 183,000+ members and host an occasional “office hour” on YouTube. To help defray our publishing costs, you may donate via link. Or you can donate by sending Eth to this address: 0xEB44F762c58Da2200957b5cc2C04473F609eAA65.

Join our community for real-time discussions here: Expertify

Artificial Intelligence and Deep Learning Weekly

News

Pulse of AI Last Two Weeks

Artificial Intelligence and Deep Learning Weekly

Amazon at Re:invent Roundup

From these releases, it gives us a feeling that Amazon is not just playing its strength such as cloud computing, it is also flexing muscles on deep/machine learning.

Artificial Intelligence and Deep Learning Weekly

Blog Posts

PRML download

News on the great ML Bible PRML – Now you can download it from Microsoft websites. The exercises are hard, but if you can finished just some of them, you will learn about machine learning more deeply.

microsoft.com

So You Want to Be a Research Scientist? By Vincent Vanhoucke

We have followed the career of Vincent Vanhoucke for a while. Back in the time when deep learning is not trendy, his was already sharing interesting comments on deep learning research on his Google Plus account. As you know, he has developed a deep learning class on Udacity.

This time he wrote an article on how to be a research scientist. His number one advice: “1. Research is about ill-posed questions with multiple (or no) answers”. Well said Dr. Vinhoucke. In fact, you may say the whole PhD education is to train students to deal with ill-posed questions. All his advices are deep, we suggest you to look at it one by one.

What we appreciate the most is while Dr Vinhoucke is a researcher, he can also appreciate what developers need:

At the same time, sitting in the seat right next to you, your engineer peers are actually building things that will endure, solving well-defined problems, and exercising the same level of creativity and mastery over their subject matter.

Is there anything we can say about being engineers in deep learning? So why don’t we just talk about “Advice No.1”?

“1. Development is about creatively combine well-researched solutions to solve real-life problems”

Oh may be “No .2”

“2. Sometimes you need to do research too”

medium.com

Video

Interview with Yoshua Bengio by Lex Friedman

Here is another interview by Lex, this time he had one of the “Canadian Mafia” in deep learning, Prof. Yoshua Bengio.

lexfridman.com

Super-Curator on AI

We covered the sale of “Edmond de Belamy, from La Famille de Belamy”, an AI-generated art in Issue 79. How would an art curator see the current trend of AI in art? Artnet has an interview with Carolyn Christov-Bakargiev, a curator and a museum director, who is going to receive the prestigious CCS Bard’s award Curatorial Excellence next year.

We called ourselves curators, but we are no artists. So let’s leave it to our readers: do you think AI-generated art is “Artificial Stupidity” too, as Christov-Bakargie said?

artnet.com

Other News

About Us

Join our community for real-time discussions here: Expertify

Artificial Intelligence and Deep Learning Weekly

Uncategorized

AIDL Weekly #81 – “Math is language that uses God” – Vladimir Vapnik

Post author By grandjanitor
Post date July 25, 2019
No Comments on AIDL Weekly #81 – “Math is language that uses God” – Vladimir Vapnik

Issue 81 November 19th 2018

Editorial

Thoughts From Your Humble Curators

This week we bring you the interview of Prof. Vladmir Vapnik, creator of SVM, by Prof. Lex Friedman, the lecturer of the MIT SDC and AGI class. We also analyze the recent statement by Dr. Ilya Sutskever which he stated short-term AGI is possible.

This newsletter is published by Waikit Lau and Arthur Chan. We also run Facebook’s most active A.I. group with 181,000+ members and host an occasional “office hour” on YouTube. To help defray our publishing costs, you may donate via link. Or you can donate by sending Eth to this address: 0xEB44F762c58Da2200957b5cc2C04473F609eAA65.

Join our community for real-time discussions here: Expertify

Artificial Intelligence and Deep Learning Weekly

News

Echo is Gaining Ground…..

We learned last week Amazon is now hiring 10000 workers for it Amazon Echo. That was doubled from 5000 from previous years. We also learned that all this investment starts to pay off, e.g. Microsoft is now selling Echo online and offline. At least for now, we all know that Amazon is the winner in the battle of speech assistants.

Artificial Intelligence and Deep Learning Weekly

Google Health Absorbs DeepMind’s Health Business

The new Google Health group now absorbs DeepMind’s health business. It makes you wonder few things:

Is Google Health part of Verily, an Alphabet subsidiary, or are they also separate?
What left for DeepMind’s business other than ……. game playing now?
What does that mean to the future of DeepMind?

One thing for sure, Google is taking a tighter grip on DeepMind now, potentially because of the bad publicity caused by DeepMind-RoyalFree event. (See Issue 20) And Verily, from what we saw, seems to be more cautious about patient privacy.

cnbc.com

Pulse of AI Last Week

Artificial Intelligence and Deep Learning Weekly

Blog Posts

AI For Everyone

deeplearning.ai just announce a new class which cater to non-technical people. So if you are thinking of preparing material for not technical staff for your company, this could be a good resource. Coming in 2019.

facebook.com

Short-Term AGI is a Serious Possibility, said Open AI Founder Ilya Sutskever

If you start AI only the last few years, one thing you might heard of is the possibility of AGI. You must feel remote for these ideas because they are so different from ideas such as supervised learning or reinforcement learning.

We feel the same way, AI practitioners these days are quite a different group from AGI advocates. So when Open AI Founder, Ilya Sutskever, said short-term AGI is a real possibility, you should probably take notice.

So why do people believe AGI is possible in the first place? Setting aside that some people just blindly believe in AGI. Most AGI advocates actually based their belief upon projection from technological progress. Moore’s Law is one of the most well-quoted. Here, Sutskever’s arguments are mostly based on the recent progress of deep learning. One of his arguments is that deep learning has “repeatedly and rapidly broken through ‘insurmountable’ barriers.” and he enumerated several important progress in deep learning such as Imagenet, GAN and AlphaGo.

Would his prediction comes true though? You asked. And of course, just that we can achieve something impossible few years ago, it doesn’t always mean we can do that. So say, speaking in deep learning: can we come up with a network architecture which integrate multiple senses together? Or if we can think of one, do we have enough computational power to train them?

Perhaps it begs a relevant question: how do you judge a certain prediction on AI? In our view, since they are predictions, you cannot quite verify if they are true now. Perhaps a better way to judge them is to look at the feasibility of their underlying theory. Most controversy about AI prediction such as the emergence of the more popular “Singularity” on how we can predict progress on other technologies.

We don’t pretend we have the answer. We suggest you to think independently and come up with your view.

medium.com

“Math is language that uses God” – Vladimir Vapnik

Here is an interesting interview of Prof. Vladimir Vapnik by MIT Prof. Lex Friedman. Just on the two minutes segments we have several thoughts and you may listen to the whole pod cast here.

The first is what Vapnik really means. It seems to us, when Vapnik said “Math is a language that uses God”. The “God” has a sense of “deterministic”, i.e. close to the meaning when Einstein said to Niels Bohr: “God doesn’t play dice”. So this saying was more like “Math is a language which is deterministic”. And later when he wondered how reality can be described by simple language of Math. It gives us a sense that he believe that reality can be described by Math.

Perhaps the more important question is “Should reality in machine learning be described by Math?” If we see reality as data we can observed and measured, then we should be more reserved if we should blindly pursue a simple equation as descriptions. For example, when we are talking about using regression to describe a set of data, what we are looking for is the best description according to some criteria. e.g. say when we are choose the optimal number of mixtures in a Gaussian mixture model(GMM), we can use likelihood difference, or BIC etc to decide. We wonder other than actual performance metrics, we can have anything to decide upon these criteria.

All-in-all, this is a very interesting interview. We also asked Prof. Friedman on how Prof. Vapnik thinks about the rise of deep learning. Prof. Friedman promise to post another clip on AIDL again. So stay tuned.

lexfridman.com

AI Blogs Last Week

Google:

BAIR:

AdaSearch

Floydhub:

On Colorization

Artificial Intelligence and Deep Learning Weekly

Other Interesting News

Uber AI Residency 2019
Should AI makes important decisions for humans? -Pew Research
Resnet training in 3.7 mins, but also check out fast.ai 18 mins setup?
Baby vs Grandma – Another trolly problem which stirs AIDL members.
Autonomous Weapon. What’s the Facts?

Artificial Intelligence and Deep Learning Weekly

About Us

Join our community for real-time discussions here: Expertify

Artificial Intelligence and Deep Learning Weekly

Resources on GPT-3

Training:

Interesting links:

Introduction:

General Resources:

Basic Tutorials – the structure of kaldi, running from egs/ etc

More advanced topics:

When you need to hack kaldi……

Questions you will ask when using Kaldi

Editorial

Thoughts From Your Curators

News

Pulse of AI Last Week

CES

Blog Posts

Open Source

Other News

Other Interesting Stories

About Us

Editorial

Thoughts From Your Curators

News

Pulse of AI (Since Last Issue)

Blog Posts

2018 in Review.

Notable AI Blog Posts

Other News

Other News

About Us

Editorial

Thoughts From Your Curators

News

Pulse of AI Last Week

Blog Posts

Notable AI Blog Posts

Open Source

About Us

About Us

Editorial

Thoughts From Your Curators

News

Pulse of AI Last Week

Blog Posts

Video

Member’s Question

Starting a Career in AI

NeurIPS2018

Round-up of NeurIPS 2018

Other News

What We Read Last Week

About Us

Editorial

Thoughts From Your Curators

News

Pulse of AI Last Two Weeks

Amazon at Re:invent Roundup

Blog Posts

Video

Other News

Other Interesting Stories

About Us

Editorial

Thoughts From Your Humble Curators

News

Echo is Gaining Ground…..

Pulse of AI Last Week

Blog Posts

AI Blogs Last Week

Other Interesting News

About Us

About Us