Categories
deep leaerning Uncategorized

Resources for GPT-N

Official papers/blogs

While I am quite aware of the technology back at GPT-1, I started writing this blog post at the time of GPT-3.  So I will focus on GPT-3 and image GPT, which you may think they are the iteration which fascinate the public most.   But I will also dovetail back to GPT-1/2 on technical reference.

Resources on GPT-3

  • Top 10 demo curated by Suzana Ilić.
  • Awesome GPT-3 
  • Excellent discussion on GPT-3+ extensive experiments on poetry/essay generation from Gwern Branwen
  • Another excellent discussion on GPT-3, focusing on GPT-3 addition capability 
  • Another one.
  • Some notes on the original paper:
    • It works on various NLP tasks: “For example, GPT-3 achieves 81.5 F1 on CoQA in the zero-shot setting, 84.0 F1 on CoQA in the one-shot setting, 85.0 F1 in the few-shot setting. Similarly, GPT-3 achieves 64.3% accuracy on TriviaQA in the zero-shot setting, 68.0% in the one-shot setting, and 71.2% in the few-shot setting, the last of which is state-of-the-art relative to fine-tuned models operating in the same closed-book setting.
    • Surprisingly,  GPT-3 excels at several few-shot learning tasks: (p.5) “GPT-3 also displays one-shot and few-shot proficiency at tasks designed to test rapid adaption or on-the-fly reasoning, which include unscrambling words, performing arithmetic, and using novel words in a sentence after seeing them defined only once. We also show that in the few-shot setting, GPT-3 can generate synthetic news articles which human evaluators have difficulty distinguishing from human-generated articles.”
    • Unsurprisingly, GPT-3 struggle in several tasks in few-shot learning tasks: (p.5 same paragraph as above) “At the same time, we also find some tasks on which few-shot performance struggles, even at the scale of GPT-3. This includes natural language inference tasks like the ANLI dataset, and some reading comprehension datasets like RACE or QuAC.”
    • But data contamination issue is real: “We also undertake a systematic study of “data contamination” – a growing problem when training high capacity models on datasets such as Common Crawl, which can potentially include content from test datasets simply because such content often exists on the web. In this paper we develop systematic tools to measure data contamination and quantify its distorting effects”
  • Performance on individual tasks:
    • Few-shot learning allows OpenGPT-3 to do NMT, (p.14) it beats the best unsupervised learning SOTA.
    • With few-shot learning, it beats the SOTA of PIQA.
    • [TBD]

Training:

Interesting links:

 

Categories
Uncategorized

Resources on Kaldi

Introduction:

Kaldi is one of the three active open source ASR projects which is based on hybrid approach.  It has perhaps the best feature sets, but it is seen to be more advanced as a toolkit.

I like the toolkit because it works.  Also ASR developers are colorful people (, sometimes too colorful), and I enjoy reading their source code.

(Yes, you need to read source code to understand kaldi.)

General Resources:

  • awesome-kaldi.  – Well-deserved to be called “awesome”.  Tons of useful links.
  • this page.

Basic Tutorials – the structure of kaldi, running from egs/ etc

Once a CMU professor told me that knowing how to use a hybrid ASR toolkits like htk, kaldi and sphinx are for really for bright kids.  He is not wrong.   All three toolkits require you to understand ASR enough to wield them effectively.   Here are bunch of resources which you can help you.

  • HTK Book – We are talking about kaldi, why bring up HTK then?  Well, kaldi was a response to htk.   Both were written as unix command-line tools.   Comparing kaldi with htk, htk was developed as a company codebase (Entropic).  So the code is thought as more refined, but harder to change.  Looking at both toolkit now (2020), I still find that the HTK tutorial is easier to follow.
  • The original kaldi tutorial – it uses RM, so if you don’t have RM, nah,  this is not going to help you run end-to-end.  But it will teach you basics of the resources.
  • The original ICASSP 2011 lecture.
  • Eleanor Chodroff’s tutorial – Rare wordy explanation of the toolkits.  With some decent notes on what #senones really means.
  • Qianhui Wan’s runthrough of stages in a kaldi training – Good high-level run through of kaldi’s script.

More advanced topics:

  • First, a survival note.  For the most part, working with Kaldi means you work with Unix and sometimes dive deep into its C++/C level code.  You would get crushed if you expect “tensorflow-style” of problem solving.
  • HBKA – WFST is one of the cores of a kaldi-based ASR system.  But it’s also rather hard to grok.  These days, HBKA is seen as the Bible of learning WFST.   The key algorithms in WFST are determinization and minimization.  Well, they are actually variants of the FST.  (In the case of minimization, you just use the FST version to minimize.) So to understand what you are doing,  you also want to have the basics of some classic FST algorithms. So a computational complexity book is very useful too.  (I use Hopcraft and Ullman) .
  • If you want to dig deeper, several papers which contain the detail algorithms (and proofs) of determinization and minimization.  are here and here.   If HBKA is the Bible, these papers might be the Words. 🙂
  • Other more wordy tutorials on WFST: Vassil Panayotov’s  Josh Meyer’s
  • Btw, talking about internals of WFST these days i seen as “advanced” topics.   Most people are using TF/Pytorch.  So revolutionary technologies such as WFST were forgotten.

When you need to hack kaldi……

  • Changing source code of kaldi, or in general, open source speech recognizers, is not the worst thing happen to a hacker.  For the most part, you can derive most information by reading the source code.   There are modules  which are terse .  e.g. nnet3.   Say if you want to add a new computation command, then you want to go through several classes to make it works.   On the same vein, you don’t really see any description of how individual command works.   Think of it as assembly code to C, you will need to work it through yourself.
  • The good news is …… it’s possible.  As always, you just need some coffee and a comfortable chair.
  • What if you want to read some documentation then?  Then go with https://kaldi-asr.org/doc/index.html.   You will be able to read high-level understanding of some algorithms.
  • I never work with Dr. Povey.   But I often think his code and description are terse.  i.e. He certainly know what he is doing, and many critics just don’t miss his points, but you need to be experienced in ASR to understand some of his “moves”.
  • Also see the next section on specific questions you may ask about Kaldi.

Questions you will ask when using Kaldi

  • Many data structures in kaldi are not “created with human readability as the first priority“.  (I chucked when I read this phrase from Kaldi doc. 🙂 )   But then users often convinced Povey to come up with a terse yet readable description.
  • Tree-related: How does the decision tree look like?  Check out copytree.   How does decision tree work in kaldi?  Oh you better learn what Event Maps is.  The link also brings you to what the internal of decision tree building looks like.   A more high-level description can be found at here.
  • Transition ID:  What is transition ID?  Two important answers: it is a 5-tuples including the identity of transitions, source, forward and self-loop transID and the phone.  It is also an ID which the minimal description of a compiled decoding graph.  See here. 
  • Lattice-relatedhere.  Also lattice-copy is your friends.
  • nnet3-related: How does neural network computation work?  How was an NN compiled?  What optimization was used on the NN?  Actually, what are the optimizations? If you feel confused about this question, check out all nnet3 links from the “nnet3 setup” page.
  • Example generation for NN training:  That… if you never “read between the pipes”, you will never understand.   Various Chinese hackers have analyze the chain though.  You can easily look it up on Google.

Acknowledgement

You always want to thank Dan Povey and the kaldi team for their great work.   Hybrid approach is not going away soon because of them.

Update Logs:

(Sep 7, 2020) Add notes on interesting topics such as tree, transition ID and neural network computation.
(Before Sep 7, 2020) Wrote the backbone of the note.

 

Categories
Uncategorized

Resources for Quantum Computation/Machine Learning

Books:

Quantum Computation and Quantum Information, or so called “Mike and Ike”.

Quantum Computing since Democritus by Scott Aaronson.

Course:

EdX Quantum Computing 

Quantum Information Science Part I, Part II, Part III

Daniel Gottesman’s Lectures. 

Tools:

PyQuil

Visualizing quantum circuit and bloch sphere : quirk.

Websites/Blogs:

Scott Aaronson’s Shtetl-Optimized

Quantum Computing Factsheet.

Videos:

David Deutsch’s video lectures. 

Michael Nielson’s video lectures.

Categories
Uncategorized

AIDL Weekly Past Links

Here is the link for all past AIDL Weekly .

Indeed, we have stopped our weekly publication of AIDL Weekly early this year.   And many of you asked about why – nothing much.  Me and Waikit are just busy in our life, and writing a weekly newsletter is tough.

But never say never, we might come back in the future.  So stay tuned.

Arthur

 

Categories
Uncategorized

AIDL Weekly Issue 86 – CES 2019, Edward Grefenstette and Common Voice

Editorial

Thoughts From Your Curators

This week on AIDL:

  • AI news from CES 2019
  • Facebook’s poaching of Edward Grefenstette from DeepMind
  • We also talk about the significance of the Common Voice Initiative by Mozilla

Join our community for real-time discussions here: Expertify


This newsletter is published by Waikit Lau and Arthur Chan. We also run Facebook’s most active A.I. group with 193,000+ members and host an occasional “office hour” on YouTube. To help defray our publishing costs, you may donate via link. Or you can donate by sending Eth to this address: 0xEB44F762c58Da2200957b5cc2C04473F609eAA65.

Artificial Intelligence and Deep Learning Weekly

News



CES

Baidu:

AMD:

Google:

And SDC news from CES:

Artificial Intelligence and Deep Learning Weekly

Blog Posts


Open Source

Other News

Other Interesting Stories

Artificial Intelligence and Deep Learning Weekly

About Us

This newsletter is published by Waikit Lau and Arthur Chan. We also run Facebook’s most active A.I. group with 193,000+ members and host an occasional “office hour” on YouTube. To help defray our publishing costs, you may donate via link. Or you can donate by sending Eth to this address: 0xEB44F762c58Da2200957b5cc2C04473F609eAA65.

Join our community for real-time discussions here: Expertify

Artificial Intelligence and Deep Learning Weekly

Categories
Uncategorized

AIDL Weekly #85 – AGI is Nowhere Near – According to Prof. Hinton

Artificial Intelligence and Deep Learning Weekly – Issue 85

Editorial

Thoughts From Your Curators

We were on vacation during Christmas and we are back this week. We bring you several interesting links:

  • How does Prof. Hinton think of AGI?
  • The MIT Course on deep learning, reinforcement learning and SDC,
  • 2018 in review.

Join our community for real-time discussions here: Expertify


This newsletter is published by Waikit Lau and Arthur Chan. We also run Facebook’s most active A.I. group with 193,000+ members and host an occasional “office hour” on YouTube. To help defray our publishing costs, you may donate via link. Or you can donate by sending Eth to this address: 0xEB44F762c58Da2200957b5cc2C04473F609eAA65.

Artificial Intelligence and Deep Learning Weekly

News

Blog Posts



2018 in Review.

Here are couple of posts which reviews AI development in 2018.

Not exactly review, but this is a good read about convolutional neural networks.

Artificial Intelligence and Deep Learning Weekly






Notable AI Blog Posts

Google:

Artificial Intelligence and Deep Learning Weekly

About Us

This newsletter is published by Waikit Lau and Arthur Chan. We also run Facebook’s most active A.I. group with 191,000+ members and host an occasional “office hour” on YouTube. To help defray our publishing costs, you may donate via link. Or you can donate by sending Eth to this address: 0xEB44F762c58Da2200957b5cc2C04473F609eAA65.

Join our community for real-time discussions here: Expertify

Artificial Intelligence and Deep Learning Weekly

Categories
Uncategorized

AIDL Weekly #84 – Udacity Reorg, AI Index 2018 Report

Editorial

Thoughts From Your Curators

This week we link you to the latest AI Index report, discuss the impact of reorganization of Udacity and analyze a blog post from OpenAI.

Join our community for real-time discussions here: Expertify


This newsletter is published by Waikit Lau and Arthur Chan. We also run Facebook’s most active A.I. group with 186,000+ members and host an occasional “office hour” on YouTube. To help defray our publishing costs, you may donate via link. Or you can donate by sending Eth to this address: 0xEB44F762c58Da2200957b5cc2C04473F609eAA65.

Artificial Intelligence and Deep Learning Weekly

News

Pulse of AI Last Week

Artificial Intelligence and Deep Learning Weekly


Blog Posts




Open Source

About Us

About Us

This newsletter is published by Waikit Lau and Arthur Chan. We also run Facebook’s most active A.I. group with 186,000+ members and host an occasional “office hour” on YouTube. To help defray our publishing costs, you may donate via link. Or you can donate by sending Eth to this address: 0xEB44F762c58Da2200957b5cc2C04473F609eAA65.

Join our community for real-time discussions here: Expertify

Artificial Intelligence and Deep Learning Weekly

Categories
Uncategorized

AIDL Weekly #83 – NeurIPS 2018

Editorial

Thoughts From Your Curators

This week we round up NeurIPS 2018 news for you. We also bring you posts on careers of AI.

Join our community for real-time discussions here: Expertify


This newsletter is published by Waikit Lau and Arthur Chan. We also run Facebook’s most active A.I. group with 184,000+ members and host an occasional “office hour” on YouTube. To help defray our publishing costs, you may donate via link. Or you can donate by sending Eth to this address: 0xEB44F762c58Da2200957b5cc2C04473F609eAA65.

Artificial Intelligence and Deep Learning Weekly

News

Pulse of AI Last Week

Artificial Intelligence and Deep Learning Weekly


Blog Posts

Video

Member’s Question

Starting a Career in AI

A member of AIDL asked: “How do I start my career in artificial intelligence?”

Answered by Arthur:

“As I promised. Here are few thoughts on the topic. I will focus on more commercial applications of ML. Of course you can be a professor, but you should know thats a moonshot.

I’d like to organize them as four parts: should you?, learning, starting out, and first 2-5 years.

should you?

I like Race Vanderdecken’s comment the most. Being MLE is not for everybody. If you go through general CS education, chances are you would be trained as a competent programmer. But being MLE means you need to sit model training, analyze results. Your instinct of finishing something fast as learned from competitive programming is useless. MLE favors slow and deliberate thinking which is not the norm in our fast-paced world.

learning

ML is a topic you should learn and can learn in great detail first before practice it. So if you are in school, take as many ML classes as possible. You should spend at least 50% of time to learn the practice of ML. So like train a classifier? Think of ways to improve its classification rate? Improve its inference speed? You should think about these issues every time you work on a new project. And your worth has a lot of do with this experience you accumulate.

starting out

So how do you actually get hired as MLE then? You go to seek one. The idea is similar to any job seeking process: present your resume to your potential employers, then pitch yourself to them. There are other routes: Someone might recommend you. You might have done an internship in the company before so they like you. It can be a placement program. But in any cases, you got to build up your skillset, and present it well in a resume.

What do people look for in candidates? First off, it’s your project portfolio. Suppose you want to work for a computer vision company, you really want to have some compelling projects on image processing. So if you tell me you train MNIST, I would think, “Okay this guy went through the basics”. But if you told me you train the whole Imagenet at home, then I would think, “ah, that’s not easy”.

Then it is your general knowledge in ML. In an interview, senior engineers would usually probe holes on your understanding and In ML, there are many misconceptions. e.g. Many people will give you silly and unsubstantiated reasonings on what deep learning is, like “it uses big data”, like “DL is just deeper than ML”. Those answers are hand-waiving and it doesn’t quite explain what deep learning is.

Another thing I do in interview: I just go ahead to look the projects quoted in a candidate resume, and asked detailed questions on each of them. Very quickly, you would realize if someone worths one’s salt.

first 2-5 years

If you are successful and got hired, you will start to go through the daily chores of being MLE. So what do MLE does? For the most part you try to make a living through machine learning. The key metric here: Do something you make being used? What that entails is you want to create an ML product, and refine it to a point that a company can sell it. There are many things to unpack here. Because just to create something in ML is hard, but usually the prototype performance would be too bad for production. Or if something is good for production, the company might just decide they don’t want to sell it.

So whether you can start out has everything to do with hardwork plus a lot of luck. My suggestion there is to start with small projects within a company, then build up your reputation. Make sure you are employed, because if you want to get better in ML, you have to keep educating yourself and that cost time and $.

after 5 years

I also have advices for people who stay in the business for around 5 years. But this comment is getting long, so let’s leave it next time?”

Artificial Intelligence and Deep Learning Weekly

NeurIPS2018

Round-up of NeurIPS 2018

We have NeurIPS 2018 this week, with its name change, and some reporters shut out from the conferences. We heard couple of stories/reviews from our members. Here is a round-up of the news:

Artificial Intelligence and Deep Learning Weekly

Other News

What We Read Last Week

Artificial Intelligence and Deep Learning Weekly

About Us

This newsletter is published by Waikit Lau and Arthur Chan. We also run Facebook’s most active A.I. group with 184,000+ members and host an occasional “office hour” on YouTube. To help defray our publishing costs, you may donate via link. Or you can donate by sending Eth to this address: 0xEB44F762c58Da2200957b5cc2C04473F609eAA65.

Join our community for real-time discussions here: Expertify

Artificial Intelligence and Deep Learning Weekly

Categories
Uncategorized

AIDL Weekly #82 – So You Want to be an AI Researcher?

Editorial

Thoughts From Your Curators

This week we round up all Amazon’s announcements at Re:invent. We also bring you a trending article from Google’s researcher Vincent Vanhoucke.


This newsletter is published by Waikit Lau and Arthur Chan. We also run Facebook’s most active A.I. group with 183,000+ members and host an occasional “office hour” on YouTube. To help defray our publishing costs, you may donate via link. Or you can donate by sending Eth to this address: 0xEB44F762c58Da2200957b5cc2C04473F609eAA65.

Join our community for real-time discussions here: Expertify

Artificial Intelligence and Deep Learning Weekly

News


Amazon at Re:invent Roundup

From these releases, it gives us a feeling that Amazon is not just playing its strength such as cloud computing, it is also flexing muscles on deep/machine learning.

Artificial Intelligence and Deep Learning Weekly

Blog Posts


Video


About Us

This newsletter is published by Waikit Lau and Arthur Chan. We also run Facebook’s most active A.I. group with 183,000+ members and host an occasional “office hour” on YouTube. To help defray our publishing costs, you may donate via link. Or you can donate by sending Eth to this address: 0xEB44F762c58Da2200957b5cc2C04473F609eAA65.

Join our community for real-time discussions here: Expertify

Artificial Intelligence and Deep Learning Weekly

 

Categories
Uncategorized

AIDL Weekly #81 – “Math is language that uses God” – Vladimir Vapnik

Editorial

Thoughts From Your Humble Curators

This week we bring you the interview of Prof. Vladmir Vapnik, creator of SVM, by Prof. Lex Friedman, the lecturer of the MIT SDC and AGI class. We also analyze the recent statement by Dr. Ilya Sutskever which he stated short-term AGI is possible.


This newsletter is published by Waikit Lau and Arthur Chan. We also run Facebook’s most active A.I. group with 181,000+ members and host an occasional “office hour” on YouTube. To help defray our publishing costs, you may donate via link. Or you can donate by sending Eth to this address: 0xEB44F762c58Da2200957b5cc2C04473F609eAA65.

Join our community for real-time discussions here: Expertify

Artificial Intelligence and Deep Learning Weekly

News

Echo is Gaining Ground…..

We learned last week Amazon is now hiring 10000 workers for it Amazon Echo. That was doubled from 5000 from previous years. We also learned that all this investment starts to pay off, e.g. Microsoft is now selling Echo online and offline. At least for now, we all know that Amazon is the winner in the battle of speech assistants.

Artificial Intelligence and Deep Learning Weekly



Blog Posts




AI Blogs Last Week

Google:

BAIR:

Floydhub:

Artificial Intelligence and Deep Learning Weekly


Other Interesting News

Artificial Intelligence and Deep Learning Weekly

About Us

About Us

This newsletter is published by Waikit Lau and Arthur Chan. We also run Facebook’s most active A.I. group with 183,000+ members and host an occasional “office hour” on YouTube. To help defray our publishing costs, you may donate via link. Or you can donate by sending Eth to this address: 0xEB44F762c58Da2200957b5cc2C04473F609eAA65.

Join our community for real-time discussions here: Expertify

Artificial Intelligence and Deep Learning Weekly