> git clone https://github.com/bitcoin/bitcoin.git
>cd ~/src/github
>git checkout v0.14.2
>./configure –without-gui –disable-tests –disable-wallet
> make -j 4
> git clone https://github.com/bitcoin/bitcoin.git
>cd ~/src/github
>git checkout v0.14.2
>./configure –without-gui –disable-tests –disable-wallet
> make -j 4
I have been taking a break from deep learning, and I am quite into graphical models (GM) lately. So that’s why I am gathering resources of understanding various concepts of GM.
Here are some useful courses one can use. They are not sorted/categorized, it’s just useful for me to look them through later.
Courses:
Note that except Koller’s class, not all of the following classes have video available.
Books:
As always, AIDL admin routinely look at whether certain post should stay to our forum. Our criterion has 3 pillars: relevancy, non-commercial and accurate. (Q13 of AIDL FAQ)
This time I look at “The Post-Quantum Mechanics of Conscious Artificial Intelligence“, the video was brought up by an AIDL member, and he recommend we started from the 40 mins mark.
So I listened through the video as recommended.
Indeed, the post is non-commercial for sure. And yes, it mentioned AGI from Roger Penrose. So it is relevant to AIDL. But is it accurate though? I’m afraid my lack of physics education background trip me. And I would judge that “I cannot decide” on the topic. Occasionally new science comes in a form no one understand yet. So calling something inaccurate without knowing is not appropriate.
As a result this post stays. But please keep on reading.
Saying so, I don’t mind to give a *strong* response to the video. Due to the following 3 reasons:
1, According to Wikipedia, most of Dr. Jack Sarfatti’s theory and work are not *peer-reviewed*. He has left academia from 1975. Most of his work is speculative. And most of them are self-published(!). There’s no experimental proof on what he said. He was asked several times about his thought in the video. He just said “You will know that it’s real”. That’s a sign that he doesn’t really evidence.
2, Then there is the idea of “Post-Quantum Mechanics”. What is it? The information we can get is really scanty. Since I can only find a group which seems to dedicate to such study, as in here. Since I can’t quite decide if the study is valid. I would say “I can’t judge.” But I also couldn’t find any other group which actively support such theory. So may be we should call the theory at best “an interesting hypothesis”. And Sarfatti build his argument on the existence on “Post Quantum Computer”. What is it? Again I cannot quite find the answer on-line.
Also you should be aware that current quantum computer have limited capability. D-Wave quantum computing is based on quantum annealing, with many disputed whether it is true quantume computing. In any case, both “conventional” quantum computing and quantum annealing has nothing to do with Post-Quantum Computer. That again you should feel very suspicious.
3a, Can all these interesting theory be the mechanism of the brain or AGI? So in the video, Sarfatti mentioned brain/AGI for four times. His point are two, I would counter them right after, first is that if you believe in Penrose’s theory that neurons is related to quantum entanglement, then his own theory-based on post quantum mechanics would be huge. But then once you listen to serious computational neuroscientists, they would be very cautious on whether quantum theory as the basis of neuronal exchange of information. There are many experimental evidence that neurons operate by electrical signal or chemical signal. But they are in a much bigger scale than quantum mechanics. So why would Penrose suggested that have make many learned people scratch their heads.
3b, Then there is the part about Turing machine. Sarfatti believes that because “post-quantum Computer” is so powerful so it must be the mechanism being used by the brain. So what’s wrong with such arguments? So first thing: no one knows what “post quantum-computer”, that I just mentioned in point 2. But then even if it is powerful, that doesn’t mean the brain has to follow such mechanism. Same can be said with our current quantum computing technologies.
Finally, Sarfatti himself believes that it is a “leap of faith” to believe the consciousness is wave. I admire his compassion on speculating the world of science/human intelligence. Yet I also learn by reading Gardner’s “Fads and Fallacies” that many pseudoscientists have charismatic personality.
So Members, Caveat Emptor.
Arthur
AIDL member Bob Akili asked (rephrased):
What is the Difference between Deep Learning and Machine Learning?
Usually I don’t write a full blog message to answer member’s questions. But what is “deep” is such a fundamental concept in deep learning, yet there are many well-meaning but incorrect answers floating around. So I think it is a great idea to answer the question clearly and hopefully disabuse some of the misconceptions as well. Here is a cleaned up and expanded version of my comment to the thread.
First of all deep learning is just a subset of techniques of machine learning. You may heard from many “Deep Learning Consultants”-type: “deep learning is completely different from from Machine Learning”. But then when we are talking about “deep learning” these days, we are really talking about “neural networks which has more than one layer”. Since neural network is just one type of ML techniques, it doesn’t make any sense to call DL as “different” from ML. It might work for marketing purpose, but the thought was clearly misleading.
So now we know that deep learning is a kind of machine learning. We still can’t quite answer why it is special. So let’s be more specific, deep learning is a kind of representation learning. What is representation learning? Representation learning is an opposite of another school of thought/practice: feature engineering. In feature engineering, humans are supposed to hand-craft features to make machine works better. If you Kaggle before, this should be obvious to you, sometimes you just want to manipulate the raw inputs and create new feature to represent your data.
Yet in some domains which involve high-dimensional data such as images, speech or text, hand-crafting feature was found to be very difficult. e.g. Using HOG type of approaches to do computer vision usually takes a 4-5 years of a PhD student. So here we come back to representation learning – can computer automatically learn good features?
Now we come to the part why deep learning is “deep” – usually we call a method “deep” when we are optimizing a nested function in the method. So for example, if you can express such functions as a graph, you would find that it has multiple layers. The term “deep” really is describing such “nestedness”. That should explain why we typically called any artificial neural network (ANN) with more than 1 hidden layer as “deep”. Or the general saying, “deep learning is just neural network which has more layers”.
(Another appropriate term is “hierarchical”. See footnote [4] for more detail.)
This is also the moment Karpathy in cs231n will show you the multi-layer CNN such that features are automatically learned from the simplest to more complex one. Eventually your last layer can just differentiate them using a linear classifier. As there is a “deep” structure that learn the right feature (last layer). Note the key term here is “automatic”, all these Gabor-filter like feature are not hand-made. Rather, they are results from back-propagation [3].
Actually, there are plenty, deep Boltzmann machine? deep belief network? deep Gaussian process? They are still discussed in unsupervised learning using neural network, but I always found that knowledge of graphical models is more important to understand them.
Yes and no. It depends on who you talk to. If you talk with ANN researchers/practitioners, they would just tell you “deep learning is just neural network which has more than 1 hidden layer”. Indeed, if you think from their perspective, the term “deep learning” could just be a short-form. Yet as we just said, you can also called other methods “deep”. So the adjective is not totally void of meaning. But many people would also tell you that because “deep learning” has become such a marketing term, it can now mean many different things. I will say more in next section.
Also the term “deep learning” has been there for a century. Check out Prof. Schmidhuber’s thread for more details?
I said it with much authority and I know some of you guys would just jump in and argue:
“What about word2vec? It is nothing deep at all, but people still call it Deep learning!!!” “What about all wide architectures such as “wide-deep learning“?” “Arthur, You are Making a HORRIBLE MISTAKE!”
Indeed, the term “deep learning” is being abused these days. More learned people, on the other hand, are usually careful to call certain techniques “deep learning” For example, in cs221d 2015/2016 lectures, Dr. Richard Socher was quite cautious to call word2vec as “deep”. His supervisor, Prof. Chris Manning, who is an authority in NLP, is known to dispute whether deep learning is always useful in NLP, simply because some recent advances in NLP really due to deep learning [1][2].
I think these cautions make sense. Part of it is that calling everything “deep learning” just blurs what really should be credited in certain technical improvement. The other part is we shouldn’t see deep learning as the only type of ML we want to study. There are many ML techniques, some of them are more interesting and practical than deep learning in practice. For example, deep learning is not known to work well with small data scenario. Would I just yell at my boss and say “Because I can’t use deep learning, so I can’t solve this problem!”? No, I would just test out random forest, support vector machines, GMM and all these nifty methods I learn over the years.
So now we come to the arena of misconceptions, I am going to discuss two claims which many people have been drumming about deep learning. But neither of them is the right answer to the question “What is the Difference between Deep and Machine Learning?
The first one you probably heard all the time, “Deep Learning is about ML methods which use a lot of data”. Or people would tell you “Oh, deep learning just use a lot of data, right?” This sounds about right, deep learning in these days does use a lot of data. So what’s wrong with the statement?
Here is the answer: while deep learning does use a lot of data, before deep learning, other techniques use tons of data too! e.g. Speech recognition before deep learning, i.e. HMM+GMM, can use up to 10k hours of speech. Same for SMT. And you can do SVM+HOG on Imagenet. And more data is always better for those techniques as well. So if you say “deep learning use more data”, then you forgot the older techniques also can use more data.
What you can claim is that “deep learning is a more effective way to utilize data”. That’s very true, because once you get into either GMM or SVM, they would have scalability issues. GMM scales badly when the amount of data is around 10k hour. SVM (with RBF-kernel in particular) is super tough/slow to use when you have ~1 million point of data.
This particular claim is different from the previous “Data Requirement” claim, but we can debunk it in a similar manner. The reason why it is wrong? Again before deep learning, people have GPUs to do machine learning already. For example, you can use GPU to speed up GMM. Before deep learning is hot, you need a cluster of machines to train acoustic model or language model for speech recognition. You also need tons of RAM to train a language model for SMT. So calling GPU/Data Center/RAM/ASIC/FPGA a differentiator of deep learning is just misleading.
You can say though “Deep Learning has change the computational model from distributed network model to more a single machine-centric paradigm (which each machine has one GPU). But later approaches also tried to combine both CPU-GPU processing together”.
Indeed, you should always treat what you read on-line with a grain of salt. Being critical is a good thing, having your own opinion is good. But you should also try to avoid equivocate an issue. Meaning: sometimes things have only one side, but you insist there are two equally valid answers. If you do so, you are perhaps making a logical error in your thinking. And a lot of people who made claims such as “deep learning is learning which use more data and use a lot of GPUS” are probably making such thinking errors.
Saying so, I would suggest you to read several good sources to judge my answer, they are:
In any case, I hope that this article helps you. I thank Bob to ask the question, Armaghan Rumi Naik has debunked many misconceptions in the original thread – his understanding on machine learning is clearly above mine and he was able to point out mistakes from other commenters. It is worthwhile for your reading time.
[1] See “Last Words: Computational Linguistics and Deep Learning”
[2] Generally whether DL is useful in NLP is widely disputed topic. Take a look of Yoav Goldberg’s view on some recent GAN results on language generation. AIDL Weekly #18 also gave an expose on the issue.
[3] Perhaps another useful term is “hierarchical”. In the case of ConvNet the term is right on. As Eric Heitzman comments at AIDL:
“(deep structure) They are *not* necessarily recursive, but they *are* necessarily hierarchical since layers always form a hierarchical structure.” After Eric’s comment, I think both “deep” and “hierarchical” are fair terms to describe methods in “deep learning”. (Of course, “hierarchical learning” is a much a poorer marketing term.)
[4] In earlier draft. I use the term recursive to describe the term “deep”, which as Eric Heitzman at AIDL, is not entirely appropriate. “Recursive” give people a feeling that the function is self-recursive or$latex f(f( \ldots f(f(*))))$. but actual function are more “nested”, like $latex f_1(f_2( \ldots f_{n-1}(f_n(*))))$. As a result, I removed the term “recursive” but just call the function “nested function”.
Of course, you should be aware that my description is not too mathematically rigorous neither. (I guess it is a fair wordy description though)
History:
20170709 at 6: fix some typos.
20170711: fix more typos.
20170711 at 7:05 p.m.: I got a feedback from Eric Heitzman who points out that the term “recursive” can be deceiving. Thus I wrote footnote [4].
If you like this message, subscribe the Grand Janitor Blog’s RSS feed. You can also find me (Arthur) at twitter, LinkedIn, Plus, Clarity.fm. Together with Waikit Lau, I maintain the Deep Learning Facebook forum. Also check out my awesome employer: Voci.
Here’s an awesome post from Prof. Tao:
What are some useful, but little-known, features of the tools used in professional mathematics?
As AIDL grew, once in a while people would talk about blockchain would affect AI or deep learning. Currently it is still a long shot, but blockchain by itself is a very interesting technology and it deserves our notice.
Here are some resources you may use to learn about blockchain. Unlike “Top 5 List” for AIDL, I don’t really understand the technology too well. But also unlike “List of Neuroscience MOOC“, Greg Dubela did give me a lot of recommendations on what you should learned up. Thus this post is also used as a resource post in “Blockchain and Crypto“.
Introductory Videos:
MOOC
Blockchain is still a new development, so it’s harder to find MOOC which can teach you the whole thing in entirety. We found there are couple of exceptions:
Books
Technical Specifications:
Visualizing Blockchain
Forum:
Different Cryptos: (under construction)
Learning blockchain these days usually means you know different the characteristics of different coins. Here are list of interesting ones.
Bibliography:
To be reviewed:
This is an impression post of Coursera “Computational Neuroscience” by Rao and Fairhall: (Crossposted in both AIDL and CNAGI)
Hope you enjoy this “impression”!
Arthur
Some misadventures on MacOS X:
Some gist about fasttext:
Some other nice resources one can follow:
http://sebastianruder.com/word-embeddings-1/index.html#continuousbagofwordscbow
http://textminingonline.com/fasttext-for-fast-sentiment-analysis
AFAIK:
These are the three I plan to explore. And this comparison seems to be fairly comprehensive on the software space of simulation.