By Yann LeCun
The talk I attended today was hosted by CMU Robotics Institute and given by the Director of AI Research of Facebook and the Professor of Computer Science, New York University Yann LeCun. I was quite excited about the talk because of the experience of the speaker and the popularity of the topic (deep leraning) to be discussed.
Professor LeCun allocated considerable time to explain convolutional neural networks at the beginning section of his talk. Then he briefly explained deep learning and its relation with neural networks. As he explained, deep learning means learning hierarchical representations. Deep learning in context of neural networks means having more than one stage of nonlinear transformation in the network. He used an example from human recognition to illustrate a deep learning neural network. The hierarchical structure in visual cortex used for simple object recognition was the example he presented. Supervised convolutional neural networks are used for recognizing multiple objects. An example of application area to supervised networks are driving cars. Professor mentioned about a project being carried out recently by CMU and Nvidia on use of neural nets for driving cars.
In the second part of the talk, he talked about memory-augmented networks (differentiable computers). Rationale behind augmented neworks with memory is that recurrent nets cannot remember things very long. Differentiable memory is like a soft RAM circuit or a soft hash table and it stores key-value pairs. An example application area of it is building a machine that answers questions.
The outcoming section was about obstacles to progress in AI. To name a few, machines need to understand/learn how the world works and machines have no common sense. In fact, throughout the talk, he expressed the importance of common sense for learning. Humans and animals have this skill and improve it by observing their environment. He defined common sense as inference or filling the blanks like inferring the physical laws.
Then he briefly mentioned about the architecture of an AI system. It mainly consists of three components: agent itelf, objective function, and the world. Objective function needs the state of agent. Agent takes percepts and observations from the world as input and it returns output/acions to the world. He also elaborated on the inner architecture of an agent. Basically, an agent has following components: world simulator, actor and critic (about the future, prediction). In the next section, he discussed a challenge in this architecure: how to simulate real world. He stated that this is a challenge because the real world is unpredictable. This challenge could be overcomed by using unsupervised learning, because it is a proper method for dealing with uncertainity. Then he continued with the explanation of an unsupervised learning technique: energy-based unsupervised learning.
Next part was related to adversarial training. He mostly demonstrated sample applications including theinitial results of the projects on which they are working in their labs. Generating images using some sample set of them is an application area. For example, using face images of several men and women with and without glasses and generating several possible woman face images, which is called as face algebra method. Another application domain is video prediction. As he stated, here the aim is to answer the question whether we can train machines to predict future using unsupervised learning.
In overall, the talk was exciting. However, I could not get much out of it. It is most probably due to the fact that I have no enough background knowledge about neural networks and deep learning. Despite I took data mining and machine learning courses and implemented neural networks, I do not have an extensive research experience in this domain. Still, it was good to learn about the current state of the art and his works about this trendy topic.