This type of AI that is competent for different tasks without regulation is closer to the way humans think
Marcus, a professor of psychology and cognitive science at New York University, recently got on board with DeepMind, an artificial intelligence company. Following the recent question on Twitter of the US General Artificial Intelligence Research Organization's OpenAI's Rubik's cube manipulator, he recently evolved AlphaStar, a new StarCraft 2 agent launched by Deep Thinking. The edition raises six major questions. This time, his point of doubt is not the performance of the game itself, but a higher level: the significance of future general intelligence research.
The coolest results in recent years have come from deep reinforcement learning
The Open Cube Robots launched by OpenAI this time do not use professional algorithms to solve a specific task (if you change a task, you need to reprogram). Instead, you use a certain learning method to train the robot to let Robots have human-like problem-solving capabilities. But Marcus thinks that this description of the results is misleading. A more appropriate description should be "manipulation of Rubik's cube with reinforcement learning" or "progress of manipulating objects with smart robotic hands."
"Marcus puts too much emphasis on 'manipulating Rubik's cube with reinforcement learning'. In fact, both the OpenAI Rubik's cube robot and the" Alpha Star "evolution version of the" Starcraft 2 "agent released by 'Deep Thinking' use deep reinforcement learning technology. Deep reinforcement learning is currently recognized as the most likely technology to achieve general artificial intelligence among existing technologies. "Hao Jianye, associate professor of the School of Software, School of Intelligence and Computing, Tianjin University, explains that there are currently three major branches of machine learning, supervised learning, unsupervised learning, and Reinforcement learning and deep learning are currently the most mainstream type of technology in supervised learning. Deep reinforcement learning is a fusion of deep learning and reinforcement learning, which integrates deep neural networks into the framework of reinforcement learning.
"In recent years, deep reinforcement learning has developed rapidly, and it has shown great potential in dealing with complex, multifaceted and decision-making problems. At present, deep reinforcement learning technology is mainly used in some games and competitions." Hao Jianye introduced, 2016, Google The "Alpha Go" defeated the world's top Go players Li Shishi and Ke Jie, and became a milestone in the field of artificial intelligence. The core of "Alpha Go" lies in the use of deep reinforcement learning algorithms, so that the computer can continuously improve chess power through self-playing. Since then, Facebook has defeated the top professional players in the DOTA2 game; the Texas Holdem AI Coldplay Master developed by the CMU team easily defeated the top players.
In addition, "deep thinking" also uses deep reinforcement learning to optimize the energy consumption of the data center; Google uses deep reinforcement learning to complete the automatic architecture search of deep neural networks, and proposes the AutoML service, which promotes machine learning as a service to Millions of households. In China, there are also many applications of deep reinforcement learning technology. Domestic teams such as Ali, Tencent and Baidu apply deep reinforcement learning to decision-making on practical issues such as search, recommendation, marketing, dispatch, and path planning.
The technology most likely to implement general artificial intelligence
The artificial intelligence has developed to the current height, and the technically big heroes should belong to deep learning algorithms. Deep learning uses multi-layer neural networks to learn from massive amounts of data, so as to realize future predictions and make artificial intelligence systems more and more intelligent. At present, the security monitoring, automatic driving, voice recognition, Baidu map, etc. we are applying are all applications of deep learning technology in image vision, speech recognition, natural language understanding and other fields.
Reinforcement learning is also a hot technology in the current machine learning field. Unlike supervised learning based on known label training models, reinforcement learning can achieve autonomous learning like a human without explicit instructions from a computer. When a certain amount of learning is reached, the reinforcement learning system can predict the correct result. "The basic idea of reinforcement learning is to learn which behavior can maximize the expected benefits under different environments and different states." Hao Jianye introduced that the new version of the "Alpha Star" agent uses self-combat technology of reinforcement learning, and its learning The process does not require data annotation, but is dominated by the reward function. The agent gets a reward score or wins a game, it will get positive feedback, and the agent will adjust its behavior according to the performance of the battle. This is like a baby learning to walk, and it will adjust its behavior according to the results.
At present, the definition of general artificial intelligence has two characteristics, one is end-to-end learning, and the other is task adaptation, which is competent for different tasks without human participation in regulation. Deep reinforcement learning can combine the perceptual ability of deep learning with the decision-making ability of reinforcement learning and control directly based on the input information. It is an artificial intelligence technology closer to the way of human thinking. In the normal course of interaction with the world, reinforcement learning uses rewards to learn through trial and error, which is very similar to the natural learning process. For example, a one-handed Rubik's cube robotic hand may need to see the Rubik's cube using deep learning's image recognition technology, and then need to strengthen the learning model to allow the robotic hand to learn autonomously in the process of continuous trial and error. In reinforcement learning, less training information can be used. The advantage of this is that it has more information and is not limited by the skills of the supervisor. Deep reinforcement learning is another step towards building an autonomous system with a higher level of understanding of the world. This is why deep reinforcement learning is currently recognized as the most likely technology to implement general artificial intelligence in existing technologies.
Future general artificial intelligence needs to rely on brain science development
"Although it is said that deep reinforcement learning technology is most likely to realize general artificial intelligence, it cannot be said that it will certainly be achieved. We are still far from the true general artificial intelligence." Hao Jianye said that when deep learning and reinforcement learning are combined The enumeration of the real situation becomes the first need to identify the real situation and then perform the enumeration of the limited mode, thereby reducing the computational pressure, but the required data will be much larger than other machine learning algorithms. If the scene is extended to multi-agent deep reinforcement learning, the required data and computing power will increase exponentially. At present, there is no platform that can provide the massive data required for reinforcement learning. Various complications. This kind of data requirement cannot be realized in many real-world fields.
For example, for example, reinforcement learning requires a lot of trial and error. If a one-handed Rubik's cube robot is applied to the actual scene of cooking, it may make the ingredients one place or pour a whole bag of salt into the pot. It may also cause fire. Therefore, the mode of trial and error learning cannot be realized in real scenarios.
In addition, deep learning and reinforcement learning are the most difficult to debug successfully in the field of machine learning. Its success cases are actually not many, but once launched, they will cause a sensation. Moreover, this is a model framework that even random seeds will greatly affect the learning effect. For the same model, training 10 times may fail 7 times and 3 times succeed. Another point is that deep reinforcement learning is extremely easy to overfit into the current environment of the agent's interaction, so the environment is slightly changed. The agent that looks good before is likely to make low-level errors.
"When human beings know things, they usually use data to make causal inferences and judgments to arrive at corresponding solutions. However, current artificial intelligence systems cannot achieve this kind of causal inference." Hao Jianye said that general artificial intelligence may be used in the future. The development of human brain also needs to rely on the development of brain science. At present, our cognition of the human brain is still at a very early stage. The brain's cognitive process of things, the problem-solving process, and the ability to think are still unclear. Therefore, the current development of artificial intelligence still has a long way to go before this universal artificial intelligence, which can truly simulate human intelligent thinking. To go.