Google’s artificial intelligence group, DeepMind, has unveiled the new version of AlphaGo – a computer program that plays the board game Go. Google revealed that, while playing with itself AlphaGo mastered the game only within 70 days. It discovered strategies that the masters of the game have been learning for over a dozen centuries, as well as those that people have not known before.
In John Badham’s “War Games” from 1983, one of the leading roles is played by Joshua – a computer commanding American ballistic missile launches. It attempts to win an imaginary nuclear war against the Soviet Union. It plays a delusional game in which the Russian attack is just a simulation, but the American missiles and the desire to attack is real. He almost succeeds and fires the rockets, however at the end of the movie, as it usually is in Hollywood productions, David Lightman, a brilliant teenager, makes Joshua play with himself the tic-tac-toe.
Joshua plays all the possible parties and finds out that the tic-tac-toe cannot be won and concludes that there is no scenario for winning a global nuclear conflict. This, of course, saves the world from a potential threat.
Until recently, the film was still in the science fiction category, because computers, even if they were able to play (and win) in various games, could not learn the winning strategy themselves. Until now strategy was based on human experiences and programmed in advance. Until now…
The latest issue of the magazine “Nature” brings a lot of news. A team of researchers led by David Silver from London-based DeepMind (owned by Google) has created the Alpha Go Zero – the artificial intelligence program, which can learn from scratch, without resorting to knowledge acquired and developed by humans.
Electronic copy of the brain
The same researchers were the creators of the previous version of AlphaGo, which in March 2016 defeated the grandmaster Go – Lee SeDol. After the duel, the Korean Federation in recognition of merit gave the program the title of a Master and awarded it seven dan in Go.
The new AlphaGo Zero has played the game with its predecessor – the champion of March 2016 – and beaten him unconditionally 100: 0.
This is a huge qualitative leap. The first version of AlphaGo Fan required 176 TPUs (Tensor Processing Unit, a Google-developed device for efficient neural network computing), the other AlphaGo Lee (named after Lee Sedola) needed 48 TPUs. The next AlphaGo Master and AlphaGo Zero were efficient using only 4 TPU processors and offered better quality and faster gameplay.
No doubt, the AlphaGo Zero algorithm is more than ten times more efficient than the AlphaGo Lee algorithm.
However, the performance is not most important here. The most significant thing is that AlphaGo developed a way to learn the game and create strategies on their own.
The latest program, like all of the AlphaGo series, uses so-called neural networks, which are logical-mathematical structures that mimic the effects of neurons in the brains of humans and animals.
The neural network is one of the most promising and commonly used techniques in artificial intelligence that it became Google’s first focus. The company has not only designed software libraries available for free to the world but has also developed specialized processors that are exceptionally efficient in addressing the neural network math problems (these are the TPUs)and the computing power that these processors offer in the free market.
So, what is the characteristic feature of the neural networks? I would say the fact that no one really knows how they work.
You cannot just go, sit down and write a computer program that will run optimally right away. Firstly, we have to design the network for specific applications, followed by a tedious programming process, which resembles the learning process of the living brain.
Real Go masters left far behind.
Previous versions of AlphaGo “taught” how to play the game by people. At first, the computers were playing with a living person, and later on, they analyzed the records of all games played by the masters of Go, and then they were playing with themselves to improve their skills.
AlphaGo Zero began, as its name implies, from zero – it only knew the rules of the game and played with itself. Actually, it played with its exact copy. After each match, the neural network data of the program that won was copied to the memory of the program that lost, and the next game began.
The whole process took 70 days, but after a dozen or so days the computer played very well (on a human level). After 40 days it even started “inventing” the strategies described by the masters of the game.
It is important to point out that people have been learning and making strategies to this game for centuries, but the computer discovered them after a few weeks of calculations! Moreover, after further “training” AlphaGo Zero has developed new strategies, which the masters of this game have no idea about.
Although DeepMind scientists remain silent whether the program has already tried to live up to a living grandmaster, we have to remember that last year it mercilessly defeated a program that has seven dan in Go and that bit the legendary Lee SeDol.
Google experiments with artificial intelligence
The development of the artificial intelligence to play in Go is not an art for art’s sake. This game is a particularly useful training ground for researchers who are looking for the best methods of machine learning to solve complex problems.
The game itself is so complex that it can not be solved just by a pure calculation of all possible stones on the board and a choice of combinations that can make you win (such a method is possible while playing “tic-tac-toe” or in checkers for example).
Of course, the computer may consider some moves in advance, but it must show some “intuition,” which of the planned moves can bring the most significant benefit.
This was one of the reasons why it was believed, that computers would never play well in Go. However, the neural networks allowed them to “program” an “intuition.” Since the computer turned out to be so good in self-teaching, its creators look forward to the future – perhaps similar programs will solve problems that people can not figure out.
The first attempts at practical application are already ongoing. Google computers, for example, are currently screening medical records from London hospitals in an attempt to find early symptoms of cancer on X-rays. Researchers hope that the computer will be able to see repetitive patterns of cancer development and will detect it before the disease progresses too far without being recognized.
IoT, Intelligent machines and deep thinking are a new concept for the humanity and we are only at its beginning. It is a quite a scary vision to cope with machines such as phones, that will decide who we can talk to, a fridge that chose what can we have for dinner or elevators refusing to take us in as it’s healthier to use the stairs… David Silver, head of the DeepMind project, says the goal was not to design a system that would scare people, but one that would help us solve problems that humanity does not cope with. And let’s keep it that way because it wouldn’t have been nice if artificial intelligence had begun to dictate how we should live.