Computer Chess: Past to Present

Alejandro López-Ortiz

alopez-o@neumann.uwaterloo.ca

Introduction

In this time of ``intelligent'' cameras, electronic games and coffee-makers it seems strange that at some point many computer scientists thought that computers would never be able to perform nontrivial tasks.

Even today, many people are still unaware of what computers can and cannot do. It is sometimes amusing to see a scientist's face light up surprise when shown some computer tools that have been available for over a decade (like computer graphics, symbolic mathematics, etc), and their disillusionment when told that computers still can't efficiently perform some apparently simple tasks (simple for a computer) like finding the optimal route to be followed by a courier delivering packages all over the city.

The history of computer chess is plagued by similar under-and-overstatements put forward by laypeople and experts alike. Though chess is an ideal problem for computers to attack. Its aims are clearly defined (to play good chess) and its advancement can be easily measured (player ranking).

Computer chess may give a down to earth perspective on what is and is not currently possible with a computer and how much effort it may take to achieve an specific goal in computer development.

History of Computer Chess

In 1950, the first computer chess program was written by Alan Turing, a British researcher who pioneered the field of digital computers. At the time, Turing had to settle with a simulation of the execution of his program with pencil and paper. Turing's program was a terrible player, but it served well its main purpose: it showed that computers can play chess. In the same year, Claude Shannon plotted a plan of action for computers to eventually be programmed to play good chess. Given the exceedingly low speed of computers in the 1950 it was not clear whether computers would ever be able to beat humans even in games as simple as checkers.

In 1958, a chess program beat a human player for the first time. The human player, a secretary of the team of programmers who had never played chess before, was taught how to play just an hour before the game, and was beaten by the chess program. As unremarkable as this feat may seem today, it served to show that knowledge could be embedded into a chess program (about an hour training worth of knowledge, to be precise).

After this victory, some of the first programmers of chess playing computers predicted that before the sixties were over a computer chess program would be the chess world champion. At the end of the sixties, Spassky was the world chess champion and computer programs played at the strength level of an average high school chess player. This encouraged some to claim that computers will never be able to perform any intelligent task, not to mention obtaining the world chess championship!

However, the prediction was made anew in the seventies, on this occasion involving a bet between David Levy, a British chess master (chess players are classified by the International Federation of Chess according to their performance. Titles are then granted appropriately, among them we have International Master (IM), Grand Master (GM), and World Champion. Mr. Levy is an IM.), and John McCarthy, a distinguished researcher in Artificial Intelligence (the field dedicated to the study and development of computers that could perform tasks deemed to involve intelligent reasoning). The bet, of more modest intentions, claimed that by 1978, a computer would be able to beat Mr Levy in a chess match.

In 1978, the best chess program at the time (CHESS 4.7) was beaten by D. Levy in a best-of-five match played in Toronto with a score of 3 wins for the humans, a draw, and a win for CHESS 4.7. Not only Mr. Levy had beaten the computer, he also pocketed, according to the terms of the bet, a sum exceeding a thousand pounds.

If the purpose of the bet was to make researchers think twice before predicting a victory, then this bet barely served it: In spite of the failed 1958-1968 and 1968-1978 predictions, computer chess experts forecasted again that a computer would obtain the world chess championship during the next ten years!

Their logic was that after all, there is a long path from average high school level to exacting one game to an International Chess Master ---the game lost by Mr. Levy. Furthermore, by this time almost nobody doubted that computers may be able to perform some intelligent tasks, though perhaps by different means and ways than those used by humans.

But yet again, 1988 came around and still the chess champion was human. The next year, the Deep Thought (DT) chess computer easily beat IM Levy. DT was (and still is) the strongest chess computer ever built. DT fares among the top one hundred players in the world. Predictions come, predictions go. DT's developers claimed that the world title was now a mere three years away from the Deep Thought computer.

An exhibition match played in 1989 between the world champion Garry Kasparov and DT resulted in the latter being crushed by Kasparov. From the current status of computer chess it is safe to say that late 1995 will be the earliest time for which another match between the human world chess champion and Deep Thought may be arranged and possibly won by DT, if not even later.

As a colleague of the author says ---half ironically, half seriously--- ``At times it seems that all what we have achieved in 40 years of computer chess research is to drop the prediction time from `sometime in the next decade' to `in the next three years'.'' Are these three year predictions founded?

Computer Chess: What, How and Why.

What is computer chess?

A computer can play chess, simply by generating moves at random, checking if they are valid according to the rules of chess, and playing any one of them. Of course, such a computer would be a fairly bad player.

This approach has been tried, just out of scientific curiosity, and the chessprogramms resulting of this scheme were usually beaten in less than five moves. (In chess jargon, a move is a pair of ``plays'' the move of the player with the white chessmen together with the black reply. A ply or half a move is either the white or black play/reply).

Once it has been established that a computer can play chess the question is if it is possible for a computer to play good chess. In principle, it is possible to design a computer program that considers all possible consequences of its moves until the very end of the game is reached and then choose the optimal strategy. The optimal strategy will be one that, regardless of which moves are selected by the opponent, it is always possible to find a winning sequence of moves.

As an example we have that in tic-tac-toe the optimal strategy always attains a tie or a victory. Regardless of the action taken by the opponent a good tic-tac-toe player is able to choose a drawing or winning response. In other words, when such hypothetical computer is considering its move, it would try out every possible legal move, and for each of these would consider all the possible replies that the opponent may play, and then all possible replies that the computer may play at this time, and so on. After this analysis is completed the computer chooses the sequence of moves that forces a win regardless of what the opponent plays. Such computer would play ``perfect chess''.

This approach is not practically feasible, since the number of possibilities that need to be explored is astronomically large ---approximately 10EE44 (the number of particles in the universe is estimated to be 10EE23).

The quest for the optimal move among the astronomical number of possibilities is approximated by means of restricting the number of possibilities or moves to be considered ---the size of the search space.

How do computers play chess?

In 1950 Claude Shannon proposed two basic strategies for restricting the number of possibilities to be considered. Both of these strategies, with minor modifications, are still used to date by competent and incompetent chessprograms alike. The so-called ``Shannon type A strategy'' considers all possible moves up to certain number of plies at which point some criteria is applied to choose the best possible of the moves considered based upon the positions attained at the end of the search. The idea is that, since it is not possible to try all the possible responses and counter-responses until the game is finished, the computer restricts the number of plies or depth of the search to a number that allows it to produce a reply in a reasonable amount of time.

It is rather interesting to note that human chess players too, look ahead only a limited number of plies until some satisfactory position is achieved.

Of course the computer needs some method to evaluate the positions reached after the sequence of plies is considered and to select the best sequence among them. For the evaluation of a position, an evaluation function assigns to each position a number according to characteristics which are known to give an advantage to the player ---like material or positional advantage, development, etc. If after a sequence of moves the computer has more chessmen in the board and a strong attack, the position is assigned a large positive integer, whereas, if after said sequence the situation is exactly the opposite, the negation of the number previously mentioned is assigned.

The Theory of Games, originally developed by Morgensten and Von Neumann, gives us a way of selecting the best sequence among all the moves tried and evaluated. Von Neumann and Morgensten realized that a game can be seen as a tree of positions and possible moves, namely a tree rooted in the current position with each branch representing each of the possible legal moves, and leading to new nodes in the tree which are the resulting positions after each move; where again new branches are added and so on (See figure 1).

           Ke4, Na1, kh8			
         /   /  |     \   \				
            White to move   
     /     /    |       \     \				
   /      /     |        \      \				
 Ke5    Ke3    Kd4 ..... Nc2    Nb3			
 /|\    /|\    /|\       /|\    /|\				
/ | \  / | \  / | \     / | \  / | \			
  B l a c k       t o      m o v e

At any given node, the player to move chooses that move ---a branch of the tree--- which maximizes the gain of that player (or equivalently, minimize its opponent's gain). A system derived from this method is called a ``minimax strategy''.

Whether or not a move maximizes a player's gain can be determined by applying the evaluation function to the resulting position. In this situation, the player who moved in the last level of the tree chooses those branches that, according to the evaluation function, are the best reply for each of the positions at the penultimum level. This score is now associated to each of the positions in the penultimate level of the tree. At this stage, the other player chooses his/her move according to the newly obtained scores for all positions in this level of the tree.

Eventually, a score will be assigned to the the positions in the level just below the starting position of the tree. The computer then simply selects that move leading to the position with the best evaluation score.

Even the fastest computers to date cannot go beyond a depth of ten ply on the average game tree. This is due to the exponential increase in the number of moves to be considered with each subsequent level considered. A search using a type A strategy to a depth of ten ply requires the study of half a quadrillion moves on the average. (At the beginning or near the end of the game or in other specialized positions, computers do reach a 20 ply depth or more.)

In principle, keeping the tree in memory would be an ideal to strive for, and as the game proceeds the search would deepen, with unused branches being disposed of. In practice, the tree is so large that it is not possible to store it in memory, and the computer is forced to recalculate the whole tree at each move.

If at the starting position the computer is looking ahead ten plies, the last move to be considered is the fifth move. After the computer and its opponent make a move, the computer will now lookahead up to move six, and so on.

The second strategy, due too to Shannon, is the so-called type B strategy, which limits the number of moves that are considered at each level. A move generator determines a set of reasonably good replies for each given position. A typical move generator chooses approximately eight possible moves for each position. Furthermore, the number of levels of the tree is not predetermined. In general, a minimum depth of plies are examined and furthermore, if the position is found to be unstable --say a position in the middle of a sequence of captures, or a check on the king-- the position is studied further until a desired level of quiescence is achieved. The reason for this can be easily understood; assume that player A's queen captures player B's rook at the last level of Shannon type B strategy. When the resulting position is examined by the computer, it is selected for it's material advantage, despite that, in the next move, the queen may be captured by, say, a pawn, resulting in a net loss for player A.

Interestingly, a deep search is not effective if only a few moves are selected at every position. For this reason, programs running in high speed computers tend to be of class A, whereas programs running on microprocessor architectures use strategy B.

Some common techniques to improve both strategies have been developed. These techniques have allowed computers with the same basic speed to play significantly better through time.

Among them we have:

Alpha-Beta cuts. This idea relies on the basic premise that once a strong reply has been found by your opponent in a given situation, the computer may stop considering ways of improving this situation and focus on moves that avoid such a position altogether.
Opening Books. Centuries of human experience in chess have resulted in a set of initial moves (about the first 15 to 20) which have been shown to be optimal strategies for both players. This set of moves are stored in databases and made available to the chessprogram.
Endgame Database. Similarly to openings, a wide collection of ending positions has been accumulated and strategies for each are known. In contrast to opening books, computers have added great deal of knowledge to endgame databases.
Special Purpose Hardware. Once the main characteristics of computer chessprograms were identified, it became possible to build specific purpose hardware for the parts of the game playing which are particularly time consuming. Typically, evaluation functions and move generators are encoded into chips.
Killer Moves. If a move has been previously found to be a very good/bad move, most likely the current evaluation will still find the present move to be a very good/bad move. In this case, the killer moves are tried first by the Alpha-Beta heuristic, with an expected reduction in the number of positions to be considered.
Transposition Tables. Often, a sequence of moves can be made in any possible order and the final position is still the same. To avoid the superfluous computations resulting from reaching the same position in different nodes of the tree, a list of moves is kept in memory. When an equivalent node is reached, the computer realizes through the use of the tables that the node has been previously studied in the tree of possible moves.
Variable Depth. Additional criteria for modifying the depth of search in either strategy A or B can be used. Generalizations of the concept of quiescence, as stability in the score of the evaluation function or aggressiveness of the position, may extend the search further.

Why should computers play chess?

As it is noted in the introduction, chess is a model problem to be solved by a computer. The aim is well defined, and performance can be easily quantified; it is clear that an exhaustive search or brute force approach is not feasible, and chess seems to have enough structure to be solved using ``intelligent'' techniques. It is natural to expect that advances in computer chess will be applied to different areas of Artificial Intelligence. Chess is the case study chosen for many computer problems.

For these reasons chess has been branded the ``drosophila'' of AI. Recall that at the beginning of the century, geneticists were experimenting with mutations and alterations of the fruit fly drosophila; laypeople questioned why geneticists did not experiment, say, instead with live stock. The answer at the time was that genetics required to learn and experiment with a simply well behaved organism, which the drosophila is. A century later geneticists are experimenting with livestock and more useful varieties of life.

Pattern matching, knowledge representation, heuristics, positional estimates, problem solving, and time constrains. Almost any real life problem, when attempted to be solved by a computer, will borrow at least one of these techniques.

Above all, the most important question sought to be answered by computer chess developments is: can computers think?

For long periods of time it was thought that high level chess playing must involve higher thinking. In other words, developing a good chessprogram was equated to emulating high level human processes as reasoning, logical derivation, and ordered recollection. It is now known that computers play chess in a manner that greatly differs from those of humans.

Researchers with a pragmatic approach claim that regardless of the strategy used by the computer, if the computer plays good chess then it is thinking. Another school of thought claims that thinking involves a particular set of procedures, which, if not employed, amount for nothing more than a brute search.

It is by no means clear which position is closer to the truth about computer reasoning, hopefully advances in computer chess will help find the correct interpretation.

Conclusion

By late 1994, the new version of DT will be ready. This version should be able to beat most human players, save the top ten or fifteen.

Given the previous history of computer chess, by 1998 Deep Thought should be playing at the level of the top three players of the world. If by that time the original developers are not tired of computer chess, then sometime in the next decade DT should beat a Kasparov-strength player, which most likely will not be the chess champion then (hey! with time humans develop and play better too!). If no major developments occur in the human side of chess, a computer should beat the world champion (whoever she may be) by the year 2022, 70 years after ``sometime in the next decade'' was predicted for the first time.

Computer Chess at Waterloo: A Final Note

During these 40 years of computer chess development, Waterloo has made significant contributions to the world of computer chess.

Back in the early days of Waterloo (1965-1975), a revolutionary idea crossed the mind of UW computer administrators: What if we give unlimited computer access to our students?

Well, now we know the answer. Students promptly developed a Fortran interpreter, made significant improvements to the Honeywell command interface/operating system, and among other things, wrote the Ribbit program, who conquered the 1st and 2nd places in the North American Computer Championship and a tie for 2nd in the first World Computer Championship in 1973.

From the team of undergrads who authored Ribbit, Ron Hansen went on to write his master thesis on computer chess, and eventually Prof. Van Emden became an expert in computer chess and endgames. Lionel Moser's MSc thesis in parallel alpha-beta search together with Jonathan Schaeffer efforts are, to date, the last word in parallel computer games.

A special mention goes for Jonathan Schaeffer, whose Sun Phonex chess program tied for first in the 1986 World Chess Championship. Schaeffer, who completed his master's and PhD in Waterloo, and is now at the University of Alberta, spend the last few years working on a checkers program called Chinook.

This checkers program broke a 40 year streak of Marion Tinsley, the current world champion, in which he lost scantly seven games. Chinook won a couple of games with Mr Tinsley, but lost the overall match.

Schaeffer is, to date, a leading figure of computer games and computer chess.

Computer chess in Waterloo now lives through the yearly Othelo tournament organized by the Computer Science Club of the Faculty of Mathematics. There are very strong Othelo programs, as well as Othelo human players at Waterloo.

Comments and/or corrections are appreciated.

alopez-o@neumann.uwaterloo.ca

Alex Lopez-Ortiz
Department of Computer Science
University of Waterloo
Waterloo, Ontario
Canada