Maven (Scrabble)
Encyclopedia
Maven is the current best known artificial intelligence
Artificial intelligence
Artificial intelligence is the intelligence of machines and the branch of computer science that aims to create it. AI textbooks define the field as "the study and design of intelligent agents" where an intelligent agent is a system that perceives its environment and takes actions that maximize its...

 Scrabble
Scrabble
Scrabble is a word game in which two to four players score points by forming words from individual lettered tiles on a game board marked with a 15-by-15 grid. The words are formed across and down in crossword fashion and must appear in a standard dictionary. Official reference works provide a list...

 player, created by Brian Sheppard. It has been used in official licensed Hasbro
Hasbro
Hasbro is a multinational toy and boardgame company from the United States of America. It is one of the largest toy makers in the world. The corporate headquarters is located in Pawtucket, Rhode Island, United States...

 Scrabble games, and the downloadable Funkitron Scrabble.

Game Phases

Maven's game play is sub-divided into three phases: The "mid-game" phase, the "pre-endgame" phase and the "endgame" phase.

The "mid-game" phase lasts from the beginning of the game up until there are nine or fewer tiles left in the bag. The program uses a rapid algorithm to find all possible plays from the given rack, and then part of the program called the "kibitzer" uses simple heuristics to sort them into rough order of quality. The most promising moves are then evaluated by "simming", in which the program simulates the random drawing of tiles, plays forward a set number of plays, and compares the points spread of the moves' outcomes. By simulating thousands of random drawings, the program can give a very accurate quantitative evaluation of the different plays.

The "pre-endgame" phase works in almost the same way as the "mid-game" phase, except that it is designed to attempt to yield a good end-game situation.

The "endgame" phase takes over as soon as there are no tiles left in the bag. In two-player games, this means that the players can now deduce from the initial letter distribution the exact tiles on each other's racks. Maven uses the B-star search algorithm to analyse the game tree during the endgame phase.

Move Generation

Maven has used several algorithms for move generation, but the one that has stuck is the DAWG
Dawg
Dawg may refer to:* an African American Vernacular English word used to apply to close friends* directed acyclic word graph, a computer science data structure* Dawg, the nickname of American mandolinist David Grisman...

 algorithm. The GADDAG
GADDAG
A GADDAG is a data structure presented by Steven Gordon in 1994, for use in generating moves for Scrabble and other word-generation games where such moves require words that "hook into" existing words. It is often in contrast to move-generation algorithms using a Directed Acyclic Word Graph , such...

 algorithm is faster, but a DAWG for North American English is only 0.5 MB, compared to about 2.5 MB for a GADDAG. That makes a significant difference for download games, whereas the speed advantage is not important. (Note that unimportant does not mean that the difference is small, merely that users cannot tell the difference. The GADDAG is perhaps twice as fast, but both algorithms are fast enough.)

Rack Evaluation

The first (1986) version of Maven used a set of about 100 patterns to value racks. Every single tile had a value (27 patterns). Each duplicate had a value (22 patterns). There were patterns for triplicates and quads for letters that have enough representation in the bag. Finally, the QU combination was a pattern.

Soon after the first version, Maven acquired rack evaluation terms for vowel/consonant balance and Q/U distribution. Vowel/consonant balance was a table lookup based on the count of vowels and consonants left in the rack. Q/U distribution varied the values of Q and U using a table lookup indexed by how many of each remained in the bag.

Shortly thereafter, Maven acquired a tile duplication evaluator. The idea was to vary a rack depending on the chance of drawing duplicates. For example, A is generally better than I as a tile, but if there are 7 A's and only 2 I's left in the bag, then maybe we should prefer to keep the I.

Parameter fitting was accomplished by tuning the values to predict the total of future scores. Nowadays this would be called Temporal Difference Learning
Temporal difference learning
Temporal difference learning is a prediction method. It has been mostly used for solving the reinforcement learning problem. "TD learning is a combination of Monte Carlo ideas and dynamic programming ideas." TD resembles a Monte Carlo method because it learns by sampling the environment according...

.

This rack evaluation design was original to Maven. It was very successful in competing with the human champions of the day.

The design was later extended by other researchers. Mark Watkins championed what he called "tile synergy patterns." These are combination like ADES which form the basis of many high-scoring words. This is a natural extension of the design, which does significantly improve play. Maven's pattern set gradually expanded from the base set of 100 to well over 400.

Maven has since switched to a completely different architecture, proposed by John O'Laughlin and implemented in Quackle. This is the "exhaustive" architecture, where the program has a different rack evaluation parameter for each of the 3 million possible combinations of 0 to 7 tiles. With the advances in computer power over the last decade, it has become possible to tune such large parameter sets.

The downside of using an exhaustive approach is that Maven lost the ability to vary evaluations as a function of the tiles that remained in the bag. The point is that the exhaustive rack evaluator does not have terms that relate a rack's value to the possible draws from the bag.

Maven's version of exhaustive rack evaluation has added that ability. In Maven, each rack has its own liner evaluator, where the value of that rack varies as a function of the chance of drawing a duplicate, the chance of drawing a vowel, and the chance of drawing Q and U. This system has 5 parameters per rack, for about 15 million parameters in all.

Simulation

The great human champion Ron Tiekert had studied Scrabble by playing out individual positions dozens of times, and tabulating results. He suggested that with Maven's speed, it should be possible to automate that process in overnight runs. Brian Sheppard named this process "simulation", though it goes by the name "rollout" in backgammon, and "playout" in Go.

The process was to select N candidate moves using the score+rack heuristic. Then play out those moves hundreds or thousands of times to see which candidate performs best. You can vary the depth of the playout to suit your purpose; play two or four moves ahead to get a good idea about point differential, or play to the end of the game to measure winning chances.

By the mid-1990s, computers had become fast enough that Maven used simulation to choose moves in competitive games under tournament time controls. Algorithmic improvements were important to scaling simulation for this purpose. The most important innovation was to vary the number of trials given to candidates so that more successful candidates receive more effort. It was also helpful to control the racks so that all candidate moves were sampled against the same, unbiased distribution.

Analysis of games played by Maven's simulation engine suggest that Maven has surpassed the skill level of human champions.

Endgame

Strong play in Scrabble endgames is much harder than it looks. In theory, endgames are a game of perfect information, so the Alpha-beta pruning
Alpha-beta pruning
Alpha-beta pruning is a search algorithm which seeks to decrease the number of nodes that are evaluated by the minimax algorithm in its search tree. It is an adversarial search algorithm used commonly for machine playing of two-player games...

 algorithm should work. But in practice Alpha Beta works badly on Scrabble.

The problem with Alpha Beta is that some Scrabble endgames require 14 moves to play out, and it is not possible to search that deeply. This is not merely a theoretical possibility. When one player is "stuck" with a tile, then it is impossible for him to play out all of his tiles. In that situation the optimal strategy for both sides is usually to play one tile on each turn.

Maven uses a different approach. The B* search algorithm is a selective-depth, progressive-widening algorithm that guarantees to find optimal solutions to two-player games when one can compute upper and lower bounds on the values of each position.

It turns out that it is possible to estimate upper and lower bounds on endgame positions. These bounds are correct (that is, the true value lies within the interval) for the overwhelming majority of positions. Since B* is reasonably robust in the presence of a small percentage of error in the bounds, Maven can solve endgames that other approaches cannot.

A further refinement makes Maven's endgame solutions asymptotically optimal even in the presence of errors. When the B* search terminates with a proof that one move is best, and there is still time remaining, then Maven widens its estimates by 1 point and searches again. These re-searches are usually very quick, because the tree from the previous search is still largely valid. Repeated use of this policy will progressively identify errors, starting with the smallest (and presumably most likely) errors.

Exhaustive Pre-Endgame

When only 1 or 2 tiles remain in the bag ("PEG-1" or "PEG-2"), it is possible to perform exhaustive searches of the state space.

The case of a PEG-1 is important, because almost one half of all games pass through that state. Maven can play out such states exhaustively in almost all cases. That is, for all legal moves Maven can play out the resulting endgames (up to 8 for each legal move), and calculate which side will win the game in each case. Because there are some situations (e.g., two blanks, stuck-with-Q) that require extra effort, the calculation is performed progressively. That is, Maven expands its analysis first where the decision is close and relevant.

In a PEG-2 it is normally not possible to exhaustively examine all move sequences, so Maven goes as far as it can in the time available.

One feature of these low-tile situations is that it is very hard to safely prune the list of legal moves. For example, the optimal play is ranked behind more than 50 other moves by the score+rack heuristic more than 1% of the time.

This policy does not produce play that is theoretically perfect, because it is impossible to know what the true initial distribution of unseen tiles should be. Assuming a uniform distribution does well, and it is possible to calculate inferences about unseen tiles that marginally improves on that assumption.

Another limitation is that Maven is not addressing the "hidden information" aspect of such situations. That is, in theory there are situations where players maximize expectation by randomly choosing moves according to a probability distribution. Maven is choosing pure strategies at each node.

Competitive results

8-2 Matchups Tournament, December 1986. Tied for first place. Opponents were very strong, including several past or future champions.

5-0 Cape Cod Fun Weekend. Opponents ranged from strong to just below championship caliber.

7-3 Matchups Team Tournament. Maven played second board on a computer team. Playing on second board lowered the caliber of opposition a bit.

0-2 match in 1996 against Adam Logan, National Champion. Match held at AAAI conference. Maven's first games that used a simulation strategy. The ratio of computer power was not quite right, though, and the implementation was buggy.

9-5 match in 1997 against Adam Logan, National Champion. Rematch at AAAI conference. This was the first match featuring a reasonably well implemented simulation engine against a human champion.

6-3 match in 1998 against Joel Sherman (World Champion) and Matt Graham (World Runner-up), sponsored by the New York Times. Maven was not using a simulation strategy in this match, but it got good tiles.

30-6 Toronto tournament, 2006. The first 14 games were against players who could easily win championships, including several National or World Championship winners. Maven went 9-5 in those games. The remaining 22 games were against a range of experts, with Maven going 21-1.

Overall: 65-21, including 32-17 against championship caliber opposition.
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK