Optimal Algorithm for Winning Hangman

In the game Hangman, is it the case that a greedy letter-frequency algorithm is equivalent to a best-chance-of-winning algorithm?

Is there ever a case where it's worth sacrificing preservation of your remaining lives, for the sake of a better chance of guessing the correct answer?

Further clarification of the problem:

  • The selected word to be guessed has been taken from a known dictionary.
  • You are given N lives, and thus have to maximise the probability of guessing all the letters in the word without making N mistakes (ie you may have an unlimited number of correct guesses).
  • Each word in the dictionary has equal probability, for the sake of this exercise (ie the word is chosen randomly)
  • a harder exercise would be to come up with a strategy against a malicious, omniscient word-chooser (I am not asking that here)
  • Motivation: This question is inspired by the interesting discussion at http://www.datagenetics.com/blog/april12012/index.html

    They suggest an algorithm for solving the word game "Hangman" optimally.

    Their strategy can be summarised thus (edited for clarification):

  • We can assume the word is drawn from a particular dictionary
  • We know the number of letters in the word
  • Eliminate all words in the dictionary that do not have the correct number of letters.
  • Choose the not-yet-guessed letter which occurs in the largest number of words in the remaining subset of the dictionary.
  • If this letter occurs, we know its location.
  • If this letter does not occur, we know it does not occur in the word.
  • Eliminate all words in the dictionary subset that do not fit exactly this correct pattern, and repeat.
  • If there are 2 (or more) letters appearing equally often, the algorithm may perform a deeper analysis of the positions to determine which one is preferred (if that is reasonable?)
  • At each stage, we are guessing the letter (not previously guessed) which occurs in the largest number of remaining possible words.

    There is some motivation to like this algorithm - we are always minimally likely to lose a life.

    But, it strikes me that this isn't necessarily the best solution: if we're trying to guess the word (within a certain number of lives), it's not necessarily always the case that the most frequently occurring letter is the most useful letter to distinguish between the remaining available words.

    I'm not sure, though, as it does seem opportune to avoid losing a life wherever possible. Will it ever be the case that optimal strategy permits us to sacrifice a life for a better opportunity to win?

    Question: is it the case that this greedy algorithm is equivalent to a best-chance-of-winning algorithm, or not? And can you prove it?

    An example dictionary+game would be ideal to show a disproof.


    Assume the following dictionary: ABC ABD AEF EGH . (I'll capitalize unguessed letters.)
    Assume you have only 1 life (makes the proof so much easier...).

    The algorithm proposed above:

    Starting with A , you lose (1/4) or are left with aBC aBD aEF (3/4).
    Now guess B and lose (1/3) or are left with abC abD (2/3).
    Now guess C or D and you lose (1/2) or win (1/2).
    Probability to win: 3/4 * 2/3 * 1/2 = 1/4.

    Now try something else:

    Starting with E , you lose (1/2) or are left with AeF eGH (1/2).
    Now you know what to guess and win.
    Probability to win: 1/2.

    Clearly the proposed algorithm is not optimal.


    There are some critical assumptions you have to make as to what a game of "Hangman" is.

  • Do you just have one word you need to guess, or do you need to guess a phrase?
  • Are some words more likely than others?
  • One thing to remember is that if you pick a correct letter, you do not lose a guess.

    I will provide a solution for the one-word-&-equally-likely case. The two-word case can be generalized by creating a new dictionary equal to the cartesian product of your current dictionary, etc. The more-likely-than-others case can be generalized with a bit of probability.

    Before we define our algorithm, we define the concept of a reduction. If we were to guess letters L1,L2,L3,...LN all at ONCE, then we would reduce the dictionary to a smaller dictionary: some words would be eliminated, and additionally some letters may also be eliminated. For example if we had the dictionary {dog, cat, rat} and we guessed a , we would eliminate {d,g} if the guess was true, or also eliminate {c,r,t} if it was false.

    The optimal algorithm is as follows:

  • Consider the game tree
  • Look at all nodes where [#guesses left == 1]
  • If there is no node, the game is impossible; if there is a node, that is your answer
  • Of course that is how you solve any game, and for the most part it is intractable due to the exponential size requirements. You cannot get optimal unless you perfectly replicate this, and I seriously doubt that a strategy which doesn't "look" two or more moves ahead can hope to replicate this. You can however attempt to approximate the optimal strategy as follows.

    Repeat the following at each step:

  • Consider each letter: choose the letter which will maximize the expected-dictionary-reduction per expected-penalty: that is, pick the letter L which will maximize the ( frac words with L #words without L + frac words without L #words with L )/( # words without L / # words total )... note that this may be infinite if all the words have a certain letter, in which case go ahead and guess it since there is no penalty.
  • Make your guess, get an updated board state
  • Eliminate all words invalidated by the new board
  • Of course if your dictionary has more than 2^[max number of guesses] entries, the "Hangman" game is mostly impossible in the equal-probability world (unless the dictionary is highly constrained), so you have to work in the unequal-probability world. In this world, rather than maximizing the amount of elimination you do, you maximize the "expected surprisal" (also called entropy). Each word you associate a prior probability (eg let's say there is a 0.00001 chance of the word being 'putrescent' and a 0.002 chance of the word being 'hangman'). The surprisal is equal to the chance, measured in bits (the negative log of the chance). An answer to a guess will either yield no letters, a single letter, or more than one letter (many possibilities). Thus:

  • For each possible guess, consider the effect that guess would have
  • For each possible outcome of the guess, consider the probability of that outcome. For example if you guessed 'A' for a 3-letter word, you'd have to consider each possible outcome in the set {A__, _A_, __A, AA_, A_A, _AA, AAA} . For each outcome, calculate the probability using Bayes's rule, and also the new possible dictionaries (eg in one case, you'd have a dictionary of _A_:{cAt, bAt, rAt, ...} and A__:{Art, Ark, Arm, ...} etc.). Each of these new dictionaries also has a likelihood ratio, of the form size(postOutcomeDictionary dictionary)/size(preOutcomeDictionary) ; the negative log of this ratio is the amount of information (in bits) which the choice conveys to you.
  • Thus you want to maximize the ratio of the expected amount of information (in bits) you gain, per the expected cost (the cost penalty is 1 if you fail and 0 if you don't). For each guess, and for each outcome of the guess, the bits-of-information-gained is bitsGainedFromOutcome[i] = -log(size(postOutcomeDictionary)/size(preOutcomeDictionary)) . We take the weighted sum of these: sum{i}( prob(outcome[i])*bitsGainedFromOutcome[i] ) , then divide by the probability we are wrong: prob(outcome=='___') .
  • We pick the letter with the minimum sum{i}( prob(outcome[i])*bitsGainedFromOutcome[i] )/prob(outcome=='___') ; in case this is infinity, there is nothing to lose, and we automatically pick it.

  • So to answer your question:

    >In the game Hangman, is it the case that a greedy letter-frequency algorithm is equivalent to a best-chance-of-winning algorithm?

    Clearly not: if the dictionary was

    cATs
    bATs
    rATs
    vATs
    sATe
    mole
    vole
    role
    

    your algorithm would guess a or t , which have a 5/8 chance of reducing your dictionary to 5/8 size for free, and a 3/8 chance of reducing your dictionary to 3/8 size for a cost of 1. You want to choose letters which reveal the most information. In this case, you should guess S, because it has a 4/8 chance of reducing your dictionary to 4/8 size for free, a 1/8 chance of reducing your dictionary to 1/8 size for free, and a 3/8 chance of reducing your dictionary to 3/8 size for a cost of 1. This is strictly better.

    edit: I wanted to use an English dictionary example (above) to demonstrate how this is not artificial, and assumed that people could extrapolate from the example without being hung up on the non-strict equality. However, here is an unambiguously clear counterexample: You have 2000 words. 1000 words contain the letter A in the first place. The other 1000 words contain a unique combination of B s embedded elsewhere. For example, ?B??? , ??B?? , ???B? , ????B , ?BB?? , ?B?B? , etc. The ? s represent randomly-chosen characters. There are no A s in the first ?, except for one word (whose ? is an 'A'), so that the frequency of A s is strictly greater than the frequency of B s. The proposed algorithm would guess A which would result in {50%: choices_halved, 50%: choices_halved & lose_one_life}, whereas this algorithm would dictate the choice B which results in {50%: YOU_WIN, 50%: choices_halved & lose_one_life}. Percentages have been rounded very slightly. (And no, a word with double letters does not contribute twice to the 'frequency', but even if it did under an insane definition, you could trivially modify this example by making the words begin with AAA...A .)

    (regarding comments: It is unreasonable to complain about strict equality in this example, eg "999/2000", since you can make the probabilities arbitrarily close to each other.)

    (Which points out an amusing side-note: If the dictionary is large enough to make hangman impossible sometimes, a strategy ought to throw away guesses that it does not expect to be able to guess. For example if it only has 2 moves left, it ought to make the highest-probability assumption it can which eliminates subtrees with more than 2-moves worth of bits of surprise.)


    I have written a script that solves hangman optimally [github].

    My basic strategy is this:

  • For a pattern such as ..e.. with tried letters: e,s,t
  • Check against words of only n digits (in this case 5)
  • Create a list of possible words
  • Create a regex from the supplied info
  • In this case it would be [^est][^est]e[^est][^est]
  • Parse your word list for words that match this regex
  • Cycle through each word, counting up the number of times every letter appears
  • Your optimal next guess is the most likely letter
  • 链接地址: http://www.djcxy.com/p/40218.html

    上一篇: 组合游戏。 如果两位球员都打得最好,谁将获胜

    下一篇: 胜出的Hang子手优化算法