How to find all input words in a given dictionary?

This is a follow-up to this question..

If I have a string text and a set of other strings I can use the Aho-Corasick algorithm to find the strings of the set in text .

Now I have a dictionary (set of strings) instead of text . I can organize the dictionary as a trie or hash table (or even BST). Can I apply the Aho-Corasick algorithm to find all strings of the set in the dictionary ?


You can apply modified algorithm.

Suppose each node in tree have 2 types of edges

1)Edge "may be", if you are at prefix, and get some letter, so new prefix can still be prefix from some word in the dictionary.

Example: Dictionary aaa and aaabc, if you are at aaa and receive a letter b, you move to aaab.

2)Edge "nope", if you are at prefix, and get some letter, so new prefix isn't in dictionary, you say that word isnt in dictionary and proceed to next word.

Example: Dictionary aaa and aaabc, if you are at aaa and receive a letter c, you can say that word isnt in dictionary and proceed to next word.

To build tree you'll need O(total dictionary length) time and O(length) to check each word, so this will lead O(input) algorithm.


The point of a dictionary is that it facilitates searching by the datastructure employed.

For example, with a hashtable, you would check for each member of set in the hashtable by using a hash lookup. There is no need to use a substring-search.

链接地址: http://www.djcxy.com/p/18138.html

上一篇: 从单词集中获取最可能的颜色

下一篇: 如何查找给定字典中的所有输入单词?