String matching with an implicit representation of a suffix tree
From Data Structures and Algorithm Analysis in Java, Weiss:
Weiss writes:
My question: given the input string (eg 'banana') and the implicit representation of the suffix tree, what would a good algorithm for substring search look like? The algorithms I've seen assume a different representation of the tree. I'd like to do substring search without converting to a different tree representation.
I've never seen this representation before. It's more common to represent the labels on the edges as pairs of integers delineating a range of characters from the original string, which would let you more easily determine what the characters on the edges are (you can just look back at the original string at those characters on an as-needed basis to see whether they match the substring you're looking at).
I'm fairly certain that this compressed representation is not good at matching substrings. In order to follow an edge, you need to know what characters are on that edge, but you can't tell what those characters are unless you scan over the characters of the original string to find something that could conceivably match. You could consider descending down into the subtree to find a suffix there and use that to reconstruct the characters, but this requires extra time and breaks the time bounds you'd expect of a suffix tree.
My best guess here is that the author is mistaken about how to represent a suffix tree in a small amount of space.
链接地址: http://www.djcxy.com/p/40114.html上一篇: 从后缀树生成后缀
下一篇: 字符串与后缀树的隐式表示匹配