suffix tree construction
i am going to implement suffix tree for given string, i think it should delcared like this
struct suffix
{
char letter;
suffix * left,*right;
};
suffix *insert(suffix *node,char *s){
}
//here i am going to construct tree with all occurances of substrings and characters but dont know how use left and right part,is this tree sorted and arranged by strict ordering of characters like binary search tree?or?please help me ,i dont want to use some code on online,i need to implement it myselft,so please give me some hints,some little code
The wikipedia is a start.
However actually doing it is to understand the suffix tree construction algorithm, the Weiner or the Ukkonen algorithm.
The Ukkonen algorithm is simpler: http://europa.zbh.uni-hamburg.de/pubs/pdf/GieKur1997.pdf
Also this link is less academic and more practical: http://www.cise.ufl.edu/~sahni/dsaaj/enrich/c16/suffix.htm
Try to start understanding the algorithm.
After that good luck, this is one of the toughest data structures. The only thing worst is the compressed and optimized versions of the suffix tree.
But all great programmers like big challenges.
Have a look at the Wikipedia description:
Note, first of all, that a suffix tree is not a binary tree so your implementation outline is fundamentally flawed.
Next, it's not enough to store a single character per node / branch; a suffix tree branch represents a substring which can be longer than a single character. It's also usual to store just the start and end indices of the substring within the string, and not the string itself; otherwise the suffix tree would consume a lot of unnecessary memory.
Lastly, don't use pointers here. They buy you nothing and only cause trouble. Use something like a boost::container::vector<suffix>
instead (I'd suggest a std::vector<suffix>
, but unfortunately standard library containers cannot deal with incomplete types).
上一篇: 后缀树和最长的重复子串问题
下一篇: 后缀树的建设