Algorithm's name

I have a set S of "small" trees S[i] which I need to find their positions for inside a bigger which are used as patterns to find matching subtrees in a bigger tree T . I know S before I start constructing T (which is a parse tree), so I am thinking about employing a cutting-plane method to match the nodes as I go (as the parser generates the CST).

The trees in S are not the same ASTs as T - think about XPath vs. XML - S holds the tree representation of the XPaths, while T is the actual AST of the source code - I need maps between i and a vector of matching nodes of T .

However I'm not sure about the names of the algorithms I would use.

Basically I know what I want to do, it feels like a "divide et impera for trees" with a stack where I hold possible candidates to match, at each shift of the LALR parser I duplicate the top of the stack and eliminate candidates i from S[i] which won't match anyway, and after a reduction I pop from the stack. At the beginning all members from S are possible candidates.

Please note: this is just about the AST, the ASG is another story...

Addendum

Here be a parse tree T .

T  - 分析树

The parsing function will be aware of a list of what I call "treepaths", in a canonical form, also represented as trees, stored in S . But they won't look like the parsetree, they have their own language to be represented in, similar to XPath.

Example of a treepath to get all functions which have as return value a variable:

function[body[return[expr[@type="variable"]]]]]
  • So what should I look for in the existing literature?
  • Any other advices?
  • Are there already languages which can query meta-annotated trees like this? An open-source C (not C++) library would be ideal.

  • 1) Your S trees as XPath correspond to some T trees. Why not construct the T trees in advance, and then pattern match them?

    2) If you want to match a pattern against a structure, you can imagine compiling the pattern to some kind of state machine, that transitions when given pieces of the tree being matched against. If the state machine ever enters an accept state, you've found a match. If you have more than one pattern, each one can treated as a state machine and you can run them "in parallel" (by simulation). To make this efficient, compute the cross product of all the state machines; now there's only one, and only one transition occurs per input. This idea I call "pattern products" and you see something like in a variety of efficient matchers. One close to what you want to do is the Rete Algorithm, which keeps track of which "patterns" are live as the data fed to it changes.


    It might be worth looking into JXPath: http://commons.apache.org/jxpath/ I'm not sure what language you're targeting, but it might be worth the shot.

    Anyway, my first impulse if I had to try and implement something like this would be to find a way to "serialize" both trees, and reduce the problem to one of simple string matching.

    链接地址: http://www.djcxy.com/p/49996.html

    上一篇: 如何读取/解析C中的输入? FAQ

    下一篇: 算法的名称