Convert BNF grammar to pyparsing

How can I describe a grammar using regex (or pyparsing is better?) for a script languge presented below (Backus–Naur Form):

<root>   :=     <tree> | <leaves>
<tree>   :=     <group> [* <group>] 
<group>  :=     "{" <leaves> "}" | <leaf>;
<leaves> :=     {<leaf>;} leaf
<leaf>   :=     <name> = <expression>{;}

<name>          := <string_without_spaces_and_tabs>
<expression>    := <string_without_spaces_and_tabs>

Example of the script:

{
 stage = 3;
 some.param1 = [10, 20];
} *
{
 stage = 4;
 param3 = [100,150,200,250,300]
} *
 endparam = [0, 1]

I use python re.compile and want to divide everything in groups, something like this:

[ [ 'stage',       '3'],
  [ 'some.param1', '[10, 20]'] ],

[ ['stage',  '4'],
  ['param3', '[100,150,200,250,300]'] ],

[ ['endparam', '[0, 1]'] ]

Updated: I've found out that pyparsing is much better solution instead of regex.


Pyparsing lets you simplify some of these kinds of constructs

leaves :: {leaf} leaf

to just

OneOrMore(leaf)

So one form of your BNF in pyparsing will look something like:

from pyparsing import *

LBRACE,RBRACE,EQ,SEMI = map(Suppress, "{}=;")
name = Word(printables, excludeChars="{}=;")
expr = Word(printables, excludeChars="{}=;") | quotedString

leaf = Group(name + EQ + expr + SEMI)
group = Group(LBRACE + ZeroOrMore(leaf) + RBRACE) | leaf
tree = OneOrMore(group)

I added quotedString as an alternative expr, in case you wanted to have something that did include one of the excluded chars. And adding Group around leaf and group will maintain the bracing structure.

Unfortunately, your sample doesn't quite conform to this BNF:

  • spaces in [10, 20] and [0, 1] make them invalid exprs

  • some leafs do not have terminating ; s

  • lone * characters - ???

  • This sample does parse successfully with the above parser:

    sample = """
    {
     stage = 3;
     some.param1 = [10,20];
    }
    {
     stage = 4;
     param3 = [100,150,200,250,300];
    }
     endparam = [0,1];
     """
    
    parsed = tree.parseString(sample)    
    parsed.pprint()
    

    Giving:

    [[['stage', '3'], ['some.param1', '[10,20]']],
     [['stage', '4'], ['param3', '[100,150,200,250,300]']],
     ['endparam', '[0,1]']]
    
    链接地址: http://www.djcxy.com/p/83186.html

    上一篇: 如何在android的webview中启用前置摄像头

    下一篇: 将BNF语法转换为pyparsing