用于蟒蛇风格结构的BNF语法(BNF Grammar for python style structures)

我正在尝试一种简单的语法来解析类似python的结构,这就是我能为列表/集合提出的

list : '[' atom ( ',' atom)* ']' set : '(' atom ( ',' atom)* ']' atom : 'a'..'z' | 'A'..'Z' | '[' list ']' | '(' set ')'

请注意,这是antlr,我想知道它的正确性和任何可以帮助我的资源

我确实查看了python的语法http://docs.python.org/reference/grammar.html,但无法弄清楚它是处理列表列表或列表集还是列表集等。

任何帮助将不胜感激。

I am trying to a simple grammar for parsing python like structures, this is what I could come up with for a list/set

list : '[' atom ( ',' atom)* ']' set : '(' atom ( ',' atom)* ']' atom : 'a'..'z' | 'A'..'Z' | '[' list ']' | '(' set ')'

Note that this is in antlr, I wanted to know about its correctness and any resources that would help me out

I did look at the python' grammar http://docs.python.org/reference/grammar.html but couldn't quite figure out it was handling list of lists or set of lists or list of sets etc..

Any help would appreciated.

最满意答案

无法弄清楚它是处理列表或列表集或集合列表等。

它不区分列表或集合或其他:

atom: ('(' [yield_expr|testlist_comp] ')' | '[' [listmaker] ']' | '{' [dictorsetmaker] '}' | '`' testlist1 '`' | NAME | NUMBER | STRING+)

他们处理你所描述的排序递归的方式是listmaker , dictorsetmaker等最终可能包含atom 。 例如:

listmaker: test ( list_for | (',' test)* [','] ) test: or_test ['if' or_test 'else' test] | lambdef or_test: and_test ('or' and_test)* and_test: not_test ('and' not_test)* not_test: 'not' not_test | comparison comparison: expr (comp_op expr)* expr: xor_expr ('|' xor_expr)* xor_expr: and_expr ('^' and_expr)* and_expr: shift_expr ('&' shift_expr)* shift_expr: arith_expr (('<<'|'>>') arith_expr)* arith_expr: term (('+'|'-') term)* term: factor (('*'|'/'|'%'|'//') factor)* factor: ('+'|'-'|'~') factor | power power: atom trailer* ['**' factor]

有很多中间体; 那是因为他们需要为一堆数学运算符建立优先级。 然后是list_for ,它允许为列表理解添加额外的东西。

一个更简化的示例可能如下所示:

atom: ('[' [list_or_set] ']' | '{' [list_or_set] '}' | NAME | NUMBER | STRING+) list_or_set: atom (',' atom)* [',']

或者,如果您希望在此级别上区分列表和集合:

atom: list | set | NAME | NUMBER | STRING+ list: '[' atom (',' atom)* [','] ']' set: '{' atom (',' atom)* [','] '}'

couldn't quite figure out it was handling list of lists or set of lists or list of sets etc..

It doesn't distinguish lists from sets or whatever:

atom: ('(' [yield_expr|testlist_comp] ')' | '[' [listmaker] ']' | '{' [dictorsetmaker] '}' | '`' testlist1 '`' | NAME | NUMBER | STRING+)

The way they handle recursion of the sort you're describing is that listmaker, dictorsetmaker etc. ultimately may contain atom. For example:

listmaker: test ( list_for | (',' test)* [','] ) test: or_test ['if' or_test 'else' test] | lambdef or_test: and_test ('or' and_test)* and_test: not_test ('and' not_test)* not_test: 'not' not_test | comparison comparison: expr (comp_op expr)* expr: xor_expr ('|' xor_expr)* xor_expr: and_expr ('^' and_expr)* and_expr: shift_expr ('&' shift_expr)* shift_expr: arith_expr (('<<'|'>>') arith_expr)* arith_expr: term (('+'|'-') term)* term: factor (('*'|'/'|'%'|'//') factor)* factor: ('+'|'-'|'~') factor | power power: atom trailer* ['**' factor]

There are a lot of intermediates; that's because they need to establish precedence for a bunch of mathematical operators. Then there list_for, which allows for adding the extra stuff for a list comprehension.

A much more simplified example might look like:

atom: ('[' [list_or_set] ']' | '{' [list_or_set] '}' | NAME | NUMBER | STRING+) list_or_set: atom (',' atom)* [',']

Or if you want the distinction between lists and sets to be made at this level:

atom: list | set | NAME | NUMBER | STRING+ list: '[' atom (',' atom)* [','] ']' set: '{' atom (',' atom)* [','] '}'

更多推荐