""

问题描述 投票:0回答:1
::=

That's definitely a challenge.

<type> ::= <base_type> <optional_array_size>
<optional_array_size> ::= "[" <INTEGER_LITERAL> "]" | ""

<base_type> ::= <integer_type> | <real_type> | <function_type>

<function_type> ::= "(" <params> ")" "->" <type>

<params> ::= <type> <params_tail> | ""
<params_tail> ::= "," <type> <params_tail> | ""

It's much easier to make an LR grammar for that language, although it's still a bit of a challenge. To start with, it's necessary to remove the ambiguity which from Integer[42]The ambiguity, as I'm sure you know, results from not knowing whether the Real in (Integer, Real) -> Integer is part of the top-level (Integer, Real) -> Integer [42] or the enclosed

. To remove the ambiguity, we need to be explicit about what construct can take an array size. (Here, I've added the desired production which allows ((Integer, Real) -> Integer)[42] to be parenthesized):

<function_type> ::= "(" <function_type_tail>

<function_type_tail> ::= <params> ")" "->" <type>
                       | "(" <params> ")" "->" <type> ")"

Unfortunately, that grammar is LR(2), not LR(1). The problem occurs withfirst(params)At the lookahead point, the parser still doesn't know if it is looking at a (redundantly) parenthesized type or at the parameter list in a function type. It won't know that until it sees the following symbol (which might be the end of input, in addition to the two options above). In both cases, it needs to reduce "(" to ((Integer) -> Real, Real) -> Integer and then to

. But then, in the first case it can just shift the close parenthesis, while in the second case it needs to continue reducing, first to
parsing grammar language-design higher-order-functions
1个回答
2
投票

It's straight-forward to remove left-recursion and then left-factor the grammar:

But that's not sufficient, because

<type> ::= <base_type> <optional_array_size>
<base_type> ::= <function_type>
<function_type> ::= "(" <params> ")" "->" <type>

and [42] can both start with ()->Integer[42]. And there's a similar problem between the productions for the parameter list. To get rid of these issues, we need yet another technique: expand non-terminals in place in order to get the conflicts into the same non-terminal so that we can left-factor them. As with this example, that often comes at the cost of some duplication.<type>By expanding <function_type>, <type> and

<type>        ::= <atomic_type> <optional_array_size>
              |   <function_type>
<opt_array_size>  ::= ""
              |   <array_size>
<atomic_type> ::= <integer_type>
              |   <real_type>
              |   "(" <type> ")"
<function_type> ::= "(" <opt_params> ")" "->" <type>
<opt_params>  ::= ""
              |   <params>
<params>      ::= <type>
              |   <params> "," <type>

, we get:

( Integer   ) [ 42 ]
( Integer   ) -> Integer
          ^
          |
          +----------------- Lookahead

And then we can left-factor to produceIntegerwhich is LL(1). I'll leave it as an exercise to reattach all the appropriate actions to these productions.<atomic_type> <type>我有一个语法,看起来像这样。<params>这样我就可以定义这样的类型 <opt_params>,

<type>. 这一切都很好,但我希望我的函数是一等公民。考虑到上面的语法,我不能有函数的数组,因为这只会把返回类型变成一个数组。"(" <type> ")" 不会是42个函数的数组,而是一个返回42个整数的函数。<function_type>我正在考虑在函数类型周围添加可选的括号。<opt_params>但这又产生了另一个问题(注意:我使用的是一个自上而下的递归下降解析器,所以我的语法必须是LL(1))。

<type>        ::= <atomic_type> <optional_array_size>
              |   <function_type>
<atomic_type> ::= <integer_type>
              |   <real_type>
              |   "(" <type> ")"
<function_type> ::= "(" <opt_params> ")" "->" <type>
              |   "(" <type> ")" "->" <type>
<opt_params>  ::= ""
              |   <params2>
<params2>     ::= <type> "," <type>
              |   <params2> "," <type>

问题是...

包含

# Not LL(1)
<type>          ::= <atomic_type> <opt_size>
                |   <function_type>
<opt_size>      ::= ""
                |   "[" integer "]"
<atomic_type>   ::= <integer_type>
                |   <real_type>
                |   "(" <type> ")"
<function_type> ::= "(" <fop>
<fop>           ::= <opt_params> ")" to <type>
                |   <type> ")" to <type>
<opt_params>    ::= ""
                |   <params2>
<params2>       ::= <type> "," <type> <params_tail>
<params_tail>   ::= "," <type> <params_tail>
                |   ""

因为函数类型可以作为函数参数传递。<function_type>. 在我修改语法之前,这种语法是有效的,但现在已经不行了。我如何修改我的语法以获得我想要的东西?<atomic_type> "(" <type>

我的语法是这样的。<atomic_type> ::= <function_type> <opt_params>

<type>     ::= <integer_type> <opt_size>
           |   <real_type> <opt_size>
           |   "(" <type> ")" <opt_size>
           |   "(" ")" "->" <type>
           |   "(" <type> ")" "->" <type>
           |   "(" <type> "," <type> <params2> ")" "->" <type>
<opt_size> ::= "" 
           |   "[" INTEGER_LITERAL "]"
<params2>  ::= ""
           |   "," <type> <params2>

::= "["

<type>     ::= <integer_type> <opt_size>
           |   <real_type> <opt_size>
           |   "(" <fop>
<fop>      ::= <type> <ftype>
           |   ")" "->" <type>
<ftype>    ::= ") <fcp>
           |   "," <type> <params2> ")" "->" <type>
<fcp>      ::= <opt_size>
           |   "->" <type>
<opt_size> ::= ""
           |   "[" INTEGER_LITERAL "]"
<params2>  ::= ""
           |   "," <type> <params2>

"]"

© www.soinside.com 2019 - 2024. All rights reserved.