ANTLR3:人品没有可行的替代方案

问题描述 投票:1回答:1

我有这个ANTLR3语法:

grammar wft;

@header {
    package com.mycompany.wftdiff.parser;

    import com.mycompany.wftdiff.model.*;
}
@lexer::header {
    package com.mycompany.wftdiff.parser;
}
@members {
    private final WftFile wftFile = new WftFile();

    public WftFile getParsingResult() {
        return wftFile;
    }
}
wftFile:
    {
        System.out.println("Heyo!");
    }
    (CommentLine | assignment | NewLine)*
    itemTypeDefinition
    EOF
    ;

/**
 * ItemTypeDefinition
 * DEFINE ITEM_TYPE
 * END ITEM_TYPE
 */
itemTypeDefinition:
    'DEFINE ITEM_TYPE' NewLine
    (KeyName|TransStmt|BaseStmt|NewLine)+
        WhiteSpace* 'DEFINE ITEM_ATTRIBUTE' NewLine
        (KeyName|TransStmt|BaseStmt)*
        WhiteSpace* 'END ITEM_ATTRIBUTE' NewLine
    'END ITEM_TYPE'
    ;

/**
 * KeyName
 * KEY NAME VARCHAR2(8)
 */
KeyName: WhiteSpace* KeyNameStart .* {$channel = HIDDEN;} NewLine;
fragment KeyNameStart: 'KEY NAME VARCHAR2(';

/**
 * TransStmt
 * TRANS DISPLAY_NAME VARCHAR2(80)
 */
TransStmt: WhiteSpace* TransStmtStart .* {$channel = HIDDEN;} NewLine;
fragment TransStmtStart: 'TRANS';

/**
 * BaseStmt
 BASE PROTECT_LEVEL NUMBER
 */
BaseStmt: WhiteSpace* BaseStmtStart .* {$channel = HIDDEN;} NewLine;
fragment BaseStmtStart: 'BASE';

/**
 * Assignment
 */
assignment returns [Assignment assignment]:
    {
        System.out.println("Assignment found!");
    }
    target=AssignmentTarget
    WhiteSpace '=' WhiteSpace
    value=String {
        assignment = new Assignment(target.getText(), value.getText());
        wftFile.addAssignment(new Assignment(target.getText(), value.getText()));
    }
    NewLine;

AssignmentTarget: A (A|D|'_')*;
String: '"' ~'"'* '"'
;

/**
 * Comment
 */
CommentLine: CommentStart .* {$channel = HIDDEN;} NewLine;
fragment CommentStart: '#';

// Lexer rules

fragment D: '0'..'9';
fragment A: 'A'..'Z'
    | 'a'..'z';
StringLength: D+;
NewLine   : '\r' '\n' | '\n' | '\r';
WhiteSpace: ' ';

然后我用为它生成一个解析器

java -cp "D:\wftdiff\lib\antlr-3.5.2\antlr-3.5.2-complete.jar" org.antlr.Tool -o src/com/mycompany/wftdiff/parser/ grammar-src/wft.g

...并这样称呼它:

val lexer = wftLexer(ANTLRFileStream(fileName))
val parser = wftParser(CommonTokenStream(lexer))
parser.wftFile()
System.out.println("Test")

fileName点包含以下内容的文本文件:

# Oracle Workflow Process Definition
# $Header$

VERSION_MAJOR = "2"
VERSION_MINOR = "6"
LANGUAGE = "GERMAN"

ACCESS_LEVEL = "100"

DEFINE ITEM_TYPE
  KEY NAME VARCHAR2(8)
  TRANS DISPLAY_NAME VARCHAR2(80)
  TRANS DESCRIPTION VARCHAR2(240)
  BASE PROTECT_LEVEL NUMBER
  BASE CUSTOM_LEVEL NUMBER
  BASE WF_SELECTOR VARCHAR2(240)
  BASE READ_ROLE REFERENCES ROLE
  BASE WRITE_ROLE REFERENCES ROLE
  BASE EXECUTE_ROLE REFERENCES ROLE
  BASE PERSISTENCE_TYPE VARCHAR2(8)
  BASE PERSISTENCE_DAYS NUMBER

  DEFINE ITEM_ATTRIBUTE
    KEY NAME VARCHAR2(30)
    TRANS DISPLAY_NAME VARCHAR2(80)
    TRANS DESCRIPTION VARCHAR2(240)
    BASE PROTECT_LEVEL NUMBER
    BASE CUSTOM_LEVEL NUMBER
    BASE TYPE VARCHAR2(8)
    BASE FORMAT VARCHAR2(240)
    BASE VALUE_TYPE VARCHAR2(8)
    BASE DEFAULT VARCHAR2(4000)
  END ITEM_ATTRIBUTE
END ITEM_TYPE

我得到以下的输出:

Heyo!
Assignment found!
Assignment found!
Assignment found!
Assignment found!
test-data/partialSample01.wft line 25:2 no viable alternative at character 'D'
test-data/partialSample01.wft line 35:2 no viable alternative at character 'E'
Test

我应该如何改变我的语法,以摆脱no viable alternative at character 'D'错误的?

请注意,我并不需要解析该文件的这一部分(我不感兴趣,在这个特定的信息,它是后话在文件中)。

更新1:试图忽略了整个事情的建议here(使用skip()),但它并没有帮助。

新的语法文件:

grammar wft;

@header {
    package com.mycompany.wftdiff.parser;

    import com.mycompany.wftdiff.model.*;
}
@lexer::header {
    package com.mycompany.wftdiff.parser;
}
@members {
    private final WftFile wftFile = new WftFile();

    public WftFile getParsingResult() {
        return wftFile;
    }
}
wftFile:
    {
        System.out.println("Heyo!");
    }
    (CommentLine | assignment | NewLine)*
    itemTypeDefinition
    EOF
    ;

/**
 * ItemTypeDefinition
 * DEFINE ITEM_TYPE
 * END ITEM_TYPE
 */
itemTypeDefinition:
    'DEFINE ITEM_TYPE' NewLine
    (KeyName|TransStmt|BaseStmt|NewLine)+
        WhiteSpace*
        NewLine
        DefineItemAttribute
        WhiteSpace*
    'END ITEM_TYPE'
    ;

DefineItemAttribute: 'DEFINE ITEM_ATTRIBUTE' .* 'END ITEM_ATTRIBUTE' {skip();};

/**
 * KeyName
 * KEY NAME VARCHAR2(8)
 */
KeyName: WhiteSpace* KeyNameStart .* {$channel = HIDDEN;} NewLine;
fragment KeyNameStart: 'KEY NAME VARCHAR2(';

/**
 * TransStmt
 * TRANS DISPLAY_NAME VARCHAR2(80)
 */
TransStmt: WhiteSpace* TransStmtStart .* {$channel = HIDDEN;} NewLine;
fragment TransStmtStart: 'TRANS';

/**
 * BaseStmt
 BASE PROTECT_LEVEL NUMBER
 */
BaseStmt: WhiteSpace* BaseStmtStart .* {$channel = HIDDEN;} NewLine;
fragment BaseStmtStart: 'BASE';

/**
 * Assignment
 */
assignment returns [Assignment assignment]:
    {
        System.out.println("Assignment found!");
    }
    target=AssignmentTarget
    WhiteSpace '=' WhiteSpace
    value=String {
        assignment = new Assignment(target.getText(), value.getText());
        wftFile.addAssignment(new Assignment(target.getText(), value.getText()));
    }
    NewLine;

AssignmentTarget: A (A|D|'_')*;
String: '"' ~'"'* '"'
;

/**
 * Comment
 */
CommentLine: CommentStart .* {$channel = HIDDEN;} NewLine;
fragment CommentStart: '#';

// Lexer rules

fragment D: '0'..'9';
fragment A: 'A'..'Z'
    | 'a'..'z';
StringLength: D+;
NewLine   : '\r' '\n' | '\n' | '\r';
WhiteSpace: ' ';

解析结果:

Heyo!
Assignment found!
Assignment found!
Assignment found!
Assignment found!
test-data/partialSample01.wft line 25:2 no viable alternative at character 'D'
test-data/partialSample01.wft line 36:0 missing DefineItemAttribute at 'END ITEM_TYPE'
Test

赏金条款

我将颁发奖金给谁完成下列英雄事迹的人:

  1. 创建一个分析器,它能够识别this file的所有部分,这是标志着批示有关,即

1.1。这里的一切都BEGIN ACTIVITYEND ACTIVITY标签,1.2。这里的一切都BEGIN ACTIVITY_TRANSITIONEND ACTIVITY_TRANSITION,1.3。这里的一切都BEGIN PROCESS_ACTIVITYBEGIN PROCESS_ACTIVITY标签。

所谓“承认一切”我的意思是,必须有ANTLR 3码,这让我把那将处理从文件中提取就像在原岗位的assignment规则数据的Java语句。你不需要编写任何Java代码存在,但它必须能够对我来说,以后添加该代码。

未标记为相关的所有部件可以由分析器(类似于原来的语法的评论)被忽略。

  1. 你的语法必须与ANTLR 3,Java的8和Windows 7兼容。
  2. 您可以删除原来的版本(如here)的代码,这样你就不会得到编译器错误。
  3. 解析器必须是能够使用java -cp "D:\wftdiff\lib\antlr-3.5.2\antlr-3.5.2-complete.jar" org.antlr.Tool -o src/com/mycompany/wftdiff/parser/ grammar-src/wft.g产生,或者,如果你使用任何特殊的设置,你需要在你的答案指定它们。问题的关键是,我需要能够重现你的结果。
  4. 当我喂sample file解析器,就必须使用它没有抱怨(没有打印任何ANTLR的错误信息,没有崩溃,没有抛出异常的技术像NullPointerException异常)。
java parsing antlr grammar antlr3
1个回答
0
投票

这里是语法。它承认所有部件,只要你愿意,你可以添加Java操作。

编译和jdk1.8,ANTLR 3.5.2和所提供的样本输入测试。

grammar wft;

@header {
    package com.mycompany.wftdiff.parser;
}

@lexer::header {
    package com.mycompany.wftdiff.parser;
}

@members {
}

wftFile :   (COMMENT|assignment|definition|flow)*
    ;

assignment
    :   ID EQ STRING
    ;

definition
    :   'DEFINE' ID
        (COMMENT | (dclass ID type) | definition)* 
        'END' ID
    ;


dclass  :   'KEY' | 'BASE' | 'TRANS'
    ;

type    :   tnum | tvarchar | tref | tdate
    ;

tnum    :   'NUMBER'
    ;

tvarchar:   'VARCHAR2' '(' INT ')'
    ;

tref    :   'REFERENCES' ID
    ;

tdate   :   'DATE'
    ;

flow    :   'BEGIN' ID (STRING)+
        (COMMENT|assignment|flow)+
        'END' ID
    ;

EQ  :   '='
    ;

ID  :   ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')*
    ;


NL  :   '\r'? '\n' {$channel=HIDDEN;}
    ;

COMMENT
    :   '#' ~('\n'|'\r')* '\r'? '\n' {$channel=HIDDEN;}
    ;

WS  :   ( ' '
        | '\t'
        ) {$channel=HIDDEN;}
    ;

STRING
    :  '"' ( ESC_SEQ | ~('\\'|'"') )* '"'
    ;

fragment
HEX_DIGIT : ('0'..'9'|'a'..'f'|'A'..'F') ;

fragment
ESC_SEQ
    :   '\\' ('b'|'t'|'n'|'f'|'r'|'\"'|'\''|'\\')
    |   UNICODE_ESC
    |   OCTAL_ESC
    ;

fragment
OCTAL_ESC
    :   '\\' ('0'..'3') ('0'..'7') ('0'..'7')
    |   '\\' ('0'..'7') ('0'..'7')
    |   '\\' ('0'..'7')
    ;

fragment
UNICODE_ESC
    :   '\\' 'u' HEX_DIGIT HEX_DIGIT HEX_DIGIT HEX_DIGIT
    ;

INT :   '0'..'9'+
    ;
© www.soinside.com 2019 - 2024. All rights reserved.