Lex和Yacc问题与评论
我正在尝试查找问题的根本原因.我有以下行需要解析-
I am trying to locate the root cause of an issue. I have the following line that needs to be parsed -
sample format "string";
sample
和format
需要标记的地方,并且需要将反向逗号中的任何内容提供给Parser文件.
但是有一个问题,如果我在字符串中有一个perl样式注释#
,那么我会得到一个错误.
Where sample
and format
need to be tokenized and whatever is in the inverted commas needs to be provided to the Parser file.
There is a catch however, if I have a perl style comment #
inside the string, then I get an error.
在lexer.l
中,我有以下内容-
stringIdentifier [^"]+
<STRING_S>{stringIdentifier} {
strncpy(yylval.str, yytext,1023);
yylval.str[1023] = '\0';
return IDENTIFIER;
}
<*>"//".* {
}
<*>"#".* {
}
<INITIAL>{s}{a}{m}{p}{l}{e} {
BEGIN(SAMPLE_S);
return SAMPLE;
}
<SAMPLE_S>{f}{o}{r}{m}{a}{t} {
return FORMAT;
}
<SAMPLE_S>"\"" {
BEGIN(STRING_S);
return INVERTED_COMMA;
}
<STRING_S>"\"" {
BEGIN(INITIAL);
return INVERTED_COMMA;
}
在Parser.y
中,我有以下规则:
pass : SAMPLE FORMAT INVERTED_COMMA IDENTIFIER INVERTED_COMMA
{
};
但是,当我输入sample format "abc;"
时,它可以工作,但是,当我在字符串中添加注释字符#
时,它会失败.您能帮忙吗
However, when I give sample format "abc;"
it works, however, when I add a comment character #
in the string it fails. Could you please help with this
答案在于您使用默认 开始条件的方式.快速阅读 lex/flex手册.
The answer lies in the way you have used the default start conditions. A quick read of the lex/flex manual explains their operation.
<*>
表示在每种状态下均应用以下模式.这包括在字符串内部,该字符串由S_STRING
状态指示.要停止注释模式在字符串内的运行,您需要从<*>
中排除 S_STRING
状态.您可以通过列出所有其他适用的状态来实现此目的,在您的示例中,这些状态将枚举为<INITIAL,S_SAMPLE>
.注释规则将变为:
The <*>
means that the following pattern is applied in every state. This includes inside a string, which is indicated by the S_STRING
state. To stop the comment pattern operating inside the string you need to exclude the S_STRING
state from <*>
. You can do this by listing all the other applicable states, which enumerate, in your example, to <INITIAL,S_SAMPLE>
. The comment rules then become:
<INITIAL,SAMPLE_S>"//".* {
}
<INITIAL,SAMPLE_S>"#".* {
}
就是这样.现在可以使用了! (我已经对其进行过测试)
And that's it. It now works! (I have tested it BTW)