使用正则表达式解析XML并获取标记之间的值

问题描述 投票:0回答:2

我有一个正则表达式,我用它来获取两组id之间的数据,例如<CLASSCOD>70</CLASSCOD>我使用的正则表达式是(?<=<CLASSCOD>)(?:[^<]|<(?!/CLASSCOD))*,它在大多数情况下有效,但是当我有一个像这样的<CLASSCOD>N</CLASSCOD>这样的值时,它说没有匹配。

整个数据字符串如下所示

<DATE>0601</DATE>
<YEAR>11</YEAR>
<AGENCY>Department of the Interior</AGENCY>
<OFFICE>Bureau of Indian Affairs</OFFICE>
<LOCATION>BIA - DAPM</LOCATION>
<ZIP>85004</ZIP>
<CLASSCOD>N</CLASSCOD>
<OFFADD>Contracting Office - Western Region 2600 N. Central Avenue, 4th Floor Phoenix AZ 85004</OFFADD>
<SUBJECT>Boiler Replacement</SUBJECT>
<SOLNBR>A11PS00463</SOLNBR>
<RESPDATE>061711</RESPDATE>
<ARCHDATE>05312012</ARCHDATE>
<CONTACT>Geraldine M. Williams Purchasing Agent 6023794087 [email protected];<a href="mailto:[email protected]">Point of Contact above, or if none listed, contact the IDEAS EC HELP DESK for assistance</a>
</CONTACT>
<LINK><URL>https://www.fbo.gov/spg/DOI/BIA/RestonVA/A11PS00463/listing.html<LINKDESC>Link To Document</LINK>
<EMAIL></EMAIL>
<EMAIL>
  [email protected]
  <EMAILDESC>
    Point of Contact above, or if none listed, contact the IDEAS EC HELP DESK for assistance
  </EMAILDESC>
</EMAIL>
<SETASIDE>Total Small Business</SETASIDE>
<POPCOUNTRY>USA</POPCOUNTRY>
<POPZIP>85634</POPZIP>
<POPADDRESS>BIE Tohono O'odham High School, Sells, AZ</POPADDRESS>

有什么建议吗?

谢谢

c# regex vb.net
2个回答
2
投票

更简单的东西应该工作:

<CLASSCOD>(.+?)</CLASSCOD>

例:

Match match = Regex.Match(input, @"<CLASSCOD>(.+?)</CLASSCOD>");
if (match.Success) {
    string value = match.Groups[1].Value;
    Console.WriteLine(value);
}

1
投票

如果要提取括号内的值,可以使用以下RegEx:

<([^>]+)>([^<]*)</\1>

对于这种情况,不需要使用前瞻和后向运算符。

© www.soinside.com 2019 - 2024. All rights reserved.