。net正则表达式-在最后一个列表项中不包含句号的字符串

Question

我正在尝试使用.net正则表达式来标识XML数据中的字符串，该字符串在最后一个标记之前不包含句号。我对正则表达式没有太多经验。我不确定我需要更改什么以及为什么要获得想要的结果。

数据中每行的末尾都有换行符和回车符。

良好的XML数据示例：

<randlist prefix="unorder">
    <item>abc</item>
    <item>abc</item>
    <item>abc</item>
</randlist>

错误的XML数据示例-regexp应该匹配-最后一个</item>之前的句号：

<randlist prefix="unorder">
    <item>abc</item>
    <item>abc</item>
    <item>abc.</item>
</randlist>

我尝试过的Reg exp模式不适用于不良XML数据（未经良好XML数据测试）：

^<randlist \w*=[\S\s]*\.*[^.]<\/item>[\n]*<\/randlist>$

使用http://regexstorm.net/tester的结果：

0 matches

使用https://regex101.com/的结果：

0 matches

由于完全停止和开始字符串条件，因此此问题与以下imo不同：

Regex for string not ending with given suffix

3的解释：

/
^<randlist \w*=[\S\s]*\.*[^.]<\/item>[\n]*<\/randlist>$
/
gm
^ asserts position at start of a line
<randlist  matches the characters <randlist  literally (case sensitive)
\w* matches any word character (equal to [a-zA-Z0-9_])
* Quantifier — Matches between zero and unlimited times, as many times as possible, giving back as needed (greedy)
= matches the character = literally (case sensitive)
Match a single character present in the list below [\S\s]*
* Quantifier — Matches between zero and unlimited times, as many times as possible, giving back as needed (greedy)
\S matches any non-whitespace character (equal to [^\r\n\t\f\v ])
\s matches any whitespace character (equal to [\r\n\t\f\v ])
\.* matches the character . literally (case sensitive)
* Quantifier — Matches between zero and unlimited times, as many times as possible, giving back as needed (greedy)
Match a single character not present in the list below [^.]
. matches the character . literally (case sensitive)
< matches the character < literally (case sensitive)
\/ matches the character / literally (case sensitive)
item> matches the characters item> literally (case sensitive)
Match a single character present in the list below [\n]*
< matches the character < literally (case sensitive)
\/ matches the character / literally (case sensitive)
randlist> matches the characters randlist> literally (case sensitive)
$ asserts position at the end of a line
Global pattern flags
g modifier: global. All matches (don't return after first match)
m modifier: multi line. Causes ^ and $ to match the begin/end of each line (not only begin/end of string)

Answer 1

@@ Silvanas是绝对正确的。您不应该使用Regex解决此问题，而应使用某种形式的XML解析器读取数据并使用.查找行。但是，如果出于某种可怕的原因而必须使用Regex，并且如果数据的结构与示例完全相同，则Regex解决方案如下：

^\s+<item>[^<]*?(?<=\.)<\/item>$

如果与该正则表达式有任何匹配，则您的xml格式不正确。但是同样，如果空格不正确，行上还有其他内容，标签arent <item>..</item>等依此类推，则此正则表达式也会失败。再说一次，除非您可以绝对保证除.以外的所有格式都将是格式正确的XML]，否则最好不使用Regex解决此问题。

。net正则表达式-在最后一个列表项中不包含句号的字符串

问题描述投票：0回答：1

1个回答

最新问题

。net正则表达式-在最后一个列表项中不包含句号的字符串

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1