我使用下面的正则表达式来捕获下划线后的所有数字/字母,但我只需捕获第二次出现,即“00500”,如下所示:
regular expresion: (?<=_)[a-zA-Z0-9]+
string:
"-rw-rw-rw- 1 rats rats 31K Sep 17 13:33 /opt/data/automation_sent/20180918/labc/0/20180918_00500.itx"
我在C#做,我认为价值将在第二组[1],但它不是;它只捕获字符串“_sent”:
string temp2 = "";
Regex getValueAfterUnderscore = new Regex(@"(?<=_)[a-zA-Z0-9]+");
Match match2 = getValueAfterUnderscore.Match(line);
if (match2.Success)
{
temp2 = match2.Groups[1].Value;
Console.WriteLine(temp2);
}
有任何想法吗?谢谢!
您可以使用以下代码捕获第二个下划线后的文本
var line = "-rw-rw-rw- 1 rats rats 31K Sep 17 13:33 /opt/data/automation_sent/20180918/labc/0/20180918_00500.itx";
string temp2 = "";
Regex getValueAfterUnderscore = new Regex(@"_.+_([a-zA-Z0-9]+)");
Match match2 = getValueAfterUnderscore.Match(line);
if (match2.Success)
{
temp2 = match2.Groups[1].Value;
Console.WriteLine(temp2);
}
输出:
00500
也许你在混淆“群体”与“匹配”。您应该搜索正则表达式的匹配项。以下是如何在给定字符串中列出正则表达式的所有匹配项:
string str = "-rw-rw-rw- 1 rats rats 31K Sep 17 13:33 /opt/data/automation_sent/20180918/labc/0/20180918_00500.itx";
MatchCollection matches = Regex.Matches(str, @"(?<=_)[a-zA-Z0-9]+");
foreach (Match curMatch in matches)
Console.WriteLine(curMatch.Value);
对于您的具体情况,请验证是否至少有2个匹配并检索matches[1]
的值(这是第二个匹配)。
if (matches.Count >= 2)
Console.WriteLine($"Your result: {matches[1].Value}");
var input = "-rw-rw-rw- 1 rats rats 31K Sep 17 13:33 /opt/data/automation_sent/20180918/labc/0/20180918_00500.itx";
Regex regex = new Regex(@"(?<Identifier1>\d+)_(?<Identifier2>\d+)");
var results = regex.Matches(input);
foreach (Match match in results)
{
Console.WriteLine(match.Groups["Identifier1"].Value);
Console.WriteLine(match.Groups["Identifier2"].Value);//second occurence
}
如果您的所有字符串看起来都像{SOME_STRING} _ {YOUR_NUMBER} .itx,那么您可以使用此解决方案(不使用正则表达式)
var arr = str.Split(new[] {"_", ".itx"}, StringSplitOptions.RemoveEmptyEntries);
var result = arr[arr.Length - 1];