假设我有以下正则表达式:
-(\d+)-
并且我想使用 C# 将第 1 组
(\d+)
替换为 AA
,以获得:
-AA-
现在我用以下方法替换它:
var text = "example-123-example";
var pattern = @"-(\d+)-";
var replaced = Regex.Replace(text, pattern, "-AA-");
但我不太喜欢这个,因为如果我更改模式以匹配
_(\d+)_
,我也必须将替换字符串更改为 _AA_
,这违反了 DRY 原则。
我正在寻找类似的东西:
保持匹配文本的原样,但将第 1 组更改为
this text
,将第 2 组更改为 another text
...
编辑:
这只是一个例子。我只是在寻找一种通用的方法来完成我上面所说的事情。
它应该适用于:
anything(\d+)more_text
以及您能想象到的任何图案。
我想做的就是只替换小组,并保留比赛的其余部分。
一个好主意可能是将所有内容封装在组内,无论是否需要识别它们。这样您就可以在替换字符串中使用它们。例如:
var pattern = @"(-)(\d+)(-)";
var replaced = Regex.Replace(text, pattern, "$1AA$3");
或使用 MatchEvaluator:
var replaced = Regex.Replace(text, pattern, m => m.Groups[1].Value + "AA" + m.Groups[3].Value);
另一种方法,有点混乱,可以使用后视/前视:
(?<=-)(\d+)(?=-)
您可以使用lookahead和lookbehind来做到这一点:
var pattern = @"(?<=-)\d+(?=-)";
var replaced = Regex.Replace(text, pattern, "AA");
我也有这个需要,我为它创建了以下扩展方法:
public static class RegexExtensions
{
public static string ReplaceGroup(
this Regex regex, string input, string groupName, string replacement)
{
return regex.Replace(
input,
m =>
{
var group = m.Groups[groupName];
var sb = new StringBuilder();
var previousCaptureEnd = 0;
foreach (var capture in group.Captures.Cast<Capture>())
{
var currentCaptureEnd =
capture.Index + capture.Length - m.Index;
var currentCaptureLength =
capture.Index - m.Index - previousCaptureEnd;
sb.Append(
m.Value.Substring(
previousCaptureEnd, currentCaptureLength));
sb.Append(replacement);
previousCaptureEnd = currentCaptureEnd;
}
sb.Append(m.Value.Substring(previousCaptureEnd));
return sb.ToString();
});
}
}
用途:
var input = @"[assembly: AssemblyFileVersion(""2.0.3.0"")][assembly: AssemblyFileVersion(""2.0.3.0"")]";
var regex = new Regex(@"AssemblyFileVersion\(""(?<version>(\d+\.?){4})""\)");
var result = regex.ReplaceGroup(input , "version", "1.2.3");
结果:
[assembly: AssemblyFileVersion("1.2.3")][assembly: AssemblyFileVersion("1.2.3")]
如果您不想更改模式,可以使用匹配组的组索引和长度属性。
var text = "example-123-example";
var pattern = @"-(\d+)-";
var regex = new RegEx(pattern);
var match = regex.Match(text);
var firstPart = text.Substring(0,match.Groups[1].Index);
var secondPart = text.Substring(match.Groups[1].Index + match.Groups[1].Length);
var fullReplace = firstPart + "AA" + secondPart;
这是另一个不错的清洁选项,不需要更改您的图案。
var text = "example-123-example";
var pattern = @"-(\d+)-";
var replaced = Regex.Replace(text, pattern, (_match) =>
{
Group group = _match.Groups[1];
string replace = "AA";
return String.Format("{0}{1}{2}", _match.Value.Substring(0, group.Index - _match.Index), replace, _match.Value.Substring(group.Index - _match.Index + group.Length));
});
更换代码:
var text = "example-123-example";
var pattern = @"-(\d+)-";
var replaced = Regex.ReplaceGroupValue(text, pattern, 1, "AA");
延伸类:
public static class RegexExtensions
{
[Pure]
public static string ReplaceGroupValue(this Regex source, string input, string groupName, string destinationValue)
{
return ReplaceGroupValue(
source,
input,
m => m.Groups[groupName],
p => destinationValue);
}
[Pure]
public static string ReplaceGroupValue(this Regex source, string input, int groupIdx, string destinationValue)
{
return ReplaceGroupValue(
source,
input,
m => m.Groups[groupIdx],
p => destinationValue);
}
[Pure]
public static string ReplaceGroupValue(this Regex source, string input, string groupName, Func<string, string> destinationValueSelector)
{
return ReplaceGroupValue(
source,
input,
m => m.Groups[groupName],
destinationValueSelector);
}
[Pure]
public static string ReplaceGroupValue(this Regex source, string input, int groupIdx, Func<string, string> destinationValueSelector)
{
return ReplaceGroupValue(
source,
input,
m => m.Groups[groupIdx],
destinationValueSelector);
}
[Pure]
private static string ReplaceGroupValue(
Regex source,
string input,
Func<Match, Group> groupSelector,
Func<string, string> destinationValueSelector)
{
var matchResult = source.Matches(input);
if (matchResult.Count <= 0)
{
return input;
}
var text = input;
foreach (var group in matchResult.OfType<Match>().Select(groupSelector).OrderByDescending(p => p.Index))
{
var begin = group.Index > 0 ? text.Substring(0, group.Index) : string.Empty;
var end = group.Index + group.Length < text.Length
? text.Substring(group.Index + group.Length)
: string.Empty;
var destinationValue = destinationValueSelector.Invoke(group.Value);
text = $"{begin}{destinationValue}{end}";
}
return text;
}
}
这是与 Daniel 类似的版本,但替换了多个匹配项:
public static string ReplaceGroup(string input, string pattern, RegexOptions options, string groupName, string replacement)
{
Match match;
while ((match = Regex.Match(input, pattern, options)).Success)
{
var group = match.Groups[groupName];
var sb = new StringBuilder();
// Anything before the match
if (match.Index > 0)
sb.Append(input.Substring(0, match.Index));
// The match itself
var startIndex = group.Index - match.Index;
var length = group.Length;
var original = match.Value;
var prior = original.Substring(0, startIndex);
var trailing = original.Substring(startIndex + length);
sb.Append(prior);
sb.Append(replacement);
sb.Append(trailing);
// Anything after the match
if (match.Index + match.Length < input.Length)
sb.Append(input.Substring(match.Index + match.Length));
input = sb.ToString();
}
return input;
通过下面的编码来获得单独的组替换。
new_bib = Regex.Replace(new_bib, @"(?s)(\\bibitem\[[^\]]+\]\{" + pat4 + @"\})[\s\n\v]*([\\\{\}a-zA-Z\.\s\,\;\\\#\\\$\\\%\\\&\*\@\\\!\\\^+\-\\\=\\\~\\\:\\\" + dblqt + @"\\\;\\\`\\\']{20,70})", delegate(Match mts)
{
var fg = mts.Groups[0].Value.ToString();
var fs = mts.Groups[1].Value.ToString();
var fss = mts.Groups[2].Value.ToString();
fss = Regex.Replace(fss, @"[\\\{\}\\\#\\\$\\\%\\\&\*\@\\\!\\\^+\-\\\=\\\~\\\:\\\" + dblqt + @"\\\;\\\`\\\']+", "");
return "<augroup>" + fss + "</augroup>" + fs;
}, RegexOptions.IgnoreCase);
抱歉,这需要 2024 年的另一个答案。
另一个非常高性能的选项怎么样:
Span
来防止分配,并避免 Substring
和 StringBuilder
以及其他昂贵的东西这是代码:
var outputString = MyRegex.Replace(inputString, m =>
{
var grp = m.Groups[1];
return string.Concat(
m.ValueSpan.Slice(0, grp.Index - m.Index), //prior part
PUT_REPLACEMENT_HERE, //replacement
m.ValueSpan.Slice(grp.Index + grp.Length - m.Index) //trailing part
);
}
附注如果您使用 .NET 8 - 使用
[GeneratedRegex]
在编译时预构建正则表达式。