如何使用这些规则从多行文本文件中提取多个字符串? 搜索字符串为“String server”、“pac”和“String method”。 它们在封闭的“{}”中可能只出现一次,也可能不出现一次。 搜索字符串匹配后,提取“”内不含“()”的值。 搜索字符串“String server”或“pac”的值仅出现一次 - 不重复。 它的值将出现在搜索字符串“String method”的值之前。 例如示例文本文件 在:
public AResponse retrieveA(ARequest req){
String server = "AAA";
String method = "retrieveA()";
log.info(method,
server,
req);
return res;
}
public BResponse retrieveB(BRequest req){
String method = "retrieveB()";
BBB pac = new BBB();
log.info(method,
pac,
req);
return res;
}
public CResponse retrieveC(CRequest req) {
String server = "CCC";
log.info(server,
req);
return res;
}
public DResponse retrieveD(DRequest req) {
String method = "retrieveD()";
log.info(method,req);
return res;
}
public EResponse retrieveE(ERequest req){
EEE pac = new EEE();
String method = "retrieveE()";
String server = "EEE";
log.info(method,
server,
pac,
req);
return res;
}
预期输出:
AAA retrieveA
BBB retrieveB
CCC
retrieveD
EEE retrieveE
我尝试了 GNU Awk 5.0.1:
awk '{
if ($0 ~ /String method/ || ($0 ~ /String server/) )
{
str=$0;
sub("String", "", str);
sub(")", "", str);
sub("=", "", str);
gsub(/\(/, "", str);
gsub(/"/, "", str);
gsub(/;/, "", str);
if (str ~ /method/)
{
method = str;
gsub(/[[:blank:]]/, "", method);
gsub(/method/, "", method);
arr[i][1] = method
count++
} else if (str ~ /server/)
{
server = str;
gsub(/[[:blank:]]/, "", server);
gsub(/server/, "", server);
arr[i][0] = server
count++
}
}
if (count > 1 || $0 ~ /log./) {
count = 0
i++
}
}
END {
for (i in arr) {
printf "%s %s\n", arr[i][0], arr[i][1];
}
}' in
这个
awk
解决方案应该适合您:
awk -v OFS='\t' -F= '
/\{[[:blank:]]*$/ {++n}
NF==2 && /String | pac/ {
gsub(/^[[:blank:]]*("|new +)|[()";]+$/, "", $2)
if ($1 ~ / (server|pac)/)
col1[n] = $2
else if ($1 ~ / method/)
col2[n] = $2
}
END {
for (i=1; i<=n; ++i)
print col1[i], col2[i]
}' file
AAA retrieveA
BBB retrieveB
CCC
retrieveD
EEE retrieveE