R：正则表达式捕获给定字符后的所有实例

Question

给定字符串ab cd ; ef gh ij，如何在;之后的第一个空格之后删除所有空格，即ab cd ; efghij？我尝试使用\K但无法完全使用它。

test = 'ab cd  ; ef  gh ij'
gsub('(?<=; )[^ ]+\\K +','',test,perl=T)
# "ab cd  ; efgh ij"

Answer 1

1）gsubfn在gsubfn包中使用gsubfn，这里是一个只使用简单正则表达式的单行。它将捕获组输入到指示的函数中（以公式表示法表示），并将匹配替换为函数的输出。

library(gsubfn)

gsubfn("; (.*)", ~ paste(";", gsub(" ", "", x)), test)
## [1] "ab cd  ; efghij"

2）gsub它使用一个由空格组成的模式，该空格不是紧跟在分号之后，而是在分号的其余部分中没有跟随任何地方。

gsub("(?<!;) (?!.*; )", "", test, perl = TRUE)
## [1] "ab cd  ; efghij"

3）regexpr / substring这将找到分号的位置，然后使用substring将其分成两部分，并用gsub替换空格，最后将其粘贴在一起。

ix <- regexpr(";", test)
paste(substring(test, 1, ix), gsub(" ", "", substring(test, ix + 2)))
## [1] "ab cd  ; efghij"

4）read.table这与（3）类似，但使用read.table将输入分成两个字段。

with(read.table(text = test, sep = ";", as.is = TRUE), paste0(V1, "; ", gsub(" ", "", V2)))
## [1] "ab cd  ; efghij"

Answer 2

我确信有一个正则表达式解决方案（我希望有人发布），但这是一个非正则表达式解决方案依赖于分号是一致的。如果有多个分隔符，您可以调整它。希望它有所帮助！

> # Split the string on the semi-colon (assumes semi-colon is consistent)
> split <- strsplit(c("ab cd  ; ef  gh ij", "abcd e f ; gh ij k"), ";")
> 
> # Extract elements separately
> pre_semicolon <- sapply(split, `[`, 1)
> post_semicolon <- sapply(split, `[`, 2)
> 
> # Remove all spaces from everything after the semi-colon
> post_semicolon <- gsub("[[:space:]]", "", post_semicolon)
> 
> # Paste them back together with a semi-colon and a space
> paste(pre_semicolon, post_semicolon, sep = "; ")
[1] "ab cd  ; efghij"  "abcd e f ; ghijk"

R：正则表达式捕获给定字符后的所有实例

问题描述投票：3回答：2

2个回答

最新问题

R：正则表达式捕获给定字符后的所有实例

问题描述 投票：3回答：2

2个回答

最新问题

问题描述投票：3回答：2