如何使用通配符解析字符串？

Question

如何使用通配符从r中的字符串解析/提取信息？

head(df)
set           type
a             [OutofArea]:[type:"928"]:[idnum:"27"]
a             [WithinRange]:[type:"029":[...
a             [OutofArea]:[type:"928"]:[...
a             [OutofArea]:[type:"274"]:[...
a             [OutofArea]:[type:"210"]:[...
a             [OutofArea]:[type:"199"]"[...

我只需要输入数字就可以了。因此只有928、029等。在这种情况下，数字是通配符-在类型：“之后和下一个”

之前的任何内容

Answer 1

我们可以使用str_extract提取“类型：”字符串后的数字

library(stringr)
library(dplyr)
df %>%
   mutate(new = str_extract(type, '(?<=type:")\\d+'))
#  set                                   type new
#1   a  [OutofArea]:[type:"928"]:[idnum:"27"] 928
#2   a [WithinRange]:[type:"029":[idnum:"27"] 029

数据

df <- structure(list(set = c("a", "a"), type = c("[OutofArea]:[type:\"928\"]:[idnum:\"27\"]", 
"[WithinRange]:[type:\"029\":[idnum:\"27\"]")), class = "data.frame", row.names = c(NA, 
-2L))

Answer 2

[假定末尾的注释中可重复显示的数据，我们可以使用read.table且sep等于双引号，然后选择第二个字段。这将数字返回为数字，但是如果您希望将其作为字符，则将colClasses = "character"添加到read.table参数中。不使用包或正则表达式。

read.table(text = df$type, sep = '"', quote = '', fill = TRUE)[[2]]
## [1] 928  29 928 274 210 199

注

Lines <- 'set           type
a             [OutofArea]:[type:"928"]:[idnum:"27"]
a             [WithinRange]:[type:"029":[...
a             [OutofArea]:[type:"928"]:[...
a             [OutofArea]:[type:"274"]:[...
a             [OutofArea]:[type:"210"]:[...
a             [OutofArea]:[type:"199"]"[...'
df <- read.table(text = Lines, header = TRUE, as.is = TRUE)

Answer 3

我们可以使用sub

sub('.*type:"(\\d+)".*', '\\1', df$type)
#[1] "928" "029" "928" "274" "210" "199"

数据

df <- structure(list(set = c("a", "a", "a", "a", "a", "a"), 
     type = c("[OutofArea]:[type:\"928\"]:[idnum:\"27\"]", 
"[WithinRange]:[type:\"029\":", "[OutofArea]:[type:\"928\"]:", 
"[OutofArea]:[type:\"274\"]:", "[OutofArea]:[type:\"210\"]:", 
"[OutofArea]:[type:\"199\"]\"")), class = "data.frame", row.names = c(NA,-6L))

如何使用通配符解析字符串？

问题描述投票：0回答：3

3个回答

数据

注

最新问题

如何使用通配符解析字符串？

问题描述 投票：0回答：3

3个回答

数据

注

最新问题

问题描述投票：0回答：3