从混合字符串中的字母中拆分数字,并使用R中的Regex将其放入列中

问题描述 投票:0回答:2

我有一系列篮球运动员统计数据,如下例所示:

stats <- c("40pt 2rb 1as 2st 2to 4trey 11-20fg 14-14ft",
           "7pt 5rb 1as 2st 1bl 3to 3-5fg 1-4ft",
           "0pt 1rb 1as 0-2fg")

理想情况下,我想将此字符串转换为表格格式:

这是每列的关键:

  • PT =点
  • RB =篮板
  • 作为协助=
  • ST =抢断
  • BL =块
  • 到=失误
  • trey = 3个指针
  • fg =尝试的实地目标
  • ft =尝试的罚球
r regex dplyr
2个回答
1
投票

我们在字母和数字之间的边界处分割字符串以创建list('lst'),循环通过list,将其更改为data.frame,其中列名称来自备用拆分值,使用rbindlist对元素进行rbind,拆分元素-使用cSplit将多个列转换为0并将NA值转换为0

library(data.table)
library(splitstackshape)
lst <- strsplit(stats, "(?<=[0-9])(?=[a-z])|\\s+", perl = TRUE)
lst1 <- lapply(lst, function(x) 
         as.data.frame.list(setNames(x[c(TRUE, FALSE)], x[c(FALSE, TRUE)])))
res <- cSplit(rbindlist(lst1, fill = TRUE), c('fg', 'ft'), '-')
for(nm in seq_along(res)){
    set(res, i = NULL, j = nm, value = as.numeric(as.character(res[[nm]])))
    set(res, i = which(is.na(res[[nm]])), j = nm, value = 0)
}

res
#    pt rb as st to trey bl fg_1 fg_2 ft_1 ft_2
#1: 40  2  1  2  2    4  0   11   20   14   14
#2:  7  5  1  2  3    0  1    3    5    1    4
#3:  0  1  1  0  0    0  0    0    2    0    0

0
投票

使用来自dcast包的reshape 2

m=gsub("(\\d+)-(\\d+)(\\w+)","\\1\\3_m \\2\\3_a",stats)
n=gsub("(\\d+)(\\S*)","\\1 \\2",gsub("\\s","\n",m))
o=cbind(read.table(text=n),group=rep(1:length(n),lengths(strsplit(n,"\n"))))
dcast(o,group~V2,value.var="V1")
  group as bl fg_a fg_m ft_a ft_m pt rb st to trey
1     1  1 NA   20   11   14   14 40  2  2  2    4
2     2  1  1    5    3    4    1  7  5  2  3   NA
3     3  1 NA    2    0   NA   NA  0  1 NA NA   NA

使用基数R.

> m=gsub("(\\d+)-(\\d+)(\\w+)","\\1\\3_m \\2\\3_a",stats)
> n=gsub("(\\d+)(\\S*)","\\1 \\2",gsub("\\s","\n",m))
> o=lapply(n,function(x)rev(read.table(text=x)))
> p=Reduce(function(x,y)merge(x,y,by="V2",all=T),o)
> read.table(text=do.call(paste,data.frame(t(p))),h=T)
  as fg_a fg_m ft_a ft_m pt rb st to trey bl
1  1   20   11   14   14 40  2  2  2    4 NA
2  1    5    3    4    1  7  5  2  3   NA  1
3  1    2    0   NA   NA  0  1 NA NA   NA NA
© www.soinside.com 2019 - 2024. All rights reserved.