我有一个数据框,其中一列包含一串数字。每行包含一组长度不同的数字。例如:
Month Count
Jan "[1.2445, 23.888883, 16.11208347]"
Feb "[2.6473, 400.6256]"
March "[6723.1838282, 187.1212, 90.111, 75.1342899]"
目标是将列表转换为数字格式,并对每行求和。结果类似于(为了方便起见,四舍五入小数):
Month Count
Jan 41.245
Feb 403.2729
March 7075.500
我使用了适用于单行的代码,但我无法将其逐行抽象到整个数据帧。
sum(as.numeric(strsplit(substr(Data$Count, 2, nchar(Data$Count) - 1), ',')[[1]]))
你就快到了 - 在基础 R 中,你可以将
lapply
与 strsplit
和 gsub
(而不是 substr
)一起使用:
unlist(lapply(strsplit(gsub("\\[|\\]", "", df$Count), ","), \(x) sum(as.numeric(x))))
# [1] 41.24547 403.27290 7075.55032
数据:
df <- read.table(text = 'Month Count
Jan "[1.2445, 23.888883, 16.11208347]"
Feb "[2.6473, 400.6256]"
March "[6723.1838282, 187.1212, 90.111, 75.1342899]" ', h = TRUE)
Data <- read.table(text = 'Month Count
Jan "[1.2445, 23.888883, 16.11208347]"
Feb "[2.6473, 400.6256]"
March "[6723.1838282, 187.1212, 90.111, 75.1342899]"',
header = T, stringsAsFactors = F)
library(dplyr, warn.conflicts = F)
Data %>%
mutate(Count = strsplit(gsub("\\[|\\]", "", Count), ", ")) %>%
tidyr::unnest_longer(Count) %>%
summarise(Count = sum(as.numeric(Count)), .by = Month) %>%
as.data.frame()
#> Month Count
#> 1 Jan 41.24547
#> 2 Feb 403.27290
#> 3 March 7075.55032
创建于 2024-04-29,使用 reprex v2.0.2