如何根据 R 中的模式从字符串创建数据集

问题描述 投票:0回答:3

我有一根这样的绳子:

ST <- c("7.5 (wk0)  / 7.4 (wk4)  / 10.3 (wk8)  / 12.7 (wk12)",
        "140.0 (wk0)  / 161.3 (wk4)  / 142.5 (wk8)")

我想生成如下数据集:

STvalue  WEEK
7.5        0
7.4        4
10.3       8
12.7       12
140.0      0
161.3      4
142.5      8
r string dplyr dataset
3个回答
2
投票

全部在基础R中:

strsplit(ST, "/") |>
  unlist() |>
  trimws() |>
  gsub(pattern = "[()wk]", replacement = "") |>
  strsplit(" ") |>
  do.call(what = rbind) |>
  as.data.frame() |>
  setNames(c("STvalue", "week")) |>
  type.convert()
#   STvalue week
# 1     7.5    0
# 2     7.4    4
# 3    10.3    8
# 4    12.7   12
# 5   140.0    0
# 6   161.3    4
# 7   142.5    8

或与

library(tidyverse)

library(tidyverse)
ST |> 
  str_split("/", simplify = FALSE) |>
  unlist() |>
  trimws() |>
  as_tibble() |>
  separate(value, into = c("STvalue", "week"), sep = " ") |>
  mutate(across(everything(), parse_number))
# # A tibble: 7 × 2
#   STvalue  week
#     <dbl> <dbl>
# 1     7.5     0
# 2     7.4     4
# 3    10.3     8
# 4    12.7    12
# 5   140       0
# 6   161.      4
# 7   142.      8

1
投票

使用

tidyr
,您可以稍微节省一点,但
separate_longer_delim
separate_wider_delim
extract_numeric

library(tidyr)
library(dplyr)

separate_longer_delim(data.frame(col = ST), col, delim = "/") %>%
  separate_wider_delim(col, delim = "(", names = c("STvalue", "Week")) %>%
  mutate(across(everything(), extract_numeric))
# or mutate(across(everything(), readr::parse_number))

输出:

  STvalue  Week
    <dbl> <dbl>
1     7.5     0
2     7.4     4
3    10.3     8
4    12.7    12
5   140       0
6   161.      4
7   142.      8

0
投票

将 w、k、( 和 ) 分别替换为空格和 / 换行符,然后使用 read.table 读入。没有使用任何封装。

ST |>
  chartr(old = "wk()/", new = "    \n", x = _) |>
  read.table(text = _, col.names = c("STvalue", "WEEK"))

给予

  STvalue WEEK
1     7.5    0
2     7.4    4
3    10.3    8
4    12.7   12
5   140.0    0
6   161.3    4
7   142.5    8
© www.soinside.com 2019 - 2024. All rights reserved.