我有一个数据框,如下所示:
# Load necessary library
library(dplyr)
# Create the data frame
test <- data.frame(
address = c("123 Elm St", "456 Oak St", "789 Pine St"),
job = c("Data Scientist", "Software ~A~ Engineer", "Project Manager"),
location = c("New York", "San Francisco ~A~ Bay Area ~A~ Near Golden Gate ~A~", "Los Angeles"),
stringsAsFactors = FALSE)
separate_longer_delim(., c(address,job, location), delim = "~A~")
我想根据“~A~”来分割行,但由于“~A~”出现的时间不同而失败。 所以我想调整“~A~”出现的次数。他们只是可以在最后重复(或类似的事情)。有人有什么想法吗?
也许可以分多个步骤进行,并根据需要进行过滤,例如哪里有空白位置(由一个字符串末尾的
~A~
引起)。
library(tidyverse)
# Create the data frame
test <- data.frame(
address = c("123 Elm St", "456 Oak St", "789 Pine St"),
job = c("Data Scientist", "Software ~A~ Engineer", "Project Manager"),
location = c("New York", "San Francisco ~A~ Bay Area ~A~ Near Golden Gate ~A~", "Los Angeles"),
stringsAsFactors = FALSE)
test |>
separate_longer_delim(job, delim = "~A~") |>
separate_longer_delim(location, delim = "~A~") |>
mutate(across(everything(), str_squish))
#> address job location
#> 1 123 Elm St Data Scientist New York
#> 2 456 Oak St Software San Francisco
#> 3 456 Oak St Software Bay Area
#> 4 456 Oak St Software Near Golden Gate
#> 5 456 Oak St Software
#> 6 456 Oak St Engineer San Francisco
#> 7 456 Oak St Engineer Bay Area
#> 8 456 Oak St Engineer Near Golden Gate
#> 9 456 Oak St Engineer
#> 10 789 Pine St Project Manager Los Angeles
创建于 2024-04-20,使用 reprex v2.1.0