tidyr::separate() 抛出意外错误

问题描述 投票:0回答:1

我知道我不是唯一一个遇到 separate() 问题的人,但是搜索 StackOverflow 的最后一个小时并没有得到我得到的错误的答案。

资料:

structure(list(geoid = c("41001", "41001", "41001", "41001", 
"41001", "41001", "41001", "41001", "41001", "41001", "41001", 
"41001", "41001", "41001", "41001", "41001", "41001", "41001", 
"41001", "41001", "41001", "41001", "41001", "41001", "41001", 
"41001", "41001", "41001", "41001", "41001", "41001", "41001", 
"41001", "41001", "41001", "41001", "41061", "41061", "41061", 
"41061", "41061", "41061", "41061", "41061", "41061", "41061", 
"41061", "41061", "41061", "41061", "41061", "41061", "41061", 
"41061", "41061", "41061", "41061", "41061", "41061", "41061", 
"41061", "41061", "41061", "41061", "41061", "41061", "41061", 
"41061", "41061", "41061", "41061", "41061"), name = c("Baker County, Oregon", 
"Baker County, Oregon", "Baker County, Oregon", "Baker County, Oregon", 
"Baker County, Oregon", "Baker County, Oregon", "Baker County, Oregon", 
"Baker County, Oregon", "Baker County, Oregon", "Baker County, Oregon", 
"Baker County, Oregon", "Baker County, Oregon", "Baker County, Oregon", 
"Baker County, Oregon", "Baker County, Oregon", "Baker County, Oregon", 
"Baker County, Oregon", "Baker County, Oregon", "Baker County, Oregon", 
"Baker County, Oregon", "Baker County, Oregon", "Baker County, Oregon", 
"Baker County, Oregon", "Baker County, Oregon", "Baker County, Oregon", 
"Baker County, Oregon", "Baker County, Oregon", "Baker County, Oregon", 
"Baker County, Oregon", "Baker County, Oregon", "Baker County, Oregon", 
"Baker County, Oregon", "Baker County, Oregon", "Baker County, Oregon", 
"Baker County, Oregon", "Baker County, Oregon", "Union County, Oregon", 
"Union County, Oregon", "Union County, Oregon", "Union County, Oregon", 
"Union County, Oregon", "Union County, Oregon", "Union County, Oregon", 
"Union County, Oregon", "Union County, Oregon", "Union County, Oregon", 
"Union County, Oregon", "Union County, Oregon", "Union County, Oregon", 
"Union County, Oregon", "Union County, Oregon", "Union County, Oregon", 
"Union County, Oregon", "Union County, Oregon", "Union County, Oregon", 
"Union County, Oregon", "Union County, Oregon", "Union County, Oregon", 
"Union County, Oregon", "Union County, Oregon", "Union County, Oregon", 
"Union County, Oregon", "Union County, Oregon", "Union County, Oregon", 
"Union County, Oregon", "Union County, Oregon", "Union County, Oregon", 
"Union County, Oregon", "Union County, Oregon", "Union County, Oregon", 
"Union County, Oregon", "Union County, Oregon"), value = c(426, 
496, 480, 412, 397, 453, 466, 504, 396, 424, 452, 570, 641, 651, 
564, 417, 272, 204, 396, 450, 457, 340, 327, 366, 438, 458, 362, 
390, 447, 620, 690, 667, 538, 435, 259, 259, 801, 840, 875, 962, 
1060, 821, 778, 836, 743, 642, 638, 731, 880, 871, 718, 457, 
339, 303, 701, 830, 808, 920, 1052, 810, 731, 814, 676, 660, 
636, 818, 1006, 865, 712, 558, 373, 570), agegroup = c("0 to 4", 
"5 to 9", "10 to 14", "15 to 19", "20 to 24", "25 to 29", "30 to 34", 
"35 to 39", "40 to 44", "45 to 49", "50 to 54", "55 to 59", "60 to 64", 
"65 to 69", "70 to 74", "75 to 79", "80 to 84", "85 years and", 
"0 to 4", "5 to 9", "10 to 14", "15 to 19", "20 to 24", "25 to 29", 
"30 to 34", "35 to 39", "40 to 44", "45 to 49", "50 to 54", "55 to 59", 
"60 to 64", "65 to 69", "70 to 74", "75 to 79", "80 to 84", "85 years and", 
"0 to 4", "5 to 9", "10 to 14", "15 to 19", "20 to 24", "25 to 29", 
"30 to 34", "35 to 39", "40 to 44", "45 to 49", "50 to 54", "55 to 59", 
"60 to 64", "65 to 69", "70 to 74", "75 to 79", "80 to 84", "85 years and", 
"0 to 4", "5 to 9", "10 to 14", "15 to 19", "20 to 24", "25 to 29", 
"30 to 34", "35 to 39", "40 to 44", "45 to 49", "50 to 54", "55 to 59", 
"60 to 64", "65 to 69", "70 to 74", "75 to 79", "80 to 84", "85 years and"
), sex = c("Male", "Male", "Male", "Male", "Male", "Male", "Male", 
"Male", "Male", "Male", "Male", "Male", "Male", "Male", "Male", 
"Male", "Male", "Male", "Female", "Female", "Female", "Female", 
"Female", "Female", "Female", "Female", "Female", "Female", "Female", 
"Female", "Female", "Female", "Female", "Female", "Female", "Female", 
"Male", "Male", "Male", "Male", "Male", "Male", "Male", "Male", 
"Male", "Male", "Male", "Male", "Male", "Male", "Male", "Male", 
"Male", "Male", "Female", "Female", "Female", "Female", "Female", 
"Female", "Female", "Female", "Female", "Female", "Female", "Female", 
"Female", "Female", "Female", "Female", "Female", "Female")), row.names = c(NA, 
-72L), class = c("tbl_df", "tbl", "data.frame"))

有效的代码(产生上述数据):

get_estimates(
  geography = "county",
  product = "characteristics",
  breakdown = c("AGEGROUP", "SEX"),
  breakdown_labels = TRUE,
  state = "OR",
  county = str_to_title(county_list)
) |>
  clean_names() |>
  filter(sex %in% c("Male", "Female")) |>
  filter(str_detect(agegroup, "^Age")) |>
  mutate(agegroup = str_replace(agegroup, "^\\w+\\s+(.*)\\s+\\w+", '\\1'))

下一行抛出错误:

|>
  mutate(
    agegroup = case_when(agegroup == "85 years and" ~ "85+ years",
                         TRUE ~ agegroup),
    name = separate(col = name, 
                    into = c("name"), 
                    sep = "\\s")
  )

错误:

Error in `mutate()`:
! Problem while computing `name = separate(col = name, into = c("name"), sep = "\\s")`.
Caused by error in `UseMethod()`:
! no applicable method for 'separate' applied to an object of class "character"
Backtrace:
 1. dplyr::mutate(...)
 6. tidyr::separate(col = name, into = c("name"), sep = "\\s")

是否有明显的东西我没有看到?我在字符列上使用

separate()
!我只是试图捕捉该专栏的第一个词并丢弃其余的!任何和所有的见解都将非常受欢迎。

编辑

解决了我当前的问题

 |>
  mutate(
    agegroup = case_when(agegroup == "85 years and" ~ "85+ years",
                         TRUE ~ agegroup), 
    name = word(name, 1, -3)
  ) 

但仍然非常想知道为什么

separate()
不起作用。

r dplyr tidyr
1个回答
1
投票

问题是您在

separate
内使用
mutate
separate
允许定义同名。 我使用了 3 个名称,因为拆分会导致 3 个新列以避免出现任何警告消息。如果你只使用一个
into=c("a")
,这会给你想要的输出,但也会有一些警告,说有更多的分裂。 您当然也可以取消选择不需要的列:
select(-c(b,c))
.


df |> 
  mutate(agegroup = case_when(agegroup == "85 years and" ~ "85+ years",
                              TRUE ~ agegroup)) |> 
  separate(name, into=c("a", "b", "c"), sep=" ")
#> # A tibble: 72 × 7
#>    geoid a     b       c      value agegroup sex  
#>    <chr> <chr> <chr>   <chr>  <dbl> <chr>    <chr>
#>  1 41001 Baker County, Oregon   426 0 to 4   Male 
#>  2 41001 Baker County, Oregon   496 5 to 9   Male 
#>  3 41001 Baker County, Oregon   480 10 to 14 Male 
#>  4 41001 Baker County, Oregon   412 15 to 19 Male 
#>  5 41001 Baker County, Oregon   397 20 to 24 Male 
#>  6 41001 Baker County, Oregon   453 25 to 29 Male 
#>  7 41001 Baker County, Oregon   466 30 to 34 Male 
#>  8 41001 Baker County, Oregon   504 35 to 39 Male 
#>  9 41001 Baker County, Oregon   396 40 to 44 Male 
#> 10 41001 Baker County, Oregon   424 45 to 49 Male 
#> # … with 62 more rows
© www.soinside.com 2019 - 2024. All rights reserved.