dplyr :: filter“没有注册tidyselect变量”

问题描述 投票:3回答:2

我正在尝试使用dplyr::filter()函数过滤我的tibble的特定行。

这是我的tibble head(raw.tb)的一部分:

A tibble: 738 x 4
      geno   ind     X     Y
     <chr> <chr> <int> <int>
 1 san1w16    A1   467   383
 2 san1w16    A1   465   378
 3 san1w16    A1   464   378
 4 san1w16    A1   464   377
 5 san1w16    A1   464   376
 6 san1w16    A1   464   375
 7 san1w16    A1   463   375
 8 san1w16    A1   463   374
 9 san1w16    A1   463   373
10 san1w16    A1   463   372
# ... with 728 more rows

当我要求:raw.tb %>% dplyr::filter(ind == contains("A"))

我得到:Error in filter_impl(.data, quo) : Evaluation error: No tidyselect variables were registered

在我的tibble unique(raw.tb$ind)是:

    [1] "A1"  "A10" "A11" "A12" "A2"  "A3"  "A4"  "A5"  "A6"  "A7"  "A8"  "A9"  "B1" 
[14] "B10" "B11" "B12" "B2"  "B3"  "B4"  "B5"  "B6"  "B7"  "B8"  "B9"  "C1"  "C10"
[27] "C11" "C12" "C2"  "C3"  "C4"  "C5"  "C6"  "C7"  "C8"  "C9"  "D1"  "D10" "D11"
[40] "D12" "D2"  "D3"  "D4"  "D5"  "D6"  "D7"  "D8"  "D9"  "E1"  "E10" "E11" "E12"
[53] "E2"  "E3"  "E4"  "E5"  "E6"  "E7"  "E8"  "E9"  "F1"  "F10" "F11" "F12" "F2" 
[66] "F3"  "F4"  "F5"  "F6"  "F7"  "F8"  "F9"  "G1"  "G10" "G11" "G2"  "G3"  "G4" 
[79] "G5"  "G6"  "G7"  "G8"  "G9"  "H1"  "H10" "H11"

而且我想使用tidyverse语言仅提取raw.tb$ind以“A”开头的行。

(我知道如何在基地R中这样做,但我的目标是使用tidyverse)。

非常感谢任何反馈

r regex dplyr tidyverse tidyselect
2个回答
6
投票

filter期望逻辑向量来过滤行。 select助手(?select_helpers)函数contains根据某种模式选择数据集的列。为了过滤行,我们可以使用grepl中的base R

raw.tb %>%
   dplyr::filter(grepl("A", ind)) 

或来自str_detectstringrtidyverse的其中一个包裹

raw.tb %>%
  dplyr::filter(stringr::str_detect(ind, "A"))

0
投票

只需写出akrun's comment,@ kruun随意接管这个答案以防万一。

创建一些数据,

dput(raw.tb) 
raw.tb <- structure(list(geno = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L), .Label = "san1w16", class = "factor"), ind = structure(c(1L, 
1L, 2L, 2L, 3L, 3L, 4L, 4L, 5L, 1L), .Label = c("A1", "B1", "C1", 
"D1", "E1"), class = "factor"), X = c(467L, 465L, 464L, 464L, 
464L, 464L, 463L, 463L, 463L, 463L), Y = c(383L, 378L, 378L, 
377L, 376L, 375L, 375L, 374L, 373L, 372L)), .Names = c("geno", 
"ind", "X", "Y"), row.names = c("1", "2", "3", "4", "5", "6", 
"7", "8", "9", "10"), class = c("tbl_df", "tbl", "data.frame"
))

数据,

raw.tb
#> # A tibble: 10 x 4
#>       geno    ind     X     Y
#>  *  <fctr> <fctr> <int> <int>
#>  1 san1w16     A1   467   383
#>  2 san1w16     A1   465   378
#>  3 san1w16     B1   464   378
#>  4 san1w16     B1   464   377
#>  5 san1w16     C1   464   376
#>  6 san1w16     C1   464   375
#>  7 san1w16     D1   463   375
#>  8 san1w16     D1   463   374
#>  9 san1w16     E1   463   373
#> 10 san1w16     A1   463   372

方法#1

raw.tb %>% dplyr::filter(str_detect(ind, "A"))
#> # A tibble: 3 x 4
#>      geno    ind     X     Y
#>    <fctr> <fctr> <int> <int>
#> 1 san1w16     A1   467   383
#> 2 san1w16     A1   465   378
#> 3 san1w16     A1   463   372

方法#1

raw.tb %>% dplyr::filter(grepl("A", ind))
#> # A tibble: 3 x 4
#>      geno    ind     X     Y
#>    <fctr> <fctr> <int> <int>
#> 1 san1w16     A1   467   383
#> 2 san1w16     A1   465   378
#> 3 san1w16     A1   463   372
© www.soinside.com 2019 - 2024. All rights reserved.