我有以下示例数据集。我想根据
df
列的值过滤数据集 o3.cpt
。我想保留 o3.cpt
列中包含 joint
向量中的任何元素的行。有没有办法同时使用 filter
和 str_detect
函数来实现此目的?
library(tidyverse)
library(tibble)
rename <- dplyr::rename
select <- dplyr::select
joint <- c(27130, 27132, 27134, 27137, 27138, 27445, 27447, 27486, 27487)
df <-
data.frame(
oid = seq(1, 7),
o3.cpt = c("27130I", "33333", "27134RI", "11111", "27138", "44444", "66666")
)
# I would like to filter the dataset `df` based on the value of `o3.cpt`.
# I want to keep `o3.cpt` column's rows that contain any of element in `joint` vector.
# When done correctly, it will filter only the row 1, row 3, and row 5.
# The string after each five-digit number can vary (not just 'I' or 'RI').
您可以使用
gsub
删除所有非数字字符,并使用 %in%
进行过滤
> subset(df, gsub("\\D", "", o3.cpt) %in% joint)
oid o3.cpt
1 1 27130I
3 3 27134RI
5 5 27138