在 R 中比较字符向量中的值

问题描述 投票:0回答:0

我有两个数据框

类别

structure(list(id = c("1", "2", "3", "4", "5"), `topographic-name` = c("Cholenholz", 
"Lisen", "Lochboden", "Löchli", "Lochweid"), category_string = c("c(\"flurname\", \",\", \"wald\")", 
"c(\"bauernhof,\", \"ort\")", "c(\"bauernhof,\", \"flurname\")", 
"c(\"bauernhof,\", \"flurname\")", "c(\"alp,\", \"flurname\")"
), category = c("9", "4", "9", "9", "9"), `year-from` = c(NA_character_, 
NA_character_, NA_character_, NA_character_, NA_character_)), row.names = c(NA, 
5L), class = "data.frame")

names_df

structure(list(id = c("1", "2", "3", "4", "5"), `topographic-name` = c("Cholenholz", 
"Lisen", "Lochboden", "Löchli", "Lochweid"), category_string = c("c(\"flurname\", \",\", \"wald\")", 
"c(\"bauernhof,\", \"ort\")", "c(\"bauernhof,\", \"flurname\")", 
"c(\"bauernhof,\", \"flurname\")", "c(\"alp,\", \"flurname\")"
), category = c("9", "4", "9", "9", "9"), `year-from` = c(NA_character_, 
NA_character_, NA_character_, NA_character_, NA_character_)), row.names = c(NA, 
5L), class = "data.frame")

现在我想比较 categories$category_values 和 names_df$category_string 并且对于每个匹配项,最低的 (1,2,3,4,5,6,7,8,9,10,11,12) 对应的 categories$category 应该写入 names_df$category_code 和相应的半径以及像 names_df$radius 这样的新字段。

它也应该不区分大小写,然后还要检查部分(带有 %ILIKE% 之类的 SQL)。例子。 categories$category_values 是“gewässer”,它还应该检查 names_df$category 是“Fliessgewässer”。

这是我到目前为止的代码,但它没有正确完成工作:

# loop through each row in names_df
for (i in seq_along(names_df$category_string)) {
  # check for matches between categories$category_values and names_df$category_string
  match_rows <- categories[sapply(categories$category_values, function(x) any(grep(tolower(x), tolower(names_df$category_string[[i]])))), ]
  
  # extract the category codes from the matched rows and add them to the list
  matched_codes[[i]] <- match_rows$category_code
  
  # concatenate the matched category codes into a string and write to names_df$category_code
  names_df$category_code[i] <- paste0(sort(unlist(matched_codes[[i]])), collapse = ", ")
}

谢谢你的帮助

r string dataframe data-manipulation
© www.soinside.com 2019 - 2024. All rights reserved.