如何获得最接近的词在purrr一个参考字

问题描述 投票:0回答:1

我有一个列表如下:

list(c("\n", "\n", "oesophagus graded  and fine\n", 
"\n", "\n", "\n", "stomach and  antrum  altough with some rfa response rfa\n", 
"\n", "mucosa washed a lot\n", "\n", "treated with halo rfa ultra \n", 
"\n", "total of 100 times\n", "\n", "duodenum looks ok"))

我想从列表,最接近于不同的列表中找到另一个词术语提取。

我期望的输出

antrum:rfa

我的第一清单:

EventList<-c("rfa", "apc", "dilat", "emr", "clip", "grasp", "probe", "iodine", 
"acetic", "nac", "peg", "botox")

我的第二个名单是:

tofind<-"ascending|descending|sigmoid|rectum|transverse|caecum|splenic|ileum|rectosigmoid|ileocaecal|hepatic|colon|terminal|terminal ileum|ileoanal|prepouch|pouch|stomach|antrum|duodenum|oesophagus|goj|ogj|cardia|anastomosis"

我使用的代码是:

EventList %>%
        map(
          ~words %>%
            str_which(paste0('^.*', .x)) %>%
            map_chr(
              ~words[1:.x] %>%
                str_c(collapse = ' ') %>%

                str_extract_all(regex(tofind, ignore_case = TRUE)) %>%
                map_if(is_empty, ~ NA_character_) %>%
                flatten_chr()%>%
                `[[`(1) %>%

                .[length(.)]
            ) %>%
            paste0(':', .x)
        ) %>%
        unlist() %>%
        str_subset('.+:')

这使我的事件(在这种情况下rfa),但不是分配它antrum,它分配它oesophagus

因此,给它在tofind列表中找到的第一个任期内,而不是最接近事件术语。

我怀疑行

`[[`(1) %>%

 .[length(.)]

是罪魁祸首,但我不知道如何让它给了我最接近的期限,而不是第一项更改

r
1个回答
1
投票

下面给你的tofindEventList匹配对于每一个匹配元素的最后一个元素

map(EventList, 
    function(event) {
      indices <- map(words, str_which, pattern = event)
      map(indices, function(i) 
        map2_chr(words, i, ~ .x[seq_len(.y)] %>% 
               str_c(collapse = ' ') %>% 
               str_extract_all(regex(tofind, ignore_case = TRUE), simplify = TRUE) %>% 
               last()) %>%
          map_if(is_empty, ~ NA_character_)
        ) %>% 
        unlist() %>% 
        paste0(':', event)
    })  %>%
  unlist() %>%
  str_subset('.+:')

# [1] "antrum:rfa"     "oesophagus:rfa"
© www.soinside.com 2019 - 2024. All rights reserved.