在R中使用paste0的新列

Question

我正在寻找一个允许我添加新列的函数，将名为ID的值添加到字符串中，即：

我有一个带有您ID的单词列表：

car = 9112
red = 9512
employee = 6117
sky = 2324

words<- c("car", "sky", "red", "employee", "domestic")
match<- c("car", "red", "domestic", "employee", "sky")

通过读取excel文件进行比较，如果它找到的值等于我的向量词，它将用其ID替换该单词，但保留原始单词

    x10<- c(words)# string

words.corpus <-  c(L4$`match`) #  pattern
idwords.corpus <- c(L4$`ID`) # replace
words.corpus <- paste0("\\A",idwords.corpus, "\\z|\\A", words.corpus,"\\z")

vect.corpus <- idwords.corpus
names(vect.corpus) <- words.corpus

data15 <- str_replace_all(x10, vect.corpus)

结果：

DATA15：

" 9112", "2324", "9512", "6117", "employee"

我正在寻找的是添加一个带有ID的新列，而不是用ID替换该单词

words      ID
car           9112
red          9512
employee 6117
sky            2324
domestic domestic

Answer 1

我会使用data.table根据固定的单词值进行快速查找。虽然它不是100％清楚你要求的，但是如果有匹配的话，你会想要用索引值替换单词，或者如果没有，则将单词留为单词。这段代码会这样做：

library("data.table")

# associate your ids with fixed word matches in a named numeric vector
ids <- data.table(
  word = c("car", "red", "employee", "sky"),
  ID = c(9112, 9512, 6117, 2324)
)
setkey(ids, word)

# this is what you would read in
data <- data.table(
  word = c("car", "sky", "red", "employee", "domestic", "sky")
)
setkey(data, word)

data <- ids[data]
# replace NAs from no match with word
data[, ID := ifelse(is.na(ID), word, ID)]

data
##        word       ID
## 1:      car     9112
## 2: domestic domestic
## 3: employee     6117
## 4:      red     9512
## 5:      sky     2324
## 6:      sky     2324

这里“国内”不匹配，因此它仍然是ID列中的单词。我还重复了“天空”，以显示这对于一个单词的每个实例都是如何工作的。

如果要保留原始排序顺序，可以在合并之前创建索引变量，然后按该索引变量对输出重新排序。

在R中使用paste0的新列

问题描述投票：0回答：1

1个回答

最新问题

在R中使用paste0的新列

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1