所以我有一个包含多个数据帧的 .txt 文件。它看起来类似于以下示例:
$stim
rt 82289.8878539, 82294.8309221, 82299.3357436, 82304.1822179
category 1, 2, 1, 1
orient 263, 313, 266, 253
$stim
rt 82289.887000, 82294.8309333, 82299.3357444, 82304.1822179
category 1, 2, 2, 2
orient 263, 310, 360, 250
这个 .txt 文件中的每个数据帧都对应一个文件名,我已将其存储在列表中。我想做的是将 $stim 替换为文件名。我构建了一个 for 循环来执行此操作,如下所示:
library(stringr)
text <- readLines("filepath")
wrong_words <- ("$stim")
new_words <- (filenames)
for (i in seq_along(wrong_words)) {
text <- str_replace_all(text, wrong_words[i], new_words[i])
}
text
writeLines(text, con="filepath")
但是,当我运行此循环时,没有任何变化,并且我得到与以前完全相同的 .txt 文件。我做错了什么?
你想要的是
grep
这里并循环匹配。 stringi::stri_replace_all_regex
等而不是用字典替换,即所有“$stim”将被替换为相同的新单词。我们可以将其包装在一个函数中,但仍然包含字典功能。为了避免特征蔓延,我们省略了 readLines
/writeLines
。
> batch_replace <- \(text, wrong_words, new_words) {
+ len0 <- length(wrong_words)
+ len1 <- length(new_words)
+ len2 <- length(pos <- grep(paste(wrong_words, collapse='|'), text))
+ if (len0 != 1L && len0 != len1) {
+ stop(sprintf("Counts of wrong_words (%s) must be 1 or match with new_words (%s).",
+ len0, len1))
+ }
+ if (len2 == 0L) {
+ message('No matches found.')
+ return(text)
+ }
+ else if (len0 == 1L) {
+ pos <- grep(wrong_words, text)
+ return(replace(text, pos, new_words))
+ }
+ else if (len0 == len2) {
+ Map(\(ps, nw) text[ps] <<- nw, pos, new_words)
+ return(text)
+ } else {
+ stop(sprintf("Counts of found words (%s) and new_words (%s) must match.",
+ len1, len2))
+ }
+ }
> text <- readLines('foo.txt')
> wrong_words <- c("\\$stim")
> new_words <- c("## WORD1", "## WORD2")
> batch_replace(text, wrong_words, new_words)
[1] "## WORD1"
[2] "rt 82289.8878539, 82294.8309221, 82299.3357436, 82304.1822179"
[3] "category 1, 2, 1, 1"
[4] "orient 263, 313, 266, 253"
[5] ""
[6] "## WORD2"
[7] "rt 82289.887000, 82294.8309333, 82299.3357444, 82304.1822179"
[8] "category 1, 2, 2, 2"
[9] "orient 263, 310, 360, 250"
[10] ""
您还可以提供字典。
> batch_replace(text, c("\\$stim", "\\$stim"), c("## WORD1", "## WORD2"))
> readLines('foo.txt') |>
+ batch_replace(c("\\$stim"), c("## WORD1", "## WORD2")) |>
+ writeLines('foo2.txt')
>
> readLines('foo2.txt') ## check
[1] "## WORD1"
[2] "rt 82289.8878539, 82294.8309221, 82299.3357436, 82304.1822179"
[3] "category 1, 2, 1, 1"
[4] "orient 263, 313, 266, 253"
[5] ""
[6] "## WORD2"
[7] "rt 82289.887000, 82294.8309333, 82299.3357444, 82304.1822179"
[8] "category 1, 2, 2, 2"
[9] "orient 263, 310, 360, 250"
[10] ""