如何从字符向量中的不同文本中删除具有不同长度*的文本块？

Question

我有一个字符向量，其中有231个文档（231行乘一列）。每个文档的开头都有大量文本，我想从231个文档中删除每个文本。问题在于此块的长度在文档之间是不同的。

让我们举一个例子，其中每个文本都有以下开头：我希望删除的文本。：

我尝试了以下选项，但没有结果：

x <- c("Text that I wish to remove because I don't like it. I really want to remove the text but I cannot do it. I hope that stackoverflow will sort it out.", 
  "Text that I wish to remove. I really want to remove the text but I cannot do it. I hope that stackoverflow will sort it out.", 
  "Text that I wish to remove and I will remove it because some great data analyst will help me solve it. I really want to remove the text but I cannot do it. I hope that stackoverflow will sort it out.", 
  "Text that I wish to remove and who know whether I manage to make it work, it could be and it could not be. I really want to remove the text but I cannot do it. I hope that stackoverflow will sort it out.")

如果要删除的块相等，我将按照某人在上一篇文章中建议的那样简单地执行以下操作：

strings <- substring(x, 60)

但是，由于任何文本的长度不同，我现在陷入困境。

理想情况下，我想获得：

[1] "I really want to remove the text but I cannot do it. I hope that stackoverflow will sort it out."
[2] "I really want to remove the text but I cannot do it. I hope that stackoverflow will sort it out."
[3] "I really want to remove the text but I cannot do it. I hope that stackoverflow will sort it out."
[4] "I really want to remove the text but I cannot do it. I hope that stackoverflow will sort it out."

有人可以帮我吗？

非常感谢！

Answer 1

您可以使用以下代码

  gsub("^.+\\. ", "", x)

[1] "I hope that stackoverflow will sort it out."
[2] "I hope that stackoverflow will sort it out."
[3] "I hope that stackoverflow will sort it out."
[4] "I hope that stackoverflow will sort it out."

Answer 2

在" ,"上分割，然后得到最后一句话：

sapply(strsplit(x, ". ", fixed = TRUE), tail, n = 1)
# [1] "I hope that stackoverflow will sort it out."
# [2] "I hope that stackoverflow will sort it out."
# [3] "I hope that stackoverflow will sort it out."
# [4] "I hope that stackoverflow will sort it out."

如何从字符向量中的不同文本中删除具有不同长度*的文本块？

问题描述投票：1回答：2

2个回答

最新问题

如何从字符向量中的不同文本中删除具有不同长度*的文本块？

问题描述 投票：1回答：2

2个回答

最新问题

问题描述投票：1回答：2