我在文件中有以下数据:
Message-ID: <123.juii@jkk>
Date: Wed, 9 Mar 2002 16:12:51 -0800 (CST)
From: [email protected]
To: [email protected], [email protected], [email protected],
[email protected], [email protected]
Subject: Sales details
Please find attached the latest sales information
let me know what you can do.
Thanks,
jLian
我想只提取电子邮件的内容。所以我试着提取没有“:”字符的行。我无法找到任何其他方式。但这会导致:
[email protected], [email protected]
Please find attached the latest sales information and
let me know what you can do.
Thanks,
jLian
其中只有第二行是消息内容。
library("stringr")
rawData = file("mail1","r")
while(TRUE){
line = readLines(rawData,n=1)
if(length(line)==0){
break
}
if(!(str_detect(line,":")))
print(line)
}
看看这是否有效:
数据:
mail<-
'Message-ID: <123.juii@jkk>
Date: Wed, 9 Mar 2002 16:12:51 -0800 (CST)
From: [email protected]
To: [email protected], [email protected], [email protected],
[email protected], [email protected]
Subject: Sales details
Please find attached the latest sales information
let me know what you can do.
Thanks,
jLian'
码:
cat(
sub(".*Subject:.*?\n\n","",mail)
)
结果:
#Please find attached the latest sales information
#let me know what you can do.
#Thanks,
#jLian
为了有效地使用我的解决方案,请将每个Mail作为多行字符串列表元素。
listOfMails <- list(mail, mail, mail) #as many as you have.
fun1<-
function(m) { sub(".*Subject:.*?\n\n","",m) }
onlyContent<-
lapply(listOfMails,fun1)