使用r中的fread从文件读取时解释换行\ n字符

问题描述 投票:1回答:1

我无法在fread程序包中获得data.table来按预期处理新行(\ n)。它们以“ \ n”而不是换行符出现(head显示为“ \\ n”而不是“ \ n”)。根据下面的帖子,我知道fread应该能够处理这种情况:fread and a quoted multi-line column value

我尝试用相同的结果引用(“ string”)值列。我错过了一个简单的解决方案或参数吗?他们应该以某种方式逃脱吗?这是一个说明问题以及我的实现的示例:

[[Edit:]]一些澄清,因此您无需阅读代码即可。 strings.txt的内容显示在# strings.txt下的代码注释中。该文件是一个制表符分隔的文本文件,具有四列三行和标题行。文件中的第一个条目strMsg1strAsIntended相同。但是,fread在从文件读取时在\ n上添加了一个反斜杠,这使得换行符变成文字\ n。如何避免这种情况?我只需要能够将新行编码到我的字符串中即可。希望这是可以理解的。

[Edit2:]结果如图所示。enter image description here

library(data.table)
library(gWidgets2)

# strings.txt
# scope order   key value
# test_gui  1   strMsg1 Text with new line characters:\n1) The first point and the\n2) second point should be on separate lines\n\nThen perhaps some text below, separated by an empty line.
# test_gui  2   strMsg2 Some text does not contain new line characters.
# test_gui  3   strMsg3 Expand window to see text and button widgets

strAsIntended <- "Text with new line characters:\n1) The first point and the\n2) second point should be on separate lines\n\nThen perhaps some text below, separated by an empty line."
filePath <- "C:\\path\\to\\strings.txt"

# Read file.
dt <- fread(file = filePath, sep = "\t", encoding = "UTF-8")
head(dt) # \n has become \\n

# Set key column.
setkey(dt, key = "key")

# Get strings for the specific function.
dt <- dt[dt$scope == "test_gui", ]

# Get strings.
strText <- dt["strMsg1"]$value
strButton <- dt["strMsg2"]$value
strWinTitle <- dt["strMsg3"]$value

# Construct gui.
w <- gwindow(title = strWinTitle)
g <- ggroup(horizontal = FALSE, container = w, expand = TRUE, fill = "both")
gtext(text = strText, container = g)
gtext(text = strAsIntended, container = g)
gbutton(text = strButton, container = g)

运行中:

R版本3.6.2(2019-12-12)平台:x86_64-w64-mingw32 / x64(64位)在以下环境中运行:Windows 10 x64(内部版本18362)

语言环境:1LC_COLLATE = English_United Kingdom.1252 LC_CTYPE = English_United Kingdom.1252[3] LC_MONETARY =英语_英国。1252LC_NUMERIC = C[5] LC_TIME = English_United Kingdom.1252

r fread
1个回答
0
投票

您误解了fread的功能。您的输入文件包含一个反斜杠,后跟n,这就是fread中的字符串所包含的内容。但是,当您打印包含反斜杠的字符串时,它会加倍。 (如果不需要,请使用cat()进行打印。)您的strAsIntended变量不包含反斜杠,它包含一个换行符,在打印时显示为\n

如果要输入文件中的\n 转换为换行符,请使用gsub或其他替换功能。例如,>

dt[,3] <- gsub("\\n", "\n", dt[,3], fixed = TRUE)
© www.soinside.com 2019 - 2024. All rights reserved.