完成问卷后,我收到了答案。 问题之一是:您在工作中多久使用这些语言?答案的格式如下:
"A - Spanish 60 \r\nB - Both of them 10 \r\n C - English 30"
"B - Both of them 50 \r\n C - English 50"
"A - Spanish 30 \r\nC - English 70"
如您所见,每个答案都由三个不同的答案组成,前面有
A
、B
或 C
(西班牙语、两者或英语)。然而,这三个答案并不总是出现,我想要得到的是下表:
Spanish | Both of them | English
60 10 30
0 50 50
30 0 70
与
strsplit(x, "\r\n")
我分开了答案,但我不知道如何继续。
让我分享一下我对实现这一目标的见解:
# Pseudocode
# 1. Init: an empty matrix result,
# #of rows equal to the number of responses
# 3 columns for Spanish, Both of them, and English.
# 2. for each response on strsplit(response, "\r\n") and extra spaces removed.
# 2.1. for each line, split it into parts using strsplit(line, " "), and
# extract the option (A, B, or C) and the value.
# 2.1.1. Based on the option, update the corresponding cell in the result matrix.
# 3. Print the result matrix.
以下是示例代码:
# init:
result <- matrix(0, nrow=length(responses), ncol=3)
colnames(result) <- c("Spanish", "Both of them", "English")
for (i in seq_along(responses)) {
response <- responses[i]
lines <- strsplit(response, "\r\n")[[1]]
for (line in lines) {
line <- gsub("^\\s+|\\s+$", "", line) # Remove extra spaces
parts <- strsplit(line, " ")[[1]]
option <- parts[1]
value <- as.numeric(parts[length(parts)])
if (option == "A") {
result[i, "Spanish"] <- value
} else if (option == "B") {
result[i, "Both of them"] <- value
} else if (option == "C") {
result[i, "English"] <- value
}
}
}