我正在尝试操作“选择所有适用”问题中的列。因此,条目的长度根据行/受访者的不同而不同。除一个(“跟踪时间/顺序”)响应选项外,所有响应选项后面都带有括号和该响应选项的唯一标识符(请参见下面的代码)。为了说明这一点,我有两个问题,一个关于某种学习工具的优势,一个关于某种学习工具的挑战。
df <- data.frame(ID = 1:6, response_strength = c("Language (L) Attention (A)", "Movement Control (MC)", "Language (L) Getting Along with Others (G) Attention (A) Memory (M)", "Memory (M) Complex Thinking (C) Spatial Thinking (S)", "Memory (M) Spatial Thinking (S)", "Language (L) Attention (A)"), response_challenge = c("Movement Control (MC)", Language (L) Attention (A)", "Complex Thinking (C)", "Attention (A)", "Getting Along with Others (G) Keeping Track of Time/Order", "Keeping Track of Time/Order Movement Control (MC)"))
我的目标是转换为长格式并有一个输出表,显示选择给定响应选项的百分比,如下所示:(请注意:以下代码是为了说明目的而创建的,因此百分比将不准确)
df2 <- data.frame(survey_question = c("response_strength", "response_strength", "response_strength", "response_strength", "response_strength", "response_strength", "response_strength", "response_challenge", "response_challenge", "response_challenge", "response_challenge", "response_challenge", "response_challenge", "response_challenge"), response = c("Movement Control (MC)", "Language (L)", "Attention (A)", "Getting Along with Others (G)", "Complex Thinking (C)", "Spatial Thinking (S)","Keeping Track of Time/Order", "Movement Control (MC)", "Language (L)", "Attention (A)", "Getting Along with Others (G)", "Complex Thinking (C)", "Spatial Thinking (S)", "Keeping Track of Time/Order"), n = c(1, 2, 4, 5, 3, 1, 2, 1, 2, 4, 5, 3, 1, 2), percent = c(.33, .67, 1.0, .33, .67, 1.0, .33, .67, 1.0, .33, .67, 1.0, .33, .67))
输出
survey_question response n percent
1 response_strength Movement Control (MC) 1 0.33
2 response_strength Language (L) 2 0.67
3 response_strength Attention (A) 4 1.00
4 response_strength Getting Alone with Others (G) 5 0.33
5 response_strength Complex Thinking (C) 3 0.67
6 response_strength Spatial Thinking (S) 1 1.00
7 response_strength Keeping Track of Time/Order 2 0.33
8 response_challenge Movement Control (MC) 1 0.67
9 response_challenge Language (L) 2 1.00
10 response_challenge Attention (A) 4 0.33
11 response_challenge Getting Alone with Others (G) 5 0.67
12 response_challenge Complex Thinking (C) 3 1.00
13 response_challenge Spatial Thinking (S) 1 0.33
14 response_challenge Keeping Track of Time/Order 2 0.67
我只是被困在最好的前进道路上。如有任何帮助,我们将不胜感激!
我会这样做:
df |> pivot_longer(-ID, names_to = "question", values_to = "response", names_transform = list(question = \(x) str_remove(x, "response_"))) |>
mutate(response = str_split(response, "(?<=\\)|Order) ")) |>
unnest_longer(response) |>
mutate(n = n(), pct = n / sum(n), .by = c(ID, question))