如何根据变量中的值和另一个变量中的值来对变量中的级别进行排序?

问题描述 投票:1回答:1

我有一个看起来像这样的数据框,我正准备ggplot:

txt <- "v1 v2 v3
'Strongly agree' 83.1 var1
'Agree' 14.9 var1
'Disagree' 1.5 var1
'Strongly disagree' 0.6 var1
'Strongly agree' 11.8 var2
'Agree' 36.5 var2
'Disagree' 17.7 var2
'Strongly disagree' 43.8 var2
'Strongly agree' 19.6 var3
'Agree' 12 var3
'Disagree' 31.6 var3
'Strongly disagree' 36.8 var3"

mydata <- read.table(textConnection(txt), sep = " ", header = TRUE)

我的问题是:如何根据mydata$v3中的值和mydta$v2中的值来订购mydata$v1中的级别?

一个例子:例如,如果我想根据mydata$v3中“mydata$v2”中的最高值,在mydata$v1中订购var1中的最高值,我将得到的顺序是:var3var2mydata$v2,因为mydata$v3中的值是83.1,19.6,11.8。

另一个例子:例如,如果我想根据mydata$v2mydata$v1中的值的总和,在var1中的“强烈同意”和“同意”级别中订购var2中的级别,我将得到的顺序是:var3mydata$v2 v1 v2 v3 1 Strongly agree 83.1 var1 2 Agree 14.9 var1 3 Disagree 1.5 var1 4 Strongly disagree 0.6 var1 5 Strongly agree 11.8 var2 6 Agree 36.5 var2 7 Disagree 17.7 var2 8 Strongly disagree 43.8 var2 9 Strongly agree 19.6 var3 10 Agree 12.0 var3 11 Disagree 31.6 var3 12 Strongly disagree 36.8 var3 levels(mydata$v3) [1] "var1" "var2" "var3" 因为 v1 v2 v3 1 Strongly agree 83.1 var1 2 Agree 14.9 var1 3 Disagree 1.5 var1 4 Strongly disagree 0.6 var1 5 Strongly agree 11.8 var2 6 Agree 36.5 var2 7 Disagree 17.7 var2 8 Strongly disagree 43.8 var2 9 Strongly agree 19.6 var3 10 Agree 12.0 var3 11 Disagree 31.6 var3 12 Strongly disagree 36.8 var3 levels(mydata$v3) [1] "var1" "var3" "var2" 中的值是(83.1 + 14.9)= 98,(11.8 + 36.5)= 48.3,(19.6 + 12)= 31.6

我不知道如何自己解决这个问题。而且,我处理了很多像这样的帧,所以代码必须进入一个函数

编辑:

在这两个例子中,我想要的结果是原始的data.frame只有mydata $ v3级别的顺序发生了变化。

所以在例1中我有:

                  v1   v2   v3
1     Strongly agree 83.1 var1
2              Agree 14.9 var1
3           Disagree  1.5 var1
4  Strongly disagree  0.6 var1
5     Strongly agree 11.8 var2
6              Agree 36.5 var2
7           Disagree 17.7 var2
8  Strongly disagree 43.8 var2
9     Strongly agree 19.6 var3
10             Agree 12.0 var3
11          Disagree 31.6 var3
12 Strongly disagree 36.8 var3 

levels(mydata$v3)
[1] "var1" "var2" "var3"

但我想要结束的是这个。

                  v1   v2   v3
1     Strongly agree 83.1 var1
2              Agree 14.9 var1
3           Disagree  1.5 var1
4  Strongly disagree  0.6 var1
5     Strongly agree 11.8 var2
6              Agree 36.5 var2
7           Disagree 17.7 var2
8  Strongly disagree 43.8 var2
9     Strongly agree 19.6 var3
10             Agree 12.0 var3
11          Disagree 31.6 var3
12 Strongly disagree 36.8 var3 

levels(mydata$v3)
[1] "var1" "var2" "var3"

在示例二我有:

factor(maydata$v3, levels(mydata$v3)[EXAMPLE1: order after value in v2 within 1 level in v1 /EXAMPLE2: order after sum of value within 2 levels in v1])

但想要:

aggregate

请注意,在示例二中我拥有的和我想要的是相同的,但我有很多data.frames,其中不会是这种情况。

我想要的是一个复杂的版本

f <- function(mydata, v1.val) {
  # Value or sum of v2 within the selected rows
  sums <- aggregate(v2 ~ v3, data=mydata[mydata$v1 %in% v1.val,], FUN=sum)

  # Decreasing order of the sum of v2 values, or the only v2 value, for each level of v3
  ord <- order(sums$v2, decreasing=TRUE)

  # Build a new factor with the proper levels and assign it to v3
  fac <- factor(mydata$v3, levels=sums$v3[ord])

  mydata$v3 <- fac
  return(mydata)
}
r sorting r-factor
1个回答
0
投票

这是> f(mydata, 'Strongly agree')$v3 [1] var1 var1 var1 var1 var2 var2 var2 var2 var3 var3 var3 var3 Levels: var1 var3 var2 > f(mydata, c('Strongly agree', 'Agree'))$v3 [1] var1 var1 var1 var1 var2 var2 var2 var2 var3 var3 var3 var3 Levels: var1 var2 var3 的解决方案:

qazxswpoi

数据框如上所示,但因子级别符合要求:

qazxswpoi
© www.soinside.com 2019 - 2024. All rights reserved.