计算R(每行)中一组变量中值的出现次数 - 使用权重

问题描述 投票:1回答:1

我有以下df8数据帧:

df8=data.frame(V1=c(10,20,10,20),V2=c(20,30,20,30),V3=c(20,10,20,10))

以下是每行的值出现次数:

a<-apply(df8,MARGIN=1,table)

> a
[[1]]

10 20 
 1  2 

[[2]]

10 20 30 
 1  1  1 

[[3]]

10 20 
 1  2 

[[4]]

10 20 30 
 1  1  1 

我有一个向量 - V = (0.25,0.25,0.5)这意味着我希望每行的每个行出现次数乘以每行的向量V:我想得到这样的东西用于计算(对每个不同行值的列的权重求和) ):[[1]]

   10  20 
 0.25  0.5

[[2]]

   10   20  30 
 0.5 0.25 0.25 

[[3]]

 10     20 
 0.25  0.5

[[4]]

 10   20   30 
 0.5 0.25 0.25 

现在我想为每一行选择具有最高a*V值的项目:

> df8
  V1 V2 V3 max_val
1 10 20 20   20
2 20 30 10   10
3 10 20 20   20
4 20 30 10   10
r dataframe vector apply
1个回答
1
投票

一个选项可以是将table函数应用于每一行,并找出每列中值的出现。然后将V中定义的因子应用于每列,以查找具有max freq*V值的列索引。该行值的index值将是所需的值。

#Multiplier for occurrence in each column
V = c(0.25,0.25,0.5)

#data frame
df8=data.frame(V1=c(10,20,10,20),V2=c(20,30,20,30),V3=c(20,10,20,10))

# This function accepts all columns for a row. Finds frequencies for each
# column values and then multiply with V (column wise)
# Finally value in row at index with max(freq*V) is returned.

find_max_freq_val <- function(x){
  freq_df <- as.data.frame(table(x))
  freq_vec <- mapply(function(y)freq_df[freq_df$x==y,"Freq"], x)
  #multiply with V with freq and find index of max(a*V)
  #Then return item at that index from x
  x[which((freq_vec*V) == max(freq_vec*V))]

}

# call above function to add an column with desired value
df8$new_val <- apply(df8, 1, find_max_freq_val)

df8
#  V1 V2 V3 new_val
#1 10 20 20      20
#2 20 30 10      10
#3 10 20 20      20
#4 20 30 10      10
© www.soinside.com 2019 - 2024. All rights reserved.