使用列索引和循环变换数据帧

问题描述 投票:1回答:2

我正在尝试编写一个函数使用索引,该函数接受键值对并将其堆叠。

这是我的数据:

mydata<-structure(list(groupA = c("Rugby for Chipmunks", "Rugby for Chipmunks", "Rugby for Chipmunks", "Chafing Explained"), First = c(5, 3.57142857142857, 5, 4.5), groupB = c("Pylons for Priests", "Eating Creosote", "Eating Creosote", "Eating Creosote"), Second = c(4, 4, 3.16666666666667, 2.1666667), groupC = c("Wow for YOU!", "Advanced Cats for Bears", "Blue Paint Only", "Mockingbirds"), Third = c(5, 3, NaN, 4), groupD = c("How to Sell Pigeons", "How to Sell Pigeons", "How to Sell Pigoens", "Larger Boulders"), Fourth = c(4.3, 3, 4.1, 3.4), groupE = c("Making Money with Pears", "Making Money with Pears", "Why Walnuts?", "Responding to Idiots Part II"), Fifth = c(5, 3, 5, 4.16666666666667)), row.names = c(NA, -4L), class = c("tbl_df", "tbl", "data.frame"))

我想使用索引,因为将来的任务将使用具有不同列名和宽度的数据框。我的方法使用一个函数来确定一列是否为奇数/偶数,然后提取成对的列,直到到达最后一个编号为odd的列。请注意,该函数必须考虑每次提取所需的奇偶索引顺序(组名和相应分数):

odd <- function(x) x%%2 != 0
outfile<-list()

moveit<-function(df){
  for (i in 1:dim(df)[2])    # define number of loops
    if (  i==dim(df)[2]-1  )  {break} # stop at least odd-numbered column
    if ( odd(i)==FALSE) {next}  # skip when i is not an odd numbered index 
  print(i)
  outfile[[i+1]]<-df[ ,c(i,i+1)]
}

result<-moveit(mydata)
str(result)

您可以看到结果只是最后一个键值对。为什么?如何调整功能以将所有键值对提取到一个数据框中?

r loops
2个回答
1
投票

我们可以将数据集的glsplit创建数值索引到listdata.framerenamelistmap元素并将其按行联接

library(dplyr)
library(purrr)
split.default(mydata, as.integer(gl(ncol(mydata), 2, ncol(mydata)))) %>% 
      map_dfr(~ .x %>% 
                  rename_all(~ c('group', 'value')))

上面也可以做成No package zone

lst1 <-  split.default(mydata, as.integer(gl(ncol(mydata), 2, ncol(mydata)))) 
do.call(rbind, lapply(lst1, setNames, c("group", "value")))

在OP的代码中,'outfile'list初始化为长度0。相反,它可以是

odd <- function(x) x%%2 != 0
outfile <- vector('list', ncol(mydata))

moveit<-function(df){
  for (i in seq_along(df)) {   
    if(odd(i)){  
      outfile[[i]]<-df[ ,c(i,i+1)]
    }
 }
 Filter(Negate(is.null), outfile)
}

result <- moveit(mydata)

而且,主要的问题是'outfile'没有在最后返回

odd <- function(x) x%%2 != 0
outfile<-list()
moveit<-function(df){
  for (i in 1:dim(df)[2]) {   # define number of loops
    if (  i==dim(df)[2]-1  )  {break} # stop at least odd-numbered column
    if ( odd(i)==FALSE) {next}  # skip when i is not an odd numbered index 
  print(i)
  outfile[[i+1]]<-df[ ,c(i,i+1)]
 }
 outfile
}

result<-moveit(mydata)

注意:这里也没有使用软件包


2
投票

1]整形 reshape可以做到这一点,假设您想要将数据帧整形为长形。不使用任何软件包。

nc <- ncol(mydata)
ig <- seq(1, nc, 2)  # indexes of key columns
reshape(as.data.frame(mydata), dir = "long", 
  varying = list(ig, -ig), v.names = c("key", "value"))

给予:

    time                          key    value id
1.1    1          Rugby for Chipmunks 5.000000  1
2.1    1          Rugby for Chipmunks 3.571429  2
3.1    1          Rugby for Chipmunks 5.000000  3
4.1    1            Chafing Explained 4.500000  4
1.2    2           Pylons for Priests 4.000000  1
2.2    2              Eating Creosote 4.000000  2
3.2    2              Eating Creosote 3.166667  3
4.2    2              Eating Creosote 2.166667  4
1.3    3                 Wow for YOU! 5.000000  1
2.3    3      Advanced Cats for Bears 3.000000  2
3.3    3              Blue Paint Only      NaN  3
4.3    3                 Mockingbirds 4.000000  4
1.4    4          How to Sell Pigeons 4.300000  1
2.4    4          How to Sell Pigeons 3.000000  2
3.4    4          How to Sell Pigoens 4.100000  3
4.4    4              Larger Boulders 3.400000  4
1.5    5      Making Money with Pears 5.000000  1
2.5    5      Making Money with Pears 3.000000  2
3.5    5                 Why Walnuts? 5.000000  3
4.5    5 Responding to Idiots Part II 4.166667  4

[2)ivot_longer这可以用pivot_longer]替代地完成

library)(dplyr)
library(tidyr)

v.names <- c("key", "value")
mydata %>%
  setNames(outer(v.names, 1:(ncol(.)/2), paste)) %>%
  mutate(id = 1:n()) %>%
  pivot_longer(cols = -id, names_to = c(".value", "no"), names_sep = " ") %>%
  arrange(no, id)

请注意,这类似于此处使用pivot_longerPivot by group for unequal data size

修订版

这里您的代码已修改。

moveit <- function(df) {
  outfile <- list()
  for(i in seq_along(df)) if (odd(i)) outfile[[(i+1)/2]] <- df[c(i, i+1)]
  outfile
}
© www.soinside.com 2019 - 2024. All rights reserved.