如何将属性从一个数据框复制到另一个数据框或将属性重新分配给新转换的数据框-R

问题描述 投票:1回答:1

转置数据后,我想重新分配丢弃的属性。这也适用于将属性从一个数据帧复制到另一个数据帧。或者在变异等之后复制属性,删除它们。

 library(reshape2)

 df <- data.frame(id = c(1,2,3,4,5), 
                  time = c(11, 22,33,44,55),
                  c  = c(1,2,3,5,5),
                  d = c(4,2,5,4,NA))

attr(df$id,"label")<- "label"
attr(df$time,"label")<- "label2"
attr(df$c,"label")<- "something here"
attr(df$d,"label")<- "count of something"
str(df)

 str(df)
 data.frame':   5 obs. of  4 variables:
 $ id  : num  1 2 3 4 5
  ..- attr(*, "label")= chr "label"
 $ time: num  11 22 33 44 55
  ..- attr(*, "label")= chr "label2"
 $ c   : num  1 2 3 5 5
  ..- attr(*, "label")= chr "something here"
 $ d   : num  4 2 5 4 NA
  ..- attr(*, "label")= chr "count of something"

投入广泛

dfwide<- recast(df,id~variable +time, 
            id.var = c("id","time"))

常用属性丢失消息:

   Warning message:
     attributes are not identical across measure variables; they will be dropped 

 str(dfwide)
'data.frame':   5 obs. of  11 variables:
 $ id  : num  1 2 3 4 5
 $ c_11: num  1 NA NA NA NA
 $ c_22: num  NA 2 NA NA NA
 $ c_33: num  NA NA 3 NA NA
 $ c_44: num  NA NA NA 5 NA
 $ c_55: num  NA NA NA NA 5
 $ d_11: num  4 NA NA NA NA
 $ d_22: num  NA 2 NA NA NA
 $ d_33: num  NA NA 5 NA NA
 $ d_44: num  NA NA NA 4 NA
 $ d_55: num  NA NA NA NA NA

使用mostattributes可以在数据帧之间复制属性,但是对于许多列名称的迭代,我无法弄清楚或考虑如何以不同的方式有效地映射这一点,一个接一个地保存。

 mostattributes(dfwide$c_11)<-attributes(df$c)
 mostattributes(dfwide$c_22)<-attributes(df$c)
 > str(dfwide)
 'data.frame':  5 obs. of  11 variables:
  $ id  : num  1 2 3 4 5
  $ c_11: num  1 NA NA NA NA
  ..- attr(*, "label")= chr "something here"
  $ c_22: num  NA 2 NA NA NA
  ..- attr(*, "label")= chr "something here"
  $ c_33: num  NA NA 3 NA NA

我试图自动化它但失败了(所有的c都应该有相同的标签,并且我们有相同的标签):

#extract arguments
dlist<-enframe(names(df))%>%
   slice(-1,-2)%>%
   pull(., value)
 dlist

 dlistw<-enframe(names(dfwide))%>%
  slice(-1)%>%
  pull(., value)
 dlistw

#function
mostatt<- function(var1, var2) {
  mostattributes(dfwide[[var1]])<<-attributes(df[[var2]])
}

mapply(mostatt,dlistw,dlist)
str(dfwide)

'data.frame':   5 obs. of  11 variables:
 $ id  : num  1 2 3 4 5
 $ c_11: num  1 NA NA NA NA
  ..- attr(*, "label")= chr "something here"
 $ c_22: num  NA 2 NA NA NA
  ..- attr(*, "label")= chr "count of something"
 $ c_33: num  NA NA 3 NA NA
  ..- attr(*, "label")= chr "something here"
 $ c_44: num  NA NA NA 5 NA
  ..- attr(*, "label")= chr "count of something"
 $ c_55: num  NA NA NA NA 5
  ..- attr(*, "label")= chr "something here"
 $ d_11: num  4 NA NA NA NA
  ..- attr(*, "label")= chr "count of something"
 $ d_22: num  NA 2 NA NA NA
  ..- attr(*, "label")= chr "something here"
 $ d_33: num  NA NA 5 NA NA
  ..- attr(*, "label")= chr "count of something"
 $ d_44: num  NA NA NA 4 NA
  ..- attr(*, "label")= chr "something here"
 $ d_55: num  NA NA NA NA NA
  ..- attr(*, "label")= chr "count of something"

我认为使用tidyselect starts_with可能值得一试但不确定如何加入它。任何建议,将不胜感激。谢谢!

r attr purrr reshape2 tidyselect
1个回答
1
投票

这是一个选项:

for(i in (setdiff(colnames(df), "id"))){
  for(x in colnames(dfwide)[(grepl(i, colnames(dfwide)))])
      mostattributes(dfwide[[x]]) <- attributes(df[[i]])
}
mostattributes(dfwide$id) <- attributes(df$id) 

因为d包含在id中,我需要在最后重写id。如果你为d更改e更简单:

df <- data.frame(id = c(1,2,3,4,5), 
                 time = c(11, 22,33,44,55),
                 c  = c(1,2,3,5,5),
                 e = c(4,2,5,4,NA))


attr(df$id,"label")<- "label"
attr(df$time,"label")<- "label2"
attr(df$c,"label")<- "something here"
attr(df$e,"label")<- "count of something"
str(df)

dfwide<- recast(df,id~variable +time, 
                id.var = c("id","time"))

for(i in (colnames(df))){
  for(x in colnames(dfwide)[(grepl(i, colnames(dfwide)))])
    mostattributes(dfwide[[x]]) <- attributes(df[[i]])
}
© www.soinside.com 2019 - 2024. All rights reserved.