将不规则嵌套列表转换为数据帧

问题描述 投票:2回答:1

我有一个嵌套列表如下:

    mylist <- list(
      list(
        id = 1234,
        attributes = list(
             list(
               typeId = 11,
               type = 'Main',
               date = '2018-01-01', 
               attributes= list(
                 list(
                   team = 'team1',
                   values = list(
                     value1 = 1, 
                     value2 = 999)),
                 list(
                   team = 'team2',
                   values = list(
                     value1 = 2, 
                     value2 = 888))
                 )
               ),
             list(
               typeId = 12,
               type = 'Extra',
               date = '2018-01-02', 
               attributes= list(
                 list(
                   team = 'team1',
                   values = list(
                     value1 = 3, 
                     value2 = 1234)),
                 list(
                   team = 'team2',
                   values = list(
                     value1 = 4, 
                     value2 = 9876))
               )
             )
          )
        )
      )

我希望将其转换为数据框,其中每个子条目与其所有父条目一起排成一行。所以我最终得到的数据框看起来像

    id type_id  type       date  team value1 value2
1 1234      11  Main 2018-08-01 team1      1    999
2 1234      11  Main 2018-08-01 team2      2    888
3 1234      12 Extra 2018-08-02 team1      3   1234
4 1234      12 Extra 2018-08-02 team2      4   9876

我并不总是知道列表中的名称,因此需要一种通用的方法来执行此操作而不指定列名

编辑

我对我最初的问题有一个答案,但是为了回应Parfaits评论“如果您发布原始JSON和您的R导入代码,可能会提供更简单的解决方案”。

我使用R代码从URL获取原始JSON:

httr::GET(
    feed_url,
    authenticate(username, password)
  ) %>%
    httr::content()

在网址中,JSON看起来像:

[{"id":[1234],"attributes":[{"typeId":[11],"type":["Main"],"date":["2018-01-01"],"attributes":[{"team":["team1"],"values":{"value1":[1],"value2":[999]}},{"team":["team2"],"values":{"value1":[2],"value2":[888]}}]},{"typeId":[12],"type":["Extra"],"date":["2018-01-02"],"attributes":[{"team":["team1"],"values":{"value1":[3],"value2":[1234]}},{"team":["team2"],"values":{"value1":[4],"value2":[9876]}}]}]}]

r list dataframe
1个回答
1
投票

现在有了这样做的功能:

flattenList <- function(input) {

    output <- NULL

    ## Check which elements of the current list are also lists.
    isList <- sapply(input, class) == "list"

    ## Any non-list elements are added to the output data frame.
    if (any(!isList)) {

        ## Determine the number of rows in the output.
        maxRows <- max(sapply(input[!isList], length))

        output <-
            ## Initialise the output data frame with a dummy variable.
            data.frame(dummy = rep(NA, maxRows)) %>%

            ## Append the new columns.
            add_column(!!! input[!isList]) %>%

            ## Delete the dummy variable.
            select(- dummy)
    }

    ## If some elemenets of the current list are also lists, we apply the function again.
    if (any(isList)) {

        ## Apply the function to every sub-list, then bind the new output as rows.
        newOutput <- lapply(input[isList], flattenList) %>% bind_rows()

        ## Check if the current output is NULL.
        if (is.null(output)) {

            output <- newOutput

        } else {

            ## If the current output has fewer rows than the new output, we recycle it.
            if (nrow(output) < nrow(newOutput)) {
                output <- slice(output, rep(1:n(), times = nrow(newOutput) / n()))
            }


            ## Append the columns of the new output.
            output <- add_column(output, !!! newOutput)
        }
    }

    return(output)
}

> flattenList(mylist)
    id typeId  type       date  team priority value1 value2
1 1234     11  Main 2018-01-01 team1        1      1    999
2 1234     11  Main 2018-01-01 team2        1      2    888
3 1234     12 Extra 2018-01-02 team1        1      3   1234
4 1234     12 Extra 2018-01-02 team2        1      4   9876

© www.soinside.com 2019 - 2024. All rights reserved.