更改 R 中 csv 文件的结构

问题描述 投票:0回答:1

我目前正在处理 csv 文件数据整理,我有一个功能“轨道”,它具有同一实例的多个值,例如数据点列表。我想创建一个循环来分割这些值,以便将它们存储在两个新列中:纬度和经度而不是轨迹。

这是我想更改它的示例:

现在是这样:

 structure(list(startTime = 1673652576000, endTime = 1673652605000, 
    track = "Any[Dict{String, Any}(\"lat\" => 47.38113329424818, \"lon\" => 8.488552397519989), Dict{String, Any}(\"lat\" => 47.38115806843576, \"lon\" => 8.488606700331793), Dict{String, Any}(\"lat\" => 47.38117942217475, \"lon\" => 8.488750238611077), Dict{String, Any}(\"lat\" => 47.381191813557656, \"lon\" => 8.488831542342021), Dict{String, Any}(\"lat\" => 47.38119496242752, \"lon\" => 8.48889035827453), Dict{String, Any}(\"lat\" => 47.381129867348655, \"lon\" => 8.488894804346009), Dict{String, Any}(\"lat\" => 47.38110967635605, \"lon\" => 8.488808354980387), Dict{String, Any}(\"lat\" => 47.381089123104935, \"lon\" => 8.48873929104632), Dict{String, Any}(\"lat\" => 47.381083177776155, \"lon\" => 8.488731425037434), Dict{String, Any}(\"lat\" => 47.38105863930372, \"lon\" => 8.48866144565764), Dict{String, Any}(\"lat\" => 47.381069428672866, \"lon\" => 8.488639684187174), Dict{String, Any}(\"lat\" => 47.38105328464012, \"lon\" => 8.48861185403874), Dict{String, Any}(\"lat\" => 47.38101967503413, \"lon\" => 8.488536802144624)]", 
    distance = 64, transitMode = "walk", oldTransitMode = ""), row.names = 1L, class = "data.frame")

我想要的样子:

有人可以帮忙吗?

r csv data-wrangling
1个回答
0
投票

track
条目是用某种我不知道的语言编写的,但一般方法是删除所有“包装”代码,然后将每个条目拆分为多个条目。我会在数据帧中的每个条目上使用
for
循环来完成此操作。其他人可能会使用其他方法。

例如:

thedata <- structure(list(startTime = 1673652576000, endTime = 1673652605000, 
               track = "Any[Dict{String, Any}(\"lat\" => 47.38113329424818, \"lon\" => 8.488552397519989), Dict{String, Any}(\"lat\" => 47.38115806843576, \"lon\" => 8.488606700331793), Dict{String, Any}(\"lat\" => 47.38117942217475, \"lon\" => 8.488750238611077), Dict{String, Any}(\"lat\" => 47.381191813557656, \"lon\" => 8.488831542342021), Dict{String, Any}(\"lat\" => 47.38119496242752, \"lon\" => 8.48889035827453), Dict{String, Any}(\"lat\" => 47.381129867348655, \"lon\" => 8.488894804346009), Dict{String, Any}(\"lat\" => 47.38110967635605, \"lon\" => 8.488808354980387), Dict{String, Any}(\"lat\" => 47.381089123104935, \"lon\" => 8.48873929104632), Dict{String, Any}(\"lat\" => 47.381083177776155, \"lon\" => 8.488731425037434), Dict{String, Any}(\"lat\" => 47.38105863930372, \"lon\" => 8.48866144565764), Dict{String, Any}(\"lat\" => 47.381069428672866, \"lon\" => 8.488639684187174), Dict{String, Any}(\"lat\" => 47.38105328464012, \"lon\" => 8.48861185403874), Dict{String, Any}(\"lat\" => 47.38101967503413, \"lon\" => 8.488536802144624)]", 
               distance = 64, transitMode = "walk", oldTransitMode = ""), row.names = 1L, class = "data.frame")

result <- data.frame()
for (i in seq_len(nrow(thedata))) {
  track <- thedata$track[i] |>
    gsub("Dict\\{String, Any}\\(", "",x = _) |>
    gsub("Any\\[", "", x = _) |>
    gsub("]", "", x = _) |>
    gsub("\\)", "", x = _) |>
    strsplit(",")
  track <- track[[1]] |>
    gsub('"lat" => ', "", x = _) |>
    gsub('"lon" => ', "", x = _) |>
    as.numeric()
  j <- 2*seq_len(length(track)/2)
  lats <- track[j-1]
  lons <- track[j]
  entry <- data.frame(
    startTime = thedata$startTime[i],
    endTime =   thedata$endTime[i],
    lat = lats,
    lon = lons,
    distance = thedata$distance[i],
    transitMode = thedata$transitMode[i],
    oldTransitMode = thedata$oldTransitMode[i])
  result <- rbind(result, entry)
}
result
#>       startTime      endTime      lat      lon distance transitMode
#> 1  1.673653e+12 1.673653e+12 47.38113 8.488552       64        walk
#> 2  1.673653e+12 1.673653e+12 47.38116 8.488607       64        walk
#> 3  1.673653e+12 1.673653e+12 47.38118 8.488750       64        walk
#> 4  1.673653e+12 1.673653e+12 47.38119 8.488832       64        walk
#> 5  1.673653e+12 1.673653e+12 47.38119 8.488890       64        walk
#> 6  1.673653e+12 1.673653e+12 47.38113 8.488895       64        walk
#> 7  1.673653e+12 1.673653e+12 47.38111 8.488808       64        walk
#> 8  1.673653e+12 1.673653e+12 47.38109 8.488739       64        walk
#> 9  1.673653e+12 1.673653e+12 47.38108 8.488731       64        walk
#> 10 1.673653e+12 1.673653e+12 47.38106 8.488661       64        walk
#> 11 1.673653e+12 1.673653e+12 47.38107 8.488640       64        walk
#> 12 1.673653e+12 1.673653e+12 47.38105 8.488612       64        walk
#> 13 1.673653e+12 1.673653e+12 47.38102 8.488537       64        walk
#>    oldTransitMode
#> 1                
#> 2                
#> 3                
#> 4                
#> 5                
#> 6                
#> 7                
#> 8                
#> 9                
#> 10               
#> 11               
#> 12               
#> 13

创建于 2024-04-11,使用 reprex v2.1.0

您应该检查您的真实数据的每一步;它的结构可能并不完全相同。

© www.soinside.com 2019 - 2024. All rights reserved.