将数据帧转换为 R 中的表

问题描述 投票:0回答:1

我使用 readxl 包中的 read_excel() 将下图中所示的 excel 文件读取到 R 中。有没有办法让我将数据转换为表格,按公司计算地点数量,如下图所示?谢谢。

编辑:

library(datapasta)
data <- tibble::tribble(
     ~Company, ~Document, ~Location.1, ~Location.2, ~Location.3, ~`Location.4.(City.Specified)`,
     "NVIDIA",      "N1",         "x",          NA,          NA,                             NA,
     "NVIDIA",      "N2",          NA,          NA,         "x",                             NA,
     "NVIDIA",      "N3",          NA,          NA,         "x",                             NA,
     "NVIDIA",      "N4",         "x",          NA,         "x",                             NA,
     "NVIDIA",      "N5",          NA,          NA,          NA,                "Palo Alto, CA",
     "Google",      "G1",          NA,         "x",          NA,                             NA,
     "Google",      "G2",          NA,         "x",          NA,                             NA,
  "Microsoft",      "M1",         "x",          NA,          NA,               "Washington, DC",
      "Tesla",      "T1",          NA,          NA,         "x",                             NA,
      "Tesla",      "T2",         "x",         "x",         "x",                "Princeton, NJ",
      "Tesla",      "T3",          NA,         "x",          NA,                   "Dallas, TX"
  )
head(data)

enter image description here

r dataframe
1个回答
0
投票

解决此问题的一种方法是首先总结位置 1 到 3,然后与位置 4 合并。

data_sum <- data |> 
  group_by(Company) |> 
  summarise(
    # sum by counting times where it is NOT NA. 
    Location.1 = sum(!is.na(Location.1)),
    Location.2 = sum(!is.na(Location.2)), 
    Location.3 = sum(!is.na(Location.3))
  )
> data_sum
# A tibble: 4 × 4
  Company   Location.1 Location.2 Location.3
  <chr>          <int>      <int>      <int>
1 Google             0          2          0
2 Microsoft          1          0          0
3 NVIDIA             2          0          3
4 Tesla              1          2          2
data |> select(Company, `Location.4.(City.Specified)`) |> 
  filter(!is.na(`Location.4.(City.Specified)`)) |> 
  left_join(data_sum) |> 
  relocate(`Location.4.(City.Specified)`, .after = `Location.3`)
# A tibble: 4 × 5
  Company   Location.1 Location.2 Location.3 `Location.4.(City.Specified)`
  <chr>          <int>      <int>      <int> <chr>                        
1 NVIDIA             2          0          3 Palo Alto, CA                
2 Microsoft          1          0          0 Washington, DC               
3 Tesla              1          2          2 Princeton, NJ                
4 Tesla              1          2          2 Dallas, TX     
© www.soinside.com 2019 - 2024. All rights reserved.