需要在 R 中使用 Tidyverse 的帮助

Question

我过去使用过 R，但从未使用过 tidyverse。我正在尝试一次通过多个变量过滤数据集。我该怎么做呢？我在下面提供了我试图回答的问题的示例： “‘player_data’数据集中有 73 名球员将 2010 年列为选秀年。其中有多少球员在 6 个类别中的每个类别中取得了职业生涯成果？”

我尝试查看 tidyverse wiki 和 Google 上的其他资源，但我仍然不明白如何使用 tidyverse 堆叠过滤器。我的学校涵盖了 R 中的数据可视化，但不一定涵盖这种类型的数据操作。预先感谢您！

Answer 1

恭喜你开始学习R！

因为您没有提供数据集，所以我将使用 palmerpenguins 包中的 penguins 数据集。

我这样做对你来说可能也更好，因为它迫使你必须获得编写代码、将其转换为你自己的数据集的经验，而不是仅仅复制粘贴我的代码。 😄

# (install if necessary, and) load libraries
pacman::p_load(palmerpenguins, tidyverse)

df <- penguins |>
  filter(year == 2008) # there is no data for 2010 in this dataset :-(

# okay, let's count the number of penguins for 2008:
count(df) # 114

# of those, how many had a bill length of at least 40mm?
df |>
  filter(bill_length_mm >= 40) |>
  count() # 83

# and how many had a body mass of at least 4000g?
df |>
  filter(body_mass_g >= 4000) |>
  count() # 66

# and then, how many had both?
df |>
  filter(bill_length_mm >= 40, body_mass_g >= 4000) |>
  count() # 64

使用

filter()

，您可以将条件放入列表中，如上所示，也可以使用

（即 AND）或

（即两个条件都必须为真）将它们链接在一起，即是 OR（即至少一个条件必须为真）。

需要在 R 中使用 Tidyverse 的帮助

问题描述投票：0回答：1

1个回答

最新问题

需要在 R 中使用 Tidyverse 的帮助

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1