我有一个包含Airbnb列表的数据集。我想根据每个host_id是每月的整个房屋还是共享房屋来计算列表的数量。因此,我假设我需要另外两列,每一行的计数(tot_EH和tot_SH)。
我在下面发布了一张图片,以显示数据集的外观和所需的输出(删除了一些不相关的列)。现在我只使用了一个host_id,但实际上有很多不同。
将新的列标记为红色并输入所需的输出。无法弄清楚如何进行。非常感谢您的帮助!
得到了同事的帮助,这很有用:
df <- df %>%
group_by(host_id, last_scraped) %>% # group data by host and month
mutate(count_listings_in_data = length(unique(id)), # for each host/month combination; count the number of unique listing IDs
count_shared_homes = length(unique(id[which(room_type_NV == "Shared home")])), # for each host/month combination; count the number of unique listing IDs for which the room type is "shared"
count_entire_homes = length(unique(id[which(room_type_NV == "Entire home")]))) # for each host/month combination; count the number of unique listing IDs for which the room type is "entire"