我一直在努力完成一个简单的任务(我想)。
我有一个数据集,其中包含两列,分别包含开始日期和结束日期。我想提取开始日期和结束日期之间的所有月份,并将它们一起列在数据框的新列中。下一步将是为该列中列出的每个月创建假人。
我的输入数据如下:
Lon Lat Year Start_date End_date
70.25 40.25 2000 10/01/2009 04/30/2010
70.75 40.25 2000 05/01/2010 08/30/2010
71.00 40.25 2000 07/07/2010 11/30/2010
这是我想要获得的:
Lon Lat Year start_date end_date Sequence
70.25 40.25 2000 10/01/2009 04/30/2010 10,11,12,1,2,3,4
70.75 40.25 2000 05/01/2010 08/30/2010 5,6,7,8
71.00 40.25 2000 07/01/2010 11/30/2010 7,8,9,10,11
最后一列包含start_date和end_date之间的所有月份(以数字表示)的列表。
这是我的暂定代码。
sequence <- Map(seq.dates, start_date, end_date,
by = "months", format = "%m/%d/%y") ```
The code works fine and gives me a list with all the months from start to end date, which is what I was aiming at. However, I am not able to cope with the list then, as I do not find any good way to extract the values of the list into a new column of the dataframe, while keeping the structure (the levels). I have tried almost any suggested in stackoverflaw on how to extract values from the list, and nothing works.
So, I want to start over and change perspective.
Is there any other way to redesign the function above in a way to produce a new column attached to my data, or a vector? AND NOT A LIST? Any help would be immensely appreciated. Thanks!
涉及dplyr
和lubridate
的一种可能性是:
df %>%
rowwise() %>%
mutate(Sequence = list(month(mdy(Start_date)):month(mdy(End_date))))
Lon Lat Year Start_date End_date Sequence
<dbl> <dbl> <int> <chr> <chr> <list>
1 70.2 40.2 2000 10/01/2009 04/30/2010 <int [7]>
2 70.8 40.2 2000 05/01/2010 08/30/2010 <int [4]>
3 71 40.2 2000 07/07/2010 11/30/2010 <int [5]>