从每小时数据开始,熊猫计算每个月的阴雨天与不阴雨天

问题描述 投票:0回答:2

我有一个大数据集(这里是到子集https://drive.google.com/open?id=1o7dEsRUYZYZ2-L9pd_WFnIX1n10hSA-f的链接),其tstamp索引(2010-01-01 00:00:00)和雨的毫米数。多年来每5分钟进行一次测量:

                      mm
tstamp                  
2010-01-01 00:00:00  0.0
2010-01-01 00:05:00  0.0
2010-01-01 00:10:00  0.0
2010-01-01 00:15:00  0.0
2010-01-01 00:20:00  0.0
........

我想得到的是每年每个月的雨天数。因此,理想情况下,如下所示的数据框

tstamp    rainy  not rainy
2010-01   11     20
2010-02   20     8
......
2012-10   15     16
2012-11   30     0

我能够获得的是一个嵌套的字典对象,如d = {year {month: {'rainy': 10, 'not-rainy': 20}... }...},它是用这个小代码段制成的:

from collections import defaultdict


d = defaultdict(lambda: defaultdict(dict))

for year in df.index.year.unique():

    try:
        for month in df.index.month.unique():

            a = df['{}-{}'.format(year, month)].resample('D').sum()

            d[year][month]['rainy'] = a[a['mm'] != 0].count()
            d[year][month]['not_rainy'] = a[a['mm'] == 0].count()

    except:
        pass

但是我想我缺少一个更简单,更直接的解决方案。有什么建议吗?

python pandas resampling
2个回答
0
投票

一种方法是做两个groupby

daily = df['mm'].gt(0).groupby(df.index.normalize()).any()
monthly = (daily.groupby(daily.index.to_period('M'))
                .value_counts()
                .unstack()
          )

0
投票

您可以执行此操作,我没有看到非雨季:

df = pd.read_csv('rain.csv')
df['tstamp'] = pd.to_datetime(df['tstamp'])
df['month'] = df['tstamp'].dt.month
df['year'] = df['tstamp'].dt.year
df = df.groupby(by=['year', 'month'], as_index=False).sum()
print(df)

输出:

    year  month     mm
0   2010      1    1.0
1   2010      2   15.4
2   2010      3   21.8
3   2010      4    9.6
4   2010      5  118.4
5   2010      6   82.8
6   2010      7   96.0
7   2010      8  161.6
8   2010      9  109.2
9   2010     10   51.2
10  2010     11   52.4
11  2010     12   39.6
12  2011      1    5.6
13  2011      2    0.8
14  2011      3   13.4
15  2011      4    1.8
16  2011      5   97.6
17  2011      6  167.8
18  2011      7  128.8
19  2011      8   67.6
20  2011      9  155.8
21  2011     10   71.6
22  2011     11    0.4
23  2011     12   29.4
24  2012      1   17.6
25  2012      2    2.2
26  2012      3   13.0
27  2012      4   55.8
28  2012      5   36.8
29  2012      6  108.4
30  2012      7  182.4
31  2012      8  191.8
32  2012      9   89.0
33  2012     10   93.6
34  2012     11  161.2
35  2012     12   26.4
© www.soinside.com 2019 - 2024. All rights reserved.