Pandas Groupby 从上一组中获取值

问题描述 投票:0回答:1

鉴于以下情况:

import pandas as pd
import numpy as np
df = pd.DataFrame({'a':['a','a','b','b','c','c'],:[1,1,np.nan,np.nan,1,1]})

df
   a    b
0  a  1.0
1  a  1.0
2  b  NaN
3  b  NaN
4  c  1.0
5  c  1.0

我需要通过按“a”分组并将“b”的值从上一组向下移动来创建一个新列(“c”),如下所示:

   a    b    c
0  a  1.0  NaN
1  a  1.0  NaN
2  b  NaN  1.0
3  b  NaN  1.0
4  c  1.0  NaN
5  c  1.0  NaN

我试过这个,但它只在每个组内向前填充,所以没有任何反应,因为每个组内没有什么可填充的:

df.groupby('a')['b'].ffill()
pandas group-by fill
1个回答
1
投票

这里是用

GroupBy.first
Series.shift
映射每个先前组的第一个值的解决方案:

df['c'] = df['a'].map(df.groupby('a')['b'].first().shift())
print (df)
   a    b    c
0  a  1.0  NaN
1  a  1.0  NaN
2  b  NaN  1.0
3  b  NaN  1.0
4  c  1.0  NaN
5  c  1.0  NaN

按组映射的另一个想法是通过

GroupBy.cumcount
使用左连接创建辅助列:

df = pd.DataFrame({'a':['a','a','b','b','c','c'],'b':[1,2,np.nan,np.nan,1,1]})

d = dict(zip(df['a'], df['a'].shift(-1)))

df1 = df.assign(g = df.groupby('a').cumcount())
df = df1.merge(df1.assign(a = df['a'].map(d)).rename(columns={'b':'c'}), 
               on=['a','g'], how='left').drop('g', axis=1)
print (df)
   a    b    c
0  a  1.0  NaN
1  a  2.0  NaN
2  b  NaN  1.0
3  b  NaN  2.0
4  c  1.0  NaN
5  c  1.0  NaN
最新问题
© www.soinside.com 2019 - 2024. All rights reserved.