跨两个数据帧收集非索引列上的公共组

问题描述 投票:0回答:1

这里有两个数据框,按照我想要的方式分组:

last5s = pd.Timestamp.now().replace(microsecond=0) - pd.Timedelta('5s')
dates = pd.date_range(last5s, periods = 5, freq='s')

N=10
data1 = np.random.randint(0,10,N)
data2 = np.random.randint(0,10,N)

df1 = pd.DataFrame({'timestamp': np.random.choice(dates, size=N), 'A': data1})
df2 = pd.DataFrame({'timestamp': np.random.choice(dates, size=N), 'B': data2})

print(df1)
print(df2)
print()

g1 = df1.groupby(pd.Grouper(key='timestamp', freq='1s'))
print("g1:")
for time, group in g1:
    print('time:', time)
    print(group)
    print()
    
print()
g2 = df2.groupby(pd.Grouper(key='timestamp', freq='1s'))
print('g2:')
for time, group in g2:
    print('time:', time)
    print(group)
    print()

输出(例如):

            timestamp  A
0 2024-03-01 10:05:26  7
1 2024-03-01 10:05:25  8
2 2024-03-01 10:05:28  1
3 2024-03-01 10:05:24  2
4 2024-03-01 10:05:28  5
5 2024-03-01 10:05:27  4
6 2024-03-01 10:05:24  6
7 2024-03-01 10:05:26  3
8 2024-03-01 10:05:26  8
9 2024-03-01 10:05:28  8
            timestamp  B
0 2024-03-01 10:05:25  1
1 2024-03-01 10:05:26  6
2 2024-03-01 10:05:25  5
3 2024-03-01 10:05:28  7
4 2024-03-01 10:05:27  7
5 2024-03-01 10:05:28  1
6 2024-03-01 10:05:28  4
7 2024-03-01 10:05:25  0
8 2024-03-01 10:05:24  6
9 2024-03-01 10:05:24  5

g1:
time: 2024-03-01 10:05:24
            timestamp  A
3 2024-03-01 10:05:24  2
6 2024-03-01 10:05:24  6

time: 2024-03-01 10:05:25
            timestamp  A
1 2024-03-01 10:05:25  8

time: 2024-03-01 10:05:26
            timestamp  A
0 2024-03-01 10:05:26  7
7 2024-03-01 10:05:26  3
8 2024-03-01 10:05:26  8

time: 2024-03-01 10:05:27
            timestamp  A
5 2024-03-01 10:05:27  4

time: 2024-03-01 10:05:28
            timestamp  A
2 2024-03-01 10:05:28  1
4 2024-03-01 10:05:28  5
9 2024-03-01 10:05:28  8


g2:
time: 2024-03-01 10:05:24
            timestamp  B
8 2024-03-01 10:05:24  6
9 2024-03-01 10:05:24  5

time: 2024-03-01 10:05:25
            timestamp  B
0 2024-03-01 10:05:25  1
2 2024-03-01 10:05:25  5
7 2024-03-01 10:05:25  0

time: 2024-03-01 10:05:26
            timestamp  B
1 2024-03-01 10:05:26  6

time: 2024-03-01 10:05:27
            timestamp  B
4 2024-03-01 10:05:27  7

time: 2024-03-01 10:05:28
            timestamp  B
3 2024-03-01 10:05:28  7
5 2024-03-01 10:05:28  1
6 2024-03-01 10:05:28  4

如何将这些组“加入”在一起,以便我可以一起迭代它们?例如。我希望能够做到:

for time, group1, group2 in somehow_joined(g1,g2):
    <do stuff with group1 and group2 in this common time group>
python pandas dataframe group-by
1个回答
0
投票

我希望我已经很好地理解了你的问题,但是你可以使用

itertools.groupby
:

from itertools import groupby

g1 = df1.groupby(pd.Grouper(key="timestamp", freq="1s"))
g2 = df2.groupby(pd.Grouper(key="timestamp", freq="1s"))

for t, g in groupby(sorted([*g1, *g2], key=lambda k: k[0]), lambda k: k[0]):
    print(t)
    print("-" * 80)
    for _, group in g:
        print(group)
    print()

打印(例如):

2024-03-01 00:14:25
--------------------------------------------------------------------------------
            timestamp  A
7 2024-03-01 00:14:25  0
9 2024-03-01 00:14:25  7
            timestamp  B
1 2024-03-01 00:14:25  0
3 2024-03-01 00:14:25  4
7 2024-03-01 00:14:25  1

2024-03-01 00:14:26
--------------------------------------------------------------------------------
            timestamp  A
5 2024-03-01 00:14:26  5
            timestamp  B
2 2024-03-01 00:14:26  4
5 2024-03-01 00:14:26  0
6 2024-03-01 00:14:26  9

2024-03-01 00:14:27
--------------------------------------------------------------------------------
            timestamp  A
0 2024-03-01 00:14:27  4
4 2024-03-01 00:14:27  6
            timestamp  B
4 2024-03-01 00:14:27  4
8 2024-03-01 00:14:27  6

2024-03-01 00:14:28
--------------------------------------------------------------------------------
            timestamp  A
1 2024-03-01 00:14:28  5
8 2024-03-01 00:14:28  8
            timestamp  B
0 2024-03-01 00:14:28  0

2024-03-01 00:14:29
--------------------------------------------------------------------------------
            timestamp  A
2 2024-03-01 00:14:29  5
3 2024-03-01 00:14:29  4
6 2024-03-01 00:14:29  1
            timestamp  B
9 2024-03-01 00:14:29  6
© www.soinside.com 2019 - 2024. All rights reserved.