我有两个数据帧,我试图以某种方式连接它们以生成第三个数据帧。
我正在 python 中执行此操作。
df1:
Date A B C
27/04/2023 00:00 55.66371155 0.321940005 0.145006001
28/04/2023 00:00 57.14315796 0.327948987 0.147119999
29/04/2023 00:00 57.3841629 0.322167009 0.146118999
30/04/2023 00:00 57.12593079 0.320318013 0.145880997
df2:
Date change Ticker
27/04/2023 00:00 0.1 A
28/04/2023 00:00 -0.1 A
29/04/2023 00:00 0.2 A
30/04/2023 00:00 -0.2 A
27/04/2023 00:00 0.3 B
28/04/2023 00:00 -0.3 B
29/04/2023 00:00 0.4 B
30/04/2023 00:00 -0.4 B
27/04/2023 00:00 0.5 C
28/04/2023 00:00 -0.5 C
29/04/2023 00:00 0.6 C
30/04/2023 00:00 -0.6 C
我想连接两个数据帧,以便 df1 中的值在日期连接到 df2,并且 df2 中的代码等于 df1 中的列名称。
结果应该是这样的:
Date change Ticker Px
27/04/2023 00:00 0.1 A 55.66371155
28/04/2023 00:00 -0.1 A 57.14315796
29/04/2023 00:00 0.2 A 57.3841629
30/04/2023 00:00 -0.2 A 57.12593079
27/04/2023 00:00 0.3 B 0.321940005
28/04/2023 00:00 -0.3 B 0.327948987
29/04/2023 00:00 0.4 B 0.322167009
30/04/2023 00:00 -0.4 B 0.320318013
27/04/2023 00:00 0.5 C 0.145006001
28/04/2023 00:00 -0.5 C 0.147119999
29/04/2023 00:00 0.6 C 0.146118999
30/04/2023 00:00 -0.6 C 0.145880997
我尝试过使用
.iloc[]
但我无法让它工作
尝试以下操作:
import pandas as pd
import numpy as np
# Sample data for df1
data1 = {
'Date': ['27/04/2023 00:00', '28/04/2023 00:00', '29/04/2023 00:00', '30/04/2023 00:00'],
'A': [55.66371155, 57.14315796, 57.3841629, 57.12593079],
'B': [0.321940005, 0.327948987, 0.322167009, 0.320318013],
'C': [0.145006001, 0.147119999, 0.146118999, 0.145880997]
}
df1 = pd.DataFrame(data1)
df1['Date'] = pd.to_datetime(df1['Date']) # Ensure 'Date' is a datetime object
# Sample data for df2
data2 = {
'Date': ['27/04/2023 00:00', '28/04/2023 00:00', '29/04/2023 00:00', '30/04/2023 00:00'] * 3,
'change': [0.1, -0.1, 0.2, -0.2, 0.3, -0.3, 0.4, -0.4, 0.5, -0.5, 0.6, -0.6],
'Ticker': ['A', 'A', 'A', 'A', 'B', 'B', 'B', 'B', 'C', 'C', 'C', 'C']
}
df2 = pd.DataFrame(data2)
df2['Date'] = pd.to_datetime(df2['Date']) # Match the datetime format
# Use numpy to map the values from df1 to df2
ticker_to_col = df1.columns[1:].tolist() # Skip 'Date' column
df2['Px'] = np.choose(
np.searchsorted(ticker_to_col, df2['Ticker']), # Position of Ticker in columns
[df1[col].values for col in ticker_to_col] # Values of each column as choices
)
print(df2)
说明: