我有两张桌子。第一个参考表如下:
| Name | Target | Bonus |
|------|--------:|------:|
| Joe | 40 | 46 |
| Phil | 38 | 42 |
| Dean | 65 | 70 |
生成表格的Python代码是:
# Data for the table
data = {
'Name': ['Joe', 'Phil', 'Dean'],
'Target': [40, 38, 65],
'Bonus': [46, 42, 70]
}
# Creating the DataFrame
ref = pd.DataFrame(data)
我的第二张表如下:
| week | Metrics | Joe | Dean |
|------------|---------|----:|-----:|
| 11/6/2023 | Target | 40 | 65 |
| 11/6/2023 | Bonus | 46 | 70 |
| 11/6/2023 | Score | 33 | 71 |
| 11/13/2023 | Target | 40 | NaN |
| 11/13/2023 | Bonus | 46 | NaN |
| 11/13/2023 | Score | 45 | NaN |
| 11/20/2023 | Target | 40 | 65 |
| 11/20/2023 | Bonus | 46 | 70 |
| 11/20/2023 | Score | 35 | 68 |
| 11/27/2023 | Target | NaN | 65 |
| 11/27/2023 | Bonus | NaN | 70 |
| 11/27/2023 | Score | NaN | 44 |
| 12/4/2023 | Target | 40 | 65 |
| 12/4/2023 | Bonus | 46 | 70 |
| 12/4/2023 | Score | 42 | 66 |
生成此表的Python代码是:
# Data for the new table
data = {
'week': ['11/6/2023', '11/6/2023', '11/6/2023', '11/13/2023', '11/13/2023', '11/13/2023',
'11/20/2023', '11/20/2023', '11/20/2023', '11/27/2023', '11/27/2023', '11/27/2023',
'12/4/2023', '12/4/2023', '12/4/2023'],
'Metrics': ['Target', 'Bonus', 'Score', 'Target', 'Bonus', 'Score',
'Target', 'Bonus', 'Score', 'Target', 'Bonus', 'Score',
'Target', 'Bonus', 'Score'],
'Joe': [40, 46, 33, 40, 46, 45, 40, 46, 35, None, None, None, 40, 46, 42],
'Dean': [65, 70, 71, None, None, None, 65, 70, 68, 65, 70, 44, 65, 70, 66]
}
# Creating the DataFrame
df = pd.DataFrame(data)
如您所见,Dean 有一周的目标、奖励和分数单元格为空白。乔在接下来的一周也会这样做。在单元格为 NaN 的这些特定实例中,我想使用以下规则填充它们:
我想要的输出表将如下所示:
| week | Metrics | Joe | Dean |
|------------|---------|----:|-----:|
| 11/6/2023 | Target | 40 | 65 |
| 11/6/2023 | Bonus | 46 | 70 |
| 11/6/2023 | Score | 33 | 71 |
| 11/13/2023 | Target | 40 | 65 |
| 11/13/2023 | Bonus | 46 | 70 |
| 11/13/2023 | Score | 45 | 65 |
| 11/20/2023 | Target | 40 | 65 |
| 11/20/2023 | Bonus | 46 | 70 |
| 11/20/2023 | Score | 35 | 68 |
| 11/27/2023 | Target | 40 | 65 |
| 11/27/2023 | Bonus | 46 | 70 |
| 11/27/2023 | Score | 40 | 44 |
| 12/4/2023 | Target | 40 | 65 |
| 12/4/2023 | Bonus | 46 | 70 |
| 12/4/2023 | Score | 42 | 66 |
我已将第二个数据框的名称更改为 df2,因为它们不能具有相同的名称:
# Iterate over each row in df2
for i, row in df2.iterrows():
# For each person
for person in ['Joe', 'Dean']:
# If the value is NaN
if pd.isnull(row[person]):
# If the metric is 'Score', use the 'Target' value
if row['Metrics'] == 'Score':
value = df.loc[df['Name'] == person, 'Target'].values[0]
# Otherwise, check if the metric exists in df and use its value
elif row['Metrics'] in df.columns:
value = df.loc[df['Name'] == person, row['Metrics']].values[0]
else:
continue # Skip if the metric is not in df and is not 'Score'
# Replace the NaN value in df2
df2.at[i, person] = value
这应该适合您的目的。