根据特定行和列值的参考表填充数据框表中的 NaN 单元格

Question

我有两张桌子。第一个参考表如下：

| Name | Target  | Bonus |
|------|--------:|------:|
| Joe  |      40 |    46 |
| Phil |      38 |    42 |
| Dean |      65 |    70 |

生成表格的Python代码是：

# Data for the table
data = {
    'Name': ['Joe', 'Phil', 'Dean'],
    'Target': [40, 38, 65],
    'Bonus': [46, 42, 70]
}

# Creating the DataFrame
ref = pd.DataFrame(data)

我的第二张表如下：

| week       | Metrics | Joe | Dean |
|------------|---------|----:|-----:|
| 11/6/2023  | Target  |  40 |   65 |
| 11/6/2023  | Bonus   |  46 |   70 |
| 11/6/2023  | Score   |  33 |   71 |
| 11/13/2023 | Target  |  40 |  NaN |
| 11/13/2023 | Bonus   |  46 |  NaN |
| 11/13/2023 | Score   |  45 |  NaN |
| 11/20/2023 | Target  |  40 |   65 |
| 11/20/2023 | Bonus   |  46 |   70 |
| 11/20/2023 | Score   |  35 |   68 |
| 11/27/2023 | Target  | NaN |   65 |
| 11/27/2023 | Bonus   | NaN |   70 |
| 11/27/2023 | Score   | NaN |   44 |
| 12/4/2023  | Target  |  40 |   65 |
| 12/4/2023  | Bonus   |  46 |   70 |
| 12/4/2023  | Score   |  42 |   66 |

生成此表的Python代码是：

# Data for the new table
data = {
    'week': ['11/6/2023', '11/6/2023', '11/6/2023', '11/13/2023', '11/13/2023', '11/13/2023',
             '11/20/2023', '11/20/2023', '11/20/2023', '11/27/2023', '11/27/2023', '11/27/2023',
             '12/4/2023', '12/4/2023', '12/4/2023'],
    'Metrics': ['Target', 'Bonus', 'Score', 'Target', 'Bonus', 'Score',
                'Target', 'Bonus', 'Score', 'Target', 'Bonus', 'Score',
                'Target', 'Bonus', 'Score'],
    'Joe': [40, 46, 33, 40, 46, 45, 40, 46, 35, None, None, None, 40, 46, 42],
    'Dean': [65, 70, 71, None, None, None, 65, 70, 68, 65, 70, 44, 65, 70, 66]
}

# Creating the DataFrame
df = pd.DataFrame(data)

如您所见，Dean 有一周的目标、奖励和分数单元格为空白。乔在接下来的一周也会这样做。在单元格为 NaN 的这些特定实例中，我想使用以下规则填充它们：

从第一个参考表中获取每个人的目标和奖励单元格值，并相应地填充 NaN 单元格。
将分数单元格设置为等于该人的目标单元格值。

我想要的输出表将如下所示：

| week       | Metrics | Joe | Dean |
|------------|---------|----:|-----:|
| 11/6/2023  | Target  |  40 |   65 |
| 11/6/2023  | Bonus   |  46 |   70 |
| 11/6/2023  | Score   |  33 |   71 |
| 11/13/2023 | Target  |  40 |   65 |
| 11/13/2023 | Bonus   |  46 |   70 |
| 11/13/2023 | Score   |  45 |   65 |
| 11/20/2023 | Target  |  40 |   65 |
| 11/20/2023 | Bonus   |  46 |   70 |
| 11/20/2023 | Score   |  35 |   68 |
| 11/27/2023 | Target  |  40 |   65 |
| 11/27/2023 | Bonus   |  46 |   70 |
| 11/27/2023 | Score   |  40 |   44 |
| 12/4/2023  | Target  |  40 |   65 |
| 12/4/2023  | Bonus   |  46 |   70 |
| 12/4/2023  | Score   |  42 |   66 |

Answer 1

我已将第二个数据框的名称更改为 df2，因为它们不能具有相同的名称：

# Iterate over each row in df2
for i, row in df2.iterrows():
    # For each person
    for person in ['Joe', 'Dean']:
        # If the value is NaN
        if pd.isnull(row[person]):
            # If the metric is 'Score', use the 'Target' value
            if row['Metrics'] == 'Score':
                value = df.loc[df['Name'] == person, 'Target'].values[0]
            # Otherwise, check if the metric exists in df and use its value
            elif row['Metrics'] in df.columns:
                value = df.loc[df['Name'] == person, row['Metrics']].values[0]
            else:
                continue  # Skip if the metric is not in df and is not 'Score'
            # Replace the NaN value in df2
            df2.at[i, person] = value

这应该适合您的目的。

根据特定行和列值的参考表填充数据框表中的 NaN 单元格

问题描述投票：0回答：1

1个回答

最新问题

根据特定行和列值的参考表填充数据框表中的 NaN 单元格

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1