Pandas:如何在不添加额外列的情况下“合并”两行交替的NaN的内容?

问题描述 投票:0回答:1

我有一个数据集,其中某些列包含数据或NaN:

rows_dict = {'category': {305: 'Seasonings, Condiments, Toppings & Sauces',
                          536: 'Seasonings, Condiments, Toppings & Sauces',
                          627: 'Commercial Snacks'},
             'histamine_level': {305: pd.np.nan, 536: pd.np.nan,
                                 627: pd.np.nan},
             'food_name': {305: 'Peppermint', 536: 'Peppermint',
                           627: 'Peppermint flavored candy'},
             'oxalate_level': {305: 'Low', 536: pd.np.nan, 627: pd.np.nan},
             'salicylate_level': {305: pd.np.nan, 536: 'Very High',
                                  627: 'High'}}
pd.DataFrame(rows_dict)

<< img src =“ https://image.soinside.com/eyJ1cmwiOiAiaHR0cHM6Ly9pLmltZ3VyLmNvbS9mdFRBMk1KLnBuZyJ9” alt =“ PreDataFrame”>因此,我试图“合并”显示此特征的行。为此,我编写了一个试图利用OR属性的函数:

def merge_2_rows(df, left_index, right_index):
    row_dict = {}
    columns_list = df.columns
    for column_name in columns_list:
        row_dict[column_name] = df.loc[left_index,
                                       column_name] or df.loc[right_index, column_name]
    match_series = (df.index.isin([left_index, right_index]))
    df = df[~match_series]
    df = df.append(pd.DataFrame([row_dict], columns=columns_list), ignore_index=True)

    return df

但是当我运行merge_2_rows(df=a_copy_of_the_above_df, left_index=305, right_index=536)时,我得到了:

“

如果第一个索引包含NaN,则OR语句将退出并且不检查第二个索引。所以这行不通。我看过pd.merge,可能有一个Series函数可以做到这一点,但我找不到它。如何合并两行交替的NaN的内容而不添加额外的列?

python pandas
1个回答
0
投票

基于来自@Yuca的提示,merge_2_rows中的此更改更简单并且可以实际使用:

© www.soinside.com 2019 - 2024. All rights reserved.