如何在保持NaN值的同时使用pandas.melt（）？

Question

我正在清理一个混乱的数据框，其中所需的一些信息出现在列名中。此信息应融合到将要创建的单个列中。

index    name       animal    fruit    veg
--------------------------------------------------
0        cow        animal    NaN      NaN
1        apple      NaN       fruit    NaN
2        carrot     NaN       NaN      veg
3        dog        animal    NaN      NaN
4        horse      animal    NaN      NaN
5        car        NaN       NaN      NaN
6        pear       NaN       fruit    NaN
7        pepper     NaN       NaN      veg
8        cucumber   NaN       NaN      veg
9        house      NaN       NaN      NaN

我已经尝试使用pandas.melt()函数，但它返回了很多行，其中包含“错误”的NaN值和重复值。

一些行应该显示NaN，但只有那些不适合列名中指定的类别，所以我不能使用pandas.dropna()。

此外，我不能确定删除重复项不会删除重要数据。

这是我使用的代码：

import pandas as pd

pd.melt(df, id_vars=['index', 'name'],
        value_vars=['animal', 'fruit', 'veg'],
        var_name='type')

我需要的结果应该是这样的：

index    name       type
--------------------------------------------------
0        cow        animal
1        apple      fruit
2        carrot     veg
3        dog        animal
4        horse      animal
5        car        NaN
6        pear       fruit
7        pepper     veg
8        cucumber   veg
9        house      NaN

Answer 1

你可以这样做（假设索引不是列，而是索引），在df.ffill()上使用axis=1：

df['type']=df[df.columns[1:]].ffill(axis=1).iloc[:,-1]
#alternatively-> df['type']=df.loc[:,['animal','fruit','veg']].ffill(axis=1).iloc[:,-1]
df_new=df[['name','type']]
print(df_new)

           name    type
index                  
0           cow  animal
1         apple   fruit
2        carrot     veg
3           dog  animal
4         horse  animal
5           car     NaN
6          pear   fruit
7        pepper     veg
8      cucumber     veg
9         house     NaN

如何在保持NaN值的同时使用pandas.melt（）？

问题描述投票：3回答：1

1个回答

最新问题

如何在保持NaN值的同时使用pandas.melt（）？

问题描述 投票：3回答：1

1个回答

最新问题

问题描述投票：3回答：1