支持 Nan 的 Pandas lambda 函数

Question

我正在尝试在 Pandas 中编写一个 lambda 函数，用于检查 Col1 是否为 Nan，如果是，则使用另一列的数据。我无法正确编译/执行代码（如下）。

import pandas as pd
import numpy as np

df = pd.DataFrame({'Col1': [1, 2, 3, np.NaN], 'Col2': [7, 8, 9, 10]})  
df2 = df.apply(lambda x: x['Col2'] if x['Col1'].isnull() else x['Col1'], axis=1)

有谁知道如何使用 lambda 函数编写这样的解决方案，或者我是否超出了 lambda 的能力？如果没有的话，你还有其他解决办法吗？

Answer 1

您需要

pandas.isnull

检查标量是否为

NaN

:

df = pd.DataFrame({ 'Col1' : [1,2,3,np.NaN],
                 'Col2' : [8,9,7,10]})  

df2 = df.apply(lambda x: x['Col2'] if pd.isnull(x['Col1']) else x['Col1'], axis=1)

print (df)
   Col1  Col2
0   1.0     8
1   2.0     9
2   3.0     7
3   NaN    10

print (df2)
0     1.0
1     2.0
2     3.0
3    10.0
dtype: float64

但更好的是使用

Series.combine_first

:

df['Col1'] = df['Col1'].combine_first(df['Col2'])

print (df)
   Col1  Col2
0   1.0     8
1   2.0     9
2   3.0     7
3  10.0    10

另一种解决方案

Series.update

：

df['Col1'].update(df['Col2'])
print (df)
   Col1  Col2
0   8.0     8
1   9.0     9
2   7.0     7
3  10.0    10

Answer 2

假设您确实有第二列，即：

df = pd.DataFrame({ 'Col1' : [1,2,3,np.NaN], 'Col2': [1,2,3,4]})

此问题的正确解决方案是：

df['Col1'].fillna(df['Col2'], inplace=True)

Answer 3

你需要使用np.nan()

#import numpy as np
df2=df.apply(lambda x: 2 if np.isnan(x['Col1']) else 1, axis=1)   

df2
Out[1307]: 
0    1
1    1
2    1
3    2
dtype: int64

Answer 4

在 pandas 0.24.2 中，我使用

df.apply(lambda x: x['col_name'] if x[col1] is np.nan else expressions_another, axis=1)

因为 pd.isnull() 不起作用。

在工作中，我发现以下现象，

没有运行结果：

df['prop'] = df.apply(lambda x: (x['buynumpday'] / x['cnumpday']) if pd.isnull(x['cnumpday']) else np.nan, axis=1)

结果存在：

df['prop'] = df.apply(lambda x: (x['buynumpday'] / x['cnumpday']) if x['cnumpday'] is not np.nan else np.nan, axis=1)

到目前为止，我仍然不知道更深层次的原因，但我有这些经验，对于对象，使用[is np.nan()]或pd.isna()。对于浮点数，请使用 np.isnan() 或 pd.isna()。

支持 Nan 的 Pandas lambda 函数

问题描述投票：0回答：4

4个回答

最新问题

支持 Nan 的 Pandas lambda 函数

问题描述 投票：0回答：4

4个回答

最新问题

问题描述投票：0回答：4