我有一些来自电导率探头的浮动数据,其中包含一些NaNs。我想根据经验阈值将探头数据转换为指标变量,但我希望NaN值保持为NaNs。转换为指示器似乎很直接,但问题在于如何处理nan's。下面是一个阈值为50的例子。
import numpy as np
import pandas as pd
x = [0, np.nan, 2, 3, 4, 51, 61, 71, 81, 91]
df = pd.DataFrame({"x":x})
df['indicator'] = (df.x <=50)*1
产量:
x indicator
0 0.0 1
1 NaN 0
2 2.0 1
3 3.0 1
4 4.0 1
5 51.0 0
6 61.0 0
7 71.0 0
8 81.0 0
9 91.0 0
但我想让nan的指标变成nan,就像这样。
x indicator
0 0.0 1
1 NaN NaN
2 2.0 1
3 3.0 1
4 4.0 1
5 51.0 0
6 61.0 0
7 71.0 0
8 81.0 0
9 91.0 0
任何帮助都将被感激。谢谢。
你可以试试这个。
import numpy as np
import pandas as pd
x = [0, np.nan, 2, 3, 4, 51, 61, 71, 81, 91]
df = pd.DataFrame({"x":x})
df['indicator'] = df.x*(df.x <=50)
输出:
x indicator
0 0.0 0.0
1 NaN NaN
2 2.0 2.0
3 3.0 3.0
4 4.0 4.0
5 51.0 0.0
6 61.0 0.0
7 71.0 0.0
8 81.0 0.0
9 91.0 0.0
准确的输出:
mport numpy as np
import pandas as pd
x = [0, np.nan, 2, 3, 4, 51, 61, 71, 81, 91]
df = pd.DataFrame({"x":x})
df['indicator'] = np.where(df.x.isnull(), np.nan, df.x < 50)
输出:
x indicator
0 0.0 1.0
1 NaN NaN
2 2.0 1.0
3 3.0 1.0
4 4.0 1.0
5 51.0 0.0
6 61.0 0.0
7 71.0 0.0
8 81.0 0.0
9 91.0 0.0
IIUC。
In [1829]: df['indicator'] = df[df.x <=50]*1
In [1830]: df
Out[1830]:
x indicator
0 0.0 0.0
1 NaN NaN
2 2.0 2.0
3 3.0 3.0
4 4.0 4.0
5 51.0 NaN
6 61.0 NaN
7 71.0 NaN
8 81.0 NaN
9 91.0 NaN
我想我应该尝试对一列应用lambda:)
x = [0, np.nan, 2, 3, 4, 51, 61, 71, 81, 91]
df = pd.DataFrame({"x":x})
indicator = lambda x: np.nan if (np.isnan(x)) else (x<=50)*1
df['indicator'] = df['x'].apply(indicator)
print(df)
打印:IIUC:指标将仅在x <=50的行中设置:我想我可以尝试对一列应用lambda:)
x indicator
0 0.0 1.0
1 NaN NaN
2 2.0 1.0
3 3.0 1.0
4 4.0 1.0
5 51.0 0.0
6 61.0 0.0
7 71.0 0.0
8 81.0 0.0
9 91.0 0.0