我的问题是:我怎样才能用 Pandas 简化我的表,只得到一个包含选定值的列(三列应该是一个)。
Name Selection Active Inactive
A active 0 0.9
B active 1 0.8
C inactive 2 0.7
D inactive 3 0.6
E active 4 0.5
喜欢
IF Selection = 'active' THEN Active ELSE Inactive as Selected_Value
得到以下结果:
Name Selected_Value
A 0
B 1
C 0.7
D 0.6
E 4
下面的代码应该为您提供您正在寻找的东西。
df.loc[df['Selection'] == 'active','Selected_Value'] = df['Active']
df.loc[df['Selection'] == 'unactive','Selected_Value'] = df['Unactive']
或
idx,cols = pd.factorize(df['Selection'].str.title())
df.assign(Selected_Value = df.reindex(cols,axis=1).to_numpy()[range(len(df)),idx])
输出:
Name Selection Active Inactive Selected_Value
0 A active 0 0.9 0.0
1 B active 1 0.8 1.0
2 C inactive 2 0.7 0.7
3 D inactive 3 0.6 0.6
4 E active 4 0.5 4.0
numpy.where()
:
import pandas as pd
import numpy as np
df = pd.DataFrame({'Name': ['A', 'B', 'C', 'D', 'E'],
'Selection': ['active', 'active', 'unactive', 'unactive', 'active'],
'Active': [0, 1, 2, 3, 4],
'Unactive': [0.9, 0.8, 0.7, 0.6, 0.5]})
df['Selected_Value'] = np.where(df['Selection']=='active', # If the element for the Selection column is active
df['Active'], # The element of the Selected_Value column of that index will be the element from the Active column
df['Unactive']) # Else, the element of the Selected_Value column of that index will be the element from the Unactive column
print(df['Selected_Value'])
输出:
0 0.0
1 1.0
2 0.7
3 0.6
4 4.0
Name: Selected_Value, dtype: float64