我正在尝试根据另一列的值在列上分配一些标签。 'Percentage_delay'
列的值介于0到1之间。如果'Percentage_delay'
列的值大于0.75,则'Labels'
列上的对应值应为'high'
,如果小于0.75则更大。小于0.5 'medium'
,如果小于0.5 'low'
。
我想出了这段代码:
for i in number_delay_aiport['Percentage_delay']:
if i >= 0 and i < 0.25:
number_delay_aiport['Labels'] = 'low'
if i >= 0.25 and i < 0.75:
number_delay_aiport['Labels'] = 'medium'
if i >= 0.75 and i <= 1:
number_delay_aiport['Labels'] = 'high'
输出错误,因为我只有Label == 'high'
:Output
如果使用'return'
功能,也会发生同样的情况。
您能告诉我为什么会这样吗?
将for循环更改为枚举的for循环,并在标签上使用iloc
:
import pandas as pd
d = {"Percentage_delay" : [0.64, 0.80, 0.55, 0.48, 0.65, 0.46, 0.87, 0.66, 0.77, 0.44]}
number_delay_airport = pd.DataFrame(d)
# to use iloc you first have to create the column
number_delay_airport['Labels'] = ''
for j, i in enumerate(number_delay_airport['Percentage_delay']):
print(i,j)
if i >= 0 and i < 0.25:
number_delay_airport['Labels'].iloc[j] = 'low'
if i >= 0.25 and i < 0.75:
number_delay_airport['Labels'].iloc[j] = 'medium'
if i >= 0.75 and i <= 1:
number_delay_airport['Labels'].iloc[j] = 'high'
print(number_delay_airport)
甚至更好,使用apply
函数,您可以执行以下操作:
import pandas as pd
d = {"Percentage_delay" : [0.64, 0.80, 0.55, 0.48, 0.65, 0.46, 0.87, 0.66, 0.77, 0.44]}
number_delay_airport = pd.DataFrame(d)
def assign_label(i):
if i >= 0 and i < 0.25:
return 'low'
if i >= 0.25 and i < 0.75:
return 'medium'
if i >= 0.75 and i <= 1:
return 'high'
number_delay_airport['Labels'] = number_delay_airport['Percentage_delay'].apply(assign_label)
print(number_delay_airport)