我尝试使用下一个函数,但出现 ValueError: Cannot set a DataFrame with multiple columns to the single column.
我将此函数与其他列(CO01NUM022AH 和 CO01NUM022CT)一起使用,并且没有任何问题(创建了新列 R2 ... y R ...)。但对于其他列(CO01EXP001VE 和 CO01EXP001RO),只有我得到列 R2 ...
如有任何帮助或建议,我们将不胜感激。
这是我的代码。我收到第 19 行的错误:
VarMeses = ('CO01EXP001VE', 'CO01EXP001RO')
def RangDef0 (row):
if row[column] == 0:
rango = 'y. Sin antiguedad'
elif row[column] > 0:
rango = baseburo[NombrecolM]
elif (row[column] == -1) | (row[column] == -2) | (row[column] == -3):
rango = 'z. Sin informacion'
else:
rango = 'y. Sin antiguedad'
return rango
binsM = [1, 12, 24, 48, 72, 96, 120, 144, 168, 99999]
labM = ['a. 1-12 m','b. 13-24 m','c. 25-48 m','d. 49-72 m','e. 73-96 m','f. 97-120 m','g. 121-144 m','h. 145-168 m','i. >168 m']
for column in VarMeses:
NombrecolM = 'R2'+ str(column)
NombrecolM2 = 'R'+ str(column)
baseburo[NombrecolM] = pd.cut(baseburo[column], bins = binsM, labels = labM, include_lowest = True)
baseburo[NombrecolM2] = baseburo.apply(RangDef0, axis=1)
这是数据框:
Item CO01NUM022AH CO01NUM022CT CO01EXP001VE CO01EXP001RO R2CO01NUM022AH RCO01NUM022AH R2CO01NUM022CT RCO01NUM022CT R2CO01EXP001VE R2CO01EXP001RO
710059 0.0 0.0 206.0 239.0 b. 1 a. 0 b. 1 a. 0 i. >168 m i. >168 m
710532 0.0 -1.0 -1.0 97.0 b. 1 a. 0 NaN z. Sin informacion NaN f. 97-120 m
710895 0.0 -1.0 13.0 117.0 b. 1 a. 0 NaN z. Sin informacion b. 13-24 m f. 97-120 m
711302 0.0 -1.0 -1.0 54.0 b. 1 a. 0 NaN z. Sin informacion NaN d. 49-72 m
711361 0.0 -1.0 -1.0 225.0 b. 1 a. 0 NaN z. Sin informacion NaN i. >168 m
该错误可能来自于在正确定义或脱离上下文之前引用了 baseburo[NombrecolM]。你可以这样做:
import pandas as pd
data = {
'Item': [710059, 710532, 710895, 711302, 711361],
'CO01NUM022AH': [0.0, 0.0, 0.0, 0.0, 0.0],
'CO01NUM022CT': [0.0, -1.0, -1.0, -1.0, -1.0],
'CO01EXP001VE': [206.0, -1.0, 13.0, -1.0, -1.0],
'CO01EXP001RO': [239.0, 97.0, 117.0, 54.0, 225.0]
}
baseburo = pd.DataFrame(data)
print(baseburo)
VarMeses = ('CO01EXP001VE', 'CO01EXP001RO')
binsM = [0, 1, 12, 24, 48, 72, 96, 120, 144, 168, 99999]
labM = ['y. Sin antiguedad', 'a. 1-12 m', 'b. 13-24 m', 'c. 25-48 m', 'd. 49-72 m', 'e. 73-96 m', 'f. 97-120 m', 'g. 121-144 m', 'h. 145-168 m', 'i. >168 m']
def RangDef0(row, column):
if row[column] == 0:
return 'y. Sin antiguedad'
elif row[column] > 0:
return pd.cut([row[column]], bins=binsM, labels=labM, right=False)[0]
elif row[column] in [-1, -2, -3]:
return 'z. Sin informacion'
else:
return 'y. Sin antiguedad'
for column in VarMeses:
NombrecolM = 'R2' + column
NombrecolM2 = 'R' + column
baseburo[NombrecolM] = pd.cut(baseburo[column], bins=binsM, labels=labM, right=False)
baseburo[NombrecolM2] = baseburo.apply(lambda row: RangDef0(row, column), axis=1)
print(baseburo)
这给出了
R2CO01EXP001VE RCO01EXP001VE R2CO01EXP001RO RCO01EXP001RO
0 i. >168 m i. >168 m i. >168 m i. >168 m
1 NaN z. Sin informacion f. 97-120 m f. 97-120 m
2 b. 13-24 m b. 13-24 m f. 97-120 m f. 97-120 m
3 NaN z. Sin informacion d. 49-72 m d. 49-72 m
4 NaN z. Sin informacion i. >168 m i. >168 m