[尝试在我的数据框中创建新列,称为“方法”。所附图片中的当前数据帧:
我正在尝试使用if / elif / else和正则表达式来创建新列,但是当我运行此代码时,我仅获得来自else语句的值。为什么这不起作用,我该如何解决?
if 'posted' in df2.Full.astype(str) and '/ Outbound' in df2.TPrev.astype(str):
df2['Method']='Classifieds Homepage Button'
elif 'ad posted' in df2.Full.astype(str) and 'thanks' in df2.TPrev.astype(str):
df2['Method']='Header after Post'
elif 'ad posted' in df2.Full.astype(str) and '/myaccount/listing-classified Outbound' in df2.TPrev.astype(str):
df2['Method']='My Listings Button'
elif 'ad posted' in df2.Full.astype(str) and '/s/' in df2.TPrev.astype(str):
df2['Method']='SRP'
elif 'ad posted' in df2.Full.astype(str) and '/myaccount/listing-classified nan' in df2.TPrev.astype(str):
df2['Method']='My Listings Button'
elif 'ad posted' in df2.Full.astype(str) and '/sell nan nan' in df2.TPrev and '/myaccount/listing-classified nan nan' in df2.Prev.astype(str):
df2['Method']='My Listings Header'
elif 'ad posted' in df2.Full.astype(str) and '/listing/' in df2.TPrev.astype(str):
df2['Method']='Detail Page Header'
elif 'ad posted' in df2.Full.astype(str) and '/search/' in df2.TPrev.astype(str):
df2['Method']='SRP'
else:
df2['Method']='Ignore'
正如评论中的人所建议的,问题是,当您为一列分配一个值时,您只需重写所有列以使其具有与您分配的值相同的值。您想要做的是:
而不是将类型更改为str每行,只需更改整个数据框:
df2.astype(str)
您需要具有逻辑,该逻辑将用于数据框的每一行,以确定“方法”列的值。最简单的方法是使用您构建的函数并通过apply调用它:
def my_logic(row):
if 'posted' in row.Full and '/ Outbound' in row.TPrev:
return "Classified Homepage Button"
elif 'ad posted' in row.Full and 'thanks' in row.TPrev:
return 'Header after Post'
elif 'ad posted' in row.Full and '/myaccount/listing-classified Outbound' in row.TPrev:
return 'My Listings Button'
elif 'ad posted' in row.Full and '/s/' in row.TPrev:
return 'SRP'
elif 'ad posted' in row.Full and '/myaccount/listing-classified nan' in row.TPrev:
return 'My Listings Button'
elif 'ad posted' in row.Full and '/sell nan nan' in row.TPrev and '/myaccount/listing-classified nan nan' in row.Prev:
return 'My Listings Header'
elif 'ad posted' in row.Full and '/listing/' in row.TPrev:
return 'Detail Page Header'
elif 'ad posted' in row.Full and '/search/' in row.TPrev:
return 'SRP'
else:
return 'Ignore'
df2['Method'] = df2.apply(lambda row: my_logic(row), axis=1)
这将是最简单的转换,但我认为将使用np.select更为优雅的解决方案-根据您的逻辑创建选择列表和True / False列表。前三个条件的示例:
conditions = [
('posted' in df2.Full) & ('/ Outbound' in df2.TPrev),
('ad posted' in df2.Full) & ('thanks' in df2.TPrev),
('ad posted' in df2.Full) & ('/myaccount/listing-classified Outbound' in df2.TPrev)]
choices = ['"Classified Homepage Button"', 'Header after Post', 'My Listings Button']
df2['Method'] = np.select(conditions, choices, default='Ignore')