如果是,则在数据框中添加ifif,否则创建一个新列

问题描述 投票:0回答:1

[尝试在我的数据框中创建新列,称为“方法”。所附图片中的当前数据帧:enter image description here

我正在尝试使用if / elif / else和正则表达式来创建新列,但是当我运行此代码时,我仅获得来自else语句的值。为什么这不起作用,我该如何解决?

if 'posted' in df2.Full.astype(str) and '/ Outbound' in df2.TPrev.astype(str):
    df2['Method']='Classifieds Homepage Button'
elif 'ad posted' in df2.Full.astype(str) and 'thanks' in df2.TPrev.astype(str):
    df2['Method']='Header after Post'
elif 'ad posted' in df2.Full.astype(str) and '/myaccount/listing-classified Outbound' in df2.TPrev.astype(str):
    df2['Method']='My Listings Button'    
elif 'ad posted' in df2.Full.astype(str) and '/s/' in df2.TPrev.astype(str):
    df2['Method']='SRP'  
elif 'ad posted' in df2.Full.astype(str) and '/myaccount/listing-classified nan' in df2.TPrev.astype(str):
    df2['Method']='My Listings Button'
elif 'ad posted' in df2.Full.astype(str) and '/sell nan nan' in df2.TPrev and '/myaccount/listing-classified nan nan' in df2.Prev.astype(str):
    df2['Method']='My Listings Header'
elif 'ad posted' in df2.Full.astype(str) and '/listing/' in df2.TPrev.astype(str):
    df2['Method']='Detail Page Header'
elif 'ad posted' in df2.Full.astype(str) and '/search/' in df2.TPrev.astype(str):
    df2['Method']='SRP'
else:
    df2['Method']='Ignore'
python regex pandas if-statement series
1个回答
0
投票

正如评论中的人所建议的,问题是,当您为一列分配一个值时,您只需重写所有列以使其具有与您分配的值相同的值。您想要做的是:

  1. 而不是将类型更改为str每行,只需更改整个数据框:

    df2.astype(str)

  2. 您需要具有逻辑,该逻辑将用于数据框的每一行,以确定“方法”列的值。最简单的方法是使用您构建的函数并通过apply调用它:

def my_logic(row):
   if 'posted' in row.Full and '/ Outbound' in row.TPrev:
      return "Classified Homepage Button"
   elif 'ad posted' in row.Full and 'thanks' in row.TPrev:
      return 'Header after Post'
   elif 'ad posted' in row.Full and '/myaccount/listing-classified Outbound' in row.TPrev:
      return 'My Listings Button'
   elif 'ad posted' in row.Full and '/s/' in row.TPrev:
      return 'SRP'
   elif 'ad posted' in row.Full and '/myaccount/listing-classified nan' in row.TPrev:
      return 'My Listings Button'
   elif 'ad posted' in row.Full and '/sell nan nan' in row.TPrev and '/myaccount/listing-classified nan nan' in row.Prev:
      return 'My Listings Header'
   elif 'ad posted' in row.Full and '/listing/' in row.TPrev:
      return 'Detail Page Header'
   elif 'ad posted' in row.Full and '/search/' in row.TPrev:
      return 'SRP'
   else:
      return 'Ignore'

df2['Method'] = df2.apply(lambda row: my_logic(row), axis=1)

这将是最简单的转换,但我认为将使用np.select更为优雅的解决方案-根据您的逻辑创建选择列表和True / False列表。前三个条件的示例:

conditions = [
   ('posted' in df2.Full) & ('/ Outbound' in df2.TPrev),
   ('ad posted' in df2.Full) & ('thanks' in df2.TPrev),
   ('ad posted' in df2.Full) & ('/myaccount/listing-classified Outbound' in df2.TPrev)]
choices = ['"Classified Homepage Button"', 'Header after Post', 'My Listings Button']
df2['Method'] = np.select(conditions, choices, default='Ignore')
© www.soinside.com 2019 - 2024. All rights reserved.