我有一个大麻数据集,其中包含“效果”列,我正尝试为不包含某些效果的菌株添加一个二进制“ nice_buds”列。这是代码:
nice_buds = []
undesired_effects = ["Sleepy", "Hungry", "Giggly", "Tingly", "Aroused", "Talkative"]
for row in sample["Effects"]:
if "Sleepy" not in row and "Hungry" not in row and "Giggly" not in row and "Tingly" not in row and "Aroused" not in row and "Talkative" not in row:
nice_buds.append(1)
else:
nice_buds.append(0)
sample["nice_buds"] = nice_buds
到目前为止,undesired_effects
列表什么也没做,就给我所需的输出而言,代码工作得很好。
我的问题是,是否还有更多“ Pythonic”或“ DRY”方式可以解决此问题...
您可以将all()
与生成器表达式一起使用以简化if语句
nice_buds = []
undesired_effects = ["Sleepy", "Hungry", "Giggly", "Tingly", "Aroused", "Talkative"]
for row in sample["Effects"]:
if all(effect not in row for effect in undesired_effects):
nice_buds.append(1)
else:
nice_buds.append(0)
sample["nice_buds"] = nice_buds
或使用any()
并检查是否存在效果:
nice_buds = []
undesired_effects = ["Sleepy", "Hungry", "Giggly", "Tingly", "Aroused", "Talkative"]
for row in sample["Effects"]:
if any(effect in row for effect in undesired_effects):
nice_buds.append(1)
else:
nice_buds.append(0)
sample["nice_buds"] = nice_buds
sample
np.where
undesired_effects = ["Sleepy", "Hungry", "Giggly", "Tingly", "Aroused", "Talkative"]
sample['nice buds'] = np.where(sample['Effects'].str.contains('|'.join(undesired_effects)), 0, 1)