我想创建一个列df['score']
,该列返回单元格和列表之间共有的值计数。
输入:
correct_list = ['cats','dogs']
answer
0 cats, dogs, pigs
1 cats, dogs
2 dogs, pigs
3 cats
4 pigs
def animal_count(dataframe):
count = 0
for term in df['answer']:
if term in symptom_list:
df['score'] = count + 1
animal_count(df)
目标输出:
correct_list = ['cats','dogs']
answer score
0 cats, dogs, pigs 2
1 cats, dogs 2
2 dogs, pigs 1
3 cats 1
4 pigs 0
有什么想法吗?谢谢!
使用Series.str.count
的另一个解决方案:
Series.str.count
[out]
df['score'] = df['answer'].str.count('|'.join(correct_list))
正如@PrinceFrancis指出的,如果不应该将 answer score
0 cats, dogs, pigs 2
1 cats, dogs 2
2 dogs, pigs 1
3 cats 1
4 pigs 0
计为catsdogs
,则可以更改正则表达式模式以适合:
2
[out]
df = pd.DataFrame({'answer': ['cats, dogs, pigs', 'cats, dogs', 'dogs, pigs', 'cats', 'pigs', 'catsdogs']})
pat = '|'.join([fr'\b{x}\b' for x in correct_list])
df['score'] = df['answer'].str.count(pat)
您可以执行以下操作
answer score
0 cats, dogs, pigs 2
1 cats, dogs 2
2 dogs, pigs 1
3 cats 1
4 pigs 0
5 catsdogs 0
它将给您以下结果
correct_list = ['cats','dogs']
df['score'] = df['answer'].str.split(',')
df['score'] = df['score'].apply(lambda x: sum(el in x for el in correct_list))
df
answer score
0 cats,dogs,pigs 2
1 cats,dogs 2
2 dogs,pigs 1
3 cats 1
4 pigs 0
import pandas as pd
correct_list = ['cats', 'dogs']
answer = ['cats,dogs,pigs','cats,dogs','dogs,pigs','cats','pigs']
answer = [ans.split(',') for ans in answer]
score = [0] * len(answer)
df = pd.DataFrame({'answer':answer,'score':score})
print(df,'\n')
df.score = df.answer.apply(lambda cell: len(set(cell) & set(correct_list)))
print(df)
"""
answer score
0 [cats, dogs, pigs] 0
1 [cats, dogs] 0
2 [dogs, pigs] 0
3 [cats] 0
4 [pigs] 0
answer score
0 [cats, dogs, pigs] 2
1 [cats, dogs] 2
2 [dogs, pigs] 1
3 [cats] 1
4 [pigs] 0
"""
我建议:
def my_func(x):
return sum([1 for y in x.split(',') if y.strip() in correct_list])
df['score'] = df['answer'].apply(my_func)
结果:
correct_list = ['cats','dogs']
df = pd.DataFrame(['cats, dogs, pigs', 'cats, dogs', 'dogs, pigs', 'cats', 'pigs'], columns=['answer'])
df['score'] = df.answer.str.split(', ').apply(lambda x: sum([1 for a in x if a in correct_list]))
您可以测量两个 answer score
0 cats, dogs, pigs 2
1 cats, dogs 2
2 dogs, pigs 1
3 cats 1
4 pigs 0
之间的相交set
:
list