我有一个带有如下标记的数据框,我想与字典的键匹配并获得相应的键和值。
数据框:
A B
1 ['i','like','apples', 'banana' ,'lot','however','do','not','eat','them','but' , 'sandwich' , 'also' , 'good']
2 ['avengers','series','something','like','most','annabelle','movies' , 'cannot' ,'watch' , 'night' , 'time']
3 ['virat kohli','batsmen','world','like','most','federer','nadal' ,'tennis']
我有如下字典:
key value
apple fruit
banana fruit
grapes fruit
sandwich junkfood
noodles junkfood
avengers action
deadpool action
annabelle horror
virat kohli cricket
federer tennis
nadal tennis
timo ball table tennis
我想用字典的键匹配一行的所有标记,并获得匹配的键和值,如下所示。
输出:
A B C
1 [fruit , junk food] ['apple' , 'banana' , 'sandwich']
2 ['action' , 'horror'] ['avengers' , 'annabelle']
3 ['cricket' , 'tennis'] ['virat kohli' ,'nadal' , 'federer']
您可以将pandas.DataFrame.apply与列表理解一起使用,
#if 'df' is your data frame & 'dct_' is your dictionary
df['C'] = df['B'].apply(lambda lst: [item for item in lst if item in dct_.keys()])
df['D'] = df['B'].apply(lambda lst: [dct_.get(item) for item in lst if item in dct_.keys()])
A B C D
0 1 [i, like, apples, banana,...] [ banana, sandwich] [fruit, junkfood]
1 2 [avengers, series, something,...] [avengers, annabelle] [action, horror]
2 3 [virat kohli, batsmen,...] [virat kohli, federer, nadal] [cricket, tennis, tennis]