在Python中使用.apply（）时如何获取行的索引？

Question

我有一个包含列表行的数据框，类似这样：

In [11]: import pandas as pd

In [12]: str1 = 'The weight of a apple'
         str2 = 'Apple MacBook release date news and rumors'

         list1 = ['DET', 'NOUN', 'ADP', 'DET', 'NOUN']
         list2 = ['PROPN', 'NOUN', 'NOUN', 'NOUN', 'CCONJ', 'PROPN']

         df = pd.DataFrame(
             {
                 'col1': [str1, str2],
                 'col2': [list1, list2]        
             }
         )

         df

Out[12]: 
                                         col1                                        col2  
0                       The weight of a apple                 [DET, NOUN, ADP, DET, NOUN]
1  Apple MacBook release date news and rumors     [PROPN, NOUN, NOUN, NOUN, CCONJ, PROPN]

我正在使用用户定义的函数来检查col1中关键字'apple'的出现，并通过在Pandas中使用.apply（）来获取其位置值。然后，我试图从与位置值匹配的col2获取列表中的项目。

但是，当.apply（）函数遍历用户定义的函数时，我不知道如何获取当前行的索引。

这是我想要做的。

In [14]: # Find occurance of 'apple' keyword
         def find_apple(text):
           keyword = 'apple'
           words = text.lower().split(' ')

           if keyword in words:    
             word_index = words.index(keyword)
             value = df.col2[curr_row_index][word_index]
             print(value)
           else:
             print('None')    

         # Function call using .apply() 
         df['col3'] = df['col1'].apply(find_apple)

我想知道如何获取curr_row_index的值，以便在数据帧的行上获得可迭代的值。

我曾尝试使用df.index和row.name无济于事。也许有人可以解释我在做什么错。

P.S。我是新来的，这是我第一次问一个问题，因此对所有丢失的信息提前表示歉意。

Answer 1

如果只想获得col2有苹果的col1值，则可以不使用自定义功能就使用np.where。

df['col2'].where(df['col1'].str.lower().str.contains('apple'))

如果您要确保苹果本身是一个单词，而不是像菠萝这样的较大单词的子串，则可以这样做

df['col2'].where(df['col1'].str.split().apply(lambda lst: 'apple' in lst))

Answer 2

重构函数以对行进行操作，然后在调用apply时使用axis=1：

def f(row):
    #print(row.name,row.col1,row.col2)
    value = None
    if 'apple' in row.col1.lower():
        idx = row.col1.lower().split().index('apple')
#        print(row.col2[idx])
        value = row.col2[idx]
    return value

df['col3' ] = df.apply(f,axis=1)

使用示例数据框：

In [34]: print(df.to_string())
                                         col1                                     col2   col3
0                       The weight of a apple              [DET, NOUN, ADP, DET, NOUN]   NOUN
1  Apple MacBook release date news and rumors  [PROPN, NOUN, NOUN, NOUN, CCONJ, PROPN]  PROPN

In [35]:

在Python中使用.apply（）时如何获取行的索引？

问题描述投票：0回答：2

2个回答

最新问题

在Python中使用.apply（）时如何获取行的索引？

问题描述 投票：0回答：2

2个回答

最新问题

问题描述投票：0回答：2