Pandas：根据系列字典中的值过滤行

Question

我的数据框中的某些列由字典本身组成，例如这个数据框：

df = pd.DataFrame({'Name': ['Alice', 'Bob', 'Aritra'],
                   'Age': [25, 30, 35],
                   'Location': [
                      {'City': 'Seattle', 'State': 'WA'}, 
                      {'City': 'New York', 'State': 'NY'}, 
                      {'City': 'Albany', 'State': 'NY'}
                    ]
                    })

df
    Name    Age Location
0   Alice   25  {'City': 'Seattle', 'State': 'WA'}
1   Bob     30  {'City': 'New York', 'State': 'NY'}
2   Aritra  35  {'City': 'Albany', 'State': 'NY'}

如何根据该字典中的值过滤数据框？

当我只想要一个值时，我可以这样做：

df['Location'][0]['State'] 
'WA'

但问题是列名和字典键之间需要索引。因此，像

df[df['Location']['State'] == 'NY']

这样选择所有来自纽约的人是行不通的。

有没有办法包含任何索引，或者必须以其他方式这样做？

期望的输出是

    Name    Age Location
1   Bob     30  {'City': 'New York', 'State': 'NY'}
2   Aritra  35  {'City': 'Albany', 'State': 'NY'}

Answer 1

使用

str.get

访问字典键和布尔索引:

out = df[df['Location'].str.get('State').eq('NY')]

或者，使用列表理解：

out = df[[d.get('State')=='NY' for d in df['Location']]]

输出：

     Name  Age                             Location
1     Bob   30  {'City': 'New York', 'State': 'NY'}
2  Aritra   35    {'City': 'Albany', 'State': 'NY'}

Pandas：根据系列字典中的值过滤行

问题描述投票：0回答：1

1个回答

最新问题

Pandas：根据系列字典中的值过滤行

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1