从pandas数据帧中提取列

问题描述 投票:0回答:1

如何从包含字典的后续pandas df中提取列作为pandas df :(我需要带有索引的'name'的所有值)

nutrients_df
Out[63]: 
                                          nutrients
0     [{'code': '203cp1252', 'name': 'Proteincp1252'...
1     [{'code': '203cp1252', 'name': 'Proteincp1252'...
2     [{'code': '203cp1252', 'name': 'Proteincp1252'...
3     [{'code': '203cp1252', 'name': 'Proteincp1252'...
4     [{'code': '203cp1252', 'name': 'Proteincp1252'...
5     [{'code': '203cp1252', 'name': 'Proteincp1252'...
6     [{'code': '203cp1252', 'name': 'Proteincp1252'...

“nutrient_df”被定义为来自json数据库的pandas df,如下所示:

nutrient_name=[]
for index, row in data_df.iterrows():
    nutrients1 = row['nutrients']
    nutrients.append(nutrients1)    
    nutrients_df = pd.DataFrame({'nutrients': nutrients})
python python-3.x pandas dictionary
1个回答
1
投票

我不确定你的df.nutrients系列中有哪些数据类型。下面是一些如何从字典类对象中提取name的示例。

import pandas as pd
from ast import literal_eval

# If your columns are genuine dictionaries
df = pd.DataFrame([[{'code': '203cp1252', 'name': 'Proteincp1252'}],
                   [{'code': '203cp1252', 'name': 'Proteincp1253'}],
                   [{'code': '203cp1252', 'name': 'Proteincp1254'}]],
                  columns=['nutrients'])

df['name'] = df['nutrients'].apply(lambda x: x['name'])

# If your column is a string
df = pd.DataFrame([["{'code': '203cp1252', 'name': 'Proteincp1252'}"],
                   ["{'code': '203cp1252', 'name': 'Proteincp1253'}"],
                   ["{'code': '203cp1252', 'name': 'Proteincp1254'}"]],
                  columns=['nutrients'])

df['name'] = df['nutrients'].apply(lambda x: literal_eval(x)['name'])
© www.soinside.com 2019 - 2024. All rights reserved.