Python Pandas - 迭代唯一的列

问题描述 投票:0回答:1

我试图迭代一个独特的列值列表,以创建三个不同的键与字典中的字典。这是我现在的代码:

import pandas as pd

dataDict = {}
metrics = frontendFrame['METRIC'].unique()

for metric in metrics:
    dataDict[metric] = frontendFrame[frontendFrame['METRIC'] == metric].to_dict('records')

print(dataDict)

这适用于少量数据,但是随着数据量的增加,这可能需要将近一秒钟(!!!!)。

我在pandas中尝试过groupby,它甚至更慢,也有地图,但我不想把东西归还给列表。我怎样才能迭代这个并以更快的方式创建我想要的东西?我使用的是Python 3.6

更新:

输入:

    DATETIME             METRIC  ANOMALY           VALUE
0   2018-02-27 17:30:32  SCORE      2.0                    -1.0
1   2018-02-27 17:30:32  VALUE      NaN                     0.0
2   2018-02-27 17:30:32  INDEX      NaN  6.6613381477499995E-16
3   2018-02-27 17:31:30  SCORE      2.0                    -1.0
4   2018-02-27 17:31:30  VALUE      NaN                     0.0
5   2018-02-27 17:31:30  INDEX      NaN  6.6613381477499995E-16
6   2018-02-27 17:32:30  SCORE      2.0                    -1.0
7   2018-02-27 17:32:30  VALUE      NaN                     0.0
8   2018-02-27 17:32:30  INDEX      NaN  6.6613381477499995E-16

输出:

{
  "INDEX": [
{
  "DATETIME": 1519759710000,
  "METRIC": "INDEX",
  "ANOMALY": null,
  "VALUE": "6.6613381477499995E-16"
},
{
  "DATETIME": 1519759770000,
  "METRIC": "INDEX",
  "ANOMALY": null,
  "VALUE": "6.6613381477499995E-16"
}],
  "SCORE": [
{
  "DATETIME": 1519759710000,
  "METRIC": "SCORE",
  "ANOMALY": 2,
  "VALUE": "-1.0"
},
{
  "DATETIME": 1519759770000,
  "METRIC": "SCORE",
  "ANOMALY": 2,
  "VALUE": "-1.0"
}],
  "VALUE": [
{
  "DATETIME": 1519759710000,
  "METRIC": "VALUE",
  "ANOMALY": null,
  "VALUE": "0.0"
},
{
  "DATETIME": 1519759770000,
  "METRIC": "VALUE",
  "ANOMALY": null,
  "VALUE": "0.0"
}]
}
python python-3.x pandas loops dataframe
1个回答
1
投票

一种可能的方案:

a = defaultdict( list )
_ = {x['METRIC']: a[x['METRIC']].append(x) for x in frontendFrame.to_dict('records')}
a = dict(a)

from collections import defaultdict

a = defaultdict( list )
for x in frontendFrame.to_dict('records'):
    a[x['METRIC']].append(x)
a = dict(a)

慢:

dataDict = frontendFrame.groupby('METRIC').apply(lambda x: x.to_dict('records')).to_dict()
© www.soinside.com 2019 - 2024. All rights reserved.