字典提取到熊猫

问题描述 投票:0回答:1

基于这本词典:

dictionary = {
    "base_name" : ["abc", "abc_2"],
    "conditional" : {
        "base_name" : "abc_cond",
        "conditions" : [{"func" : "==", "ref" : 1}],
        "res" : -0.044
    },
    "outlier" : 0.232,
    "out_name" : "var",
    "transformation" : [
        {"func" : "==", "ref" : -1, "res" : -0.323},
        {"func" : "<=", "ref" : 23, "res" : -0.123},
        {"func" : ">", "ref" : -5, "res" : -0.433},
        {"func" : "else", "res" : -0.663},
    ]
}

将这些值提取到 pandas 数据框的最佳方法是什么?

python pandas dataframe dictionary
1个回答
1
投票

应该做这样的事情。我不是 100% 确定“异常值”。总是这么叫吗?或者您是否需要弄清楚另一个键是什么,在这种情况下是离群值?

是否有要处理的词典列表,或者只有这个?

import pandas as pd

dictionary = {
    "base_name" : ["abc", "abc_2"],
    "conditional" : {
        "base_name" : "abc_cond",
        "conditions" : [{"func" : "==", "ref" : 1}],
        "res" : -0.044
    },
    "outlier" : 0.232,
    "out_name" : "var",
    "transformation" : [
        {"func" : "==", "ref" : -1, "res" : -0.323},
        {"func" : "<=", "ref" : 23, "res" : -0.123},
        {"func" : ">", "ref" : -5, "res" : -0.433},
        {"func" : "else", "res" : -0.663},
    ]
}

data = []

# Root level
row = {
    "name": dictionary["out_name"],
    "org_name": dictionary["base_name"],
    "func": "outlier",
    "ref": "",
    "res": dictionary["outlier"]
}

data.append(row)

# Conditional
row = {
    "name": dictionary["out_name"],
    "org_name": dictionary["conditional"]["base_name"],
    "func": dictionary["conditional"]["conditions"][0]["func"],
    "ref": dictionary["conditional"]["conditions"][0]["ref"],
    "res": dictionary["conditional"]["res"]
}

data.append(row)

# Each transformation
for transformation in dictionary["transformation"]:
    row = {
        "name": dictionary["out_name"],
        "org_name": dictionary["base_name"],
        "func": transformation["func"],
        "ref": transformation.get("ref", ""),
        "res": transformation["res"]
    }
    
    data.append(row)

df = pd.DataFrame(data=data)

print(df)

输出:

  name      org_name     func ref    res
0  var  [abc, abc_2]  outlier      0.232
1  var      abc_cond       ==   1 -0.044
2  var  [abc, abc_2]       ==  -1 -0.323
3  var  [abc, abc_2]       <=  23 -0.123
4  var  [abc, abc_2]        >  -5 -0.433
5  var  [abc, abc_2]     else     -0.663

你的尝试只会得到字典的转换部分。您也需要显式获取字典的其他部分。

另外,在解包转换字典之前,先将

ref
值设置为空字符串,否则所有
int
s将被隐式转换为
float
s。

import pandas as pd

dictionary = {
    "base_name" : ["abc", "abc_2"],
    "conditional" : {
        "base_name" : "abc_cond",
        "conditions" : [{"func" : "==", "ref" : 1}],
        "res" : -0.044
    },
    "outlier" : 0.232,
    "out_name" : "var",
    "transformation" : [
        {"func" : "==", "ref" : -1, "res" : -0.323},
        {"func" : "<=", "ref" : 23, "res" : -0.123},
        {"func" : ">", "ref" : -5, "res" : -0.433},
        {"func" : "else", "res" : -0.663},
    ]
}

data = []

# Root level
data.append({
    "name": dictionary["out_name"],
    "org_name": dictionary["base_name"],
    "func": "outlier",
    "ref": "",
    "res": dictionary["outlier"]
})

# Conditional
data.append({
    "name": dictionary["out_name"],
    "org_name": dictionary["conditional"]["base_name"],
    "func": dictionary["conditional"]["conditions"][0]["func"],
    "ref": dictionary["conditional"]["conditions"][0]["ref"],
    "res": dictionary["conditional"]["res"]
})

# Transformations
for transformation in dictionary["transformation"]:
    data.append({
        "name" : dictionary["out_name"],
        "org_name" : dictionary["base_name"],
        "ref": "", # If ref is missing in one of the dicts, pandas implicitly 
                   # transforms all the ints to floats.
        **transformation,
})

df = pd.DataFrame(data=data)

print(df)

输出:

  name      org_name     func ref    res
0  var  [abc, abc_2]  outlier      0.232
1  var      abc_cond       ==   1 -0.044
2  var  [abc, abc_2]       ==  -1 -0.323
3  var  [abc, abc_2]       <=  23 -0.123
4  var  [abc, abc_2]        >  -5 -0.433
5  var  [abc, abc_2]     else     -0.663
© www.soinside.com 2019 - 2024. All rights reserved.