Pandas 数据框 Styler 不断覆盖数据框中以前的样式

问题描述 投票:0回答:1

我有一个带有几列的 Pandas 数据框。

# Example Data frame

df = pd.DataFrame({'A':[1,15,10,47,35],      
    'B':["Mac","Mac","Mac","Mac","Mac"],
    'C':["Dog","Dog","Cat","Dog","Tiger"], 
    'D':["CDN", "USD", "CDN", "Pe", "Dr"]
})

我想根据列中每个元素的相对频率为“B”、“C”、“D”列中的每个元素着色。例如,“D”列中“CDN”的相对频率为 2/5 = 0.4。

这些是我基于相对频率的颜色标准:

相对频率 颜色
大于等于0.90 绿色
小于0.90且大于或等于0.30 黄色
小于0.30 红色

由于“D”列中“CDN”的相对频率为 0.4,因此该单元格将被分配黄色背景颜色。

我知道如何查找列中每个元素的相对频率以及如何为元素着色。

我的问题是一列的样式不断被另一列的样式覆盖。这是我的代码:

RemvColOfInterest = ['B', 'C', 'D'] # These are the columns whose elements we want to color

lstcollectionOverallRelFreqs = ['some relative frequencies'] # You don't have to worry about this

colIndexList = [] # This is the index of each of the columns in RemvColOfInterest
s = 0

while (s < len(RemvColOfInterest)):
    colIndexList.append(s)
    s = s + 1
    
tempdf = copy.copy(df)
    
for g, h in zip( RemvColOfInterest, colIndexList ): 
    df = tempdf.style.applymap(highlight_cell, lstFreq = lstcollectionOverallRelFreqs, colIndex = h, subset = pd.IndexSlice[:, [g]])

# If I output my df to an excel file:
df.to_excel("My file path", index = False)

def highlight_cell(value, lstFreq, colIndex):
    Freq = determine_Freq(lstFreq[colIndex]) # All you need to know is that this is the function that finds the relative frequency associated with the element/cell
            
    threshold1 = 0.90
    threshold2 = 0.30
    
    if (Freq >= threshold1):
        return 'background-color: green;'
    elif ((Freq < threshold1) and (Freq >= threshold2)): 
        return 'background-color: yellow;'
    else:
        return 'background-color: red;'

在 Excel 文件中,只有“D”列中的元素具有背景颜色。列“B”和“C”只有通常的白色背景颜色。这让我相信“B”列和“C”列的样式都被“D”列的样式覆盖。我该如何防止这种情况发生。

我相信这是有问题的行(当它在 for 循环中时,导致

df
的样式在每次迭代期间被新样式替换):

df = tempdf.style.applymap(highlight_cell, lstFreq = lstcollectionOverallRelFreqs, colIndex = h, subset = pd.IndexSlice[:, [g]])

问题是,在应用样式(子集参数)时,我一次只考虑一列。那么,为什么不同栏目的样式会互相覆盖呢?如果我不这样做:

df[g] = tempdf.style.applymap(highlight_cell, lstFreq = lstcollectionOverallRelFreqs, colIndex = h, subset = pd.IndexSlice[:, [g]])

对于“B”、“C”和“D”列中的每个单元格,我都得到

pandas.io.formats.style.Styler object at 0x00000...
。有什么指示/建议吗?

示例的输出 excel 文件应如下所示:

python pandas background-color
1个回答
0
投票

尝试一下(你必须已经安装了

xlsxwriter
):

有关条件格式的更多信息请参见此处

import pandas as pd

df = pd.DataFrame(
    {
        "A": [1, 15, 10, 47, 35],
        "B": ["Mac", "Mac", "Mac", "Mac", "Mac"],
        "C": ["Dog", "Dog", "Cat", "Dog", "Tiger"],
        "D": ["CDN", "USD", "CDN", "Pe", "Dr"],
    }
)

writer = pd.ExcelWriter("out.xlsx", engine="xlsxwriter")
df.to_excel(writer, sheet_name="Sheet1", index=False)

workbook = writer.book
worksheet = writer.sheets["Sheet1"]

format1 = workbook.add_format({"bg_color": "#98fb98", "font_color": "#111111"})
format2 = workbook.add_format({"bg_color": "#ffff31", "font_color": "#111111"})
format3 = workbook.add_format({"bg_color": "#fe2712", "font_color": "#111111"})

for c in ['B', 'C', 'D']:
    vals = df[c].value_counts() / len(df)
    for i, v in zip(vals.index, vals):
        f = {"type": "cell", "criteria": "==", "value": f'"{i}"', "format": format1 if v > 0.9 else (format2 if v > 0.3 else format3)}
        r = f"{c}2:{c}{len(df)+1}"
        worksheet.conditional_format(r, f)

writer.close()

创建

out.xlsx
(来自 LibreOffice 的屏幕截图):


编辑:

openpyxl
版本:

import pandas as pd

df = pd.DataFrame(
    {
        "A": [1, 15, 10, 47, 35],
        "B": ["Mac", "Mac", "Mac", "Mac", "Mac"],
        "C": ["Dog", "Dog", "Cat", "Dog", "Tiger"],
        "D": ["CDN", "USD", "CDN", "Pe", "Dr"],
    }
)

format1 = "background-color: #98fb98; color: #111111"
format2 = "background-color: #ffff31; color: #111111"
format3 = "background-color: #fe2712; color: #111111"


def fn(x):
    if x.name == "A":
        return [""] * len(x)

    vals = x.value_counts() / len(x)
    return [
        format1 if v > 0.9 else (format2 if v > 0.3 else format3)
        for v in map(vals.get, x)
    ]


with pd.ExcelWriter("out.xlsx", engine="openpyxl") as writer:
    df.style.apply(fn).to_excel(writer, index=False, sheet_name="Sheet1")
© www.soinside.com 2019 - 2024. All rights reserved.