Openpyxl 更改列表中单词的颜色,而不是整个单元格

问题描述 投票:0回答:1

我想将 prev_list 中包含单词的部分的颜色更改为红色。一个句子中可以出现多个 prev_list 单词。我的代码现在只更改最后一个匹配单词的颜色。请帮忙。

示例: 我有一只狗、一只猫和一条鱼

未来: 我有一只(红色字体)、一只和一条

现状: 我有一只狗、一只猫和一条

from openpyxl import Workbook
from openpyxl.cell.text import InlineFont
from openpyxl.cell.rich_text import TextBlock, CellRichText
import pandas as pd
from openpyxl.utils.dataframe import dataframe_to_rows
import re


prev_words = pd.read_excel("prev_file.xlsx", usecols=['word'])
prev_list = prev_words['word'].tolist()

wb = Workbook()
ws = wb.active
df = pd.read_excel("new_file.xlsx")

for r in dataframe_to_rows(df, index=False, header=True):
    ws.append(r)

red = InlineFont(color='00FF0000')
black = InlineFont(color='00000000')

for row in ws.iter_rows(min_row=0):
    for cell in row:
        cell_value = str(cell.value)
        rich_text = CellRichText()

        contains_prev_word = False
        for word in prev_list:
            if word in cell_value:
                contains_prev_word = True
                break

        if not contains_prev_word:
            rich_text.append(TextBlock(black, cell_value))
        else:
            for match in re.finditer(word, cell_value):
                start_idx = match.start()
                end_idx = start_idx + len(word)
                rich_text.append(TextBlock(black, cell_value[:start_idx]))
                rich_text.append(TextBlock(red, cell_value[start_idx:end_idx]))
                rich_text.append(TextBlock(black, cell_value[end_idx:]))
        cell.value = rich_text

wb.save("resultresult2.xlsx")
wb.close()
python openpyxl
1个回答
0
投票

我猜需求与此类似 其中“new_file.xlsx”中的一行包含在句子中以红色突出显示的单词,如下所示。

举个例子,“prev_file.xlsx”中提取到“prev_list”中的单词是;

此方法创建需要出现红色突出显示的列表列表。
该列表如下所示;
[[10, 3]、[55, 3]、[2, 4]、[62, 4]、[32, 3]、[80, 3]]

每个单独列表的第一个元素是红色突出显示开始的索引,第二个元素是要突出显示的单词的长度。
该列表经过排序,以允许将文本块按顺序应用到单元格文本,从而产生如下所示的列表;
[[2, 4]、[10, 3]、[32, 3]、[55, 3]、[62, 4]、[80, 3]]

然后只需循环访问此列表,并将每个列表应用为单元格文本中的文本块,将中间文本保留为黑色。

from openpyxl import Workbook
from openpyxl.cell.text import InlineFont
from openpyxl.cell.rich_text import TextBlock, CellRichText
import pandas as pd
from openpyxl.utils.dataframe import dataframe_to_rows
import re


prev_words = pd.read_excel("prev_file.xlsx", usecols=['word'])
prev_list = prev_words['word'].tolist()

wb = Workbook()
ws = wb.active
df = pd.read_excel("new_file.xlsx")

for r in dataframe_to_rows(df, index=False, header=True):
    ws.append(r)

red = InlineFont(color='00FF0000')
black = InlineFont(color='00000000')

for row in ws.iter_rows():
    for cell in row:
        cell_value = str(cell.value)
        rich_text = CellRichText()

        wd_positions_list = []
        ### Create list of lists for RED highlight positions
        for word in prev_list:
            if word in cell_value:
                for match in re.finditer(word, cell_value):
                    wd_positions_list.append([match.start(), len(word)])

        if wd_positions_list:  # If the list has values process them
            wd_positions_list = sorted(wd_positions_list)  # Sort the list by the first element in each list of lists

            blk_start_idx = 0  # Black text start index
            for word_position in wd_positions_list:
                start_idx = word_position[0]
                word_length = word_position[1]

                end_idx = start_idx + word_length

                rich_text.append(TextBlock(black, cell_value[blk_start_idx:start_idx]))
                rich_text.append(TextBlock(red, cell_value[start_idx:end_idx]))
                blk_start_idx = end_idx

            cell.value = rich_text

wb.save("result2.xlsx")

结果文本


**注意**有关您的代码的一些注释;
Openpyxl 行从 1 开始,因此 `ws.iter_rows(min_row=0)` 中的 'min_row' 不正确,但也是不必要的。 Openpyxl 代码会自动将其修复为 1,但是由于 1 是最小值的默认值,因此无论如何都不需要指定。这条线会做同样的事情
ws.iter_rows():

仅当您以只读或只写方式打开 XLSX 文件时,“wb.close()”命令才会生效。
© www.soinside.com 2019 - 2024. All rights reserved.