我想将 prev_list 中包含单词的部分的颜色更改为红色。一个句子中可以出现多个 prev_list 单词。我的代码现在只更改最后一个匹配单词的颜色。请帮忙。
示例: 我有一只狗、一只猫和一条鱼
未来: 我有一只狗(红色字体)、一只猫和一条鱼
现状: 我有一只狗、一只猫和一条鱼
from openpyxl import Workbook
from openpyxl.cell.text import InlineFont
from openpyxl.cell.rich_text import TextBlock, CellRichText
import pandas as pd
from openpyxl.utils.dataframe import dataframe_to_rows
import re
prev_words = pd.read_excel("prev_file.xlsx", usecols=['word'])
prev_list = prev_words['word'].tolist()
wb = Workbook()
ws = wb.active
df = pd.read_excel("new_file.xlsx")
for r in dataframe_to_rows(df, index=False, header=True):
ws.append(r)
red = InlineFont(color='00FF0000')
black = InlineFont(color='00000000')
for row in ws.iter_rows(min_row=0):
for cell in row:
cell_value = str(cell.value)
rich_text = CellRichText()
contains_prev_word = False
for word in prev_list:
if word in cell_value:
contains_prev_word = True
break
if not contains_prev_word:
rich_text.append(TextBlock(black, cell_value))
else:
for match in re.finditer(word, cell_value):
start_idx = match.start()
end_idx = start_idx + len(word)
rich_text.append(TextBlock(black, cell_value[:start_idx]))
rich_text.append(TextBlock(red, cell_value[start_idx:end_idx]))
rich_text.append(TextBlock(black, cell_value[end_idx:]))
cell.value = rich_text
wb.save("resultresult2.xlsx")
wb.close()
我猜需求与此类似
其中“new_file.xlsx”中的一行包含在句子中以红色突出显示的单词,如下所示。
举个例子,“prev_file.xlsx”中提取到“prev_list”中的单词是;
此方法创建需要出现红色突出显示的列表列表。
该列表如下所示;
[[10, 3]、[55, 3]、[2, 4]、[62, 4]、[32, 3]、[80, 3]]
每个单独列表的第一个元素是红色突出显示开始的索引,第二个元素是要突出显示的单词的长度。
该列表经过排序,以允许将文本块按顺序应用到单元格文本,从而产生如下所示的列表;
[[2, 4]、[10, 3]、[32, 3]、[55, 3]、[62, 4]、[80, 3]]
然后只需循环访问此列表,并将每个列表应用为单元格文本中的文本块,将中间文本保留为黑色。
from openpyxl import Workbook
from openpyxl.cell.text import InlineFont
from openpyxl.cell.rich_text import TextBlock, CellRichText
import pandas as pd
from openpyxl.utils.dataframe import dataframe_to_rows
import re
prev_words = pd.read_excel("prev_file.xlsx", usecols=['word'])
prev_list = prev_words['word'].tolist()
wb = Workbook()
ws = wb.active
df = pd.read_excel("new_file.xlsx")
for r in dataframe_to_rows(df, index=False, header=True):
ws.append(r)
red = InlineFont(color='00FF0000')
black = InlineFont(color='00000000')
for row in ws.iter_rows():
for cell in row:
cell_value = str(cell.value)
rich_text = CellRichText()
wd_positions_list = []
### Create list of lists for RED highlight positions
for word in prev_list:
if word in cell_value:
for match in re.finditer(word, cell_value):
wd_positions_list.append([match.start(), len(word)])
if wd_positions_list: # If the list has values process them
wd_positions_list = sorted(wd_positions_list) # Sort the list by the first element in each list of lists
blk_start_idx = 0 # Black text start index
for word_position in wd_positions_list:
start_idx = word_position[0]
word_length = word_position[1]
end_idx = start_idx + word_length
rich_text.append(TextBlock(black, cell_value[blk_start_idx:start_idx]))
rich_text.append(TextBlock(red, cell_value[start_idx:end_idx]))
blk_start_idx = end_idx
cell.value = rich_text
wb.save("result2.xlsx")
ws.iter_rows():