使用word表格合并单元格python构建html

问题描述 投票:0回答:1

我想构建从html中的word加载的表格,但一个大问题是合并的单元格,我得到的最好的结果是返回单元格的值而不重复合并的单元格,但我停在那里,不知道我如何可以继续

from docx import Document

def iter_unique_cells(row):
    prior_tc = None
    for cell in row.cells:
        this_tc = cell._tc
        if this_tc is prior_tc:
            continue
        prior_tc = this_tc
        yield cell


document = Document("document.docx")
for table in document.tables:
    for row in table.rows:
        for cell in iter_unique_cells(row):
            for paragraph in cell.paragraphs:
                print(paragraph.text)
python-3.x python-docx
1个回答
0
投票

我会重写

iter_unique_cells
函数来返回当前单元格是否合并。然后,您可以通过将
colspan="2"
添加到
<td></td>
元素来将此信息集成到 html 中。这应该合并单元格(水平)。为了构建 html,我将在所有循环外部声明一个字符串,并在每次迭代开始时添加每个元素的开始标记,在末尾添加结束标记。

from docx import Document

def iter_unique_cells(row):
    ...  # modify to return cell, is_merged

document = Document("document.docx")
html = ""
for table in document.tables:
    html += "<table>"
    for row in table.rows:
        html += "<tr>"
        for cell, is_merged in iter_unique_cells(row):
            html += "<td colspan='2'>" if is_merged else "<td>"
            for paragraph in cell.paragraphs:
                html += f"<p>{paragraph.text}</p>"
            html += "</td>"
        html += "</tr>"
    html += "</table>"
© www.soinside.com 2019 - 2024. All rights reserved.