使用word表格合并单元格python构建html

Question

我想构建从html中的word加载的表格，但一个大问题是合并的单元格，我得到的最好的结果是返回单元格的值而不重复合并的单元格，但我停在那里，不知道我如何可以继续

from docx import Document

def iter_unique_cells(row):
    prior_tc = None
    for cell in row.cells:
        this_tc = cell._tc
        if this_tc is prior_tc:
            continue
        prior_tc = this_tc
        yield cell


document = Document("document.docx")
for table in document.tables:
    for row in table.rows:
        for cell in iter_unique_cells(row):
            for paragraph in cell.paragraphs:
                print(paragraph.text)

Answer 1

我会重写

iter_unique_cells

函数来返回当前单元格是否合并。然后，您可以通过将

colspan="2"

添加到

<td></td>

元素来将此信息集成到 html 中。这应该合并单元格（水平）。为了构建 html，我将在所有循环外部声明一个字符串，并在每次迭代开始时添加每个元素的开始标记，在末尾添加结束标记。

from docx import Document

def iter_unique_cells(row):
    ...  # modify to return cell, is_merged

document = Document("document.docx")
html = ""
for table in document.tables:
    html += "<table>"
    for row in table.rows:
        html += "<tr>"
        for cell, is_merged in iter_unique_cells(row):
            html += "<td colspan='2'>" if is_merged else "<td>"
            for paragraph in cell.paragraphs:
                html += f"<p>{paragraph.text}</p>"
            html += "</td>"
        html += "</tr>"
    html += "</table>"

使用word表格合并单元格python构建html

问题描述投票：0回答：1

1个回答

最新问题

使用word表格合并单元格python构建html

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1