使用Python提取word文档的所有节号

问题描述 投票:0回答:0

我有一个包含多个部分的 word 文档,这是示例

Document History
        Table of contents
        1. Introduction
        ...
        2. Purpose
        ...
        3. Scope
        ...
        4. Detailed levels of document
        ...
        4.1 Next level
            ...
        4.2 sub levels
            ...
            4.2.1 
            ...
            4.2.2
            ...
                4.2.2.1
                ...

我想提取所有级别的节号,输出应该是这样的:

1 2 3 4 4.1 4.2 4.2.1 4.2.2 4.2.2.1

import tkinter as tk
from tkinter import filedialog
import docx

root = tk.Tk()
root.withdraw()

file_path = filedialog.askopenfilename(filetypes=[("Word Documents", "*.docx")])

doc = docx.Document(file_path)

#Something wrong here: this extracts only the headings, but i do not want the headings, rather i just need #the section numbers alone of all levels
for paragraph in doc.paragraphs:
    if paragraph.style.name == 'Heading 1':
        print(paragraph.text)
python python-3.x python-docx
© www.soinside.com 2019 - 2024. All rights reserved.