使用Python和python-docx乘以Word文件的内容

Question

我正在利用 tkinter 和 python-docx 库构建一个 Python 应用程序。其目的是读取现有的 Word 文档（包括表格、段落和其他典型的 Word 元素）并在同一文档中按照用户指定的次数重现其内容。这项工作的精髓是生成用于交付目的的可打印标签。

目标：该应用程序应该能够：

允许用户选择 .docx 文件。让用户指定内容应重复的次数。保存复制指定次数的原始内容的结果。当前状态：利用下面的代码，我成功地复制了内容。然而，似乎只有第一页保留了原始内容，而后续页面则没有了：

蟒蛇复制代码

import tkinter as tk
from tkinter import filedialog, messagebox
from docx import Document

class WordMultiplierApp:
    def __init__(self, root):
        self.root = root
        self.root.title('Word Content Multiplier')
        
        self.file_path = tk.StringVar()
        
        # UI components
        tk.Label(root, text="Select Word File:").pack(pady=10)
        tk.Button(root, text="Browse", command=self.browse_file).pack(pady=10)
        
        tk.Label(root, text="No. of Times to Multiply:").pack(pady=10)
        self.multiplier = tk.Entry(root)
        self.multiplier.pack(pady=10)
        
        tk.Button(root, text="Process", command=self.process_file).pack(pady=20)

    def browse_file(self):
        file = filedialog.askopenfilename(filetypes=[('Word Files', '*.docx')])
        if file:
            self.file_path.set(file)

    def process_file(self):
        if not self.file_path.get():
            messagebox.showerror("Error", "Please select a valid Word file.")
            return
        
        try:
            times = int(self.multiplier.get())
        except ValueError:
            messagebox.showerror("Error", "Enter a valid integer for multiplication.")
            return
        
        original_doc = Document(self.file_path.get())

        # Check if original doc is not empty
        if not original_doc.paragraphs:
            messagebox.showerror("Error", "The Word file seems empty or content cannot be read.")
            return

        # Multiply content of each paragraph
        paragraphs = list(original_doc.paragraphs)
        for para in paragraphs:
            original_text = para.text
            for _ in range(1, times):
                original_doc.add_paragraph(original_text)
                
        save_path = filedialog.asksaveasfilename(defaultextension=".docx", filetypes=[('Word Files', '*.docx')])
        if save_path:
            original_doc.save(save_path)
            messagebox.showinfo("Success", "File processed and saved successfully!")

if __name__ == "__main__":
    root = tk.Tk()
    app = WordMultiplierApp(root)
    root.mainloop()

挑战：如何确保内容真实重复指定次数，包括表格、段落等元素，没有任何遗漏？

对此难题的任何见解或解决方案将不胜感激。预先感谢您的时间和专业知识。

利用 python-docx 库，我努力迭代原始文档的每个段落，并将文本附加用户定义的次数。我的期望很简单：对于原始文档中的每个段落、表格或其他 Word 元素，内容将按照用户指定的确切次数进行复制，同时保持原始格式和结构。

然而，在执行时，我发现了一个异常。虽然第一页准确地复制了内容，但后续页面却莫名其妙地没有内容。考虑到生成可打印标签的目标，这种不一致会阻碍应用程序的功效。

在--@timroberts之后编辑：

感谢您的反馈。我目前正在 Python 中使用 PyPDF2 库来处理 PDF。这是我正在使用的代码片段：

python
Copy code
import PyPDF2

def multiply_pdf_content(input_pdf, output_pdf, times):
    with open(input_pdf, 'rb') as file:
        reader = PyPDF2.PdfFileReader(file)
        writer = PyPDF2.PdfFileWriter()

        for _ in range(times):
            for page_num in range(reader.numPages):
                page = reader.getPage(page_num)
                writer.addPage(page)

        with open(output_pdf, 'wb') as output_file:
            writer.write(output_file)

尽管多次尝试将内容相乘，但我在输出 PDF 中只获得了一份内容副本。任何关于我可能做错的事情的指导都会非常有帮助。

Answer 1

我不确定你做错了什么，但是当我修改你的代码以使其与最新的 PyPDF2 API 匹配时，它工作得很好：

import PyPDF2

def multiply_pdf_content(input_pdf, output_pdf, times):
    with open(input_pdf, 'rb') as file:
        reader = PyPDF2.PdfReader(file)
        writer = PyPDF2.PdfWriter()

        for _ in range(times):
            for page in reader.pages:
                writer.add_page(page)

        with open(output_pdf, 'wb') as output_file:
            writer.write(output_file)

multiply_pdf_content('Video.pdf','xxx.pdf',5)

使用Python和python-docx乘以Word文件的内容

问题描述投票：0回答：1

1个回答

最新问题

使用Python和python-docx乘以Word文件的内容

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1