替换段落文本时处理页眉/页脚中的图像的最佳方法？ Python-docx

Question

我有一个可以查找/替换大量文档的工具。这是代码：

        # Find the index of the find_text (case-insensitive)
        index = paragraph.text.lower().find(find.entryWord.lower())

        # Calculate the end of the substring to be replaced
        end_index = index + len(find.entryWord)

        # Create the modified paragraph
        modified_paragraph = (
            paragraph.text[:index]
            + paragraph.text[index:end_index]
            .lower()
            .replace(find.entryWord.lower(), replace.entryWord)
            + paragraph.text[end_index:]
        )

        paragraph.text = modified_paragraph

        font = style.font
        
        font.name = replace.font or 'Arial'
        font.size = Pt(int(replace.fontSize) if replace.fontSize else 10)
        font.bold = replace.bold or False
        font.italic = replace.italic or False
        
        if replace.color:
            font.color.rgb = RGBColor(replace.color.get('r'), replace.color.get('g'), replace.color.get('b'))

        paragraph.style = style

这效果很好，只是当我正在处理的段落中有图像时，它就会删除该图像。当我设置

paragraph.text = modified_paragraph

时会发生这种情况。

对于我的问题，是否有更好的方法来处理与我要更改的文本位于同一段落中的图像？

注意：我知道 python-docx 中的运行，但它们与单词的分解方式极其不一致，所以如果可以的话，我宁愿避免使用它们。

Answer 1

这个问题没有简单的答案，它们都涉及运行级别和运行级别以下的工作。但有一些事情有助于理解：

图像包含在
```
w:drawing
```
元素中，并且这些元素作为 run 的子元素出现。
您可以迭代运行的“内部内容”来发现
```
Drawing
```
对象，它们是
```
w:drawing
```
元素的代理。
当您调用
```
Paragraph.text
```
时，所有现有的运行元素都会从段落中删除。但是，它们不会自动删除。一旦对它们的最后一个引用超出范围，它们就会等待垃圾收集。

因此，一种可能的方法是：

检测包含绘图的运行
替换段落文字
重新添加包含绘图的运行

这将需要在 XML 级别上执行某些步骤，因此，如果您熟悉其他

paragraph._p.append(run._r)

问题和答案中的此类内容，则可以使用

python-docx

之类的内容。

替换段落文本时处理页眉/页脚中的图像的最佳方法？ Python-docx

问题描述投票：0回答：1

1个回答

最新问题

替换段落文本时处理页眉/页脚中的图像的最佳方法？ Python-docx

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1