使用Python将PDF转换为图像

问题描述 投票:1回答:2

我正在为此安装的Ubuntu服务器中将pdf文件转换为图像文件:

  1. python2.7
  2. poppler-utils
  3. pdf2image == 1.12.1

我的代码:

from pdf2image import convert_from_path, convert_from_bytes

images = convert_from_path("/home/user/pdf_file.pdf")

# OR

with open("/home/user/pdf_file.pdf") as pdf:
    images = convert_from_bytes(pdf.read())

输出

当我使用函数“ convert_from_path”

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python2.7/dist-packages/pdf2image/pdf2image.py", line 143, in convert_from_path
    thread_output_file = next(output_file)
TypeError: ThreadSafeGenerator object is not an iterator

当我使用函数“ convert_from_bytes”

Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
  File "/usr/local/lib/python2.7/dist-packages/pdf2image/pdf2image.py", line 268, in convert_from_bytes
    paths_only=paths_only,
  File "/usr/local/lib/python2.7/dist-packages/pdf2image/pdf2image.py", line 143, in convert_from_path
    thread_output_file = next(output_file)
TypeError: ThreadSafeGenerator object is not an iterator

我已经重新安装了所有实用程序,然后我面临这些问题。

python image pdf typeerror converters
2个回答
1
投票

如果要将PDF转换为图像,可以尝试Python Ghostscript package

pip install ghostscript

import ghostscript
import locale

def pdf2jpeg(pdf_input_path, jpeg_output_path):
    args = ["pef2jpeg", # actual value doesn't matter
            "-dNOPAUSE",
            "-sDEVICE=jpeg",
            "-r144",
            "-sOutputFile=" + jpeg_output_path,
            pdf_input_path]

    encoding = locale.getpreferredencoding()
    args = [a.encode(encoding) for a in args]

    ghostscript.Ghostscript(*args)

pdf2jpeg(
    "...Fixate/ActiveState/pdf/a.pdf",
    "...Fixate/ActiveState/pdf/a.jpeg",
)

0
投票

我也在python2中失败,但在python3中成功。

另一个图书馆也发生了同样的问题:TypeError: 'threadsafe_iter' object is not an iterator

正如他们所说,这是python 2 vs 3问题,由next()函数引起。如果修改文件__next__()中的next()-> /home/***/.local/lib/python2.7/site-packages/pdf2image/generators.py,它将在py2中成功运行。

顺便说一句,我已经为pdf2image团队创建了新期刊。TypeError: ThreadSafeGenerator object is not an iterator #133


附加pdf2image自述文件表示这是一个python(3.5+)模块,因此最好在py3中使用它。如果必须使用py2,请尝试尝试__next__()-> next()(但不确定是否安全)

© www.soinside.com 2019 - 2024. All rights reserved.