Pytesseract 不在 MacOS 上保留临时文件

问题描述 投票:0回答:1

跑步时

import cv2
import numpy as np
import pytesseract

pytesseract.pytesseract.tesseract_cmd = '/usr/local/bin/tesseract'
img = cv2.imread('some.png')

h, w, c = img.shape
boxes = pytesseract.image_to_boxes(img) 

我得到以下堆栈跟踪:

File "/Users/thomaskilian/Documents/pytess.py", line 9, in <module>
  boxes = pytesseract.image_to_boxes(img)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pytesseract/pytesseract.py", line 491, in image_to_boxes
  }[output_type]()
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pytesseract/pytesseract.py", line 490, in <lambda>
  Output.STRING: lambda: run_and_get_output(*args),
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pytesseract/pytesseract.py", line 290, in run_and_get_output
  with open(filename, 'rb') as output_file:

builtins.FileNotFoundError: [Errno 2] No such file or directory: '/var/folders/m3/26h8sdk11p7731577hpllh900000gn/T/tess_wayjir39.box'

我在 pytesseract.py 中追踪到了

    run_tesseract(**kwargs)
    filename = f"{kwargs['output_filename_base']}{extsep}{extension}"
    with open(filename, 'rb') as output_file:

run
之后,输出位于temp
var
文件夹中。但是就在
open
处,文件不见了。看起来临时文件有点太临时了。有什么办法吗?

macos python-tesseract
1个回答
0
投票

看来问题可能是 Pytesseract 的临时文件在 open() 函数可以访问它之前被擦除得太早了。

将 output_filename_base 参数设置为特定文件路径作为让 Pytesseract 生成临时文件的替代方法是一种尝试的选择。例如,您可以将代码更改为如下所示:

    import cv2
    import numpy as np
    import pytesseract
    import tempfile

    pytesseract.pytesseract.tesseract_cmd = '/usr/local/bin/tesseract'
    img = cv2.imread('some.png')

    h, w, c = img.shape
    with tempfile.NamedTemporaryFile(suffix='.box') as tf:
        boxes = pytesseract.image_to_boxes(img, 
    output_type=pytesseract.Output.BYTES, output_filename_base=tf.name) 
        print(boxes.decode())

tempfile 模块中的NamedTemporaryFile 方法在此代码中用于创建具有提供的文件扩展名(在本例中为.box)的临时文件。 with 语句确保文件在不再需要时自动销毁。

image_to_boxes()生成的bytes对象上使用decode()方法,然后您可以访问临时文件的内容。

© www.soinside.com 2019 - 2024. All rights reserved.