我想运行来自 https://github.com/deepdoctection/deepdoctection Colab 中的自述文件的示例代码:
import deepdoctection as dd
from IPython.core.display import HTML
from matplotlib import pyplot as plt
analyzer = dd.get_dd_analyzer() # instantiate the built-in analyzer similar to the Hugging Face space demo
df = analyzer.analyze(path = "/path/to/your/doc.pdf") # setting up pipeline
df.reset_state() # Trigger some initialization
doc = iter(df)
page = next(doc)
image = page.viz()
plt.figure(figsize = (25,17))
plt.axis('off')
plt.imshow(image)
但我明白了:
/usr/local/lib/python3.10/dist-packages/deepdoctection/utils/pdf_utils.py in _input_to_cli_str(input_file_name, output_file_name, dpi, size)
160 command = "pdftocairo"
161 else:
--> 162 raise PopplerNotFound("Poppler not found. Please install or add to your PATH.")
163
164 if platform.system() == "Windows":
PopplerNotFound: Poppler not found. Please install or add to your PATH.
我已经尝试了这个问题和其他一些问题的选项,但它们没有改变任何东西。
很高兴回答您的问题! :D
您遇到的错误表明
deepdoctection
库需要在您的系统上安装Poppler(一种PDF渲染工具),但它找不到它。要在 Google Colab 环境中解决此问题,您可以按照以下步骤操作:
!apt-get install -y poppler-utils
import os
os.environ['PATH'] += ":/usr/bin/"
import deepdoctection as dd
from IPython.core.display import HTML
from matplotlib import pyplot as plt
import os
# Install Poppler
!apt-get install -y poppler-utils
# Add Poppler to PATH
os.environ['PATH'] += ":/usr/bin/"
analyzer = dd.get_dd_analyzer() # instantiate the built-in analyzer similar to the Hugging Face space demo
# Use a sample PDF file for testing (replace with your actual path)
pdf_path = "/content/sample.pdf"
df = analyzer.analyze(path=pdf_path) # setting up pipeline
df.reset_state() # Trigger some initialization
doc = iter(df)
page = next(doc)
image = page.viz()
plt.figure(figsize=(25, 17))
plt.axis('off')
plt.imshow(image)
此代码安装 Poppler,将其添加到 PATH,然后运行示例代码。如果您仍然遇到问题,请确保 PDF 文件路径正确,并且可以从 Colab 环境访问 PDF 文件。