如何使用 OpenCV 对 Tesseract 奶油色背景上的黑色文本进行预处理？

Question

我想从此图像中提取文本：

特别是“Kills”下的行。但是我似乎无法得到准确的结果。

我尝试将图像转换为灰色并应用阈值：

import { createWorker, OEM, PSM } from "tesseract.js";
import cv from "@u4/opencv4nodejs";
import fs from "node:fs/promises";

const worker = await createWorker("eng", OEM.TESSERACT_LSTM_COMBINED);

await options.worker.setParameters({
  tessedit_char_whitelist: "0123456789",
  tessedit_pageseg_mode: PSM.SINGLE_BLOCK,
});

const image = await cv.imdecodeAsync(
  await fs.readFile("input.png"),
  cv.COLOR_BGR2GRAY
);

const threshHoldedImage =
  await image.thresholdAsync(
    150,
    255,
    cv.THRESH_BINARY
  );

const blurredImage = await cv.imencodeAsync(".png", threshHoldedImage);

const {
  data: { text: tierKillsText },
} = await options.worker.recognize(blurredImage, {
  rectangle: {
    top: 265,
    left: 552,
    width: 87,
    height: 138,
  },
});

console.log(tierKillsText);
// Received: 3228387
// Expected: 3328387

我也尝试过应用高斯模糊但没有成功：

const sigma = 0.75;
const blurred = threshHoldedImage.gaussianBlur(new cv.Size(0, 0), sigma);

Answer 1

您可以尝试一下吗？为我工作。

import cv2
import numpy as np
from pytesseract import pytesseract

# Since I can't access the filesystem, I'll be using the uploaded image
image_path = 'C:/Users/Chirantan.Gupta/Downloads/image.png'

# Read the image
image = cv2.imread(image_path)

# Make sure the image has been read properly
if image is None:
    raise ValueError(f"The image at path {image_path} could not be loaded. Please check the file path.")

# Convert to grayscale
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

# Apply Gaussian blur
blurred = cv2.GaussianBlur(gray, (5, 5), 0)

# Use adaptive thresholding
thresh = cv2.adaptiveThreshold(blurred, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C,
                               cv2.THRESH_BINARY_INV, 11, 2)

# Dilate the text to make it more solid
kernel = np.ones((2, 2), np.uint8)
dilated = cv2.dilate(thresh, kernel, iterations=1)


# text = pytesseract.image_to_string(dilated, config=config)## Set config accordingly

# For demonstration purposes, let's save the processed image to check the preprocessing steps
cv2.imwrite('/mnt/data/processed_image.png', dilated)

'C:/Users/Chirantan.Gupta/Downloads/processed_image.png'

如何使用 OpenCV 对 Tesseract 奶油色背景上的黑色文本进行预处理？

问题描述投票：0回答：1

1个回答

最新问题

如何使用 OpenCV 对 Tesseract 奶油色背景上的黑色文本进行预处理？

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1