Tesseract.js OCR 如何正确设置页面分割模式（PSM、pageseg）以检测图像中的单个数字

Question

我一直在使用 tesseract 读取各种数字（最多 99,999.9），格式如下：

OCR 失败的图像示例：

似乎大约 80% 的时间都能正确读取，但我需要 95% 的准确度。

async function runOCR(url) {
    const worker = await Tesseract.createWorker('eng', 1, {
        tessedit_pageseg_mode: 13,
        config: '--psm 13'
    });

    (async () => {
        await worker.load();
        await worker.loadLanguage('eng');
        await worker.initialize('eng');    
        
        await worker.setParameters({
            tessedit_ocr_engine_mode: Tesseract.OEM_TESSERACT_ONLY,
            tessedit_char_whitelist: '0123456789,.',
            preserve_interword_spaces: '0',
            SINGLE_WORD: true,
            tessedit_pageseg_mode: Tesseract.SINGLE_WORD,
        });
        const {
            data: { text },
        } = await worker.recognize(url);
        doSomething(text);
        await worker.terminate();
    })();
}

主要问题是我不知道在哪里设置页面分割模式（PSM，pageseg）。我找到的示例要么已过时，要么采用其他语言。

这是我从C文件中找到的pageseg选项列表（https://github.com/tesseract-ocr/tesseract/blob/4.0.0/src/ccstruct/publictypes.h#L163）

  PSM_OSD_ONLY,       ///< Orientation and script detection only.
  PSM_AUTO_OSD,       ///< Automatic page segmentation with orientation and
                      ///< script detection. (OSD)
  PSM_AUTO_ONLY,      ///< Automatic page segmentation, but no OSD, or OCR.
  PSM_AUTO,           ///< Fully automatic page segmentation, but no OSD.
  PSM_SINGLE_COLUMN,  ///< Assume a single column of text of variable sizes.
  PSM_SINGLE_BLOCK_VERT_TEXT,  ///< Assume a single uniform block of vertically
                               ///< aligned text.
  PSM_SINGLE_BLOCK,   ///< Assume a single uniform block of text. (Default.)
  PSM_SINGLE_LINE,    ///< Treat the image as a single text line.
  PSM_SINGLE_WORD,    ///< Treat the image as a single word.
  PSM_CIRCLE_WORD,    ///< Treat the image as a single word in a circle.
  PSM_SINGLE_CHAR,    ///< Treat the image as a single character.
  PSM_SPARSE_TEXT,    ///< Find as much text as possible in no particular order.
  PSM_SPARSE_TEXT_OSD,  ///< Sparse text with orientation and script det.
  PSM_RAW_LINE,       ///< Treat the image as a single text line, bypassing
                      ///< hacks that are Tesseract-specific.

如何更好地检测图像中的数字或如何正确设置页面分割模式/配置？（我所做的配置更改似乎对我的命中率没有影响）

Answer 1

我在

tessedit_pageseg_mode: 13,

中看到

createWorker

，然后在

tessedit_pageseg_mode: Tesseract.SINGLE_WORD

中看到

worker.setParameters

。
您只需在调用 recognize 函数之前设置此参数（

页面分割模式

）一次。

要检测图像中的单个数字（例如您提供的图像），您应该使用

PSM_SINGLE_LINE

或

PSM_SINGLE_WORD

，它们似乎专门针对此类任务进行了优化。

async function runOCR(url) {
    const worker = await Tesseract.createWorker({
        logger: m => console.log(m)
    });

    await worker.load();
    await worker.loadLanguage('eng');
    await worker.initialize('eng');

    // Set only the necessary parameters once
    await worker.setParameters({
        tessedit_ocr_engine_mode: Tesseract.OEM_TESSERACT_ONLY,
        tessedit_char_whitelist: '0123456789.,',
        tessedit_pageseg_mode: Tesseract.PSM_SINGLE_LINE // or PSM_SINGLE_WORD if a line does not work well
    });

    // Now recognize the number in the image
    const { data: { text } } = await worker.recognize(url);
    doSomething(text);

    await worker.terminate();
}

Tesseract.js OCR 如何正确设置页面分割模式（PSM、pageseg）以检测图像中的单个数字

问题描述投票：0回答：1

1个回答

最新问题

Tesseract.js OCR 如何正确设置页面分割模式（PSM、pageseg）以检测图像中的单个数字

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1