如何使用魔杖更改图片的对比度?

问题描述 投票:0回答:1

我有下面的图片在Tesseract OCR中使用:

enter image description here

我处理图片的代码是:

# HOCR
with image[450:6200, 840:3550] as cropped:
    imgPage = wi(image = cropped)
    imageBlob = imgPage.make_blob('png')
    horas = gerarHocr(imageBlob)

def gerarHocr(imageBlob):
    image = Image.open(io.BytesIO(imageBlob))
    markup = pytesseract.image_to_pdf_or_hocr(image, lang='por', extension='hocr', config='--psm 6')
    soup = BeautifulSoup(markup, features='html.parser')

    spans = soup.find_all('span', {'class' : 'ocrx_word'})

    listHoras = []
    ...
    return listHoras

例如,尽管我的OCR有时会感到困惑,并且将83复制,并返回07:44/14:183而不是07:44/14:13

[我认为,如果我使用魔杖去除灰线,则可以提高OCR的置信度。请问我该怎么做?

谢谢,

python ocr tesseract python-tesseract wand
1个回答
0
投票

我将使用cv2和/或numpy.array

将浅灰色转换为白色

img[ img > 128 ] = 255

将深灰色转换为黑色

img[ img < 128 ] = 0

import cv2

folder = '/home/user/images/'

# read it
img = cv2.imread(folder + 'old_img.png')

# convert ot grayscale
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# reduce colors

img[ img > 128 ] = 255
img[ img < 128 ] = 0

# save it    
cv2.imwrite(folder + 'new_img.png', img)

# display result
#cv2.imshow('window', img)
#cv2.waitKey(0) # press any key in window to close it
#cv2.destroyAllWindows()

结果

enter image description here

© www.soinside.com 2019 - 2024. All rights reserved.