OpenCV-从调查表中检测复选框的手写标记

问题描述 投票:3回答:1

我正在处理许多患者摄入量调查表。这是问卷的扫描示例。我需要对其进行处理并存储到数据库中,但是在检测这些手写标记时遇到了问题:

患者入学问卷

<< img src =“ https://image.soinside.com/eyJ1cmwiOiAiaHR0cHM6Ly9pLnN0YWNrLmltZ3VyLmNvbS8xa0N4RC5qcGcifQ==” alt =“患者摄入问卷”>

调查表中有不同类型的标记。一些复选框被涂成黑色。一些复选框带有刻度或十字标记。这些标记均表示已选中复选框。我需要使用opencv2来识别选中了哪些框。

我已经尝试过光学字符识别,但是结果并没有真正的帮助。标记的形状太多,因此OCR会将它们识别为不同的字符。我需要找出在调查表中选中了哪些框。 cv2本可以解决此问题,但我不知道。

# Expected input: An image of Questionnaire

# Expected output:
Have you seen other health care providers for your problems of dizziness 
and/or imbalance? [selected] Yes [unselected] No

Have you been through a program of Vestibular and Balance Rehabilitation 
Therapy? [selected] Yes [unselected] No

=============================
[unselected] vertigo
[unselected] falling
...
[selected] Drunk-like

=============================
[selected] Vertigo
[selected] Falling
[selected] Fatigue
[selected] Wooziness
[selected] Spinning
[unselected] Disconnected

我之前使用Python tesseract OCR软件包的尝试:

from PIL import Image
import pytesseract
path ="page1.jpg"
img = Image.open(path)
text = pytesseract.image_to_string(img, lang='eng', config='-c preserve_interword_spaces=1 --psm 6')
print text

O Vertigo           O Falling              O Fatigue                 W Vertigo          YA Falling             y[ Fatigue
[ Wooziness     O Spinning         O Disconnected       A \Wooziness     Q Spinning         [ Disconnected
O Imbalance      B Drunk-like        O Swirling             O Imbalance      O Drunk-like       @ Swirling      ;
O Faint            [ Rocking        O Can’tfocus         M Faint           4 Rocking          O Can’t focus
O Lightheaded O Swaying -~ . -0 Unsteady       O Lightheaded O Swaying       N Unsteady
O “onaboat” O Swimming sensation                      Weonaboat” @ Swimming sensation
O Other:                                                        0 Other:

我的想法是:如果OCR将矩形复选框识别为字符'O'或数字'0',则应取消选中该复选框。否则应选择它。基于该规则,我可以基于OCR结果检测手写标记。我将测试一些样本并查看精度,尽管我不确定这是否可行。如果是这样,我过一会儿,我会再报告此帖子。

python opencv image-processing computer-vision omr
1个回答
0
投票

我想知道您是否成功使用pytesseract进行了复选框检测。我有一个类似的情况,需要识别复选框。

© www.soinside.com 2019 - 2024. All rights reserved.