我想实时捕获在 PC 全屏上玩的游戏的分数和计时器,我有下面的工作代码,但是当单独裁剪时它只能识别分数“0”或计时器“1:13”但不是我想要的完整裁剪图像中的“0 1:13 0”。代码从完整图像生成“45”,这很奇怪(可能由于图像中的线条图形而导致错误识别?),但我希望结果为“0 1:13 0”,帮助!
如何删除完整图像中的图形,以便只有文本,因为我认为这就是它不返回 0 1:13 0 的原因,或者还有其他原因吗?
代码:
import cv2
import numpy as np
import pytesseract
from PIL import ImageGrab
from PIL import Image, ImageOps
while True:
cap1 = ImageGrab.grab(bbox=(740, 260, 1070,300)).convert('L')
cap = ImageOps.invert(cap1)
text = pytesseract.image_to_string(cap,lang='eng', config='--psm 10 --oem 3 -c tessedit_char_whitelist=0123456789:')
text = text.strip()
if len(text) > 0:
print(text)
if cv2.waitKey(1) == 27:
break
cv2.destroyAllWindows()
更新的代码(请查看下面的内容并在我遇到错误时提供帮助):img = cv2.imread(cap) ^^^^^^^^^^^^^^^ 类型错误:无法将对象转换为“文件名”的“str”:
#!/usr/bin/env python3.8
import cv2
import numpy as np
import pytesseract
from PIL import ImageGrab
while True:
cap = ImageGrab.grab(bbox=(740, 260, 1070,300))
# Read Image and Crop Borders
img = cv2.imread(cap)
# Gray Color
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
binary = gray.copy()
# Thresholding
th1 = 20
th2 = 200
binary[binary<th1]=0
binary[binary>th2]=0
binary[binary>=th1]=255
# Resize as original image is small
scale_ = 5 # of original size
width = int(img.shape[1] * scale_ )
height = int(img.shape[0] * scale_ )
dim = (width, height)
# resize image
resized = cv2.resize(binary, dim, interpolation = cv2.INTER_CUBIC)
# Filtering
filtered = cv2.medianBlur(resized, 17)
# Morhpology
# res = cv2.erode(resized, None)
# IMG Dimensions
h, w = filtered.shape[:2]
# Cropping
s1 = filtered[:,:w//3]
s2 = filtered[:,2*w//3:]
t = filtered[:,w//3:2*w//3]
for i in [s1, t, s2]:
# OCR
print(pytesseract.image_to_string(i, lang='eng',config='--psm 7'))
# Visualization
disp = cv2.cvtColor(filtered, cv2.COLOR_GRAY2BGR)
disp[:, w//3] = (0, 0, 255)
disp[:, 2*w//3] = (0, 0, 255)
cv2.namedWindow("output", cv2.WINDOW_NORMAL)
cv2.imshow("output", disp)
if cv2.waitKey(1) == 27:
break
cv2.destroyAllWindows()
您可以像 here 那样在 OCR 之前进行经典处理,此外还有 medianFiltering 以去除盐和纸张噪声,然后将图像分成三分之三以分别检测:
输出
0
1:13
0
#!/usr/bin/env python3.8
import cv2
import numpy as np
import pytesseract
im_path="./"
im_name = "2.jpg"
# Read Image and Crop Borders
img = cv2.imread(im_path+im_name)
# Gray Color
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
binary = gray.copy()
# Thresholding
th1 = 20
th2 = 200
binary[binary<th1]=0
binary[binary>th2]=0
binary[binary>=th1]=255
# Resize as original image is small
scale_ = 5 # of original size
width = int(img.shape[1] * scale_ )
height = int(img.shape[0] * scale_ )
dim = (width, height)
# resize image
resized = cv2.resize(binary, dim, interpolation = cv2.INTER_CUBIC)
# Filtering
filtered = cv2.medianBlur(resized, 17)
# Morhpology
# res = cv2.erode(resized, None)
# IMG Dimensions
h, w = filtered.shape[:2]
# Cropping
s1 = filtered[:,:w//3]
s2 = filtered[:,2*w//3:]
t = filtered[:,w//3:2*w//3]
for i in [s1, t, s2]:
# OCR
print(pytesseract.image_to_string(i, lang='eng',config='--psm 7'))
# Visualization
disp = cv2.cvtColor(filtered, cv2.COLOR_GRAY2BGR)
disp[:, w//3] = (0, 0, 255)
disp[:, 2*w//3] = (0, 0, 255)
cv2.namedWindow("output", cv2.WINDOW_NORMAL)
cv2.imshow("output", disp)
cv2.waitKey(0)