如何在Python中扫描多字符多字符串数组中的字符?

问题描述 投票:0回答:1

我正在开发我的第一个Python项目,用于从OCR读取字符串并输出盲文的设备。盲文设备一次只能输出6个字母。我试图扫描6个字符长的多字符串数组中的每个字符。

为简单起见,现在我只想为多字符串数组中的每个字符打印“this is(insert character)”。实际上,输出将运行代码,该代码告诉前两个电机以盲文创建字符,然后对剩余的5个字符执行此操作,其余10个电机在每个6个字符长的字符串之间有短暂延迟。如何扫描每个6个字符长的字符串并将其循环到数组中的其余字符串?

这是我到目前为止的地方:

from PIL import Image
import pytesseract


img = Image.open('img file path')
text = [item for item in (pytesseract.image_to_string(img, lang='eng', config='--psm 6')).split('\n')]
oneLineStr = ' '.join(text)
# displays: The quick brown fox jumps over the lazy dog.
print(oneLineStr)

arr6elem = []
for idx in range(0, len(oneLineStr), 6):
    arr6elem.append(oneLineStr[idx:idx + 6])
# displays: ['The qu', 'ick br', 'own fo', 'x jump', 's over', ' the l', 'azy do', 'g.']
print(arr6elem)

# Don't know what to do from this point
# Want to scan each 6-element string in list and for each string, see which elements it consists of
# (capital/lower case characters, numbers, spaces, commas, apostrophes, periods, etc.)
# Then, print "this is a" for letter a, or "this is a colon" for :, etc.
# So that output looks like:
# ["'this is T', 'this is h', 'this is e', this is a space', 'this is q', 'this is u'", "'this is i', 'this is c'...]
python arrays string text split
1个回答
1
投票

字典应该可以解决问题:

punctuation = {
    ' ': 'a space',
    ',': 'a comma',
    "'": 'an apostrophes',
    '.': 'a period'
}

for word in arr6elem:
    for char in word:
        print('This is {}'.format(punctuation.get(char, char)))

一旦你用你需要的所有项目构建了标点符号,循环将从中获取相应的值,或者默认为实际的char。

Output:
# This is T
# This is h
# This is e
# This is a space
# This is q
# This is u
# This is i
# This is c
# This is k
# This is a space
# This is b
# This is r
# This is o
# This is w
# This is n
# This is a space
# This is f
# ...
© www.soinside.com 2019 - 2024. All rights reserved.