仅当字符是字母时如何打印?

问题描述 投票:0回答:2

这是我之前在这里的帖子:

如果一个单词的最后一个字母是元音,如何返回1?否则返回 0

这是我使用的代码:

import sys
import re

pattern = re.compile("^[a-z]+$")  # matches purely alphabetic words
starting_vowels = re.compile("(^[aeiouAEIOU])")  # matches starting vowels
ending_vowels = re.compile("[aeiouAEIOU]$")  # matches ending vowels
starting_vowel_match = 0
ending_vowel_match = 0

for line in sys.stdin:
    line = line.strip()  # removes leading and trailing whitespace
    words = line.lower().split()  # splits the line into words and converts to lowercase
    for word in words:
        if len(word) == 1:
            print(word[0], 1, *((1, 1) if word[0] in 'aeiou' else (0, 0))) # * unpacks startVowel 1 endVowel 1 if word[0] is a vowel
        else:
            print(word[0], 1, 1 if word[0] in 'aeiou' else 0, 0) 
            print(*(f'{letter} 1 0 0' for letter in word[1: -1]), sep='\n')
            print(word[-1], 1, 0, 1 if word[-1] in 'aeiou' else 0)

我希望它只在字符是字母表时打印,所以我想要的示例输出是包含字符串“这是美好生活”的文本文件:

i 1 1 0
t 1 0 0
s 1 0 0
a 1 1 1
b 1 0 0
e 1 0 0
a 1 0 0
u 1 0 0
t 1 0 0
i 1 0 0
f 1 0 0
u 1 0 0
l 1 0 0
l 1 0 0
i 1 0 0
f 1 0 0
e 1 0 1

我现在看到的是:

i 1 1 0
' 1 0 0
t 1 0 0
s 1 0 0
a 1 1 1
b 1 0 0
e 1 0 0
a 1 0 0
u 1 0 0
t 1 0 0
i 1 0 0
f 1 0 0
u 1 0 0
l 1 0 0
l 1 0 0
i 1 0 0
f 1 0 0
e 1 0 1

我想知道如何去掉输出中的特殊字符。我尝试了几件事,包括添加

        for letter in word:
            if pattern.match(letter):

for letter in word"
块中,但它没有返回我想要的输出。

python regex mapreduce
2个回答
0
投票

不确定为什么原始代码与 re 一起工作,因为它从未被使用过。

在分析超过 1 个字母的单词时,需要单独考虑 [1:-1] 拆分中的所有字符。

像这样的东西:

import sys
from string import ascii_lowercase as LOWER

VOWELS = set('aeiouAEIOU')

def isvowel(c):
    return int(c in VOWELS)

for line in sys.stdin:
    for word in line.strip().lower().split():
        if len(word) == 1:
            print(word, '1 1', isvowel(word[0]))
        else:
            print(word[0], 1, isvowel(word[0]), 0)
            for letter in word[1:-1]:
                if letter in LOWER:
                    print(f'{letter} 1 0 0')
            print(word[-1], '1 0', isvowel(word[-1]))

输出:

i 1 1 0
t 1 0 0
s 1 0 0
a 1 1 1
b 1 0 0
e 1 0 0
a 1 0 0
u 1 0 0
t 1 0 0
i 1 0 0
f 1 0 0
u 1 0 0
l 1 0 0
l 1 0 0
i 1 0 0
f 1 0 0
e 1 0 1

-1
投票

所以你想把一个字符串拆分成单词,然后把每个单词拆分成字母。对于您想要打印的每个字母:

[letter] [starting_vowel_match] [letter_vowel_match] [ending_vowel_match]

这是我解决这个问题的方法:

import re

test = "It's a beautiful life"

for line in test.split("\n"):
    line = line.strip()  # removes leading and trailing whitespace
    words = line.lower().split()  # splits the line into words and converts to lowercase
    for word in words:
        for letter in re.sub(r'[^a-zA-Z0-9]', '', word):
            print(
                letter, 
                1 if word[0] in 'aeiou' else 0,
                1 if letter in 'aeiou' else 0, 
                1 if word[-1] in 'aeiou' else 0)

结果看起来与您的示例输出不同,但我希望第一行包含 starting_vowel_match!

i 1 1 0
t 1 0 0
s 1 0 0
a 1 1 1
b 0 0 0
e 0 1 0
a 0 1 0
u 0 1 0
t 0 0 0
i 0 1 0
f 0 0 0
u 0 1 0
l 0 0 0
l 0 0 1
i 0 1 1
f 0 0 1
e 0 1 1
© www.soinside.com 2019 - 2024. All rights reserved.