仅当字符是字母时如何打印？

Question

这是我之前在这里的帖子：

这是我使用的代码：

import sys
import re

pattern = re.compile("^[a-z]+$")  # matches purely alphabetic words
starting_vowels = re.compile("(^[aeiouAEIOU])")  # matches starting vowels
ending_vowels = re.compile("[aeiouAEIOU]$")  # matches ending vowels
starting_vowel_match = 0
ending_vowel_match = 0

for line in sys.stdin:
    line = line.strip()  # removes leading and trailing whitespace
    words = line.lower().split()  # splits the line into words and converts to lowercase
    for word in words:
        if len(word) == 1:
            print(word[0], 1, *((1, 1) if word[0] in 'aeiou' else (0, 0))) # * unpacks startVowel 1 endVowel 1 if word[0] is a vowel
        else:
            print(word[0], 1, 1 if word[0] in 'aeiou' else 0, 0) 
            print(*(f'{letter} 1 0 0' for letter in word[1: -1]), sep='\n')
            print(word[-1], 1, 0, 1 if word[-1] in 'aeiou' else 0)

我希望它只在字符是字母表时打印，所以我想要的示例输出是包含字符串“这是美好生活”的文本文件：

我现在看到的是：

我想知道如何去掉输出中的特殊字符。我尝试了几件事，包括添加

        for letter in word:
            if pattern.match(letter):

在

for letter in word"

块中，但它没有返回我想要的输出。

Answer 1

不确定为什么原始代码与 re 一起工作，因为它从未被使用过。

在分析超过 1 个字母的单词时，需要单独考虑 [1:-1] 拆分中的所有字符。

像这样的东西：

import sys
from string import ascii_lowercase as LOWER

VOWELS = set('aeiouAEIOU')

def isvowel(c):
    return int(c in VOWELS)

for line in sys.stdin:
    for word in line.strip().lower().split():
        if len(word) == 1:
            print(word, '1 1', isvowel(word[0]))
        else:
            print(word[0], 1, isvowel(word[0]), 0)
            for letter in word[1:-1]:
                if letter in LOWER:
                    print(f'{letter} 1 0 0')
            print(word[-1], '1 0', isvowel(word[-1]))

输出：

Answer 2

所以你想把一个字符串拆分成单词，然后把每个单词拆分成字母。对于您想要打印的每个字母：

[letter] [starting_vowel_match] [letter_vowel_match] [ending_vowel_match]

这是我解决这个问题的方法：

import re

test = "It's a beautiful life"

for line in test.split("\n"):
    line = line.strip()  # removes leading and trailing whitespace
    words = line.lower().split()  # splits the line into words and converts to lowercase
    for word in words:
        for letter in re.sub(r'[^a-zA-Z0-9]', '', word):
            print(
                letter, 
                1 if word[0] in 'aeiou' else 0,
                1 if letter in 'aeiou' else 0, 
                1 if word[-1] in 'aeiou' else 0)

结果看起来与您的示例输出不同，但我希望第一行包含 starting_vowel_match！

仅当字符是字母时如何打印？

问题描述投票：0回答：2

2个回答

最新问题

仅当字符是字母时如何打印？

问题描述 投票：0回答：2

2个回答

最新问题

问题描述投票：0回答：2