在Popen生成器中处理特殊字符（解码）

Question

Context

我有一个生成器，不断从特定命令输出每一行（请参见下面的代码片段，取自here的代码）。

def execute(cmd):
    popen = subprocess.Popen(cmd, stdout=subprocess.PIPE, shell=True, universal_newlines=True)

    for stdoutLine in iter(popen.stdout.readline, ""):
        yield stdoutLine.rstrip('\r|\n')

问题

问题是，标准输出行可以包含cp1252无法处理的特殊字符。（请参阅下面的多个错误消息，每个错误消息均来自不同的测试）

UnicodeDecodeError: 'charmap' codec can't decode byte 0x8d in position 6210: character maps to <undefined>
UnicodeDecodeError: 'charmap' codec can't decode byte 0x8d in position 3691: character maps to <undefined>
UnicodeDecodeError: 'charmap' codec can't decode byte 0x8d in position 6228: character maps to <undefined>

问题

如何处理这些特殊字符？

Answer 1

解决方案非常简单：如果不需要，请不要解码标准输出。

我的解决方案是向execute函数添加一个参数，该参数确定生成器将产生解码的字符串还是未触及的字节。

def execute(cmd, decode=False):
    popen = subprocess.Popen(cmd, stdout=subprocess.PIPE, shell=True, universal_newlines=decode)

    for stdoutLine in iter(popen.stdout.readline, ""):
        if decode:
            yield stdoutLine.rstrip('\r|\n')
        else:
            yield stdoutLine.rstrip(b'\r|\n')

因此，当我知道我正在执行的命令将返回ASCII字符并需要一个解码的字符串，然后我传递decode=True参数。

在Popen生成器中处理特殊字符（解码）

问题描述投票：0回答：1

1个回答

最新问题

在Popen生成器中处理特殊字符（解码）

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1