我们面临一个问题,即在 OpenShift 环境中通过子进程运行的 Python 脚本在显示警告消息后不会捕获打印输出,而它在本地按预期工作。
问题描述:
我们有一个脚本 (
snippet.py
),它使用 openpyxl(本地和 OpenShift 环境中的版本为 3.1.2)来读取 Excel 文件并打印特定信息。
该脚本作为另一个 Python 脚本中的子进程执行 (app.py
)。
在本地,该脚本运行完美,打印来自 openpyxl 的警告消息和预期输出。
然而,在我们的 OpenShift 部署中,仅打印警告,并且没有捕获警告之后的任何打印语句。
snippet.py
:
from openpyxl import load_workbook
# Load the workbook and select the 'Sheet C' sheet\n
wb = load_workbook(r'some_excel_file.xlsx')
sheet = wb['Sheet C']
# Find the 'A/C' task and its duration\n
for row in sheet.iter_rows(values_only=True):
if row and 'A/C' in row:
task_index = row.index('A/C')
duration_index = task_index + 1
# Assuming 'Duration' is next to 'Task'\n
duration = row[duration_index]
print(f'The duration for the A/C task is: {duration} days')
break
else:
print('The A/C task was not found or the Duration column is missing.')
app.py
:
import subprocess
try:
result = subprocess.run(
["python3 ", "snippet.py"],
stdout=subprocess.PIPE,
# To Pipe errors/warning also into STDOUT
stderr=subprocess.STDOUT,
text=True,
check=False
)
except subprocess.CalledProcessError as e:
print("ERROR")
print("stdout: " + result.stdout)
本地:
stdout:
C:\Users\User\Documents\project_x\.venv\Lib\site-packages\openpyxl\worksheet\_reader.py:329: UserWarning: Data Validation extension is not supported and will be removed
warn(msg)
C:\Users\User\Documents\project_x\.venv\Lib\site-packages\openpyxl\worksheet\_reader.py:329: UserWarning: Conditional Formatting extension is not supported and will be removed
warn(msg)
The A/C task was not found or the Duration column is missing.
[... 40 times the same line]
The duration for the A/C task is: =SheetB!Y44 days
开档:
stdout:
C:\Users\User\Documents\project_x\.venv\Lib\site-packages\openpyxl\worksheet\_reader.py:329: UserWarning: Data Validation extension is not supported and will be removed
warn(msg)
C:\Users\User\Documents\project_x\.venv\Lib\site-packages\openpyxl\worksheet\_reader.py:329: UserWarning: Conditional Formatting extension is not supported and will be removed
warn(msg)
PYTHONUNBUFFERED=1
(环境)并使用flush=True
(打印)和bufsize=0
(子进程)。规格 | 本地 | 开班 |
---|---|---|
蟒蛇 | 3.11.2 | 3.11.5 |
openpyxl | 3.1.2 | 3.1.2 |
确保工作目录设置正确,使用此脚本进行调试,我还包含了有助于进一步调试的日志记录
try:
result = subprocess.run(
["python3", "snippet.py"],
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT,
text=True,
check=False,
cwd='/path/to/script/directory' # Set the correct path
)
except subprocess.CalledProcessError as e:
print("ERROR")
print("stdout: " + result.stdout)
print("stderr: " + result.stderr)
您还可以添加日志记录。
from openpyxl import load_workbook
import logging
logging.basicConfig(filename='snippet.log', level=logging.DEBUG)
# Load the workbook and select the 'Sheet C' sheet
logging.info("Loading workbook...")
wb = load_workbook(r'some_excel_file.xlsx')
logging.info("Workbook loaded.")
# Rest of your code...
然后只需查看
snippet.log
看看发生了什么步骤
经过进一步调查,我们可以将问题的根本原因缩小到 openpyxl 的
load_workbook
函数,而不是假定的 subprocess
。仅当 read_only
设置为 False
时才会发生。这可能是由于在我们的 Openshift 部署上运行的 Linux 缺少一些似乎与处理/编辑 Excel 文件相关的组件,特别是当它们包含公式时。在本地,我们不会遇到问题,因为机器在 Windows 上运行,并且还包含 Excel 等 MS Office 软件组件。
导致 Openshift 出现问题:
from openpyxl import load_workbook
wb = load_workbook(r'some_excel_file.xlsx', read_only=False)
Openshift 没有问题:
from openpyxl import load_workbook
wb = load_workbook(r'some_excel_file.xlsx', read_only=True)
其他问题讨论了在 Linux 机器上用 python 编辑 Excel 文件的可能性,例如这个: 在 Linux 上,使用 Python 编辑包含公式的 Excel 文件,然后读取结果值