我可以寻求您的帮助来解决 Python 包 openpyxl 版本 3.0.7 的文件句柄问题吗?如果 load_workbook 'read_only' 参数设置为 False,则不会发生这种情况。仅当设置为 True 时才会发生。如果您多次调用这些 load_workbook 和 close 函数(同一文件),最终会发生这种情况。我相信我缩小了打开文件句柄的源代码的范围。问题是它没有被删除。多次打开/关闭同一工作簿后调用
shutil.move(source_file, target_file)
时会引发异常。我将尝试通过打开和关闭一次来避免这种情况,但我需要构建一个数据结构来存储所有内容,因为工作簿有 23 个工作表。但这似乎是一个问题。如果我设置read_only=False,性能很糟糕!所以跑起来大约需要一个小时以上。
import openpyxl # openpyxl 3.0.7
# repeat open/close multiple times
wb_source = openpyxl.load_workbook(file_path, read_only=True)
ws_source = wb_source[worksheet_name]
for row in ws_source.rows:
for cell in # cells
# ...
wb_source.close()
shutil.move(file_path, file_path_archive)
这里是例外:
Traceback (most recent call last):
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.7_3.7.2544.0_x64__qbz5n2kfra8p0\lib\shutil.py", line 566, in move
os.rename(src, real_dst)
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Python\\...file.xlsx' -> 'C:\\Python\\...file.xlsx'
.\venv\Lib\site-packages\openpyxl\reader\excel.py
# Python stdlib imports
from zipfile import ZipFile, ZIP_DEFLATED, BadZipfile
from sys import exc_info
from io import BytesIO
import os.path
import warnings
# ...
if self.read_only:
ws = ReadOnlyWorksheet(self.wb, sheet.name, rel.target, self.shared_strings)
ws.sheet_state = sheet.state
self.wb._sheets.append(ws)
continue
else:
fh = self.archive.open(rel.target)
ws = self.wb.create_sheet(sheet.name)
ws._rels = rels
ws_parser = WorksheetReader(ws, fh, self.shared_strings, self.data_only)
ws_parser.bind_all()
.\venv\Lib\site-packages\openpyxl\packaging\manifest.py
mimetypes = MimeTypes()
C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.7_3.7.2544.0_x64__qbz5n2kfra8p0\Lib\mimetypes.py
class MimeTypes:
def init(files=None):
global suffix_map, types_map, encodings_map, common_types
global inited, _db
inited = True # so that MimeTypes.__init__() doesn't call us again
if files is None or _db is None:
db = MimeTypes()
if _winreg:
db.read_windows_registry()
if files is None:
files = knownfiles
else:
files = knownfiles + list(files)
else:
db = _db
for file in files:
if os.path.isfile(file):
db.read(file) # <-------------------------------------- read file
encodings_map = db.encodings_map
suffix_map = db.suffix_map
types_map = db.types_map[True]
common_types = db.types_map[False]
# Make the DB a global variable now that it is fully initialized
_db = db
def read(self, filename, strict=True):
"""
Read a single mime.types-format file, specified by pathname.
If strict is true, information will be added to
list of standard types, else to the list of non-standard
types.
"""
with open(filename, encoding='utf-8') as fp:
self.readfp(fp, strict)
def readfp(self, fp, strict=True):
"""
Read a single mime.types-format file.
If strict is true, information will be added to
list of standard types, else to the list of non-standard
types.
"""
while 1:
line = fp.readline()
if not line:
break
words = line.split()
for i in range(len(words)):
if words[i][0] == '#':
del words[i:]
break
if not words:
continue
type, suffixes = words[0], words[1:]
for suff in suffixes:
self.add_type(type, '.' + suff, strict)
3.0.7和3.1.2版本都有这个问题。现在我只打开文件一次并读取所有数据,最后关闭。仅执行一次此操作并不能消除文件句柄问题,但因为我只打开一次,所以这并不是一个大问题。当我这样做时,性能显着提高。
该框架存在此问题的原因是因为 xlsx 文件从技术上讲是一个 zip 文件,在场景下有多个文件,因此从存档中打开多个文件。