你好,我是Python初学者。
我试图使用 pd.read_excel 从一个文件夹中读取一些 excel 文件 (.xlsx)。
`import numpy as np
`import pandas as pd
`import os`
`import torch.nn as nn
`import matplotlib.pyplot as plt
`def get_data(data_path):
` print(f"Reading data from {data_path}")
` data = pd.read_excel(data_path, engine='openpyxl')
` num_rows, num_columns = data.shape
` print("Original data:")
` print(data.head(100)) # 输出数据框的前几行
` print(f"Number of rows:{num_rows}")
`` print(f"Number of columns:{num_columns}")
`` print("Number of rows in data:", len(data))
`` return data
`
`data_path = r'E:\SVN_Pengfei\5.2020年工作\BaiduNetdiskWorkspace\1.ATM项目\4.技术文档\高速数据资料\北五环数据\雷达数据'
`predict_values_name = ' Speed Value'
`data=get_data(data_path)
但总是显示错误:
Reading data from E:\SVN_Pengfei\5.2020年工作\BaiduNetdiskWorkspace\1.ATM项目\4.技术文档\高速数据资料\北五环数据\雷达数据
Traceback (most recent call last):
File "e:/SVN_Pengfei/5.2020年工作/BaiduNetdiskWorkspace/1.ATM项目/2.相关软件/模型、代码及说明文档/LSTM预测流量/test1.py", line 19, in <module>
data=get_data(data_path)
File "e:/SVN_Pengfei/5.2020年工作/BaiduNetdiskWorkspace/1.ATM项目/2.相关软件/模型、代码及说明文档/LSTM预测流量/test1.py", line 8, in get_data
data = pd.read_excel(data_path, engine='openpyxl')
File "D:\Python37\lib\site-packages\pandas\util\_decorators.py", line 296, in wrapper
return func(*args, **kwargs)
File "D:\Python37\lib\site-packages\pandas\io\excel\_base.py", line 304, in read_excel
io = ExcelFile(io, engine=engine)
File "D:\Python37\lib\site-packages\pandas\io\excel\_base.py", line 867, in __init__
self._reader = self._engines[engine](self._io)
File "D:\Python37\lib\site-packages\pandas\io\excel\_openpyxl.py", line 480, in __init__
super().__init__(filepath_or_buffer)
File "D:\Python37\lib\site-packages\pandas\io\excel\_base.py", line 353, in __init__
self.book = self.load_workbook(filepath_or_buffer)
File "D:\Python37\lib\site-packages\pandas\io\excel\_openpyxl.py", line 492, in load_workbook
filepath_or_buffer, read_only=True, data_only=True, keep_links=False
File "D:\Python37\lib\site-packages\openpyxl\reader\excel.py", line 345, in load_workbook
data_only, keep_links, rich_text)
File "D:\Python37\lib\site-packages\openpyxl\reader\excel.py", line 123, in __init__
self.archive = _validate_archive(fn)
File "D:\Python37\lib\site-packages\openpyxl\reader\excel.py", line 93, in _validate_archive
raise InvalidFileException(msg)
openpyxl.utils.exceptions.InvalidFileException: openpyxl does not support file format, please check you can open it with Excel first. Supported formats are: .xlsx,.xlsm,.xltx,.xltm
我可以将 data_path 更改为一个特定文件,例如 data_path = r'E:\SVN_鹏飞.2020年工作\百度网盘工作空间.ATM项目.技术文档\高速数据资料\北五环数据\雷达数据bcde.xlsx'
它工作得很好,但我不想这样做,我希望它读取文件夹,我将有另一个模块来处理多个文件。
熊猫版本:1.1.5 python版本:3.7.0
删除
engine='openpyxl'
:
data = pd.read_excel(data_path)