我尝试使用 numpy.savez 保存多个数组以便以后更快地访问。
import numpy as np
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
iris = datasets.load_iris()
X_train, X_test, y_train, y_test = train_test_split(
iris.data, iris.target, random_state = 100, test_size = 0.2,
stratify = iris.target
)
scaler = StandardScaler()
scaler.fit(X_train)
X_train_scaled = scaler.transform(X_train)
X_test_scaled = scaler.transform(X_test)
np.savez('data/iris_train_scaled', X = X_train_scaled, y = y_train)
np.savez('data/iris_test_scaled', X = X_test_scaled, y = y_test)
该代码给了我以下错误:
File "c:\Users\Montrast\Desktop\c++\.vscode\1.py", line 32, in <module>
np.savez('data/iris_train_scaled', X = X_train_scaled, y = y_train)
File "C:\Users\Montrast\AppData\Local\Programs\Python\Python311\Lib\site-packages\numpy\lib\npyio.py", line 639, in savez
_savez(file, args, kwds, False)
File "C:\Users\Montrast\AppData\Local\Programs\Python\Python311\Lib\site-packages\numpy\lib\npyio.py", line 736, in _savez
zipf = zipfile_factory(file, mode="w", compression=compression)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Montrast\AppData\Local\Programs\Python\Python311\Lib\site-packages\numpy\lib\npyio.py", line 103, in zipfile_factory
return zipfile.ZipFile(file, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Montrast\AppData\Local\Programs\Python\Python311\Lib\zipfile.py", line 1283, in __init__
self.fp = io.open(file, filemode)
^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: 'data/iris_train_scaled.npz'
据我了解,我正在创建一个名为“data/iris_train_scaled”的文件,那么为什么会出现 FileNotFoundError。
您遇到的错误消息 FileNotFoundError 表示目录“data”不存在。当您尝试使用 np.savez 保存 .npz 文件时,它会尝试将其保存在指定目录中,但由于该目录不存在,因此会引发错误。
在尝试将文件保存到其中之前,您需要确保该目录存在。您可以使用 Python 的 os 模块或使用 pathlib 模块中的 Path 对象创建目录。
试试这个:
import numpy as np
import os
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
# Create the 'data' directory if it doesn't exist
if not os.path.exists('data'):
os.makedirs('data')
iris = datasets.load_iris()
X_train, X_test, y_train, y_test = train_test_split(
iris.data, iris.target, random_state=100, test_size=0.2,
stratify=iris.target
)
scaler = StandardScaler()
scaler.fit(X_train)
X_train_scaled = scaler.transform(X_train)
X_test_scaled = scaler.transform(X_test)
np.savez('data/iris_train_scaled', X=X_train_scaled, y=y_train)
np.savez('data/iris_test_scaled', X=X_test_scaled, y=y_test)