如何合并数据集并在拆分后将其另存为 CSV
X = df.drop(['label'],axis=1).values
y = df.iloc[:, -1].values.reshape(-1,1)
y=np.ravel(y)
X_train, X_test, y_train, y_test = train_test_split(X,y, train_size = 0.8,
test_size = 0.2, random_state = 0,stratify = y)
我想将数据集合并回来并将其保存为 CSV 文件,但不更改特征名称或顺序或任何内容?
np.concatenate
或 pd.concat
可以在这里提供帮助,例如这样(但不保存初始行顺序):
train_full = np.hstack((X_train, y_train.reshape(-1, 1)))
test_full = np.hstack((X_test, y_test.reshape(-1, 1)))
df_full = pd.DataFrame(np.concatenate((train_full, test_full), axis = 0), columns = df.columns)
df_full.to_csv('data.csv')