高价值损失和MSE

Question

我有一个非常大的能量波数据集，我正在上面练习神经网络，但我的 MSE 和 Val-loss 非常高。我尝试使用相关矩阵并进行两步分割，并使用 3 个隐藏层和正则化。 MSE 和 VAL oss 的结果很高，约为 12 万亿。这是我的数据链接

# Load the data
perth_49 = pd.read_csv(r'WEC_Perth_49.csv')
sydney_49 = pd.read_csv(r'WEC_sydney_49.csv')
perth_100 = pd.read_csv(r'WEC_perth_100.csv')
sydney_100 = pd.read_csv(r'WEC_sydney_100.csv')

#merge the dataframes based on a common identifier
merged_data = pd.concat([perth_49, sydney_49, perth_100, sydney_100])

#define the target variable "total power otput"
target_variable = 'Total_Power'

# Defne the potential features 
features = [f'X{i}' for i in range(1, 101)] + [f'Y{i}' for i in range(1, 101)] + [f'Power{i}' for i in range(1, 101)] + ['qW']

# Compute the correlation matrix
correlation_matrix = merged_data[features + [target_variable]].corr()

# Sort the correlations with the target variable in descending order
correlation_with_target = correlation_matrix[target_variable].sort_values(ascending=False)

print("Correlation with target variable:")
print(correlation_with_target)

# Choose the top N features with the highest correlation
top_features = correlation_with_target.head(5).index.tolist()
top_features = top_features[1:5]  # Exclude the target variable itself

# Select the relevant features from the dataset
selected_data = merged_data[top_features + [target_variable]]

# Replace NaN values with 0 in selected_data
 #selected_data.fillna(0, inplace=True)

# Scale the features (normalizing)
selected_data[top_features] = scale(selected_data[top_features])

# Split the data into training and testing sets
X = selected_data[top_features].values
y = selected_data[target_variable].values.reshape(-1, 1)

# Split the data into training, validation, and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=0.2, random_state=42)

# Print the sizes of each set
print("Training set size:", len(X_train))
print("Validation set size:", len(X_val))
print("Testing set size:", len(X_test))

# Scaling the features using MinMaxScaler
scaler = MinMaxScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_val_scaled = scaler.transform(X_val)
X_test_scaled = scaler.transform(X_test)

# Creating a neural network with Keras
#model = keras.Sequential([
#    layers.Dense(128, activation='relu', input_shape=(len(top_features),)),
#    layers.Dense(64, activation='relu'),
#    layers.Dense(32, activation='relu'),
#    layers.Dense(16, activation='relu'),
#    layers.Dense(1)  # a single output (total power)
#])
##########
from tensorflow.keras import regularizers

model = keras.Sequential([
    layers.Dense(128, activation='relu', input_shape=(len(top_features),), kernel_regularizer=regularizers.l2(0.001)),
    layers.Dense(64, activation='relu', kernel_regularizer=regularizers.l2(0.001)),
    layers.Dense(32, activation='relu', kernel_regularizer=regularizers.l2(0.001)),
    layers.Dense(16, activation='relu', kernel_regularizer=regularizers.l2(0.001)),
    layers.Dense(1)  # a single output (total power)
])

# Compiling the model
optimizer = keras.optimizers.Adam(learning_rate=0.00001)
model.compile(optimizer=optimizer, loss='mean_squared_error')

# Early stopping
#early_stopping = keras.callbacks.EarlyStopping(monitor='val_loss', patience=10, restore_best_weights= True)

# Training the model
model.fit(X_train_scaled, y_train, epochs=4000, batch_size=32, validation_data=(X_val_scaled, y_val))

# Training the model with early stopping
#model.fit(X_train_scaled, y_train, epochs=4000, batch_size=32, validation_data=(X_val_scaled, y_val), callbacks=[early_stopping])


# Evaluating the model on the test set
loss = model.evaluate(X_test_scaled, y_test)
print("Test Loss:", loss)

# Making predictions
predictions = model.predict(X_test_scaled)

# Calculating mean squared error
mse = mean_squared_error(y_test, predictions)
print("Mean Squared Error:", mse)

Answer 1

我的第一个想法是，对于 4k epoch，您可能会同时遇到学习率太低和网络太深的问题。我在一台台式机上运行，因此按照我目前的速度，不会在 7 天内完成所有 4k epoch，但在 60 epoch 时，我会损失 7770 亿美元，学习率为 0.001，仅损失第一个隐藏层。是的，更多的隐藏层允许学习更复杂的功能，较低的学习率有助于获得更平滑的学习，但这两者都需要更多的训练次数才能获得特定水平的结果。如果您的计算机比我更快，可能值得尝试较小的网络和较高的学习率，看看您能管理什么

高价值损失和MSE

问题描述投票：0回答：1

1个回答

最新问题

高价值损失和MSE

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1