我正在使用 optuna 对 Python 中的 ML 模型进行超参数调整。在定义用于调整深度学习模型的目标函数时,我尝试定义一个选项列表,
trail.suggest_int
可以从中获取值。
例如 -
'batch_size': trial.suggest_int('batch_size', [16, 32, 64, 128, 256])
optuna 文档建议
trial.suggest_int
应采用以下格式
'some_param': trial.suggest_int('some_param', low, high, step)
我的代码如下所示
def objective(trial):
DL_param = {
'learning_rate': trial.suggest_float('learning_rate', 1e-3, 1e-1),
'optimizer': trial.suggest_categorical('optimizer', ["Adam", "RMSprop", "SGD"]),
'h_units': trial.suggest_int('h_units', 50, 250, step = 50),
'alpha': trial.suggest_float('alpha', [0.001,0.01, 0.1, 0.2, 0.3]),
'batch_size': trial.suggest_int('batch_size', [16, 32, 64, 128, 256]),
}
DL_model = build_model(DL_param)
DL_model.compile(optimizer=DL_param['optimizer'], loss='mean_squared_error')
DL_model.fit(x_train, y_train, validation_split = 0.3, shuffle = True,
batch_size = DL_param['batch_size'], epochs = 30)
y_pred_2 = DL_model.predict(x_test)
return mse(y_test_2, y_pred_2, squared=True)
我在定义参数
'alpha'
和 'batch_size'
的列表时遇到问题。有办法吗?像 trial.suggest_categorical
这样的东西可以从给定列表中选择字符串,如上面的代码
'optimizer': trial.suggest_categorical('optimizer', ["Adam", "RMSprop", "SGD"])
欢迎任何建议。预先感谢。
事实证明,您可以使用
trial.suggest_categorical
来实现您的目标:
import optuna
def objective(trial):
# define two variables:
A = trial.suggest_categorical('A', [1,2,3])
B = trial.suggest_categorical('B', [5,6])
# minimize this toy objective:
obj = A/B
return obj
study = optuna.create_study(direction="minimize")
study.optimize(objective, n_trials=20)
GridSampler
可以解决这个问题:
import optuna
def objective(trial):
# define two variables:
A = trial.suggest_float('A', 0.001, 0.01)
B = trial.suggest_int('B', 10, 70)
# minimize this toy objective:
obj = A/B
return obj
def optimization():
# define custom values to search on:
search_space = {'A':[0.0015, 0.003, 0.0075], 'B':[11, 23]}
sampler = optuna.samplers.GridSampler(search_space)
study = optuna.create_study(study_name="Optimization over given values", sampler=sampler)
study.optimize(objective, n_trials = 6)
if __name__=='__main__':
optimization()
输出是:
[I 2023-02-03 10:21:01,912] A new study created in memory with name: Optimization over given values
[I 2023-02-03 10:21:01,914] Trial 0 finished with value: 0.0006818181818181818 and parameters: {'A': 0.0075, 'B': 11}. Best is trial 0 with value: 0.0006818181818181818.
[I 2023-02-03 10:21:01,916] Trial 1 finished with value: 0.00013043478260869567 and parameters: {'A': 0.003, 'B': 23}. Best is trial 1 with value: 0.00013043478260869567.
[I 2023-02-03 10:21:01,917] Trial 2 finished with value: 0.0003260869565217391 and parameters: {'A': 0.0075, 'B': 23}. Best is trial 1 with value: 0.00013043478260869567.
[I 2023-02-03 10:21:01,921] Trial 3 finished with value: 6.521739130434783e-05 and parameters: {'A': 0.0015, 'B': 23}. Best is trial 3 with value: 6.521739130434783e-05.
[I 2023-02-03 10:21:01,927] Trial 4 finished with value: 0.00013636363636363637 and parameters: {'A': 0.0015, 'B': 11}. Best is trial 3 with value: 6.521739130434783e-05.
[I 2023-02-03 10:21:01,951] Trial 5 finished with value: 0.00027272727272727274 and parameters: {'A': 0.003, 'B': 11}. Best is trial 3 with value: 6.521739130434783e-05.
只需在@mohammad-joshaghani的解决方案中添加一条注释,当使用search_space和GridSampler定义超参数的自定义值时,必须为所有需要在objective()函数中调整和定义的超参数提供自定义值。否则,任何对 ML 模型运行优化的尝试都将被终止,并显示来自 optuna/samplers/_grid.py,第 186 行,sample_independent 的错误信息 引发 ValueError(message) 如下:
“ValueError:在给定网格中找不到参数名称learning_rate。”
另一方面,如果有人更喜欢仅为超参数的子集提供用户指定的值,建议使用@youjun-hu的解决方案。