Tensorflow的DNNRegressor是如何决定要做多少步的?

问题描述 投票:0回答:0

我正在从张量流中运行 DNNRegressor,我只能获得大约 2000-4000 步,具体取决于输入参数。我正在运行 DNNRegressor,其中嵌入了描述国际象棋游戏 FEN 代码的列。我的标签是由 https://www.kaggle.com/datasets/ronakbadhe/chess-evaluations.

提供的国际象棋评估

以下是我正在测试的输入参数,并通过以下方式获得较少的步骤:

embedding_dims = [int(64),(64*2),(64*4)]
basic_list = [32, 32, 64, 64, 64, 128, 128, 264, 264]
hidden_units = [basic_list, [i*2 for i in basic_list], [i/2 for i in basic_list]]
batches = [32, 64, 128, 256, 512, 1024, 2048]

rates = [i for i in list(np.arange (.01,.2,.1))]
optimizers=[]
optimizersworded=[]
for lr in rates:
    optimizers.append(tf.compat.v1.train.AdagradOptimizer(learning_rate=lr))
    optimizersworded.append('tf.compat.v1.train.AdagradOptimizer(learning_rate='+str(lr))
    optimizers.append(tf.compat.v1.train.AdamOptimizer(learning_rate=lr))
    optimizersworded.append('tf.compat.v1.train.AdamOptimizer(learning_rate='+str(lr))
    optimizers.append(tf.compat.v1.train.FtrlOptimizer(learning_rate=lr))
    optimizersworded.append('tf.compat.v1.train.FtrlOptimizer(learning_rate='+str(lr))

这是我得到的评估类型的示例:

{'average_loss': 350384.3, 'label/mean': 42.602047, 'loss': 89658030.0, 'prediction/mean': 42.507717, 'global_step': 1068}

顺便说一下,我知道在 tensorflow 中有更好的方法来运行这种类型的数据。我只使用 DNNRegressor,因为我在定义自己的模型时遇到错误,如该线程中所述:尽管编译,但在 Tensorflow 模型中未发现损失。我不期望创建的结果可以与您根据对层和参数的不同组合的测试定义自己的模型创建的结果相媲美。

其他信息:

Dataframe head(目标是标签参数,列名不正确,但每一列描述了一个棋盘格。我们还有一列最后是玩家回合。):

   target  r  n  b  q  k b.1 n.1 r.1  p  ... P.7  R X.31  B  Q  K B.1 N.1 R.1   
0      50  r  n  b  q  k   b   X   r  p  ...   P  R    X  B  Q  K   B   N   R  \
1      10  r  n  b  q  k   b   X   r  p  ...   P  R    X  B  Q  K   B   N   R   
2      75  r  n  b  q  k   b   X   r  p  ...   P  R    X  B  Q  K   B   N   R   
3      52  r  n  b  q  k   b   X   r  p  ...   P  R    X  B  Q  K   B   N   R   
4      52  r  n  b  q  k   b   X   r  p  ...   P  R    X  B  Q  K   B   N   R   

  b.2  
0   w  
1   b  
2   w  
3   b  
4   w  
[5 rows x 66 columns]

数据集中的行数和列数:

row number: 1366424
column number: 66

分成训练和测试数据集后的行数。

1093139 train examples
273285 test examples

这是我用于预处理数据和运行具有不同参数和超参数的模型的代码(我没有使用 GridSearchCV,因为 a)我想测试参数,而不仅仅是超参数和 b)如果不重写代码就无法运行它GridsearchCV 要求您拟合模型,而不仅仅是训练它。 DNNRegressor 没有合适的选项,尽管我看到有些人从旧版本的 tensorflow 获得合适的选项):

parameters = [embedding_dims,hidden_units,optimizers, batches]
pramatersworded= [embedding_dims,hidden_units,optimizersworded, batches]

parameters_perm = list(itertools.product(*parameters))
parametersworded_perm = list(itertools.product(*pramatersworded))

try:

    with open("evals", "rb") as fp:   # Unpickling
        evals = pickle.load(fp)
except:

    evals={}

X = pd.read_csv('RegChessDataForTensorflow.csv', nrows=1366424) #,nrows=10000
X["target"] = X["target"].astype(int)
X = X.loc[:, ~X.columns.str.contains('^Unnamed')]

print(X.head())
print("row number:", X.shape[0])
print("column number:", X.shape[1])

train_ds, test_ds = train_test_split(X, test_size=0.2)
print(len(train_ds), 'train examples')
print(len(test_ds), 'test examples')

#tf.compat.v1.logging.set_verbosity(tf.compat.v1.logging.ERROR)
tf.compat.v1.logging.set_verbosity(tf.compat.v1.logging.INFO)

features={}
labels=[]

def embedded_col_maker(mode,embedding_dims):

    X = train_ds if mode == tf.estimator.ModeKeys.TRAIN else test_ds

    global features
    global labels
    features={}
    labels=[i for i in X.iloc[:,0]]

    for colnum in range(1,len(X.columns)): features[str(X.columns[colnum])] = [str(i) for i in X.iloc[:,colnum]]

    embedding_cols=[]
    for colnum in range(1,len(X.columns)):
        vocab_col = tf1.feature_column.categorical_column_with_vocabulary_list(X.columns[colnum],vocabulary_list=['r', 'n', 'b','q','k','p','X','R','P','B','Q','K','N'],num_oov_buckets=0)
        embedding_cols.append(tf1.feature_column.embedding_column(vocab_col, embedding_dims))

    return embedding_cols

for i in range(0,len(parameters_perm)):

    print("parameters:", parametersworded_perm[i])
    print("finished", i, "of", len(parameters_perm))
    embedding_dims=int(parameters_perm[i][0])
    hidden_units=parameters_perm[i][1]
    optimizer=parameters_perm[i][2]
    batch=parameters_perm[i][3]

    try:

        i=int(i)
        if i == 0: embedding_cols = embedded_col_maker(tf.estimator.ModeKeys.TRAIN, embedding_dims)
        if parametersworded_perm[i] in list(evals.values()): parametersworded_perm[i], list(evals.values()); continue

        model_dir = "/Users/lukastaylor/tfpy/RegModels/modelsearcb2/"+str(i)

        DNN = tf1.estimator.DNNRegressor(feature_columns = embedding_cols,
                                         hidden_units=hidden_units,
                                         #model_dir=model_dir,
                                         optimizer=optimizer)

        input_fn = lambda:tf1.data.Dataset.from_tensor_slices((features, labels)).batch(batch)

        DNN.train(input_fn)

        embedding_cols = embedded_col_maker(tf.estimator.ModeKeys.EVAL, embedding_dims)
        evaluation = DNN.evaluate(input_fn)
        print(parametersworded_perm[i],"evaluation:",evaluation)
        evals[evaluation['average_loss']] = parametersworded_perm[i]

    except Exception as e:

        print("ERROR:", e)
        evals[e] = parametersworded_perm[i]

    with open("evals", "wb") as fp:

        pickle.dump(evals, fp)

print(evals)
python tensorflow machine-learning deep-learning tensorflow-estimator
© www.soinside.com 2019 - 2024. All rights reserved.