有没有一种简单的方法来扩展现有的激活功能?我的自定义softmax函数返回:渐变的操作有“无”

问题描述 投票:2回答:1

我想通过仅使用向量中的前k个值来实现使softmax更快的尝试。

为此,我尝试在模型中实现tensorflow的自定义函数:

def softmax_top_k(logits, k=10):
    values, indices = tf.nn.top_k(logits, k, sorted=False)
    softmax = tf.nn.softmax(values)
    logits_shape = tf.shape(logits)
    return_value = tf.sparse_to_dense(indices, logits_shape, softmax)
    return_value = tf.convert_to_tensor(return_value, dtype=logits.dtype, name=logits.name)
    return return_value

我正在使用时尚mnist来测试,这种尝试是否有效:

fashion_mnist = keras.datasets.fashion_mnist
(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()

# normalize the data
train_images = train_images / 255.0
test_images = test_images / 255.0

# split the training data into train and validate arrays (will be used later)
train_images, train_images_validate, train_labels, train_labels_validate = train_test_split(
    train_images, train_labels, test_size=0.2, random_state=133742,
)

model = keras.models.Sequential([
    keras.layers.Flatten(input_shape=(28, 28)),
    keras.layers.Dense(128, activation=tf.nn.relu),
    keras.layers.Dense(10, activation=softmax_top_k)
])


model.compile(
    loss='sparse_categorical_crossentropy',
    optimizer='adam',
    metrics=['accuracy']
)

model.fit(
    train_images, train_labels,
    epochs=10,
    validation_data=(train_images_validate, train_labels_validate),
)

model_without_cnn.compile(
    loss='sparse_categorical_crossentropy',
    optimizer='adam',
    metrics=['accuracy']
)

model_without_cnn.fit(
    train_images, train_labels,
    epochs=10,
    validation_data=(train_images_validate, train_labels_validate),
)

但在执行过程中出现错误:

ValueError: An operation hasNonefor gradient. Please make sure that all of your ops have a gradient defined (i.e. are differentiable).

我找到了this: (How to make a custom activation function),它解释了如何为tensorflow实现完全自定义的激活函数。但由于这使用并扩展了softmax,我认为渐变应该仍然是相同的。

这是我使用python和tensorflow进行编码的第一周,因此我对所有内部实现都没有很好的概述。

有没有更简单的方法将softmax扩展到新函数,而不是从头开始实现?

提前致谢!

python tensorflow keras softmax activation-function
1个回答
0
投票

使用tf.scatter_nd,而不是使用稀疏张量来使用“除了softmaxed top-K值之外的全零”来表示张量:

import tensorflow as tf

def softmax_top_k(logits, k=10):
    values, indices = tf.nn.top_k(logits, k, sorted=False)
    softmax = tf.nn.softmax(values)
    logits_shape = tf.shape(logits)
    # Assuming that logits is 2D
    rows = tf.tile(tf.expand_dims(tf.range(logits_shape[0]), 1), [1, k])
    scatter_idx = tf.stack([rows, indices], axis=-1)
    return tf.scatter_nd(scatter_idx, softmax, logits_shape)

编辑:这是一个稍微复杂的张力版本,具有任意数量的维度。但是,代码仍然需要在图形构造时知道维数。

import tensorflow as tf

def softmax_top_k(logits, k=10):
    values, indices = tf.nn.top_k(logits, k, sorted=False)
    softmax = tf.nn.softmax(values)
    # Make nd indices
    logits_shape = tf.shape(logits)
    dims = [tf.range(logits_shape[i]) for i in range(logits_shape.shape.num_elements() - 1)]
    grid = tf.meshgrid(*dims, tf.range(k), indexing='ij')
    scatter_idx = tf.stack(grid[:-1] + [indices], axis=-1)
    return tf.scatter_nd(scatter_idx, softmax, logits_shape)
© www.soinside.com 2019 - 2024. All rights reserved.