具有自定义损失的Keras中的无监督编码

Question

[我正在尝试在Keras中使用RNN对时变协方差建模，在这里我将信号Y的协方差分解为随时间变化的加权和：C_Y ^ t = SUM_i ^ npriors（alpha_i ^ t * beta_i），其中beta_i是一些固定的基础集，而alpha_i ^ t是我要推论的术语。

作为成本函数，我（当前）使用负对数似然，其中似然度是具有推论协方差C_Y ^ t（如上所示）的零均值MVN：似然度= MVN（Y； 0，C_Y ^ t）。一旦正确实施，我将使用带有KL分歧的reparam技巧。

我不想明确地在传统的自动编码器设置中重建数据-我只想推断最适合随时间变化的协方差动态变化的alpha项。因此，在调用模型时，输出应仅为alpha_mu和alpha_sigma：

alpha_model_net = tf.keras.Model(inputs=[inputs_layer],
                                  outputs= [alpha_mu,alpha_sigma], 
                                  name='Alpha_MODEL')

但是我不知道这些alpha术语是什么[[先验，因此，在调用alpha_model_net.fit(Y_observed,[alpha_mu_predict,alpha_sigma_predict])时，很难知道这些[alpha_mu_predict,alpha_sigma_predict]术语应处于无监督状态。

因此，我的问题分为两部分：
如果我不认识他们，我应该以alpha_predict的形式输入什么？
我实际上是使用alpha分布中的样本，即在此处所示的尝试实现中自定义成本函数中的alpha_ast吗？
我有一个自己实施的方法。我的代码的关键部分可以在下面看到，a complete example with data simulation can be found on a Google Colab doc here。
模型
mini_batch_length = 10 # feature length nchans = 5 # number of features/channels of observed data, Y nunits = 10 # number of GRU units npriors = 2 # i.e. how many basis functions we have inputs_layer = layers.Input(shape=(mini_batch_length,nchans), name='Y_input') output,state = tf.compat.v1.keras.layers.CuDNNGRU(nunits, # number of units return_state=True, return_sequences=True, name='uni_INF_GRU')(inputs_layer) alpha_mu = tf.keras.layers.Dense(npriors,activation='linear',name='alpha_mu')(output) alpha_sigma = tf.keras.layers.Dense(npriors,activation='linear',name='alpha_sigma')(output) # use reparameterization trick to push the sampling out as input alpha_ast = layers.Lambda(sampling, name='alpha_ast')([alpha_mu, alpha_sigma]) # instantiate alpha MODEL network: alpha_model_net = tf.keras.Model(inputs=[inputs_layer], outputs= [alpha_ast], name='Alpha_MODEL') tf.keras.utils.plot_model(alpha_model_net, to_file='vae_mlp_encoder.png', show_shapes=True)
成本函数
def vae_loss(Y_portioned, alpha_ast): """ Our cost function is just the NLL The likelihood is a multivariate normal with zero mean and time-varying covariance: P(Y|alpha^t) = MVN(Y; 0, C_Y^t) where C_Y^t = SUM_i^npriors (alpha_ast_i^t beta_i) Y is our observed data alpha_ast_i^t are our samples from the inferred parameters (mu,sigma) beta_i are the basis functions (corresponding to covariance_matrix below) and (perhaps obviously) are not trainable. """ # Alphas need to end up being of dimension (?,mini_batch_length,npriors,1,1), # and need to undergo softplus transformation: alpha_ext = tf.keras.backend.expand_dims(tf.keras.backend.expand_dims( tf.keras.activations.softplus(alpha_ast), axis=-1),axis=-1) # Covariance basis set # This needs to be of dim [npriors, sensors, sensors]: covariance_basis = np.tile(np.zeros((nchans,nchans)),(npriors,1,1)).astype('float32') covariance_basis[0,0,0] = 1 covariance_basis[1,1,1] = 1 # Covariance basis functions need to be of dimension [1,1, npriors, sensors, sensors] covariance_ext = tf.reshape(covariance_basis,(1,1,npriors,nchans,nchans)) # Do the multiplicative sum over the npriors dimension: cov_arg = tf.reduce_sum(tf.multiply(alpha_ext,covariance_ext),2) safety_add = 1e-6*np.eye(nchans, nchans) cov_arg = cov_arg + safety_add mvn=tfd.MultivariateNormalFullCovariance( loc = np.zeros((mini_batch_length,nchans)).astype('float32'), covariance_matrix=cov_arg, allow_nan_stats=False) # Evaluate the -log(MVN) at the current batch of data. We add a tiny constant # to avoid any NaN or inf troubles loss = tf.reduce_sum(-tf.math.log(mvn.prob(Y_portioned)+1e-9)) return loss
适合的型号
opt = tf.keras.optimizers.Adam(lr=0.001) alpha_model_net.compile(optimizer=opt, loss=vae_loss) history=alpha_model_net.fit(Y_portioned, # Observed data. Y_portioned, # ??? verbose=1, shuffle=True, epochs=100, batch_size=400)
非常感谢，如果我错过任何关键细节，请告诉我。
使用TensorFlow 2.1.0后端。

Answer 1

不确定这是最明智的做法，但是我使用了add_loss函数来解决此问题。

我将通过完整的实施方式更新我的原始问题。

具有自定义损失的Keras中的无监督编码

问题描述投票：0回答：1

1个回答

最新问题

具有自定义损失的Keras中的无监督编码

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1