具有自定义损失的Keras中的无监督编码

问题描述 投票:0回答:1

[我正在尝试在Keras中使用RNN对时变协方差建模,在这里我将信号Y的协方差分解为随时间变化的加权和:C_Y ^ t = SUM_i ^ npriors(alpha_i ^ t * beta_i),其中beta_i是一些固定的基础集,而alpha_i ^ t是我要推论的术语。

作为成本函数,我(当前)使用负对数似然,其中似然度是具有推论协方差C_Y ^ t(如上所示)的零均值MVN:似然度= MVN(Y; 0,C_Y ^ t)。一旦正确实施,我将使用带有KL分歧的reparam技巧。

我不想明确地在传统的自动编码器设置中重建数据-我只想推断最适合随时间变化的协方差动态变化的alpha项。因此,在调用模型时,输出应仅为alpha_mualpha_sigma

alpha_model_net = tf.keras.Model(inputs=[inputs_layer],
                                  outputs= [alpha_mu,alpha_sigma], 
                                  name='Alpha_MODEL')

但是我不知道这些alpha术语是什么[[先验,因此,在调用alpha_model_net.fit(Y_observed,[alpha_mu_predict,alpha_sigma_predict])时,很难知道这些[alpha_mu_predict,alpha_sigma_predict]术语应处于无监督状态。

因此,我的问题分为两部分:

    如果我不认识他们,我应该以alpha_predict的形式输入什么?
  1. 我实际上是使用alpha分布中的样本,即在此处所示的尝试实现中自定义成本函数中的alpha_ast吗?
  • 我有一个自己实施的方法。我的代码的关键部分可以在下面看到,a complete example with data simulation can be found on a Google Colab doc here

    模型

  • mini_batch_length = 10 # feature length nchans = 5 # number of features/channels of observed data, Y nunits = 10 # number of GRU units npriors = 2 # i.e. how many basis functions we have inputs_layer = layers.Input(shape=(mini_batch_length,nchans), name='Y_input') output,state = tf.compat.v1.keras.layers.CuDNNGRU(nunits, # number of units return_state=True, return_sequences=True, name='uni_INF_GRU')(inputs_layer) alpha_mu = tf.keras.layers.Dense(npriors,activation='linear',name='alpha_mu')(output) alpha_sigma = tf.keras.layers.Dense(npriors,activation='linear',name='alpha_sigma')(output) # use reparameterization trick to push the sampling out as input alpha_ast = layers.Lambda(sampling, name='alpha_ast')([alpha_mu, alpha_sigma]) # instantiate alpha MODEL network: alpha_model_net = tf.keras.Model(inputs=[inputs_layer], outputs= [alpha_ast], name='Alpha_MODEL') tf.keras.utils.plot_model(alpha_model_net, to_file='vae_mlp_encoder.png', show_shapes=True)

    成本函数

    def vae_loss(Y_portioned, alpha_ast): """ Our cost function is just the NLL The likelihood is a multivariate normal with zero mean and time-varying covariance: P(Y|alpha^t) = MVN(Y; 0, C_Y^t) where C_Y^t = SUM_i^npriors (alpha_ast_i^t beta_i) Y is our observed data alpha_ast_i^t are our samples from the inferred parameters (mu,sigma) beta_i are the basis functions (corresponding to covariance_matrix below) and (perhaps obviously) are not trainable. """ # Alphas need to end up being of dimension (?,mini_batch_length,npriors,1,1), # and need to undergo softplus transformation: alpha_ext = tf.keras.backend.expand_dims(tf.keras.backend.expand_dims( tf.keras.activations.softplus(alpha_ast), axis=-1),axis=-1) # Covariance basis set # This needs to be of dim [npriors, sensors, sensors]: covariance_basis = np.tile(np.zeros((nchans,nchans)),(npriors,1,1)).astype('float32') covariance_basis[0,0,0] = 1 covariance_basis[1,1,1] = 1 # Covariance basis functions need to be of dimension [1,1, npriors, sensors, sensors] covariance_ext = tf.reshape(covariance_basis,(1,1,npriors,nchans,nchans)) # Do the multiplicative sum over the npriors dimension: cov_arg = tf.reduce_sum(tf.multiply(alpha_ext,covariance_ext),2) safety_add = 1e-6*np.eye(nchans, nchans) cov_arg = cov_arg + safety_add mvn=tfd.MultivariateNormalFullCovariance( loc = np.zeros((mini_batch_length,nchans)).astype('float32'), covariance_matrix=cov_arg, allow_nan_stats=False) # Evaluate the -log(MVN) at the current batch of data. We add a tiny constant # to avoid any NaN or inf troubles loss = tf.reduce_sum(-tf.math.log(mvn.prob(Y_portioned)+1e-9)) return loss

    适合的型号

    opt = tf.keras.optimizers.Adam(lr=0.001) alpha_model_net.compile(optimizer=opt, loss=vae_loss) history=alpha_model_net.fit(Y_portioned, # Observed data. Y_portioned, # ??? verbose=1, shuffle=True, epochs=100, batch_size=400)
    非常感谢,如果我错过任何关键细节,请告诉我。

    使用TensorFlow 2.1.0后端。

    python tensorflow machine-learning keras
    1个回答
    0
    投票
    不确定这是最明智的做法,但是我使用了add_loss函数来解决此问题。

    我将通过完整的实施方式更新我的原始问题。

    © www.soinside.com 2019 - 2024. All rights reserved.