为什么GAN中生成器的训练标签应该始终为True?

问题描述 投票:0回答:2

我目前正在学习深度学习,尤其是 GAN。 我从下面的网站找到了一个简单的 GAN 代码。 https://medium.com/@devnag/generative-adversarial-networks-gans-in-50-lines-of-code-pytorch-e81b79659e3f

但是,在代码中,我不明白为什么我们总是需要给 Generator 赋予 true 标签,如下所示。

        for g_index in range(g_steps):
            # 2. Train G on D's response (but DO NOT train D on these labels)
            G.zero_grad()

            gen_input = Variable(gi_sampler(minibatch_size, g_input_size))
            g_fake_data = G(gen_input)
            dg_fake_decision = D(preprocess(g_fake_data.t()))
            g_error = criterion(dg_fake_decision, Variable(torch.ones(1)))  # we want to fool, so pretend it's all genuine

            g_error.backward()
            g_optimizer.step()  # Only optimizes G's parameters

具体来说,在这条线上。

            g_error = criterion(dg_fake_decision, Variable(torch.ones(1)))  # we want to fool, so pretend it's all genuine

生成器的输入数据是假数据(包括噪声),因此如果我们在这些输入数据上分配 True 标签,我认为生成器最终会创建类似于假数据的数据(看起来不像真实数据)。我的理解有错吗?抱歉问了这个愚蠢的问题,但如果您有知识,请帮助我。 我将在下面放置完整的代码。

    #!/usr/bin/env python

    # Generative Adversarial Networks (GAN) example in PyTorch.
    # See related blog post at https://medium.com/@devnag/generative-adversarial-networks-gans-in-50-lines-of-code-pytorch-e81b79659e3f#.sch4xgsa9
    import numpy as np
    import torch
    import torch.nn as nn
    import torch.nn.functional as F
    import torch.optim as optim
    from torch.autograd import Variable

    # Data params
    data_mean = 4
    data_stddev = 1.25

    # Model params
    g_input_size = 1     # Random noise dimension coming into generator, per output vector
    g_hidden_size = 50   # Generator complexity
    g_output_size = 1    # size of generated output vector
    d_input_size = 100   # Minibatch size - cardinality of distributions
    d_hidden_size = 50   # Discriminator complexity
    d_output_size = 1    # Single dimension for 'real' vs. 'fake'
    minibatch_size = d_input_size

    d_learning_rate = 2e-4  # 2e-4
    g_learning_rate = 2e-4
    optim_betas = (0.9, 0.999)
    num_epochs = 30000
    print_interval = 200
    d_steps = 1  # 'k' steps in the original GAN paper. Can put the discriminator on higher training freq than generator
    g_steps = 1

    # ### Uncomment only one of these
    #(name, preprocess, d_input_func) = ("Raw data", lambda data: data, lambda x: x)
    (name, preprocess, d_input_func) = ("Data and variances", lambda data: decorate_with_diffs(data, 2.0), lambda x: x * 2)

    print("Using data [%s]" % (name))

    # ##### DATA: Target data and generator input data

    def get_distribution_sampler(mu, sigma):
        return lambda n: torch.Tensor(np.random.normal(mu, sigma, (1, n)))  # Gaussian

    def get_generator_input_sampler():
        return lambda m, n: torch.rand(m, n)  # Uniform-dist data into generator, _NOT_ Gaussian

    # ##### MODELS: Generator model and discriminator model

    class Generator(nn.Module):
        def __init__(self, input_size, hidden_size, output_size):
            super(Generator, self).__init__()
            self.map1 = nn.Linear(input_size, hidden_size)
            self.map2 = nn.Linear(hidden_size, hidden_size)
            self.map3 = nn.Linear(hidden_size, output_size)

        def forward(self, x):
            x = F.elu(self.map1(x))
            x = F.sigmoid(self.map2(x))
            return self.map3(x)

    class Discriminator(nn.Module):
        def __init__(self, input_size, hidden_size, output_size):
            super(Discriminator, self).__init__()
            self.map1 = nn.Linear(input_size, hidden_size)
            self.map2 = nn.Linear(hidden_size, hidden_size)
            self.map3 = nn.Linear(hidden_size, output_size)

        def forward(self, x):
            x = F.elu(self.map1(x))
            x = F.elu(self.map2(x))
            return F.sigmoid(self.map3(x))

    def extract(v):
        return v.data.storage().tolist()

    def stats(d):
        return [np.mean(d), np.std(d)]

    def decorate_with_diffs(data, exponent):
        mean = torch.mean(data.data, 1, keepdim=True)
        mean_broadcast = torch.mul(torch.ones(data.size()), mean.tolist()[0][0])
        diffs = torch.pow(data - Variable(mean_broadcast), exponent)
        return torch.cat([data, diffs], 1)

    d_sampler = get_distribution_sampler(data_mean, data_stddev)
    gi_sampler = get_generator_input_sampler()
    G = Generator(input_size=g_input_size, hidden_size=g_hidden_size, output_size=g_output_size)
    D = Discriminator(input_size=d_input_func(d_input_size), hidden_size=d_hidden_size, output_size=d_output_size)
    criterion = nn.BCELoss()  # Binary cross entropy: http://pytorch.org/docs/nn.html#bceloss
    d_optimizer = optim.Adam(D.parameters(), lr=d_learning_rate, betas=optim_betas)
    g_optimizer = optim.Adam(G.parameters(), lr=g_learning_rate, betas=optim_betas)

    for epoch in range(num_epochs):
        for d_index in range(d_steps):
            # 1. Train D on real+fake
            D.zero_grad()

            #  1A: Train D on real
            d_real_data = Variable(d_sampler(d_input_size))
            d_real_decision = D(preprocess(d_real_data))
            d_real_error = criterion(d_real_decision, Variable(torch.ones(1)))  # ones = true
            d_real_error.backward() # compute/store gradients, but don't change params

            #  1B: Train D on fake
            d_gen_input = Variable(gi_sampler(minibatch_size, g_input_size))
            d_fake_data = G(d_gen_input).detach()  # detach to avoid training G on these labels
            d_fake_decision = D(preprocess(d_fake_data.t()))
            d_fake_error = criterion(d_fake_decision, Variable(torch.zeros(1)))  # zeros = fake
            d_fake_error.backward()
            d_optimizer.step()     # Only optimizes D's parameters; changes based on stored gradients from backward()

        for g_index in range(g_steps):
            # 2. Train G on D's response (but DO NOT train D on these labels)
            G.zero_grad()

            gen_input = Variable(gi_sampler(minibatch_size, g_input_size))
            g_fake_data = G(gen_input)
            dg_fake_decision = D(preprocess(g_fake_data.t()))
            g_error = criterion(dg_fake_decision, Variable(torch.ones(1)))  # we want to fool, so pretend it's all genuine

            g_error.backward()
            g_optimizer.step()  # Only optimizes G's parameters

        if epoch % print_interval == 0:
            print("%s: D: %s/%s G: %s (Real: %s, Fake: %s) " % (epoch,
                                                                extract(d_real_error)[0],
                                                                extract(d_fake_error)[0],
                                                                extract(g_error)[0],
                                                                stats(extract(d_real_data)),
                                                                stats(extract(d_fake_data))))
machine-learning deep-learning pytorch
2个回答
4
投票

在这部分代码中,您正在训练 G 来欺骗 D,因此 G 生成假数据并询问 D 是否认为它是真实的(真实标签),然后 D 的梯度一直传播到 G(这可能是因为 D 的输入)是 G 的输出),这样它将在下一次迭代中学会更好地愚弄 D。

G 的输入不可训练,G 仅尝试将它们转换为真实数据(与 d_sampler 生成的数据类似)


0
投票

生成器输出的标签在生成器的训练阶段被指定为“True”(代码中的红色 for 循环)。此步骤是必要的,因为它会在 True 和 False 之间产生损失以更新生成器。在我的说明中,您可以看到,如果保留生成器标签(“0”或“False”),那么即使鉴别器通过标记“0”成功地判断出它所看到的是假的,也不会发生任何事情,因为没有损失。

© www.soinside.com 2019 - 2024. All rights reserved.