在PyTorch中使用偏差进行基本函数逼近

Question

使用R，通过神经网络很容易逼近基本函数：

library(nnet)
x <- sort(10*runif(50))
y <- sin(x)
nn <- nnet(x, y, size=4, maxit=10000, linout=TRUE, abstol=1.0e-8, reltol = 1.0e-9, Wts = seq(0, 1, by=1/12) )
plot(x, y)
x1 <- seq(0, 10, by=0.1)
lines(x1, predict(nn, data.frame(x=x1)), col="green")
predict( nn , data.frame(x=pi/2) )

具有仅仅4个神经元的一个隐藏层的简单神经网络足以近似正弦。（根据stackoverflow问题Approximating function with Neural Network。）

但我无法在PyTorch中获得同样的东西。

实际上，由R创建的神经网络不仅包含输入，四个隐藏和输出，还包含两个“偏置”神经元 - 第一个连接到隐藏层，第二个连接到输出。

上图是通过以下方式获得的：

library(devtools)
library(scales)
library(reshape)
source_url('https://gist.github.com/fawda123/7471137/raw/cd6e6a0b0bdb4e065c597e52165e5ac887f5fe95/nnet_plot_update.r')
plot.nnet(nn$wts,struct=nn$n, pos.col='#007700',neg.col='#FF7777')   ### this plots the graph
plot.nnet(nn$wts,struct=nn$n, pos.col='#007700',neg.col='#FF7777', wts.only=1)   ### this prints the weights

尝试使用PyTorch会产生不同的网络：缺少偏见神经元。

以下是在PyTorch中尝试在R中执行的操作。结果不会令人满意：函数不是近似的。最明显的区别是缺乏神经元的偏见。

import torch
from torch.autograd import Variable

import random
import math

N, D_in, H, D_out = 1000, 1, 4, 1

l_x = []
l_y = []

for a in range(1000):
    r = random.random()*10
    l_x.append( [r] )
    l_y.append( [math.sin(r)] )


tx = torch.cuda.FloatTensor(l_x)
ty = torch.cuda.FloatTensor(l_y)

x = Variable(tx, requires_grad=False)
y = Variable(ty, requires_grad=False)

w1 = Variable(torch.randn(D_in, H ).type(torch.cuda.FloatTensor), requires_grad=True)
w2 = Variable(torch.randn(H, D_out).type(torch.cuda.FloatTensor), requires_grad=True)

learning_rate = 1e-5
for t in range(1000):
    y_pred = x.mm(w1).clamp(min=0).mm(w2)

    loss = (y_pred - y).pow(2).sum()
    if t<10 or t%100==1: print(t, loss.data[0])

    loss.backward()

    w1.data -= learning_rate * w1.grad.data
    w2.data -= learning_rate * w2.grad.data

    w1.grad.data.zero_()
    w2.grad.data.zero_()


t = [ [math.pi] ]
print( str(t) +" -> "+ str( (Variable(torch.cuda.FloatTensor( t ))).mm(w1).clamp(min=0).mm(w2).data ) )
t = [ [math.pi/2] ]
print( str(t) +" -> "+ str( (Variable(torch.cuda.FloatTensor( t ))).mm(w1).clamp(min=0).mm(w2).data ) )

如何通过插入“偏置”神经元或其他缺失细节使网络接近给定函数（在这种情况下是正弦）？

而且：我很难理解为什么R插入“偏差”。我发现信息表明偏见可能类似于“回归模型中的拦截” - 我仍然不清楚。任何信息，将不胜感激。编辑：一个很好的解释原来是在stackoverflow问题Role of Bias in Neural Networks

编辑：

获得结果的一个例子，虽然使用“更全面”的框架（“不重新发明轮子”）如下：

import torch
from torch.autograd import Variable
import torch.nn.functional as F

import math

N, D_in, H, D_out = 1000, 1, 4, 1

l_x = []
l_y = []

for a in range(1000):
    t = (a/1000.0)*10
    l_x.append( [t] )
    l_y.append( [math.sin(t)] )

x = Variable( torch.FloatTensor(l_x) )
y = Variable( torch.FloatTensor(l_y) )


class Net(torch.nn.Module):
    def __init__(self, n_feature, n_hidden, n_output):
        super(Net, self).__init__()
        self.to_hidden = torch.nn.Linear(n_feature, n_hidden)
        self.to_output = torch.nn.Linear(n_hidden,  n_output)

    def forward(self, x):
        x = self.to_hidden(x)
        x = F.tanh(x)           # activation function
        x = self.to_output(x)
        return x


net = Net(n_feature = D_in, n_hidden = H, n_output = D_out)

learning_rate =  0.01 
optimizer = torch.optim.Adam( net.parameters() , lr=learning_rate )

for t in range(1000):
    y_pred = net(x) 

    loss = (y_pred - y).pow(2).sum()
    if t<10 or t%100==1: print(t, loss.data[0])

    loss.backward()
    optimizer.step()
    optimizer.zero_grad()


t = [ [math.pi] ]
print( str(t) +" -> "+ str( net( Variable(torch.FloatTensor( t )) ) ) )
t = [ [math.pi/2] ]
print( str(t) +" -> "+ str( net( Variable(torch.FloatTensor( t )) ) ) )

不幸的是，虽然这段代码工作正常，但它并没有解决使原始的，更“低级”的代码按预期工作的问题（例如引入偏见）。

Answer 1

关注@ jdhao的评论 - 这是一个超级简单的PyTorch模型，它可以精确计算你想要的东西：

 class LinearWithInputBias(nn.Linear):
    def __init__(self, in_features, out_features, out_bias=True, in_bias=True):
        nn.Linear.__init__(self, in_features, out_features, out_bias)
        if in_bias:
            in_bias = torch.zeros(1, out_features)
            # in_bias.normal_()  # if you want it to be randomly initialized
            self._out_bias = nn.Parameter(in_bias)

    def forward(self, x):
        out = nn.Linear.forward(self, x)
        try:
            out = out + self._out_bias
        except AttributeError:
            pass
        return out

但是，你的代码中还有一个错误：从我所看到的，你不会训练它 - 也就是你没有调用优化器（比如torch.optim.SGD(mod.parameters())，你可以通过调用grad.data.zero_()来删除梯度信息）。

在PyTorch中使用偏差进行基本函数逼近

问题描述投票：2回答：1

1个回答

最新问题

在PyTorch中使用偏差进行基本函数逼近

问题描述 投票：2回答：1

1个回答

最新问题

问题描述投票：2回答：1