为什么学习率需要这么低?

问题描述 投票:0回答:1

我是 Ml 新手,所以我尝试在多元线性回归中使梯度下降。函数很简单,就是z = x + y。但由于某种原因,学习率需要为 1e-16 才能发挥作用。有什么想法吗? 这是代码

import numpy as np

cases_x = np.array([[1,1], [2,3], [5,6], [1,5], [2,2], [100,100], [101,102], [500,500], [1000,1000], [100000000,100000000]])
cases_y = np.array([2,5,11,6,4,200,203,1000,2000,200000000])


cur_w = np.array([0,0])
cur_b = 0
def f(x,w,b):
    return np.dot(w,x) + b
    
def J(w,b):
    sum_ = 0 
    for i in range(0,10):
        sum_ += pow(f(cases_x[i], w,b) - cases_y[i],2)
    sum_ /= 20
    return sum_
    

def derivitive_w(w):
    d = np.array([])
    for i in range(0,2):
        cur_d = 0
        for j in range(0,10):
            cur_d += (f(cases_x[j], cur_w, cur_b) - cases_y[j]) * cases_x[j][i]
            
        cur_d /= 10
        #print(cur_d)
        d = np.append(d,cur_d)
    #print(d)
    return d
def derivitive_b(b):
    tmp_b = 0 
    for i in range(0,10):
        tmp_b += f(cases_x[i], cur_w, cur_b) - cases_y[i]
    tmp_b /= 10
    return tmp_b
for i in range(0,100):
    tmp_w = derivitive_w(cur_w)
    tmp_b = derivitive_b(cur_b)
    
    cur_b = cur_b - (0.0000000000000001 * tmp_b)
    cur_w = cur_w - (0.0000000000000001 * tmp_w)
inpt = list(map(int,input().split()))
arr = np.array([inpt[0],inpt[1]])
print(int(f(arr, cur_w,cur_b)))
machine-learning gradient-descent
1个回答
-1
投票

这是因为你的输入数据规模;你需要在梯度下降之前对其进行归一化。

© www.soinside.com 2019 - 2024. All rights reserved.