如何使用线性回归来预测每月受害者总数

Question

我有一个大型数据集，如下图所示，其中还包含“月”和“年”列。我尝试使用线性回归模型来预测每月的受害者总数，但我不知道如何获得受害者总数

from sklearn.linear_model import LinearRegression
reg = LinearRegression()
reg.fit(df_pre[[]],df_pre["Year"]) #don't know how to fit the data in here.

感谢帮助！

我尝试匹配维克特的年龄和月份，但我得到了错误的答案。我尝试创建一个新的数据框，其中仅包含月份和总受害者，然后拟合将具有不同的大小。

Answer 1

将数据拟合到模型背后的概念是：

reg.fit([all_inputs], [outputs])

In Machine learning terms:
reg.fit([features], [target])

由于我无法正确预览您的数据集，这里有一个简单的示例，介绍如何使用

LinearRegression

来拟合数据和预测。

假设我们有

x_1

、

x_2

、

的小数据集，其中

x_1

和

x_2

是特征（模型的输入），而

是目标（我们想要预测的） .

我们的数据集：

x_1 = [1, 1, 2, 2]
x_2 = [1, 2, 2, 3]
y = [6, 8, 9, 11]
data = [[1, 1, 6], [1, 2, 8], [2, 2, 9], [2, 3, 11]]
The nested lists are rows (that is data has 4 rows and 3 columns)

完整代码

# Import the packages and libraries
import numpy as np
from sklearn.linear_model import LinearRegression
import pandas as pd


# Convert our data into DataFrame
data = [[1, 1, 6], [1, 2, 8], [2, 2, 9], [2, 3, 11]] 
columns = ["x_1", "x_2", "y"] # columns of the dataframe
df = pd.DataFrame(data, columns=columns) # This will turn the data into a table like your data.

# Split the data to features and label
X_train = df.copy()

y_train = X_train["y"] # This is the target/ label/ output

del X_train["y"] # delete the label from the copied dataframe, so we are left with the features only.

# To answer your question of how to fit and predict with LinearRegression
model = LinearRegression() # Instantiate the class

model.fit(X_train, y_train) # Fit the input (features i.e X_train "x_1, x_2") and the output (target "y") to the model.

result = model.predict(np.array([[3, 5]])) # Now, we want to use the model to make prediction by passing a new set of input/ features x_1 and x_2 to the model to predict  

# so we should get result = [16.].

请注意，我们使用的是这个简单的二次方程

y = (1 * x_1) + (2 * x_2) + 3

，如果您应该将

x_1 = 3

和

x_2 = 5

传递给方程，则

y = 16

这意味着我们的模型工作正常。

如何使用线性回归来预测每月受害者总数

问题描述投票：0回答：1

1个回答

最新问题

如何使用线性回归来预测每月受害者总数

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1