我有一个特定的数据集,其中有 3 个不同的函数和 3 个不同的变量:
x = 1
strLen_x = int(3**x + 0.25*(-3*(-1)**x + 3**x + 6)) # Function 1
y = 2
strLen_y = 2*y # Function 2
z = 3
strLen_z = 3*(z + 1 + (z//4)) # Function 3
# Expecting Function 4 in terms of 1, 2, and 3
最终输出即
strLen_
... 变量只不过是一个依赖于 3 个不同变量 x, y, and z
的变量。
我正在尝试提出一个通用方程式,其中 strLen_xyz
是根据所有 3 个变量 (x, y, z
)。
这是一张不同
x, y, z
值的输出表。
注:y > 1
和所有变量均为正整数。这里,
Output
是最终要得到的值。
此代码对您的数据进行多变量回归。除了生成线的 Seaborn 图之外的所有工作:
# see https://stackoverflow.com/questions/75650950/generate-equation-for-one-variable-in-terms-of-others
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns
from sklearn import linear_model
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, PolynomialFeatures
def mse(y_test, predictions):
return np.mean((y_test-predictions)**2)
if __name__ == "__main__":
df = pd.read_csv('resources/strlen.csv')
cols = df.select_dtypes(np.number).columns
X = df[['x', 'y', 'z']].to_numpy()
y = df[['len']].to_numpy()
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)
y_scaled = scaler.fit_transform(y)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=1234)
reg = linear_model.LinearRegression()
reg.fit(X_train, y_train)
predictions = reg.predict(X_test)
err = mse(y_test, predictions)
print("Error : ", err)
print("Intercept : ", reg.intercept_ )
print("Coefficients: ", reg.coef_)
y_pred_line = reg.predict(X)
cmap = plt.get_cmap('viridis')
fig = plt.figure(figsize=(8, 6))
sns.scatterplot(data=df, x='x', y='len', hue='y', style='z')
sns.lineplot(data=df, x='x', y='len')
x_line = np.linspace(df.iloc[:, 0:3].min(axis=0), df.iloc[:, 0:3].max(axis=0), num=11)
y_line = reg.intercept_[0] + np.matrix(x_line) * np.matrix(reg.coef_.transpose())
y_plot = y_line.flatten().transpose()
print('x_line: ', x_line, ' shape: ', x_line.shape)
print('y_line: ', y_plot, ' shape: ', y_plot.shape)
# m1 = plt.scatter(x_line[:, 0], y_plot, color=cmap(0.9), s=10)
plt.show()