python中的双beta - 在statsmodel中使用虚拟变量进行多元线性回归

问题描述 投票:2回答:1

我试图使用statsmodel回归计算python中的双beta。不幸的是我提示错误信息。

双β的回归方程给出了here

Dual Beta Formula

我现在忽略了无风险利率(rf),但是一旦我添加它,实现应该是类似的。现在我的代码如下所示,其中我的'spx.xlsx'文件简单有两列带返回,称为'SPXrets'和'AAPLrets'(+一列有日期):

import pandas as pd
from pandas import ExcelWriter
from pandas import ExcelFile

import statsmodels.api as sm
import statsmodels.formula.api as smf
import numpy as np


df = pd.read_excel('spx.xlsx')
print(df.columns)

mod = smf.ols(formula='AAPLrets ~ SPXrets', data=df)
res = mod.fit()
print(res.summary())

提示一个错误的错误:

PatsyError:拦截术语不能与任何其他东西相互作用AAPLrets~SPXrets:c(D)+ SPXrets:(1-c(D))

感谢任何帮助 - 非常感谢!

python regression linear-regression statsmodels beta
1个回答
0
投票

编辑:

在我最初的建议之后,OP已经改变了标题和提供的代码片段。我的建议已经相应编辑。


新建议:

我怀疑你的数据集遇到了一些问题。我建议您告诉我们更多关于数据源,如何加载数据,它看起来是什么样的(结构)以及列的类型(字符串,浮点等)。

我现在可以告诉你的是,你的代码段运行正常,有一些示例数据,如下所示:

码:

               CONret  DAXret:c(D)  DAXret:(1-c(D))  AAPLrets  SPXrets  dummy
2017-01-08     109          107              122       101      100      0
2017-01-09     117          108              124       113      147      0
2017-01-10     142          108              130       107      103      1
2017-01-11     106          121              149       103      104      1
2017-01-12     124          149              143       112      126      0

输出:

                            OLS Regression Results                            
==============================================================================
Dep. Variable:               AAPLrets   R-squared:                       0.095
Model:                            OLS   Adj. R-squared:                  0.004
Method:                 Least Squares   F-statistic:                     1.044
Date:                Thu, 14 Feb 2019   Prob (F-statistic):              0.331
Time:                        16:00:01   Log-Likelihood:                -48.388
No. Observations:                  12   AIC:                             100.8
Df Residuals:                      10   BIC:                             101.7
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
Intercept     84.3198     31.143      2.708      0.022      14.929     153.711
SPXrets        0.2635      0.258      1.022      0.331      -0.311       0.838
==============================================================================
Omnibus:                        5.649   Durbin-Watson:                   1.882
Prob(Omnibus):                  0.059   Jarque-Bera (JB):                2.933
Skew:                           1.202   Prob(JB):                        0.231
Kurtosis:                       3.290   Cond. No.                         872.
==============================================================================

这是整个事情:

# imports
import statsmodels.formula.api as smf
import pandas as pd
import numpy as np
import statsmodels.api as sm

# sample data
np.random.seed(1)
rows = 12
listVars= ['CONret','DAXret:c(D)', 'DAXret:(1-c(D))', 'AAPLrets', 'SPXrets']
rng = pd.date_range('1/1/2017', periods=rows, freq='D')
df = pd.DataFrame(np.random.randint(100,150,size=(rows, len(listVars))), columns=listVars) 
df = df.set_index(rng)
df['dummy'] = np.random.randint(2, size=df.shape[0])

mod = smf.ols(formula='AAPLrets ~ SPXrets', data=df)
res = mod.fit()
res.summary()

另一个建议是:


就个人而言,如果没有人的话,我会感觉更舒服。

下面的代码段将允许您运行线性回归并选择是返回模型摘要,还是选择具有其他详细信息的数据框,如系数p值和r平方。

# Imports
import pandas as pd
import numpy as np
import statsmodels.api as sm

# sample data
np.random.seed(1)
rows = 12
listVars= ['CONret','DAXret:c(D)', 'DAXret:(1-c(D))', 'AAPLrets', 'SPXrets']
rng = pd.date_range('1/1/2017', periods=rows, freq='D')
df = pd.DataFrame(np.random.randint(100,150,size=(rows, len(listVars))), columns=listVars) 
df = df.set_index(rng)
df['dummy'] = np.random.randint(2, size=df.shape[0])

def LinReg(df, y, x, const, results):

    betas = x.copy()

    # Model with out without a constant
    if const == True:
        x = sm.add_constant(df[x])
        model = sm.OLS(df[y], x).fit()
    else:
        model = sm.OLS(df[y], df[x]).fit()

    # Estimates of R2 and p
    res1 = {'Y': [y],
            'R2': [format(model.rsquared, '.4f')],
            'p': [model.pvalues.tolist()],
            'start': [df.index[0]], 
            'stop': [df.index[-1]],
            'obs' : [df.shape[0]],
            'X': [betas]}
    df_res1 = pd.DataFrame(data = res1)

    # Regression Coefficients
    theParams = model.params[0:]
    coefs = theParams.to_frame()
    df_coefs = pd.DataFrame(coefs.T)
    xNames = list(df_coefs)
    xValues = list(df_coefs.loc[0].values)
    xValues2 = [ '%.2f' % elem for elem in xValues ]
    res2 = {'Independent': [xNames],
            'beta': [xValues2]}
    df_res2 = pd.DataFrame(data = res2)

    # All results
    df_res = pd.concat([df_res1, df_res2], axis = 1)
    df_res = df_res.T
    df_res.columns = ['results']


    if results == 'summary':

        return(model.summary())
        print(model.summary())
    else:
        return(df_res)

df_regression = LinReg(df = df, y = 'CONret', x = ['DAXret:c(D)', 'DAXret:(1-c(D))', 'dummy'], const = True, results = 'summary')

print(df_regression)

试运行1:

df_regression = LinReg(df = df, y = 'CONret', x = ['DAXret:c(D)', 'DAXret:(1-c(D))'], const = True, results = '')

输出1:

                                                       results
Y                                                       CONret
R2                                                      0.0813
p            [0.13194822614949883, 0.45726622261432304, 0.9...
start                                      2017-01-01 00:00:00
stop                                       2017-01-12 00:00:00
obs                                                         12
X                        [DAXret:c(D), DAXret:(1-c(D)), dummy]
Independent       [const, DAXret:c(D), DAXret:(1-c(D)), dummy]
beta                                [88.94, 0.24, -0.01, 2.20]

试运行2:

df_regression = LinReg(df = df, y = 'CONret', x = ['DAXret:c(D)', 'DAXret:(1-c(D))', 'dummy'], const = True, results = 'summary')

输出2:

                            OLS Regression Results                            
==============================================================================
Dep. Variable:                 CONret   R-squared:                       0.081
Model:                            OLS   Adj. R-squared:                 -0.263
Method:                 Least Squares   F-statistic:                    0.2361
Date:                Thu, 14 Feb 2019   Prob (F-statistic):              0.869
Time:                        16:04:02   Log-Likelihood:                -47.138
No. Observations:                  12   AIC:                             102.3
Df Residuals:                       8   BIC:                             104.2
Df Model:                           3                                         
Covariance Type:            nonrobust                                         
===================================================================================
                      coef    std err          t      P>|t|      [0.025      0.975]
-----------------------------------------------------------------------------------
const              88.9438     53.019      1.678      0.132     -33.318     211.205
DAXret:c(D)         0.2350      0.301      0.781      0.457      -0.459       0.929
DAXret:(1-c(D))    -0.0060      0.391     -0.015      0.988      -0.908       0.896
dummy               2.2005      8.973      0.245      0.812     -18.490      22.891
==============================================================================
Omnibus:                        1.025   Durbin-Watson:                   2.354
Prob(Omnibus):                  0.599   Jarque-Bera (JB):                0.720
Skew:                           0.540   Prob(JB):                        0.698
Kurtosis:                       2.477   Cond. No.                     2.15e+03
==============================================================================
© www.soinside.com 2019 - 2024. All rights reserved.