Weighted Least Squares
In [1]:
1 2 3 4 5 6 7 8 | <span class = "kn" > from < / span> <span class = "nn" >__future__< / span> <span class = "kn" > import < / span> <span class = "n" >print_function< / span> <span class = "kn" > import < / span> <span class = "nn" >numpy< / span> <span class = "kn" >as< / span> <span class = "nn" >np< / span> <span class = "kn" > from < / span> <span class = "nn" >scipy< / span> <span class = "kn" > import < / span> <span class = "n" >stats< / span> <span class = "kn" > import < / span> <span class = "nn" >statsmodels.api< / span> <span class = "kn" >as< / span> <span class = "nn" >sm< / span> <span class = "kn" > import < / span> <span class = "nn" >matplotlib.pyplot< / span> <span class = "kn" >as< / span> <span class = "nn" >plt< / span> <span class = "kn" > from < / span> <span class = "nn" >statsmodels.sandbox.regression.predstd< / span> <span class = "kn" > import < / span> <span class = "n" >wls_prediction_std< / span> <span class = "kn" > from < / span> <span class = "nn" >statsmodels.iolib.table< / span> <span class = "kn" > import < / span> <span class = "p" >(< / span><span class = "n" >SimpleTable< / span><span class = "p" >,< / span> <span class = "n" >default_txt_fmt< / span><span class = "p" >)< / span> <span class = "n" >np< / span><span class = "o" >.< / span><span class = "n" >random< / span><span class = "o" >.< / span><span class = "n" >seed< / span><span class = "p" >(< / span><span class = "mi" > 1024 < / span><span class = "p" >)< / span> |
WLS Estimation
Artificial data: Heteroscedasticity 2 groups
Model assumptions:
- Misspecification: true model is quadratic, estimate only linear
- Independent noise/error term
- Two groups for error variance, low and high variance groups
In [2]:
1 2 3 4 5 6 7 8 9 10 11 12 | <span class = "n" >nsample< / span> <span class = "o" > = < / span> <span class = "mi" > 50 < / span> <span class = "n" >x< / span> <span class = "o" > = < / span> <span class = "n" >np< / span><span class = "o" >.< / span><span class = "n" >linspace< / span><span class = "p" >(< / span><span class = "mi" > 0 < / span><span class = "p" >,< / span> <span class = "mi" > 20 < / span><span class = "p" >,< / span> <span class = "n" >nsample< / span><span class = "p" >)< / span> <span class = "n" >X< / span> <span class = "o" > = < / span> <span class = "n" >np< / span><span class = "o" >.< / span><span class = "n" >column_stack< / span><span class = "p" >((< / span><span class = "n" >x< / span><span class = "p" >,< / span> <span class = "p" >(< / span><span class = "n" >x< / span> <span class = "o" > - < / span> <span class = "mi" > 5 < / span><span class = "p" >)< / span><span class = "o" > * * < / span><span class = "mi" > 2 < / span><span class = "p" >))< / span> <span class = "n" >X< / span> <span class = "o" > = < / span> <span class = "n" >sm< / span><span class = "o" >.< / span><span class = "n" >add_constant< / span><span class = "p" >(< / span><span class = "n" >X< / span><span class = "p" >)< / span> <span class = "n" >beta< / span> <span class = "o" > = < / span> <span class = "p" >[< / span><span class = "mf" > 5. < / span><span class = "p" >,< / span> <span class = "mf" > 0.5 < / span><span class = "p" >,< / span> <span class = "o" > - < / span><span class = "mf" > 0.01 < / span><span class = "p" >]< / span> <span class = "n" >sig< / span> <span class = "o" > = < / span> <span class = "mf" > 0.5 < / span> <span class = "n" >w< / span> <span class = "o" > = < / span> <span class = "n" >np< / span><span class = "o" >.< / span><span class = "n" >ones< / span><span class = "p" >(< / span><span class = "n" >nsample< / span><span class = "p" >)< / span> <span class = "n" >w< / span><span class = "p" >[< / span><span class = "n" >nsample< / span> <span class = "o" > * < / span> <span class = "mi" > 6 < / span><span class = "o" > / < / span><span class = "mi" > 10 < / span><span class = "p" >:]< / span> <span class = "o" > = < / span> <span class = "mi" > 3 < / span> <span class = "n" >y_true< / span> <span class = "o" > = < / span> <span class = "n" >np< / span><span class = "o" >.< / span><span class = "n" >dot< / span><span class = "p" >(< / span><span class = "n" >X< / span><span class = "p" >,< / span> <span class = "n" >beta< / span><span class = "p" >)< / span> <span class = "n" >e< / span> <span class = "o" > = < / span> <span class = "n" >np< / span><span class = "o" >.< / span><span class = "n" >random< / span><span class = "o" >.< / span><span class = "n" >normal< / span><span class = "p" >(< / span><span class = "n" >size< / span><span class = "o" > = < / span><span class = "n" >nsample< / span><span class = "p" >)< / span> <span class = "n" >y< / span> <span class = "o" > = < / span> <span class = "n" >y_true< / span> <span class = "o" > + < / span> <span class = "n" >sig< / span> <span class = "o" > * < / span> <span class = "n" >w< / span> <span class = "o" > * < / span> <span class = "n" >e< / span> <span class = "n" >X< / span> <span class = "o" > = < / span> <span class = "n" >X< / span><span class = "p" >[:,[< / span><span class = "mi" > 0 < / span><span class = "p" >,< / span><span class = "mi" > 1 < / span><span class = "p" >]]< / span> |
WLS knowing the true variance ratio of heteroscedasticity
In [3]:
1 2 3 | <span class = "n" >mod_wls< / span> <span class = "o" > = < / span> <span class = "n" >sm< / span><span class = "o" >.< / span><span class = "n" >WLS< / span><span class = "p" >(< / span><span class = "n" >y< / span><span class = "p" >,< / span> <span class = "n" >X< / span><span class = "p" >,< / span> <span class = "n" >weights< / span><span class = "o" > = < / span><span class = "mf" > 1. < / span><span class = "o" > / < / span><span class = "n" >w< / span><span class = "p" >)< / span> <span class = "n" >res_wls< / span> <span class = "o" > = < / span> <span class = "n" >mod_wls< / span><span class = "o" >.< / span><span class = "n" >fit< / span><span class = "p" >()< / span> <span class = "k" > print < / span><span class = "p" >(< / span><span class = "n" >res_wls< / span><span class = "o" >.< / span><span class = "n" >summary< / span><span class = "p" >())< / span> |
OLS vs. WLS
Estimate an OLS model for comparison:
In [4]:
1 2 3 | <span class = "n" >res_ols< / span> <span class = "o" > = < / span> <span class = "n" >sm< / span><span class = "o" >.< / span><span class = "n" >OLS< / span><span class = "p" >(< / span><span class = "n" >y< / span><span class = "p" >,< / span> <span class = "n" >X< / span><span class = "p" >)< / span><span class = "o" >.< / span><span class = "n" >fit< / span><span class = "p" >()< / span> <span class = "k" > print < / span><span class = "p" >(< / span><span class = "n" >res_ols< / span><span class = "o" >.< / span><span class = "n" >params< / span><span class = "p" >)< / span> <span class = "k" > print < / span><span class = "p" >(< / span><span class = "n" >res_wls< / span><span class = "o" >.< / span><span class = "n" >params< / span><span class = "p" >)< / span> |
Compare the WLS standard errors to heteroscedasticity corrected OLS standard errors:
In [5]:
1 2 3 4 5 6 7 | <span class = "n" >se< / span> <span class = "o" > = < / span> <span class = "n" >np< / span><span class = "o" >.< / span><span class = "n" >vstack< / span><span class = "p" >([[< / span><span class = "n" >res_wls< / span><span class = "o" >.< / span><span class = "n" >bse< / span><span class = "p" >],< / span> <span class = "p" >[< / span><span class = "n" >res_ols< / span><span class = "o" >.< / span><span class = "n" >bse< / span><span class = "p" >],< / span> <span class = "p" >[< / span><span class = "n" >res_ols< / span><span class = "o" >.< / span><span class = "n" >HC0_se< / span><span class = "p" >],< / span> <span class = "p" >[< / span><span class = "n" >res_ols< / span><span class = "o" >.< / span><span class = "n" >HC1_se< / span><span class = "p" >],< / span> <span class = "p" >[< / span><span class = "n" >res_ols< / span><span class = "o" >.< / span><span class = "n" >HC2_se< / span><span class = "p" >],< / span> <span class = "p" >[< / span><span class = "n" >res_ols< / span><span class = "o" >.< / span><span class = "n" >HC3_se< / span><span class = "p" >]])< / span> <span class = "n" >se< / span> <span class = "o" > = < / span> <span class = "n" >np< / span><span class = "o" >.< / span><span class = "n" > round < / span><span class = "p" >(< / span><span class = "n" >se< / span><span class = "p" >,< / span><span class = "mi" > 4 < / span><span class = "p" >)< / span> <span class = "n" >colnames< / span> <span class = "o" > = < / span> <span class = "p" >[< / span><span class = "s" > 'x1' < / span><span class = "p" >,< / span> <span class = "s" > 'const' < / span><span class = "p" >]< / span> <span class = "n" >rownames< / span> <span class = "o" > = < / span> <span class = "p" >[< / span><span class = "s" > 'WLS' < / span><span class = "p" >,< / span> <span class = "s" > 'OLS' < / span><span class = "p" >,< / span> <span class = "s" > 'OLS_HC0' < / span><span class = "p" >,< / span> <span class = "s" > 'OLS_HC1' < / span><span class = "p" >,< / span> <span class = "s" > 'OLS_HC3' < / span><span class = "p" >,< / span> <span class = "s" > 'OLS_HC3' < / span><span class = "p" >]< / span> <span class = "n" >tabl< / span> <span class = "o" > = < / span> <span class = "n" >SimpleTable< / span><span class = "p" >(< / span><span class = "n" >se< / span><span class = "p" >,< / span> <span class = "n" >colnames< / span><span class = "p" >,< / span> <span class = "n" >rownames< / span><span class = "p" >,< / span> <span class = "n" >txt_fmt< / span><span class = "o" > = < / span><span class = "n" >default_txt_fmt< / span><span class = "p" >)< / span> <span class = "k" > print < / span><span class = "p" >(< / span><span class = "n" >tabl< / span><span class = "p" >)< / span> |
Calculate OLS prediction interval:
In [6]:
1 2 3 4 | <span class = "n" >covb< / span> <span class = "o" > = < / span> <span class = "n" >res_ols< / span><span class = "o" >.< / span><span class = "n" >cov_params< / span><span class = "p" >()< / span> <span class = "n" >prediction_var< / span> <span class = "o" > = < / span> <span class = "n" >res_ols< / span><span class = "o" >.< / span><span class = "n" >mse_resid< / span> <span class = "o" > + < / span> <span class = "p" >(< / span><span class = "n" >X< / span> <span class = "o" > * < / span> <span class = "n" >np< / span><span class = "o" >.< / span><span class = "n" >dot< / span><span class = "p" >(< / span><span class = "n" >covb< / span><span class = "p" >,< / span><span class = "n" >X< / span><span class = "o" >.< / span><span class = "n" >T< / span><span class = "p" >)< / span><span class = "o" >.< / span><span class = "n" >T< / span><span class = "p" >)< / span><span class = "o" >.< / span><span class = "n" > sum < / span><span class = "p" >(< / span><span class = "mi" > 1 < / span><span class = "p" >)< / span> <span class = "n" >prediction_std< / span> <span class = "o" > = < / span> <span class = "n" >np< / span><span class = "o" >.< / span><span class = "n" >sqrt< / span><span class = "p" >(< / span><span class = "n" >prediction_var< / span><span class = "p" >)< / span> <span class = "n" >tppf< / span> <span class = "o" > = < / span> <span class = "n" >stats< / span><span class = "o" >.< / span><span class = "n" >t< / span><span class = "o" >.< / span><span class = "n" >ppf< / span><span class = "p" >(< / span><span class = "mf" > 0.975 < / span><span class = "p" >,< / span> <span class = "n" >res_ols< / span><span class = "o" >.< / span><span class = "n" >df_resid< / span><span class = "p" >)< / span> |
In [7]:
1 | <span class = "n" >prstd_ols< / span><span class = "p" >,< / span> <span class = "n" >iv_l_ols< / span><span class = "p" >,< / span> <span class = "n" >iv_u_ols< / span> <span class = "o" > = < / span> <span class = "n" >wls_prediction_std< / span><span class = "p" >(< / span><span class = "n" >res_ols< / span><span class = "p" >)< / span> |
Draw a plot to compare predicted values in WLS and OLS:
In [8]:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | <span class = "n" >prstd< / span><span class = "p" >,< / span> <span class = "n" >iv_l< / span><span class = "p" >,< / span> <span class = "n" >iv_u< / span> <span class = "o" > = < / span> <span class = "n" >wls_prediction_std< / span><span class = "p" >(< / span><span class = "n" >res_wls< / span><span class = "p" >)< / span> <span class = "n" >fig< / span><span class = "p" >,< / span> <span class = "n" >ax< / span> <span class = "o" > = < / span> <span class = "n" >plt< / span><span class = "o" >.< / span><span class = "n" >subplots< / span><span class = "p" >(< / span><span class = "n" >figsize< / span><span class = "o" > = < / span><span class = "p" >(< / span><span class = "mi" > 8 < / span><span class = "p" >,< / span><span class = "mi" > 6 < / span><span class = "p" >))< / span> <span class = "n" >ax< / span><span class = "o" >.< / span><span class = "n" >plot< / span><span class = "p" >(< / span><span class = "n" >x< / span><span class = "p" >,< / span> <span class = "n" >y< / span><span class = "p" >,< / span> <span class = "s" > 'o' < / span><span class = "p" >,< / span> <span class = "n" >label< / span><span class = "o" > = < / span><span class = "s" > "Data" < / span><span class = "p" >)< / span> <span class = "n" >ax< / span><span class = "o" >.< / span><span class = "n" >plot< / span><span class = "p" >(< / span><span class = "n" >x< / span><span class = "p" >,< / span> <span class = "n" >y_true< / span><span class = "p" >,< / span> <span class = "s" > 'b-' < / span><span class = "p" >,< / span> <span class = "n" >label< / span><span class = "o" > = < / span><span class = "s" > "True" < / span><span class = "p" >)< / span> <span class = "c" > # OLS</span> <span class = "n" >ax< / span><span class = "o" >.< / span><span class = "n" >plot< / span><span class = "p" >(< / span><span class = "n" >x< / span><span class = "p" >,< / span> <span class = "n" >res_ols< / span><span class = "o" >.< / span><span class = "n" >fittedvalues< / span><span class = "p" >,< / span> <span class = "s" > 'r--' < / span><span class = "p" >)< / span> <span class = "n" >ax< / span><span class = "o" >.< / span><span class = "n" >plot< / span><span class = "p" >(< / span><span class = "n" >x< / span><span class = "p" >,< / span> <span class = "n" >iv_u_ols< / span><span class = "p" >,< / span> <span class = "s" > 'r--' < / span><span class = "p" >,< / span> <span class = "n" >label< / span><span class = "o" > = < / span><span class = "s" > "OLS" < / span><span class = "p" >)< / span> <span class = "n" >ax< / span><span class = "o" >.< / span><span class = "n" >plot< / span><span class = "p" >(< / span><span class = "n" >x< / span><span class = "p" >,< / span> <span class = "n" >iv_l_ols< / span><span class = "p" >,< / span> <span class = "s" > 'r--' < / span><span class = "p" >)< / span> <span class = "c" > # WLS</span> <span class = "n" >ax< / span><span class = "o" >.< / span><span class = "n" >plot< / span><span class = "p" >(< / span><span class = "n" >x< / span><span class = "p" >,< / span> <span class = "n" >res_wls< / span><span class = "o" >.< / span><span class = "n" >fittedvalues< / span><span class = "p" >,< / span> <span class = "s" > 'g--.' < / span><span class = "p" >)< / span> <span class = "n" >ax< / span><span class = "o" >.< / span><span class = "n" >plot< / span><span class = "p" >(< / span><span class = "n" >x< / span><span class = "p" >,< / span> <span class = "n" >iv_u< / span><span class = "p" >,< / span> <span class = "s" > 'g--' < / span><span class = "p" >,< / span> <span class = "n" >label< / span><span class = "o" > = < / span><span class = "s" > "WLS" < / span><span class = "p" >)< / span> <span class = "n" >ax< / span><span class = "o" >.< / span><span class = "n" >plot< / span><span class = "p" >(< / span><span class = "n" >x< / span><span class = "p" >,< / span> <span class = "n" >iv_l< / span><span class = "p" >,< / span> <span class = "s" > 'g--' < / span><span class = "p" >)< / span> <span class = "n" >ax< / span><span class = "o" >.< / span><span class = "n" >legend< / span><span class = "p" >(< / span><span class = "n" >loc< / span><span class = "o" > = < / span><span class = "s" > "best" < / span><span class = "p" >);< / span> |
Feasible Weighted Least Squares (2-stage FWLS)
In [9]:
1 2 3 4 5 6 7 8 | <span class = "n" >resid1< / span> <span class = "o" > = < / span> <span class = "n" >res_ols< / span><span class = "o" >.< / span><span class = "n" >resid< / span><span class = "p" >[< / span><span class = "n" >w< / span><span class = "o" > = = < / span><span class = "mf" > 1. < / span><span class = "p" >]< / span> <span class = "n" >var1< / span> <span class = "o" > = < / span> <span class = "n" >resid1< / span><span class = "o" >.< / span><span class = "n" >var< / span><span class = "p" >(< / span><span class = "n" >ddof< / span><span class = "o" > = < / span><span class = "nb" > int < / span><span class = "p" >(< / span><span class = "n" >res_ols< / span><span class = "o" >.< / span><span class = "n" >df_model< / span><span class = "p" >)< / span><span class = "o" > + < / span><span class = "mi" > 1 < / span><span class = "p" >)< / span> <span class = "n" >resid2< / span> <span class = "o" > = < / span> <span class = "n" >res_ols< / span><span class = "o" >.< / span><span class = "n" >resid< / span><span class = "p" >[< / span><span class = "n" >w< / span><span class = "o" >! = < / span><span class = "mf" > 1. < / span><span class = "p" >]< / span> <span class = "n" >var2< / span> <span class = "o" > = < / span> <span class = "n" >resid2< / span><span class = "o" >.< / span><span class = "n" >var< / span><span class = "p" >(< / span><span class = "n" >ddof< / span><span class = "o" > = < / span><span class = "nb" > int < / span><span class = "p" >(< / span><span class = "n" >res_ols< / span><span class = "o" >.< / span><span class = "n" >df_model< / span><span class = "p" >)< / span><span class = "o" > + < / span><span class = "mi" > 1 < / span><span class = "p" >)< / span> <span class = "n" >w_est< / span> <span class = "o" > = < / span> <span class = "n" >w< / span><span class = "o" >.< / span><span class = "n" >copy< / span><span class = "p" >()< / span> <span class = "n" >w_est< / span><span class = "p" >[< / span><span class = "n" >w< / span><span class = "o" >! = < / span><span class = "mf" > 1. < / span><span class = "p" >]< / span> <span class = "o" > = < / span> <span class = "n" >np< / span><span class = "o" >.< / span><span class = "n" >sqrt< / span><span class = "p" >(< / span><span class = "n" >var2< / span><span class = "p" >)< / span> <span class = "o" > / < / span> <span class = "n" >np< / span><span class = "o" >.< / span><span class = "n" >sqrt< / span><span class = "p" >(< / span><span class = "n" >var1< / span><span class = "p" >)< / span> <span class = "n" >res_fwls< / span> <span class = "o" > = < / span> <span class = "n" >sm< / span><span class = "o" >.< / span><span class = "n" >WLS< / span><span class = "p" >(< / span><span class = "n" >y< / span><span class = "p" >,< / span> <span class = "n" >X< / span><span class = "p" >,< / span> <span class = "mf" > 1. < / span><span class = "o" > / < / span><span class = "n" >w_est< / span><span class = "p" >)< / span><span class = "o" >.< / span><span class = "n" >fit< / span><span class = "p" >()< / span> <span class = "k" > print < / span><span class = "p" >(< / span><span class = "n" >res_fwls< / span><span class = "o" >.< / span><span class = "n" >summary< / span><span class = "p" >())< / span> |
Please login to continue.