Prediction (out of sample)
In [1]:
1 2 3 | <span class = "kn" > from < / span> <span class = "nn" >__future__< / span> <span class = "kn" > import < / span> <span class = "n" >print_function< / span> <span class = "kn" > import < / span> <span class = "nn" >numpy< / span> <span class = "kn" >as< / span> <span class = "nn" >np< / span> <span class = "kn" > import < / span> <span class = "nn" >statsmodels.api< / span> <span class = "kn" >as< / span> <span class = "nn" >sm< / span> |
Artificial data
In [2]:
1 2 3 4 5 6 7 8 | <span class = "n" >nsample< / span> <span class = "o" > = < / span> <span class = "mi" > 50 < / span> <span class = "n" >sig< / span> <span class = "o" > = < / span> <span class = "mf" > 0.25 < / span> <span class = "n" >x1< / span> <span class = "o" > = < / span> <span class = "n" >np< / span><span class = "o" >.< / span><span class = "n" >linspace< / span><span class = "p" >(< / span><span class = "mi" > 0 < / span><span class = "p" >,< / span> <span class = "mi" > 20 < / span><span class = "p" >,< / span> <span class = "n" >nsample< / span><span class = "p" >)< / span> <span class = "n" >X< / span> <span class = "o" > = < / span> <span class = "n" >np< / span><span class = "o" >.< / span><span class = "n" >column_stack< / span><span class = "p" >((< / span><span class = "n" >x1< / span><span class = "p" >,< / span> <span class = "n" >np< / span><span class = "o" >.< / span><span class = "n" >sin< / span><span class = "p" >(< / span><span class = "n" >x1< / span><span class = "p" >),< / span> <span class = "p" >(< / span><span class = "n" >x1< / span><span class = "o" > - < / span><span class = "mi" > 5 < / span><span class = "p" >)< / span><span class = "o" > * * < / span><span class = "mi" > 2 < / span><span class = "p" >))< / span> <span class = "n" >X< / span> <span class = "o" > = < / span> <span class = "n" >sm< / span><span class = "o" >.< / span><span class = "n" >add_constant< / span><span class = "p" >(< / span><span class = "n" >X< / span><span class = "p" >)< / span> <span class = "n" >beta< / span> <span class = "o" > = < / span> <span class = "p" >[< / span><span class = "mf" > 5. < / span><span class = "p" >,< / span> <span class = "mf" > 0.5 < / span><span class = "p" >,< / span> <span class = "mf" > 0.5 < / span><span class = "p" >,< / span> <span class = "o" > - < / span><span class = "mf" > 0.02 < / span><span class = "p" >]< / span> <span class = "n" >y_true< / span> <span class = "o" > = < / span> <span class = "n" >np< / span><span class = "o" >.< / span><span class = "n" >dot< / span><span class = "p" >(< / span><span class = "n" >X< / span><span class = "p" >,< / span> <span class = "n" >beta< / span><span class = "p" >)< / span> <span class = "n" >y< / span> <span class = "o" > = < / span> <span class = "n" >y_true< / span> <span class = "o" > + < / span> <span class = "n" >sig< / span> <span class = "o" > * < / span> <span class = "n" >np< / span><span class = "o" >.< / span><span class = "n" >random< / span><span class = "o" >.< / span><span class = "n" >normal< / span><span class = "p" >(< / span><span class = "n" >size< / span><span class = "o" > = < / span><span class = "n" >nsample< / span><span class = "p" >)< / span> |
Estimation
In [3]:
1 2 3 | <span class = "n" >olsmod< / span> <span class = "o" > = < / span> <span class = "n" >sm< / span><span class = "o" >.< / span><span class = "n" >OLS< / span><span class = "p" >(< / span><span class = "n" >y< / span><span class = "p" >,< / span> <span class = "n" >X< / span><span class = "p" >)< / span> <span class = "n" >olsres< / span> <span class = "o" > = < / span> <span class = "n" >olsmod< / span><span class = "o" >.< / span><span class = "n" >fit< / span><span class = "p" >()< / span> <span class = "k" > print < / span><span class = "p" >(< / span><span class = "n" >olsres< / span><span class = "o" >.< / span><span class = "n" >summary< / span><span class = "p" >())< / span> |
In-sample prediction
In [4]:
1 2 | <span class = "n" >ypred< / span> <span class = "o" > = < / span> <span class = "n" >olsres< / span><span class = "o" >.< / span><span class = "n" >predict< / span><span class = "p" >(< / span><span class = "n" >X< / span><span class = "p" >)< / span> <span class = "k" > print < / span><span class = "p" >(< / span><span class = "n" >ypred< / span><span class = "p" >)< / span> |
Create a new sample of explanatory variables Xnew, predict and plot
In [5]:
1 2 3 4 5 | <span class = "n" >x1n< / span> <span class = "o" > = < / span> <span class = "n" >np< / span><span class = "o" >.< / span><span class = "n" >linspace< / span><span class = "p" >(< / span><span class = "mf" > 20.5 < / span><span class = "p" >,< / span><span class = "mi" > 25 < / span><span class = "p" >,< / span> <span class = "mi" > 10 < / span><span class = "p" >)< / span> <span class = "n" >Xnew< / span> <span class = "o" > = < / span> <span class = "n" >np< / span><span class = "o" >.< / span><span class = "n" >column_stack< / span><span class = "p" >((< / span><span class = "n" >x1n< / span><span class = "p" >,< / span> <span class = "n" >np< / span><span class = "o" >.< / span><span class = "n" >sin< / span><span class = "p" >(< / span><span class = "n" >x1n< / span><span class = "p" >),< / span> <span class = "p" >(< / span><span class = "n" >x1n< / span><span class = "o" > - < / span><span class = "mi" > 5 < / span><span class = "p" >)< / span><span class = "o" > * * < / span><span class = "mi" > 2 < / span><span class = "p" >))< / span> <span class = "n" >Xnew< / span> <span class = "o" > = < / span> <span class = "n" >sm< / span><span class = "o" >.< / span><span class = "n" >add_constant< / span><span class = "p" >(< / span><span class = "n" >Xnew< / span><span class = "p" >)< / span> <span class = "n" >ynewpred< / span> <span class = "o" > = < / span> <span class = "n" >olsres< / span><span class = "o" >.< / span><span class = "n" >predict< / span><span class = "p" >(< / span><span class = "n" >Xnew< / span><span class = "p" >)< / span> <span class = "c" > # predict out of sample</span> <span class = "k" > print < / span><span class = "p" >(< / span><span class = "n" >ynewpred< / span><span class = "p" >)< / span> |
Plot comparison
In [6]:
1 2 3 4 5 6 7 | <span class = "kn" > import < / span> <span class = "nn" >matplotlib.pyplot< / span> <span class = "kn" >as< / span> <span class = "nn" >plt< / span> <span class = "n" >fig< / span><span class = "p" >,< / span> <span class = "n" >ax< / span> <span class = "o" > = < / span> <span class = "n" >plt< / span><span class = "o" >.< / span><span class = "n" >subplots< / span><span class = "p" >()< / span> <span class = "n" >ax< / span><span class = "o" >.< / span><span class = "n" >plot< / span><span class = "p" >(< / span><span class = "n" >x1< / span><span class = "p" >,< / span> <span class = "n" >y< / span><span class = "p" >,< / span> <span class = "s" > 'o' < / span><span class = "p" >,< / span> <span class = "n" >label< / span><span class = "o" > = < / span><span class = "s" > "Data" < / span><span class = "p" >)< / span> <span class = "n" >ax< / span><span class = "o" >.< / span><span class = "n" >plot< / span><span class = "p" >(< / span><span class = "n" >x1< / span><span class = "p" >,< / span> <span class = "n" >y_true< / span><span class = "p" >,< / span> <span class = "s" > 'b-' < / span><span class = "p" >,< / span> <span class = "n" >label< / span><span class = "o" > = < / span><span class = "s" > "True" < / span><span class = "p" >)< / span> <span class = "n" >ax< / span><span class = "o" >.< / span><span class = "n" >plot< / span><span class = "p" >(< / span><span class = "n" >np< / span><span class = "o" >.< / span><span class = "n" >hstack< / span><span class = "p" >((< / span><span class = "n" >x1< / span><span class = "p" >,< / span> <span class = "n" >x1n< / span><span class = "p" >)),< / span> <span class = "n" >np< / span><span class = "o" >.< / span><span class = "n" >hstack< / span><span class = "p" >((< / span><span class = "n" >ypred< / span><span class = "p" >,< / span> <span class = "n" >ynewpred< / span><span class = "p" >)),< / span> <span class = "s" > 'r' < / span><span class = "p" >,< / span> <span class = "n" >label< / span><span class = "o" > = < / span><span class = "s" > "OLS prediction" < / span><span class = "p" >)< / span> <span class = "n" >ax< / span><span class = "o" >.< / span><span class = "n" >legend< / span><span class = "p" >(< / span><span class = "n" >loc< / span><span class = "o" > = < / span><span class = "s" > "best" < / span><span class = "p" >);< / span> |
Predicting with Formulas
Using formulas can make both estimation and prediction a lot easier
In [7]:
1 2 3 4 5 | <span class = "kn" > from < / span> <span class = "nn" >statsmodels.formula.api< / span> <span class = "kn" > import < / span> <span class = "n" >ols< / span> <span class = "n" >data< / span> <span class = "o" > = < / span> <span class = "p" >{< / span><span class = "s" > "x1" < / span> <span class = "p" >:< / span> <span class = "n" >x1< / span><span class = "p" >,< / span> <span class = "s" > "y" < / span> <span class = "p" >:< / span> <span class = "n" >y< / span><span class = "p" >}< / span> <span class = "n" >res< / span> <span class = "o" > = < / span> <span class = "n" >ols< / span><span class = "p" >(< / span><span class = "s" > "y ~ x1 + np.sin(x1) + I((x1-5)**2)" < / span><span class = "p" >,< / span> <span class = "n" >data< / span><span class = "o" > = < / span><span class = "n" >data< / span><span class = "p" >)< / span><span class = "o" >.< / span><span class = "n" >fit< / span><span class = "p" >()< / span> |
We use the I
to indicate use of the Identity transform. Ie., we don't want any expansion magic from using **2
In [8]:
1 | <span class = "n" >res< / span><span class = "o" >.< / span><span class = "n" >params< / span> |
Out[8]:
Now we only have to pass the single variable and we get the transformed right-hand side variables automatically
In [9]:
1 | <span class = "n" >res< / span><span class = "o" >.< / span><span class = "n" >predict< / span><span class = "p" >(< / span><span class = "n" >exog< / span><span class = "o" > = < / span><span class = "nb" > dict < / span><span class = "p" >(< / span><span class = "n" >x1< / span><span class = "o" > = < / span><span class = "n" >x1n< / span><span class = "p" >))< / span> |
Out[9]:
Please login to continue.