Robust Linear Models

Robust Linear Models

Link to Notebook GitHub

In [1]:
1
2
3
4
5
<span class="kn">from</span> <span class="nn">__future__</span> <span class="kn">import</span> <span class="n">print_function</span>
<span class="kn">import</span> <span class="nn">numpy</span> <span class="kn">as</span> <span class="nn">np</span>
<span class="kn">import</span> <span class="nn">statsmodels.api</span> <span class="kn">as</span> <span class="nn">sm</span>
<span class="kn">import</span> <span class="nn">matplotlib.pyplot</span> <span class="kn">as</span> <span class="nn">plt</span>
<span class="kn">from</span> <span class="nn">statsmodels.sandbox.regression.predstd</span> <span class="kn">import</span> <span class="n">wls_prediction_std</span>

Estimation

Load data:

In [2]:
1
2
<span class="n">data</span> <span class="o">=</span> <span class="n">sm</span><span class="o">.</span><span class="n">datasets</span><span class="o">.</span><span class="n">stackloss</span><span class="o">.</span><span class="n">load</span><span class="p">()</span>
<span class="n">data</span><span class="o">.</span><span class="n">exog</span> <span class="o">=</span> <span class="n">sm</span><span class="o">.</span><span class="n">add_constant</span><span class="p">(</span><span class="n">data</span><span class="o">.</span><span class="n">exog</span><span class="p">)</span>

Huber's T norm with the (default) median absolute deviation scaling

In [3]:
1
2
3
4
5
6
<span class="n">huber_t</span> <span class="o">=</span> <span class="n">sm</span><span class="o">.</span><span class="n">RLM</span><span class="p">(</span><span class="n">data</span><span class="o">.</span><span class="n">endog</span><span class="p">,</span> <span class="n">data</span><span class="o">.</span><span class="n">exog</span><span class="p">,</span> <span class="n">M</span><span class="o">=</span><span class="n">sm</span><span class="o">.</span><span class="n">robust</span><span class="o">.</span><span class="n">norms</span><span class="o">.</span><span class="n">HuberT</span><span class="p">())</span>
<span class="n">hub_results</span> <span class="o">=</span> <span class="n">huber_t</span><span class="o">.</span><span class="n">fit</span><span class="p">()</span>
<span class="k">print</span><span class="p">(</span><span class="n">hub_results</span><span class="o">.</span><span class="n">params</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="n">hub_results</span><span class="o">.</span><span class="n">bse</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="n">hub_results</span><span class="o">.</span><span class="n">summary</span><span class="p">(</span><span class="n">yname</span><span class="o">=</span><span class="s">'y'</span><span class="p">,</span>
            <span class="n">xname</span><span class="o">=</span><span class="p">[</span><span class="s">'var_</span><span class="si">%d</span><span class="s">'</span> <span class="o">%</span> <span class="n">i</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">hub_results</span><span class="o">.</span><span class="n">params</span><span class="p">))]))</span>
[-41.0265   0.8294   0.9261  -0.1278]
[ 9.7919  0.111   0.3029  0.1286]
                    Robust linear Model Regression Results
==============================================================================
Dep. Variable:                      y   No. Observations:                   21
Model:                            RLM   Df Residuals:                       17
Method:                          IRLS   Df Model:                            3
Norm:                          HuberT
Scale Est.:                       mad
Cov Type:                          H1
Date:                Tue, 02 Dec 2014
Time:                        12:54:10
No. Iterations:                    19
==============================================================================
                 coef    std err          z      P>|z|      [95.0% Conf. Int.]
------------------------------------------------------------------------------
var_0        -41.0265      9.792     -4.190      0.000       -60.218   -21.835
var_1          0.8294      0.111      7.472      0.000         0.612     1.047
var_2          0.9261      0.303      3.057      0.002         0.332     1.520
var_3         -0.1278      0.129     -0.994      0.320        -0.380     0.124
==============================================================================

If the model instance has been used for another fit with different fit
parameters, then the fit options might not be the correct ones anymore .

Huber's T norm with 'H2' covariance matrix

In [4]:
1
2
3
<span class="n">hub_results2</span> <span class="o">=</span> <span class="n">huber_t</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span><span class="n">cov</span><span class="o">=</span><span class="s">"H2"</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="n">hub_results2</span><span class="o">.</span><span class="n">params</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="n">hub_results2</span><span class="o">.</span><span class="n">bse</span><span class="p">)</span>
[-41.0265   0.8294   0.9261  -0.1278]
[ 9.0895  0.1195  0.3224  0.118 ]

Andrew's Wave norm with Huber's Proposal 2 scaling and 'H3' covariance matrix

In [5]:
1
2
3
<span class="n">andrew_mod</span> <span class="o">=</span> <span class="n">sm</span><span class="o">.</span><span class="n">RLM</span><span class="p">(</span><span class="n">data</span><span class="o">.</span><span class="n">endog</span><span class="p">,</span> <span class="n">data</span><span class="o">.</span><span class="n">exog</span><span class="p">,</span> <span class="n">M</span><span class="o">=</span><span class="n">sm</span><span class="o">.</span><span class="n">robust</span><span class="o">.</span><span class="n">norms</span><span class="o">.</span><span class="n">AndrewWave</span><span class="p">())</span>
<span class="n">andrew_results</span> <span class="o">=</span> <span class="n">andrew_mod</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span><span class="n">scale_est</span><span class="o">=</span><span class="n">sm</span><span class="o">.</span><span class="n">robust</span><span class="o">.</span><span class="n">scale</span><span class="o">.</span><span class="n">HuberScale</span><span class="p">(),</span> <span class="n">cov</span><span class="o">=</span><span class="s">"H3"</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="s">'Parameters: '</span><span class="p">,</span> <span class="n">andrew_results</span><span class="o">.</span><span class="n">params</span><span class="p">)</span>
Parameters:  [-40.8818   0.7928   1.0486  -0.1336]

See help(sm.RLM.fit) for more options and module sm.robust.scale for scale options

Comparing OLS and RLM

Artificial data with outliers:

In [6]:
1
2
3
4
5
6
7
8
9
<span class="n">nsample</span> <span class="o">=</span> <span class="mi">50</span>
<span class="n">x1</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">linspace</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">20</span><span class="p">,</span> <span class="n">nsample</span><span class="p">)</span>
<span class="n">X</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">column_stack</span><span class="p">((</span><span class="n">x1</span><span class="p">,</span> <span class="p">(</span><span class="n">x1</span><span class="o">-</span><span class="mi">5</span><span class="p">)</span><span class="o">**</span><span class="mi">2</span><span class="p">))</span>
<span class="n">X</span> <span class="o">=</span> <span class="n">sm</span><span class="o">.</span><span class="n">add_constant</span><span class="p">(</span><span class="n">X</span><span class="p">)</span>
<span class="n">sig</span> <span class="o">=</span> <span class="mf">0.3</span>   <span class="c"># smaller error variance makes OLS<->RLM contrast bigger</span>
<span class="n">beta</span> <span class="o">=</span> <span class="p">[</span><span class="mi">5</span><span class="p">,</span> <span class="mf">0.5</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.0</span><span class="p">]</span>
<span class="n">y_true2</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">dot</span><span class="p">(</span><span class="n">X</span><span class="p">,</span> <span class="n">beta</span><span class="p">)</span>
<span class="n">y2</span> <span class="o">=</span> <span class="n">y_true2</span> <span class="o">+</span> <span class="n">sig</span><span class="o">*</span><span class="mf">1.</span> <span class="o">*</span> <span class="n">np</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">normal</span><span class="p">(</span><span class="n">size</span><span class="o">=</span><span class="n">nsample</span><span class="p">)</span>
<span class="n">y2</span><span class="p">[[</span><span class="mi">39</span><span class="p">,</span><span class="mi">41</span><span class="p">,</span><span class="mi">43</span><span class="p">,</span><span class="mi">45</span><span class="p">,</span><span class="mi">48</span><span class="p">]]</span> <span class="o">-=</span> <span class="mi">5</span>   <span class="c"># add some outliers (10% of nsample)</span>

Example 1: quadratic function with linear truth

Note that the quadratic term in OLS regression will capture outlier effects.

In [7]:
1
2
3
4
<span class="n">res</span> <span class="o">=</span> <span class="n">sm</span><span class="o">.</span><span class="n">OLS</span><span class="p">(</span><span class="n">y2</span><span class="p">,</span> <span class="n">X</span><span class="p">)</span><span class="o">.</span><span class="n">fit</span><span class="p">()</span>
<span class="k">print</span><span class="p">(</span><span class="n">res</span><span class="o">.</span><span class="n">params</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="n">res</span><span class="o">.</span><span class="n">bse</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="n">res</span><span class="o">.</span><span class="n">predict</span><span class="p">())</span>
[ 5.1406  0.52   -0.0144]
[ 0.4601  0.071   0.0063]
[  4.7816   5.05     5.3137   5.5727   5.8268   6.0761   6.3207   6.5604
   6.7954   7.0256   7.2511   7.4717   7.6875   7.8986   8.1049   8.3064
   8.5031   8.695    8.8821   9.0645   9.2421   9.4149   9.5829   9.7461
   9.9045  10.0582  10.207   10.3511  10.4904  10.6249  10.7546  10.8796
  10.9997  11.1151  11.2257  11.3315  11.4325  11.5287  11.6202  11.7068
  11.7887  11.8658  11.9381  12.0056  12.0684  12.1263  12.1795  12.2279
  12.2715  12.3103]

Estimate RLM:

In [8]:
1
2
3
<span class="n">resrlm</span> <span class="o">=</span> <span class="n">sm</span><span class="o">.</span><span class="n">RLM</span><span class="p">(</span><span class="n">y2</span><span class="p">,</span> <span class="n">X</span><span class="p">)</span><span class="o">.</span><span class="n">fit</span><span class="p">()</span>
<span class="k">print</span><span class="p">(</span><span class="n">resrlm</span><span class="o">.</span><span class="n">params</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="n">resrlm</span><span class="o">.</span><span class="n">bse</span><span class="p">)</span>
[ 5.1141  0.5015 -0.0036]
[ 0.1318  0.0204  0.0018]

Draw a plot to compare OLS estimates to the robust estimates:

In [9]:
1
2
3
4
5
6
7
8
9
10
<span class="n">fig</span> <span class="o">=</span> <span class="n">plt</span><span class="o">.</span><span class="n">figure</span><span class="p">(</span><span class="n">figsize</span><span class="o">=</span><span class="p">(</span><span class="mi">12</span><span class="p">,</span><span class="mi">8</span><span class="p">))</span>
<span class="n">ax</span> <span class="o">=</span> <span class="n">fig</span><span class="o">.</span><span class="n">add_subplot</span><span class="p">(</span><span class="mi">111</span><span class="p">)</span>
<span class="n">ax</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">x1</span><span class="p">,</span> <span class="n">y2</span><span class="p">,</span> <span class="s">'o'</span><span class="p">,</span><span class="n">label</span><span class="o">=</span><span class="s">"data"</span><span class="p">)</span>
<span class="n">ax</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">x1</span><span class="p">,</span> <span class="n">y_true2</span><span class="p">,</span> <span class="s">'b-'</span><span class="p">,</span> <span class="n">label</span><span class="o">=</span><span class="s">"True"</span><span class="p">)</span>
<span class="n">prstd</span><span class="p">,</span> <span class="n">iv_l</span><span class="p">,</span> <span class="n">iv_u</span> <span class="o">=</span> <span class="n">wls_prediction_std</span><span class="p">(</span><span class="n">res</span><span class="p">)</span>
<span class="n">ax</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">x1</span><span class="p">,</span> <span class="n">res</span><span class="o">.</span><span class="n">fittedvalues</span><span class="p">,</span> <span class="s">'r-'</span><span class="p">,</span> <span class="n">label</span><span class="o">=</span><span class="s">"OLS"</span><span class="p">)</span>
<span class="n">ax</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">x1</span><span class="p">,</span> <span class="n">iv_u</span><span class="p">,</span> <span class="s">'r--'</span><span class="p">)</span>
<span class="n">ax</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">x1</span><span class="p">,</span> <span class="n">iv_l</span><span class="p">,</span> <span class="s">'r--'</span><span class="p">)</span>
<span class="n">ax</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">x1</span><span class="p">,</span> <span class="n">resrlm</span><span class="o">.</span><span class="n">fittedvalues</span><span class="p">,</span> <span class="s">'g.-'</span><span class="p">,</span> <span class="n">label</span><span class="o">=</span><span class="s">"RLM"</span><span class="p">)</span>
<span class="n">ax</span><span class="o">.</span><span class="n">legend</span><span class="p">(</span><span class="n">loc</span><span class="o">=</span><span class="s">"best"</span><span class="p">)</span>
Out[9]:
<matplotlib.legend.Legend at 0x2b147bfb5f10>

Example 2: linear function with linear truth

Fit a new OLS model using only the linear term and the constant:

In [10]:
1
2
3
4
<span class="n">X2</span> <span class="o">=</span> <span class="n">X</span><span class="p">[:,[</span><span class="mi">0</span><span class="p">,</span><span class="mi">1</span><span class="p">]]</span>
<span class="n">res2</span> <span class="o">=</span> <span class="n">sm</span><span class="o">.</span><span class="n">OLS</span><span class="p">(</span><span class="n">y2</span><span class="p">,</span> <span class="n">X2</span><span class="p">)</span><span class="o">.</span><span class="n">fit</span><span class="p">()</span>
<span class="k">print</span><span class="p">(</span><span class="n">res2</span><span class="o">.</span><span class="n">params</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="n">res2</span><span class="o">.</span><span class="n">bse</span><span class="p">)</span>
[ 5.7194  0.3764]
[ 0.4006  0.0345]

Estimate RLM:

In [11]:
1
2
3
<span class="n">resrlm2</span> <span class="o">=</span> <span class="n">sm</span><span class="o">.</span><span class="n">RLM</span><span class="p">(</span><span class="n">y2</span><span class="p">,</span> <span class="n">X2</span><span class="p">)</span><span class="o">.</span><span class="n">fit</span><span class="p">()</span>
<span class="k">print</span><span class="p">(</span><span class="n">resrlm2</span><span class="o">.</span><span class="n">params</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="n">resrlm2</span><span class="o">.</span><span class="n">bse</span><span class="p">)</span>
[ 5.2347  0.4696]
[ 0.1041  0.009 ]

Draw a plot to compare OLS estimates to the robust estimates:

In [12]:
1
2
3
4
5
6
7
8
9
10
<span class="n">prstd</span><span class="p">,</span> <span class="n">iv_l</span><span class="p">,</span> <span class="n">iv_u</span> <span class="o">=</span> <span class="n">wls_prediction_std</span><span class="p">(</span><span class="n">res2</span><span class="p">)</span>
 
<span class="n">fig</span><span class="p">,</span> <span class="n">ax</span> <span class="o">=</span> <span class="n">plt</span><span class="o">.</span><span class="n">subplots</span><span class="p">(</span><span class="n">figsize</span><span class="o">=</span><span class="p">(</span><span class="mi">8</span><span class="p">,</span><span class="mi">6</span><span class="p">))</span>
<span class="n">ax</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">x1</span><span class="p">,</span> <span class="n">y2</span><span class="p">,</span> <span class="s">'o'</span><span class="p">,</span> <span class="n">label</span><span class="o">=</span><span class="s">"data"</span><span class="p">)</span>
<span class="n">ax</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">x1</span><span class="p">,</span> <span class="n">y_true2</span><span class="p">,</span> <span class="s">'b-'</span><span class="p">,</span> <span class="n">label</span><span class="o">=</span><span class="s">"True"</span><span class="p">)</span>
<span class="n">ax</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">x1</span><span class="p">,</span> <span class="n">res2</span><span class="o">.</span><span class="n">fittedvalues</span><span class="p">,</span> <span class="s">'r-'</span><span class="p">,</span> <span class="n">label</span><span class="o">=</span><span class="s">"OLS"</span><span class="p">)</span>
<span class="n">ax</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">x1</span><span class="p">,</span> <span class="n">iv_u</span><span class="p">,</span> <span class="s">'r--'</span><span class="p">)</span>
<span class="n">ax</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">x1</span><span class="p">,</span> <span class="n">iv_l</span><span class="p">,</span> <span class="s">'r--'</span><span class="p">)</span>
<span class="n">ax</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">x1</span><span class="p">,</span> <span class="n">resrlm2</span><span class="o">.</span><span class="n">fittedvalues</span><span class="p">,</span> <span class="s">'g.-'</span><span class="p">,</span> <span class="n">label</span><span class="o">=</span><span class="s">"RLM"</span><span class="p">)</span>
<span class="n">legend</span> <span class="o">=</span> <span class="n">ax</span><span class="o">.</span><span class="n">legend</span><span class="p">(</span><span class="n">loc</span><span class="o">=</span><span class="s">"best"</span><span class="p">)</span>
doc_statsmodels
2025-01-10 15:47:30
Comments
Leave a Comment

Please login to continue.