stats.gof.powerdiscrepancy()

statsmodels.stats.gof.powerdiscrepancy

statsmodels.stats.gof.powerdiscrepancy(observed, expected, lambd=0.0, axis=0, ddof=0) [source]

Calculates power discrepancy, a class of goodness-of-fit tests as a measure of discrepancy between observed and expected data.

This contains several goodness-of-fit tests as special cases, see the describtion of lambd, the exponent of the power discrepancy. The pvalue is based on the asymptotic chi-square distribution of the test statistic.

freeman_tukey: D(x| heta) = sum_j (sqrt{x_j} - sqrt{e_j})^2

Parameters:

o : Iterable

Observed values

e : Iterable

Expected values

lambd : float or string

  • float : exponent a for power discrepancy
  • ?loglikeratio?: a = 0
  • ?freeman_tukey?: a = -0.5
  • ?pearson?: a = 1 (standard chisquare test statistic)
  • ?modified_loglikeratio?: a = -1
  • ?cressie_read?: a = 2/3
  • ?neyman? : a = -2 (Neyman-modified chisquare, reference from a book?)

axis : int

axis for observations of one series

ddof : int

degrees of freedom correction,

Returns:

D_obs : Discrepancy of observed values

pvalue : pvalue

References

Cressie, Noel and Timothy R. C. Read, Multinomial Goodness-of-Fit Tests,
Journal of the Royal Statistical Society. Series B (Methodological), Vol. 46, No. 3 (1984), pp. 440-464
Campbell B. Read: Freeman-Tukey chi-squared goodness-of-fit statistics,
Statistics & Probability Letters 18 (1993) 271-278
Nobuhiro Taneichi, Yuri Sekiya, Akio Suzukawa, Asymptotic Approximations
for the Distributions of the Multinomial Goodness-of-Fit Statistics under Local Alternatives, Journal of Multivariate Analysis 81, 335?359 (2002)
Steele, M. 1,2, C. Hurst 3 and J. Chaseling, Simulated Power of Discrete
Goodness-of-Fit Tests for Likert Type Data

Examples

1
2
>>> observed = np.array([ 2.4.2.1.1.])
>>> expected = np.array([ 0.20.20.20.20.2])

for checking correct dimension with multiple series

1
2
3
4
5
6
7
8
9
10
11
12
13
14
>>> powerdiscrepancy(np.column_stack((observed,observed)).T, 10*expected, lambd='freeman_tukey',axis=1)
(array([[ 2.7451662.745166]]), array([[ 0.60133460.6013346]]))
>>> powerdiscrepancy(np.column_stack((observed,observed)).T, 10*expected,axis=1)
(array([[ 2.772588722.77258872]]), array([[ 0.596573590.59657359]]))
>>> powerdiscrepancy(np.column_stack((observed,observed)).T, 10*expected, lambd=0,axis=1)
(array([[ 2.772588722.77258872]]), array([[ 0.596573590.59657359]]))
>>> powerdiscrepancy(np.column_stack((observed,observed)).T, 10*expected, lambd=1,axis=1)
(array([[ 3.3.]]), array([[ 0.55782540.5578254]]))
>>> powerdiscrepancy(np.column_stack((observed,observed)).T, 10*expected, lambd=2/3.0,axis=1)
(array([[ 2.897145462.89714546]]), array([[ 0.575182770.57518277]]))
>>> powerdiscrepancy(np.column_stack((observed,observed)).T, expected, lambd=2/3.0,axis=1)
(array([[ 2.897145462.89714546]]), array([[ 0.575182770.57518277]]))
>>> powerdiscrepancy(np.column_stack((observed,observed)), expected, lambd=2/3.0, axis=0)
(array([[ 2.897145462.89714546]]), array([[ 0.575182770.57518277]]))

each random variable can have different total count/sum

1
2
3
4
5
6
7
8
9
10
11
12
>>> powerdiscrepancy(np.column_stack((observed,2*observed)), expected, lambd=2/3.0, axis=0)
(array([[ 2.897145465.79429093]]), array([[ 0.575182770.21504648]]))
>>> powerdiscrepancy(np.column_stack((observed,2*observed)), expected, lambd=2/3.0, axis=0)
(array([[ 2.897145465.79429093]]), array([[ 0.575182770.21504648]]))
>>> powerdiscrepancy(np.column_stack((2*observed,2*observed)), expected, lambd=2/3.0, axis=0)
(array([[ 5.794290935.79429093]]), array([[ 0.215046480.21504648]]))
>>> powerdiscrepancy(np.column_stack((2*observed,2*observed)), 20*expected, lambd=2/3.0, axis=0)
(array([[ 5.794290935.79429093]]), array([[ 0.215046480.21504648]]))
>>> powerdiscrepancy(np.column_stack((observed,2*observed)), np.column_stack((10*expected,20*expected)), lambd=2/3.0, axis=0)
(array([[ 2.897145465.79429093]]), array([[ 0.575182770.21504648]]))
>>> powerdiscrepancy(np.column_stack((observed,2*observed)), np.column_stack((10*expected,20*expected)), lambd=-1, axis=0)
(array([[ 2.772588725.54517744]]), array([[ 0.596573590.2357868 ]]))
doc_statsmodels
2025-01-10 15:47:30
Comments
Leave a Comment

Please login to continue.