statistics.variance()

statistics.variance(data, xbar=None)

Return the sample variance of data, an iterable of at least two real-valued numbers. Variance, or second moment about the mean, is a measure of the variability (spread or dispersion) of data. A large variance indicates that the data is spread out; a small variance indicates it is clustered closely around the mean.

If the optional second argument xbar is given, it should be the mean of data. If it is missing or None (the default), the mean is automatically calculated.

Use this function when your data is a sample from a population. To calculate the variance from the entire population, see pvariance().

Raises StatisticsError if data has fewer than two values.

Examples:

>>> data = [2.75, 1.75, 1.25, 0.25, 0.5, 1.25, 3.5]
>>> variance(data)
1.3720238095238095

If you have already calculated the mean of your data, you can pass it as the optional second argument xbar to avoid recalculation:

>>> m = mean(data)
>>> variance(data, m)
1.3720238095238095

This function does not attempt to verify that you have passed the actual mean as xbar. Using arbitrary values for xbar can lead to invalid or impossible results.

Decimal and Fraction values are supported:

>>> from decimal import Decimal as D
>>> variance([D("27.5"), D("30.25"), D("30.25"), D("34.5"), D("41.75")])
Decimal('31.01875')
 
>>> from fractions import Fraction as F
>>> variance([F(1, 6), F(1, 2), F(5, 3)])
Fraction(67, 108)

Note

This is the sample variance s² with Bessel’s correction, also known as variance with N-1 degrees of freedom. Provided that the data points are representative (e.g. independent and identically distributed), the result should be an unbiased estimate of the true population variance.

If you somehow know the actual population mean μ you should pass it to the pvariance() function as the mu parameter to get the variance of a sample.

Links:

https://docs.python.org/3.5/library/statistics.html#statistics.variance

doc_python

2025-01-10 15:47:30

Comments