Extensions of the Central Limit Theorem

applet-magic.com Thayer Watkins Silicon Valley & Tornado Alley USA

The Central Limit Theorem

The Central Limit Theorem (CLT) is a powerful and important result of mathematical analysis. In its standard form it says that if a stochastic variable x has a finite variance then the distribution of the sums of n samples of x will approach a normal distribution as the sample size n increases without limit. In this standard version there is the requirement that the elements of the sum have identical independent distributions. This is a requirement only in order to make the proof feasible. The CLT applies to the case of non-identical distributions so long as the set of distributions is bounded in terms of mean and variance. The CLT can be also extended to sample statistics beyond sums and means. For example, sample statistics of the form

s = f^-1(Σ_i=1ⁿf(x_i))

will have a limiting distribution which is a transform of a normal distribution.
Consider the instance of this case in which f(x)=x². This would be the sample standard deviation when the distribution of x is known to have a zero mean. Since the CLT applies for any distribution of finite variance it would apply to the distribution of x². Thus the distribution of the sum of squares would approach a normal distribution. The statistic s would then approach a distribution which is the square root transform of a normal distribution.
As another instance of this case consider the geometric means of samples:

g = [Π_i=1ⁿx_i]^1/n

This can be put into the form

s = log(g) = [Σ_i=1ⁿlog(x_i)]/n

The sum of the logarithms of x will approach a normal distribution and likewise the mean of the sample logarithms. Therefore the distribution of log(g) will approach a normal distribution and hence g will have a limiting distribution which is the exponential transform of a normal distribution.
In general then the sum of the f(x)'s will have a limit distribution which is normal and the statistic s will have a limit distribution which is the f^-1() transform of a normal distribution.
The Functional Transform of Distribution

If z has a probability distribution p(z) what is the distribution of f(z)? Consider first the case in which f(z) is a monotonically increasing function. The probability that z lies between a and b is given by

P(a≤z≤b) = ∫_a^bp(z)dz

The probability distribution for f(z) is given mathematically by the change of variable in the integral; i.e., the probability that w=f(z) is between f(a) and f(b) is given by:

P(f(a)≤f(z)≤f(b)) = ∫_a^b[p(f^-1(w))(dz/dw)]dw

For instance suppose w=f(z)=z³ and hence z=w^1/3. If z has the normal distribution (1/√2π)exp[-z²/2] then

dz/dw = (1/3)w^-2/3
and hence the probability distribution for w is
(1/3√2π)exp[-w^2/3/2]w^-2/3

When f(z) is monotonically decreasing the result is essentially the same except there has to be a reversal of the limits of integration which results in a negative sign in the result which when multiplied by the negative sign of dz/dw is equivalent to taking the absolute value of dz/dw.
When f(z) is not monotonic then the possibility of multiple solution for z of the equation f(z)=w must be taken into account. This is expressed as the probability density function for w, P(w), being given by

P(w) = Σp(z_α)|dz/dw|_{z_α}
where the sum is over all z_α
such that f(z_α)=w
and the derivative dz/dw is evaluated
at those values of z_α.

Consider the case of f(z)=z². Then z=±w^1/2 and dz/dw =(1/2)w^-1/2. The variable w can have only non-negative values. Its probability density function is given by:

P(w) = 2p(w^1/2)(1/2)w^-1/2 = p(w^1/2)w^-1/2

Suppose the probability density distribution for z is

p(z) = 1 for -0.5≤z≤+0.5
p(z) = 0 for all other values of z

The square of z can then only have values between 0 and 0.25. Thus the probability density function for w=z² is given by

P(w) = w^-1/2 for 0≤w≤0.25
P(w) = 0 for all other values of w

Illustration of this Extension of the Central Limit Theorem

Below are shown the histograms for 2000 repetions of taking samples of n random variables and computing the sum of the squares of a random variable which is uniformly distributed between -0.5 and +0.5. The sum is normalized by dividing by the square root of the sample size n. This keeps the dispersion of the distribution constant. Otherwise with larger n the distribution would be more spread out. Althought the random variable is distributed between -0.5 and +0.5 its square is distributed between 0 and 0.25. Each time the display is refreshed a new set of 2000 repetions of the samples is created.
As can be seen, as the sample size n gets larger the distribution more closely approximates the shape of the normal distribution.

Although the distribution for n=1 is decidedly non-normal, for n=16 the distribution looks quite close to a normal distribution even though the sample value can take on only positive values.
If the square root is taken of the sum of the squares the distributions of the results are as is shown below:

The positive square root of the square of the random variable is distributed from 0 to 0.5. Although the distributions for larger sample size look generally like normal distributions they are transforms of normal distributions.
Not All Sample Statistics Approximate
a Normal Distribution

Consider the distribution of sample maximums for samples of a random variable uniformly distributed between -0.5 and +0.5. For n=1 the sample maximum is just the sample value.

The above distributions suggests that for an extension of the central limit theorem to apply the sample statistic must be representable as a sum.

HOME PAGE OF applet-magic
HOME PAGE OF Thayer Watkins

The Central Limit Theorem

s = f-1(Σi=1nf(xi))

g = [Πi=1nxi]1/n

s = log(g) = [Σi=1nlog(xi)]/n

The Functional Transform of Distribution

P(a≤z≤b) = ∫abp(z)dz

P(f(a)≤f(z)≤f(b)) = ∫ab[p(f-1(w))(dz/dw)]dw

dz/dw = (1/3)w-2/3 and hence the probability distribution for w is (1/3√2π)exp[-w2/3/2]w-2/3

P(w) = Σp(zα)|dz/dw|zα where the sum is over all zα such that f(zα)=w and the derivative dz/dw is evaluated at those values of zα.

P(w) = 2p(w1/2)(1/2)w-1/2 = p(w1/2)w-1/2

p(z) = 1 for -0.5≤z≤+0.5 p(z) = 0 for all other values of z

P(w) = w-1/2 for 0≤w≤0.25 P(w) = 0 for all other values of w

Illustration of this Extension of the Central Limit Theorem

Not All Sample Statistics Approximate a Normal Distribution

s = f^-1(Σ_i=1ⁿf(x_i))

g = [Π_i=1ⁿx_i]^1/n

s = log(g) = [Σ_i=1ⁿlog(x_i)]/n

P(a≤z≤b) = ∫_a^bp(z)dz

P(f(a)≤f(z)≤f(b)) = ∫_a^b[p(f^-1(w))(dz/dw)]dw

dz/dw = (1/3)w^-2/3
and hence the probability distribution for w is
(1/3√2π)exp[-w^2/3/2]w^-2/3

P(w) = Σp(z_α)|dz/dw|_{z_α}
where the sum is over all z_α
such that f(z_α)=w
and the derivative dz/dw is evaluated
at those values of z_α.

P(w) = 2p(w^1/2)(1/2)w^-1/2 = p(w^1/2)w^-1/2

p(z) = 1 for -0.5≤z≤+0.5
p(z) = 0 for all other values of z

P(w) = w^-1/2 for 0≤w≤0.25
P(w) = 0 for all other values of w

Not All Sample Statistics Approximate
a Normal Distribution