Raw vs Central Moments

The choice of reference point a in \(E[(X-a)^r]\) fundamentally changes what the moments measure. This page explains the difference and gives guidance on when to use each.

Raw Moments (a = 0)

Raw moments are computed about the origin:

\[\mu'_r = E[X^r]\]

They depend on both the location and the shape of the distribution. Shifting the distribution by a constant \(c\) changes all raw moments:

\[E[(X + c)^r] \neq E[X^r] \quad \text{in general}\]

When to use raw moments:

You need the mean (\(\mu'_1 = E[X]\))
You want to compute variance via the shortcut \(\sigma^2 = \mu'_2 - (\mu'_1)^2\)
You are computing moments of a distribution that is naturally centred at the origin (e.g. symmetric distributions like Normal(0, σ))
You are verifying a specific formula that is expressed in terms of raw moments

Limitations:

Raw moments of high order can be very large even for narrow distributions, because they grow with the distance of the distribution from the origin. This makes them less interpretable and more numerically sensitive.

Central Moments (a = μ)

Central moments are computed about the mean:

\[\mu_r = E\left[(X - \mu)^r\right]\]

They are location-invariant: shifting the distribution does not change any central moment. This makes them purely descriptive of the distribution’s shape.

Key properties:

\(\mu_1 = 0\) always (by definition of the mean)
\(\mu_2 = \sigma^2\) — variance, always non-negative
\(\mu_3\) — measures skewness (asymmetry)
\(\mu_4\) — measures tailedness (kurtosis)

When to use central moments:

You want variance, skewness, or kurtosis
You are comparing the shape of distributions with different means
You are verifying distributional properties that should be independent of location
You want the Statistics tab in ToFUL to populate with derived measures

Moments about a Custom Point (a = c)

For any constant \(c\):

\[\mu_r(c) = E\left[(X - c)^r\right]\]

This generalises both raw (\(c = 0\)) and central (\(c = \mu\)) moments.

When to use custom moments:

You want to measure deviations from a natural reference other than the mean (e.g. a threshold, a benchmark, or a regulatory bound)
You are using the parallel axis theorem to transform between moment types analytically
You are computing conditional moments given a specific point of interest

The Parallel Axis Theorem for Moments 

If you know the central moments and the mean, you can compute moments about any other point \(c\) without recomputation:

\[\mu_r(c) = \sum_{k=0}^{r} \binom{r}{k} (\mu - c)^{r-k} \mu_k\]

For \(r = 2\) specifically, this becomes the classical parallel axis theorem:

\[E\left[(X-c)^2\right] = \sigma^2 + (\mu - c)^2\]

ToFUL always computes directly from the definition rather than via this formula, to avoid cancellation errors.

Practical Comparison 

For a Geometric distribution with \(p = 0.3\):

Order	Raw moment (a = 0)	Central moment (a=μ)
1	2.3333 (= μ)	0.0000
2	13.2222	7.7778 (= σ²)
3	111.2222	50.9753
4	1247.2963	649.2963

The central moments are smaller and reveal the shape directly (σ² = 7.78, which equals \((1-p)/p^2 = 0.7/0.09\) — the known variance formula).

Raw vs Central Moments

Raw Moments (a = 0)

Central Moments (a = μ)

Moments about a Custom Point (a = c)

The Parallel Axis Theorem for Moments

Practical Comparison

See also

The Parallel Axis Theorem for Moments 

Practical Comparison 

See also 