Wednesday, June 30, 2010

Unbiasedness and 2 counterexamples

The role of unbiasedness in statistical decision theory is ambiguous. Of the various criteria being considered here, it is the only one that does not depend solely on the risk function. Often we find that biased estimators perform better than unbiased ones from the point of view of, say, minimising the mean squared error. For this reason, many modern statisticians consider the whole concept of unbiasedness to be somewhere between a distraction and a total irrelevance.

- Essentials of Statistical Inference, Young, G.A. and R.L. Smith, CUP (2005), pp. 10

I will supply two examples - one innocuous and one dramatic to illustrate the problems with unbiasedness as a criterion for selecting estimators. These are both taken from here, but the second is orginally from here.

  1. Consider the problem of estimating the variance of observations $\{X\}_{i=1}^{n}$ drawn from a Gaussian density with mean $\mu$ and variance $\sigma^2$. Consider the estimators \[s^2 =\frac{1}{n-1}\sum_{i=1}^n(x_i-\bar{x})^2\] and \[t^2 =\frac{1}{n}\sum_{i=1}^n(x_i-\bar{x})^2\] for $\sigma^2$ where $\bar{x} = \frac{\sum_{i=1}^n x_i}{n}$. $s^2$ is often insisted upon as it is claimed to be better due to its unbiasedness. However, on the more relevant criteria of mean square error (what good is an unbiased estimator for a parameter with true value zero (say) if it is -100 half the time and 100 the other half? I'd rather have the biased estimator that is +0.0001 with probability 0.6 and -0.0001 with probability 0.4) $t^2$ is superior in this case, as \[\mathbb{E}(t^2-\sigma^2)^2 < \mathbb{E}(s^2-\sigma^2)^2\]

  2. The differences in the two estimators above diminishes as the sample sizes increase and they converge to the same probability limit, $\sigma^2$, and thus are both consistent. A remarkable situation where this is not the case is when $X \sim Poisson(\lambda)$, $\lambda \in \mathbb{R}^+$. An unbiased estimator $\delta(X)$ for $e^{-2\lambda}$ satisfies the requirement \[\mathbb{E}(\delta(X)) = \sum_{x=0}^{\infty}\delta(x)\frac{e^{-\lambda}\lambda^x}{x!} = e^{-2\lambda}\], where $\frac{e^{-\lambda}\lambda^x}{x!}$ is the Poisson mass function.
    But from the power series expansion of $e^{-\lambda}$ we know that the only possible such function $\delta(X)$ is $(-1)^{X}$. Then if $X=100$, we are lead to estimate the parameter $e^{-2\lambda}$ as 1 whereas $\lambda$ is would need to be fairly high to produce a realisation of 100. And even more incredibly, if $X=3$, then $\hat{\delta} = -1$, an estimate for a parameter which must lie in $(0, 1]$. A better estimator for $e^{-2\lambda}$ is $e^{-2X}$ which is, in this case, the maximum likelihood estimator.

1 comment:

Arash Beheshtian said...


My name is Arash, PhD student at Cornell University, USA.

Just noticed you are expert in ‘R’ and was wondering ask you a question.

Is there any online tutorial or workshop for ‘mlogit’ package?

The reason I ask is that my research topic is microeconomics impact of vehicle alternatives (fuel cell vehicle vs hybrid vs …) and I need to get comfortable with ‘mlogit’ package.

Also, I had a course at Cornell and reviewed some online/pdf materials, but I think I need to learn more details and doing more in-depth studies. Any help from your side, would be greatly appreciated.

Thanks a bunch,