The Average as a Surrogate for the Total
In statistics and probability, the arithmetic mean or the
average value is often considered to be especially important. There are some good reasons for this –
sometimes. Other times, not so much. For starters, there really is no such thing as an average person. So, knowing the average value for a
population may not give you very much information about yourself or any other
specific individual. But, for the
purpose of providing a quantitative description of a population, the average
often works rather well. The reason is
simple; the average is proportional to the total:
Average = Population
Total / number of persons
Therefore, as long as the utility function is also
proportional to the quantitative value, the average serves as a utilitarian
measure of value. Even though that proposition is dubitable to
the point of being obviously wrong under some circumstances (e.g. for risk
assessments where the risk is driven by extreme values), the “average person” often
serves as a useful stand in for "everyone".
Expected Value =
Total Net Return / number of bets
The use of mathematical probability in finance and insurance
often uses the same underlying logic:
Given the fact are sure to be some bad loans and bad insurance risks,
the key having a profitable business is to have the average return be
positive. At least, that is what
investors expect.
Measurement Error
Using the Standard Error of the Mean to characterize the
uncertainty associated with scientific measurements has a long history. Writing in 1755, Thomas Simpson adapted the Bernoulli
theorem (aka the law of large numbers) to make the following observation
(quote from Stigler, 1986):
Upon the whole …. It appears, that the taking of the Mean of a number of observations, greatly diminishes the chances for all the smaller errors, and cuts off almost all possibility of any great ones: which last consideration, alone, seems sufficient to recommend the use of the method, not only to astronomers, but to all others concerned in making experiments of any kind (to which the above reasoning is equally applicable). And the more observations or experiments that are made, the less will the conclusion be liable to err, provided they admit of being repeated under similar circumstances.
However Simpson’s claim was met by immediate criticism from
Thomas Bayes, who noted (also in Stigler, 1986):
As I see no mistakes in Mr. Simpson’s calculations, I will venture to say that there is one in the Hypothesis on which he proceeds. And I think it is manifestly this, when we observe with imperfect instruments or organs; he supposes that the chances for the same error in excess or defect are exactly the same, and upon this hypothesis only has he shown the incredible advantage, which he would prove arises from taking the mean of a great many observations.
In other words, the standard error of the mean accurately
characterizes the uncertainty of a measurement only when, as Simpson assumed,
the true value corresponds to the arithmetic mean. If it doesn’t, then even though the theorem
is true, the result is irrelevant. For
example, if the errors are lognormally distributed, then the true value will
correspond to the geometric mean rather than the arithmetic mean. If the underlying distribution of the
measurements to the true value is unknown, then so is the relationship of the
true value to the distribution. Calling
the mean value the expected value doesn’t help at all.
Averaging the Truth
In the realm of the probability of chance, the mean value is
almost certainly given far more credence than it deserves. But still, under most circumstances the arithmetic
mean isn’t too far from the actual value of interest to not be considered
approximately true. On the other hand, with the probability of
causes, or any other notion of probability arising from a notion of competing
theoretical propositions, there is no basis for using a mean value at all. For example, consider the probability that
the earth is round as opposed to flat. As a decision problem, under no circumstances
would it make any sense to average the flat earth theory with the round one. Yet, that is essentially what Bayesian Model Averaging
does.
The admirable trait of Bayesian Model Averaging (BMA) is
that it acknowledges that different plausible models may yield estimates that
may be quite different (Hoeting et al, 1999).
Like a probability tree treatment
of model uncertainty (e.g. Evans et al, 1994; Carrington et al, 2013), BMA requires
identification of a set of alternative plausible models and establishing a
model probability that will surely require some degree of subjective
judgement. But, with BMA the subjective
probability is just the prior probability rather than the finished product. Bayesian updating and averaging is the next
step.
The differences between BMA and an unvarnished probability
tree are all attributable to different notions of probability. Like Bayesian schemes in general, BMA is
intended to give the probability of causes a mathematical treatment that
resembles that used for the probability of chance; and the fixation on the
arithmetic mean comes with that package.
A probability tree approach that embellishes a weight-of-the-evidence evaluation
is apt to use something like the Bradford-Hill criteria (Hill, 1966) to establish
model probabilities, none of which assign any importance to the arithmetic
mean. Given the fact that assuming the mean
is what led Thomas Bayes to criticize Simpson, it seems that the real Bayes
would never have approved of BMA.
Along with a range or outer bounds, the mean is perhaps a
useful central estimate even when uncertainty arises from competing plausible propositions. But, since
it corresponds to a common legal standard of proof (“preponderance of the evidence”),
the median is better for many purposes.
But there may be room for both.
The real problem with BMA is that it proffers the arithmetic mean as the
value of interest. It isn’t really; the value
at stake is the truth. If current science is unable to divulge it,
then we really don’t know what to expect.
References
Carrington CD, Murray C, and Tao, S. (2013). A
Quantitative Assessment of Inorganic Arsenic in Apple Juice.
Evans, J.S.,
Graham, J.D., Gray, G.M., and Sielken, R.L., Jr. (1994). A distributional
approach to characterizing low-dose cancer risk. Risk
Anal 14:25-34.
Hill, Sir Arthur Bradford (1965). The Environment and Disease: Association or
Causation? Proc Royal Soc Med 58:295-300.
Hoeting JA, Madigan D, Raftery AE, and Volinsky VT (1999). Bayesian Model
Averaging: A Tutorial. Statistical Science
14:382–417.
Stigler SM (1986).
Probabilities and the Measurement of Uncertainty. In: The
History of Statistics: The Measurement of Uncertainty before 1900. Belknap Press, Cambridge MA, pp. 62-98.
Official Post Soundtrack
Supertramp (1975).
The Meaning. In: Crisis, What Crisis?, Track 9.
Post Notes
Thesis Post #47. Best read in conjunction with "A Dictionary of Probability" an "Quantifiers".
No comments:
Post a Comment