Tuesday, March 31, 2026

To Bayes or Not to Bayes, That Is the Question


The Responsibility Gap

There is no reason to assign probabilities to competing theories unless there is an actual decision that must be made.  Pascal knew that, and that’s one of the things you would learn from chapter 6 (Hacking, 1975).  It obviously can’t be a flawless process; otherwise the correct theory would be assigned a probability of 1 every time.

So, I figure the next best thing is to have probability assignments that are scientifically defensible.  At least that’s what I tried to do when working for a regulatory agency as a scientific advisor because a) I figured it was my job, b) I liked the job, and c) I was allowed to do it.  I always thought that would start a conversation about what the probability assignments should be.  But that didn’t happen for several related reasons:

1) It may be easier to not make a decision at all.  It’s sort of an FDA tradition to declare emergency and then at a later date declare victory without doing anything at all.
2) If a decision absolutely must be made then it is much easier to with a formula, e.g. safety/uncertainty factors that doesn’t need to scientifically defensible.
3) If a risk estimate is necessary then it’s much easier to use a default assumption (linear extrapolation from high doses

All of those techniques insulate the expert from the decision, which obviates the need to assign probabilities to alternative hypotheses.  

Bayesian metholodogy also insulates experts from the decision, but to a lesser degree.  Since it does lay out the competing theories that underly a decision, I think it is far preferable to the backroom methodologies outlined above.  But perhaps the best about Bayesian methodology is that the practitioners actually want the job.

Help Wanted

I started assigning probabilities to alternative hypotheses because I figured it was my job and for many years I was allowed to do it.  I was also allow to participate in the writing of my own job description and I always made sure it said I was supposed to “convey uncertainty about potential health effects arising from contaminants  in food to decision makers”.  But that didn’t last; after a reorganization my job description changed to something like “support a decision that has already been made”.   That’s when I pretty much figured I’d rather write a blog than work at the USFDA.

I don’t really want my old job back.  I’m too old for that.  But I still figure someone else should have it.  There’s been another reorganization, but there is still a contaminants branch facing the same old problems.  Seems like a lot of people in the EPA and WHO should know their way around a probability tree as well.  

But it isn’t really just a government issue; it’s primarily a science problem.  The Bayesians shouldn’t need to identify plausible theories after a study has already been conducted.  That should have happened before the effort to design experiments and/or collect observational data began.  But of course, it is entirely possible that all of the hypotheses considered at the outset of a study are disproven by the new data.  That makes it time for a new theory.  Neither a frequentist or Bayesian analysis can help with that.  But a probability tree can.

Model Shopping

Perhaps the scientists who need probability trees the most are epidemiologists, especially the ones doing multivariate analysis with multiple putative causal influences.  I’ve been over some of it before from a historical perspective (Neyman was right, but Fisher sold more textbooks); null hypothesis testing doesn’t necessarily test the hypotheses that really matter.  That can easily set up a model shopping exercise that is solely interested in generating statistical significance, possibly by using a model that isn’t plausible in the first place.  I’ll also add the general point made in my last post that trying to turn hypothesis testing into a statistical exercise that treats observations as instances in stochastic probability theory rather than evidence for or against a theory even when the underlying theory isn’t stochastic at all. 

Statistical significance testing isn’t crazy when the number of alternative theories is exactly two, and the number of observations is small.  But otherwise, it’s nuts.  Model shopping isn’t such a bad idea when you are shopping for plausible theories.  However, it needs to happen as part of an open discussion that even someone working for the government can take part in.  That means the set recorded observations  used in published studies needs to be shared.  Furthermore, the search for more plausible theories doesn't stop just because a paper has already been published.  

Reference

Hacking, I (1975).  The Great Decision.  In: The Emergence of Probability.  Cambridge University Press, pp. 63-72.

Official Sound Track

Talking Heads (1977).  Don't Worry About the Government.  In: Talking Heads: 77, Track 8.


Tuesday, March 10, 2026

An Ode to Regression Analysis

Hello Again

It's been a while.  But the bug to write yet another essay has bitten me and I don't know what to do with it besides putting here.  It more or less started with having a wiki article on the History of Probability called to my attention.  I was gratified to see that it opened by acknowledging the duality of probability that I figure is a matter of psychological fact.  But then, as usual in my experience, the rest of the article proceeded to focus on stochastical probability, aka frequency of occurrence.  
 
Most annoying to me is that even though Ian Hacking’s The Emergence of Probability was referenced three times, the chapter (8) that discussed Pascal’s Wager wasn’t mentioned at all.  That’s where Hacking discussed the ins and outs of using probability trees (aka probability logic) to represent the evidential status of competing theories.

I get it; frequentist probability is much more amenable to a mathematical treatment than the evidential ancillary could ever hope to be; and that’s what the article is really about.  Hume (1739) and Hill (1965) resorted to subjective rules of evidence with no mathematics whatsoever rather than a deductive process where the premises automatically dictate the conclusions.  That doesn’t fit into a history of mathematical probability.

But I've been over all that before on this blog.  What's got me going again is that I've come up with another angle.  Quantifying stochastic probability has been done, but quantifying evidence is another thing altogether.

An Important Caution

Devising a mathematical methodology for assigning a numerical probability to competing theories is rather obviously not always possible.  For example, Pascal’s wager was on “God Is” vs “God Is Not”.  Deciding who the killer is in a murder mystery doesn't involve mathematics either even when it is quantified beyond a reasonable doubt.  Furthermore, mathematics is a deductive process; the conclusions must follow from the premises.  On the other hand, weighing evidence or generating hypotheses in the first place is inductive.  Or so they say, because it is also usually conceded that weighing evidence involves subjective judgment.  But when the theories themselves are mathematical then perhaps something useful can be done to make the evaluation not quite so subjective.  

Regression Analysis

As it turns out, if you ascribe discrepancies between a quantitative model and a set of observations to measurement error with a stochastic distribution then you can turn the estimation of model parameter values into a statistical problem.  It’s neat trick.  Sure, you can use a ruler to draw a straight line through scattered data, but different line drawers may end up with somewhat different slopes.  But with least squares linear regression you get the same result every time. 

Linear regression can performed with relatively straightforward mathematics.  But the model has to be linear.  However, with the aid of a computer using trial and error methodology where parameters are adjusted up and down to find if the fit improved or not, least squares regression can applied to any model.  Furthermore, it doesn’t even have to be least squares; any other methodology that weights the relative importance of discrepancies between model and observations may be used instead.  In particular, weighting unsquared residuals places less weight on large deviations than squared residuals do.

However, the underlying rationale for regression analysis is not entirely justified.  First, even if the discrepancies between model and observation are a result of measurement error the actual distribution is usually unknown.  Second, the set of observations may not be entirely representative of the actual distribution.  Third, the model may be not entirely correct.  That is especially likely with multivariate analyses where mismodeling one quantitative relationship can end up with misestimation of the other model parameters as well.  At that point it may be time to consider a new hypothesis.

So calling regression analysis “statistical” or even mathematical is a big stretch.  But I still think it’s very useful because it is using data as evidence for models and theories.  In fact, it can be thought of as quantitative induction.  That is good, very good in fact.  Furthermore, it seems clear that regression analysis has a role to play in filling out probability trees for competing theories with numerical probability assignments that sum to one.

Quantifying the Probability of Competing Hypotheses

The Bayesian Strategy

Employers of Bayes Theorem definitely understand that probability is not the same as frequency of occurrence and they are also comfortable with assigning probabilities to competing models.  Known as Bayesian inference, this is accomplished by assimilating the alternative models into a supermodel and then performing a regression analysis that assigns greater probabilities to the model(s) that fit the best.  

A Beyesian analysis can  also let subjective expert opinion be part of the process, but there’s a catch; the contributions from the experts comes before the regression analysis.  That suffers from the same general problem of trying to make grading evidence a deductive exercise; it’s just not consistent with the way science works.  After all, the issue underlying hypotheses lies in evaluating if they are true rather than how often they are true.   

The Pearson Strategy

There are two sorts of correlation coefficients, aka r-values.  The first measures the association between two different measurements of sets of observations  (e.g. genetics and the occurrence of a disease).  But a Pearson correlation coefficient can also be used to measure the relationship between the values predicted by a model and those observed, and it’s generated by linear regression.  You can easily produce something analogous to the Pearson r value for any regression methodology. The Bayes factor is also functionally equivalent to an r value.

Pearson himself thought the r value was useful for grading the strength of an inference (Porter, 1986).  It plugs in nicely to the first three Hill criteria, namely strength, consistency, and specificity.  It’s not to hard to argue that a model or theory with a higher r value deserves a higher probability assignment.  You could even devise an algorithm or equation that at least somewhat fairly directs the relationship.  Yes, it would be somewhat arbitrary, but I’ll take it all day over safety factors or default assumptions.

In Summary

There are two approaches for combining data and expert opinion.  The Bayesian approach starts with expert opinion and then uses data to produce final evidential judgments.  The Pearson-Hill approach produces a measure of how well the data fits each hypothesis, but leaves the final evidential judgment to experts.  I'll discuss pros and cons next.

References

Hacking, I (1975).  The Great Decision.  In: The Emergence of Probability.  Cambridge University Press, pp. 63-72.

Hill, AB (1965).  "The Environment and Disease: Association or Causation?". Proceedings of the Royal Society of Medicine. 58 (5): 295–300.

Hume, D (1739).  A Treatise on Human Nature.  Book I, Section XV.

Porter, TM (1986).  The Rise of Statistical Thinking 1820-1900. Princeton University Press.

Official Sound Track

Beatles (1969).  Come Together.  In: Abbey Road, Track 4.