Tuesday, March 3, 2015

Convincing Arguments

The Voice of Experience

In his seminal paper on the evaluation of epidemiological data, Sir Austin Bradford Hill sought to share his experience in occupational epidemiology with those working in the nascent field of environmental epidemiology.  In doing so, he laid out the problem of detecting “relationships between injury, sickness, and conditions of work” as follows:

“There are, of course, instances in which we can reasonably answer these questions from the general body of medical knowledge.  A particular, and perhaps extreme, physical environment cannot fail to be harmful; a particular chemical is known to be toxic to man and therefore suspect on the factory floor.  Sometimes, alternatively, we may be able to consider what might a particular environment do to man, and then see whether such consequences are indeed to be found.  But more often than not we have no such guidance, no such means of proceeding; more often than not we are dependent on our observation and enumeration of defined events for which we then seek antecedents.  In other words we see that the event B is associated with environmental feature A,that, to take a specific example, some form of respiratory illness is associated with dust in the environment.  In what circumstances can we pass from this observed association to a verdict of causation?”

The criteria laid out by Hill collectively provide a framework for grading the evidence for a causal relationship.  Briefly, these are as follows:
  • Strength.  This criterion expressly states that association is evidence for causation, it also suggests that the strength of the association evidence rises with the degree of association.
  • Consistency.  The criteria suggests that observation of the same association by different people at different times strengthens the evidence.  Associated associations are to be preferred over those that are not.
  • Specificity.  Hill suggests that an association with one disease rather than many slightly strengthens the argument for causality.
  • Temporality.  Obviously, the cause must precede the effect.  That said, the evidence is generally stronger when the putative effect occurs immediately.
  • Biological Gradient.  The evidence is strengthened if there is a quantitative link between the magnitude of the putative causal agent and the magnitude of the outcome.  Or, in other words, a dose-response relationship is to be expected.
  • Plausibility.  While Hill stipulates that biological plausibility can strengthen an argument for causation, he also acknowledges that current knowledge may be inadequate or even mistaken.  There also may be multiple plausible explanation for an association, only some of which entail causation.
  • Coherence.  A causal interpretation is weakened if contradicted by other known facts.  However, the caveats concerning the limits of current knowledge also apply here.
  • Experiment.  By removing a putative cause, an intervention will test a causal hypotheses.  This is also a form of evidence by association, but here the association, if there is one, is produced deliberately.
  • Analogy.  A causal argument is more acceptable when it bears a similarity to other causal arguments that have already been accepted.
Although these nine criteria are sometimes taken as a sort of gospel, it is clear that the Hill only intended them to be rough guidelines.  As a collection of arguments that scientists often give to support their theories, they were not novel even 50 years ago. The associational criteria (strength, consistency, specificity, and experiment) may be grouped together under the more general concept of empirical support, while the other criteria may all be invoked as a form of theoretical support.
Some of the Hill criteria are readily amenable to quantification.  In particular, odds ratios and relative risks are often used to characterize strength of association, and linear regression techniques may be used to demonstrate a dose-response trend.   However, even though most of the theoretical criteria are also matters of degree, any formal methodology used to measure evidential weight is apt to be as subjective as the theories they are applied to.  In addition, even though Hill clearly thinks some of the criteria are more important than others, he provides no guidance as to how all the criteria are weighed together to provide an overall grading of the evidence.  As a result, the enterprise of weighing evidence for causality is inherently subjective, and is often done by committee or workgroup.

The IARC Weight of the Evidence Categories

Since 1972, the most well known agency tasked with conducting weight-of-the-evidence evaluations is the United Nations sponsored International Agency for Research on Cancer (IARC).   IARC sponsored workgroups have evaluated hundreds of chemicals and substances with the goal of place each in one of four evidential categories (from IARC, 2006):


  • Sufficient evidence of carcinogenicity: The Working Group considers that a causal relationship has been established between exposure to the agent and human cancer. That is, a positive relationship has been observed between the exposure and cancer in studies in which chance, bias and confounding could be ruled out with reasonable confidence. A statement that there is sufficient evidence is followed by a separate sentence that identifies the target organ(s) or tissue(s) where an increased risk of cancer was observed in humans. Identification of a specific target organ or tissue does not preclude the possibility that the agent may cause cancer at other sites.
  • Limited evidence of carcinogenicity: A positive association has been observed between exposure to the agent and cancer for which a causal interpretation is considered by the Working Group to be credible, but chance, bias or confounding could not be ruled out with reasonable confidence.
  • Inadequate evidence of carcinogenicity: The available studies are of insufficient quality, consistency or statistical power to permit a conclusion regarding the presence or absence of a causal association between exposure and cancer, or no data on cancer in humans are available.
  • Evidence suggesting lack of carcinogenicity: There are several adequate studies covering the full range of levels of exposure that humans are known to encounter, which are mutually consistent in not showing a positive association between exposure to the agent and any studied cancer at any observed level of exposure. The results from these studies alone or combined should have narrow confidence intervals with an upper limit close to the null value (e.g. a relative risk of 1.0). Bias and confounding should be ruled out with reasonable confidence, and the studies should have an adequate length of follow-up. A conclusion of evidence suggesting lack of carcinogenicity is inevitably limited to the cancer sites, conditions and levels of exposure, and length of observation covered by the available studies. In addition, the possibility of a very small risk at the levels of exposure studied can never be excluded.


Although placement into categories is clearly concerned with causal interpretation of epidemiological data, IARC evaluations do not formally use the Hill criteria.  However, the lines of reasoning used in IARC monographs are found on Hill’s list.  In particular, strength of association and theoretical arguments are both brought forward in the evaluations.

Although IARC evaluations are primarily concerned with human carcinogenesis, they also generally consider the evidence for carcinogenicity in animals as well.  In some cases, the evidential categories for human and animal carcinogenesis are distinguished.  In particular, a category 2A designation indicates that there is limited evidence for human carcinogenicity, while 2B indicates that there is limited evidence of carcinogenicity in animals only.

The IARC weight of the evidence evaluations are implicitly geared towards a two-node probability tree.  Either the chemical is carcinogenic, or it is not.  It is also clear that, even though it is not numerically defined, the probability is graded; the probability that a chemical is carcinogenic can be very low or very high.

References

Hill, Sir Arthur Bradford (1965).  The Environment and Disease: Association or Causation?  Proc Royal Soc Med 58:295-300.

International Agency for Research on Cancer (2006).  The Preamble to the IARC Monographs.  

Official Post Soundtrack


Burgess, M and God, Jrs (1994).  Up On the Hill.  In: Zima Junction, Track 9


Post Notes

This is thesis #3.  Mostly standard fare.  The two twists are a) grouping the Hill criteria into emprical and theoretical categories, and b) suggesting the use of weight of the evidence evaluations for assigning weight to probability tree nodes. 

No comments:

Post a Comment