Sunday, July 10, 2016

SPSG #13: Ending the Game

This chapter outlines strategies for overcoming the difficulties noted in earlier chapters. First, adopting some common legal strategies for separating Matters of Fact from Matters of Law could do wonders. The creation of job positions for Science Judges whose sole responsibility is to disentangle science matters from policy matters could facilitate that. Second, while eliminating the science-policy shell game entirely is probably not possible, there is no reason why it should be condoned or institutionalized. Therefore, the EPA assessment guidelines for cancer and noncancer endpoints both need to be rewritten. As the face of the Safety Assessment Paradigm, the Reference Dose background document should be rewritten to make it clear that the product of the assessment is a regulatory policy rather than a statement of scientific fact. As for the cancer risk assessment guidelines, instead of enshrining the default option with the Point of Departure, the guidelines should use probability trees to make the default option going away entirely. Furthermore, there is no reason to the restrict the use of quantitative risk assessment to just cancer. However, solving that problem will create another problem: At least in public health, a legacy of the institutionalized use of the science-policy shell game has virtually eliminated risk management as a federal job position. So, doing a risk assessment is of very little use if no one in the federal government has the responsibility for managing issues. That can happen, but position descriptions will need to be rewritten. Finally, research should be funded to support science, instead of supporting technocratic shell games. In particular, the enterprise of environmental epidemiology needs to be redesigned. Statistical significance testing should be eliminated as the primary means of drawing conclusions from data, and studies should be designed to increase or reduce evidentiary weight accorded to causal theories instead. Perhaps most importantly, observational data needs to be shared. When it comes to analyzing data, regardless of what their source of funding is, investigators cannot be given complete deference in conveying what the data infer. As an academic recommendation, teaching Statistics and Probability as separate subjects would clear up more than a few nagging philosophical problems.  Finally, the facade of impersonal scientific objectivity needs to be abandoned. Scientists should be both free to speculate and humble enough to admit their theories may be wrong.

SPSG #12: Personal Technique

This chapter is largely written in the first person, and it does so largely for the purpose of disparaging the concept of scientific objectivity. It starts out by describing how several reorganizations dramatically affected the branch at the USFDA that I worked in for 25 years. In the end, it was swallowed up by the shell game. It then goes on to discuss the importance of recognizing the subjective nature of science, particularly when the science is unsettled and uncertain. The objectivity facade is partly attributable to scientific writing style that takes the author out of discussions of factual issues, which hides tha fact that personal opnions are beign expressed.   To demonstrate what science is really like, I walk through the personal choices I made in developing the dose-response model for arsenic and lung cancer that was used for the apple juice and rice risk assessments discussed in Chapter 11. Some of those choices were done by committee and some were not, but either way they all involved subjective scientific judgements made in a fog of uncertainty. In one case, I made a different choice that I had previously because new information influenced my subjective judgment about how to go about estimating lifetime risks from a prospective epidemiological study, which underscores the notion that “objective” reality evolves with scientific inquiry. The resulting dose-response model is then used to provide risk estimates for someone with a high-end (for the United States) arsenic intake. In addition to providing the lifetime risk estimates that were also given in the FDA reports on apple juice and rice, estimated changes in average life expectancy are also provided. For the purpose of making an individual choice, the latter measure is far more meaningful. The chapter then suggests that the inability of EPA to provide a dose-response characterization for arsenic may stem from a wrongheaded demand for objectivity that dictates the use of the wrong probability and the wrong personnel for a job that needs statistical theory instead of statistical probability.

Saturday, July 9, 2016

SPSG #11: The Technocracide

Since it kills the Safety Assessment Paradigm every time, it has always been very clear that arsenic would never make it as a food additive. Although arsenic commonly occurs in food as a contaminant, the concern for arsenic in food was always mitigated by the fact that the largest exposures have generally been from drinking water. That equation changed in 2001, when the EPA passed a regulation for arsenic in drinking water that changed that equation. Because the drinking water rule required a cost-benefit analysis, the decision process was supported by a risk assessment that produced risk estimates; in spite of the guidelines, it was consistent with the risk assessment paradigm. However, the Office of Water did find it necessary to hire outside consultants to accomplish that goal. In any case, arsenic exposure from water was reduced and arsenic in food became a relatively bigger issue as a result. As a result of public attention in 2011, the FDA issued guidance "action" levels for arsenic in apple juice in 2013 and rice in infant foods in 2016. From a risk management standpoint, both efforts were abject failures. Although risk assessments were produced, they didn't really support the guidance in either case. At least part of the reason is that setting levels usually isn't an effective way to manage risks from contaminants. Preventing something from getting into the food in the first place can be far easier, but that isn't always possible. Nonetheless, the FDA went ahead anyway with action levels anyway. In the case of apple juice, it isn't too hard to figure out why; the FDA commissioner publicly promised an action level before the risk assessment was done. The reasoning that went into the rice guidance is more mysterious. Even though there is nothing in the risk assessment to indicate that they are uniquely susceptible to arsenic, the FDA advised both infants and pregnant women to reduce their rice intake, but gave no advice for anyone else. In fact, the exposure assessment indicated that the greatest exposures to arsenic from rice are in adult males. On a more positive note, the FDA cancer risk assessments for both apple juice and rice solved the default option problem by using probability trees to represent the theoretical probability associated with the dose-response relationship for arsenic and both lung and bladder cancer. The main lesson to be learned from those exercises is that regardless of how well an assessment represents current science, if the message takes precedence over the result, there will be no reason to expect public health to improve.

SPSG #10: The Paradigm War

Methylmercury in fish has been a major issue for both the EPA and the FDA since several epidemics occurred in Japan and Iraq in the 60s and 70s. At about the same time (early 90s) as the FDA started quantifying risks for methylmercury in fish and issuing consumer advice for commercial seafood, the EPA started giving recreational fish consumption advice based on the EPA Reference Dose (RfD). In 1999 congress asked the National Academy of Science (NAS) to evaluate the RfD for methylmercury, and a report was issued in 2000. The fact that congress even asked the question of the NAS cemented in many people minds that the RfD was and is a statement of a scientific fact. That meant that if it was true for EPA then it had to be true for FDA too, and as a result any attempt to quantify the risks and provide information about what the risks came to be viewed as a political attempt to undercut the science. By 2004, the FDA and EPA had agreed to give joint advice for fish consumption, but there was no agreement about what the basis or the rationale for the fish advisory was. While the EPA thought the RfD was paramount, the FDA chose to pursue a quantitative strategy that balanced benefits and risks; so the fish-risk benefit assessment was an FDA-only affair. But perhaps the most important difference was about what information, if any, would be given to consumers. The RfD treats consumers in the same manner as it treats agency managers; it decides for them, and as a result there is no basis for providing the information to consumers that will let them decide for themselves. As an alternative, a few representative risk estimates are provided for the consumption of fish during pregnancy using a version of the risk assessment model developed for the FDA that is designed to estimate risks for individual consumers.

Tuesday, July 5, 2016

SPSG #9: A Practical Guide to Theoretical Probability

This Risk Analysis methodology chapter is the applied version of the philosophical discussion of probability presented in Chapter 2. It also fixes the flaws in the Redbook paradigm discussed in Chapter 4, resulting in the Guillotine paradigm. It begins with a discussion of characterizing uncertainty when there are both statistical and theoretical probabilities involved. While a theoretical probability does not need to be quantified when it is the only probability involved or when there is no decision at stake, giving it the same epistemic standing as a statistical one is unavoidable when both matter. However, that does not mean a theoretical probability can be used as if it were a statistical probability. A theoretical probability is perhaps true always or perhaps false always; is it not true sometimes and false at other times. The discussion then turns to the problem of assigning probabilities to alternative theories. Declaring that all sum to one is a simple matter, but deciding the probability of each theory is not. Since theoretical probabilities are subjective, depending on the opinions of those who have one (that usually means experts) is in some way is inevitable. However, instead of asking experts to assign probabilities to theories directly, there are advantage to garnering opinion in the form of evidential weights, where each alternative theory is evaluated more or less independently. Although formal weight-of-the-evidence schemes have been developed for many regulatory purposes, they are not usually thought of as quantitative exercises. However, it has been done and could be done better. It is also argued that WoE analysis and dose-response modeling need to be more tightly integrated, especially when the judgment that there is a causal relationship becomes more likely than not. First, the shape of the dose-response relationship may influence the judgment that there is a causal relationship. Second, the last vestiges of causal uncertainty may not matter if the estimated risks are too low to matter or high enough to be a concern even if they are only probable.

Monday, July 4, 2016

SPSG #8: The Wrong Probability

This chapter is about the problems associated with using statistical probability as the only probability, especially in epidemiology. While different scientific disciplines typically rely on somewhat different collections of convincing arguments, the conduct of epidemiology can aptly be compared to a trial for murder. Since the issue is causality, theoretical probability is front and center. Yet, at least when designing studies and publishing studies, environmental epidemiology studies often rely on tests of statistical significance testing for drawing conclusions. Arthur Bradford-Hill disparaged this practice over 50 years ago, and he is still quite right; statisticians are using the wrong probability. But that isn't the only problem. Epidemiologists (or their statisticians) often treat measures designed to quantify strength of association for the purpose of arguing causality as if they were measures of effect; thereby completely missing the point of having them in the first place. Next, epidemiologists are often reluctant to share raw data. While there are many possible explanations for this practice, the fact that other analysts would be able to use the data to explore and support theories not utilized in the published report is chief among them. The data sharing problem becomes especially evident when the theories used in published analyses are obviously wrong, either when they are first published, or perhaps later. This more or less forces the court of scientific opinion to rely on hearsay evidence. As an example of that, the use of log transformed measures of dose in multivariate regression analyses is discussed. Since it is an established analytical procedure, there is a tendency to think of regression analysis as a "theory-free" analysis that provides conclusions that are largely empirical. But, that isn't true at all. Linear regression analysis presumes that the quantitative dose-response relationship is linear. Similarly, doing a linear regression analysis with the log of dose presumes that the quantitative dose-response relationship is loglinear. But that results in a supralinear function where not only do the effects get bigger as the dose gets smaller, the effect approaches infinity as the dose approaches zero. Even though that's quite impossible, the practice continues, and that is probably because testing a theory that is definitely wrong is a reliable way of producing scary statistically significant low dose effects.

SPSG #7: The Sociology of Technocracy

This chapter is like a sociology of science essay, except that it is really about politicians in lab coats, with most of it being concerned with toxicology. Many of the roots of the SPSG can be found in academia. First, there is a discussion of the Information Quality Act of 2002 that sought to make the information used by the federal government more objective. Yet the Office of Management and Budget interpreted "objectivity" as meaning "peer reviewed". While that could potentially subject scientific claims to cross-examination by outside experts, without separation of science and policy, peer review can also be used to prevent cross-examination altogether. While toxicology initially was primarily associated with the drug industry, it has become increasingly concerned with environmental regulation. The growth of environmental toxicology programs that are almost entirely concerned with government as a career path are a prime example. As a result, environmental toxicologists can potentially complete their careers by only talking among themselves. Although the Society of Toxicology did not have a Code of Ethics when it was formed in 1961, it does now. Many of the recommendations seem to be political statements that on closer examination aren't necessarily ethical at all. In particular, members are required to be "advocates of public health" and "Abstain from professional judgments influenced by undisclosed conflict of interest". Both of these statements favor the interests of public sector members (i.e. technocrats) over those with private interests. Nutritionists can be technocrats too, and when nutrients are also toxic, that can create a clash of technocratic cultures. While nutritionists don't use safety factors, when it comes to considering dose-response relationships, their tradition is quite limited, perhaps by design. In the “Risk Analysis Paradogm”, Risk Communication is often recognized as a third component of the regulatory decision process, along with Risk Assessment and Risk Managment.  However, the roots of the discipline lie in the study of consumers responses, which makes it well suited for selling a decision that has already been made. Quantification can be part of the SPSG too.  Because of their ability to seemingly automate a decision process, often by ignoring or assiduously hiding theoretical probability, statisticians can play the shell game too.  Since they often equate the behavior of scientists with science, sociologists sometimes seem to backhandedly endorse the shell game.

Note

Although it wasn't my original intent, this chapter does read like a populist manifesto.  

Sunday, July 3, 2016

SPSG #6: Two Charades

While the earlier chapters are largely historical, this is the chapter that begins to speak of current practice, and it is also gives the book its title. Quite simply, the Science-Policy Shell-Game (SPSG) is a technocratic game played by treating science and policy as if they were interchangeable in both directions. It is comprised of two components. On the one hand, a statement is purported to be science in front of a political audience. On the other hand, the same statement is purported to be policy in front of a scientific audience. As the end result, a regulatory decision is shielded from both scientific and political scrutiny. Although there were many examples of it before, the SPSG was deliberately institutionalized by a committee of EPA technocrats in 1986. The object of their creation was the EPA Reference Dose (RfD). As an example of the Safety Assessment Paradigm, the RfD was no different from the ADI, except for one thing: It was claimed to be a scientific fact. As a further technocratic assault, the 2005 EPA Cancer Assessment Guidelines replaced plausible worst-case estimates with what amounted to a ban on theoretical reasoning. That was accomplished by interposing a "Point of Departure" between toxicological theory and regulatory decision making. That made no sense then, and it still doesn't. Why the SPSG is played is debatable; it may be some combination of agency managers hiding decisions they would rather not defend, scientists who don't want to cede regulatory control to agency managers or elected officials, statistical decision theorists who claim to make decision processes objective, or it may just be career maintenance and research dollars. But whatever the reasons are, the SPSG is antiscientific and antidemocratic. It cuts off scientific discussion from policy making with technocratic short cuts, and it leads scientists and the public to believe that the decisions faced by the government and themselves are far simpler than they really are.  The “Fifth Branch” is introduced as a term to describe the technocrats inside and outside of the federal government who play the SPSG.

SPSG #5: Dose-Response Theory

This chapter is a compendium of pharmacology and toxicology theory, and since it doesn’t build on any of the previous chapters, it is essentially a third introductory chapter.  Although does get a bit technical, it is designed to give a sense of what the quantitative issues are without delving into mathematics.  Although it isn’t necessary for most of the later chapters, it is provided as background material for some of the discussions involving theoretical probability in some of the later chapters, especially eight through eleven.  The chapter commences with a survey of basic concepts including biochemical mechanisms underlying the interaction of a toxic chemical with a biological molecule, and toxicokinetic theory that describes what happens before it gets there.  There is also discussion of statistical theories like probit analysis that treat the causal issue as a problem of describing how much the dose required to produce a given effect varies in a population.  There are also hybrid or two-dimensional models that describe the both magnitude of individual effect and population variation as well.  It has long been recognized that there is a temporal component to dose response relationships that can vary between both chemicals and effects; yet the temporal component is often ignored.  All said, biology is complex and dose-response theory is imperfect; there is plenty of room for improvement.  Nonetheless, one thing seems clear; the dose makes the poison.  Effects tend to get bigger as the dose gets bigger, but not necessarily in proportion.


Saturday, July 2, 2016

SPSG #4: The Risk Assessment Paradigm

This chapter is about the Risk Assessment Paradigm (RAP) that was originally introduced to chemical safety in food and the environment as a mechanism for dealing with carcinogens in food. The advantages of the RAP were touted by a 1983 report from the National Academy of Sciences (NAS) that is often referred to as the Redbook. In particular, the RAP was widely heralded as a democratic alternative to the technocratic SAP. This was accomplished by distinguishing a risk assessment process that characterizes what the risks are and a risk management process that uses the information from the risk assessment to make a decision. However, the Redbook version of the RAP had some significant flaws that kept it from really accomplishing the goal of separating science from policy.  Therefore, this book also uses a more basic definition of RAP called "the Guillotine Paradigm", that is predicated on the separation of "Is" from "Ought": The scientific discussion of what is known is a separate endeavor from the policy decision of what ought to be done.  The problems with the Redbook were identified in subsequent NRC reports.  First, the Redbook paradigm begins with a Hazard identification step.  Since this is to be done by scientific experts, it gave them control over what questions are to be answered.  That problem can be addressed by viewing the RAP as an iterative process, where the policy deliberation identifies the questions, and the risk assessment provides the answers.  Veiwing the RAP as a dialogue istead of a monologue opens the decision process up to democratic participation.  Secondly, the Redbook embraced the notion of the “default option”, where theoretical probabilities were to be resolved by giving one theoretical alternative preference as a matter of regulatory policy.  That problem can be solved with a probability tree that acknowledges the theoretical alternatives.  Thirdly, the Redbook paradigm was derived from procedures used at the FDA for dealing with cancer and the Delaney Clause, and it was therefore sometimes interpreted as being applicable only to cancer.  Although the RAP is named after the analytical component, the fact that policymaking (aka Risk Management) is recognized as a distinct process is at least as important.  The RAP is especially indispensable when regulatory decision making requires a rational process where the risk is balanced against something else, such as a benefit, another risk, or the cost of avoiding it.

SPSG #3: The Safety Assessment Paradigm

This chapter is about the procedure originally developed for the premarket approval of pesticides and food additives, which is referred to throughout the book as the Safety Assessment Paradigm.  The SAP gave birth to the Acceptable Daily Intake (ADI), which was the daily exposure to a chemical that would be considered acceptable by the FDA. The ADI was originally calculated by dividing the highest dose found to not have an observable (statistically significant) effect in a laboratory study by a safety factor of 100. There are three key features of Safety Assessment that are especially worth noting. First, since the SAP delegates the regulatory decision to experts, it is thoroughly technocratic.  Second, premarket approval and the SAP were designed to be precautionary; a chemical couldn't be used until it was shown to be safe. Third, it is presumed that the way to limit exposure to a chemical is the set a level that the government will consider to be acceptable. Although the SAP has evolved from its 1954 introduction, it still retains its unmistakable premarket approval origins.


Friday, July 1, 2016

SPSG #2: Two Probabilities and Frequency Too

This is a history of philosophy of science presentation, and it sets up the terminology used throughout the rest of the book: If the first chapter is about "safety", this one is about "probability". There are two very different concepts of probability, both of which are very old and generally familiar. From an etymological standpoint, the legal concept of probability dates back to Roman law. In common parlance, the evidential form of probability is being used whenever a proposition is said to be "probably true". Even though it wasn't called probability until Pascal and friends gave it that name in the 17th century, the concept of chance and its relationship to frequency of occurrence has been around since Aristotle. In common parlance, the probability of chance is being used whenever it is said that an event will "probably happen". In scientific and technical literature, it is the second "statistical" meaning of probability that is used almost exclusively, and that is almost certainly because it is more objective in an empirical sense. However, the other subjective form of probability, which is called "theoretical" or "evidential" probability in the book to emphasize its role in science, can still be found in scientific discussion all the time, but it usually appears in the words rather than the numbers.

There are two other important concepts introduced in this chapter as well. First, as a result of the statistical definition of probability, uncertainty and frequency of occurrence are often treated as identical concepts. That is incorrect both from a grammatical standpoint, and because even without the appendage of "probability", statistical theories concerning frequency of occurrence are important in many scientific disciplines. For example, in public health, regulatory issues often revolve around evidential probabilities of statistical theories. Second, abstract representations of probability are also sometimes referred to as probability themselves without reference to usage, which leads to yet another opportunity for semantic confusion. As they are taught in introductory courses concerned with probability and statistics, mathematical probability distributions that can be used to represent either frequency of occurrence or the probability of chance are well known. However, even though theoretical probability is clearly a matter of degree, it is much harder to quantify. A probability tree can be used to represent theoretical probability and provide a quantitative interpretation as well. The basic concept is very simple: The probability of all alternative theories or hypotheses under consideration sums to one. Scientific evidence may then be weighed in order to determine which hypotheses, and to what degree, are more probable than the others. Causal relationships are the most common issue in which scientific issues involving theoretical probability arise. For example, whether or not a particular chemical causes cancer, and if so how, involves theoretical probability. Although they have evolved somewhat, guidelines for establishing causal relationships in science have been around for centuries. They have been around in law for even longer than that.

Notes

Perhaps this should be the first chapter, but in the interest of not straining credulity at the outset I have put it second.

After having been at the FDA for several years, I ran into my first model uncertainty problem.  I traipsed over to the Office of Mathematics to inquire about the proper methodology for calculating the probability of a model being true. I asked a statistician who had been at the FDA for thirty years, and the answer I got astounded me:
You are not allowed to ask that question 
I obtained a second opinion from a younger statistician,  just trying to find out about how to go about identifying the best model among several and got an answer that I found to be no more satisfactory:
Find a biologist and beat it out of them
As a biologist who did not want to be beaten, I soon embarked upon a philosophy of science reading binge that lasted several years in the mid 90's.  The main thing I learned is that there is another probability that is quite different from the one the statisticians were using.  However, in the end I was unable learn very much about it from the philosophers.  This chapter is a compilation of the few gems that I managed to gather from my survey.  Perhaps not surprisingly, some of the most important insights came from practicing scientists and risk analysts. In any case, I managed to answer my own question, at least to my own satisfaction. Although beating the probabilities out of the biologists can be a fair characterization of the method, the beating can be avoided with a willing confession. The description of that solution begins in this chapter and finishes in chapter 9.

However, I ran into other difficulties subsequently.  It seems that not everyone wanted the problem to be solved at all.  The rest of the book is about that.


SPSG #1: Food Law and Chemical Safety

This is an historical review of food law concerned with chemical safety. It commences with a discussion of the 1906 Pure Food and Drug Act and ends with the Dietary Supplement amendments of 1994.  All told, there are about a half dozen statues that pertain to the regulation of chemicals in food by the U.S. Food and Drug Administration (FDA) that differ in many ways.  While virtually all of the laws governing the safety of chemicals in food require scientific interpretation, the way in which scientific expertise is to utilized to create a legal definition of what is considered safe or not by the agency necessarily differs among different statutes.  The statutes differ in the definition of harm, the burden of proof, and in their evidentiary standards. For the purposes of the rest of the book, the most important distinction is between additives and contaminants. Food additives are deliberately added to food, have an intended use, must be approved by the FDA before they can be used, and the burden of proof lies on the manufacturer. Because the approval process is structured and planned, the way in which scientific expertise is utilized also occurs in a somewhat predetermined manner. The evidentiary standard for arguing that an additive has not been shown to be safe is also very low; it must only be argued that there is substantial evidence that harm is possible. Contaminants are, by definition, present in food unintentionally and many occur naturally. Therefore, they don't have an intended use, they don't need to be approved, and the burden of proof is on the government to show that the contaminants is harmful enough to be worthy of regulation, and the judicial branch often makes the final determination of what will be considered "safe".  Other classes of chemicals fall soemwhere between those two extremes.  In particular, food additives in use prior to the Federal Food Drug and Cosmetic Act amendments of 1958 were exempted from the approval process.  This created a class of chemicals that are “Generally Recognized as Safe” where FDA approval is obtained by demonstrating history of use rather than going through a rigorous testing regimen.  As the end result, there is no consistent definition in either law or science about what the word "safe" really means.