Emilio A.L.Gianicolo, Dr. rer. physiol.,1,2,* Martin Eichler, Dr. phil.,3 Oliver Muensterer, Univ.-Prof. Dr. med.,4 Konstantin Strauch, Prof. Dr. rer. nat.,1,5 and Maria Blettner, Prof. Dr. rer. nat.1 Show
Emilio A.L.Gianicolo1Institute for Medical Biostatistics, Epidemiology and Informatics (IMBEI), University Medical Center of the Johannes Gutenberg University of Mainz 2Institute of Clinical Physiology of the Italian National Research Council, Lecce, Italy Find articles by Emilio A.L.Gianicolo Martin Eichler3Technical University Dresden, University Hospital Carl Gustav Carus, Medical Clinic 1, Dresden Find articles by Martin Eichler Oliver Muensterer4Department of Pediatric Surgery, Faculty of Medicine, Johannes Gutenberg University of Mainz Find articles by Oliver Muensterer Konstantin Strauch1Institute for Medical Biostatistics, Epidemiology and Informatics (IMBEI), University Medical Center of the Johannes Gutenberg University of Mainz 5Institute of Genetic Epidemiology, Helmholtz Zentrum München—German Research Center for Environmental Health, Neuherberg; Chair of Genetic Epidemiology, Institute for Medical Information Processing, Biometry, and Epidemiology, Ludwig-Maximilians-Universität, München Find articles by Konstantin Strauch Maria Blettner1Institute for Medical Biostatistics, Epidemiology and Informatics (IMBEI), University Medical Center of the Johannes Gutenberg University of Mainz Find articles by Maria Blettner Disclaimer 1Institute for Medical Biostatistics, Epidemiology and Informatics (IMBEI), University Medical Center of the Johannes Gutenberg University of Mainz 2Institute of Clinical Physiology of the Italian National Research Council, Lecce, Italy 3Technical University Dresden, University Hospital Carl Gustav Carus, Medical Clinic 1, Dresden 4Department of Pediatric Surgery, Faculty of Medicine, Johannes Gutenberg University of Mainz 5Institute of Genetic Epidemiology, Helmholtz Zentrum München—German Research Center for Environmental Health, Neuherberg; Chair of Genetic Epidemiology, Institute for Medical Information Processing, Biometry, and Epidemiology, Ludwig-Maximilians-Universität, München *Institut für Medizinische Biometrie, Epidemiologie und Informatik Universitätsmedizin der Johannes Gutenberg-Universität Mainz Abteilung Epidemiologie und Versorgungsforschung Obere Zahlbacher Str. 69, 55131 Mainz, Germany [email protected] Received 2019 Aug 2; Accepted 2019 Nov 18. Copyright notice AbstractBackgroundIn clinical medical research, causality is demonstrated by randomized controlled trials (RCTs). Often, however, an RCT cannot be conducted for ethical reasons, and sometimes for practical reasons as well. In such cases, knowledge can be derived from an observational study instead. In this article, we present two methods that have not been widely used in medical research to date. MethodsThe methods of assessing causal inferences in observational studies are described on the basis of publications retrieved by a selective literature search. ResultsTwo relatively new approaches—regression-discontinuity methods and interrupted time series—can be used to demonstrate a causal relationship under certain circumstances. The regression-discontinuity design is a quasi-experimental approach that can be applied if a continuous assignment variable is used with a threshold value. Patients are assigned to different treatment schemes on the basis of the threshold value. For assignment variables that are subject to random measurement error, it is assumed that, in a small interval around a threshold value, e.g., cholesterol values of 160 mg/dL, subjects are assigned essentially at random to one of two treatment groups. If patients with a value above the threshold are given a certain treatment, those with values below the threshold can serve as control group. Interrupted time series are a special type of regression-discontinuity design in which time is the assignment variable, and the threshold is a cutoff point. This is often an external event, such as the imposition of a smoking ban. A before-and-after comparison can be used to determine the effect of the intervention (e.g., the smoking ban) on health parameters such as the frequency of cardiovascular disease. ConclusionThe approaches described here can be used to derive causal inferences from observational studies. They should only be applied after the prerequisites for their use have been carefully checked. The fact that correlation does not imply causality was frequently mentioned in 2019 in the public debate on the effects of diesel emission exposure (, ). This truism is well known and generally acknowledged. A more difficult question is how causality can be unambiguously defined and demonstrated (). According to the eighteenth-century philosopher David Hume, causality is present when two conditions are satisfied: 1) B always follows A—in which case, A is called a “sufficient cause” of B; 2) if A does not occur, then B does not occur—in which case, A is called a “necessary cause” of B (). These strict logical criteria are only rarely met in the medical field. In the context of exposure to diesel emissions, they would be met only if fine-particle exposure always led to lung cancer, and lung cancer never occurred without prior fine-particle exposure. Of course, neither of these is true. So what is biological, medical, or epidemiological causality? In medicine, causality is generally expressed in probabilistic terms, i.e. exposure to a risk factor such as cigarette smoking or diesel emissions increases the probability of a disease, e.g., lung cancer. The same understanding of causality applies to the effects of treatment: for instance, a certain type of chemotherapy increases the likelihood of survival in patients with a diagnosis of cancer, but does not guarantee it. BOX 1Causality in epidemiological observational studies (modified from Parascondola and Weed [34])
In many scientific disciplines, causality must be demonstrated by an experiment. In clinical medical research, this purpose is achieved with a randomized controlled trial (RCT) (). An RCT, however, often cannot be conducted for either ethical or practical reasons. If a risk factor such as exposure to diesel emissions is to be studied, persons cannot be randomly allocated to exposure or non-exposure. Nor is any randomization possible if the research question is whether or not an accident associated with an exposure, such as the Chernobyl nuclear reactor disaster, increased the frequency of illness or death. The same applies when a new law or regulation, e.g., a smoking ban, is introduced. When no experiment can be conducted, observational studies need to be performed. The object under study—i.e., the possible cause—cannot be varied in a targeted and controlled way; instead, the effect this factor has on a target variable, such as a particular illness, is observed and documented. Several publications in epidemiology have dealt with the ways in which causality can be inferred in the absence of an experiment, starting with the classic work of Bradford Hill and the nine aspects of causality (viewpoints) that he proposed () () and continuing up to the present (, ). BOX 2The Bradford Hill criteria for causality (modified from [5])
Aside from the statistical uncertainty that always arises when only a sample of an affected population is studied, rather than its entirety (), the main obstacle to the study of putative causal relationships comes from confounding variables (“confounders”). These are so named because they can, depending on the circumstances, either obscure a true effect or simulate an effect that is, in fact, not present (). Age, for example, is a confounder in the study of the association between occupational radiation exposure and cataract (), because both cumulative radiation exposure and the risk of cataract rise with increasing age. The various statistical methods of dealing with known confounders in the analysis of epidemiological data have already been presented in other articles in this series (, , ). In the current article, we discuss two new approaches that have not been widely applied in medical and epidemiological research to date. Methods of evaluating causal inferences in observational studiesThe main advantage of an RCT is randomization, i.e., the random allocation of the units of observation (patients) to treatment groups. Potential confounders, whether known or unknown, are thereby distributed to the treatment groups at random as well, although differences between groups may arise through sample variance. Whenever randomization is not possible, the effect of confounders must be taken into account in the planning of the study and in data analysis, as well as in the interpretation of study findings. Classic methods of dealing with confounders in study planning are stratification and matching (, ), as well as so-called propensity score matching (PSM) (). The best-known and most commonly used method of data analysis is regression analysis, e.g., linear, logistic, or Cox regression (). This method is based on a mathematical model created in order to explain the probability that any particular outcome will arise as the combined result of the known confounders and the effect under study. Regression analyses are used in the analysis of clinical or epidemiological data and are found in all commonly used statistical software packages. However, they are often used inappropriately because the prerequisites for their correct application have not been checked. They should not be used, for example, if the sample is too small, if the number of variables is too large, or if a correlation between the model variables makes the results uninterpretable (). Regression-discontinuity methodsRegression-discontinuity methods have been little used in medical research to date, but they can be helpful in the study of cause-and-effect relationships from observational data (). Regression-discontinuity design is a quasi-experimental approach () that was developed in educational psychology in the 1960s (). It can be used when a threshold value of a continuous variable (the “assignment variable”) determines the treatment regimen to which each patient in the study is assigned (). BOX 3Terms used to characterize experiments ()
BOX 4Regression-discontinuity methodsIn the simplest case, that of a linear regression, the parameters in the following model are to be estimated: yi = ß 0 + ß 1 z i + ß 2 (x i - x c) + e i, where: i from 1 to N represents the statistical units y is the outcome ß 0 is the y-intercept z is a dichotomous variable (0, ) indicating whether the patient was treated () or not treated (0) x is the assignment variable x c is the threshold ß 1 is the effect of treatment ß 2 is the regression coefficient of the assignment variable e is the random error A possible assignment variable could be, for example, the serum cholesterol level: consider a study in which patients with a cholesterol level of 160 mg/dL or above are assigned to receive a therapy. Since the cholesterol level (the assignment variable) is subject to random measurement error, it can be assumed that patients whose level of cholesterol is close to the threshold (160 mg/dL) are randomly assigned to the different treatment regimens. Thus, in a small interval around the threshold value, the assignment of patients to treatment groups can effectively be considered random (). This sample of patients with near-threshold measurements can thus be used for the analysis of treatment efficacy. For this line of argument to be valid, it must truly be the case that the value being measured is subject to measuring error, and that there is practically no difference between persons with measured values slightly below or slightly above threshold. Treatment allocation in this narrow range can be considered quasi-random. This method can be applied if the following prerequisites are met:
Example 1: The one-year mortality of neonates as a function of the intensity of medical and nursing care was to be studied, where the intensity of care was determined by a birth-weight threshold: infants with very low birth weight (<1500 g) (group A) were cared for more intensively than heavier infants (group B) (). The question to be answered was whether the greater intensity of care in group A led to a difference in mortality between the two groups. It was assumed that children with birth weight near the threshold are identical in all other respects, and that their assignment to group A or group B is quasi-random, because the measured value (birth weight) is subject to a relatively small error. Thus, for example, one might compare children weighing 1450–1500 g to those weighing 1501–1550 g at birth to study whether, and how, a greater intensity of care affects mortality. In this example, it is assumed that the variable “birth weight” has a random measuring error, and thus that neonates whose (true) weight is near the threshold will be randomly allocated to one or the other category. But birth weight itself is an important factor affecting infant mortality, with lower birth weight associated with higher mortality (); thus, the interval taken around the threshold for the purpose of this study had to be kept narrow. The study, in fact, showed that the children treated more intensively because their birth weight was just below threshold had a lower mortality than those treated less intensively because their birth weight was just above threshold. Example 2: A regression-discontinuity design was used to evaluate the effect of a measure taken by the Canadian government: the introduction of a minimum age of 19 years for alcohol consumption. The researchers compared the number of alcohol-related disorders and of violent attacks, accidents, and suicides under the influence of alcohol in the months leading up to (group A) and subsequent to (group B) the 19th birthday of the persons involved. It was found that persons in group B had a greater number of alcohol-related inpatient treatments and emergency hospitalizations than persons in group A. With the aid of this quasi-experimental approach, the researchers were able to demonstrate the success of the measure (). It may be assumed that the two groups differed only with respect to age, and not with respect to any other property affecting alcohol consumption. Interrupted time seriesInterrupted time series are a special type of regression-discontinuity design in which time is the assignment variable. The cutoff point is often an external event that is unambiguously identifiable as having occurred at a certain point in time, e.g., an industrial accident or a change in the law. A before-and-after comparison is made in which the analysis must still take adequate account of any relevant secular trends and seasonal fluctuations (). BOX 5Interrupted time seriesIn the simplest case of a study involving an interrupted time series, the temporal sequence is analyzed with a piecewise regression. The following model is used to study both a shift in slope and a shift in the level of an outcome before and after an intervention, e.g., the introduction of a law banning smoking (figure 2): y = ß 0 + ß 1 × time + ß 2 × intervention + ß 3 × time × intervention + e, where: y is the outcome, e.g., cardiovascular diseases intervention is a dummy variable for the time before (0) and after (1) the intervention (e.g., smoking ban) time is the time since the beginning of the study ß 0 is the baseline incidence of cardiovascular diseases ß 1 is the slope in the incidence of cardiovascular diseases over time before the introduction of the smoking ban ß 2 is the change in the incidence level of cardiovascular diseases after the introduction of the smoking ban (level effect) ß 3 is the change in the slope over time (cf. ß 1) after the introduction of the smoking ban (slope effect) e is the random error The prerequisites for the use of this method must be met (, ):
Example 1: In one study, the rates of acute hospitalization for cardiovascular diseases before and after the temporary closure of Heathrow Airport because of volcanic ash were determined to investigate the putative effect of aircraft noise (). The intervention (airport closure) took place from 15 to 20 April 2010. The hospitalization rate was found to have decreased among persons living in the urban area with the most aircraft noise. The number of observation points was too low, however, to show a causal link conclusively. Example 2: In another study, the rates of hospitalization before and after the implementation of a smoking ban (the intervention) in public areas in Italy were determined (). The intervention occurred in January 2004 (the cutoff time). The number of hospitalizations for acute coronary events was measured from January 2002 to November 2006 (figure 1). The analysis took account of seasonal dependence, and an effect modification for two age groups—persons under age 70 and persons aged 70 and up—was determined as well. The hospitalization rate declined in the former group, but not the latter. Open in a separate window Figure 1 Age-standardized hospitalization rates for acute coronary events (ACE) in persons under age 70 before and after the implementation of a smoking ban in public places in Italy, studied with the corresponding methods (). The observed and predicted rates are shown (circles and solid lines, respectively). The dashed lines show the seasonally adjusted trend in ACE before and after the introduction of the nationwide smoking ban. DiscussionThe necessary distinction between causality and correlation is often emphasized in scientific discussions, yet it is often not applied strictly enough. Furthermore, causality in medicine and epidemiology is mostly probabilistic in nature, i.e., an intervention alters the probability that the event under study will take place. A good illustration of this principle is offered by research on the effects of radiation, in which a strict distinction is maintained between deterministic radiation damage on the one hand, and probabilistic (stochastic) radiation damage on the other (). Deterministic radiation damage—radiation-induced burns or death—arises with certainty whenever a subject receives a certain radiation dose (usually a high one). On the other hand, the risk of cancer-related mortality after radiation exposure is a stochastic matter. Epidemiological observations and biological experiments should be evaluated in tandem to strengthen conclusions about probabilistic causality (). While RCTs still retain their importance as the gold standard of clinical research, they cannot always be carried out. Some indispensable knowledge can only be obtained from observational studies. Confounding factors must be eliminated, or at least accounted for, early on when such studies are planned. Moreover, the data that are obtained must be carefully analyzed. And, finally, a single observational study hardly ever suffices to establish a causal relationship. In this article, we have presented two newer methods that are relatively simple and which, therefore, could easily be used more widely in medical and epidemiological research (). Either one should be used only after the prerequisites for its applicability have been meticulously checked. In regression-discontinuity methods, the assumption of continuity must be verified: in other words, it must be checked whether other properties of the treatment and control groups are the same, or at least equally balanced. The rules of group assignment and the role played by the continuous assignment variable must be known as well. Regression-discontinuity methods can generate causal conclusions, but any such conclusion will not be generalizable if the treatment effects are heterogeneous over the range of the assignment variable. The estimate of effect size is applicable only in a small, predefined interval around the threshold value. It must also be checked whether the outcome and the assignment variable are in a linear relationship, and whether there is any interaction between the treatment and assignment variables that needs to be considered. In the analysis of interrupted time series, the assumption of continuity must be tested as well. Furthermore, the method is valid only if the occurrence of any other intervention at the same time point as the one under study can be ruled out (). Finally, the type of temporal sequence must be considered, and more complex statistical methods must be applied, as needed, to take such phenomena as autoregression into account. Observational studies often suggest causal relationships that will then be either supported or rejected after further studies and experiments. Knowledge of the effects of radiation exposure was derived, at first, mainly from observations on victims of the Hiroshima and Nagasaki atomic bomb explosions (). These findings were reinforced by further epidemiological studies on other populations exposed to radiation (e.g., through medical procedures or as an occupational hazard), by physical considerations, and by biological experiments (). A classic example from the mid-19th century is the observational study by Snow (): until then, the biological cause of cholera was unknown. Snow found that there had to be a causal relationship between the contamination of a well and a subsequent outbreak of cholera. This new understanding led to improved hygienic measures, which did, indeed, prevent infection with the cholera pathogen. Cases such as these prove that it is sometimes reasonable to take action on the basis of an observational study alone (). They also demonstrate, however, that further studies are necessary for the definitive establishment of a causal relationship. Open in a separate window Figure 2 The effect of a smoking ban on the incidence of cardiovascular diseases Key messages
AcknowledgmentsTranslated from the original German by Ethan Taub, M.D. FootnotesConflict of interest statement The authors state that they have no conflict of interest. References1. Köhler D. Feinstaub und Stickstoffdioxid (NO2): Eine kritische Bewertung der aktuellen Risikodiskussion. Dtsch Arztebl. 2018;115(38) A-1645. [Google Scholar] 2. Deutsche Gesellschaft für Epidemiologie, Deutsche Gesellschaft für Medizinische Informatik Biometrie und Epidemiologie, Deutsche Gesellschaft für Public Health, Deutsche Gesellschaft für Sozialmedizin und Prävention. Offener Brief bzw. Stellungnahme auf den Webseiten der beteiligten Fachgesellschaften 2019. www.dgepi.de/assets/News/84b5207b3d/NOxFeinstaubStellungnahme2019_01_29.pdf (last accessed on 11 January 2020)
|