>From: hes@unity.ncsu.edu >Subject: Re: put "Flaws in Kellerman" in my ftp dir? >To: chan@shell.portal.com (Jeff Chan) >Date: Sun, 26 Dec 1993 19:51:45 -0500 (EST) > >Jeff, > Sure - feel free to put this in your ftp archive and to distribute it. >You can use the title/author as below. (I hope to do a little >rewriting, and so that's why I've included a date.) Serious Flaws in Kellerman, et al (1993) NEJM (December, 1993) by Henry E. Schaffer, Ph. D. Summary and Overview The Kellerman, et al (1993) study in the NEJM attempts to use the case-control method (CCM) to show that gun ownership increases homicide in the home. The limitations of the CCM, and serious flaws in the study methodology, result in invalidation of the study's conclusions. The CCM has a number of limitations in what it can accomplish, and has a number of conditions (assumptions) which must be satisfied for it to be able to satisfactorily accomplish even the limited goals for which it is suitable. The biggest limitation is that the CCM can't demonstrate causation. The CCM finds 'associations' between studied factors and the 'outcome' which defines the 'cases'. These 'associations' may suggest that there is a causal relationship, and may then be used to justify a study of causal relationships, but it is incorrect to jump from the discovery of an association to a conclusion of causation. Other weak points in the CCM have to do with susceptibility to biases in the selection of the cases, and with confounding factors which can affect the choice of the controls. These can easily lead to spurious associations when there actually are none, or to associations which are reversed in direction from what actually exists. The Kellerman, et al (1993) study has been widely quoted as demonstrating that there is a causal relationship between handguns in the home and homicides. The paper itself doesn't go that far, but it uses suggestive language, which suggests that there is more than merely an 'association'. The flaws in the paper are such as to make the the reader suspicious of the association found. Showing flaws in the methods does not prove that the paper is wrong, but it causes a loss of confidence in the results. Conclusions which are not properly supported must be considered invalid until proper support becomes available, if ever. It is the responsibility of the authors to support their conclusions. It isn't the responsibility of the readers to go out to collect data to prove that the flaws in the paper lead to incorrect conclusions. The detailed treatment of these flaws, with supporting data, examples and methods is necessarily quite long, but it does illustrate that the Kellerman, et al paper is based on unsupported assumptions and that the conclusions must be viewed with suspicion or rejected as being unsupported. Acknowledgement I was helped in this project by the advice, criticism and encouragement of Dan Day, Fran Haga, Steve Holland and Paul Stoufflet. Many other people on the net also helped. I have full responsibility for any defects. Detailed Examination Subgroups and confounding factors The methods used in Kellerman, et al do not take into account subgrouping or stratification in our society, and this can be shown to be able to cause a spurious association comparable to the one found. The case-control method (described in an Appendix) has an assumption of homogeneity for all relevant variables which are not taken into account in the study. If this is violated, it is possible to have an 'apparent association' result when there is actually no true association. Technically this would probably be considered to be "confounding" due to whatever factors were heterogeneous. Unlike other types of studies in which randomization is used to protect against unaccounted for variation, the case-control remains susceptible. Here are some simple examples of an association in the overall data: I) when there is no association in two subgroups, and II) when the association in the subgroups is actually of the opposite direction (i.e. where gun posession has a protective effect.) The computation of association is shown in an Appendix. I) No association in subgroups - spurious harmful association overall Consider that the population is composed of a minority subgroup which has a high risk of homicide, and a relatively high gun ownership rate. This subgroup is composed of 'career criminals', gang members and others who have a repeated history of criminal activity. The majority subgroup has a low risk of homicide and a lower gun ownership rate. This majority is the general law-abiding public. This type of subgrouping does occur in the US, and is discussed in an Appendix. Subgroup sizes of 10%/90% are used in this example to be in the range of numbers found in the studies cited in the Appendix. There is no causal relationship between homicide and gun ownership in either subgroup. No causal effect in subgroups - spurious harmful association overall: Gun High Risk Low Risk Ownership dead alive dead alive Own gun 165 665,000 27.5 2,992,500 No gun 165 665,000 82.5 8,977,500 --- -------- ----- ---------- Totals 330 1,330,000 110 11,970,000 (Population Total 13,3000,000, total dead-in-home 440) The 'odds ratio' measure of association is 1.0 in each case indicating a lack of association of gun ownership with homicide. However, when we put these two groups together into the single population which they compose we get the data: Gun Total Population Ownership dead alive Own gun 192.5 3,657,500 No gun 247.5 9,642,500 The 'odds ratio' now is 2.0 which indicates an association of gun ownership with homicide. This is not due to gun ownership having a causal effect, but rather there is a 'confounding' variable of subgroup membership and gun ownership is associated with subgroup. So the association of gun ownership with homicide would be called an 'apparent association' in the literature. II) Protective effect in subgroups - spurious harmful association overall Since the 2.0 odds ratio in (I) above is fairly large (it is comparable to the 1.6 odds ratio found in the paper,) it is clear that the same type of apparent harmful association can arise even when there is a protective effect of ownership within each of the subgroups. Arbitrarily modifying the example numbers above to introduce a similar protective effect in each subgroup produces: Gun High Risk Low Risk Ownership dead alive dead alive Own gun 151 665,000 24 2,992,500 No gun 179 665,000 86 8,977,500 These show an odds ratio of .84 for each of these subgroups. Note that odds ratios < 1 represent protective associations. However, when we put these two groups together into the single population which they compose we get the data: Gun Total Population Ownership dead alive Own gun 165 3,657,500 No gun 265 9,642,500 The odds ratio is now 1.64 which is a (spurious) harmful association. This must be considered to be an "apparent association" of gun ownership with homicide because it has resulted from data in which there was a clear protective effect, and yet it resulted in a spurious indication of harm comparable to the 1.6 value given in the paper. Note that all the above has used the entire population in the calculation - but since the odds ratio is unaffected by dividing a column by a constant, the exact same odds ratios would be produced if a sample was taken from the "alive" column (corresponding to the choice of 'alive' controls.) In this case the table immediately above would be: Gun Total Population Ownership dead alive Own gun 165 121 No gun 265 319 which produces the identical 1.64 odds ratio. The Kellerman, et al (1993) study in the NEJM didn't use the same calculation that is shown above. They used the "Mantel-Haenszel chi-square analysis for matched pairs" but didn't give any analysis. This analysis is able to adjust for differences in stratified data *if* the stratification (subdivision of the overall population into the two subgroups) is known and is taken into account when matching.. Matching control pairs is an attempt to get the each case and matched control be in the same subgroup - when the population is divided into subgroups. If this is done, then it appears that the Mantel-Haenszel analysis will produce an association calculation which is free of the confounding demonstrated above. However it is not clear that the Kellerman, et al matching does select controls from the same subgroups as the cases. The control selection was done using a random selection starting outside a "one-block avoidance zone" away from the case homicide, and the matching criteria did not include any life-style or related indicators. If the population is composed of subgroups which differ in homicide rates, then the matching procedure would be hoped to select the matching control from the same subgroup as the case it is supposed to match. This could happen with the matching method used if the subgroups were settled in distinct different large geographic areas. Because of the avoidance method used these areas would have to be larger than one-block in size (how much larger is hard to tell, since the paper doesn't say how far outside the zone it was necessary to travel to find a matching control who would agree to cooperate.) But it doesn't appear that risk sub-groups are distributed in such a coarse-grain manner. I discussed this with a colleague who is a sociologist/ criminologist who pointed out that risk subgroup factors (drug dealing, violent criminal events, violently abusive family relationships, etc.) often are fine-grained. They vary between different families in one apartment building, and certainly vary between different families in a block. Therefore choosing a control who lives 1 or more blocks away will not assure matching with respect to the sub-group. For a minority sub-group (e.g. the 10% "High Risk" group in the examples above) the chances good are that homicides in the high risk group will be matched with low risk group controls. The Kellerman, et al paper presented all of its data in terms of the overall group numbers, similar to the total population information presented in the examples above. Therefore there is no way to rework the analyses and check on the Mantel-Haenszel analysis results. Without proper within-sub-group matching the Mantel-Haenszel result would be affected by confounding and therefore produce incorrect results just as found by the odds ratio analysis used in the above examples. This can be shown by using two situations based on (I) above. We take, as before, the High Risk subgroup as 10% of the population and the Low Risk group as the other 90%. We have the same 440 cases as in (I) above, and they will be matched in two ways. A) will be with controls selected without consideration of subgroup membership. B) will be with controls selected to be in the same subgroup as the case. The 440 case individuals are High-Risk with gun 110 High-Risk without Gun 110 Low-Risk with gun 55 Low Risk without Gun 165 The population as a whole is High-Risk with gun 5.0% High-Risk without gun 5.0% Low-Risk with gun 22.5% Low-Risk without gun 67.5%. The population figures result from 10% High-Risk of which 50% are with gun and 50% without gun. The 90% Low-Risk is 25% with gun, and 75% without gun. A) Working out the expected numbers of the four types of matched case-control pairs when the controls are selected *without* consideration of subgroup membership: case with gun case with gun case without gun case without gun control with control without control with control without 45.375 119.625 75.625 199.375 The odds ratio is 119.625/75.625 = 1.58 Remember that there is no association within each of the subgroups, and therefore this is a spurious association comparable to the 1.6 found. B) Working out the expected numbers of the four types of matched case-control pairs when the control are selected from within the same subgroup as the case. case with gun case with gun case without gun case without gun control with control without control with control without 68.75 96.25 96.25 178.75 The odds ratio is 96.25/96.25 = 1. This is the same (no association) result which is found within each subgroup. This indicates that the Mantel-Haenszel method correctly compensates for stratification only when the stratification is recognized. Therefore it can be seen that this type of subgrouping could, by itself, account for the results of the study. Bias due to failure to respond honestly The cases and the controls were asked about gun ownership in the home. The raw results were that 174 of the cases (45.4%) said that there was ownership and 139 (35.8%) said that, for a crude odds ratio of 1.6. Might there be a bias in these responses? Considering that each of the cases was a homicide reported to the police, we can expect that there was a police investigation and not only was a gun found if there was one in the home, but that there would be little reluctance to admit the fact. What about the controls? The authors refer to "a pilot study of homes listed as the addresses of owners of registered handguns confirmed that respondents' answers to questions about gun ownership were generally valid." (This study by Kellerman, et al, 1990 is cited below.) This sounds impressive - until considering what "generally" means. In the study referred to, the authors found that 97.1% of the families (34 of 35) which were listed as being the location of a registered handgun admitted to having guns in the home, either at the time or recently. This sounds very impressive until the numbers are placed in perspective. 75 homes were chosen, but due to difficulties in address records, only 55 could be found, and of these only 35 consented to the interview. Therefore we can only conclude that 31 of the 55 homes contacted (56.4%) and 31 of the total of 75 homes (41.3%) admitted to gun ownership. This is considering only *registered* owners. One might plausibly think the difficulties in finding 20 (= 26.7%) of the registered owners might be related to their unwillingness to be connected with ownership. The refusal to be interviewed might have the same cause, and owners of unregistered guns would be even more reluctant to admit to ownership. Criminals and owners of illicit guns are likely to refuse to be interviewed, let alone admit to ownership. Therefore it appears that Kellerman is quoting his own previous work in a way which overstates its conclusion. The reason this % is important can be seen by looking at the amount by which gun ownership is stated to be lower in the controls than in the cases. This is the root of the 'association' which is claimed to exist between gun ownership and homicide. It would take only 37 controls who possessed guns, but denied possession, to make the control ownership exactly equal to the cases (and produce a crude odds ratio of 1.0.) Note that the chance of lying in denial is raised by the fact that most of the time (51.7% of the time) the control, instead of a proxy, was interviewed, and therefore there could be maximum personal interest in denying gun ownership. If 45% of the control actually owned guns, this 'deficit' of 37 would represent a 21.1% 'false-denial-rate.' Such a rate is quite consistent with the results of the pilot study, even though the authors do not admit to it. Therefore this bias could, by itself, account for the results of the study. (The study is Kellermann, A. L., F. P. Rivara, J. Banton, D. Reay, and C. L. Feigner. Validating survey responses about gun ownership among owners of registered handguns. Am J Epidemiol 1990; 131:1080-4.) Selection Bias and Response Bias A major point is made in this study that *all* of the homicides meeting the study's 'in the home' criterion were included. This is a benefit to a CCM study since it eliminates the possibility of case 'selection bias' affecting the results. However, upon closer inspection, it appears that there is far from total inclusion and that there is room for selection bias to act. The authors try to give the impression that there was a very high response - they do this by giving 'partial' percentages several times, rather than stating the end result. There were 444 homicides meeting the 'home' criterion. 24 were excluded for "various reasons" leaving 94.6%. But then 7% were dropped because of failure to interview the proxy, and an additional 1% due to failure to find a control, leaving 388 matched pairs. This is down to 87.4%. The authors state, "Although case-control studies offer many advantages over ecologic studies, they are prone to several sources of bias. To minimize selection bias, we included *all* cases of homicide in the home and rigorously followed an explicit procedure for randomly selecting neighborhood control subjects. High response rates among case proxies (92.6 percent) and matching controls (80.6 percent) minimized nonresponse bias." (emphasis added) Are the authors overstating their case? Perhaps just a little, but many would be willing to allow 87.4% to be described as "all". However, this is not the end - even though there were 388 matched pairs, it appears that the study did not obtain complete data on all of them, and the multivariate analyses used require complete data, and so there were only 316 matched pairs used in the final analyses. This represents 71.2% of the 444 homicides. It is very difficult to accept that "all" fairly describes this 71.2%. This does not prove that there was any selection or response bias in this study, it just shows that there was room for such biases to act. It also shows that the authors avoided coming to grips with this issue and misled the readers into thinking that there could be little or no such bias. APPENDICES Appendix on the Case Control method. The case control technique is described in: Designing Clinical Research Stephen B. Hulley & Steven R. Cummings, editors Williams & Wilkins 1988 Here are some quote from relevant sections, with some notes of mine on how it applies to the current topic. Chapter 8 Designing a New Study: II. Cross-sectional and Case-control Studies by Thomas B. Newman, et al. Case-control Studies are covered on pages 78 - 86 Emphasis marked with _ _ is in the original. [My comments are in square brackets.] " ... case-control studies are generally _retrospective_. They identify groups of subjects with and without the disease, then look backward in time to find differences in predictor variables that may explain why the cases got the disease and the controls did no." (footnote "The terms "predictor" and "outcome" variable can be confusing in a case-control study. From a statistical viewpoint, the search for associations in these studies uses the presence or absence of the disease as the predictor and the level of various risk factors as the outcome. However, this reverses the biological meaning of these terms, and we have elected to continue the convention of using predictors and outcomes to reflect the putative cause and effect relationships.) "The design of a case-control study is challenging because of the increased opportunities for bias, but there are many examples of well-designed studies that have yielded important results ..." Strengths of Case-Control Studies - two are discussed: Efficiency for rare outcomes Usefulness for generating hypotheses Weaknesses of case-control studies "Case control studies are a cheap and practical way to investigate risk factors for rare diseases, or to generate hypotheses about new diseases or unusual outbreaks. These are great strengths, but they are achieved at a considerable cost." ... "But the biggest weakness of case-control studies is their _increased_susceptibility_to_bias. This bias comes chiefly from two sources: the separate sampling of the cases and controls, and the retrospective measurement of the predictor variables." "In general, sampling bias is important when the sample of cases is unrepresentative _with_respect_to_the_risk_factor_being_studied." [Note that this is a tricky concept. In Kellerman, et al, they took all of the in-home homicides in those locales/times - and so it might be though that there was no sampling and therefore no possible bias for cases. But there are several ways in which there could be a sampling bias. 1) Not all of the homicides got into the analysis - 444 cases were reduced to 420 for the study, then to 405 because of lack of controls, then to 388 - of which only 357 had controls matched for all four matching characteristics. But then they only used 316 of matched pairs in the final multivariate analysis, in order to only include pairs for thich they had data on "the six variables of interest". They didn't say what number of these 316 were matched for all four characteristics. So instead of a 100% sample which has no selection and therefore can't be biased, there was a 71% sample which allows for the existence of bias. 2) Only 3 counties in the US were used, and the study is being used to reach conclusions about the whole US - therefore this raises the question whether these 3 counties are an unbiased sample of the US. Note that the homocide rate in these counties was approximately 50% greater than the overall U.S. rate. This then brings into question whether the results, even without question of procedural flaws, could be representative of the U.S. population. 3) The cases were selected by the criterion of an in-home homicide, but the risk factor most discussed is the keeping of a gun for purposes of protection. There is nothing to show that the homicide cases studied are representative of the households which keep a gun for protection.] "The more difficult decisions faced ... usually relate to the more open-ended task of _selecting_the_controls. The general goal is to find an accessible population at risk of the disease who otherwise represent the same population as the cases, and there are four main strategies for achieving this goal." "1. Sampling the cases and controls in the same way: One strategy is to choose a control group that _compensates_ for an unrepresentative sample of cases by being unrepresentative in the same way." "2. Matching: Matching is a simple method of ensuring that cases and controls are comparable with respect to major factors that are related to the disease but not of interest to the investigator." "Differential measurement bias, and how to control it: The second particular problem of case-control studies is bias that affects one group more than the other caused by the retrospective approach to measuring the predictor variables. [e.g. "_differential_ recall bias" in which one group, case or control, is more likely to remember or report risk factors. In particular, any difference in reporting gun ownership would cause a bias in the Kellerman, et al study.] "... there are two specific strategies for avoid bias in measuring risk factors in case-control studies." "1. Use of data recorded before the outcome occurred: ..." "2. Blinding: ... "Ideally, neither the study subjects nor the investigators should know which subjects are cases and which are controls. ... In practice, this is often difficult. The subjects know whether they are sick or well, ..." [In the Kellerman, et al study, there is a distinction of whether or not a homicide occurred at home - which doesn't seem to be amenable to blinding.] Measures of association (Appendix 8A) Predictor variable Outcome variable present absent present a b absent c d Relative risk ~~ Odds ratio = ad/bc [~~ is used for wavy =] [Appendix 8B is "Why the odds ratio can be used as an estimate for relative risk in a case-control study"] [Appendix 10A is "Hypothetical example of confounding" in which an apparent association between coffee drinking and MI is shown to result from an actual zero association of coffee drinking but instead from a high association of smoking with coffee drinking.] [This is a very relevant example, because overlooking confounding variables (such a membership in a high-risk group) can easily produce significant but spurious associations in the results. This is easy to demonstrate.] Appendix - calculation of odds ratio Gun Outcomes Ownership dead alive Own gun a b No gun c d The odds ratio is ad/bc. Appendix - The Mantel-Haenszel Chi-square analysis for matched pairs is a special case of their analysis for a stratified sample in case-control studies. The odds ratio is B/C where B is the number of pairs where the case has gun ownership and the control doesn't and C is the opposite (disjoint pairs) Appendix - Justification for existence of sub-group/stratification. A sociologist colleague lent me a copy of a textbook, Criminology, 2nd ed. by John E. Conklin, 1986, Macmillin Pub. Co. It has a Chapter on "Criminal Careers" about people who commit crimes repeatedly. In this chapter, a section on 'Delinquent Careers' (starting on pg. 308) gives some direct data on subgrouping. Two research studies in different cities are discussed. A study of a birth cohort (Delinquency in a Birth Cohort, Wolfgang, Figlio and Sellin, 1972) covering males 10 - 18 years old in Philadelphia showed the following results: Type of Offender % of Cohort % of all of Cohort's Police Contacts Nondelinquents 65.1 0 One-time offenders 16.2 15.8 Nonchronic offenders 12.4 32.3 Chronic offenders 6.3 51.9 Less than 5 contacts counts as nonchronic. They point out that the one-time offender group usually were involved in relatively trivial offenses. Note that 10% of the offenders would account for roughly 2/3 of all police contacts. Another cohort study in Racine, Wisc. of juveniles and young adults (Shannon, 1982) showed similar concentration with 5 - 7 % accounted for over 1/2 of all non-traffic police contacts ~20% " " " 80% " " " " 5 - 14% " " ALL of the felony arrests. These are cohort studies, and are therefor not susceptible to sampling bias and other such problems as many of the other (easier to run) studies. We have the inescapable conclusion that there is subgrouping in the population, with a small fraction of the population accounting for a large portion of serious criminal behavior. Appendix - population figures County population dates of study duration pop-years (1990 census) Cuyahoga 1412140 1/1/90-10/23/92 2.81 years 3,970,000 Shelby 826330 10/23/87-92 4 years 3,305,000 King 1507319 " " " " 6,030,000 Total pop-years: ~13,300,000 -------------- --henry schaffer