>From: hes@unity.ncsu.edu
>Subject: Re: put "Flaws in Kellerman" in my ftp dir?
>To: chan@shell.portal.com (Jeff Chan)
>Date: Sun, 26 Dec 1993 19:51:45 -0500 (EST)
>
>Jeff,
>  Sure - feel free to put this in your ftp archive and to distribute it. 
>You can use the title/author as below.  (I hope to do a little
>rewriting, and so that's why I've included a date.)


           Serious Flaws in Kellerman, et al (1993) NEJM
                        (December, 1993)
                 by Henry E. Schaffer, Ph. D.


                      Summary and Overview

     The Kellerman, et al (1993) study in the NEJM attempts to use the 
case-control method (CCM) to show that gun ownership increases homicide
in the home.  The limitations of the CCM, and serious flaws in the study 
methodology, result in invalidation of the study's conclusions.

     The CCM has a number of limitations in what it can accomplish, and
has a number of conditions (assumptions) which must be satisfied for it
to be able to satisfactorily accomplish even the limited goals for
which it is suitable.  The biggest limitation is that the CCM can't
demonstrate causation.  The CCM finds 'associations' between studied
factors and the 'outcome' which defines the 'cases'.  These
'associations' may suggest that there is a causal relationship, and may
then be used to justify a study of causal relationships, but it is
incorrect to jump from the discovery of an association to a conclusion
of causation.  Other weak points in the CCM have to do with
susceptibility to biases in the selection of the cases, and with
confounding factors which can affect the choice of the controls.  These
can easily lead to spurious associations when there actually are none,
or to associations which are reversed in direction from what actually
exists.

     The Kellerman, et al (1993) study has been widely quoted as
demonstrating that there is a causal relationship between handguns in
the home and homicides.  The paper itself doesn't go that far, but it
uses suggestive language, which suggests that there is more than merely
an 'association'.  The flaws in the paper are such as to make the the
reader suspicious of the association found.  Showing flaws in the
methods does not prove that the paper is wrong, but it causes a loss of
confidence in the results.  Conclusions which are not properly
supported must be considered invalid until proper support becomes
available, if ever.  It is the responsibility of the authors to support
their conclusions.  It isn't the responsibility of the readers to go
out to collect data to prove that the flaws in the paper lead to
incorrect conclusions.

     The detailed treatment of these flaws, with supporting data,
examples and methods is necessarily quite long, but it does illustrate
that the Kellerman, et al paper is based on unsupported assumptions and
that the conclusions must be viewed with suspicion or rejected as being
unsupported.


                        Acknowledgement

     I was helped in this project by the advice, criticism and
encouragement of Dan Day, Fran Haga, Steve Holland and Paul Stoufflet.
Many other people on the net also helped.  I have full responsibility 
for any defects.


                     Detailed Examination

                   Subgroups and confounding factors

     The methods used in Kellerman, et al do not take into account
subgrouping or stratification in our society, and this can be shown to
be able to cause a spurious association comparable to the one found.

     The case-control method (described in an Appendix) has an
assumption of homogeneity for all relevant variables which are not
taken into account in the study.  If this is violated, it is possible
to have an 'apparent association' result when there is actually no true
association.  Technically this would probably be considered to be
"confounding" due to whatever factors were heterogeneous.  Unlike other
types of studies in which randomization is used to protect against
unaccounted for variation, the case-control remains susceptible.

     Here are some simple examples of an association in the overall data:

I) when there is no association in two subgroups, and 
II) when the association in the subgroups is actually of the opposite 
direction (i.e. where gun posession has a protective effect.)

The computation of association is shown in an Appendix.

I)  No association in subgroups - spurious harmful association overall

     Consider that the population is composed of a minority subgroup
which has a high risk of homicide, and a relatively high gun ownership
rate.  This subgroup is composed of 'career criminals', gang members
and others who have a repeated history of criminal activity.  The
majority subgroup has a low risk of homicide and a lower gun ownership
rate.  This majority is the general law-abiding public.  This type of
subgrouping does occur in the US, and is discussed in an Appendix.
Subgroup sizes of 10%/90% are used in this example to be in the range
of numbers found in the studies cited in the Appendix.  There is no
causal relationship between homicide and gun ownership in either
subgroup.

  No causal effect in subgroups - spurious harmful association overall:

Gun                High Risk                  Low Risk
Ownership          dead   alive            dead    alive
Own gun            165    665,000          27.5  2,992,500
No gun             165    665,000          82.5  8,977,500
                   ---   --------         ----- ----------
     Totals        330  1,330,000         110   11,970,000

        (Population Total 13,3000,000, total dead-in-home 440)

     The 'odds ratio' measure of association is 1.0 in each case
indicating a lack of association of gun ownership with homicide.
However, when we put these two groups together into the single
population which they compose we get the data:

Gun               Total Population
Ownership         dead     alive
Own gun           192.5    3,657,500
No gun            247.5    9,642,500

     The 'odds ratio' now is 2.0 which indicates an association of gun
ownership with homicide.  This is not due to gun ownership having a
causal effect, but rather there is a 'confounding' variable of subgroup
membership and gun ownership is associated with subgroup.  So the
association of gun ownership with homicide would be called an 'apparent
association' in the literature.

II) Protective effect in subgroups - spurious harmful association overall

     Since the 2.0 odds ratio in (I) above is fairly large (it is
comparable to the 1.6 odds ratio found in the paper,) it is clear
that the same type of apparent harmful association can arise even when
there is a protective effect of ownership within each of the subgroups.
Arbitrarily modifying the example numbers above to introduce a similar
protective effect in each subgroup produces:

Gun                High Risk                    Low Risk
Ownership          dead   alive                dead    alive
Own gun            151    665,000               24    2,992,500
No gun             179    665,000               86    8,977,500

     These show an odds ratio of .84 for each of these subgroups.  Note 
that odds ratios < 1 represent protective associations.  However, when 
we put these two groups together into the single population which they 
compose we get the data:

Gun               Total Population
Ownership         dead     alive
Own gun           165      3,657,500
No gun            265      9,642,500

     The odds ratio is now 1.64 which is a (spurious) harmful
association.  This must be considered to be an "apparent association"
of gun ownership with homicide because it has resulted from data in
which there was a clear protective effect, and yet it resulted in a
spurious indication of harm comparable to the 1.6 value given in the
paper.

     Note that all the above has used the entire population in the
calculation - but since the odds ratio is unaffected by dividing a
column by a constant, the exact same odds ratios would be produced if
a sample was taken from the "alive" column (corresponding to the choice
of 'alive' controls.)  In this case the table immediately above would be:

Gun               Total Population
Ownership         dead     alive
Own gun           165      121
No gun            265      319

which produces the identical 1.64 odds ratio.

     The Kellerman, et al (1993) study in the NEJM didn't use the same
calculation that is shown above.  They used the "Mantel-Haenszel
chi-square analysis for matched pairs" but didn't give any analysis.
This analysis is able to adjust for differences in stratified data *if*
the stratification (subdivision of the overall population into the two
subgroups) is known and is taken into account when matching..

     Matching control pairs is an attempt to get the each case and
matched control be in the same subgroup - when the population is
divided into subgroups.  If this is done, then it appears that the
Mantel-Haenszel analysis will produce an association calculation which
is free of the confounding demonstrated above.  However it is not clear
that the Kellerman, et al matching does select controls from the same
subgroups as the cases.  The control selection was done using a random
selection starting outside a "one-block avoidance zone" away from the
case homicide, and the matching criteria did not include any life-style
or related indicators.

     If the population is composed of subgroups which differ in
homicide rates, then the matching procedure would be hoped to select
the matching control from the same subgroup as the case it is supposed
to match.  This could happen with the matching method used if the
subgroups were settled in distinct different large geographic areas.
Because of the avoidance method used these areas would have to be
larger than one-block in size (how much larger is hard to tell, since
the paper doesn't say how far outside the zone it was necessary to
travel to find a matching control who would agree to cooperate.)  But
it doesn't appear that risk sub-groups are distributed in such a
coarse-grain manner.  I discussed this with a colleague who is a
sociologist/ criminologist who pointed out that risk subgroup factors
(drug dealing, violent criminal events, violently abusive family
relationships, etc.) often are fine-grained.  They vary between
different families in one apartment building, and certainly vary
between different families in a block.  Therefore choosing a control
who lives 1 or more blocks away will not assure matching with respect
to the sub-group.  For a minority sub-group (e.g. the 10% "High Risk"
group in the examples above) the chances good are that homicides in the
high risk group will be matched with low risk group controls.

     The Kellerman, et al paper presented all of its data in terms of
the overall group numbers, similar to the total population information
presented in the examples above.  Therefore there is no way to rework
the analyses and check on the Mantel-Haenszel analysis results.
Without proper within-sub-group matching the Mantel-Haenszel result
would be affected by confounding and therefore produce incorrect
results just as found by the odds ratio analysis used in the above
examples.

     This can be shown by using two situations based on (I) above.  We
take, as before, the High Risk subgroup as 10% of the population and the
Low Risk group as the other 90%.  We have the same 440 cases as
in (I) above, and they will be matched in two ways. 

A) will be with controls selected without consideration of subgroup 
membership.  
B) will be with controls selected to be in the same subgroup as the case.  

  The 440 case individuals are 
High-Risk with gun       110
High-Risk without Gun    110
Low-Risk  with gun        55
Low Risk  without Gun    165

  The population as a whole is
High-Risk with gun         5.0% 
High-Risk without gun      5.0% 
Low-Risk  with gun        22.5%
Low-Risk  without gun     67.5%.  

The population figures result from 10% High-Risk of which 50% are with
gun and 50% without gun.  The 90% Low-Risk is 25% with gun, and 75%
without gun.

     A)  Working out the expected numbers of the four types of matched
case-control pairs when the controls are selected *without*
consideration of subgroup membership:

   case with gun   case with gun    case without gun  case without gun
   control with    control without  control with      control without
     45.375          119.625           75.625            199.375

     The odds ratio is 119.625/75.625 = 1.58

     Remember that there is no association within each of the subgroups,
and therefore this is a spurious association comparable to the 1.6 found.

     B) Working out the expected numbers of the four types of matched
case-control pairs when the control are selected from within the same
subgroup as the case.

   case with gun   case with gun    case without gun  case without gun
   control with    control without  control with      control without 
     68.75           96.25            96.25              178.75

     The odds ratio is 96.25/96.25 = 1.

     This is the same (no association) result which is found within each
subgroup.  This indicates that the Mantel-Haenszel method correctly
compensates for stratification only when the stratification is recognized.

     Therefore it can be seen that this type of subgrouping could, by
itself, account for the results of the study.


                Bias due to failure to respond honestly

     The cases and the controls were asked about gun ownership in the
home.  The raw results were that 174 of the cases (45.4%) said that
there was ownership and 139 (35.8%) said that, for a crude odds ratio
of 1.6.  Might there be a bias in these responses?  Considering that
each of the cases was a homicide reported to the police, we can expect
that there was a police investigation and not only was a gun found if
there was one in the home, but that there would be little reluctance to
admit the fact.  What about the controls?

     The authors refer to "a pilot study of homes listed as the
addresses of owners of registered handguns confirmed that respondents'
answers to questions about gun ownership were generally valid."  (This
study by Kellerman, et al, 1990 is cited below.)  This sounds
impressive - until considering what "generally" means.  In the study
referred to, the authors found that 97.1% of the families (34 of 35)
which were listed as being the location of a registered handgun
admitted to having guns in the home, either at the time or recently.
This sounds very impressive until the numbers are placed in
perspective.  75 homes were chosen, but due to difficulties in address
records, only 55 could be found, and of these only 35 consented to the
interview.  Therefore we can only conclude that 31 of the 55 homes
contacted (56.4%) and 31 of the total of 75 homes (41.3%) admitted to
gun ownership. This is considering only *registered* owners.  One might
plausibly think the difficulties in finding 20 (= 26.7%) of the
registered owners might be related to their unwillingness to be
connected with ownership.  The refusal to be interviewed might have the
same cause, and owners of unregistered guns would be even more
reluctant to admit to ownership.  Criminals and owners of illicit guns
are likely to refuse to be interviewed, let alone admit to ownership.
Therefore it appears that Kellerman is quoting his own previous work in
a way which overstates its conclusion.

     The reason this % is important can be seen by looking at the
amount by which gun ownership is stated to be lower in the controls
than in the cases.  This is the root of the 'association' which is
claimed to exist between gun ownership and homicide.  It would take
only 37 controls who possessed guns, but denied possession, to make the
control ownership exactly equal to the cases (and produce a crude odds
ratio of 1.0.)  Note that the chance of lying in denial is raised by the
fact that most of the time (51.7% of the time) the control, instead of
a proxy, was interviewed, and therefore there could be maximum personal
interest in denying gun ownership.  If 45% of the control actually
owned guns, this 'deficit' of 37 would represent a 21.1%
'false-denial-rate.'  Such a rate is quite consistent with the results
of the pilot study, even though the authors do not admit to it.

     Therefore this bias could, by itself, account for the results of
the study.

(The study is Kellermann, A. L., F. P. Rivara, J. Banton, D. Reay,
and C. L. Feigner.  Validating survey responses about gun ownership
among owners of registered handguns.  Am J Epidemiol 1990; 131:1080-4.)


                   Selection Bias and Response Bias

     A major point is made in this study that *all* of the homicides
meeting the study's 'in the home' criterion were included.  This is a
benefit to a CCM study since it eliminates the possibility of case
'selection bias' affecting the results.  However, upon closer
inspection, it appears that there is far from total inclusion and that
there is room for selection bias to act. The authors try to give the
impression that there was a very high response - they do this by giving
'partial' percentages several times, rather than stating the end
result.  There were 444 homicides meeting the 'home' criterion.  24
were excluded for "various reasons" leaving 94.6%.  But then 7% were
dropped because of failure to interview the proxy, and an additional 1%
due to failure to find a control, leaving 388 matched pairs.  This is
down to 87.4%.  The authors state, "Although case-control studies offer
many advantages over ecologic studies, they are prone to several
sources of bias. To minimize selection bias, we included *all* cases of
homicide in the home and rigorously followed an explicit procedure for
randomly selecting neighborhood control subjects. High response rates
among case proxies (92.6 percent) and matching controls (80.6 percent)
minimized nonresponse bias." (emphasis added)

     Are the authors overstating their case?  Perhaps just a little, but
many would be willing to allow 87.4% to be described as "all".  However,
this is not the end - even though there were 388 matched pairs, it
appears that the study did not obtain complete data on all of them, and
the multivariate analyses used require complete data, and so there were
only 316 matched pairs used in the final analyses.  This represents
71.2% of the 444 homicides.  It is very difficult to accept that "all"
fairly describes this 71.2%.

     This does not prove that there was any selection or response bias
in this study, it just shows that there was room for such biases to
act.  It also shows that the authors avoided coming to grips with this
issue and misled the readers into thinking that there could be little
or no such bias.


                      APPENDICES

Appendix on the Case Control method.

The case control technique is described in:

Designing Clinical Research
Stephen B. Hulley & Steven R. Cummings, editors
Williams & Wilkins  1988

Here are some quote from relevant sections, with some notes of mine on
how it applies to the current topic.

Chapter 8 Designing a New Study: II. Cross-sectional and Case-control
Studies  by Thomas B. Newman, et al.  

Case-control Studies are covered on pages 78 - 86
Emphasis marked with _ _ is in the original. [My comments are in square
brackets.]


" ... case-control studies are generally _retrospective_.  They identify
groups of subjects with and without the disease, then look backward in
time to find differences in predictor variables that may explain why the
cases got the disease and the controls did no."  (footnote "The terms
"predictor" and "outcome" variable can be confusing in a case-control
study.  From a statistical viewpoint, the search for associations in
these studies uses the presence or absence of the disease as the
predictor and the level of various risk factors as the outcome.
However, this reverses the biological meaning of these terms, and we
have elected to continue the convention of using predictors and outcomes
to reflect the putative cause and effect relationships.)

"The design of a case-control study is challenging because of the
increased opportunities for bias, but there are many examples of
well-designed studies that have yielded important results ..."

Strengths of Case-Control Studies - two are discussed:
  Efficiency for rare outcomes
  Usefulness for generating hypotheses

Weaknesses of case-control studies
  "Case control studies are a cheap and practical way to investigate
risk factors for rare diseases, or to generate hypotheses about new
diseases or unusual outbreaks.  These are great strengths, but they are
achieved at a considerable cost." ... "But the biggest weakness of
case-control studies is their _increased_susceptibility_to_bias.  This
bias comes chiefly from two sources: the separate sampling of the cases
and controls, and the retrospective measurement of the predictor
variables."

  "In general, sampling bias is important when the sample of cases is
unrepresentative _with_respect_to_the_risk_factor_being_studied."
[Note that this is a tricky concept.  In Kellerman, et al, they took all
of the in-home homicides in those locales/times - and so it might be
though that there was no sampling and therefore no possible bias for
cases.  But there are several ways in which there could be a sampling 
bias.  1) Not all of the homicides got into the analysis - 444 cases
were reduced to 420 for the study, then to 405 because of lack of
controls, then to 388 - of which only 357 had controls matched for all 
four matching characteristics.  But then they only used 316 of matched
pairs in the final multivariate analysis, in order to only include pairs
for thich they had data on "the six variables of interest".  They didn't 
say what number of these 316 were matched for all four characteristics.  
So instead of a 100% sample which has no selection and therefore can't
be biased, there was a 71% sample which allows for the existence of bias.
2) Only 3 counties in the US were used, and the study is being used to 
reach conclusions about the whole US - therefore this raises the question 
whether these 3 counties are an unbiased sample of the US.  Note that
the homocide rate in these counties was approximately 50% greater than the
overall U.S. rate.  This then brings into question whether the results,
even without question of procedural flaws, could be representative of
the U.S. population.  3) The cases were selected by the criterion of an 
in-home homicide, but the risk factor most discussed is the keeping of a 
gun for purposes of protection.  There is nothing to show that the 
homicide cases studied are representative of the households which keep a 
gun for protection.]

  "The more difficult decisions faced ... usually relate to the more
open-ended task of _selecting_the_controls.  The general goal is to find
an accessible population at risk of the disease who otherwise represent
the same population as the cases, and there are four main strategies for
achieving this goal."

  "1. Sampling the cases and controls in the same way:  One strategy is
to choose a control group that _compensates_ for an unrepresentative
sample of cases by being unrepresentative in the same way."

  "2. Matching: Matching is a simple method of ensuring that cases and
controls are comparable with respect to major factors that are related
to the disease but not of interest to the investigator."

  "Differential measurement bias, and how to control it:  The second
particular problem of case-control studies is bias that affects one
group more than the other caused by the retrospective approach to
measuring the predictor variables.  [e.g. "_differential_ recall bias"
in which one group, case or control, is more likely to remember or
report risk factors.  In particular, any difference in reporting gun
ownership would cause a bias in the Kellerman, et al study.]

  "... there are two specific strategies for avoid bias in measuring
risk factors in case-control studies."
  "1. Use of data recorded before the outcome occurred: ..."
  "2. Blinding: ... "Ideally, neither the study subjects nor the
investigators should know which subjects are cases and which are
controls. ... In practice, this is often difficult.  The subjects know
whether they are sick or well, ..." [In the Kellerman, et al study,
there is a distinction of whether or not a homicide occurred at home -
which doesn't seem to be amenable to blinding.]

Measures of association (Appendix 8A)

Predictor variable   Outcome variable
                       present absent
     present             a       b
     absent              c       d

Relative risk  ~~ Odds ratio  = ad/bc    [~~ is used for wavy =]

[Appendix 8B is "Why the odds ratio can be used as an estimate for
relative risk in a case-control study"]

[Appendix 10A is "Hypothetical example of confounding" in which an
apparent association between coffee drinking and MI is shown to result
from an actual zero association of coffee drinking but instead from a
high association of smoking with coffee drinking.]

  [This is a very relevant example, because overlooking confounding
variables (such a membership in a high-risk group) can easily produce 
significant but spurious associations in the results.  This is easy to
demonstrate.]

Appendix - calculation of odds ratio

Gun                 Outcomes
Ownership         dead     alive
Own gun             a        b
No gun              c        d

  The odds ratio is ad/bc.

Appendix - The Mantel-Haenszel Chi-square analysis for matched pairs 
is a special case of their analysis for a stratified sample in 
case-control studies.

The odds ratio is

     B/C    where B is the number of pairs where the
            case has gun ownership and the control doesn't  and
            C is the opposite (disjoint pairs)

Appendix - Justification for existence of sub-group/stratification.

     A sociologist colleague lent me a copy of a textbook, Criminology,
2nd ed. by John E. Conklin, 1986, Macmillin Pub. Co.  It has a Chapter
on "Criminal Careers" about people who commit crimes repeatedly.  In
this chapter, a section on 'Delinquent Careers' (starting on pg. 308)
gives some direct data on subgrouping.  Two research studies in
different cities are discussed.

     A study of a birth cohort (Delinquency in a Birth Cohort, Wolfgang,
Figlio and Sellin, 1972) covering males 10 - 18 years old in
Philadelphia showed the following results:

Type of Offender   % of Cohort  % of all of Cohort's
                                Police Contacts
Nondelinquents         65.1          0
One-time offenders     16.2         15.8
Nonchronic offenders   12.4         32.3
Chronic offenders       6.3         51.9

     Less than 5 contacts counts as nonchronic. They point out that the 
one-time offender group usually were involved in relatively trivial
offenses.  Note that 10% of the offenders would account for roughly 2/3
of all police contacts.

     Another cohort study in Racine, Wisc. of juveniles and young adults
(Shannon, 1982) showed similar concentration with

  5 - 7 %   accounted for over 1/2 of all non-traffic police contacts
   ~20%      "        "    "   80%    "      "         "       "
  5 - 14%    "        "     ALL    of the felony arrests.

     These are cohort studies, and are therefor not susceptible to
sampling bias and other such problems as many of the other (easier to
run) studies.  We have the inescapable conclusion that there is
subgrouping in the population, with a small fraction of the population
accounting for a large portion of serious criminal behavior.

Appendix - population figures 

County      population     dates of study    duration     pop-years
            (1990 census)
 Cuyahoga   1412140        1/1/90-10/23/92    2.81 years  3,970,000
 Shelby      826330        10/23/87-92        4 years     3,305,000
 King       1507319          "     "          "   "       6,030,000

Total pop-years:  ~13,300,000

--------------

--henry schaffer