Instance-Control Studies


Accomplice studies take an intuitive logic to them, simply they can be very problematic when:

  1. The outcomes existence investigated are rare;
  2. There is a long time period between the exposure of interest and the development of the disease; or
  3. Information technology is expensive or very difficult to obtain exposure information from a cohort.

In the first case, the rarity of the disease requires enrollment of very big numbers of people. In the 2d case, the long period of follow-up requires efforts to go along contact with and collect issue information from individuals. In all three situations, toll and feasibility become an important business organisation.

A case-command design offers an alternative that is much more than efficient. The goal of a case-command written report is the same every bit that of accomplice studies, i.e. to estimate the magnitude of association betwixt an exposure and an outcome. All the same, instance-control studies employ a different sampling strategy that gives them greater efficiency.   As with a accomplice study, a case-command written report attempts to place all people who have adult the disease of involvement in the divers population. This is non because they are inherently more important to estimating an clan, just considering they are almost always rarer than non-diseased individuals, and one of the requirements of accurate estimation of the association is that at that place are reasonable numbers of people in both the numerators (cases) and denominators (people or person-time) in the measures of illness frequency for both exposed and reference groups. However, because nearly of the denominator is fabricated up of people who practice not develop disease, the instance-control blueprint avoids the need to collect information on the entire population by selecting a sample of the underlying population.

Rothman describes the case-control strategy as follows:

"Case-control studies are best understood by considering as the starting point a source population, which represents a hypothetical study population in which a cohort written report might have been conducted. The source population is the population that gives ascension to the cases included in the study. If a cohort study were undertaken, nosotros would ascertain the exposed and unexposed cohorts (or several cohorts) and from these populations obtain denominators for the incidence rates or risks that would exist calculated for each accomplice. Nosotros would then place the number of cases occurring in each cohort and calculate the risk or incidence rate for each. In a instance-command study the same cases are identified and classified as to whether they belong to the exposed or unexposed cohort. Instead of obtaining the denominators for the rates or risks, however, a control group is sampled from the unabridged source population that gives ascension to the cases. Individuals in the control group are then classified into exposed and unexposed categories. The purpose of the control group is to make up one's mind the relative size of the exposed and unexposed components of the source population."

To illustrate this consider the following hypothetical scenario in which the source population is Plymouth County in Massachusetts, which has a full population of 6,647 (hypothetical). Xiii people in the county have been diagnosed with an unusual illness and seven of them have a particular exposure that is suspected of existence an important contributing factor. The chief problem hither is that the disease is quite rare.

Map of Plymouth County showing red icons of people who developed hepatitis A in the outbreak

If I somehow had exposure and outcome data on all of the subjects in the source population and looked at the association using a cohort design, information technology might expect similar this:

Diseased

Non-diseased

Full

Exposed

7

1,000

1,007

Not-exposed

6

5,634

five,640

Therefore, the incidence in the exposed individuals would be seven/1,007 = 0.lxx%, and the incidence in the not-exposed individuals would be 6/5,640 = 0.eleven%. Consequently, the take chances ratio would be 0.seventy/0.xi=vi.52, suggesting that those who had the run a risk cistron (exposure) had 6.v times the gamble of getting the disease compared to those without the take a chance cistron. This is a stiff association.

In this hypothetical example, I had data on all half dozen,647 people in the source population, and I could compute the probability of illness (i.eastward., the run a risk or incidence) in both the exposed group and the not-exposed group, because I had the denominators for both the exposed and non-exposed groups.

The problem, of course, is that I usually don't take the resource to get the data on all subjects in the population. If I took a random sample of fifty-fifty five-10% of the population, I might not have any diseased people in my sample.

An alternative approach would be to use surveillance databases or administrative databases to discover most or all xiii of the cases in the source population and decide their exposure condition. However, instead of enrolling all of the other five,634 residents, suppose I were to just take a sample of the not-diseased population. In fact, suppose I only took a sample of i% of the non-diseased people and I then determined their exposure status. The data might look something like this:

Diseased

Not-diseased

Total

Exposed

7

ten

unknown

Non-exposed

half dozen

56

unknown

With this sampling approach I can no longer compute the probability of disease in each exposure group, considering I no longer have the denominators in the last column. In other words, I don't know the exposure distribution for the entire source population. All the same, the pocket-size command sample of not-diseased subjects gives me a way to estimate the exposure distribution in the source population. Then, I can't compute the probability of disease in each exposure group, but I can compute the odds of disease in each group.

The Odds Ratio

The odds of disease in the exposed grouping are seven/x, and the odds of disease in the non-exposed grouping are six/56. If I compute the odds ratio, I go (vii/x) / (5/56) = vi.56, very close to the risk ratio that I computed from data for the entire population. We volition consider odds ratios and case-control studies in much greater depth in a later module. Nonetheless, for the time existence the key things to remember are that:

  1. The sampling strategy for a case-control study is very dissimilar from that of cohort studies, despite the fact that both have the goal of estimating the magnitude of association between the exposure and the event.
  2. In a case-control study in that location is no "follow-upwardly" period. One starts past identifying diseased subjects and determines their exposure distribution; one then takes a sample of the source population that produced those cases in order to estimate the exposure distribution in the overall source population that produced the cases. [In cohort studies none of the subjects take the outcome at the first of the follow-up period.]
  3. In a case-control report, you cannot measure incidence, considering you start with diseased people and non-diseased people, so you cannot calculate relative chance.
  4. The case-command design is very efficient. In the instance above the case-control study of but 79 subjects produced an odds ratio (6.56) that was a very shut approximation to the take a chance ratio (vi.52) that was obtained from the data in the entire population.
  5. Example-command studies are particularly useful when the event is rare is uncommon in both exposed and non-exposed people.

The Difference Between "Probability" and "Odds"?

  • The probability that an upshot will occur is the fraction of times you expect to run into that event in many trials. Probabilities always range between 0 and ane.
  • The odds are defined as the probability that the upshot will occur divided past the probability that the event volition not occur.

If the probability of an event occurring is Y, so the probability of the event not occurring is 1-Y. (Case: If the probability of an effect is 0.80 (80%), then the probability that the consequence will non occur is 1-0.80 = 0.xx, or 20%.

The odds of an effect represent the ratio of the (probability that the consequence will occur) / (probability that the upshot will not occur). This could exist expressed as follows:

Odds of consequence = Y / (i-Y)

So, in this instance, if the probability of the event occurring = 0.80, then the odds are 0.80 / (1-0.lxxx) = 0.80/0.20 = 4 (i.eastward., iv to 1).

  • If a race equus caballus runs 100 races and wins 25 times and loses the other 75 times, the probability of winning is 25/100 = 0.25 or 25%, just the odds of the horse winning are 25/75 = 0.333 or 1 win to iii loses.
  • If the horse runs 100 races and wins five and loses the other 95 times, the probability of winning is 0.05 or five%, and the odds of the horse winning are 5/95 = 0.0526.
  • If the horse runs 100 races and wins 50, the probability of winning is 50/100 = 0.50 or 50%, and the odds of winning are 50/l = i (even odds).
  • If the horse runs 100 races and wins 80, the probability of winning is fourscore/100 = 0.fourscore or eighty%, and the odds of winning are 80/20 = four to i.

Notation that when the probability is depression, the odds and the probability are very similar.

Thinking man icon for upcoming question

On Sept. 8, 2011 the New York Times ran an article on the economy in which the writer began past saying "If history is a guide, the odds that the American economic system is falling into a double-dip recession have risen sharply in recent weeks and may fifty-fifty accept reached fifty pct." Farther down in the article the author quoted the economist who had been interviewed for the story. What the economist had really said was, "Whether we reach the technical definition [of a double-dip recession] I retrieve is probably close to l-50."

Question: Was the author correct in saying that the "odds" of a double-dip recession may have reached 50 percentage?

Respond

Key Concept: In a study that is designed and conducted as a case-control study, y'all cannot calculate incidence. Therefore, you cannot calculate hazard ratio or hazard divergence. Yous can simply calculate an odds ratio. Even so, in sure situations a instance-control study is the only feasible study design.