11 minute read

摘自 What are the differences among different study designs, and what are the relative advantages of each?。做目录并排版。

1. Two Classic Types of Studies

The two classic types of studies in biomedical research are controlled (also called experimental or intervention ([ˌɪntə’venʃn], intervene [ˌɪntə’vi:n], 介入, come between)) and observational (also called epidemiological [ˌepɪˌdi:mɪə’lɒdʒɪkl]). There are advantages and disadvantages to each, and an awareness of these differences makes for a savvier (savvy, [‘sævɪ], informal, 懂行的) consumer of public health information.

2. Controlled Studies

2.1 Controlled studies are sometimes called Randomized Clinical Trials (RCTs), but there is a confusion

Controlled and observational studies come in many forms, and to complicate matters there is some language confusion: controlled studies are sometimes called Randomized Clinical Trials (RCTs), but in fact, not all controlled studies are clinical (临床的) trials. Even worse, the acronym [‘ækrənɪm] RCT is also used to mean Randomized Controlled Trial. We parse the words below in order to shed some light on the differences.

2.2 Randomized Clinical Trials

A Randomized Clinical Trial compares the impact of different drugs or treatment regimens ([ˈredʒɪmən], 养生法) among separate groups. A two-armed Clinical Trial would typically compare the response to different treatments in two different groups whose members are otherwise as similar as possible to each other. For example, a Randomized Clinical Trial on a proposed cancer drug would enlist a large group of people who are demographically (demography [dɪˈmɒgrəfi] 人口统计学; demographic [ˌdemə’ɡræfɪk] 人口统计学的) similar, and who all need cancer treatment. These people would then be randomly assigned to two groups (let’s call them A and B). Group A would be given the proposed drug and Group B would be given standard treatment. The researchers would then measure whether Group A has a better outcome than Group B. The study is called randomized because whether a person receives the new drug or the standard treatment would be decided randomly.

The most robust Randomized Clinical Trials are double blinded, meaning that even the doctors or nurses interacting with the patients don’t know who received which treatment. This precautionary ([prɪ’kɔ:ʃənərɪ], of care taken in advance) method prevents the possibility that the patient picks up on subtle signs from the doctors, which could in turn influence his or her psychology, possibly having an effect on the outcome. It also prevents the possibility that the doctors or nurses give the patients different treatment based on their knowledge of whether they are receiving the tested drug or standard care. In many cases, it is not possible to conduct a blinded study; for example, a study on whether talk-therapy is helpful for Post-Traumatic [trɔ:ˈmætɪk] Stress Disorder could not be designed in a way that the recipients of therapy would not know that they received it.

2.3 Randomized Controlled Trials

A Randomized Controlled Trial compares an experimental group with a control group (rather than with another treatment group); hence the name control. The terms Randomized Controlled Trial and Randomized Clinical Trials are often used interchangeably; the only distinction is that a Randomized Controlled Trial compares an experimental group with a control (placebo [plə’si:bəʊ]) group, while a Randomized Clinical Trial compares different treatment methods to different people, without a control group (no-treatment) at all. Many researchers do not worry about this distinction at all, and use the words interchangeably, calling them both RCTs.

3. Observational Studies

In contrast, an observational study might search out data about people already using a particular drug, such as hormonal ([hɔ:’məʊnl], 荷尔蒙的) birth control, and compare it to data on similar people who are not using the drug. It is called observational because the tactic [‘tæktɪk] is to observe the effect of the drug (or an activity or a lifestyle, or whatever else) in the general population, without recruiting [rɪ’kru:t] participants into a controlled environment. Observational studies can be designed in many different ways, depending on how groups are compared and over what period of time.

3.1 Cross-Sectional

A cross-sectional study looks at data that were collected across a whole population to provide a snapshot of that population at a single point in time. This kind of study is used to look for associations between observed properties, such as income level versus years of education, or disease incidence based on geographic location. It can also be used to assess the prevalence of disease: by sampling from the population, for example, we can estimate how common breast cancer is among women.

The US census [ˈsensəs] provides data for many cross-sectional studies. Social scientists often use survey data collected with certain guidelines; these cross-sectional studies are called Knowledge, Attitude and Practice (KAP) studies. Many public and health policy decisions are informed by KAP studies, such as how best to introduce community storm water reduction practices in a major American suburb, or how to inform the public in Pakistan about malaria ([məˈleəriə], 疟疾) prevention.

3.2 Longitudinal

By contrast, an observational study is called longitudinal if it includes multiple observations for each individual over time. This kind of study is used to measure changes in one or more measured characteristics of a population or subgroup that has been selected based on possession of specific properties. Cohort studies, for example, are longitudinal studies that look at a specific group of people (the cohort [ˈkəuhɔ:t]) over a period of time, typically several years. For example, a cohort study following male, American veterans recently examined whether HIV infection was correlated with heart failure in this group of people. Cohort studies can be prospective (looking forward in time, 以当前为起点, 考察未来的数据) by enrolling people in the study in advance and following them over a period of time, or retrospective (looking backward in time, 以当前为终点, 考察之前的数据) by, for example, using archived records. Birth cohorts, groups of people born around the same time, are studied from newborn stage through several years of growth, in order to study their development with respect to correlations like their mothers’ pregnancy choices (such as consuming alcohol or caffeine).

3.3 Which is better: cross-sectional or longitudinal?

Cross-sectional studies suggest correlations, point at areas for further study, and tend to be cheaper to conduct than longitudinal studies. Many cross-sectional studies are a secondary analysis of data already collected for other purposes. However, multiple hypotheses may be seen to support a single correlation; a single snapshot in time may gives little insight into which may be correct. Cross-sectional studies cannot infer causal relationships, and cannot measure population changes over time.

For example, a population census may show that 10% of a population is at or below the poverty level. But this does not answer whether poverty is entrenched. Perhaps 10% of the population is always poor. Or, perhaps, this 10% might not be in poverty a year later, but a different 10% is. Cross-sectional studies can’t measure the difference; measurements of changes over time are required to understand this statistic more fully.

Longitudinal studies, by making multiple measurements over time, can measure changes and give greater validity to correlations observed in a cross-sectional study. They often follow specific subgroups of a population, and can be built to keep track of data from specific individuals. In this way, confounders and bias can (at least partially) be accounted for (也有写成 be accounted-for 的, 表 be taken into account, 也就是 be taken into consideration), reducing erroneous [ɪˈrəʊniəs] conclusions found in cross-sectional surveys. However, these studies are more expensive to conduct.

4. Which is better: controlled or observational?

Both have their advantages, but a controlled study is generally considered more reliable. There are several reasons why: First, the people who participate are randomly assigned to one treatment regimen or another. This means that there is no inherent bias as to who is taking the medicine based on the fact that some people opt to take the drug and others don’t. Second, in a controlled study, confounding factors such as age, race, weight, sex, etc. can all be accounted for in the design of the study itself. The statisticians involved typically calculate how many participants are needed in order to identify true associations.

The other reason to favor controlled studies is that they are the gold standard to attempt to discern the cause of an observed correlation. If the group of medicine-takers has a lower rate of headaches than the non-medicine users, then it is reasonable to conclude that the medicine is the reason for the disappearing headaches. In a controlled study, the medicine-takers are similar in every measurable quality (race, age, smoking status, etc.) as the non-medicine takers.

If the study were observational and found that those who take the medicine have a lower rate of headaches, one could only conclude that the medicine and fewer headaches are correlated. There may not be a direct causal relationship. For example, it could be that those who proactively (acting in advance) take medicine also seek to reduce stress at work, and reduced job-related stress is the reason that their headaches disappear.

However, observational studies have two important advantages: first, they are typically much larger than controlled studies. One can collect data about thousands or even millions of people (as the census bureau does) and make conclusions that are more resilient ([rɪ’zɪlɪənt], == flexible) just because there are so many people participating in the study.

Second, controlled studies pose an ethical dilemma, that of taking away the right of the participant to make his or her own decisions. This is why, for example, all the studies on the harm of smoking are based on observational studies. Once there was sufficient reason to suspect that smoking was harmful, it would have been unethical to take two groups (randomly assigned) and have one smoke and the other not smoke in order to evaluate the how smoking affects an assortment of diseases. Similarly, most studies on pregnant women and how their lifestyles are correlated with health measurements of their resulting babies are observational; it would be hard to convince a randomly assigned group of women to drink a glass of wine each meal in order to see if alcohol causes developmental delay.

5. Case-control study is a third kind of study but less frequently used

In addition to controlled and observational studies, there is a third kind of study (which comes up far less frequently in the media) - a case-control study. In this type of study, a group of people who have a particular trait such as a disease or condition (the case group), are compared and contrasted with individuals who do not (the control group). Such a study specifically pairs a case with one or more control, all of whom have comparable exposure to specific known risk factors. From the comparison, the likelihood of exposure to certain risks as a cause of the trait is measured.

For example, suppose we want to evaluate the relationship between miscarriage (流产; 早产) and coffee consumption in the first trimester ([traɪˈmestə(r)], 三个月) of pregnancy. In this scenario, two groups are formed: one is made up of women who have had miscarriages and the other is made up of women who haven’t. The women may be “matched”, by controlling for factors such as age, economic status, medical care, number of pregnancies, etc. Then the two groups are questioned about their coffee intake in the first trimester, and their answers are compared. Depending on the outcome, there may be a correlation with heavy coffee drinking and miscarriage. One should be wary, however, of “recall bias” in a study like this. Women who have had miscarriages may be more likely to remember having consumed coffee, perhaps because they feel guilty or generally remember the first trimester better because of the trauma of the miscarriage.

5.1 Which is better: cross-sectional or case-control?

As with the comparisons made above, the answer to such a question is determined by the goals of the study. Suppose we were conducting a study about non-Hodgkin’s [ˈhɔdʒkin] lymphoma ([lɪm’fəʊmə], 淋巴瘤). If we wanted to get an idea of the prevalence of non-Hodgkin’s lymphoma across a population (approximately 20 per 100,000 people in the US), we would design a cross-sectional study. Such a study could even be designed to identify associated risk factors for further research, such as exposure to certain pesticides. But the cross-sectional study will only identify the potential risk factors, since it cannot assess causality (without the application of sophisticated analytic tools which have their own limitations).

A case-control study on this rare disease could not, for its part, give a sense of the overall likelihood of being diagnosed with the disease, but it can be designed to measure associated risk factors. By pairing cases of the disease with controls who do not have the disease, but have the same set of exposure to presumed (presume: to assume to be true without proof) risks (such as pesticides ([ˈpestɪsaɪd], 杀虫剂)), the relative risk of contracting non-Hodgkin’s lymphoma as a result of exposure to pesticides can be calculated. It is possible to build significant confounder-reduction into the design of a case-control study, something much harder to do in a cross-sectional study (for example, maybe during the year for which data are collected there was a chemical spill at a large plant producing pesticides). In contrast, a cohort study on a rare disease would be ineffective, since there would be too few cases of the disease.

Any type of study needs to control for confounding factors in order to draw appropriate conclusions. A study on light drinking while pregnant, for example, should account for the fact that women who drink are also more likely to smoke, and smoking causes fetal injury. Even in a controlled environment, confounding factors can creep in and affect the results - usually, because the study designers failed to account for a confounding factor. The art of conducting well-designed studies has its challenges, regardless of the nature of the study.