Tải bản đầy đủ
5 Practical Matters: Beginning a Meta-Analytic Database

5 Practical Matters: Beginning a Meta-Analytic Database

Tải bản đầy đủ



meta-­analysis. I suggest that you be rather inclusive during this initial screening, retaining any studies that might meet your inclusion criterion. You should
also retain any nonempirical works, such as reviews or theoretical papers;
although these do not provide empirical results for your meta-­analysis, it will
be worthwhile to read them (1) to identify additional studies cited in these
papers, and (2) to inform interpretation of results of your meta-­analysis.
As you are identifying works that you will retain, it is critical to have some
way of organizing this information. I use spreadsheets such as that shown in
Table 3.1. (I have shown only four studies here, your spreadsheet will likely
be much larger.) Although you should develop an approach that meets your
own needs, this example spreadsheet contains several pieces of information
that I recommend recording. The first column contains a number for each
paper (article, chapter, dissertation, etc.) identified in the search. The number is arbitrary, but it is useful for filing purposes (as the number of papers
becomes large, it is useful to file them by number rather than, e.g., author
name). The next four columns contain citation information for the paper. This
information is useful not only for citing the paper in your write-up, but in
identifying repetitive papers during your multiple search strategies (for this
purpose, having this information in a searchable spreadsheet is useful). The
sixth column contains the abstract, which is useful if you want to search for
specific terms within your spreadsheet. I recommend copying this information into your spreadsheet if it is electronically available, but it probably is not
worth the time needed to type this in manually. The seventh column identifies where and when the paper was found; recording the date is important
because (1) you might want to update the search near the completion of your
meta-­analysis, and (2) you should report the last search dates in your presentation of your meta-­analysis. The two rightmost columns (columns eight and
nine) contain information for retrieving and coding the reports. One column
indicates whether you have the report, or the status of your attempt to retrieve
it (e.g., the third paper notes that I had requested this dissertation through my
university’s interlibrary loan system). The last column will become relevant
when you begin coding the studies (see Chapters 4–8). Here, I have recorded
the person (BS = Brian Stucky, the second author on this paper) who coded
this report and the date it was coded. Recording both pieces of information
are valuable in case you later identify a problem in the coding (e.g., if one
coder was making a consistent error) or if you revise the coding protocol (you
then need to modify the coding of all studies coded before this change). In
this column, I also record when studies are excluded for a particular reason;
for instance, the fourth study was excluded because it used an adult sample
(which was one of the specified exclusion criteria in this review).


Hawley, Little,
and Card


Bailey and




In press





forms . . .

Predictors of
peer . . .

The allure of a
mean friend . . .

aggression . . .


Journal of Youth and Adolescence

Dissertation, University of
California, Berkeley

International Journal of
Behavioral Development, 31(2),

Child Development, 66(3),


purpose . . .

the role . . .

theory . . .

Assessed a
form . . .


E-mail request


(May 2007)

(Nov. 2005)

Found in







NC, 9/12/07

BS, 12/1/05


Note. The table lists Bailey and Ostrov as “in press” even though it was published in 2008. I left the date in the table as “in press,” however, because the table is meant to
show progress as it occurred during the time of this research (which was prior to this work being published).

Crick and




TABLE 3.1. Example Spreadsheet for Organizing a Literature Search



Of course, you may use a different way of organizing information from
your literature search. The point is that you should have some way of organizing information that clearly records important information and avoids any
duplication of effort.

3.6 Summary
One of the most important steps of a meta-­analytic review is obtaining the
sample of studies that will provide the data for your analyses. To define this
sample, we need to specify a clear set of inclusion and exclusion criteria
specifying what types of studies will and will not comprise this sample. We
then search the literature for studies fitting these inclusion criteria. Several
approaches to searching for literature exist, and I have described some of
the more common methods. The goal of this search is to obtain an unbiased,
typically exhaustive (i.e., complete) sample of studies.

3.7Recommended Readings
Reed, J. G., & Baxter, P. M. (2009). Using reference databases. In H. Cooper, L. V. Hedges,
& J. C. Valentine (Eds.), The handbook of research synthesis and meta-­analysis (2nd
ed., pp. 73–101). New York: Russell Sage Foundation.—This chapter provides a very
detailed, practical guide to using electronic databases, including forward search databases.
Hopewell, S., Clarke, M., & Mallett, S. (2005). Grey literature and systematic reviews. In H.
R. Rothstein, A. J. Sutton, & M. Borenstein (Eds.), Publication bias in meta-­analysis: Prevention, assessment and adjustments (pp. 49–72). Hoboken, NJ: Wiley.—This chapter
describes several ways of identifying and retrieving studies that are more obscure
than traditional journal articles, and discusses the biases potentially introduced by not
including this literature.

  1. The details (e.g., effect sizes, distributions around the mean) of this example will
become clearer as you read subsequent chapters. For now, you should just try to
understand the gist of this example.
  2. In principle, a meta-­analysis does not need to include all studies that exist.
Instead, you can select a random sample of all existing studies on which to perform your analyses, assuming the studies you have selected provide adequate

Searching the Literature


statistical power to evaluate your research questions. I view this type of random
sampling as an extremely valuable approach to performing reviews in areas where
there is so much empirical literature that a full meta-­analysis is not practical.
However, very few meta-­analytic reviews use this random-­sampling approach;
nearly all attempt to be exhaustive in their inclusion of studies. Unfortunately,
this typical practice of being exhaustive seems to have created a standard where
meta-­analytic reviews are expected to be exhaustive, and the random-­sampling
approach would likely draw criticism.
  3. The importance of developing clear operational definitions of constructs is
important regardless of effect sizes used, whether they are of single variables
(e.g., means or proportions) or multivariate effect sizes (see Chapter 7).
  4. If you are particularly interested in drawing cross-­cultural conclusions and there
exists adequate numbers of studies written in a tractable number of languages,
it may be possible to hire translators. However, you should remember that coding studies is an intensive effort (see Chapters 4 and 5) that requires considerable technical expertise. Because it would be difficult to find someone with both
multilingual and meta-­analytic skills, and require considerable amounts of their
time, this is not a viable alternative in the vast majority of cases. For this reason,
restriction of populations of studies to those written in languages you know is
often reasonable as long as you recognize this restriction.
  5. This condition is necessary to include a study in your analyses. However, you
should also consider whether the studies that report insufficient information differ in meaningful ways, with the most relevant possibility being that the results
were nonsignificant. If you find that a considerable number of studies report
insufficient information to compute effect sizes (and other efforts, such as contacting the authors, do not alleviate this problem), then you should report these
studies in your report for transparency.
  6. Here, performing the meta-­analysis with a random sample of studies might be
preferable to changing your inclusion/exclusion criteria, especially if doing so
makes the population of studies of lesser interest. Footnote 2 of this chapter
describes some of the challenges to this approach.
  7. To illustrate this cost, consider my experience when publishing the example
meta-­analysis I use throughout this book: During this review process, one of the
reviewers suggested that I “plow through” the approximately 30,000 studies that
could be identified using a very general search term like “aggression.” Assuming
10 minutes to review each study for possible inclusion (which is a conservative estimate), this process would have taken over two years of 40 hours/week
reviewing. During this time, approximately 3,000 additional studies identified
using this search term would have been added, thus requiring another 3 to 4
months of full-time reviewing. Furthermore, during the coding, analysis, and
write-up of these results, a couple thousand more works would likely have been


added to the database. Although this reviewer was certainly trying to be helpful
by ensuring high recall, this example illustrates that the cost of low precision
can be substantial in making a meta-­analysis impossible.

  8. The use of nonacademic search engines (e.g., Google scholar) might be especially
plagued by inconsistency in what works are included. I personally do not use
these nonacademic search engines. If you do decide to use one, I recommend
not using it as a primary search method, but rather as a check of the adequacy
of your other search procedures (i.e., after searching for literature using other
methods, does this nonacademic search engine uncover additional works that
should have been included?).
  9. We did not do so in the actual meta-­analysis because the number of studies using
samples outside of this age range was reasonably small.
10. To my knowledge, no one has evaluated this possibility empirically. I also suspect that factors unrelated to the effect sizes (e.g., length of time since the presentation, your persuasiveness and persistence in requesting presentations) are
more influential with regard to response than the effect sizes. But this possibility
of biased response should be kept in mind when response rates are low, and it
might be worthwhile to evaluate this possibility (through, e.g., funnel plots or
effect size–­sample size correlations; see Chapter 11) among the conference presentation included in your meta-­analysis.
11. I do not believe that anyone has evaluated this empirically.
12. I find it comforting to consider that, just as there has never been a flawless study
(see quote by Cooper, 2003, in Chapter 2 of the present volume), there has never
been—and never will be—a flawless meta-­analysis. Although you might strive
to obtain every study within your sample, there comes a point of diminishing
returns where a tremendous amount of additional effort yields very few additional benefits. When this point is reached, your field benefits more from timely
completion and dissemination of your meta-­analysis than futile efforts to obtain
additional studies.
13. This image might seem quaint to some readers. If you prefer, point-and-click
your way through the online tables of contents of some relevant journals.

Part II

The Building Blocks
Coding Individual Studies


Coding Study Characteristics

Performing the simplest meta-­analysis, in which the goal is simply to estimate
a typical (mean) effect size and perhaps to make statistical inferences about
this effect size (see Chapters 8 and 10), requires only that you code the effect
sizes and sample sizes (to compute the standard errors of the effect sizes) from
each study (see Chapter 5). If you wish to correct for artifacts to these effect
sizes, it is also necessary to code information for these corrections such as the
reliabilities and dichotomizations of variables comprising the effect sizes (see
Chapter 6).
Performing this sort of simple meta-­analysis may seem adequate if it
answers all of your research questions. However, this approach would fail to
provide information about why effect sizes might differ across studies, a question that might be a key motivator of the meta-­analysis (see Chapter 2) or a
valuable follow-up to observed heterogeneity (Chapter 8). Moderator analyses attempt to explain this heterogeneity among effect sizes by evaluating
whether coded study characteristics systematically predict variation in effect
sizes across studies (see Chapter 9). To perform these moderator analyses, it
is necessary that you code relevant study characteristics that might be useful in
predicting variation in effect sizes across studies.
In addition to coding study characteristics for moderator analyses, thorough coding of these characteristics is important simply for describing the
research basis for your meta-­analysis. In other words, what does the sample
of studies from which you draw your conclusions look like? This description is
useful both in describing the population to which you can make conclusions
(see Chapter 3 for a discussion of conceptualizing samples and populations
of studies) and in identifying gaps within the research. For example, does your
meta-­analysis rely primarily on studies using a particular measure or type of




measure to the exclusion of others, or certain types of samples to the neglect
of others? Answers to these questions inform both the extent to which you can
generalize your conclusions and where it might be valuable to perform future
primary research.
In short, almost every meta-­analysis will benefit from careful coding of
study characteristics, whether you use them for performing moderator analyses
or for describing the sample of studies. In this chapter, I first describe considerations in selecting study characteristics to code (Section 4.1) and then turn
to the specific topic of coding study quality (Section 4.2). I next describe the
important step of evaluating coding decisions (Section 4.3). Finally, I provide
practical suggestions for developing a coding protocol to guide the coding of
studies (Section 4.4).

4.1 Identifying Interesting Moderators
Decisions about which study characteristics to code need to be heavily
informed by your knowledge of the content area in which you are performing
a meta-­analytic review. Nevertheless, I describe two sets of general considerations that I believe apply to meta-­analytic reviews across fields: considering
the research questions you are interested in and considering coding certain
specific aspects of studies.
4.1.1Considering Research Questions of Interest
Just as planning a primary research study requires you to select variables
based on your research questions, planning a meta-­analysis requires that you
base your decisions about which study characteristics to code on the research
questions that you wish to answer. If your research questions are exclusively
about average effect sizes across studies (i.e., combining studies), then you
might not need to code much beyond effect sizes, sample sizes, and information for any artifact corrections you wish to make. I qualify this statement
by noting that it is still valuable to be able to provide basic descriptive information about this sample of studies to inform the generalizability of your
review. Nevertheless, the number of study characteristics that you will need
to code to address this research question adequately is small.
In contrast, if at least some of your research questions involve comparing studies (i.e., identifying whether studies with certain features yield
larger effect sizes than studies with other features), then it will be much more
important to code many study characteristics. Obviously, if you put forth

Coding Study Characteristics


a research question about a specific characteristic moderating effect sizes
(e.g., do studies with this characteristic yield larger effect sizes than studies
without this characteristic?), then it will be necessary to code this specific
characteristic. However, you should also consider what study characteristics
might commonly co-occur with the characteristic you are interested in, and
code these. For example, if you are interested in investigating whether studies with certain types of samples yield different effect sizes (e.g., children vs.
adults), you should carefully consider the other study characteristics that
are likely to differ across these types of samples (e.g., studies of adults might
frequently rely on self-­reports, whereas studies of children might frequently
rely on parent reports, observations, etc.). If you fail to code these other study
characteristics, then you cannot empirically rule out the possibility that your
results involving the coded study characteristic of interest are not really due
to these co-­occurring characteristics. In contrast, if you do code these characteristics, then you are able to evaluate empirically such competing explanations (see Chapter 9).
As a more extreme version of research questions involving specific moderators, some meta-­analysts aim to predict all heterogeneity in effect sizes by
coded study characteristics. Although this goal tends to be quite exploratory,
and you would therefore view the findings of moderation by specific characteristics cautiously, it nevertheless is a goal you might consider. If so, then
you will necessarily code a large number of study characteristics; specifically,
you will code any study characteristics that meet two conditions. First, the
study characteristics are consistently reported in many or even most studies;
this is necessary to avoid a preponderance of missing data when you evaluate the coded characteristic as a moderator. The second condition is that the
study characteristic varies across at least some studies; this variability across
studies is necessary for the study characteristic to covary with effect sizes.
You would then enter these coded study characteristics into some large predictive model (e.g., forward stepwise regression) to explore relations between
them and variation in effect sizes.
4.1.2Considering Specific Aspects of Studies
As I mentioned, the exact study characteristics you code will depend on your
research questions and be informed by your knowledge of the topic area.
Nevertheless, four general types of characteristics should be considered in
any meta-­analysis in the social sciences: characteristics of the sample, measurement, design, and source (see also Lipsey, 2009; Lipsey & Wilson, 2001,
pp. 83–86). These are summarized in Table 4.1.



TABLE 4.1.  Summary of Study Characteristics to Consider Coding
Broad aspect

Narrow aspects



Sampling procedures

Sampling from unique settings,
representative sample, country

Demographic features

Gender composition, ethnic composition,
socioeconomic status, age, IQ

Sources of information

Self-­report, other reporter (e.g., spouse,
parent, teacher), observations

Measurement process

Covert versus overt observations, timed
versus untimed performance

Specific measures used

Specific measure, original versus short
forms, translations

Type of design

Experimental, quasi-­experimental, pre–
post comparisons, regression discontinuity

Specific design features

Type of control group, length of
longitudinal time span

Publication status

Published versus unpublished, publication

Year of study

Year of publication, year of data collection


Funded versus unfunded, source of

Researcher characteristics

Discipline, gender, ethnicity

Internal validity

Use of random assignment, condition
concealment, attrition

External validity

Use of random sampling procedures,
samples based on specific subpopulations

Construct validity

Reliability of measures (for correction
rather than exclusion or moderator
analyses), relevant measurement
characteristics (described above)




Study quality

4.1.2.a Sample Characteristics
Potentially relevant characteristics of the sample that you might consider
include aspects of the sampling procedure and the demographic features
of the sample. For instance, you might code sampling procedures such as
whether the sample was drawn from unique settings (e.g., from a university setting, some sort of clinical setting, a correctional facility, or specific
other settings relevant to the area), whether the study attempted to draw a