Appendix 10.1 Input of Lisrel for Data Analysis of a Classic MTMM Study
Tải bản đầy đủ
206
Estimation of Reliability, Validity, and Method Effects
y ij = rij t + e ij (10.2A.1)
t ij = v ij fi + m ij m j (10.2A.2)
From this model, one can derive the most commonly used MTMM model by
substitution of Equation (10.A.2) into Equation (10.2A1). It results in the models
(10.2A.3) or (10.2A.4):
y ij = rij v ij fi + rij m ij m j + e ij (10.2A.3)
or
where
y ij = q ij fi + sij m j + e ij (10.2A.4)
=
q ij rij=
v ij and sij rij m ij .
One advantage of this formulation is that qij represents the strength of the relationship between the variable of interest and the observed variable and is an important
indicator of the total quality of an instrument. Besides, sij represents the systematic
effect of method j on response yij. Another advantage is that it simplifies Equation
(9.1) to (10.2A5):
r(y1 j , y 2 j ) = q jr (f1 , f2 )q 2 j + s1 js2 j (10.2A.5)
Although this model is quite instrumental, some limitations are connected with it.
One of these is that the parameters themselves are products of more fundamental
parameters. This creates problems because the estimates for the data quality of any
model are derived only after the MTMM experiment is completed and the data analyzed. Therefore, in order to apply this approach for each item in the survey, two
more requests have to be asked to estimate the item quality. The cost of doing this
makes this approach unrealistic for standard survey research.
An alternative is to study the effects in terms of how different questionnaire
design choices affect the quality of the criteria and to use the results for predicting the data quality before and after the data are collected. By making a metaanalysis to determine the effects of the question design choices on the quality
criteria, we would be eliminating the additional survey items needed in substantive
surveys. It is an approach that has been suggested by Andrews (1984) and has
been applied in several other studies (Költringer 1995; Scherpenzeel and Saris
1997; Corten et al. 2002; Saris and Gallhofer 2007b).
In such a meta-analysis, it is desirable that the parameters to be estimated
represent only one criterion and not mixtures of different criteria, in order to keep
the explanation clear. It is for this particular reason that Saris and Andrews (1991)
have suggested an alternative parameterization of the classical model: the TS model,
presented in Equations (10.2A.1) and (10.2A.2), where the reliability and validity
coefficients are separated and hence can be estimated independently from each
Appendix
207
other. Both coefficients can also vary between 0 and 1, which does not occur if one
employs the reliability and the validity coefficient as Andrews (1984) did, starting
with the classical model (10.2A.5). In agreement with Saris and Andrews (1991), we
suggested that for the meta-analysis the TS MTMM model has major advantages,
and therefore, we have presented the TS model in this chapter.
11
Split-Ballot Multitrait–
Multimethod Designs1
Although the classical MTMM design is effective, it has one major problem, namely,
that one has to ask the respondents three times nearly the same questions. As a
consequence, people can become bored and answer with less seriousness as questions
are repeated. It is also possible that they remember what they have said before, which
means that the observed responses are not independent of each other.
In order to cope with this problem, there are two possible strategies: (1) to increase
the time between the observations so that the previous answers cannot be remembered anymore or (2) to reduce the number of repeated observations. The first
approach has been tried in the past. It was discussed that one can after 25 minutes
repeat the same questions if similar questions are asked in between. This will be a
solution for the second measures but not for the third ones. So, we have mentioned
that Scherpenzeel (1995) have used in many experiments a panel design where at
each point in time, only two observations of the same questions are made. However,
this approach requires a panel that is commonly not available. The second strategy is
to ask each respondent fewer questions while compensating for the “missing data by
design” by collecting data from different subsamples of the population. In doing so,
the designs look very similar to the frequently used split-ballot experiments and
hence are called the “split-ballot MTMM design” or SB-MTMM design. This design
1
This chapter is based on a paper by Saris W. E., A. Satorra, and G. Coenders 2004. A new approach to
evaluating the quality of measurement instruments: The split-ballot MTMM design. Sociological
Methodology, complemented with the results recently published by Revilla and Saris (2013).
Design, Evaluation, and Analysis of Questionnaires for Survey Research, Second Edition.
Willem E. Saris and Irmtraud N. Gallhofer.
© 2014 John Wiley & Sons, Inc. Published 2014 by John Wiley & Sons, Inc.
208
209
THE SPLIT-BALLOT MTMM DESIGN
will be discussed in this chapter because it is the design that has been used for all
MTMM experiments in the European Social Survey (ESS).
11.1 The Split-Ballot MTMM Design
In the commonly used split-ballot experiments, random samples from the same
population receive different versions of the same requests. In other words, each
respondent group gets one method. The split-ballot design makes it possible to compare the response distributions of the different requests across their forms and to
assess their possible relative biases (Schuman and Presser 1981; Billiet et al. 1986).
In the SB-MTMM design, random samples of the same population are also used
but with the difference that these groups get two different forms of the same request.
In total, it is one less repetition than in the classical MTMM design and one more
than in the commonly used split-ballot designs. We will show that the design,
suggested by Saris (1998c), combines the benefits of the split-ballot approach and
the MTMM approach in that it enables researchers to evaluate measurement bias,
reliability, and validity simultaneously, and that it does so, while reducing response
burden. Applications of this approach can also be found in Saris (1998c) and
Kogovšek et al. (2001). A more complex alternative design has been suggested by
Bunting et al. (2002). The suggestion to use split-ballot designs for structural equation
models (SEM) can be traced back to Arminger and Sobel (1991).
11.1.1 The Two-Group Design
The two-group SB-MTMM design is structured as follows. The sample is split
randomly into two groups. One group has to answer three survey items formulated by
method 1, while the other group is given the same survey items presented in a second
form, in the MTMM literature called “method 2.” In the last part of the questionnaire,
all respondents are presented with the three items, which are now formulated in
method 3 format. The design can be summarized as tabulated in Figure 11.1.
In summary, under the two-group design, the researcher draws two comparable
random samples from the same population and asks three requests about at least three
traits in each sample: one time with the same and the other time with another form
(method) of the same requests (traits) after sufficient time has elapsed. Van Meurs
and Saris (1990) have demonstrated that after 25 minutes the memory effects are
negligible if similar questions have been asked in between the two sets of questions.
This time gap is enough to obtain independent measures in most circumstances.
Sample 1
Sample 2
Time 1
Time 2
Form 1
Form 2
Form 3
Form 3
Figure 11.1 The two-group SB-MTMM design.
210
Split-Ballot Multitrait–Multimethod Designs
Table 11.1 Samples providing data for correlation estimation
Method 1
Method 2
Method 3
Method 1
Method 2
Method 3
Sample 1
None
Sample 1
Sample 2
Sample 2
Sample 1 + 2
The design in Figure 11.1 matches the standard split-ballot design at time 1 and
provides information about differences in response distributions between the
methods. Combined with the information obtained at time 2, this design provides
extra information. The question still remains whether the reliability, validity, and
method effects can be estimated from this data, since each respondent answers only
two requests about the same trait and not three, as is required from the classical
MTMM design. The answer is not immediately evident since the necessary
information for the 9 × 9 correlation matrix comes from different groups and is by
design incomplete (see Table 11.1). Table 11.1 shows the groups that provide data for
estimating variances and correlations between requests using either the same or different forms (methods).
In contrast to the classical design, no correlations are obtained for form 1 and form
2 requests, as they are missing by design. Otherwise, all correlations in the 9 × 9
matrix can be obtained on the basis of one or two samples, but the data come from
different samples.
Each respondent is given the same requests only twice, reducing the response
burden considerably. However, in large surveys, the sample can be split into more
subsamples and hence evaluate more than one set of requests. However, the correlations between forms 1 and 2 cannot be estimated, resulting in a loss of degrees of
freedom (df) when estimating the model on the now incomplete correlation matrix.
This might make the estimation less effective than the standard design where all
correlations are available, as in the three-group design.
11.1.2 The Three-Group Design
The three-group design proceeds as the previous design except that three groups or
samples are used instead of two, leaving us with the following scheme (Fig. 11.2):
Using this design, all request forms are treated equally: they are measured once at
the first and later at a second point in time. Therefore, there are also no missing
correlations in the correlation matrix, as shown in Table 11.2.
Evidently, the major advantage of this approach is that all correlations can be
obtained. A second advantage is that the order effects are canceled out because each
measure comes once at the first position and another time at the second position
within the questionnaire.
A major disadvantage, however, is that the main questionnaire has to be prepared
in three different formats for the three different groups. In addition, the same measures are not obtained from all respondents. This may raise a serious issue in the
211
THE SPLIT-BALLOT MTMM DESIGN
Sample 1
Sample 2
Sample 3
Time 1
Time 2
Form 1
Form 2
Form 3
Form 2
Form 3
Form 1
Figure 11.2 The three-group SB-MTMM design.
Table 11.2 Samples providing data for correlation estimation
Method 1
Method 2
Method 3
Method 1
Method 2
Method 3
Samples 1 and 3
Sample 1
Sample 3
Samples 1 and 2
Sample 2
Samples 2 and 3
analysis because the sample size is reduced with respect to its relationships with the
other variables.2 This design was for the first time used by Kogovšek et al. (2001).
11.1.3 Other SB-MTMM Designs
Other methods that are guided by the principles discussed previously can also be
designed. The effects of different factors can be studied simultaneously, and interaction effects can be estimated. However, an alternative to this type of study is to
employ a meta-analysis of many separate MTMM experiments under different
conditions, which will be elaborated in the next chapter.
There is one other design that deserves special attention, the SB-MTMM design,
which makes use of an exact replication of methods. In doing so, the occasion effects
can be studied without placing an extra response burden on respondents. A possible
design is illustrated in Figure 11.3.
Figure 11.3 models a complete four-group design for two methods and their
replications. The advantage of this design is that the same information as with the
other two designs is obtained, and in addition to the previous design, the occasionspecific variance can be estimated. This is only possible if exact repetition of the
same measures is included in the design. In order to estimate these effects, the model
specified in Chapter 10 has to be extended with an occasion-specific factor (Saris
et al. 2004). This design can be reduced to a three-group design by leaving out sample
2 or 3 or alternatively sample 1 or 4, assuming that the order effects are negligible or
that the occasion effects are the same for the different methods.
2
A possible alternative would be to add to the study a relatively small subsample. For the whole sample,
one would use method 1, the method expected to give the best results, in the main questionnaire; method 2
for one subgroup; and method 3 for another subgroup in an additional part of the questionnaire that relates
to methodology. With the subsample, one would use method 2 for the main questionnaire and method 3 in
the methodological part. In this way, method 1 is available for all people, and all three combinations of the
forms are also available. Also, one could get an estimate of the complete covariance matrix for the MTMM
analysis without harming the substantive analysis. But this design would cost extra money for the a dditional
subsample. The appropriate size of the subsamples is a matter for further research.
212
Split-Ballot Multitrait–Multimethod Designs
Sample 1
Sample 2
Sample 3
Sample 4
Time 1
Time 2
Form 1
Form 1
Form 2
Form 2
Form 1
Form 2
Form 1
Form 2
Figure 11.3 A four-group SB-MTMM design with exact replications.
Another similar design can be developed including three different methods;
h owever, it is beyond the scope of this chapter to discuss further possibilities. For
further information, we refer to Saris et al. (2004) and the first two large-scale
applications of this design in the ESS (2002).
We hope that we clarified that the major advantage of these designs is the reduction
of the response burden from three to two observations. Furthermore, in order to show
that these designs can be applied in practice, we need to discuss, based on the
collected data, the estimation of the parameters.
11.2 Estimating and Testing Models for Split-Ballot
MTMM Experiments
The split-ballot MTMM experiment differs from the standard approach in that
different equivalent samples of the same population are studied instead of just one.
Given the random samples are drawn from the same populations, it is natural to
assume that the model is exactly the same for all respondents and equal to the model
we have specified in Figure 10.4, which includes the restrictions on the parameters
suggested by Saris and Andrews (1991). The only difference is that not all requests
have been asked in every group.
Since the assignment of individuals to groups has been made at random, and there
is a large sample in each group, the most natural approach for estimating is the
multiple-group SEM method (Jöreskog 1971). It is available in most of the SEM
software packages. We refer to this approach as multiple-group structural equation
model or MGSEM.3 As indicated in the previous section, a common model is fitted
across the samples, with equality constraints for all the parameters across groups.
With the current software and applying the theory for multiple-group analysis,
estimation can be made by using the maximum likelihood (ML) method or any other
standard estimation procedure in SEM. In the case of non-normal data, robust
standard errors and test statistics are available in the standard software packages.
3
Because each group will be confronted with partially different measures of the same traits, certain
software for multiple-group analysis will require some small tricks to be applied. This is the case for
LISREL, where the standard approach expects the same set of observable variables in each group. Simple
tricks to handle such a situation of the set of observable variables differing across groups were already
described in the early work of Jöreskog (1971) and in the manual of the early versions of the LISREL
program; such tricks are also described in Allison (1987). Multiple-group analysis with the software EQS,
for example, does not require the same number of variables in the different groups.
EMPIRICAL EXAMPLES
213
For a review of multiple-group analysis in SEM as applied to all the designs enumerated in the present chapter, see Satorra (2000).
The incomplete data setup we are facing could also be considered as a missing
data problem (Muthen et al. 1987). However, the approach for missing data assumes
normality, while this design does not provide the theoretical basis for robust standard
errors and corrected test statistics that are currently available in MGSEM software.
Thus, since the multiple-group option offers the possibility of standard errors and test
statistics that are protected from non-normality, we suggest that the multiple-group
approach is preferable.
Given this situation, we suggest the MGSEM approach for estimating and testing
the model on SB-MTMM data. In doing so, the correlation matrices are analyzed,
while the data quality criteria (reliability, validity coefficients, and method effects)
are obtained by standardizing the solution.
Although the statistical literature suggests that data quality indicators can be estimated
using the SB-MTMM designs, we need to be careful while using the two-group designs
with incomplete data, because they may lead to empirical u nderidentification problems. Before addressing this issue, we will illustrate an application of the two designs
based on data from the same study discussed in the previous chapters.
11.3 Empirical Examples
In Chapters 9 and 10, an empirical example of the classical MTMM experiment was
discussed. In order to illustrate the difference between this design and the SB-MTMM
designs, we have randomly split the total sample of that study (n = 428) into two
(n = 210) and three groups (n = 140). Thereafter, we took only those variables that
would have been collected had the two- or three-group MTMM design been used, for
each group. In this way, we obtained incomplete correlation matrices for each group.
Next, we estimated the model, using the multiple-group approach. Now, we will
investigate the results, starting with the three-group design, where a complete
correlation matrix is available for all groups. Later, we discuss the results for the
two-group design, where the correlation information is incomplete.
11.3.1 Results for the Three-Group Design
The random sampling of the different groups and selection of the variables according
to the three-group design has led to the results summarized in Table 11.3. First, this
table indicates that in each sample incomplete data are obtained for the MTMM
matrix. The correlations for the unobserved variables are represented by 0s and the
variances by 1s. This presentation is necessary for the multiple-group analysis with
incomplete data in LISREL but does not have to be used in general.
Keep in mind that these correlation matrices are incomplete because at each time
interval, one set of variables is missing. We see also that we have summarized the
response distributions in means and standard deviations, which can be compared
across groups as is done in the standard split-ballot experiments. However, in this
Table 11.3 Data for three-group SB-MTMM analysis on the basis of three random
samples from the British pilot study of the ESS
Correlations, means, and standard deviations of the first subsample
Correlations
1.00
.469 1.00
.250 .415 1.00
.0 .0 .0
1.00
.0 .0 .0
.0 1.00
.0 .0 .0
.0 .0 1.00
−.524 − .322 − .212 .0 .0 .0 1.00
−.313 − .523 − .273 .0 .0 .0 .509 1.00
−.244 − .313 − .517 .0 .0 .0 .442 .461 1.00
Means
2.39 2.69 2.41 .0 .0 .0 2.09 1.77 2.02
Standard deviations
.70 .71 .78 1.0 1.0 1.0 .71 .68 .73
Correlations, means, and standard deviations of the second subsample
Correlations
1.00
.0 1.00
.0 .0 1.00
.0 .0 .0 1.00
.0 .0 .0 .598 1.00
.0 .0 .0 .601 .694 1.00
.0 .0 .0 .588 .398 .517 1.00
.0 .0 .0 .395 .690 .504 .547 1.00
.0 .0 .0 .397 .462 .571 .545 .564 1.00
Means
.0 .0 .0 5.22 4.30 4.98 1.91 1.69 2.00
Standard deviations
1.0 1.0 1.0 2.27 2.51 2.47 .69 .65 .71
Correlations, means, and standard deviations of the third subsample
Correlations
1.00
.469 1.00
.393 .605 1.00
−.669 − .454 − .489 1.00
−.512 − .669 − .564 .707 1.00
−.495 − .508 − .742 .693 .729 1.00
.0 .0 .0
.0 .0 .0 1.00
.0 .0 .0
.0 .0 .0 .0 1.00
.0 .0 .0
.0 .0 .0 .0 .0 1.00
Means
2.41 2.65 2.50 5.18 4.32 4.99 .0 .0 .0
Standard deviations
.78 .77 .90 2.39 2.39 2.53 1.0 1.0 1.0
215
EMPIRICAL EXAMPLES
Table 11.4 Estimates of parameters for the full sample using three methods and for
the three-group design with incomplete data in each group
Full sample
Three-group SB-MTMM design
M1
M2
M3
M1
M2
M3
Reliability coefficient for
Q1
Q2
Q3
.79
.85
.81
.91
.94
.93
.82
.87
.84
.78
.82
.83
.91
.97
.95
.84
.86
.77
Validity coefficient for
Q1
Q2
Q3
Method variance
.93
.94
.95
.05
.91
.92
.93
.73
.85
.87
.88
.09
.94
.94
.96
.04a
.91
.93
.93
.73
.86
.85
.84
.09
This coefficient is not significantly different from 0, while all others are significantly different from 0.
a
case, we also want estimates for the reliability, validity, and method effects. In
estimating these coefficients from the data for the three randomly selected groups
simultaneously, we have assumed that the model is the same for all groups except for
the specification of variables selected for the three groups. The technical details of
this analysis are given in Appendix 11.1, where the LISREL input is presented.
In Table 11.4, we provide the results of the estimation as provided by LISREL
using the ML estimator.4 The table also contains the full sample estimates for
comparison. Given that on the basis of sampling fluctuations, one can expect
differences between the d ifferent groups, the similarity between the results for
the two designs indicates that the three-group SB-MTMM design can provide
estimates for the parameters of the MTMM model that are very close to the estimates of the classical design. At the same time, the correlation matrices are rather
incomplete since the respondents are asked to answer fewer requests about the
same topic.
Moreover, the fact that the program did not indicate identification problems
suggests that the model is identified even though the correlation matrices in the
different subgroups are incomplete. Let us now investigate the same example in an
identical manner assuming that a two-group design has been used.
11.3.2 Two-Group SB-MTMM Design
Using the two-group design, the same model is assumed to apply for the whole
group, and the analysis is carried out in exactly the same manner. The data for this
design are presented in Table 11.5. The procedure for filling in the empty cells in the
4
In this case, LISREL reports a chi2 of 54.7 with df = 111. However, the number of df is incorrect because
in each matrix 24 correlations and variances were missing, so the df should be reduced by 3 × 24 = 72, and
the correct df are 39.
216
Split-Ballot Multitrait–Multimethod Designs
Table 11.5 Data for two-group SB-MTMM analysis on the basis of two random
samples from the British pilot study of the ESS
Correlations, means, and standard deviations of the first subsample
Correlations
1.00
.457 1.00
.347 .478 1.00
.0 .0 .0
1.00
.0 .0 .0
.0 1.00
.0 .0 .0
.0 .0 1.00
−.564 − .365 − .344 .0 .0 .0 1.00
−.366 − .597 − .359 .0 .0 .0 .546 1.00
−.350 − .386 − .530 .0 .0 .0 .512 .498 1.00
Means
2.42 2.75 2.43 .0 .0 .0 2.01 1.70 1.99
Standard deviations
74 .76 .83 1.0 1.0 1.0 .71 .67 .73
Correlations, means, and standard deviations of the second subsample
Correlations
1.00
.0 1.00
.0 .0 1.00
.0 .0 .0 1.00
.0 .0 .0 .686 1.00
.0 .0 .0 .669 .742 1.00
.0 .0 .0 .585 .449 .441 1.00
.0 .0 .0 .464 .684 .546 .568 1.00
.0 .0 .0 .397 .516 .674 .516 .607 1.00
Means
.0 .0 .0 5.26 4.49 5.10 2.01 1.80 2.02
Standard deviations
1.0 1.0 1.0 2.38 2.40 2.51 .74 .73 .81
table was the same in Table 11.5 as in Table 11.3. An important difference between
the two designs is that in the two-group design, no correlations between the first and
the second methods are available, and so, the coefficients have to be estimated on the
basis of incomplete data.
The first analysis of these matrices did converge, but the variance of the first
method factor was negative. This issue may also arise in the classical MTMM
approach when a method factor has a variance very close to 0.
In Table 11.4, we have seen that the method variance for the first factor was not
significantly different from 0 and rather small even though the estimate was based on
two groups of 140 or 280 cases. In the two-group design, the variance has to be
estimated on the basis of 210 cases, and the program does not provide a proper
solution. A common remedy is to fix one parameter on a value close to 0. If we fix