Tải bản đầy đủ
1 The Quality of Questions Inolved in the MTMM Experiments

1 The Quality of Questions Inolved in the MTMM Experiments

Tải bản đầy đủ



Figure 13.2  The first page if the MTMM questions have been selected.

On this screen, the user can make a selection for specific questions. The study, the
language, and the country of the questions can be specified by making a selection in
the boxes at the top left. One can even ask for a specific question by typing the number
or name of the question. Imagine that we want to look at questions of round 3 of the
ESS, asked in Ireland. We can do so by the specifications presented in Figure 13.3.
Let us say that we want to see the results for a specific one, B38, for example. One
can type the number in the text box, but one can also directly click on the question in
the screen. By doing so, the screen presented in Figure 13.4 appears.
The pop-up screen presents how the question was formulated, and it indicates
what information is available for this question. First of all, we see that the quality of
the question estimated in the MTMM experiment is given, which is .557. This means
that close to 56% of the variance in the observed variable comes from the variable
that it should measure. It also means that close to 44% of the variance is error.
Sometimes, there are also other estimates of the quality of the question available,
especially predictions based on the coding of the question and the prediction program
discussed in the last chapter. MTMM questions are normally also coded, and therefore, a prediction of the quality by the program can also be obtained. This is also true
in this case. The so-called “authorized” prediction is .601. This prediction is called
“authorized” because it is based on the coding of this question that has been checked
on correctness by our research team at RECSM.
We see that, in this case, the predicted values are not very different from the value
obtained by the MTMM experiment. One can get more information about the quality
of the question in particular, splitting the quality up into reliability and validity. This
can be done for the MTMM results by clicking on “View MTMM Results,” but one
can also click on “View prediction details.” In this case, one gets the details of the


The SQP 2.0 Program for Prediction of Quality and Improvement

Figure 13.3  The first 20 questions asked in Ireland and involved in MTMM experiments.

Figure 13.4  The question B38 of round 3 in English of Ireland.

MTMM and the SQP prediction results together. Choosing the latter option, the
screen presented in Figure 13.5 appears.
We see that in this case, the estimates by MTMM and the predictions by SQP 2.0
are rather similar. In general, this can be expected given the high correlation between



Figure 13.5  Detailed information about the quality indicators of question B38 from Ireland.

these two estimates reported in the last chapter, but there are exceptions because,
occasionally, questions can deviate from the most common questions or because the
analysis has led to a rather deviant result.
The common method variance (CMV) is also presented on the screen. The
CMV is an estimate of the correlation that the method would produce between
variables that are measured with the same method and have the same quality. In
this case, one can say that due to the method used, the correlations would be .170
too high. In Chapter 16, we will discuss how this information with respect to the
data quality can be used in data analysis in order to correct for measurement
To get a different picture of the quality, one can also ask for the quality coefficients by clicking on “View quality coefficients.” By doing so, the screen image
presented in Figure 13.6 appears.
The quality coefficients are comparable with standardized factor loadings. In the
MTMM experiments, these coefficients have been estimated. They are the square
root of the estimates for reliability, validity, and quality. In this screenshot, the uncertainty ranges are also presented for all three estimates. It is clear that for a large part,
they overlap. This is what you expect if the two estimates give approximately the
same result. It should, however, be clear that these two sets of estimates are based on
very different information. One is based on the MTMM data and the other on the
codes of the question characteristics and the prediction procedure described in the
previous chapter.
In order to see the codes, one should click on “View prediction codes.” In doing
so, the screenshot presented in Figure 13.7 appears.


The SQP 2.0 Program for Prediction of Quality and Improvement

Figure 13.6  The comparison of the quality coefficients.

Figure 13.7  The codes selected for the different characteristics of the question B38.

At the right-hand side, we can see the codes for all the characteristics of the question.
On the left side is the text of the question. These are the authorized codes of the characteristics approved by the team of RECSM.
13.1.2  Looking for Optimal Measures for a Concept
Typical for the MTMM experiments is that alternative forms of questions for a c­ oncept
have been tested, that the quality of the questions is evaluated in the e­ xperiments, and that
a prediction of the quality is also available if the codes of the questions were completed.
For the concept discussed previously concerning good or bad consequences of
­immigration for the economy, this is typically the case. We can get this information
by first going back to the screen presented in Figure 13.3 and asking the program to



Figure 13.8  The result of the search for questions with the typical word “bad.”

search for questions that are characterized by a keyword. In this case, the keyword
“bad” can be used because that is typical for these questions. There may be other
questions that are also characterized by this word. Nevertheless, we will surely end
up with all questions characterized by the word “bad.” If we search with this word,
we get the screen presented in Figure 13.8.
It happens that in this case, all four questions do indeed measure bad ­consequences
for the economy due to immigration. It seems that no other question contains the
same word. In round 3 of the ESS, an MTMM experiment was carried out involving
four questions for the same concept-by-intuition. In the main questionnaire, question
B38 was asked according to an item-specific scale with 11 categories going from
“bad for the economy” to “good for the economy.”
In the supplementary questionnaire, all three questions were of the agree/disagree
type. In the first group, question HS4 was asked with the use of a fully labeled fivepoint scale going from “agree strongly” to “disagree strongly.” In the second group,
HS16 was presented with an 11-point scale with only the endpoints labeled from
“strongly disagree” to “strongly agree.” In the third group, question HS28 used an
eight-point scale going from “disagree strongly” to “agree strongly.”
Figure 13.8 shows that all questions were evaluated in the MTMM experiment
(m) and completely coded and checked by RECSM, which is called an authorized
coding (a), while one question was also completely coded by another user that was
not controlled by RECSM (u). Clicking on a question, one can get the quality
­estimates from each of the sources and more detailed information as we have shown
previously. For illustrative purposes, we have summarized the estimates provided for
the different questions in Table 13.1.
For the first two questions, the MTMM estimates of the quality and the predictions based on the authorized codings are rather comparable. This is not the case for


The SQP 2.0 Program for Prediction of Quality and Improvement

Table 13.1  The quality estimates of four questions about consequences
of immigration for the economy


SQP predictions



Other user




the last two questions, that is, the agree/disagree questions with 8- and 11-point
scales. These two questions are typical cases described in the previous chapter
where the MTMM estimate of the quality is very low and the prediction based on
the question characteristics is more moderate. This is probably so because it is not
always the case in our data set that the 8- and 11-point scale have such a negative
effect on the quality. In fact, the length of the scale can even have a positive effect,
but not in the case of the agree/disagree scales as we have seen in Chapter 12. It
may be, therefore, that the prediction overestimates the quality of these questions.
It seems that a user did not trust the predicted low quality of question HS16 and
coded the question again. Based on these codes, the quality prediction was, however,
exactly the same. This does not have to happen. One can compare these codes, and
then, one can see if the codes were indeed exactly the same.
We give this example because it illustrates that SQP can be used to see which type
of question is better for the concept-by-intuition one would like to measure. This
information helps to make a choice for a question in a new study. In this specific case,
the last two types of agree/disagree questions should not be chosen. The first two types
are better. The two questions have very different forms, but they are quite similar in
quality. However, this is not the end of the story because we will show in Section 13.3
that the SQP program can also give suggestions for the improvement of the questions.
In that respect, the first type of question can be more improved than the second.
This example illustrates how one can use the information from the MTMM experiments
provided by SQP for selection of higher-quality questions for new research. However, it
should be mentioned that this information does not actually exist for many questions. One
cannot do MTMM experiments for all questions given the costs and the amount of work
associated with them and because different forms of the same questions cannot
always be formulated. This is, for example, true for factual and background questions.
13.2  The Quality of Non-MTMM Questions in the Database
Moving to the second option of the SQP program, we have to go back to the first
page by clicking on the word “Home,” and then, we click on “View all questions that
are currently available.” If we select on this screen as study round 1, English as language, and as country Ireland, then we get the screen presented in Figure 13.9. Here,



Figure 13.9  Overview of the questions of round 1 from Ireland.

there are two possibilities: either the questions have already been coded or they have
not. Looking at Figure 13.9, we see both examples; question A3, for example, has
been coded, while question A4 has not been coded. A1 was already coded, but A2
was not coded. In reality, the latter is now coded because we used this question as an
If we select question A1, we once again get the pop-up screen for this question as
before, presenting the quality prediction by SQP based on the approved coding. We
can also find the details of the quality estimates and the specification of the codes. So
far, it goes the same as before.1
If we select A2, however, the process is different because A2 has not been coded
so far. In selecting this question, we get the screen presented in Figure 13.10.
In order to get a prediction of the quality of this question, the first step is to code
the question. If you click on “Code question to create my own prediction,” the screen
appears that is presented in Figure 13.11.
Selecting “Begin coding” leads one to the screen presented in Figure 13.12.
On the lower left-hand side, the question and answer categories are presented. On
the top left side, the first characteristic that we should code is presented. This is the
domain of the question. The possible categories have been indicated. At the side,
some information about this characteristic is indicated. If one selects a category, the
choice is presented on the right side of the screen, and the next characteristics to be
coded appear at the left side. This characteristic is coded in the same way, and this
process goes on until all characteristics are coded.

Except that in this case, one coder did not finish the task and therefore no prediction was generated (−99).