Example 6.6 E stimation of Total and Row Proportions for the Cross- Tabulation of Gender and Lifetime Major Depression Status Using the NCS- R Data
Tải bản đầy đủ
164
Applied Survey Data Analysis
TableÂ€6.4
Estimated Proportions of U.S. Adults by Gender and Lifetime Major
Depression Status
Description
Parameter
Estimated
Proportion
Linearized
SE
95% CI
Design
Effect
Male, no MDE
Male, MDE
Female, no MDE
Female, MDE
πA0
πA1
πB0
πB0
Total Proportions
0.406
0.007
0.072
0.003
0.402
0.005
0.120
0.003
(0.393, 0.421)
(0.066, 0.080)
(0.391, 0.413)
(0.114, 0.126)
1.87
1.64
1.11
0.81
No MDE|Male
MDE|Male
No MDE|Female
π0|A
π1|A
π0|B
π1|B
Row Proportions
0.849
0.008
0.151
0.008
0.770
0.006
0.230
0.006
(0.833, 0.864)
(0.136, 0.167)
(0.759, 0.782)
(0.218, 0.241)
2.08
2.08
0.87
0.87
MDE|Female
Source: Analysis based on NCS-R data.
variance–covariance matrix. Internally, Stata labels the estimated row proportions
for MDE = 1 as _ prop _ 2. The lincom command is then executed to estimate
the contrast of the male and female proportions and the standard error of this difference. The relevant Stata commands are as follows:
svyset seclustr [pweight=ncsrwtsh], strata(sestrat) ///
vce(linearized) singleunit(missing)
svy: proportion mde, over(sex)
lincom [_prop_2]Male - [_prop_2]Female
The output of the lincom command provides the following estimate of the male–
female difference, its standard error, and a 95% CI for the contrast of proportions.
∆ˆ = pmale − p female
se( ∆ˆ )
CI.95 ( ∆ )
–0.079
0.010
(–0.098, –0.059)
Because the design-based 95% CI for the difference in proportions does not
include 0, the data suggest that the rate of lifetime major depressive episodes for
women is significantly higher than that for men.
6.4.4â•‡ Chi-Square Tests of Independence of Rows and Columns
For a 2 × 2 table, the contrast of estimated subpopulation proportions examined in Example 6.7 is equivalent to a test of whether the response variable
(MDE) is independent of the factor variable (SEX). More generally, under
SRS, two categorical variables are independent of each other if the following
© 2010 by Taylor and Francis Group, LLC
165
Categorical Data Analysis
relationship of the expected proportion in row r, column c of the cross-tabulation holds:
πˆ rc = Expected proportion in row r , column c =
nr + n+ c
⋅
= pr + ⋅ p+ c (6.6)
n++ n++
Under SRS, formal testing of the hypothesis that two categorical variables
are independent can be conducted by comparing the expected cell proportion
under the independence assumption, πˆ rc , to the observed proportion from
the survey data, prc . Intuitively, if the differences ( πˆ rc – prc ) are large there is
evidence that the independence assumption does not hold and that there is
an association between the row and column variables. In standard practice,
two statistics are commonly used to test the hypothesis of independence in
two-way tables:
Pearson’s chi-square test statistic:
X 2 Pearson = n++ ⋅
∑ ∑(p
rc
r
− πˆ rc )2 / πˆ rc
(6.7)
c
Likelihood ratio test statistic:
G 2 = 2 ⋅ n++ ⋅
∑∑ p
rc
r
c
p
× ln rc
πˆ rc
(6.8)
Under the null hypothesis of independence for rows and columns of a twoway table, these test statistics both follow a central χ2 distribution with (R – 1)
× (C – 1) degrees of freedom.
Ignoring the complex sample design that underlies most survey data sets can
introduce bias in the estimated values of these test statistics. To correct the bias
in the estimates of the population proportions used to construct the test statistic, weighted estimates of the cell, row, and column proportions are substituted,
for example, prc = Nˆ rc / Nˆ ++ . To correct for the design effects on the sampling
variances of these proportions, two general approaches have been introduced
in the statistical literature. Both approaches involve scaling the standard X2Pearson
and G2 test statistics by dividing them by an estimate of a generalized design
effect factor (GDEFF). Theory BoxÂ€6.2 provides a mathematical explanation of
how the generalized design effect adjustments are computed.
Fellegi (1980) was the first to propose such a correction based on a generalized design effect. Rao and Scott (1984) and later Thomas and Rao (1987)
extended the theory of generalized design effect corrections for these test
statistics. The Rao–Scott method requires the computation of generalized
© 2010 by Taylor and Francis Group, LLC
166
Applied Survey Data Analysis
Theory BoxÂ€6.2â•… First- and SecondOrder Design Effect Corrections
The Fellegi (1980) method for a generalized design effect correction
to the chi-square test statistic is best summarized as a three-step process. First, the average of the design effects for the R × C (unweighted)
proportions involved in the computation of the chi-square statistic is
computed. The standard Pearson or likelihood ratio chi-square statistic computed under simple random sampling is then divided by the
average design effect. The resulting adjusted chi-square test statistic is
referred to a χ2 distribution with degrees of freedom equal to (R – 1) ×
(C – 1) to test the null hypothesis of independence.
Rao and Scott (1984) built on this method by advocating the use of
weighted estimates of the proportions in the construction of the standard chi-square statistics. Under the Rao–Scott method, the generalized design effect is defined as the mean of the eigenvalues of the
following matrix, D:
−1
D = VDesignVSRS
(6.9)
In Equation 6.9, VDesign is the matrix of design-based (e.g., linearized)
variances and covariances for the R × C vector of estimated proportions
used to construct the chi-square test statistic, and VSRS is the matrix of
variance and covariances for the estimated proportions given a simple
random sample of the same size. The Rao–Scott generalized design
effect factor, GDEFF, for two-way tables can then be written as follows:
GDEFF =
∑ ∑(1 − p ) ⋅ d ( p ) − ∑(1 − p
rc
r
2
r+
rc
c
r
) ⋅ d 2 ( pr + ) −
( R − 1)(C − 1)
∑(1 − p
+c
) ⋅ d 2 ( p+ c ) (6.10)
c
The design-adjusted test statistics introduced in Equation 6.11 are
computed based on first-order design corrections of this type.
Thomas and Rao (1987) derived second-order design corrections to
the test statistics, which incorporate variability in the eigenvalues of the
D matrix. These second-order design corrections can be implemented
by dividing the adjusted test statistic based on the first-order correc2
tion (e.g., XR2 −S = XPearson
/GDEFF ) by the quantity (1 + a2), where a represents the coefficient of variation of the eigenvalues of the D matrix. The
F-transformed version of this second-order design-corrected version
of the Pearson chi-square statistic is currently the default test statistic
reported by Stata’s svy: tab command for analyses of two-way tables.
© 2010 by Taylor and Francis Group, LLC
167
Categorical Data Analysis
Thomas and Rao (1987) used simulations to show that this secondorder correction controls Type I error rates much better when there is
substantial variance in the eigenvalues of D.
design effects that are analytically more complicated than the Fellegi
approach. The Rao–Scott procedures are now the standard in procedures for
the analysis of categorical survey data in software systems such as Stata and
SAS. The design-adjusted Rao–Scott Pearson and likelihood ratio chi-square
test statistics are computed as follows:
2
XR2 −S = XPearson
/GDEFF ,
GR2 −S = G 2 /GDEFF
(6.11)
Under the null hypothesis of independence of rows and columns, both of
these adjusted test statistics can be referred to a χ2 distribution with (R – 1) ×
(C – 1) degrees of freedom. Thomas and Rao (1987) showed that a transformation of the design-adjusted X2R-S and G2R-S values produced a more stable test
statistic that under the null hypothesis closely approximated an F distribution. TableÂ€6.5 defines the F-transformed version of these two chi-square test
statistics and the corresponding F reference distribution to be used in testing
the independence of rows and columns.
A third form of the chi-square test statistic that may be used to test the
null hypothesis of independence of rows and columns in a cross-tabulation
of two categorical variables is the Wald chi-square test statistic (see Theory
BoxÂ€6.3). We will see in later chapters that Wald statistics play an important
role in hypothesis testing for linear and generalized linear models. However,
simulation studies have shown that the standard Pearson chi-square test statistic and its design-adjusted forms proposed by Rao and Scott (1984) and
Rao and Thomas (1988) perform best for both sparse and nonsparse tables
TableÂ€6.5
F-Transformations of the Rao–Scott Chi-Square Test Statistics
F-Transformed Test Statistics
F Reference
Distribution under H0
FR–S,Pearson = X2R–S/(R – 1)(C – 1)
F(R–1)(C–1),(R–1)(C–1)·df
FR–S,LRT = G2R–S/(R – 1)(C – 1)
F(R–1)(C–1),(R–1)(C–1)df
where R is the number of rows, C is the number of columns in the
crosstab, and df is the design degrees of freedom
Note: Stata employs a special procdure involving a Satterthwaite correction
in deriving these F statistics. This can result in non-integer degrees of
freedom (Stater, 2008). See Table 6.6.
© 2010 by Taylor and Francis Group, LLC
168
Applied Survey Data Analysis
Theory BoxÂ€6.3â•… The Wald Chi-square Test of
Independence for Categorical Variables
The Wald chi-square test statistic for the null hypothesis of independence of rows and columns in a two-way table is defined as follows:
QWald = Yˆ ′( HVˆ ( Nˆ ) H ′ )−1Yˆ ,
(6.12)
Yˆ = ( Nˆ − E )
(6.13)
where
is a vector of R × C differences between the observed and expected cell
counts, for example, Nˆ rc − Erc where under the independence hypothesis, Erc = Nˆ r + ⋅ Nˆ + c / Nˆ ++ . The matrix term HVˆ ( Nˆ ) H ′ represents the
estimated variance–covariance matrix for the vector of differences. In
the case of a complex sample design, the variance–covariance matrix of
the weighted frequency counts, Vˆ ( Nˆ ) , is estimated using a TSL, BRR,
or JRR approach that captures the effects of stratification, clustering,
and weighting.
Under the null hypothesis of independence, QWald follows a χ2 distribution with (R – 1) × (C– 1) degrees of freedom. An F-transform
of the Wald chi-square test statistic reported in SUDAAN and other
software programs is
FWald = QWald ×
df − ( R − 1)(C − 1) + 1
∼ F ( R--1)( C -1),df -( R-1)( C -1)+1 under H 0 .
(6.14)
( R − 1)(C − 1)df
(Sribney, 1998) and that these tests are more powerful than the Wald test
statistic, especially for larger tables. As a result, Stata makes the FR-S,Pearson test
statistic (with a second-order design correction incorporated) the default test
statistic reported by its svy: tab procedure, and this is also the default test
statistic reported by SAS PROC SURVEYFREQ (with a first-order design correction incorporated).
Example 6.8:â•‡ Testing the Independence of Alcohol Dependence and
Education Level in Young Adults (Ages 18–28) Using the NCS-R Data
This example uses the Stata svy: tab command to compute the Rao–Scott
F-statistics and test the independence of two categorical variables that are available in the NCS-R data set (for Part II respondents): ALD, an indicator of receiving
a diagnosis of alcohol dependence in the lifetime; and ED4CAT, a categorical
variable measuring educational attainment (1 = less than high school, 2 = high
© 2010 by Taylor and Francis Group, LLC
169
Categorical Data Analysis
TableÂ€6.6
Design-Based Analysis of the Association between NCS-R Alcohol
Dependence and Education Level for Young Adults Aged 18–28
Alcohol Dependence Row Proportions (Linearized SE)
Education Level
(Grades)
0–11
12
13–15
16+
Total
Unadjusted X2
X2Pearson = 27.21
n18-28 = 1,275
0 = No
1 = Yes
Total
0.909 (0.029)
0.951 (0.014)
0.951 (0.010)
0.931 (0.014)
0.940 (0.009)
0.091 (0.029)
0.049 (0.014)
0.049 (0.010)
0.069 (0.014)
0.060 (0.009)
1.000
1.000
1.000
1.000
1.000
Tests of Independence
Rao–Scott F
P ( χ(23 ) > X2Pearson)
p < 0.0001
FR-S,Pearson = 1.64
P (F2.75, 115.53 > FR-S)
p = 0.18
Parameters of the Rao–Scott Design-Adjusted Test
Design df = 42
GDEFF = 6.62
a = 0.56
school, 3 = some college, 4 = college and above). The analysis is restricted to the
subpopulation of NCS-R Part II respondents 18–28 years of age. After identifying
the complex design features to Stata, we request the cross-tabulation analysis and
any related design-adjusted test statistics by using the svy: tab command:
svyset seclustr [pweight = ncsrwtlg], strata(sestrat)
svy, subpop(if 18<=age<29): tab ed4cat ald, row se ci deff
ED4CAT is specified as the row (factor) variable and ALD as the column
(response) variable. Weighted estimates of the row proportions are requested
using the row option. TableÂ€6.6 summarizes the estimated row proportions and
standard errors for the ALD × ED4CAT crosstabulation along with the Rao–Scott
F-test of independence.
An estimated 9.1% of young adults in the lowest education group have been
diagnosed with alcohol dependence at some point in their lifetime (95% CI = 4.7%,
17.0%), while an estimated 6.9% of young adults in the highest education group
have been diagnosed with alcohol dependence (95% CI = 4.6%, 10.2%). By default,
Stata reports the standard uncorrected Pearson chi-square test statistic ( X 2Pearson =
27.21, p < 0.0001) and then reports the (second-order) design-adjusted Rao–Scott
F-test statistic (FR-S,Pearson = 1.64, p = 0.18) (see TableÂ€6.5). The standard Pearson X2
test rejects the null hypothesis of independence at α = 0.05; however, when the
corrections for the complex sample design are introduced, the Rao–Scott designadjusted test statistic fails to reject a null hypothesis of independence between education and a lifetime diagnosis of alcohol dependence in this younger population.
The appropriate inference in this case would thus be that there is no evidence of a
bivariate association between these two categorical factors in this subpopulation.
Multivariate analyses examining additional potential predictors of alcohol dependence could certainly be examined at this point (see Chapter 8 for examples).
© 2010 by Taylor and Francis Group, LLC