Tải bản đầy đủ

Example 6.6 E stimation of Total and Row Proportions for the Cross- Tabulation of Gender and Lifetime Major Depression Status Using the NCS- R Data

164

Applied Survey Data Analysis

TableÂ€6.4

Estimated Proportions of U.S. Adults by Gender and Lifetime Major

Depression Status

Description

Parameter

Estimated

Proportion

Linearized

SE

95% CI

Design

Effect

Male, no MDE

Male, MDE

Female, no MDE

Female, MDE

πA0

πA1

πB0

πB0

Total Proportions

0.406

0.007

0.072

0.003

0.402

0.005

0.120

0.003

(0.393, 0.421)

(0.066, 0.080)

(0.391, 0.413)

(0.114, 0.126)

1.87

1.64

1.11

0.81

No MDE|Male

MDE|Male

No MDE|Female

π0|A

π1|A

π0|B

π1|B

Row Proportions

0.849

0.008

0.151

0.008

0.770

0.006

0.230

0.006

(0.833, 0.864)

(0.136, 0.167)

(0.759, 0.782)

(0.218, 0.241)

2.08

2.08

0.87

0.87

MDE|Female

Source: Analysis based on NCS-R data.

variance–covariance matrix. Internally, Stata labels the estimated row proportions

for MDE = 1 as _ prop _ 2. The lincom command is then executed to estimate

the contrast of the male and female proportions and the standard error of this difference. The relevant Stata commands are as follows:

svyset seclustr [pweight=ncsrwtsh], strata(sestrat) ///

vce(linearized) singleunit(missing)

svy: proportion mde, over(sex)

lincom [_prop_2]Male - [_prop_2]Female

The output of the lincom command provides the following estimate of the male–

female difference, its standard error, and a 95% CI for the contrast of proportions.

∆ˆ = pmale − p female

se( ∆ˆ )

CI.95 ( ∆ )

–0.079

0.010

(–0.098, –0.059)

Because the design-based 95% CI for the difference in proportions does not

include 0, the data suggest that the rate of lifetime major depressive episodes for

women is significantly higher than that for men.

6.4.4â•‡ Chi-Square Tests of Independence of Rows and Columns

For a 2 × 2 table, the contrast of estimated subpopulation proportions examined in Example 6.7 is equivalent to a test of whether the response variable

(MDE) is independent of the factor variable (SEX). More generally, under

SRS, two categorical variables are independent of each other if the following

© 2010 by Taylor and Francis Group, LLC

165

Categorical Data Analysis

relationship of the expected proportion in row r, column c of the cross-tabulation holds:

πˆ rc = Expected proportion in row r , column c =

nr + n+ c

⋅

= pr + ⋅ p+ c (6.6)

n++ n++

Under SRS, formal testing of the hypothesis that two categorical variables

are independent can be conducted by comparing the expected cell proportion

under the independence assumption, πˆ rc , to the observed proportion from

the survey data, prc . Intuitively, if the differences ( πˆ rc – prc ) are large there is

evidence that the independence assumption does not hold and that there is

an association between the row and column variables. In standard practice,

two statistics are commonly used to test the hypothesis of independence in

two-way tables:

Pearson’s chi-square test statistic:

X 2 Pearson = n++ ⋅

∑ ∑(p

rc

r

− πˆ rc )2 / πˆ rc

(6.7)

c

Likelihood ratio test statistic:

G 2 = 2 ⋅ n++ ⋅

∑∑ p

rc

r

c

p

× ln rc

πˆ rc

(6.8)

Under the null hypothesis of independence for rows and columns of a twoway table, these test statistics both follow a central χ2 distribution with (R – 1)

× (C – 1) degrees of freedom.

Ignoring the complex sample design that underlies most survey data sets can

introduce bias in the estimated values of these test statistics. To correct the bias

in the estimates of the population proportions used to construct the test statistic, weighted estimates of the cell, row, and column proportions are substituted,

for example, prc = Nˆ rc / Nˆ ++ . To correct for the design effects on the sampling

variances of these proportions, two general approaches have been introduced

in the statistical literature. Both approaches involve scaling the standard X2Pearson

and G2 test statistics by dividing them by an estimate of a generalized design

effect factor (GDEFF). Theory BoxÂ€6.2 provides a mathematical explanation of

how the generalized design effect adjustments are computed.

Fellegi (1980) was the first to propose such a correction based on a generalized design effect. Rao and Scott (1984) and later Thomas and Rao (1987)

extended the theory of generalized design effect corrections for these test

statistics. The Rao–Scott method requires the computation of generalized

© 2010 by Taylor and Francis Group, LLC

166

Applied Survey Data Analysis

Theory BoxÂ€6.2â•… First- and SecondOrder Design Effect Corrections

The Fellegi (1980) method for a generalized design effect correction

to the chi-square test statistic is best summarized as a three-step process. First, the average of the design effects for the R × C (unweighted)

proportions involved in the computation of the chi-square statistic is

computed. The standard Pearson or likelihood ratio chi-square statistic computed under simple random sampling is then divided by the

average design effect. The resulting adjusted chi-square test statistic is

referred to a χ2 distribution with degrees of freedom equal to (R – 1) ×

(C – 1) to test the null hypothesis of independence.

Rao and Scott (1984) built on this method by advocating the use of

weighted estimates of the proportions in the construction of the standard chi-square statistics. Under the Rao–Scott method, the generalized design effect is defined as the mean of the eigenvalues of the

following matrix, D:

−1

D = VDesignVSRS

(6.9)

In Equation 6.9, VDesign is the matrix of design-based (e.g., linearized)

variances and covariances for the R × C vector of estimated proportions

used to construct the chi-square test statistic, and VSRS is the matrix of

variance and covariances for the estimated proportions given a simple

random sample of the same size. The Rao–Scott generalized design

effect factor, GDEFF, for two-way tables can then be written as follows:

GDEFF =

∑ ∑(1 − p ) ⋅ d ( p ) − ∑(1 − p

rc

r

2

r+

rc

c

r

) ⋅ d 2 ( pr + ) −

( R − 1)(C − 1)

∑(1 − p

+c

) ⋅ d 2 ( p+ c ) (6.10)

c

The design-adjusted test statistics introduced in Equation 6.11 are

computed based on first-order design corrections of this type.

Thomas and Rao (1987) derived second-order design corrections to

the test statistics, which incorporate variability in the eigenvalues of the

D matrix. These second-order design corrections can be implemented

by dividing the adjusted test statistic based on the first-order correc2

tion (e.g., XR2 −S = XPearson

/GDEFF ) by the quantity (1 + a2), where a represents the coefficient of variation of the eigenvalues of the D matrix. The

F-transformed version of this second-order design-corrected version

of the Pearson chi-square statistic is currently the default test statistic

reported by Stata’s svy: tab command for analyses of two-way tables.

© 2010 by Taylor and Francis Group, LLC

167

Categorical Data Analysis

Thomas and Rao (1987) used simulations to show that this secondorder correction controls Type I error rates much better when there is

substantial variance in the eigenvalues of D.

design effects that are analytically more complicated than the Fellegi

approach. The Rao–Scott procedures are now the standard in procedures for

the analysis of categorical survey data in software systems such as Stata and

SAS. The design-adjusted Rao–Scott Pearson and likelihood ratio chi-square

test statistics are computed as follows:

2

XR2 −S = XPearson

/GDEFF ,

GR2 −S = G 2 /GDEFF

(6.11)

Under the null hypothesis of independence of rows and columns, both of

these adjusted test statistics can be referred to a χ2 distribution with (R – 1) ×

(C – 1) degrees of freedom. Thomas and Rao (1987) showed that a transformation of the design-adjusted X2R-S and G2R-S values produced a more stable test

statistic that under the null hypothesis closely approximated an F distribution. TableÂ€6.5 defines the F-transformed version of these two chi-square test

statistics and the corresponding F reference distribution to be used in testing

the independence of rows and columns.

A third form of the chi-square test statistic that may be used to test the

null hypothesis of independence of rows and columns in a cross-tabulation

of two categorical variables is the Wald chi-square test statistic (see Theory

BoxÂ€6.3). We will see in later chapters that Wald statistics play an important

role in hypothesis testing for linear and generalized linear models. However,

simulation studies have shown that the standard Pearson chi-square test statistic and its design-adjusted forms proposed by Rao and Scott (1984) and

Rao and Thomas (1988) perform best for both sparse and nonsparse tables

TableÂ€6.5

F-Transformations of the Rao–Scott Chi-Square Test Statistics

F-Transformed Test Statistics

F Reference

Distribution under H0

FR–S,Pearson = X2R–S/(R – 1)(C – 1)

F(R–1)(C–1),(R–1)(C–1)·df

FR–S,LRT = G2R–S/(R – 1)(C – 1)

F(R–1)(C–1),(R–1)(C–1)df

where R is the number of rows, C is the number of columns in the

crosstab, and df is the design degrees of freedom

Note: Stata employs a special procdure involving a Satterthwaite correction

in deriving these F statistics. This can result in non-integer degrees of

freedom (Stater, 2008). See Table 6.6.

© 2010 by Taylor and Francis Group, LLC

168

Applied Survey Data Analysis

Theory BoxÂ€6.3â•… The Wald Chi-square Test of

Independence for Categorical Variables

The Wald chi-square test statistic for the null hypothesis of independence of rows and columns in a two-way table is defined as follows:

QWald = Yˆ ′( HVˆ ( Nˆ ) H ′ )−1Yˆ ,

(6.12)

Yˆ = ( Nˆ − E )

(6.13)

where

is a vector of R × C differences between the observed and expected cell

counts, for example, Nˆ rc − Erc where under the independence hypothesis, Erc = Nˆ r + ⋅ Nˆ + c / Nˆ ++ . The matrix term HVˆ ( Nˆ ) H ′ represents the

estimated variance–covariance matrix for the vector of differences. In

the case of a complex sample design, the variance–covariance matrix of

the weighted frequency counts, Vˆ ( Nˆ ) , is estimated using a TSL, BRR,

or JRR approach that captures the effects of stratification, clustering,

and weighting.

Under the null hypothesis of independence, QWald follows a χ2 distribution with (R – 1) × (C– 1) degrees of freedom. An F-transform

of the Wald chi-square test statistic reported in SUDAAN and other

software programs is

FWald = QWald ×

df − ( R − 1)(C − 1) + 1

∼ F ( R--1)( C -1),df -( R-1)( C -1)+1 under H 0 .

(6.14)

( R − 1)(C − 1)df

(Sribney, 1998) and that these tests are more powerful than the Wald test

statistic, especially for larger tables. As a result, Stata makes the FR-S,Pearson test

statistic (with a second-order design correction incorporated) the default test

statistic reported by its svy: tab procedure, and this is also the default test

statistic reported by SAS PROC SURVEYFREQ (with a first-order design correction incorporated).

Example 6.8:â•‡ Testing the Independence of Alcohol Dependence and

Education Level in Young Adults (Ages 18–28) Using the NCS-R Data

This example uses the Stata svy: tab command to compute the Rao–Scott

F-statistics and test the independence of two categorical variables that are available in the NCS-R data set (for Part II respondents): ALD, an indicator of receiving

a diagnosis of alcohol dependence in the lifetime; and ED4CAT, a categorical

variable measuring educational attainment (1 = less than high school, 2 = high

© 2010 by Taylor and Francis Group, LLC

169

Categorical Data Analysis

TableÂ€6.6

Design-Based Analysis of the Association between NCS-R Alcohol

Dependence and Education Level for Young Adults Aged 18–28

Alcohol Dependence Row Proportions (Linearized SE)

Education Level

(Grades)

0–11

12

13–15

16+

Total

Unadjusted X2

X2Pearson = 27.21

n18-28 = 1,275

0 = No

1 = Yes

Total

0.909 (0.029)

0.951 (0.014)

0.951 (0.010)

0.931 (0.014)

0.940 (0.009)

0.091 (0.029)

0.049 (0.014)

0.049 (0.010)

0.069 (0.014)

0.060 (0.009)

1.000

1.000

1.000

1.000

1.000

Tests of Independence

Rao–Scott F

P ( χ(23 ) > X2Pearson)

p < 0.0001

FR-S,Pearson = 1.64

P (F2.75, 115.53 > FR-S)

p = 0.18

Parameters of the Rao–Scott Design-Adjusted Test

Design df = 42

GDEFF = 6.62

a = 0.56

school, 3 = some college, 4 = college and above). The analysis is restricted to the

subpopulation of NCS-R Part II respondents 18–28 years of age. After identifying

the complex design features to Stata, we request the cross-tabulation analysis and

any related design-adjusted test statistics by using the svy: tab command:

svyset seclustr [pweight = ncsrwtlg], strata(sestrat)

svy, subpop(if 18<=age<29): tab ed4cat ald, row se ci deff

ED4CAT is specified as the row (factor) variable and ALD as the column

(response) variable. Weighted estimates of the row proportions are requested

using the row option. TableÂ€6.6 summarizes the estimated row proportions and

standard errors for the ALD × ED4CAT crosstabulation along with the Rao–Scott

F-test of independence.

An estimated 9.1% of young adults in the lowest education group have been

diagnosed with alcohol dependence at some point in their lifetime (95% CI = 4.7%,

17.0%), while an estimated 6.9% of young adults in the highest education group

have been diagnosed with alcohol dependence (95% CI = 4.6%, 10.2%). By default,

Stata reports the standard uncorrected Pearson chi-square test statistic ( X 2Pearson =

27.21, p < 0.0001) and then reports the (second-order) design-adjusted Rao–Scott

F-test statistic (FR-S,Pearson = 1.64, p = 0.18) (see TableÂ€6.5). The standard Pearson X2

test rejects the null hypothesis of independence at α = 0.05; however, when the

corrections for the complex sample design are introduced, the Rao–Scott designadjusted test statistic fails to reject a null hypothesis of independence between education and a lifetime diagnosis of alcohol dependence in this younger population.

The appropriate inference in this case would thus be that there is no evidence of a

bivariate association between these two categorical factors in this subpopulation.

Multivariate analyses examining additional potential predictors of alcohol dependence could certainly be examined at this point (see Chapter 8 for examples).

© 2010 by Taylor and Francis Group, LLC

## 2010 applied survey data analysis

## 4 Simple Random Sampling: A Simple Model for Design- Based Inference

## 2 Analysis Weights: Review by the Data User

## Example 5.1: A Weighted Histogram of Total Cholesterol Using the 2005– 2006 NHANES Data

## Example 5.8: Estimating Population Quantiles for Total Household Assets Using the HRS Data

## Example 5.11: Estimating Mean Systolic Blood Pressure for Males and Females Age > 45 Using the NHANES Data

## Example 5.13: E stimating Differences in Mean Total Household Assets from 2004 to 2006 Using Data from the HRS

## Example 6.4: A Goodness- of- Fit Test for Blood Pressure Status Category Proportions

## Example 6.8: Testing the Independence of Alcohol Dependence and Education Level in Young Adults ( Ages 18– 28) Using the NCS- R Data

## Example 6.9: Simple Logistic Regression to Estimate the NCS- R Male/ Female Odds Ratio for Lifetime Major Depressive Episode

## 5 Application: Modeling Diastolic Blood Pressure with the NHANES Data

Tài liệu liên quan