Tải bản đầy đủ
3…SPSS Procedure for Hierarchical Cluster Analysis

# 3…SPSS Procedure for Hierarchical Cluster Analysis

Tải bản đầy đủ

10.3

SPSS Procedure for Hierarchical Cluster Analysis

241

=[ Select all the seven variables and place it on the Variables box

=[ Statistics =[Click on Agglomeration Schedule and Proximity matrix,
then click Continue

242

10

Cluster Analysis

=[Click on Plots =[Dendrogram, All Clusters and from Orientation Vertical, then click Continue

=[Cluster method =[Select Ward’s Method and Squared Euclidean Distance, then Click Continue

10.4

Questions

243

Click Ok to get the results of Hierarchical Cluster Analysis

10.4 Questions
1.

Segmentation studies in Marketing will employ which of the following statistical techniques?
a.
b.
c.
d.
e.

2.

The logic of ________ is to group individuals or objects by their similarity or
distance from each other.
a.
b.
c.
d.
e.

3.

Regression
Discriminant analysis
Cluster analaysis
Conjoint analaysis
T-tests

Cluster Analysis
Factor analysis
Discriminant analysis
Regression analysis
Bivariate analysis

Which method of analysis does not classify variables as dependent or
independent?
a. Multiple regression analysis
b. Discriminant analysis
c. Analysis of variance

244

10

Cluster Analysis

d. Cluster analysis
e. Simple regression analysis
4.

A _________ or tree graph is a graphical device for displaying clustering
results. Vertical lines represent clusters that are joined together. The position
of the line on the scale indicates the distances at which clusters were joined.
a.
b.
c.
d.
e.

Dendrogram
Scattergram
Scree plot
Icicle diagram
Histogram.

Chapter 11

Binary Logistic Regression

11.1 Chapter Overview
This chapter discusses a methodology that is more or less analogous to linear
regression discussed in the previous chapter, Binary Logistic Regression. In a
binary logistic regression, a single dependent variable (categorical: two categories) is predicted from one or more independent variables (metric or non-metric).
This chapter also explains what the logistic regression model tells us: Interpretation of regression coefficients and odds ratios using IBM SPSS 20.0. The
example detailed in this chapter involves one metric- and four non-metric-independent variables.

11.2 Logistic Regression
Logistic regression is considered to be a specialized form of regression. In
regression equation, the researcher explains the relationship between one metricdependent and one or more metric-independent variables. In logistic regression,
we can predict a categorical dependent variable (non-metric) in terms of one or
more categorical (non-metric) or non-categorical (metric) independent variables.
The coefficients that we derived from both the equations are more or less
similar, in a way that it explains the relative impact of each predictor variable
on dependent variable. In discriminant analysis, the non-metric dependent variable is predicted based on metric-independent variables and categorizing the
members or objects in each group based on discriminant Z scores. It requires
the calculation of cutting scores and based on that we will assign the observations to each group. The major differences between these three methods are
shown Table 11.1.

S. Sreejesh et al., Business Research Methods,
DOI: 10.1007/978-3-319-00539-3_11,
Ó Springer International Publishing Switzerland 2014

245

246

11 Binary Logistic Regression

Table 11.1 Differences among three methods
Major
Regression analysis
Discriminant
differences
analysis
Dependent
Metric (non-categorical)
variable
Independent Metric (non-categorical)
variable
Assumptions All major assumptions like
linearity, normality,
equality of variance and
no multicollinearity

Non-metric
(categorical)
Metric (noncategorical)
Normality,
Linearity,
equality of
variance and
covariance

Logistic regression
Non-metric (categorical)
Metric or non-metric
Not based on these strict
assumptions except
multicollinearity. Robust
even if these assumptions
are not met

11.3 Logistic Curve Versus Regression Line

Fig. 11.1 Logistic
regression curve

Probability of Dependent Variable

Figure 11.1 shows logistic regression curve to represent the relationship between
dependent and independent variables. The logistic regression uses binary-dependent variable and has only the values of 0 and 1, and metric- or non-metricindependent variable, and predicting the probability (ranges from 0 to 1) of the
dependent variable based on the levels of independent variable. At very low levels
of independent variable, this probability approaches to zero, but never reaches to
zero. In a similar fashion, as the independent variable increases, the probability of
the dependent variable also increases and approaches to one but never exceed it.
The linear regression line in Fig. 11.2 shows one to one relationship (linear) with
metric-dependent and metric-independent variable could not explain non-linear
relationships (0 or 1). At the same time, logistic regression follows binomial distribution instead of normal distribution, and therefore, it invalidate all the statistical
testing based on the assumptions of normality. Logistic regression also creates
instances of hetroscedasticity or unequal variances as it works in dichotomously

Level of Independent Variable

11.3

Logistic Curve Versus Regression Line

247

Dependent Variable

Fig. 11.2 Linear regression
curve

Independent Variable

(categorical) dependent variable. In linear regression, the predicted value could not
restrict in a range (0–1) as we discussed in the case of logistic regression.
Logistic regression achieves this restriction (0–1) in two steps. In first case, it
transforms probability of the event as Odds
hh value.
ii Odds—it is the ratio of the

probability of the two outcomes or events

Pr obi
1ÀPr obi

. Representing the probability

of a 1 as p so that the probability of a zero is (1-p), then the odds is simply given
by p/(1-p). For example, assume that the purchase probability of product A is
0.50, then the odds ratio product A is the ratio of the probability of the two
outcomes (purchase versus non-purchase) = 0.50/1–0.50 = 1. In this way, probability would be stated in metric form. Any odds value can be converted back into
a probability that falls between 0 and 1.
In the second stage, the Odd ratio is converted into logit value. Logit value is
calculated by taking the logarithm of the odds. This is mainly to achieve or keep
the odds values from going below zero, which is the lower limit of the odds. The
following Table 11.2 shows the relationship between Probability, Odds value and
Logit value.
Odds value less than 1 will have a negative logit value and greater than 1 will
have a positive logit value. Once we get the logit value from the Odds value, we
Table 11.2 Relationship
between probability, odds
and logit value

Probability

Odds

Log odds (Logit value)

0.00
0.10
0.30
0.50
0.70
0.90
1.00

0.00
0.111
0.428
1.00
2.333
9.00
Cannot be calculated

Cannot be calculated
-2.197
-0.847
0.000
0.847
2.197