3…SPSS Procedure for Hierarchical Cluster Analysis
Tải bản đầy đủ
10.3
SPSS Procedure for Hierarchical Cluster Analysis
241
=[ Select all the seven variables and place it on the Variables box
=[ Statistics =[Click on Agglomeration Schedule and Proximity matrix,
then click Continue
242
10
Cluster Analysis
=[Click on Plots =[Dendrogram, All Clusters and from Orientation Vertical, then click Continue
=[Cluster method =[Select Ward’s Method and Squared Euclidean Distance, then Click Continue
10.4
Questions
243
Click Ok to get the results of Hierarchical Cluster Analysis
10.4 Questions
1.
Segmentation studies in Marketing will employ which of the following statistical techniques?
a.
b.
c.
d.
e.
2.
The logic of ________ is to group individuals or objects by their similarity or
distance from each other.
a.
b.
c.
d.
e.
3.
Regression
Discriminant analysis
Cluster analaysis
Conjoint analaysis
T-tests
Cluster Analysis
Factor analysis
Discriminant analysis
Regression analysis
Bivariate analysis
Which method of analysis does not classify variables as dependent or
independent?
a. Multiple regression analysis
b. Discriminant analysis
c. Analysis of variance
244
10
Cluster Analysis
d. Cluster analysis
e. Simple regression analysis
4.
A _________ or tree graph is a graphical device for displaying clustering
results. Vertical lines represent clusters that are joined together. The position
of the line on the scale indicates the distances at which clusters were joined.
a.
b.
c.
d.
e.
Dendrogram
Scattergram
Scree plot
Icicle diagram
Histogram.
Chapter 11
Binary Logistic Regression
11.1 Chapter Overview
This chapter discusses a methodology that is more or less analogous to linear
regression discussed in the previous chapter, Binary Logistic Regression. In a
binary logistic regression, a single dependent variable (categorical: two categories) is predicted from one or more independent variables (metric or non-metric).
This chapter also explains what the logistic regression model tells us: Interpretation of regression coefficients and odds ratios using IBM SPSS 20.0. The
example detailed in this chapter involves one metric- and four non-metric-independent variables.
11.2 Logistic Regression
Logistic regression is considered to be a specialized form of regression. In
regression equation, the researcher explains the relationship between one metricdependent and one or more metric-independent variables. In logistic regression,
we can predict a categorical dependent variable (non-metric) in terms of one or
more categorical (non-metric) or non-categorical (metric) independent variables.
The coefficients that we derived from both the equations are more or less
similar, in a way that it explains the relative impact of each predictor variable
on dependent variable. In discriminant analysis, the non-metric dependent variable is predicted based on metric-independent variables and categorizing the
members or objects in each group based on discriminant Z scores. It requires
the calculation of cutting scores and based on that we will assign the observations to each group. The major differences between these three methods are
shown Table 11.1.
S. Sreejesh et al., Business Research Methods,
DOI: 10.1007/978-3-319-00539-3_11,
Ó Springer International Publishing Switzerland 2014
245
246
11 Binary Logistic Regression
Table 11.1 Differences among three methods
Major
Regression analysis
Discriminant
differences
analysis
Dependent
Metric (non-categorical)
variable
Independent Metric (non-categorical)
variable
Assumptions All major assumptions like
linearity, normality,
equality of variance and
no multicollinearity
Non-metric
(categorical)
Metric (noncategorical)
Normality,
Linearity,
equality of
variance and
covariance
Logistic regression
Non-metric (categorical)
Metric or non-metric
Not based on these strict
assumptions except
multicollinearity. Robust
even if these assumptions
are not met
11.3 Logistic Curve Versus Regression Line
Fig. 11.1 Logistic
regression curve
Probability of Dependent Variable
Figure 11.1 shows logistic regression curve to represent the relationship between
dependent and independent variables. The logistic regression uses binary-dependent variable and has only the values of 0 and 1, and metric- or non-metricindependent variable, and predicting the probability (ranges from 0 to 1) of the
dependent variable based on the levels of independent variable. At very low levels
of independent variable, this probability approaches to zero, but never reaches to
zero. In a similar fashion, as the independent variable increases, the probability of
the dependent variable also increases and approaches to one but never exceed it.
The linear regression line in Fig. 11.2 shows one to one relationship (linear) with
metric-dependent and metric-independent variable could not explain non-linear
relationships (0 or 1). At the same time, logistic regression follows binomial distribution instead of normal distribution, and therefore, it invalidate all the statistical
testing based on the assumptions of normality. Logistic regression also creates
instances of hetroscedasticity or unequal variances as it works in dichotomously
Level of Independent Variable
11.3
Logistic Curve Versus Regression Line
247
Dependent Variable
Fig. 11.2 Linear regression
curve
Independent Variable
(categorical) dependent variable. In linear regression, the predicted value could not
restrict in a range (0–1) as we discussed in the case of logistic regression.
Logistic regression achieves this restriction (0–1) in two steps. In first case, it
transforms probability of the event as Odds
hh value.
ii Odds—it is the ratio of the
probability of the two outcomes or events
Pr obi
1ÀPr obi
. Representing the probability
of a 1 as p so that the probability of a zero is (1-p), then the odds is simply given
by p/(1-p). For example, assume that the purchase probability of product A is
0.50, then the odds ratio product A is the ratio of the probability of the two
outcomes (purchase versus non-purchase) = 0.50/1–0.50 = 1. In this way, probability would be stated in metric form. Any odds value can be converted back into
a probability that falls between 0 and 1.
In the second stage, the Odd ratio is converted into logit value. Logit value is
calculated by taking the logarithm of the odds. This is mainly to achieve or keep
the odds values from going below zero, which is the lower limit of the odds. The
following Table 11.2 shows the relationship between Probability, Odds value and
Logit value.
Odds value less than 1 will have a negative logit value and greater than 1 will
have a positive logit value. Once we get the logit value from the Odds value, we
Table 11.2 Relationship
between probability, odds
and logit value
Probability
Odds
Log odds (Logit value)
0.00
0.10
0.30
0.50
0.70
0.90
1.00
0.00
0.111
0.428
1.00
2.333
9.00
Cannot be calculated
Cannot be calculated
-2.197
-0.847
0.000
0.847
2.197
–