Tải bản đầy đủ - 0trang
11 Mann–Whitney U/Wilcoxon Signed-Rank R Tutorial
Researchers were interested in explaining the driving factors for this biogeographic
pattern and whether it could be explained by differences in the habitat preferences of the
two fish species. One parameter that was measured was temperature, with each species
potentially exhibiting a specific thermal preference.
example was taken from the research conducted by Dr. Pablo Weaver.
Formulate a question about the data that can be addressed by performing a
Mann–Whitney U test.
Question: Do P. hispaniolana and P. dominicensis show preferences to different
Based on the question, formulate the null and alternative hypotheses that
address the question proposed.
Null Hypothesis (H0): The habitats of P. hispaniolana and P. dominicensis do not
differ in temperature.
Alternative Hypothesis (H1): There is a difference in the temperatures preference
between P. hispaniolana and P. dominicensis.
Now that an appropriate testable question has been developed along with a set of testable
hypotheses, you can run the statistical analysis.
This tutorial focuses on running a Mann–Whitney U and Wilcoxon signed-rank test in
Refer to Chapter 15 for R-specific terminology and instructions on how to invoke and
Check all assumptions prior to running the test.
Mann–Whitney U Test Numbers Tutorial
1. Create two vectors that contain the water temperatures (in degrees Celsius) sampled
from the habitat of each species of fish. Name the vectors by the species name, or
something similarly appropriate. Press enter/return after constructing each vector.
2. Type the vector names into the wilcox.test() function (Note: The Mann–Whitney U
is also called the Mann–Whitney–Wilcoxon or Wilcoxon rank sum test, which is why
we use the wilcoxon.test() function). Press enter/return.
Note: The wilcox.test() function can also be used to perform a Wilcoxon signed-rank
test by simply adding the argument paired = TRUE, which would look like the
following: wilcox.test(hispaniolana, dominicensis, paired = TRUE). The
output is interpreted similarly to the output for the Mann–Whitney U.
3. The following screen will appear. This is known as the output.
In the case of the water temperature in the habitats of two fish species, the p-value is
significant (p-value = 0.003234) indicating that we should reject the null hypothesis that
the two species of fish prefer the same temperature in their habitat. Therefore, the fish
species prefer two different habitats. A quick glance of the data shows us the water
temperatures are higher for P. dominicensis.
The water temperatures (degrees Celsius) significantly differ (W = 103, p-value =
0.003234) between the habitats for P. hispaniolana and P. dominicensis on the island of
Hispaniola. An examination of the data show that P. dominicensis prefers water with a
By the end of this chapter, you should be able to:
1. Understand when a Kruskal–Wallis test is applicable and why.
2. Use statistical programs to perform a Kruskal–Wallis and determine the
significance (p-value) for your analysis.
3. Evaluate the medians or mean ranks of three or more groups or samples and
construct a logical conclusion for each dataset.
4. Use the skills generated to perform, analyze, and evaluate your own dataset from
8.1 Kruskal–Wallis Background
In the last chapter, we covered the Mann–Whitney U test and the Wilcoxon signed-rank
test, which are two-sample, nonparametric tests. Another commonly used nonparametric
test is the two- or more sample Kruskal–Wallis test. The Kruskal–Wallis is considered
the nonparametric analogue to the parametric, one-way ANOVA. The Kruskal–Wallis test
is also considered to be similar to the Mann–Whitney U test, as it performs a comparison
or analysis of independent samples.
The Kruskal–Wallis is most applicable when comparing two or more samples; the data
within each of the multiple samples do not need to follow a normal distribution. As with
one-way ANOVA, the test aims to determine if there is an overall difference among the
various sampling groups; also similar to one-way ANOVA, the Kruskal–Wallis lacks the
ability to specify where the difference lies between groups.
For example, suppose a pharmacist was testing the efficacy of three novel drugs: Drug A,
Drug B, and Drug C. After running a Kruskal–Wallis test, the results indicate that there is
a significant difference in patient drug response (p < 0.05); however, the results would
not indicate which drug was the most or least effective. Was the difference between Drug
B and Drug C? Or was the difference between Drug A and Drug C? In this case, we cannot
determine significance between each group separately, only that there is a significant
difference somewhere between drugs.
A graph can depict where the differences lie, as can interpretations of the medians or
mean ranks. A test such as Dunnett's C post hoc test can also be applied to determine
which of the drugs were the most/least effective. To present your results from a Kruskal–
Wallis test, you want to report the chi-square value (X2), degrees of freedom (df), p-value,
and median or mean rank for each group.
8.2 Case Study 1
Snails are known to be the intermediate host for a number of parasites that later infect
mammalian hosts, including humans and domestic livestock. For example, freshwater
areas in Africa are at high risk for containing parasitic blood-flukes belonging to the
genus Schistosomes. Schistosomes invade the digestive tract of freshwater snails, and
continue their life cycle until they are excreted into the water in the form of larvae. As
humans come into contact with contaminated water, they run the risk of becoming the
secondary host for Schistosomes, leading to the disease Schistosomiasis.
In the following hypothetical example, Dr. Perkins, a parasitologist, wants to examine the
parasite load of intermediate hosts from Lake Malawi, the southernmost East African rift
lake. She collected 50 snails from the lake and identified three species of Bulinus
(Bulinus forskalii, Bulinus beccarii, and Bulinus cernicus) from her collection. She
performed a dissection on each of the individuals and kept a tally of the number of
parasites observed in the gastrointestinal tract of each individual. Dr. Perkins proposed
the following question and hypotheses.
Question: “Is there a difference in the number of parasites in the three different
species of Bulinus?”
Null Hypothesis (H0): There is no difference in the number of parasites in the
three different species of Bulinus.
Alternative Hypothesis (H1): There is a difference in the number of parasites in the
three different species of Bulinus.
Reported parasite load (number of parasites) within individuals of the three species of
snails, B. forskalii, B. beccarii, and B. cernicus can be observed in Table 8.1. Given the
experimental data, we can conclude that the median and mean values are not equal for
each of the three groups and that a series of outliers are present within the dataset. From
these observations, we can determine that a Kruskal–Wallis is best suited for this
Table 8.1 Kruskal–Wallis experimental data showing the number of parasites within
individual snails from one of the three species of snails, Bulinus forskalii, Bulinus
beccarii, and Bulinus cernicus. The median and mean are calculated and provided at the
B. forskalii (n = 18) B. beccarii (n = 13) B. cernicus (n = 19)
Figure 8.1 displays the SPSS output for the experimental results.
Figure 8.1 Kruskal–Wallis SPSS output.
Figure 8.2 Bar graph illustrating the median number of parasites observed among the
three snail species, Bulinus forskalii, Bulinus beccarii, and Bulinus cernicus.
Because it is a nonparametric test, we display results from a Kruskal–Wallis test using
the medians, see Figure 8.2.
After reviewing the experimental results, Dr. Perkins was able to determine that there
was a significant difference in the number of parasites among the three different species
of Bulinus (X2 = 9.392, df = 2, p = 0.009) snails. The reported medians for each
independent group are as follows: Species 1 (B. forskalii, n1= 18) with a median of 4.5;
Species 2 (B. beccarii, n2= 13) with a median of 3; and Species 3 (B. cernicus, n3= 19) with
a median of 2. The Kruskal–Wallis test does not allow for a direct comparison between
pairs of species. A post hoc analysis, such as Dunnett's C, will need to be run in order to
determine the specific pairwise differences.
Nuts and Bolts
The Kruskal–Wallis test is most applicable when certain conditions and assumptions are
1. Data type – The dependent variable is ordinal or interval/ratio in nature. Similar to
the Mann–Whitney U test, the Kruskal–Wallis test must report the dependent
variable in terms of a ranking or continuous data, as in IQ or exam scores. The
independent variable should be categorical with two or more groups.
2. Distribution of data – The Kruskal–Wallis is a nonparametric statistical test;
therefore, the data follow a non-normal distribution. As we discussed in Chapter 4, the
evaluation of medians allows us to assess the overall data, despite the presence of
outliers skewing the data. The distribution of the groups is important. If the shape of
the distribution is the same across the groups, then the medians can be used in
interpretations. If the shape of the distribution is different, then the mean ranks are
3. Sampling groups and observations – The Kruskal–Wallis test is commonly used
when you have two or more sampling groups, unlike the Mann–Whitney U test which
has only two groups. In addition, the Kruskal–Wallis test assumes that the
observations are independent and gathered from different individuals or subjects; one
individual cannot belong to more than one group. In the case where one individual
contributed to more than one observation, the observations are not considered to be
4. Random sampling– The observations were collected at random.
8.3 Case Study 2
Researchers from a local sleep institute are interested in studying the effects of alcohol
consumption on the quality of sleep. They recruited 20 healthy people between the ages
of 25 and 35 years to participate in a study. For an average sized person, one standard
alcoholic drink is equivalent to 5 ounces of wine, 1.5 ounces of an 80 proof distilled drink,
or 0.16 g/kg of body weight (Williams and Salamy, 1972). On the initial night of the study,
each participant's body weight was measured. Then, they were randomized into four
groups and asked to consume different amounts of alcohol based on their group and body
weight. The following are the randomized control/treatment groups:
Group 1: No alcohol consumed (0.00 g of alcohol/kg body weight)
Group 2: One standard drink (0.16 g of alcohol/kg body weight)
Group 3: Two standard drinks (0.32 g of alcohol/kg body weight)
Group 4: Three standard drinks (0.48 g of alcohol/kg body weight)
All the participants were given an equal amount of liquid to consume, with varying
amounts of alcoholic content, depending on their assigned group. The participants were
blinded to their group assignment (they were not told to which group they were
assigned), and the taste of alcohol was masked. Alcohol was administered 60 minutes
prior to “bedtime” for all participants, and they were allowed to sleep for an 8-hour
period. The next day, they were asked to rank their quality of sleep on a scale from 1 to 5:
1 = Could not sleep, 2 = Had somewhat difficulty in sleeping, 3 = Decent night's sleep, 4 =
Slept comfortably, 5 = Slept great
Formulate a question about the data that can be addressed by performing a
Question: Does the amount of alcohol consumed before bedtime affect the quality of
Formulate the null and alternative hypotheses that address the question
Null Hypothesis (H0): There is no difference in the overall quality of sleep following
the consumption of differing levels of alcohol in a standard drink.
Alternative Hypothesis (H1): There is a difference in the overall quality of sleep
following the consumption of differing levels of alcohol in a standard drink.
For the summary of collected sleep satisfaction scores, see Table 8.2. Even though the
data are not highly skewed, because the dataset is particularly small and is composed of
ranked responses, it is appropriate to use a Kruskal–Wallis test for the analysis of these
Figure 8.3 illustrates the SPSS output for the experimental results.
Refer to Figure 8.4 for a graphic illustration of the results.
After running the Kruskal–Wallis analysis, the output indicated a significant difference in
the quality of sleep between the four different treatment groups (X2 = 12.766, df = 3, p =
0.005). Thus, we reject the null hypothesis and the alternative hypothesis is supported.
The ranks chart reported the mean ranks for each independent group; the mean rank of
14.90 for the control (0 g/kg, n1 = 5) group, 15.20 for one standard drink (0.16 g/kg, n2 =
5), 6.90 for two standard drinks (0.32 g/kg, n3 = 5), and 5.00 for three standard drinks
(0.48 g/kg, n4 = 5). Recall that the output suggests there is a difference between groups,
but does not specify which groups are significant vis-à-vis other groups. To determine
individual effects for each group relative to others, a post hoc test needs to be utilized.
Table 8.2 Kruskal–Wallis experimental data showing the summary of collected sleep
satisfaction scores. The median and mean are calculated and provided at the bottom.
Control One Drink Two Drinks Three Drinks
Figure 8.3 Kruskal–Wallis SPSS output.