2 Hettmansperger–Randles Estimators of Location and Shape
Tải bản đầy đủ - 0trang
192
S. Taskinen and H. Oja
Affine equivariance properties of the functionals imply that in the elliptic case,
.Fy / D and V.Fy / D p ˙ =Tr.˙ / D .
When location and shape functionals are applied to empirical distribution
function based on the sample Y D .y1 ; : : : ; yn /T , we obtain estimators that we
denote from now on by O D .Y/ and VO D V.Y/. The estimators are then naturally
affine equivariant as well and, in the elliptic model, all location and shape estimates
then estimate the same population quantities and and are directly comparable
without any modifications.
11.2.2 k-Step Location and Shape Estimators
The location estimator O based on a chosen location score function T.y/ solves the
estimating equation
avefT.yi
O /g D 0:
The corresponding location functional .Fy / is then defined by E T.y .Fy // D0.
If the identity score, T.y/ D y, were used, the classical sample mean vector is
obtained that is optimal in the case of multivariate normality. Optimal location
score function in the spherical case is T.y/ D r .jjyjj/. The spatial median,
Brown (1983), is obtained by using the spatial sign score function S.y/ defined
in (11.1). The spatial median is highly robust estimator of symmetry center having
50 % breakdown point and bounded influence function. It can be computed using a
simple iteration steps
Ok D Ok
1
C
avefS.yi O k 1 /g
:
avefjjyi O k 1 jj 1 g
The estimator is however only rotation equivariant, that is, it satisfies (11.3) only for
orthogonal p p matrices A.
The affine equivariant spatial median can be obtained using so called transformation-retransformation technique. In that case, the observations are first standardized, the spatial median is then found for the standardized observations, and
the estimate is then transformed back to the coordinate system of the original
observations. See Chakraborty et al. (1998), Tyler et al. (2009) and Ilmonen et al.
(2012) and the references therein. In Hettmansperger and Randles (2002), location
and shape estimators are estimated simultaneously: O and VO are chosen to satisfy
avefS.Oei /g D 0 and p avefS.Oei /S.Oei /T g D Ip ;
(11.4)
O D p. The resulting
where eO i D VO 1=2 .yi O / and VO is standardized so that Tr.V/
location estimate O is an affine equivariant spatial median and the shape matrix
estimate VO is the Tyler’s M-estimate, Tyler (1987), with respect to the spatial
median.
11 k-Step Hettmansperger–Randles Estimates
193
The statistical properties of HR estimators were studied in Hettmansperger and
Randles (2002), Tyler (1987), and Dümbgen and Tyler (2005). They showed
that the location and shape estimators have bounded influence functions, positive
breakdown points and limiting multivariate normal distributions. The computation
is very simple. As in general M-estimation case, the estimating equations (11.4) can
rewritten in a way that provides the following iteration steps.
Iteration Steps 1 The HR location-scatter estimate is obtained using the following
steps
1=2
1. eO i D VO k 1 .yi O k 1 /, i D 1; : : : ; n,
1=2
2. O k D O k 1 C VO k 1 ŒavefjjOei jj 1 g 1 avefS.Oei /g,
1=2
1=2
3. VO k D VO k 1 avefS.Oei /S.Oei /T g VO k 1 .
and VO k is standardized so that Tr.VO k / D p.
Unfortunately there is no proof for the convergence of the above algorithm nor
the existence and uniqueness of the HR estimates. It is however well known that
the convergence is attained if one repeats the steps 1 and 2 alone (spatial median)
or the steps 1 and 3 alone (Tyler’s scatter matrix). In the paper we proceed with the
same practical
p solution as in Taskinen et al. (2010), that is, we start the iteration
with some n-consistent estimates and stop iterating after k steps. The estimate
then inherits some properties of the initial estimate but, with large k, the behavior is
almost as that of the regular HR estimate. We then give the following.
Definition 11.1. Let O 0 and VO 0 be initial location and shape estimators. The k-step
HR estimators O k and VO k for location and shape are obtained by starting with O 0
and VO 0 and repeating Iteration Steps 1 k times.
Notice that the k-step estimators are affine equivariant if the initial estimators are
affine equivariant. In Croux et al. (2010), the robustness and efficiency properties
of k-step Tyler’s shape estimator were studied. They for example showed that the
breakdown property of the k-step estimator is inherited from the initial estimator.
The approach used in Croux et al. (2010) differs from ours in that the location center
is assumed to be fixed. In the following sections, we derive influence functions and
asymptotic properties for the simultaneous k-step HR location and shape estimators.
11.2.3 Influence Functions
The robustness of a functional T against a single outlier y can be measured using
the influence functions Hampel et al. (1986). Let
F D .1
/F C
y;
denote the contaminated distribution, where y is the cdf of a distribution with
probability mass one at point y. The influence function of T is then given by
194
S. Taskinen and H. Oja
IF.yI T; F/ D lim
T.F /
T.F/
!0
:
A continuous and bounded influence function indicates good local robustness
properties of an estimator. We now find the influence functions of the k-step HR
estimators in the elliptic case.
Due to affine equivariance properties of our estimators, it suffices to derive
influence functions at a spherical distribution F0 of e. Hampel et al. (1986) and
Ollila et al. (2004) showed that in that case, the influence functions of all location
and shape functionals, .F/ and V.F/, are of the form
IF.yI ; F0 / D .r/ u;
(11.5)
and
Ä
IF.yI V; F0 / D ˛.r/ uuT
1
Ip ;
p
(11.6)
where r D jjyjj, u D jjyjj 1 y and real-valued functions .r/ and ˛.r/ depend
both on the functionals and on the underlying distribution F0 . When comparing
robustness properties of different estimators, it is enough to compare weight
functions and ˛ only. In the following we will derive these functions for k-step
HR-estimators.
Let now k D k .Fy / and Vk D Vk .Fy / be the functionals corresponding to
k-step HR-estimators O k and VO k , that is,
1=2
k
D
k 1
C
Vk 1 EŒS.e/
EŒjjejj 1
(11.7)
and
1=2
1
Vk D p ŒTr.Vk
where e D Vk
1=2
EF S.e/S.e/T Vk 1 /
1=2
1 .y
k 1 /.
1
1=2
1
Vk
1=2
EF S.e/S.e/T Vk 1 ;
(11.8)
We prove the following result in Appendix.
Theorem 11.1. The influence functions of k-step HR location and scatter functionals k and Vk with initial functionals 0 and V0 at F0 , the distribution of spherical
e with Cov.e/ D Ip , is given by (11.5) and by (11.6), respectively, with
"
Â Ãk
Â Ãk #
1
1
p Œ.p 1/E.jjejj 1 / 1 ;
k .r/ D
0 .r/ C 1
p
p
and
Â
˛k .r/ D
p
pC2
Ãk
"
˛0 .r/ C 1
Â
p
pC2
Ãk #
.p C 2/:
195
4
10
11 k-Step Hettmansperger–Randles Estimates
2
3
HR location.
3−step HR location
2−step HR location
1−step HR location
S−estimator
0
0
2
1
4
γμ
γμ
6
8
HR location
3−step HR location
2−step HR location
1−step HR location
mean vector
0
2
4
6
8
10
0
r
1
2
3
4
r
Fig. 11.1 Functions k for the k-step HR location functionals with k D 0; 1; 2; 3 and 1 when
the regular mean vector (left figure) and 50 % BP S-estimator with biweight loss-function (right
figure) are used as starting functionals. The functions are computed at the bivariate standard normal
distribution case
First note that, the influence functions of the regular HR location-scatter estimate is
obtained when k ! 1, that is,
.r/ D pŒ.p
1/E.jjejj 1 /
1
and ˛.r/ D p C 2:
The above influence functions are clearly bounded if those of the initial estimators
are bounded. In Fig. 11.1 we illustrate the behaviour of the function k at bivariate
standard normal case using two different initial estimators. When the sample
mean vector is used as a starting value, resulting k-step estimators have naturally
unbounded influence functions, although after few steps the influence function is
very close to that of the affine equivariant spatial median. When highly robust 50 %
breakdown point S-estimator with biweight loss-function, Davies (1987), is used
as an initial estimator, bounded influence functions are obtained. Notice that after
few steps the influence function does not differ much from that of the location HR
estimator.
In Fig. 11.2, the influence functions for k-step HR shape estimators are illustrated
at bivariate standard normal case. As initial estimators we use the sample covariance
matrix as well as the 50 % breakdown point S-estimator with biweight loss-function.
When we start with non-robust sample covariance matrix, unbounded influence
functions are obtained, but the estimator with better robustness properties is again
obtained after few steps. By using S-estimator as a starting value, the influence
functions of resulting k-step estimators are naturally bounded.
In the following section, we will compare the efficiency properties of k-step HR
estimators with different initial estimators. We will show that, after only few steps,
the initial estimator has very little influence on the resulting efficiencies.
S. Taskinen and H. Oja
10
HR shape
3−step HR shape
2−step HR shape
1−step HR shape
S−estimator
8
αV
30
0
0
2
10
4
20
αV
40
HR shape
3−step HR shape
2−step HR shape
1−step HR shape
covariance matrix
6
50
12
196
0
2
4
6
8
10
0
1
2
r
r
3
4
Fig. 11.2 Functions ˛k for the k-step HR shape functionals with k D 0; 1; 2; 3 and 1 when the
regular covariance matrix (left figure) and 50 % BP S-estimator with biweight loss-function (right
figure) are used as starting functionals. The functions are computed at the bivariate standard normal
distribution case
11.2.4 Limiting Distributions and Asymptotic Relative
Efficiencies
Thepasymptotic normality of k-step HR estimators follows if the initial estimators
are n-consistent and have limiting multinormal distributions. In the following, we
write vec.V/ for the vectorization of a matrix V, obtained by stacking the columns
of V on top of each other. We also denote
Cp;p .V/ D .Ip2 C Kp;p /.V ˝ V/
2
vec.V/vecT .V/;
p
where Kp;p is the commutation matrix, that is, a p2 p2 block matrix with .i; j/-block
being equal to a p p matrix that has 1 at entry .j; i/ and zero elsewhere.
Theorem 11.2. Let y1 ; : : : ; yn be a random p
sample from
p F0 , the distribution of
spherical e with Cov.e/ D Ip . Assume that n O 0 and n vec.VO 0 Ip / have a
joint limiting multivariate normal distribution. Then
p
d
n O k ! N.0;
1k Ip /
and
p
n vec.VO k
d
Ip / ! N.0;
where
1k
Functions
k
D
EŒ
2
k .jjejj/
p
and
2k
and ˛k are given in Theorem 11.1.
D
EŒ˛k2 .jjejj/
p.p C 2/
2k
Cp;p .Ip //;
11 k-Step Hettmansperger–Randles Estimates
197
The limiting distributions at elliptical distribution follow from the affine equivariance properties of the estimators. See for example Ollila et al. (2004) and Taskinen
et al. (2010).
Corollary 11.1. Let y1 ; : : : ; yn be a random sample from F, an elliptical distribution of ˙ 1=2 e C where e is spherical with Cov.e/ D Ip . Write D .p=tr.˙ //˙ .
Then
p
n.Ok
where
1k
d
/ ! N.0;
and
2k
1k ˙ /
and
p
n vec.VO k
d
/ ! N.0;
are given in Theorem 11.2 and W D Ip2
p
2k
1
WCp;p . /WT /;
vec. /vec.Ip /T :
In order to compare asymptotic relative efficiencies of different estimators, one
only has to compare scalars 1k and 2k . In Table 11.1 we list the asymptotic relative
efficiencies of k-step HR location estimators as compared to the sample mean at
different p-variate t-distributions with selected values of dimension p and degrees
of freedom , where D 1 refers to the multinormal case. As in previous section,
we use the sample mean and 50 % BP S-estimator as starting values.
Table 11.1 Asymptotic relative efficiencies of k-step HR location
estimators as compared to the sample mean at different p-variate
t-distributions with selected values of dimension p and degrees of freedom
pD2
pD5
p D 10
k
1
2
3
4
5
1
1
2
3
4
5
1
1
2
3
4
5
1
(a)
D3
1.600
1.882
1.969
1.992
1.998
2.000
2.094
2.274
2.300
2.304
2.305
2.306
2.302
2.412
2.421
2.422
2.422
2.422
D6
1.135
1.135
1.115
1.101
1.093
1.084
1.238
1.250
1.250
1.250
1.250
1.250
1.297
1.312
1.313
1.313
1.313
1.313
D1
0.936
0.867
0.827
0.806
0.796
0.785
0.937
0.912
0.907
0.906
0.905
0.905
0.960
0.952
0.951
0.951
0.951
0.951
(b)
D3
2.025
2.045
2.031
2.017
2.009
2.000
2.359
2.318
2.308
2.306
2.306
2.306
2.451
2.426
2.423
2.423
2.422
2.422
D6
1.035
1.074
1.083
1.085
1.085
1.084
1.259
1.253
1.251
1.250
1.250
1.250
1.320
1.314
1.313
1.313
1.313
1.313
D1
0.697
0.747
0.768
0.777
0.781
0.785
0.898
0.904
0.905
0.905
0.905
0.905
0.950
0.951
0.951
0.951
0.951
0.951
The sample mean (a) and the 50 % BP S-estimator (b) are used as a
starting values
198
S. Taskinen and H. Oja
Consider first the efficiency results for the simple k-step estimator that uses
sample mean vector as a starting value. In case of high-dimensional data, the
k-step estimators are very efficient even in the multinormal case, and after few
steps the efficiencies are already very close to those of regular HR estimators. As
seen in previous section, in case of low-dimensional data, several steps are needed
to obtain estimator with reasonable robustness properties. When multinormal data
in considered, such estimator seems to lack efficiency. To study the effect of an
initial estimator to efficiencies, 50 % BP S-estimator was also used as a starting
value. When k is large enough, the initial estimator has very little influence on the
efficiencies. For example when different 5-step estimators are compared, regardless
of the distribution, the efficiencies are almost alike.
In Table 11.2 the asymptotic relative efficiencies of k-step HR shape estimators
are given as compared to the sample covariance matrix based shape estimator.
We again use the sample covariance matrix as well as the 50 % BP S-estimator
as starting values. As seen in Table 11.2, in multinormal case, the k-step shape
estimators are very inefficient no matter which estimator is used as a starting value.
For heavy-tailed distributions, the k-step estimators outperform the initial sample
covariance matrix. Again after five steps, the efficiencies are very close to those
of the limiting estimators, and the efficiencies of sample covariance matrix based
estimators are very similar to those of the S-estimator based estimators.
11.3 Hettmansperger–Randles Estimators of Regression
11.3.1 k-Step Regression Estimators
Assume next the linear regression model
yi D BT xi C ˙ 1=2 ei ; i D 1; : : : ; n;
where yi are the p-variate response vectors, B is the q p matrix of unknown
regression parameters and ˙ is the covariance matrix of the residuals. The q-vector
of explaining variables xi and the standardized p-variate residuals ei are independent
and ei is spherical around zero with Cov.ei / D Ip . Finally .xi ; ei /, i D 1; : : : ; n, are
iid. We may then also write
Y D XB C E˙ 1=2 ;
(11.9)
where Y D .y1 ; : : : ; yn /T and E D .e1 ; : : : ; en /T are n p matrices, and X D
.x1 ; : : : ; xn /T is an n q matrix.
The regression estimator BO based on the location score function T.y/ solves
avefT.yi
BT xi /xTi g D 0
(11.10)
11 k-Step Hettmansperger–Randles Estimates
199
Table 11.2 Asymptotic relative efficiencies of k-step HR shape estimators as compared to the sample covariance matrix based shape estimator
at different p-variate t-distributions with selected values of dimension p
and degrees of freedom
pD2
pD5
p D 10
k
1
2
3
4
5
1
1
2
3
4
5
1
1
2
3
4
5
1
(a)
D5
1.714
1.778
1.670
1.590
1.546
1.500
2.194
2.221
2.170
2.151
2.145
2.143
2.512
2.520
2.504
2.501
2.500
2.500
D8
1.091
0.941
0.846
0.796
0.774
0.750
1.205
1.119
1.086
1.075
1.073
1.071
1.301
1.261
1.252
1.250
1.250
1.250
D1
0.800
0.640
0.566
0.532
0.516
0.500
0.831
0.748
0.724
0.717
0.715
0.714
0.878
0.841
0.835
0.834
0.833
0.833
(b)
D5
1.377
1.472
1.495
1.500
1.500
1.500
2.173
2.159
2.149
2.144
2.143
2.143
2.521
2.505
2.501
2.500
2.500
2.500
D8
0.687
0.733
0.746
0.749
0.750
0.750
1.094
1.081
1.074
1.072
1.072
1.071
1.245
1.253
1.251
1.250
1.250
1.250
D1
0.458
0.489
0.496
0.498
0.499
0.500
0.744
0.723
0.717
0.715
0.715
0.714
0.851
0.836
0.834
0.834
0.833
0.833
The sample covariance matrix (a) and the 50 % BP S-estimator (b) are
used as starting values
With the identity score T.y/ D y, the classical least squares (LS) estimator for
O
model (11.9) is obtained. The solution BO D B.X;
Y/ D .XT X/ 1 XT Y is then fully
equivariant, that is, it satisfies
O
O
B.X;
XH C Y/ D B.X;
Y/ C H;
for all q
p matrices H (regression equivariance). Further,
O
O
B.X;
YW/ D B.X;
Y/W;
for all nonsingular p
p matrices W (Y-equivariance) and
O
O
B.XV;
Y/ D V 1 B.X;
Y/;
for all nonsingular q q matrices V (X-equivariance).
As in case of location estimation, robust regression estimator is obtained by
replacing identity scores used in (11.10) with spatial sign scores S.y/. This choice