Risk measures: Credit Value-at-Risk and Expected Shortfall
Tải bản đầy đủ
377
Risk measurement for a repo portfolio
a per cent is defined as the expected value of losses exceeding the a per cent
VaR, or equivalently the expected outcome in the worst (1Àa) per cent of
cases.
To be more precise, let x 2 R d denote a random variable with a positive
density p(x). For each decision vector n, chosen from a certain subset w of
Rn, let h(n,x) denote the portfolio loss random variable, having a distribution in R, induced by that of x. For a fixed n the cumulative distribution
function for the portfolio loss variable is given by
Z
Fðn; ŁÞ ¼
pðxÞdx ¼ Pfx Łg
hðn;xÞ h
and its inverse is defined as
F À1 ðn; xÞ ¼ minfh : Fðn; hÞ ! xg
The VaRa and ESa values for the loss random variable associated with n
and a specified confidence level a are given by
VaRa ðnÞ ¼ F À1 ðn; aÞ
and
À1
ESa ðnÞ ¼ ð1 À aÞ
Z
hðn;xÞ!VaRa ðnÞ
hðn; xÞpðxÞdx
As mentioned above, VaRa is the a-quantile of the portfolio loss distribution and ESa gives the expected value of losses exceeding VaRa.
As is well known, unlike VaR, ES is a coherent risk measure.15 One of
the main problems with VaR is that, in general, it is not sub-additive
which implies that it is possible to construct two portfolios A and B such that
VaR(AþB) > VaR(A) þ VaR(B). In other words, the VaR of the combined
portfolio exceeds the sum of the individual VaRs, thus discouraging diversification. Another feature of VaR which, compared to ES, makes it unattractive from a computational point of view, is its lack of convexity. For these
reasons, ES is used as the preferred risk measure here.
Generally, the approaches used for the computation of risk measures can be
classified either as fully- or semi-parametric approaches. In a fully-parametric
15
Artzner et al. (1999) call a risk measure coherent if it is transition invariant, positively homogeneous, sub-additive
and monotonic with relation to stochastic dominance of order one.
378
Heinle, E. and Koivu, M.
approach the portfolio loss distribution is assumed to follow some parametric distribution, e.g. the normal- or t-distribution, based on which the
relevant risk measures (VaR and ES) can be easily estimated. For example,
in the case the portfolio loss random variable hðn; xÞ would follow a normal
distribution, which is widely applied as the basis for market VaR calculations (Gupton et al. 1997), the 99 per cent VaR would be calculated as
VaR99% ðhÞ ¼ lh þ N À1 ð0:99Þrh % lh þ 2:33rh ;
where lh and rh denote the mean and volatility of the losses.
In a semi-parametric approach the portfolio loss distribution is not
explicitly known. However, what is known instead is the (multivariate)
probability distribution of the random variable x, driving the portfolio
losses, and the mapping to the portfolio losses hðn; xÞ. In many cases, as is
also done here, the estimation of the portfolio risk measures has to be done
numerically by utilizing Monte Carlo (MC) simulation techniques.
The estimation of portfolio a-VaR by plain MC simulation can be
achieved by first simulating a set of portfolio losses, organizing the losses in
increasing order and, finally, finding the value under which a ·100% of the
losses lie. ES can then be calculated by taking the average of the losses
exceeding this value.
However, this sorting-based procedure fails in case the generated sample
points are not equally probable, as happens e.g. when a variance reduction
technique called importance sampling is used to improve the accuracy of
the risk measure estimates. Fortunately, there exists an alternative way to
compute VaR and ES simultaneously, that is also applicable in the presence
of arbitrary variance reduction techniques.
Rockafellar and Uryasev (2000) have shown that, ESa (with confidence
level a) can be obtained as a solution of a convex optimization problem
Z
½hðn; xÞ À mþ pðxÞdx;
ð10:1Þ
ESa ðnÞ ¼ min m þ ð1 À aÞÀ1
m2R
x2R d
where ½zþ ¼ maxfz; 0g, and the value of m which minimizes equation
(10.1) equals VaRa . An MC-based estimate for ESa and VaRa is obtained by
generating a sample of realizations for the portfolio loss variable and by
solving
ES a ðnÞ ¼ min m þ ð1 À aÞÀ1
m2R
N
1X
½hðn; xi Þ À mþ :
N i¼1
379
Risk measurement for a repo portfolio
This problem can easily be solved either by formulating the above
problem as a linear program, as in Rockafellar and Uryasev (2000, 2002),
which requires introducing N auxiliary variables and inequality constraints
to the model, or by directly solving the one dimensional non-smooth
minimization problem e.g. with a sub-gradient algorithm; (see Bertsekas
1999 or Nesterov 2004 for more information on sub-gradient methods).
These expressions are readily applicable in optimization applications
where the optimization is also performed over the portfolio holdings n and
the objective is to find an investment portfolio which minimizes the portfolio
ESa (see Rockafellar and Uryasev 2000, 2002; Andersson et al. 2001).
Another complication related to the specifics of credit risk estimation is
that credit events are extremely rare so that without a highly diversified
portfolio none will occur outside the e.g. 1 per cent tail and thus 99 per cent
VaR will be zero. ES on the other hand also accounts for the magnitude of
the tail events and is thus able to capture the difference between credit
exposures concentrated far into the tail.
As an example, consider a portfolio consisting of n different obligors with
4 basis point PD, which corresponds to a rating of AA–. Assuming for
simplicity that the different debtors are uncorrelated, it can be calculated
with elementary probability rules how many obligors the portfolio should
contain in order to obtain a non-zero VaR:
Pðat least one obligor defaultsÞ ¼ 1 À Pðnone of the obligors defaultÞ
¼ 1 À 0:9996n > 0:01 , n > 25
Including the effect of correlation would mean that the number of obligors
should be even higher for VaR to be positive.
6. An efficient Monte Carlo approach for credit risk estimation
The simplest and the best-known method for numerical approximation of
high-dimensional integrals is the Monte Carlo method (MC), i.e. random
sampling. However, in the literature of numerical integration, there exist
many techniques that can be used to improve the performance of MC
sampling schemes in high-dimensional integration. These techniques can be
generally classified as variance reduction techniques since they all aim at
reducing the variance of the MC estimates. Most widely applied variance
reduction techniques in financial engineering include importance sampling
380
Heinle, E. and Koivu, M.
(IS), (randomized) quasi-Monte Carlo (QMC) methods, antithetic variates
and control variates; see Glasserman 2004 for a thorough introduction to
these techniques. Often the use of these techniques substantially improves
the accuracy of the resulting MC estimates, thus effectively reducing the
computational burden required by the simulations; see e.g. Ja¨ckel (2002),
Glasserman (2004) and the references therein. In the context of credit risk
(rare event) simulations the application of importance sampling usually
offers substantial variance reductions, but combining IS with e.g. QMC
sampling techniques can improve the computational efficiency even further.
Sections 6.1 and 6.2 provide a brief overview and motivation for the
variance reduction techniques applied in this study: importance sampling
and randomized QMC methods. Section 6.3 demonstrates the effectiveness
of different variance reduction techniques in the estimation of credit risk
measures, in order to find out which combination of variance reduction
techniques would be well suited for the residual risk estimation in the
Eurosystem’s credit operations.
6.1 Importance sampling
Importance sampling is especially suited for rare event simulations. Roughly
speaking, the objective of importance sampling is to make rare events less
rare by concentrating the sampling effort in the region of the sampling space
which matters most to the value of the integrand, i.e. the tail of the distribution in the credit risk context.
Recall that the usual MC estimator of the expectation of a (loss) function h
Z
hðn; xÞpðxÞdx ¼ E ½hðxÞ
l¼
x2R d
where x2Rd is a d-dimensional random variable with a positive density p (x)
is
l^ ¼
N
1X
hðxi Þ
N i¼1
where N is the number of simulated sample points.
When calculating the expectation of a function which gets a non-zero
value only under the occurrence of a rare event, say a default with a
probability e.g. 0.04 per cent, it is not efficient to sample points from the
381
Risk measurement for a repo portfolio
distribution of x, since the majority of the generated sample points will
provide no information about the behaviour of the integrand. Instead, it
would seem intuitive to generate the random samples so that they are more
concentrated in the region which matters most to the value of the integrand.
This intuition is reflected in importance sampling by replacing the original
sampling distribution with a distribution which increases the likelihood that
‘important’ observations are drawn.
Let g be any other probability density on Rd satisfying pðxÞ > 0 )
gðxÞ > 0; 8x 2 R d :
Then l can be written as
!
Z
pðxÞ
pðxÞ
~
gðxÞdx ¼ E hðxÞ
hðn; xÞ
l ¼ E ½hðxÞ ¼
gðxÞ
gðxÞ
x2R d
where E~ indicates that the expectation is taken with respect to the probability measure g, and the MC estimator of l is given by
l¼
N
N
1X
1X
pð~
xi Þ
hðxi Þ ¼
hð~
xi Þ
N i¼1
N i¼1
gð~
xi Þ
with x~i ; i ¼ 1; . . . ; N independent draws from g and the weight pð~
xi Þ=gð~
xi Þ
is the ratio of the original and the new importance sampling density.
The IS estimator of ESa and VaRa can now be obtained by simply solving
the problem
ES a ðnÞ ¼ min m þ ð1 À aÞÀ1
m2R
N
1X
pð~
xi Þ
½hðn; x~i Þ À mþ
N i¼1 gð~
xi Þ
ð10:2Þ
where x~i ; i ¼ 1; . . . ; N are independent draws from the density g.
Successful application of importance sampling requires that:
1) g is chosen so that the variance of the IS estimator is less than the
variance of the original MC estimate;
2) g is easy to sample from;
3) p(x)/g(x) is easy to evaluate.
Generally, fulfilling these requirements is not at all trivial. However, in some
cases e.g. when p(x) is a multivariate normal density – that will be used in the
credit risk estimations in Section 7 – these necessities are easily met. For a
more detailed discussion on IS with normally distributed risk factors and its
applications in finance; see Glasserman (2004) and the references therein.
382
Heinle, E. and Koivu, M.
The density p(x) of a d-dimensional multivariate normal random variable
x, with mean vector h and covariance matrix R, is given by
1
1
T À1
exp À ðx À ŁÞ R ðx À ŁÞ
pðxÞ ¼
2
ð2pÞd=2 DetðRÞ1=2
With normally distributed risk factors, the application of IS is relatively
straightforward. Impressive results can be obtained by choosing the IS
density, g, as a multivariate normal distribution with mean vector Ł^ and
covariance matrix R, i.e. by simply changing the mean of the original distribution; see Glasserman et al. (1999).
This choice of g clearly satisfies requirement 2) above, but it also satisfies
requirement 3) since the density ratio is simply
T À1
1
pðxÞ c exp À 2 ðx À ŁÞ R ðx À ŁÞ
1 ^
T À1 ^
¼ exp ð ðŁ þ ŁÞ À xÞ R ðŁ À ŁÞ
ð10:3Þ
¼
gðxÞ c exp À 1 ðx À ŁÞ
2
^ T RÀ1 ðx À ŁÞ
^
2
1
where c ¼
. As demonstrated in Section 6.3, an appropriate
ð2pÞd=2 DetðRÞ1=2
choice of Ł^ effectively reduces the variance of the IS estimator in comparison
to the plain MC estimate, thus also satisfying the most important
requirement 1).
6.2 Quasi-Monte Carlo methods
QMC methods can be seen as a deterministic counterpart to the MC
method. They are deterministic methods designed to produce point sets
that cover the d-dimensional unit hypercube as uniformly as possible, see
Niederreiter (1992). By suitable transformations, QMC methods can be
used to approximate many other probability distributions as well. They are
just as easy to use as MC, but they often result in faster convergence of the
approximations, thus reducing the computational burden of simulation
algorithms. For a more thorough treatment of the topic the reader is
referred to Niederreiter 1992 and Glasserman 2004.
It is well known that if the function h(x) is square integrable then the
standard error of the MC sample average approximation l^ is of order
pﬃﬃﬃﬃﬃ
1= N . This means that cutting the approximation error in half requires
increasing the number of points by a factor of four. In QMC the convergence rate is lnðN ÞdÀ1 =N , which is asymptotically of order 1/N, which is
383
Risk measurement for a repo portfolio
much better compared to MC. However, this asymptotic convergence rate
is practically never useful since even for a very moderate dimension d ¼ 4,
N ! 2.4 · 107 for QMC to be theoretically superior to MC. Fortunately in
many applications, and especially in the field of financial engineering, QMC
methods produce superior results over MC and the reason for this lies in the
fact that, even though the absolute dimension of the considered problems
could be very high, the effective dimension, i.e. the number of dimensions
that account for the most of the variation in the value of the integrand, is
often quite low. For a formal definition of the term effective dimension
see e.g. Caflisch et al. (1997) and L’Ecuyer (2004). Therefore, obtaining
good approximations for these important dimensions, e.g. using QMC, can
significantly improve the accuracy of the resulting estimates. These effects
will be demonstrated in the next section.
The uniformity properties of QMC point sets deteriorate as a function of
the dimension. In high-dimensional integration the fact that the QMC point
sets are most uniformly distributed in low dimensions can be further utilized:
1) by approximating the first few (1–10) dimensions with QMC and the
remaining dimensions with MC; and
2) by transforming the function h so that its expected value and variance
remain unchanged in the MC setting, but its effective dimension (in
some sense) is reduced so that the first few dimensions account for the
most variability of h.
Detailed descriptions for implementing 2) in case the risk factors are
normally distributed, as is done here, are described in L’Ecuyer (2004), and
a procedure based on principal component analysis (PCA) (Acworth et al.
1997) is also outlined and applied in the following.
The fact that QMC point sets are completely deterministic makes error
estimation very difficult, compared to MC. Fortunately, this problem can be
rectified by using randomized QMC (RQMC) methods. To enable practical
error estimation for QMC a number of randomization techniques have been
proposed in the literature; see L’Ecuyer and Lemieux (2002) for an excellent
survey. An easy way of randomizing any QMC point set, suggested by
Cranley and Patterson (1976), is to shift it randomly, modulo 1, with
respect to all of the coordinates. After the randomization each individual
sample point is uniformly distributed over the sample space, but the point
set as a whole still preserves its regular structure. Randomizing QMC point
sets allows one to view them as variance reduction techniques which often
produce significant variance reductions with respect to MC in empirical
applications, see e.g. L’Ecuyer (2004).
384
Heinle, E. and Koivu, M.
The best-suited combination of the described variance reduction techniques for Credit VaR calculations has to be further specified based on
empirical findings.
6.3 Empirical results on variance reduction
This section studies the empirical performance of the different variance
reduction techniques by estimating portfolio level ES figures caused by
credit events, i.e. defaults, using the following simplified assumptions: The
portfolio contains d ¼ 100 issuers distributed equally within the four different rating categories AAA, AA, A and BBB. The issuer PDs are all equal
within the rating classes and the applied rating class specific default probabilities are presented in Table 10.1. For simplicity, the asset price correlation between every issuer is assumed to be either 0.24 or 0.5. The value of
the portfolio is arbitrarily chosen to be 1000 and it is invested evenly across
all the issuers. The recovery ratio is assumed to be 40 per cent of the
notional amount invested in each issuer. The Credit VaR will be estimated
over an annual horizon at a confidence level of 99 per cent.
The general simulation algorithm can be described as follows:
1. Draw a point set of uniformly distributed random numbers
UN ¼ fu1 ; . . . ; uN g&½0; 1Þd
2. Decompose the issuer asset correlation matrix as R ¼ CC T , for some
matrix C.
For i ¼ 1 to N:
3. Transform ui, component by component, to a normally distributed
j
random variable xi through inversion xi ¼ UÀ1 ðui Þ; where UÀ1
denotes the inverse of the cumulative standard normal distribution and
i ¼ 1, . . . ,N, j ¼ 1, . . . ,d,
^ where Ł^ is the mean vector of the shifted density g.
4. Set x~i ¼ Cxi þ Ł,
^ RÞ
Now x~ $ UðŁ;
xj;i >zj g, j ¼ 1, . . . ,d
5. Identify defaulted issuers Yj ¼ 1f~
6. Calculate the portfolio loss hðn; x~i Þ ¼ c1 Y1 þ · · · þ cd Yd
End
N
P
T À1 ^
^
Find the estimate ES a ðnÞ ¼ min m þ ð1 À aÞÀ1 N1 e 0:5ðŁÀ~xi Þ R Ł ½hðn; x~i Þ À mþ
m2R
i¼1
In step 1) the point set can be simply an MC sample or a sample generated
through an arbitrary RQMC method. In step 2) the most common choice for
C is the Cholesky factorization which takes C to be a lower triangular matrix.
385
Risk measurement for a repo portfolio
Another possibility is to select C based on a standard principal component
analysis (PCA) which concentrates the variance, as much as possible, to the
first coordinates of x, with the aim of reducing the effective dimension of the
problem. This choice yields C ¼ QD1/2, where D is a diagonal matrix containing the eigenvalues of R in decreasing order, and Q is an orthogonal
matrix whose columns are the corresponding unit-length eigenvectors.
Even though this technique completely ignores the function h whose
expectation we are trying to estimate, it has proven empirically to perform
well in combination with (R)QMC techniques; see Acworth et al. (1997),
Moskowitz and Caflisch (1996) and L’Ecuyer (2004). In step 5) 1{.} is an
indicator function, in 6) ci denotes loss given default for issuer i in monetary
terms, and the final expression for the ES is obtained simply by combining
(10.2) and (10.3) with Ł ¼ 0.
The simulation experiments are undertaken with a sample size of
N¼5000 and the simulation trials are repeated 100 times to enable the
computation of error estimates for RQMC methods. The numerical results
presented below were obtained using a randomized Sobol sequence,16 but
comparable results were also achieved with e.g. Korobov lattice rules. The
accuracy of the different variance reduction techniques are compared by
reporting variance reduction factors with respect to plain MC sampling for
all the considered methods. This factor is computed as the ratio of the
estimator variance with the plain MC method and the variance achieved
with an alternative simulation method.
Figure 10.5 illustrates the variance reduction factors achieved with the
importance sampling approach described in Section 6.1, where the mean
Ł ¼ 0 of the d-dimensional standard normally distributed variable x will be
^ The results show that the simple importance sampling scheme
shifted to Ł.
can reduce the estimator variance by a factor of 40 and 200 for an asset
correlation of 0.24 and 0.50, respectively. Figure 10.5 also illustrates,
reassuringly, that the variance reduction factors are fairly robust with
respect to the size of the mean shift, but shifting the mean too aggressively
can also substantially increase the estimator variance. The highest variance
reductions are obtained with the mean shifts equalling 1.3 and 1.8 in case of
0.24 and 0.5 asset correlation assumptions, respectively.
As indicated above, additional variance reduction may be achieved by
combining importance sampling with RQMC and dimension reduction
techniques. Tables 10.3 and 10.4 report the results of such experiments
16
See: Sobol (1967).
Heinle, E. and Koivu, M.
250
50
ρ = 0.24 (left-hand scale)
ρ = 0.50 (right-hand scale)
45
200
Variance reduction factor
40
35
150
30
25
100
20
15
Variance reduction factor
386
50
10
5
0
0
0
0.5
1
1.5
2
2.5
3
Mean shift (θˆ )
Figure 10.5 Variance reduction factors, for varying values of Ł^ and asset correlations.
under the asset correlation assumption of 0.24 and 0.5, respectively. In
addition to MC, three different combinations of variance reduction techniques are considered, namely Monte Carlo with importance sampling
(MCþIS), a combination of MC and randomized Sobol sequence with IS
(SOBþIS) where the first five dimensions are approximated with Sobol
point sets and the remaining ones with MC,17 and finally SOBþIS combined with the PCA dimension reduction technique to pack the variance as
much as possible to the first dimensions which are hopefully well
approximated with the Sobol points (SOBþISþPCA).
The results in Table 10.3 and Table 10.4 show that all the applied techniques produce significant variance reduction factors (VRF) with respect to
MC and the VRFs grow substantially as the confidence level increases from
99 per cent to 99.9 per cent. In all experiments SOBþISþPCA produces the
highest VRFs and the effectiveness of the PCA decomposition increases with
the asset correlation which reduces the effective dimension of the problem
as the asset prices tend to fluctuate more closely together.
The conducted experiments indicate that, among the considered variance
reduction techniques, the implementation based on SOBþISþPCA produces the highest VRFs and therefore this is the simulation approach chosen
to derive residual risk estimates for the Eurosystem’s credit operations.
17
Empirical tests indicated that extending the application of Sobol point sets to higher dimensions than five, generally
had a detrimental effect on the accuracy of the results. Tests also showed that applying antithetic variates instead of
plain MC does not improve the results further. Therefore, plain MC is used for the other dimensions.
387
Risk measurement for a repo portfolio
Table 10.3 Comparison of various variance reduction techniques with 0.24 asset correlation
ES99%
r299%
VRF99%
ES99.9%
r299.9%
VRF99.9%
MC
MC þ IS
SOB þ IS
SOB þ IS þ PCA
16.629
1.809
1
33.960
41.323
1
16.612
0.045
40
34.300
0.239
173
16.545
0.038
47
34.157
0.204
202
16.575
0.035
51
34.198
0.120
343
Table 10.4 Comparison of various variance reduction techniques with 0.5 asset correlation
ES99%
r299%
VRF99%
ES99.9%
r299.9%
VRF99.9%
MC
MC þ IS
SOB þ IS
SOB þ IS þ PCA
29.049
21.539
1
83.544
586.523
1
29.368
0.106
202
85.915
0.816
719
29.418
0.075
286
85.935
0.381
1540
29.409
0.051
422
85.911
0.184
3186
7. Residual risk estimation for the Eurosystem’s credit operations
This section presents the results of the residual risk estimations for the
Eurosystem’s credit operations. The most important data source used for
these risk estimations is a snapshot on disaggregated data on submitted
collateral that was taken in November 2006. This data contains information
on the amount of specific assets submitted by each single counterparty as
collateral to the Eurosystem. In total, Eurosystem counterparties submitted
collateral of around EUR 928 billion to the Eurosystem.
For technical reasons, the dimension of the problem needs to be reduced
without impacting the risk calculations. The total collateral amount is
spread over more than 18,000 different counterparty–issuer pairs. To reduce
the dimension of the problem, only those pairs are considered where the
submitted collateral amount is at least EUR 100 million. As a consequence,
the number of issuers is reduced to 445 and the number of counterparties is
reduced to 247. With this approach, only 64 per cent of the total collateral
submitted is taken into account. Therefore, after the risk calculations, the
resulting risks need to be scaled up accordingly.