Tải bản đầy đủ - 0 (trang)
1 Correlation, Co-Integration and Multi-Factor Models

1 Correlation, Co-Integration and Multi-Factor Models

Tải bản đầy đủ - 0trang



with orthogonal increments dz1 and dz2 . The coefficient of linear correlation between x1

and x2 , ρ12 , is

ρ12 =

E[dx1 dx2 ]

E[dx12 ]E[dx22 ]

σ11 σ21


(σ11 σ21 )2 + (σ11 σ22 )2

For simplicity set the drift terms to zero, and consider the new variable y = x1 + λx2 :

dy = σ11 dz1 + λ (σ21 dz1 + σ22 dz2 )

= (σ11 + λσ21 ) dz1 + λσ22 dz2


In general the variable y will be normally distributed, with zero mean and instantaneous

variance var(y) = (σ11 + λσ21 )2 + (λσ22 )2 dt. After a finite time t, its distribution will


y ∼ N (0, [(σ11 + λσ21 )2 + (λσ22 )2 ]t)


Now choose λ such that the coefficient in dz1 for the increment dy is zero:





It is easy to see that this value of λ gives the lowest possible variance (dispersion) to the

variable y:


y ∼ N (0, λ2 σ22






If λ were equal to −1 the variable y would simply give the spread between x1 and x2 ,

and in this case

y ∼ N (0, [(σ11 − σ21 )2 + (σ22 )2 ]t),

λ = −1


Equation (5.7) tells us that, no matter how strong the correlation between the two variables

might be, as long as it is not 1, the variance of their spread, or, for that matter, of any

linear combination between them, will grow indefinitely over time. Therefore, a linear

correlation coefficient does not provide a mechanism capable of producing long-term

‘cohesion’ between diffusive state variables.

Sometimes this is perfectly appropriate. At other times, however, we might believe

that, as two stochastic variables move away from each other, there should be ‘physical’

(financial) mechanisms capable of pulling them together. This might be true, for instance,

for yields or forward rates. Conditional on our knowing that, say, the 9.5-year yield in 10

years’ time is at 5.00%, we would expect the 10-year yield to be ‘not too far apart’, say,

somewhere between 4.50% and 5.50%. In order to achieve this long-term effect by means

of a correlation coefficient we might be forced to impose too strong a correlation between

the two yields for the short-term dynamics between the two variables to be correct. Or,

conversely, a correlation coefficient calibrated to the short-term changes in the two yields



is likely to predict a variance for the difference between the two yields considerably

higher than what we might consider reasonable.

In order to describe a long-term link between two variables (or, indeed, a collection of

variables) we require a different concept, namely co-integration. In general, co-integration

occurs when two time series are each integrated of order b, but some linear combination

of them is integrated of order a < b. Typically, for finance applications, b = 1 and

a = 0. (For a discussion of co-integration see Alexander (2001) for a simple introduction,

or Hamilton (1994) for a more thorough treatment.) In a diffusive context, this means

that the spread(s) between the co-integrated variables is of the mean-reverting type. More

generally, Granger (1986) and Engle and Granger (1987) have shown that if a set of

variables is co-integrated an error-correction model, i.e. a process capable of pulling

them together, must be present among them.

Why is this relevant in the context of our discussion? Let us lay down the ‘facts’ in


1. Several studies suggest that forward rates of a given yield curve should be cointegrated. See, for example, Alexander (2001). Let us accept this as a fact.

2. In the real world, if these forward rates are co-integrated, they will not disperse ‘too

much’ relative to each other, even after very long time periods.

3. Just positing a correlation structure among the forward rates is not capable of providing the necessary long-term cohesion mechanism among forward rates in a manner

consistent with their short-term dynamics.

4. In a diffusive setting, in order to describe simultaneously the short-term correlation

and the long-term co-integration among forward rates one must introduce errorcorrection (mean-reverting) terms.

These statements are all true in the real world. They would seem to suggest that, even

if individually the forward rates followed exactly a diffusive process, in order to price a

long-dated option it would be inadequate to describe the nature of their co-dependence

simply by means of a correlation matrix. This statement however is not necessarily correct.

Even if the underlying forward rates are indeed co-integrated, and even if a correlated

diffusion disperses them far too much when compared with the real world, this does not

matter for option pricing because, in the diffusive setting, the error-correction (meanreverting) term appears in the drift of the state variables. And we have seen in Chapter 4,

the real-world drift does not matter (in a perfect Black-and-Scholes world) for option


This important caveat is the exact counterpart of the statement proved in Section 4.7

that quadratic variation, and not variance, is what matters for single-asset Black-andScholes option pricing. In a single-asset case, what matters are the local ‘vibrational’

properties of the underlying, not how much it will disperse after a finite period of time.

Similarly, in a multi-factor setting, it is guessing the relationship between the local vibrations of the underlying assets relative to each other that will allow the trader to set up a

successful hedging strategy, not estimating their long-term relative dispersion. In a Blackand-Scholes world, option traders do not engage in actuarial pricing (for which variance

and long-term relative dispersion do matter), but do engage in local riskless replication.

1 See

also the discussion in Sections 18.2 and 18.3.



Therefore in what follows I will focus on the correlation matrix as the only mechanism

necessary to describe the link between a set of underlying variables, even if we know

that in the real world this might not be appropriate. In particular, we will consider timedependent volatilities and an imperfect correlation as the only mechanisms relevant for

derivatives pricing to produce changes in the shape of a given yield curve. This topic is

addressed in the next section.

5.1.1 The Multi-Factor Debate

Historically, the early response to the need of creating mechanisms capable of producing changes in the shape of the yield curve was to invoke multi-factor models. Taking

inspiration from the results of principal component analysis (PCA), the early (one-factor)

models were seen as capable of moving the first eigenvector (the level of the curve), but

ineffectual in achieving a change in slope, curvature, etc. At least two or three independent

modes of deformation were therefore seen as necessary to produce the required degree of

shape change in the yield curve.

The PCA eigenmodes are orthogonal by construction. When the ‘reference axes’ of

the eigenvectors are translated from the principal components back to the forward rates,

however, the latter become imperfectly instantaneously correlated. Therefore the need

to produce a change in shape in the yield curve made early modellers believe that the

introduction of several, imperfectly correlated, Brownian shocks would be the solution to

the problem. Furthermore, Rebonato and Cooper (1995) showed that, in order to model

a financially convincing instantaneous correlation matrix, a surprisingly large number of

factors was required. Therefore, the reasoning went, to produce what we want (changes

in the shape of the yield curve) we require a sufficiently rich and realistic instantaneous

correlation matrix, and this, in turn, requires many Brownian factors. In the early-to-mid1990s high-dimension yield-curve models became, in the eyes of traders and researchers,

the answer and the panacea to the pricing problems of the day.

To some extent this is correct. Let us not lose sight, however, of what we are trying

to achieve, and let us not confuse our goals with (some of) the means to achieve them.

The logical structure of the problem is schematically as follows.

• The introduction of complex derivatives payoffs highlights the need for models that

allow the yield curves to change shape. This is a true statement, and a model that

allows changes in shape in the yield curve is our goal.

• If one works with instantaneously imperfectly correlated forward rates, the yield

curve will change shape over time. This is also a true statement.

• If we want to recover a financially convincing instantaneous correlation structure,

many factors are needed. This, too, is a true statement.

• Therefore the imperfect correlation among rates created by a many-factor model is

what we require in order to obtain our goal. This does not necessarily follow.

The last step, in fact, implies that an imperfect degree of instantaneous correlation

is the only mechanism capable of producing significant relative moves among rates.

But this is not true. Imposing an instantaneous correlation coefficient less than one can

certainly produce some degree of independent movement in a set of variables that follows



a diffusion. But this is neither the only, nor, very often, the most financially desirable

tool by means of which a change in the shape of the yield curve can be achieved. What

else can produce our goal (the change in shape in the yield curve)? I show below that

introducing time-dependent volatilities can constitute a powerful, and often financially

more desirable, alternative mechanism in order to produce the same effect.

If this is the case, in the absence of independent financial information it is not clear

which mechanism one should prefer. Indeed, the debate about the number of factors

‘really’ needed for yield-curve modelling was still raging as late as 2002: Longstaff et al.

(2000a) argue that a billion dollars are being thrown away in the swaption market by

using low-dimensionality models; Andersen and Andreasen (2001) rebut that even a onefactor model, as long as well implemented, is perfectly adequate; Joshi and Theis (2002)

provide an alternative perspective on the topic. Without entering this debate (I express

my views on the matter in Rebonato (2002)), in the context of the present discussion one

can say that the relative effectiveness and the financial realism of the two mechanisms

that can give rise to changes in the shape of the yield curve must be carefully weighed

in each individual application.

It therefore appears that our goal (i.e. the ability to produce a change in the shape

of the yield curve) can be produced by different combinations of the two mechanisms

mentioned so far. If this is the case, perhaps different ‘amounts’ of de-correlation and of

time variation of the volatility can produce similar effects, in so far as changes in the shape

of the yield curve are concerned. Is there a quantitative measure of how successful we

have been in creating the desired effect, whatever means we have employed ? Surely, this

indicator cannot be the coefficient of instantaneous correlation, because, as stated above

and proved later in this chapter, a substantial de-coupling between different portions of

the yield curve can occur even when we use a one-factor model. The answer is positive,

and the quantity that takes into account the combined effects of time-dependent volatilities

and of imperfect correlations is the terminal correlation, which I define and discuss in

Section 5.3. In the following sections I will therefore tackle the following topics:

• I will draw a distinction between instantaneous and terminal correlation.

• I will show that it is the terminal and not the instantaneous correlation that matters

for derivatives pricing.

• I will then proceed to show how a non-constant instantaneous volatility can give rise

to substantial terminal de-correlation among the underlying variables even when the

instantaneous correlation is very high (or, indeed, perfect).

• Having explained why we need a judicious combination of time-dependent volatilities and imperfect instantaneous correlation, I will discuss how a naăve Monte

Carlo simulation can be carried out in this setting.

• Finally, I will show how a conceptually more satisfactory and computationally

more efficient Monte Carlo valuation can be carried out in this time-dependent


The last point has an intrinsic (i.e. computational) interest. It is even more important,

however, for its conceptual relevance. Since this book deals only in passing with numerical

techniques, the main reason for discussing it in this chapter is that it shows with great



clarity how terminal correlation enters the pricing of derivatives products, and how the

role of instantaneous correlation is, in a way, secondary. Again, instantaneous correlation

is just one of the mechanisms available to produce what we really want (namely, terminal



The Stochastic Evolution of Imperfectly

Correlated Variables

In order to get a qualitative feel for the problem, let us begin by considering the evolution

of two log-normally distributed quantities (rates or prices), that we shall denote by x1 and

x2 , respectively:


= µ1 dt + σ1 (t) dw1



= µ2 dt + σ2 (t) dw2




The two Brownian increments, dw1 and dw2 , are assumed to be imperfectly correlated,

and we will therefore write

E[dw1 dw2 ] = ρ dt


Note that we have explicitly allowed for the possibility of time dependence in the two

volatilities. Also, we have appended an index (1 or 2) to the volatility symbol to emphasize

that by σ1 (t) and σ2 (t) we denote the volatility of different prices or rates at the same

time, t (and not the volatility of the same spot process at different times). Let us then

choose a final time horizon, T , and let us impose that the unconditional variance of each

variable over this time horizon should be exactly the same:




σ1 (u)2 du =

σ2 (u)2 du = σ 2 T



For the sake of simplicity, let us finally assume that both variables start at time zero from

the same value:

x1 (0) = x2 (0) = x0


Let us now run two Monte Carlo simulations of the stock price processes out to time

T . For the first simulation, we will use constant volatilities for σ1 and σ2 (and therefore,

given requirement (5.11), the volatility is the same for x1 and x2 and equal to σ for both

variables), but a coefficient of instantaneous correlation less than one. When running the

second simulation, we will instead use two time-dependent instantaneous volatilities as in

Equations (5.8) and (5.9) (albeit constrained by Equation (5.11)), but perfect correlation.

To be more specific, for the first variable, x1 , we will assign the instantaneous volatility

σ1 (t) = σ0 exp(−νt),

0≤t ≤T




with σ0 = 20%, ν = 0.64 and T = 4 years. The second variable, x2 , will have an

instantaneous volatility given by

σ2 (t) = σ0 exp[−ν(T − t)],

0≤t ≤T


The reader can check that the variance Equation (5.11) is indeed satisfied.

Having set up the problem in this manner, we can simulate the joint processes for

x1 and x2 in the two different universes. In the time-dependent case the two variables

were subjected to the same Brownian shocks. In the constant-volatility case two different Brownian shocks, correlated as per Equation (5.10), were allowed to shock the two

variables. After running a large number of simulations, we ignored our knowledge of the

volatility of the two processes and evaluated the correlation between the changes in the

logarithms of the two variables in the two simulations using the expression


ρ1,2 =


ln x1i − ln x 1

ln x1i − ln x 1

ln x2i − ln x 2



ln x2i − ln x 2



The results of these trials are shown in Figures 5.1 and 5.2.

Looking at these two figures, it is not surprising to discover that the same sample

correlation (in this case ρ1,2 = 34.89%) was obtained despite the fact that the two decorrelation-generating mechanisms were very different.

For clarity of exposition, in this stylized case a very strongly time-varying volatility

was assigned to the two variables. It is therefore easy to tell which figure is produced by which mechanism: note how in Figure 5.1 the changes for Series 1 are large























Figure 5.1 Changes in the variables x1 and x2 . The two variables were subjected to the same

random shocks (instantaneous correlation = 1). The first variable (Series 1) had an instantaneous

volatility given by σ1 (t) = σ0 exp(−νt), 0 ≤ t ≤ T , with σ0 = 20%, ν = 0.64 and T = 4 years.

The second variable (Series 2) had an instantaneous volatility given by σ2 (t) = σ0 exp[−ν(T − t)],

0 ≤ t ≤ T . The empirical sample correlation turned out to be 34.89%.






















Figure 5.2 Changes in the variables x1 and x2 . The two variables were subjected to different

random shocks (instantaneous correlation = 35.00%). Both variables had the same constant instantaneous volatility of σ0 = 20%. The empirical sample correlation turned out to be 34.89%.
























Figure 5.3 Can the reader guess, before looking below, whether this realization was obtained

with constant volatility and a correlation of 85%, or with a correlation of 90% and a decay constant

ν of 0.2? The sample correlation turned out to be 85%.

at the beginning and small at the end (and vice versa for Series 2), while they have

roughly the same magnitude for the two series in Figure 5.2. In a more realistic case,

however, where the correlation is high but not perfect and the decay factor ν not as

pronounced, it can become very difficult to distinguish the two cases ‘by inspection’. See

Figure 5.3.



Table 5.1 The data used to produce Figure 5.4. Note the greater decrease

in sample correlation produced by the non-constant volatility when the

instantaneous correlation is high.

Decay constant

Instantaneous correlation










0.973944 0.90183



0.705876 0.675758

0.626561 0.470971 0.475828

0.334509 0.330294 0.332757


0.172208 0.173178

0.062323 −0.12066 −0.09665








In principle, it is of course possible to analyse the two time series separately beforehand in order to establish the possible existence of time dependence in the volatility

function. Armed with this information, the trader could, again in principle, analyse the

joint dynamics of the two variables, and estimate an instantaneous correlation coefficient. In practice, however, these statistical studies are fraught with difficulties, and,

especially if the instantaneous volatility is mildly time dependent and the correlation relatively high, the task of disentangling the two effects can be extremely difficult. See

Figure 5.3.

Unfortunately, the case of mildly-varying instantaneous volatilities and of relatively

high instantaneous correlations is the norm rather than the exception when one deals with

the dynamics of forward rates belonging to the same yield curve. The combined effects

of the two de-correlating mechanisms are priced in the relative implied volatilities of caps

and swaptions (see the discussion in Section 10.3), and even relatively ‘stressed’ but still

realistic assumptions for the correlation and volatility produce rather fine differences in

the relative prices (of the order of one to three percentage points – vegas – in implied


In order to study the relative importance of the two possible mechanisms to produce decorrelation, Table 5.1 and Figure 5.4 show the sample correlation between log-changes in

the two time series obtained by running many times the simulation experiment described

in the captions to Figures 5.1 and 5.2, with the volatility decay constant (ν) and the instantaneous correlation shown in the table. More precisely, the first row displays the sample

correlation obtained for a series of simulations conducted using perfect instantaneous correlation and more and more strongly time-dependent volatilities (decay constants ν of 0.2,

0.4, 0.6 and 0.8); the second row displays the sample correlation obtained with the same

time-dependent volatilities and an instantaneous correlation of 0.8; and so on.

The important conclusion that one can draw from these data is that a non-constant

instantaneous volatility brings about a relatively more pronounced de-correlation when

the instantaneous correlation is high. In particular, when this latter quantity is zero, a nonconstant instantaneous volatility does not bring about any further reduction in the sample

correlation (apart from adding some noise). From these observations one can therefore

conclude that the volatility-based de-correlation mechanism should be of greater relevance

in the case of same-currency forward rates, than in the case of equities or FX rates.



Sample correlation



















Instantaneous correlation


Decay constant



Figure 5.4 The sample correlation evaluated along the path for a variety of combinations of

the instantaneous correlation (from 1 to 0) and of the decay constant ν (from 0.2 to 0.8). Note

how the decrease in sample correlation introduced by the time dependence of volatility is more

pronounced the higher the instantaneous correlation. Regimes of high instantaneous correlations

are more commonly found in the same-currency interest-rate case, than for equities or FX.

In this section I have tried to give a qualitative feel for the impact on the sample

correlation of a time-dependent instantaneous volatility, and of a less-than-perfect instantaneous correlation. At this stage I have kept the discussion at a very qualitative level. In

particular, it is not obvious at this point why the terminal, rather then the instantaneous,

correlation should be of relevance for option pricing. The discussions in Chapters 2 and

4 about volatilities and variances should make us rather cautious before deciding on the

basis of ‘intuition’ which quantities matter when it comes to pricing an option. The purpose of the next section is therefore to identify in a more precise manner the quantities that

affect the joint stochastic evolution of correlated financial variables, in so far as option

pricing is concerned.

The analysis will be carried out by considering a ‘thought Monte Carlo experiment’,

but the main focus is more on the conceptual part, rather than on the description of a

numerical technique. In order to carry out and analyse these Monte Carlo experiments we

will have to discretize Ito integrals, and to make use of some basic results in stochastic

integration. These topics are therefore briefly introduced, but in a very non-technical

manner. I have reported some results without proof, and provided very sketchy and handwaving proofs for others. For the reader who would like to study the matter more deeply,

the references provided below can be of assistance. For a clear treatment intermediate

in length between an article and a slim book, I can recommend the course notes by

Shreve (1997). Standard, book-length, references are then Oksendal (1995), Lamberton

and Lapeyre (1991), Neftci (1996) and Baxter and Rennie (1996). If the reader were to

fall in love with stochastic calculus, Karatzas and Shreve (1991) is properly the bible,

but the amount of work required is substantial. Finally, good and simple treatments of

selected topics can be found in Bjork (1998) (e.g. for stochastic integrals) and Pliska

(1997) (e.g. for filtrations).




The Role of Terminal Correlation in the Joint

Evolution of Stochastic Variables

In what follows we will place ourselves in a perfect Black(-and-Scholes) world. In particular, in addition to the usual assumptions about the market structure, we will require

that the spot or forward underlying variables should be log-normally distributed. As a

consequence, we will ignore at this stage the possibility of any smiles in the implied

volatility. Smile effects are discussed in Part II. Our purpose is to identify what quantities

are essential in order to carry out the stochastic part of the evolution of the underlying

variables. We will obtain the fundamental result that, in addition to volatilities, a quantity that we will call ‘terminal correlation’ will play a crucial role. This quantity will be

shown to be in general distinct from the instantaneous correlation; in particular, it can

assume very small values even if the instantaneous correlation is perfect. In this sense,

the treatment to be found in this section formalizes and justifies the discussion in the

previous sections of this chapter, and in Chapter 2. En route to obtaining these results we

will also indicate how efficient Monte Carlo simulations can be carried out.

In order to obtain these results it is important to recall some definitions regarding

stochastic integrals. This is undertaken in the next section.

5.3.1 Defining Stochastic Integrals

Let us consider a stochastic process, σ , and a standard Wiener process, Z, defined over an

interval [a b].2 Since σ can be a stochastic process, and not just a deterministic function

of time, one cannot simply require that it should be square integrable. A more appropriate

condition is that the expectation of its square should be integrable over the interval [a b]:



E[σ 2 (u)] du < ∞


The second requirement that should be imposed on the process σ is that it should be

adapted to the filtration generated by the Wiener process, Z.3 Our goal is to give a

meaning to the expression



σ (u) dZ(u)


for all functions σ satisfying the two conditions above. The task is accomplished in two


Step 1: Let us divide the interval [a b] into n subintervals, with t0 = a, . . . , tn−1 = b.

Given this partition of the interval [a b] we can associate to σ a new function, σ , defined

2 The

treatment in this sub-section follows closely Bjork (1998).

intuitive definition of adaptness is the following. Let S and σ be stochastic processes. If the value of

σ at time t, σt , can be fully determined by the realization of the process S up to time t, then the process σ is

said to be adapted to S. For instance, let S be a stock price process and σ be a volatility function of the form

σ (St , t). Then the value of the volatility σ at time t is completely known if the realization of the stock price,

S, is known, and the volatility process is said to be adapted to the stock price process.

3 An



to be equal to σ on the initial point of each subinterval:

σ (t) = σ (t)

for t = t0 , t1 , . . . , tn−1


and to be piecewise constant over each interval, [tk tk+1 ], with k = 0, 1, . . . , n − 1. If this

is the case, the function σ can be more simply indexed by k, rather than by a continuous

time argument, and can be denoted by the symbol σk : σ (t) = σk (t) = σk for t ∈

[tk tk+1 ], with k = 0, 1, . . . , n − 1.

The elementary Ito integral between a and b, In (a, b), of the function σn (t) is then

defined by the expression


In (a, b) =


σn (u) dz(u) =

σk [Z(tk+1 ) − Z(tk )]



A comment is in order. In defining non-stochastic (e.g. Riemann) integrals, in the limit

it does not matter whether the point in the interval where the function is evaluated is chosen

to be at the beginning, at the end or anywhere else. When dealing with stochastic integrals,

however, the choice of the evaluation point for the function does make a difference, and

the limiting process leads to different results depending on its location. In order to ensure

that a sound financial (‘physical’) interpretation can be given to the integral, it is important

that the evaluation point should be made to coincide with the left-most point in the interval.

Step 2: Equation (5.19) defines a series of elementary integrals, each one associated

with a different index n, i.e. with the number of subintervals [tk tk+1 ] into which we have

subdivided the interval [a b]. If there exists a limit for the sequence of functions In (a, b)

as the integer n tends to infinity, then the Ito integral over the integral [a b], I (t; a, b),

is defined as

I (t; a, b) ≡




σ (u) dz(u) = lim In (a, b) = lim


= lim


n→∞ a

σn (u)dz(u)

σk [Z(tk+1 ) − Z(tk )]



As defined, the Ito integral itself can be regarded as stochastic variable, I . It is therefore

pertinent to ask questions about it such as its expectation or its variance. One can prove

the following:

• The expectation taken at time a of the Ito integral I (t; a, b) is zero:




σ (u) dz(u) = 0


• The variance of the Ito integral is linked to the time integral of the expectation of

the square of σ by the following relationship:




σ (u) dz(u) = Ea




σ (u) dz(u)




Ea [σ (u)2 ] du


Tài liệu bạn tìm kiếm đã sẵn sàng tải về

1 Correlation, Co-Integration and Multi-Factor Models

Tải bản đầy đủ ngay(0 tr)