3 Merton Problem with Stochastic Volatility: Model Coefficient Polynomial Expansions
Tải bản đầy đủ - 0trang
Financial Signal Processing and Machine Learning
156
of the following SDE:
dSt = 𝜇(Yt )St dt + 𝜎(Yt )St dBSt ,
dYt = c(Yt ) dt + 𝛽(Yt ) dBYt ,
d⟨BS , BY ⟩t = 𝜌dt,
where BS and BY are standard Brownian motions with correlation 𝜌. Let W denote the wealth
process of an investor who holds 𝜋t units worth of currency in S at time t, and has (Wt − 𝜋t )
units of currency in a bond. For simplicity, we assume the risk-free rate of interest is zero. As
such, the wealth process W satisfies
dWt =
𝜋t
dSt = 𝜋t 𝜇(Yt ) dt + 𝜋t 𝜎(Yt ) dBSt .
St
Observe that S does not appear in the dynamics of the wealth process W.
An investor chooses 𝜋t to maximize his expected utility of wealth at a time T in the future,
where utility is measured by a smooth, increasing, and strictly concave function U ∶ ℝ+ → ℝ,
and the objective to maximize is 𝔼 U(WT ). Increasing describes a preference for more wealth
than less, whereas concavity captures risk aversion, with more concave being more risk averse.
The analysis is illustrated with power utility functions in Section 7.3.3.
We define the investor’s value function u by
u(t, y, 𝑤) ∶= sup 𝔼[U(WT )|Yt = y, Wt = 𝑤],
𝜋∈Π
where Π is the set of admissible strategies:
}
{
T
𝜋t2 𝜎 2 (Yt ) dt < ∞ and Wt ≥ 0 a.s. ,
Π ∶= 𝜋 adapted: 𝔼
∫0
where adapted means adapted to the filtration generated by (BS , BY ).
Assuming that u ∈ C1,2 ([0, T], ℝ, ℝ+ ), the value function solves the Hamilton–Jacobi–
Bellman partial differential equation (HJB-PDE) problem:
(𝜕t + AY )u + max A𝜋 u = 0,
𝜋∈ℝ
u(T, y, 𝑤) = U(𝑤),
(7.27)
where (AY + A𝜋 ) is the generator of (Y, W), assuming a Markov investment strategy
𝜋t = 𝜋(t, Yt , Wt ). Specifically, the operators AY and A𝜋 are given by
1
AY = c(y)𝜕y + 𝛽 2 (y)𝜕y2 ,
2
1
2
+ 𝜋(t, y, 𝑤)𝜌𝜎(y)𝛽(y)𝜕y 𝜕𝑤 .
A𝜋 = 𝜋(t, y, 𝑤)𝜇(y)𝜕𝑤 + 𝜋 2 (t, y, 𝑤)𝜎 2 (y)𝜕𝑤
2
The optimal strategy 𝜋 ∗ is given by
𝜋 ∗ = arg maxA𝜋 u = −
𝜋∈ℝ
𝜇(𝜕𝑤 u) + 𝜌𝛽𝜎(𝜕y 𝜕𝑤 u)
2 u)
𝜎 2 (𝜕𝑤
,
(7.28)
where, for simplicity (and from now on), we have omitted the arguments (t, y, 𝑤).
Inserting the optimal strategy 𝜋 ∗ into the HJB-PDE (7.27) yields
(𝜕t + AY )u + N (u) = 0,
(7.29)
Stochastic Volatility
157
where N (u) is a nonlinear term:
(𝜕𝑤 u)(𝜕y 𝜕𝑤 u) 1 2 2 (𝜕y 𝜕𝑤 u)2
1 (𝜕 u)2
− 𝜌𝛽𝜆
− 𝜌 𝛽
.
N (u) = − 𝜆2 𝑤2
2u
2u
2
2
𝜕𝑤 u
𝜕𝑤
𝜕𝑤
Here, we have introduced the Sharpe ratio 𝜆(y) ∶= 𝜇(y)∕𝜎(y).
7.3.2
Asymptotic Approximation
For general {𝛽, c, 𝜆}, there is no closed-form solution of (7.29). Hence, we seek an asymptotic
approximation for u. To this end, using equation (7.22) from Section 7.2.5.1 as a guide, we
expand the coefficients in (7.29) in a Taylor series about an arbitrary point ȳ . Specifically, for
any function 𝜒 ∶ ℝ → ℝ, we may formally write
𝜒(y) =
∞
∑
𝜀n 𝜒 n (y),
𝜒 n (y) ∶=
n=0
1 n
𝜕 𝜒(̄y)⋅(y − ȳ )n ,
n! y
𝜀 = 1,
(7.30)
where we have once again introduced 𝜀 for purposes of accounting. We also expand the function u as a power series in 𝜀
u=
∞
∑
𝜀n un ,
𝜀 = 1.
(7.31)
n=0
Now, for each group of coefficients appearing in (7.29), we insert an expansion of the form
(7.30), and we define
An ∶= cn 𝜕y + ( 12 𝛽 2 )n 𝜕y2 ,
n ∈ {0} ∪ ℕ.
(7.32)
We also insert into (7.29) our expansion (7.31) for u.
Next, collecting terms of like powers of 𝜀, we obtain at lowest order
(𝜕𝑤 u0 )(𝜕y 𝜕𝑤 u0 )
(𝜕y 𝜕𝑤 u0 )
(𝜕 u )2
1
1
− ( 𝜌2 𝛽 2 )0
= 0,
(𝜕t + A0 )u0 − ( 𝜆2 )0 𝑤2 0 − (𝜌𝛽𝜆)0
2
2u
2
2
𝜕𝑤 u0
𝜕𝑤 u0
𝜕𝑤
0
2
with u0 (T, y, 𝑤) = U(𝑤). We can look for a solution u0 = u(t, 𝑤) that is independent of y, and
then we have
(𝜕 u )2
1
𝜕t u0 − ( 𝜆2 )0 𝑤2 0 = 0,
2
𝜕𝑤 u0
u0 (T, 𝑤) = U(𝑤).
(7.33)
We observe that (7.33) is the same nonlinear PDE problem that arises when one considers an
underlying that has a constant drift 𝜇0 = 𝜇(̄y), diffusion coefficient 𝜎0 = 𝜎(̄y), and Sharpe ratio
𝜆0 = 𝜆(̄y) = 𝜇(̄y)∕𝜎(̄y).
It is convenient to define the risk-tolerance function
R0 ∶=
−𝜕𝑤 u0
2u
𝜕𝑤
0
,
and the operators
k
,
Dk = Rk0 𝜕𝑤
k = 1, 2,. · · · .
Financial Signal Processing and Machine Learning
158
We now proceed to the order O(𝜀) terms. Using u0 = u0 (t, 𝑤) and (7.33), we obtain
1
1
(𝜕t + A0 )u1 + ( 𝜆20 )D2 u1 + 𝜆20 D1 u1 + (𝜌𝛽𝜆)0 D1 𝜕y u1 = −( 𝜆2 )1 D1 u0 ,
2
2
u1 (T, y, 𝑤) = 0,
(7.34)
(7.35)
which is a linear PDE problem for u1 .
We can rewrite equations (7.34)–(7.35) more compactly as
(𝜕t + A0 + B 0 (t))u1 + H1 = 0,
u1 (T, y, 𝑤) = 0,
(7.36)
where the linear operator B(t) and the source term H1 are given by
B 0 (t) = 12 𝜆20 D2 + 𝜆20 D1 + (𝜌𝛽𝜆)0 D1 𝜕y ,
H1 = ( 12 𝜆2 )1 R0 𝜕𝑤 u0 .
Observe that (7.36) is a linear PDE for u1 .
The following change of variables (see Fouque et al., 2013) will be useful for solving the
PDE problem (7.36). Define
u1 (t, y, 𝑤) = q1 (t, y, z(t, 𝑤)),
z(t, 𝑤) = − log 𝜕𝑤 u0 (t, 𝑤) + 12 𝜆20 (T − t).
(7.37)
Inserting (7.37) into (7.36), we find that q1 satisfies
0 = (𝜕t + A0 + C 0 )q1 + Q1 ,
q1 (T, y, z) = 0,
(7.38)
where the operator C 0 is given by
C0 =
1 2 2
𝜆 𝜕 + (𝜌𝛽𝜆)0 𝜕y 𝜕z ,
2 0 z
(7.39)
and the function Q1 satisfies H1 (t, y, 𝑤) = Q1 (t, y, z(t, 𝑤)).
Now, from (7.32) and (7.39), we observe that the operator (A0 + C 0 ) is the infinitesimal
generator of a diffusion in ℝ2 whose drift vector and covariance matrix are constant. The
semigroup P 0 (t, t′ ) generated by (A0 + C 0 ) is given by
P 0 (t, T)G(y, z) ∶=
∫ℝ2
d𝜂 d𝜁 Γ0 (t, y, z; T, 𝜂, 𝜁 )G(𝜂, 𝜁 ),
where Γ0 , the fundamental solution corresponding to (𝜕t + A0 + C 0 ), is a Gaussian kernel:
(
)
1
1
exp − mT C−1 m ,
Γ0 (t, y, z; T, 𝜂, 𝜁 ) = √
2
(2𝜋)3 |C|
with covariance matrix C and vector m given by
( 2
)
(𝛽 )0 (𝜌𝛽𝜆)0
C = (T − t)
,
(𝜌𝛽𝜆)0 (𝜆2 )0
)
𝜂 − y − (T − t)c0
.
𝜁 −z
(
m=
By Duhamel’s principle, the unique classical solution to (7.38) is given by
T
q1 (t) =
∫t
ds P 0 (t, s)Q1 (s),
Stochastic Volatility
159
In the case of a general utility function, (7.33) is easily solved numerically, for instance by
solving the fast diffusion (or Black’s) equation for the risk tolerance function R0 (see Fouque
et al., 2013). Then, u1 can also be computed numerically using the formulas above. In the case
of power utility, there are explicit formulas, as given in Section 7.3.3.
Having obtained an approximation for the value function u ≈ u0 + u1 , we now seek an
expansion for the optimal control 𝜋 ∗ ≈ 𝜋0∗ + 𝜋1∗ . Inserting the expansion (7.30) of the coefficients and the expansion (7.31) for u into (7.28), and collecting terms of like powers of 𝜀, we
obtain
𝜇0 (𝜕𝑤 u0 )
O(1) ∶
𝜋0 = −
O(𝜀) ∶
𝜋1 = −𝜋0
2u )
(𝜎 2 )0 (𝜕𝑤
0
,
(7.40)
2
(𝜕𝑤 u0 )
(𝜕𝑤
u1 )
(𝜎 2 )1
− 𝜇1
−
𝜋
0
2
2
2u )
2
(𝜎 )0
(𝜕𝑤 u0 )
(𝜎 )0 (𝜕𝑤
0
− 𝜇0
(𝜕𝑤 u1 )
(𝜎 2 )
2
0 (𝜕𝑤 u0 )
− (𝜌𝛽𝜎)0
(𝜕y 𝜕𝑤 u1 )
2u )
(𝜎 2 )0 (𝜕𝑤
0
.
(7.41)
Higher order terms for both the value function u and the optimal control 𝜋 ∗ can be obtained
in the same manner as u1 and 𝜋1 . Analysis of the asymptotic formulas for different utility functions and stochastic volatility models is presented in more detail in Lorig and Sircar (2015).
7.3.3
Power Utility
Finally, we consider a utility function U from the constant relative risk aversion (CRRA), or
power family:
CRRA utility:
U(𝑤) ∶=
𝑤1−𝛾
, 𝑤 > 0,
1−𝛾
𝛾 > 0, 𝛾 ≠ 1,
where 𝛾 is called the risk aversion coefficient. Here, all the quantities above can be computed
explicitly.
The explicit solution u0 to (7.33) is
(
(
)
)
1 2 1−𝛾
𝜆0
u0 (t, 𝑤) = U(𝑤) exp
(T − t) .
2
𝛾
The risk-tolerance function is R0 =
𝑤
𝛾,
and the transformation in (7.37) is then
(
z(t, 𝑤) = 𝛾𝑤 + (T − t)
2𝛾 − 1
𝛾
)
1 2
𝜆 .
2 0
An explicit computation reveals that u1 is given by
u1 (t, y, 𝑤) = q1 (t, y, z(t, 𝑤))
(
)′ (
(
))
1−𝛾
1 2
1−𝛾
1
2
u0 (t, 𝑤) 𝜆 (̄y)
𝜌𝛽0 𝜆0
=
(T − t)(y − ȳ ) + (T − t) c0 +
.
𝛾
2
2
𝛾
Financial Signal Processing and Machine Learning
160
For the specific case ȳ = y, the above expression simplifies to
(
)′ (
(
))
1−𝛾
1 2
1−𝛾
1
2
u0 (t, 𝑤) 𝜆 (y)
(T − t) c0 +
𝜌𝛽0 𝜆0
u1 (t, y, 𝑤) =
.
𝛾
2
2
𝛾
Using these explicit representations of u0 and u1 , the expressions (7.40) and (7.41) for the
optimal stockholding approximations become
𝜇
𝜋0∗ = 02 ,
𝛾𝜎0
(
)
)′ )
(
(
𝜇0 ( 2 )′
𝜇′ (̄y)
(1 − 𝛾)(T − t)
1 1 2
∗
𝜋1 (t, y) = (y − ȳ )
𝜆 (̄y)
− 4 𝜎 (̄y)
+
𝜌𝛽0
.
𝛾𝜎0
𝛾 2
𝛾𝜎 2
𝛾𝜎
0
0
For the specific case ȳ = y, the formula for 𝜋1∗ simplifies to
)′
(
(1 − 𝛾)(T − t)
1 1 2
𝜆 (y) .
𝜋1∗ (t, y) =
𝜌𝛽0
𝛾𝜎0
𝛾 2
7.4 Conclusions
Asymptotic methods can be used to analyze and simplify pricing and portfolio optimization
problems, and we have presented some examples and methodologies. A key insight is to perturb problems with stochastic volatility around their constant volatility counterparts to obtain
the principle effect of volatility uncertainty.
These approaches reduce the dimension of the effective problems that have to be solved, and
often lead to explicit formulas that can be analyzed for intuition. In the context of portfolio
problems accounting for stochastic volatility, recent progress has been made in cases where
there are transaction costs (Bichuch and Sircar, 2014, Kallsen and Muhle-Karbe, 2013) or
stochastic risk aversion that varies with market conditions (Dong and Sircar, 2014), and under
more complex local-stochastic volatility models (Lorig and Sircar, 2015).
Acknowledgements
RL’s work is partially supported by NSF grant DMS-0739195. RS’s work is partially supported
by NSF grant DMS-1211906.
References
Aït-Sahalia, Y. and Jacod, J. (2014). High-frequency financial econometrics. Princeton: Princeton University Press.
Bakshi, G., Cao, C. and Chen, Z. (1997). Empirical performance of alternative option pricing models. Journal of
Finance, 52 (5), 2003–2049.
Berestycki, H., Busca, J. and Florent, I. (2004). Computing the implied volatility in stochastic volatility models.
Communications on Pure and Applied Mathematics, 57 (10), 1352–1373.
Bichuch, M. and Sircar, R. (2014). Optimal investment with transaction costs and stochastic volatility. Submitted.
Black, F. and Scholes, M. (1972). The valuation of option contracts and a test of market efficiency. Journal of Finance,
27, 399–417.
Black, F. and Scholes, M. (1973). The pricing of options and corporate liabilities. The Journal of Political Economy,
81(3), 637–654.
Stochastic Volatility
161
Dong, Y. and Sircar, R. (2014). Time-inconsistent portfolio investment problems. In Stochastic analysis and applications 2014 – in honour of Terry Lyons (ed. Crisan, D. Hambly, B. and Zariphopoulou T.). Berlin: Springer.
Dupire, B. (1994). Pricing with a smile. Risk, 7 (1), 18–20.
Engle, R. (1982). Autoregressive conditional heteroscedasticity with estimates of the variance of United Kingdom
inflation. Econometrica, 50 (4), 987–1008.
Fouque, J.-P., Papanicolaou, G. and Sircar, R. (2000). Derivatives in financial markets with stochastic volatility. Cambridge: Cambridge University Press.
Fouque, J.-P., Papanicolaou, G., Sircar, R. and Solna, K. (2011). Multiscale stochastic volatility for equity, interest
rate, and credit derivatives. Cambridge: Cambridge University Press.
Fouque, J.-P., Sircar, R. and Zariphopoulou, T. (2013). Portfolio optimization & stochastic volatility asymptotics.
Mathematical Finance, forthcoming.
Gulisashvili, A. (2012). Analytically tractable stochastic stock price models. New York: Springer.
Heston, S. (1993). A closed-form solution for options with stochastic volatility with applications to bond and currency
options. Review of Financial Studies, 6 (2), 327–343.
Hull, J. and White, A. (1987). The pricing of options on assets with stochastic volatilities. Journal of Finance, 42 (2),
281–300.
Jonsson, M. and Sircar, R. (2002a). Optimal investment problems and volatility homogenization approximations. In
Modern methods in scientific computing and applications (ed. Bourlioux, A., Gander, M. and Sabidussi, G.), NATO
Science Series II vol. 75. New York: Kluwer, pp. 255–281.
Jonsson, M. and Sircar, R. (2002b). Partial hedging in a stochastic volatility environment. Mathematical Finance,
12 (4), 375–409.
Kallsen, J. and Muhle-Karbe, J. (2013). The general structure of optimal investment and consumption with small
transaction costs. Submitted.
Lee, R.W. (2004). The moment formula for implied volatility at extreme strikes. Mathematical Finance, 14 (3),
469–480.
Lewis, A. (2000). Option valuation under stochastic volatility. Newport Beach, CA: Finance Press.
Lorig, M. and Lozano-Carbassé, O. (2015). Multiscale exponential Lévy models. Quantitative Finance, 15 (1),
91–100.
Lorig, M., Pagliarani, S. and Pascucci, A. (2014). A Taylor series approach to pricing and implied vol for LSV models.
Journal of Risk, 17, 1–17.
Lorig, M., Pagliarani, S. and Pascucci, A. (2015a). Analytical expansions for parabolic equations. SIAM Journal on
Applied Mathematics, 75, 468–491.
Lorig, M., Pagliarani, S. and Pascucci, A. (2015b). Explicit implied volatilities for multifactor local-stochastic volatility models. Mathematical Finance, forthcoming.
Lorig, M., Pagliarani, S. and Pascucci, A. (2015c). A family of density expansions for Lévy-type processes with
default. Annals of Applied Probability, 25 (1), 235–267.
Lorig, M., Pagliarani, S. and Pascucci, A. (2015d). Pricing approximations and error estimates for local Lévy-type
models with default. Computers and Mathematics with Applications, forthcoming.
Lorig, M. and Sircar, R. (2015). Portfolio optimization under local-stochastic volatility: coefficient Taylor series
approximations & implied Sharpe ratio. Submitted.
Merton, R. (1976). Option pricing when underlying stock returns are discontinuous. Journal of Financial Economics,
3 (1), 125–144.
Merton, R. (1992). Continuous-time finance. Oxford: Blackwell.
Merton, R.C. (1969). Lifetime portfolio selection under uncertainty: the continuous-time case. Review of Economics
and Statistics, 51 (3), 247–257.
Merton, R.C. (1971). Optimum consumption and portfolio rules in a continuous-time model. Journal of Economic
Theory, 3 (1–2), 373–413.
Nelson, D. (1990). ARCH models as diffusion approximations. Journal of Econometrics, 45 (1–2), 7–38.
Nelson, D. (1991). Conditional heteroskedasticity in asset returns: a new approach. Econometrica, 59 (2), 347–370.
Scott, L. (1987). Option pricing when the variance changes randomly: theory, estimation, and an application. Journal
of Financial and Quantitative Analysis, 22 (4), 419–438.
Sircar, R. and T. Zariphopoulou (2005). Bounds and asymptotic approximations for utility prices when volatility is
random. SIAM Journal of Control and Optimization, 43 (4), 1328–1353.
Tehranchi, M.R. (2009). Asymptotics of implied volatility far from maturity. Journal of Applied Probability, 629–650.
Wiggins, J. (1987). Option values under stochastic volatility. Journal of Financial Economics, 19 (2), 351–372.
8
Statistical Measures of Dependence
for Financial Data
David S. Matteson, Nicholas A. James and William B. Nicholson
Cornell University, USA
8.1 Introduction
The analysis of financial and econometric data is typified by non-Gaussian multivariate
observations that exhibit complex dependencies: heavy-tailed and skewed marginal distributions are commonly encountered; serial dependence, such as autocorrelation and conditional
heteroscedasticity, appear in time-ordered sequences; and nonlinear, higher-order, and tail
dependence are widespread. Illustrations of serial dependence, nonnormality, and nonlinear
dependence are shown in Figure 8.1.
When data are assumed to be jointly Gaussian, all dependence is linear, and therefore only
pairwise among the variables. In this setting, Pearson’s product-moment correlation coefficient
uniquely characterizes the sign and strength of any such dependence.
Definition 8.1 For random variables X and Y with joint density fXY , Pearson’s correlation
coefficient is defined as
⎤
⎡
(x − 𝜇X ) (y − 𝜇Y )
⎢ (X − 𝜇X ) (Y − 𝜇Y ) ⎥
fXY (x, y) dx dy,
𝜌P (X, Y) = E ⎢ √
√
√
⎥=∫ ∫ √
⎢
𝜎X2
𝜎Y2 ⎥
𝜎Y2
𝜎X2
⎦
⎣
where 𝜇X = E(X) = ∫ ∫ xfXY (x, y) dx dy, and 𝜎X2 = E[(X − 𝜇X )2 ] = ∫ ∫ (x − 𝜇X )2 fXY (x, y) dx dy,
are the mean and variance of X, respectively, and 𝜇Y and 𝜎Y2 are defined similarly.
This conventional measure of pairwise linear association is well-defined provided 𝜎X2 and 𝜎Y2
are positive and finite, in which case 𝜌P (X, Y) ∈ [−1, +1]. A value of −1 or +1 indicates
Financial Signal Processing and Machine Learning, First Edition.
Edited by Ali N. Akansu, Sanjeev R. Kulkarni and Dmitry Malioutov.
© 2016 John Wiley & Sons, Ltd. Published 2016 by John Wiley & Sons, Ltd.
Statistical Measures of Dependence for Financial Data
−4
−2
−0.010 −0.005
0
0.000
2
0.005
4
0.010
163
0
50
100
150
200
0
100
200
Time
300
400
500
Time
4
2
0
−2
Sample Quantiles
−4
Density
0.00 0.05 0.10 0.15 0.20 0.25 0.30
Normal Q−Q Plot
1
2
3
4
5
−3
6
−2
−1
0
1
2
3
−10
0.0
−15
−10
0.2
−5
0.4
0
0.6
0.8
5
1.0
Theoretical quantiles
−5
0
5
10
15
0.0
0.2
0.4
0.6
0.8
1.0
Figure 8.1 Top left: Strong and persistent positive autocorrelation, that is, persistence in local level;
top right: moderate volatility clustering, that is, i.e., persistence in local variation. Middle left: Right
tail density estimates of Gaussian versus heavy- or thick-tailed data; middle right: sample quantiles of
heavy-tailed data versus the corresponding quantiles of the Gaussian distribution. Bottom left: Linear
regression line fit to non-Gaussian data; right: corresponding estimated density contours of the normalized sample ranks, which show a positive association that is stronger in the lower left quadrant compared
to the upper right.
Financial Signal Processing and Machine Learning
164
perfect negative or positive linear dependence, respectively, and in either case X and Y have
an exact linear relationship. Negative and positive values indicate negative and positive linear
associations, respectively, while a value of 0 indicates no linear dependence.
A sample estimator 𝜌̂P is typically defined by replacing expectations with empirical expectations in the above definition. For a paired random sample of n observations (X1∶n , Y 1∶n ) =
{(xi , yi )}ni=1 , define
1 ∑ (xi − 𝜇̂ X ) (yi − 𝜇̂ Y )
1∑
1 ∑
∶ 𝜇̂ X =
xi ; 𝜎̂ X2 =
(xi − 𝜇̂ X )2 ,
√
√
n − 1 i=1
n
n
−
1
i=1
i=1
𝜎̂ X2
𝜎̂ Y2
n
𝜌̂P (X1∶n , Y 1∶n ) =
n
n
where 𝜇̂ Y and 𝜎̂ Y2 are defined similarly. For jointly Gaussian variables (X, Y), zero correlation is
equivalent to independence. For an independent and identically distributed (i.i.d.) sample from
the bivariate normal distribution, inference regarding 𝜌P can be conducted using the following
asymptotic approximation:
√
D
n(𝜌̂P − 𝜌P ) → N [0, (1 − 𝜌2P )2 ],
D
in which → denotes convergence in distribution. Under the null hypothesis of zero correla√
D
tion, this expression simplifies to n𝜌̂P → N (0, 1). More generally, for an i.i.d sample from
an arbitrary distribution with finite fourth moments, E(X 4 ) and E(Y 4 ), a variance-stabilizing
transformation may be applied (Fisher’s transformation) to obtain the alternative asymptotic
approximation (cf. Ferguson, 1996)
√ (
)
n
1 + 𝜌P D
1 + 𝜌̂P
− log
→ N (0, 1).
log
2
1 − 𝜌̂P
1 − 𝜌P
Although the previous approximation is quite general, assuming finite fourth moments may
be unreasonable in numerous financial applications where extreme events are common. Furthermore, Pearson’s correlation coefficient measures the strength of linear relationships only.
There are many situations in which correlations are zero but a strong nonlinear relationship
exists, such that variables are highly dependent. In Section 8.2, we discuss several robust
measures of correlation and pairwise association, and illustrate their application in measuring
serial dependence in time-ordered data. In Section 8.3, we consider multivariate extensions and
Granger causality, and introduce measures of mutual independence. Finally, in Section 8.4 we
explore copulas and their financial applications.
8.2 Robust Measures of Correlation and Autocorrelation
Financial data are often time-ordered, and intertemporal dependence is commonplace. The
autocovariance and autocorrelation functions are extensions of covariance and Pearson’s correlation to a time-dependent setting, respectively.
Definition 8.2 For a univariate ordered sequence of random variables {Xt }, the autocovariance 𝛾 and autocorrelation 𝜌 at indices q and r (Shumway and Stoffer, 2011) are defined as
𝛾(Xq , Xr ) = 𝛾(q, r) = E[(Xq − 𝜇q )(Xr − 𝜇r )]
Statistical Measures of Dependence for Financial Data
and
𝜌P (Xq , Xr ) = 𝜌P (q, r) = √
165
𝛾(q, r)
,
√
𝛾(q, q) 𝛾(r, r)
respectively, in which 𝜇q = E(Xq ) and 𝜇r = E(Xr ). The above quantities are well-defined provided E(Xt2 ) is finite for all t; however, estimating these quantities from an observed sequence
X1∶n = {xt }nt=1 requires either multiple i.i.d. realizations of the entire sequence (uncommon in
finance), or some additional assumptions. The first basic assumption is that the observations are
equally spaced, and t denotes their discrete-time index. We will refer to such sequences generically as time-series. In finance, this assumption may only hold approximately. For example, a
sequence of daily market closing asset prices may only be available for weekdays, with additional gaps on holidays, or intraday asset transaction prices may be reported every hundredth of
a second, but there may be no transaction at many of these times. In either case, the consecutive
observations are commonly regarded as equally spaced, for simplicity.
The next basic assumption is some form of distributional invariance over time, such as stationarity.
Definition 8.3 A univariate sequence of random variables {Xt } is weakly (or covariance)
stationary if and only if
E(Xt ) = E(Xt−h ) = 𝜇
and
𝛾(t, t − h) = 𝛾(|h|) = 𝛾h
∀t, h, and
𝛾0 < ∞.
This implies that the means and variances are finite and constant, and the autocovariance is
constant with respect to t, and only depends on the relative time lag h between observations.
For any k-tuple of indices t1 , t2 , … , tk , let Ft1 ,t2 ,…,tk (⋅) denote the joint distribution function
of (Xt1 , Xt2 , … , Xtk ). Then, the sequence is strictly stationary if and only if
Ft1 ,t2 ,…,tk (⋅) = Ft1 −h,t2 −h,…,tk −h (⋅) ∀h, k, and ∀t1 , t2 , … , tk .
This implies that the joint distributions of all k-tuples are invariant to a common time shift h
such that their relative time lags remain constant.
Strict stationary implies weak stationarity provided the variance is also finite.
Now, under the weak stationarity assumption, the parameters 𝛾h = 𝛾(h) for h = 0, 1, 2, …
denote the autocovariance function of {Xt } with respect to the time lag h, and the corresponding autocorrelation function (ACF) is defined as 𝜌P (h) = 𝛾h ∕𝛾0 . Under the weak
stationarity assumption, the joint distribution of the random variables (X1 , … , Xn ) has
mean vector 𝝁X = 𝟏𝜇, where 𝟏 denotes a length n vector of ones, and a symmetric
Toeplitz covariance matrix ΣX , with [ΣX ]i,j = 𝛾(|i − j|). Furthermore, both ΣX and the
corresponding correlation matrix ΩX are positive definite for any stationary sequence. For
an observed stationary time-series X1∶n = {xt }nt=1 , the mean is estimated as before (𝜇̂ X ),
while autocovariances and autocorrelations are commonly estimated as 𝛾̂ (X1∶n ; h) = 𝛾̂ (h) =
1 ∑n
𝛾 (0), respectively. Using
t=h+1 (xt − 𝜇̂ X )(xt−h − 𝜇̂ X ) and 𝜌̂P (X 1∶n ; h) = 𝜌̂P (h) = 𝛾̂ (h)∕̂
n
1
assures that the corresponding estimated covariance and
the scaling 1n as opposed to n−h
̂ X ]i,j = 𝜌̂P (|i − j|) are both positive definite
correlation matrices [Σ̂ X ]i,j = 𝛾̂ (|i − j|) and [Ω
(McLeod and Jimenéz, 1984).
Financial Signal Processing and Machine Learning
166
8.2.1
Transformations and Rank-Based Methods
Pearson’s correlation and the autocorrelation function are commonly interpreted under an
implicit joint normality assumption on the data. The alternative measures discussed in this subsection offer robustness to outlying and extreme observations and consider nonlinear dependencies, all of which are common in financial data.
8.2.1.1
Huber-type Correlations
Transformations may be applied to define a robust correlation measure between pairs of random variables (X, Y). For example, let 𝜇R and 𝜎R denote robust location and scale parameters,
such as the trimmed mean and trimmed standard deviation. And let 𝜓 denote a bounded monotone function, such as 𝜓(x; k) = xI|x|≤k + sgn(x)kI|x|>k (cf. Huber, 1981), where k is some
positive constant and IA is the indicator function of an event A. Then, a robust covariance
and correlation may be defined as
[ (
) (
)]
X − 𝜇R (X)
Y − 𝜇R (Y)
𝛾R (X, Y) = 𝜎R (X)𝜎R (Y)E 𝜓
𝜓
𝜎R (X)
𝜎R (Y)
and
𝛾R (X, Y)
,
𝜌R (X, Y) = √
√
𝛾R (X, X) 𝛾R (Y, Y)
respectively (cf. Maronna et al., 2006). Although these measures are robust to outlying and
extreme observations, they depend on the choice of transformation 𝜓. They provide an intuitive measure of association, however, for an arbitrary joint distribution on (X, Y), 𝜌R (X, Y) ≠
𝜌P (X, Y), in general. Sample versions are obtained by replacing the expectation, 𝜇R and 𝜎R ,
by their sample estimates in the above expressions. Asymptotic sampling distributions can be
derived, but the setting is more complicated than above. Finally, robust pairwise covariances
and correlations such as these may be used to define corresponding covariance and correlation matrices for random vectors, but the result is not positive definite (or affine equivariant),
in general.
8.2.1.2
Kendall’s Tau
Rank-based methods measure all monotonic relationships and are resistant to outliers.
Kendall’s tau (Kendall, 1938) is a nonparametric measure of dependence, for which the
sample estimate considers the pairwise agreement between two ranked lists.
Definition 8.4 For random variables X and Y, Kendall’s tau is defined as
𝜏(X, Y) = P[(X − X ∗ )(Y − Y ∗ ) > 0] − P[(X − X ∗ )(Y − Y ∗ ) < 0]
= E{sgn[(X − X ∗ )(Y − Y ∗ )]},
in which (X ∗ , Y ∗ ) denotes an i.i.d. copy of (X, Y).
(8.1)