Appendix A All the Math You Need. . . and No More (An Executive Summary)
Tải bản đầy đủ - 0trang
1318
Appendix A all the math you need. . . and no more
Obviously, if you can get away with just the adding and subtracting, and occasional other
easy bits of arithmetic, well that’s just ﬁne and dandy. And that’s what the simple option pricing
model, the binomial model, is all about. However, we don’t do an awful lot of that in this book,
because it is rather limiting. You can’t build much of a skyscraper with just a toy hammer and
a bit of string. And I do want us to build skyscrapers in this book, as it were.
Then there’s the abstract stuff; which is fantastic. The problem is that it is abstract probability
theory. Sometimes we have problems that aren’t probabilistic, then what? Probability theory
isn’t much use for that, is it? Sticking with the buildings analogy, I think of martingale theory
not as a tool but as a material. Steel is a material; it’s brilliant, you can build all sorts of things
with it, ships, bridges, etc. But it’s not that great for houses, and while it’s useful as the skeleton
of skyscrapers you will need other material to pad out that skeleton.
Differential calculus, now that’s not a material, that’s a box of tools. With the right tools
and some imagination you can build anything. Calculus doesn’t care whether a problem is
deterministic or probabilistic or something completely different. Calculus is just about how
things change or evolve, in time, space or with stock price. And that’s mostly what we do in
this book. Another advantage of focusing on the tools rather than the materials is that we don’t
have to limit ourselves in our modeling. Getting back to ﬁnance, many models at the cutting
edge of ﬁnance research are non linear. Calculus has no problems with nonlinearity, whereas
martingales do. If you are concentrating on the probabilistic models it seriously hampers your
scope for creativity. After all, outside of ﬁnance most models are non linear.
The real-world subject of quantitative ﬁnance uses tools from many branches of mathematics,
and ﬁnancial modeling can be approached in a variety of different ways. For some strange reason
the advocates of different branches of mathematics get quite emotional when discussing the
merits and demerits of their methodologies and those of their ‘opponents.’ Is this a territorial
thing; what are the pros and cons of martingales and differential equations; what is all this fuss
about and will it end in tears before bedtime?
Here’s a list of the various approaches to modeling and a selection of useful tools. The
distinction between a ‘modeling approach’ and a ‘tool’ will start to become clear.
A.2.1 Modeling Approaches
Deterministic
The idea behind this approach is that our model will tell us everything about the future. Given
enough data, and a big enough brain, we can write down some equations or an algorithm for
predicting the future. Interestingly, the subject of dynamical systems and chaos fall into this
category. And, as you know, chaotic systems show such sensitivity to initial conditions that
predictability is in practice impossible. This is the ‘butterﬂy effect,’ that a butterﬂy ﬂapping its
wings in Brazil will ‘cause’ rainfall over Manchester. (Like what doesn’t!) A topic popular in
the early 1990s, this has not lived up to its promises in the ﬁnancial world.
Probabilistic
One of the main assumptions about the ﬁnancial markets, at least as far as quantitative ﬁnance
goes, is that asset prices are random. We tend to think of describing ﬁnancial variables as
following some random path, with parameters describing the growth of the asset and its degree
of randomness. We effectively model the asset path via a speciﬁed rate of growth, on average,
and its deviation from that average. This approach to modeling has had the greatest impact
over the last 30 years, leading to the explosive growth of the derivatives markets.
all the math you need. . . and no more Appendix A
Discrete/Continuous
Whether probabilistic or deterministic the eventual model you write down can be discrete or
continuous. Discrete means that asset prices and/or time can only be incremented in ﬁnite
chunks, whether a dollar or a cent, a year or a day. Continuous means that no such lower
increment exists. The mathematics of continuous processes is often easier than that of discrete
ones. But then when it comes to number crunching you have to turn a continuous model into
a discrete one anyway.
In discrete models we end up with difference equations. An example of this is the binomial
model for asset pricing. Time progresses in ﬁnite amounts, the time step. In continuous models
we end up with differential equations. The equivalent of the binomial model in discrete space
is the Black–Scholes model, which has continuous asset price and continuous time. Whether
binomial or Black–Scholes both of these models come from the probabilistic assumptions about
the ﬁnancial world.
A.2.2 The Tools
Now let’s take a look at some of the tools available.
Simulations
If the ﬁnancial world is random then we can experiment with the future by running simulations.
For example, an asset price may be represented by its average growth and its risk, so let’s
simulate what could happen in the future to this random asset. If we were to take such an
approach we would want to run many, many simulations. There’d be little point in running just
the one, we’d like to see a range of possible future scenarios.
Simulations can also be used for non-probabilistic problems. Just because of the similarities
between mathematical equations, a model derived in a deterministic framework may have a
probabilistic interpretation.
Discretization methods
The complement to simulation methods, there are many types of these. The best known are the
ﬁnite-difference methods which are discretizations of continuous models such as Black–Scholes.
Depending on the problem you are solving, and unless it’s very simple, you will probably go
down the simulation or ﬁnite-difference routes for your number crunching.
Approximations
In modeling we aim to come up with a solution representing something meaningful and useful,
such as an option price. Unless the model is really simple, we may not be able to solve it easily.
This is where approximations come in. A complicated model may have approximate solutions;
and these approximate solutions might be good enough for our purposes.
Asymptotic analysis
This is an incredibly useful technique, used in most branches of applicable mathematics, but
almost unknown in ﬁnance. The idea is simple, ﬁnd approximate solutions to a complicated
problem by exploiting parameters or variables that are either large or small, or special in some
way. For example, there are simple approximations for vanilla option values close to expiry.
1319
1320
Appendix A all the math you need. . . and no more
Green’s functions
This is a very special technique that only works in certain situations. The idea is that solutions
to some difﬁcult problems can be built up from solutions to special solutions of a similar
problem.
A.3
e
The ﬁrst bit of math you need to know about is e.
e is
• a number, 2.7183. . .
• a function when written ex ; this function is a.k.a. exp(x)
The function ex is just the number 2.7183 . . . raised to the power x; e2 is just 2.7183 . . .2 =
7.3891 . . . , e1 is 2.7183 . . . and e0 = 1. What about non-integer powers?
The function ex can be written as an inﬁnite series
∞
e =1+x +
x
1 2
2x
+
1 3
6x
+ ··· =
i=0
xi
.
i!
This gets around the non-integer power problem.
A plot of ex as a function of x is shown in Figure A.1.
The function ex has the special property that the slope or gradient of the function is also ex .
Plot this slope as a function of x and for ex you get the same curve again. It follows that the
slope of the slope is also ex , etc. etc.
14
12
10
8
6
exp(x)
4
2
−2
Figure A.1
−1.5
−1
The function ex .
−0.5
0
0
0.5
x
1
1.5
2
2.5
3
all the math you need. . . and no more Appendix A
2
1
log(x)
−0.5
0
0
0.5
1
1.5
2
2.5
3
x
−1
−2
−3
−4
Figure A.2 The function log x.
A.4
log
Take the plot of ex in Figure A.1 and rotate it about a 45◦ line to get Figure A.2. This new
function is ln x, the Naperian logarithm of x. The relationship between ln and e is
eln x = x or
ln(ex ) = x.
So, in a sense, they are inverses of each other.
The function ln x is also often denoted by log x, as in this book. Sometimes log x refers to
the function with the properties
10log x = x and log(10x ) = x.
This function would be called ‘logarithm base ten.’ The most useful logarithm has base e =
2.7183 . . . because of the properties of the gradient of ex .
The slope of the log x function is x −1 .
From Figure A.2 you can see that there don’t appear to be any values for log x for negative
x. The function can be deﬁned for these but you’d need to know about complex numbers,
something we won’t be requiring here.
A.5
DIFFERENTIATION AND TAYLOR SERIES
I’ve introduced the idea of a gradient or slope in the sections above. If we have a function
denoted by f (x), then we denote the gradient of this function at the point x by
df
.
dx
1321
1322
Appendix A all the math you need. . . and no more
Mathematically the slope is deﬁned as
df
f (x + δx) − f (x)
= lim
.
δx→0
dx
δx
The action of ﬁnding the gradient is also called ‘differentiating’ and the slope can also be called
the ‘derivative’ of the function. This use of ‘derivative’ shouldn’t be confused with the use
meaning an option contract.
The slope can also be differentiated, resulting in a second derivative of the function f (x).
This is denoted by
d 2f
.
dx 2
We can take this differentiation to higher and higher orders.
Take a look at Figure A.3. In particular, note the two dots marked on the bold curve. The
bold curve is the function f (x). The dot on the left is at the point x on the horizontal axis and
the function value is f (x), the distance up the vertical axis. The dot to the right of this is at
x + δx with function value f (x + δx). What can we say about the vertical distance between
the two dots in terms of the horizontal distance?
Start with a trivial example. If the distance δx is zero then the vertical distance is also zero.
Now consider a very small but non-zero δx.
The straight line tangential to the bold curve f (x) at the point x is shown in the ﬁgure. This
line has slope df/dx evaluated at x. Notice that the right-hand hollow dot is almost on this
bold line. This suggests that a good approximation to the value f (x + δx) is
f (x + δx) ≈ f (x) + δx
df
(x).
dx
'the curve'
f(x + dx)
quadratic
approximation
to the
curve
f(x)
linear
approximation
to the
curve
x
Figure A.3
A schematic diagram of Taylor series.
x + dx
all the math you need. . . and no more Appendix A
This is a linear relationship between f (x + δx) − f (x) and δx. This makes sense since on
rearranging we get
df
f (x + δx) − f (x)
≈
dx
δx
which as δx goes to zero becomes our earlier deﬁnition of the gradient.
But the right-hand hollow dot is not exactly on the straight line. It is slightly above it.
Perhaps a quadratic relationship between f (x + δx) − f (x) and δx would be a more accurate
approximation. This is indeed true (provided δx is small enough) and we can write
f (x + δx) ≈ f (x) + δx
df
d 2f
(x) + 12 δx 2 2 (x).
dx
dx
This approximation, shown on the ﬁgure as the grey dot, is more accurate. One can take this
approximation to cubic, quartic, . . . The Taylor series representation of f (x + δx) is the inﬁnite
sum
∞
f (x + δx) = f (x) +
i=1
1 i d if
(x).
δx
i!
dx i
Taylor series is incredibly useful in derivatives theory, where the function that we are interested in, instead of being f , is V , the value of an option. The independent variable is no longer
x but is S, the price of the underlying asset. From day to day the asset price changes by a
small, random amount. This asset price change is just δS (instead of δx). The ﬁrst derivative
of the option value with respect to the asset is known as the delta, and the second derivative is
the gamma.
The value of an option is not only a function of the asset price S but also the time t: V (S, t).
This brings us into the world of partial differentiation.
Think of the function V (S, t) as a surface with coordinates S and t on a horizontal plane.
The partial derivative of V (S, t) with respect to S is written
∂V
∂S
and is deﬁned as
∂V
V (S + δS, t) − V (S, t)
= lim
.
δS→0
∂S
δS
Note that in this V is only ever evaluated at time t. This is like measuring the gradient of
the function V (S, t) in the S direction along a constant value of t.1 The partial derivative of
V (S, t) with respect to t is similarly deﬁned as
∂V
V (S, t + δt) − V (S, t)
= lim
.
δt→0
∂t
δt
Higher-order derivatives are deﬁned in the obvious manner.
1
Or think of it as the slope of a hill going North. The time derivative would be the slope going West.
1323
1324
Appendix A all the math you need. . . and no more
The Taylor series expansion of the value of an option is then
V (S + δS, t + δt) ≈ V (S, t) + δt
∂V
∂V
∂ 2V
+ δS
+ 12 δS 2 2 + · · · .
∂t
∂S
∂S
This series goes on for ever, but I’ve only written down the largest and most important terms,
those which are required for the Black–Scholes analysis.
A.6
EXPECTATION AND VARIANCE
Much of the modeling in ﬁnance uses ideas from probability theory. Again you don’t need to
know that much to understand most of the theory.
The ﬁrst important idea is that of expectation or mean. If you roll a die there is an equal, 16 ,
probability of each number coming up. What is the expected number or the average number if
you roll the die many times. The answer is
1×
1
6
+2×
1
6
+3×
1
6
+4×
1
6
+5×
1
6
+6×
1
6
= 3 12 .
Here we just multiply each of the possible numbers that could turn up by the probability of
each, and sum. Although 3 12 is the expected value, it cannot, of course, be thrown since only
integers are possible.
Generally, if we have a random variable X (the number thrown, say) which can take any of
the values xi (1, 2, 3, 4, 5, 6 in our example) for i = 1, . . . , N each of which has a probability
P (X = xi ) (in the example, 16 ) then the expected value is
N
xi P (X = xi ).
E[X] =
i=1
Expectations have the following properties:
E[cX] = cE[X]
and
E[X + Y ] = E[X] + E[Y ].
If the outcome of two random events X and Y have no impact on each other they are said
to be independent. If X and Y are independent we have
E[XY ] = E[X]E[Y ].
Expectations are important in ﬁnance because we often want to know what we can expect
to make from an investment on average.
The expectation or mean is also known as the ﬁrst moment of the distribution of the random
variable X. It can be thought of as being a typical value for X. The scatter of values around
the mean can be measured by the second moment or the variance:
Var(X) = E (X − E[X])2 .
all the math you need. . . and no more Appendix A
Variances have the following property:
Var(cX) = c2 Var(X).
When X and Y are independent
Var(X + Y ) = Var(X) + Var(Y ).
The standard deviation is the square root of the variance and is perhaps more useful as a
measure of dispersion since it has the same units as the variable X:
Standard deviation(X) =
Var(X).
If the standard deviation is small then values of X are concentrated around the mean, E[X].
If the standard deviation is large then values of X are more widely scattered.
Standard deviations are important in ﬁnance because they are often used as a measure of
risk in an investment. The higher the standard deviation of investment returns the greater the
dispersion of the returns and the greater the risk.
A.7
ANOTHER LOOK AT BLACK–SCHOLES
Now that we understand about differentiation we can take another look at the Black–Scholes
equation:
∂V
∂V
∂ 2V
+ 12 σ 2 S 2 2 + rS
− rV = 0.
∂t
∂S
∂S
The option value V (S, t) depends on (or ‘is a function of’) the asset price S and the time t.
The ﬁrst derivative of the option value with respect to time is called the theta:
=
∂V
.
∂t
Notice that this is a partial derivative and so theta is the gradient of the option value in the
direction of changing time, asset price ﬁxed. It measures the rate of change of the option value
with time if the asset price doesn’t move, hence the other name ‘time decay.’
The ﬁrst derivative of the option value with respect to the asset price is called the delta:
=
∂V
.
∂S
This is the slope in the S direction with time ﬁxed. Asset prices change very rapidly and so
the dominant change in the option value from moment to moment is the delta multiplied by
the change in the asset price. This is just a simple application of Taylor series; the difference
between the option price at time t when the asset is at S and a later time t + δt when the asset
price is S + δS is given by
V (S + δS, t + δt) − V (S, t) =
δS + · · · .
The · · · are terms which are, generally speaking, smaller than the leading term. They are still
important, as we’ll see in a moment.
1325
1326
Appendix A all the math you need. . . and no more
Because the change in option value and the change in asset price are so closely linked we
can see that holding a quantity
of the underlying asset short we can eliminate, to leading
order, ﬂuctuations in our net portfolio value. This is the basis of delta hedging.
The second derivative of the option value with respect to the asset price is called the gamma:
=
∂ 2V
.
∂S 2
This is also just the S derivative of the delta. If the asset changes by an amount δS then the
delta changes by an amount δS. Thus the gamma is a measure of how much one might have
to rehedge, and gives a measure of the amount of transaction costs from delta hedging.
Now we can interpret all the terms in the Black–Scholes equation, but what does the equation
itself mean?
Written in terms of the greeks, the Black–Scholes equation is
+ 12 σ 2 S 2 + rS
− rV = 0.
Reordering this we have
= rV − rS
− 12 σ 2 S 2
= r(V − S ) − 12 σ 2 S 2 .
When we have a delta hedged position we hold the option with value V and are short
the underlying asset. Thus our portfolio value is at any time
of
V −S .
So we can write the Black–Scholes equation in words as
Time decay = (interest received on cash equivalent of portfolio value) − 12 σ 2 S 2 gamma.
The option value grows by an equivalent of interest that would have been received by a
riskless pure cash position. But the delta hedged option is not a cash position. That’s where the
ﬁnal, gamma, term comes in.
Ignoring the interest on the cash equivalent, the theta and gamma terms add up to zero. Of
course, you can’t ignore this interest unless the portfolio has zero value or rates are zero.
The delta hedge is only accurate to leading order. If one is hedging with ﬁnite time intervals
between rehedges then there is inevitably a little bit of randomness that we can’t hedge away.
We can see this if we go to higher order in the Taylor series expansion of V (S + δS, t + δt):
V (S + δS, t + δt) − V (S, t) =
δS +
δt +
1
2
δS 2 . . . .
The term is predictable if we know the time δt between hedges (and it has already appeared in
the Black–Scholes equation). But the term is multiplied by the random δS 2 . We can’t hedge
this away perfectly. It is, in practice, the source of hedging errors. However, if we rehedge
sufﬁciently frequently (i.e. δt is very small) then the combined effect of the gamma terms is
via an average of the δS 2 . And this average is σ 2 S 2 δt. Why is it the average that matters? It’s
like betting on the toss of a biased coin. If you have an advantage then you can exploit it by
betting a small amount but very, very often. In the long run you will certainly win. (In terms
of standard deviations, as the time between hedges decreases so does the standard deviation of
the hedging error accumulated over the life of the option.)
all the math you need. . . and no more Appendix A
We can now see that the gamma term in the Black–Scholes equation is to balance the higherorder ﬂuctuations in the option value. Naturally, it therefore depends on the magnitude of these
ﬂuctuations, the volatility of the underlying asset.
A.8
SUMMARY
That wasn’t hard, was it?
1327