Tải bản đầy đủ
7 Empirical Bayesian credibility theory: Model 2 – the Bühlmann–Straub model

# 7 Empirical Bayesian credibility theory: Model 2 – the Bühlmann–Straub model

Tải bản đầy đủ

186

Model based pricing – setting premiums

extension and improvement. The risk during one year may relate to cover for a
small business with four shops and three delivery vans on the road – the business may do well, expand, and next year have six shops and four vans on the
road. The increased estate (buildings, vans, stock) and general activity is not
taken account of by Model 1 but is taken account of by Model 2.
With this recognition of changing risk volumes, it is inappropriate now
to assume that, given the risk parameter, the claims variables are identically
distributed. The assumptions we do make for Model 2 are most conveniently
expressed in a manner which makes them less restrictive than was the case
for Model 1 – and these assumptions are made not about the claims variables
themselves, but about the variables representing claims per unit of risk volume.
So, let Y1 , Y2 , . . . , Yn represent the aggregate claims in n successive years
for a risk, and let P1 , P2 , . . . , Pn be corresponding risk volumes. These risk
volumes are known numbers (not random variables) and can be quantified in
various ways – for example, numbers of policies in a changing portfolio, numbers of shops in a chain, numbers of vehicles in a fleet, etc. A sensible general
measure which can be used – perhaps obvious once mentioned – is the annual
premium income the insurer has charged to cover the risk over recent years
(provided the premiums were set sensibly to reflect the risk).
We introduce Xi to represent the aggregate claims in year i scaled to take
account of the volume of business, that is
Xi = Yi /Pi ,

i = 1, 2, . . . , n,

(4.36)

so Xi is the aggregate claims per unit of risk volume in year i.
The basic structure of this model is that the distribution of each variable
Xi , i = 1, 2, . . . , n, depends on the value of a risk parameter θ, which is fixed
for that risk but unknown, and is regarded as a random variable with unknown
distribution function. It is not appropriate to assume that the Xi are identically
distributed, either conditionally (given θ), or unconditionally.
Assumptions
(1) Given θ, the Xi , i = 1, 2, . . . , n, are independent.
(2) E[Xi | θ] does not depend on i.
(3) Pi Var[Xi | θ] does not depend on i.
Under these assumptions we define
m(θ) = E[Xi |θ] and s2 (θ) = Pi Var[Xi |θ].

(4.37)

To motivate assumption (3), consider a risk which consists of a portfolio of
independent policies – suppose the number of policies in force in year i is Pi (a
known number). Suppose also that, for each policy, the aggregate claims in any
given year have mean m(θ) and variance s2 (θ), where θ is the risk parameter for

4.7 Empirical Bayesian credibility theory: Model 2

187

all policies involved. The aggregate claims in year i is Yi , and let Xi = Yi /Pi ,
as in (4.36). It is clear that
E[Yi | θ] = Pi m(θ) and Var[Yi | θ] = Pi s2 (θ).
Hence, Var[Xi | θ] = s2 (θ)/Pi , and so Pi Var[Xi | θ] = s2 (θ), thus satisfying the
stated assumption (3).
The pure premium per unit of risk is m(θ), and we want to estimate
the expected value of the aggregate claims in the coming year, namely
E[Yn+1 | θ] = Pn+1 m(θ) (we assume we know Pn+1 at the start of
year n + 1). So our problem is again to estimate m(θ), given the data
(y1 , P1 ), (y2 , P2 ), . . . , (yn , Pn ), and of course, derived from these, x1 , x2 , . . . , xn .
As in the case of Model 1 we want to find the estimator of m(θ) which is the
linear function of the observations X1 , X2 , . . . , Xn with minimum mean square
error; we want to choose a0 , a1 , . . . , an to optimise the estimator of m(θ) given
by a0 + a1 X1 + a2 X2 + · · · + an Xn .
Theorem 4.14 below gives the derivation of the credibility premium. Again,
as in §4.6, and as a preliminary to the proof, we give some expectations we
will require (the verifications of the first three are the same as the verifications
in the proof of §4.6; the fourth is diﬀerent, and is deferred to Exercise 4.20).
Lemma 4.13
(i)
(ii)
(iii)
(iv)

With the set-up and notation of EBCT Model 2, we have

E[Xi ] = E[m(θ)];
E[Xi m(θ)] = E[m2 (θ)];
E[Xi X j ] = E[m2 (θ)] for i j;
E[Xi2 ] = P1 E[s2 (θ)] + E[m2 (θ)].
i

Unlike the situation in the previous analysis, the constants a1 , a2 , . . . , an in
this case are not equal (since the Xi are not identically distributed).
Theorem 4.14 Let X1 , X2 , . . . , Xn be a sequence of random variables, each
of whose distribution depends on a parameter θ, and which, given θ, are
independent, with E[Xi | θ] = m(θ) and Pi Var[Xi | θ] = s2 (θ), i = 1, . . . , n.
Then the estimator a0 + nj=1 a j X j of m(θ) for which
⎡⎧
⎫2 ⎤

⎢⎢⎢⎪
n

⎬ ⎥⎥⎥⎥

E ⎢⎢⎢⎢⎪

a
X
m(θ)

a

0
j
j
⎥⎥

⎣⎪
⎭ ⎥⎦

j=1
is minimised is given by
ZX + (1 − Z)E[m(θ)],
where
X=

n
i=1 Pi Xi
n
i=1 Pi

and

Z=

n
i=1
n
i=1

Pi +

Pi
E[s2 (θ)]
Var[m(θ)]

.

188

Model based pricing – setting premiums

Proof Let S = E {m(θ) − a0 − a1 X1 − a2 X2 − · · · − an Xn }2 . Taking the partial derivative of S with respect to a0 gives
∂S
= 0 ⇒ E [m(θ) − a0 − a1 X1 − a2 X2 − · · · − an Xn ] = 0.
∂a0
Noting that E[Xi ] = E[m(θ)] (by Lemma 4.13(i)), this gives
a0 = (1 − a1 − a2 − · · · − an )E[m(θ)].
For j = 1, 2, . . . , n we have
∂S
= 0 ⇒ E X j {m(θ) − a0 − a1 X1 − a2 X2 − · · · − an Xn } = 0.
∂a j
This gives
E[X j m(θ)] = a0 E[X j ] +

ai E[Xi X j ] + a j E[X 2j ],
i j

from which, using Lemma 4.13, we have
n

E[m2 (θ)] = a0 E[m(θ)] +

ai E[m2 (θ)] + a j E[s2 (θ)]/P j .
i=1

Using the expression for a0 above and Lemma 4.13, and after some algebra,
we find

n
⎟⎟ Var[m(θ)]
⎜⎜⎜
P j.
a j = ⎜⎜⎝1 −
ai ⎟⎟⎟⎠
E[s2 (θ)]
i=1
Hence, summing from j = 1 to n, we have

n
n
⎜⎜⎜
⎟⎟ Var[m(θ)]
a j = ⎜⎜⎝1 −
ai ⎟⎟⎟⎠
E[s2 (θ)]
j=1
i=1

n

P j,
j=1

which gives
E[m(θ)]
a0 =

n
i=1

E[s2 (θ)]
Var[m(θ)]

Pi +

E[s2 (θ)]
Var[m(θ)]

,

and
aj =

Pj
n
i=1

Pi +

E[s2 (θ)]
Var[m(θ)]

,

j = 1, 2, . . . , n.

Putting these expressions into a0 + a1 X1 + · · · + an Xn , we find that the “best”
linear estimator is given by
E[s (θ)]
+
E[m(θ)] Var[m(θ)]
2

n
i=1

Pi +

n
i=1

E[s2 (θ)]
Var[m(θ)]

Pi Xi

.

4.7 Empirical Bayesian credibility theory: Model 2

189

Table 4.10. Collective of risks for Model 2
Year

Risk

1
2
·
·
N

1

2

·

·

n

Y11 ; P11
Y21 ; P21
·
·
YN1 ; PN1

Y12 ; P12
Y22 ; P22
·
·
YN2 ; PN2

·
·
·
·
·

·
·
·
·
·

Y1n ; P1n
Y2n ; P2n
·
·
YNn ; PNn

This may be written as
ZX + (1 − Z)E[m(θ)],
where
Z=

n
i=1
n
i=1

Pi +

Pi
E[s2 (θ)]
Var[m(θ)]

and

X=

n
i=1 Pi Xi
,
n
i=1 Pi

and the result is proved.
Notes
(1) X can of course also be written as X =

n
i=1 Yi
.
n
i=1 Pi

(2) The coeﬃcient a j of X j in the optimal estimator is proportional to P j , the
risk volume for that year – this makes good sense as the claims data for
years with higher risk volumes should have greater influence on the value
(3) In the case that the risk volumes are all equal, Model 2 is, in practice, the
same as Model 1; putting all the risk volumes Pi equal to 1 gives
Z=

n
,
E[s2 (θ)]
n+
Var[m(θ)]

which is exactly the expression we had in the case of Model 1.
We now consider how to estimate the three structural parameters E[m(θ)],
E[s2 (θ)] and Var[m(θ)] using data from a collective of N (fixed) comparable
risks. Our data consist of values (yi j , Pi j ), where yi j is an observation of a random variable Yi j which represents the aggregate claims for risk i in year j,
i = 1, 2, . . . , N, j = 1, 2, . . . n. We present the data in the cells in Table 4.10 in
the form Yi j ; Pi j .

190

Model based pricing – setting premiums

For each i and j define Xi j = Yi j /Pi j . For each risk, say risk i, the distribution
of each Xi j , j = 1, 2, . . . , n, depends on a risk parameter θi , which is fixed for
that risk, but unknown.
For each risk, say risk i, we make the following distributional assumptions.
Assumptions
(1) Given θi , the Xi j , j = 1, 2, . . . , n, are independent.
(2) E[Xi j | θi ] does not depend on j.
(3) Pi j Var[Xi j | θi ] does not depend on j.
Under these assumptions we define
m(θi ) = E[Xi j | θi ] and s2 (θi ) = Pi j Var[Xi j | θi ].
Each risk therefore has the same structure as that of the single risk we
considered earlier – this gives us the within risk structure we require.
We now make assumptions to give us the appropriate between risk structure:
Assumptions (continued)
(4) For diﬀerent risks i j, the pairs of variables (θi , Xil ) and (θ j , X jk ), l, k =
1, 2, . . . , n, are independent.
(5) The risk parameters θi , i = 1, 2, . . . , N, are iid.
Since θi , i = 1, 2, . . . , N, are identically distributed, it follows that none of
E[m(θi )], E[s2 (θi )] or Var[m(θi )] depend on i, and so we write them as E[m(θ)],
E[s2 (θ)] and Var[m(θ)], respectively.
We now seek estimators of these three structural parameters, and at this point
it is helpful to introduce some new notation – we will adopt the statistical convention of using a clear point (•) in place of a subscript to indicate that we have
summed over that subscript (so, for example, the sum of x31 , x32 , x33 , . . . , x3n
is denoted x3• , the sum of y15 , y25 , y35 , . . . , ym5 is denoted y•5 , and the sum of
zi j over all values of i and j is denoted z•• ).
We now have
n
n
Yi•
j=1 Pi j Xi j
j=1 Yi j
Xi =
=
=
,
(4.38)
n
n
P
P
P
i
j
i
j
i•
j=1
j=1
X=

N
n
j=1 Pi j Xi j
i=1
N
n
j=1 Pi j
i=1

=

N
i=1
N
i=1

Pi• X i
n
j=1

Pi j

=

N
i=1
N
i=1

n
j=1 Yi j
n
j=1 Pi j

=

Y••
P••

(4.39)

and finally
P∗ =

1
Nn − 1

N

Pi• 1 −
i=1

Pi•
.
P••

(4.40)

4.7 Empirical Bayesian credibility theory: Model 2

191

Table 4.11. Usual estimators of the structural parameters in EBCT Model 2
Structural
parameter

Estimator

E[m(θ)]

X
1
N

E[s2 (θ)]
Var[m(θ)]

1
P∗

N

1
n−1

⎨ 1

⎩ Nn − 1

i=1

n

Pi j (Xi j − X i )2
j=1
N

n

1
Pi j (Xi j − X) −
N

N

2

i=1 j=1

i=1

1
n−1

n

j=1

Pi j (Xi j − X i ) ⎪

2⎬

It is important to note our notation here: the mean of the claims (per unit of
risk volume) for an individual risk (risk i) is now denoted X i and is a weighted
mean, the weights being the risk volumes (see definition of X i above). The
symbol X now denotes the overall (weighted) mean claims (per unit of risk
volume) for all risks involved.
The credibility premium for risk i now appears as
Zi X i + (1 − Zi )E[m(θ)],

(4.41)

where X i is given by (4.38) and
Zi =

Pi•
.
E[s2 (θ)]
Pi• +
Var[m(θ)]

(4.42)

We give the usual estimators for the structural parameters in Table 4.11.
These estimators are unbiased – it is easy to verify that X is unbiased for
E[m(θ)]:
E[Xi j ] = E[E(Xi j | θi )] = E[m(θi )] = E[m(θ)],
and it follows immediately that

⎢⎢
E[X] = E ⎢⎢⎢⎣

N
n
⎥⎥⎥
j=1 Pi j Xi j ⎥
i=1
⎥⎦
N
n
P
j=1 i j
i=1

= E[m(θ)].

We defer the unbiasedness of the estimator of E[s2 (θ)] to Exercise 4.21.
(1) The estimators revert to those of Model 1 in the case that the risk volumes
are the same for all risks and years. Setting Pi j = 1 for all i, j we have
n

j=1 Pi j = n and P = [n(N − 1)]/(Nn − 1) (see Exercise 4.22).

192

Model based pricing – setting premiums
Table 4.12. Aggregate claims/volumes of business for
the four risks in Example 4.15
Risk
1
2
3
4

1

2

3

4

5

33 ; 4
22 ; 3
114 ; 16
77 ; 8

26 ; 4
16 ; 2
117 ; 19
74 ; 8

28 ; 5
19 ; 3
116 ; 18
59 ; 7

41 ; 5
29 ; 4
171 ; 22
86 ; 10

34 ; 5
33 ; 5
139 ; 22
98 ; 12

(2) While the estimates of E[m(θ)], E[s2 (θ)] and Var[m(θ)] only have
to be calculated once for the collective (and hence the expression
E[s2 (θ)]/ Var[m(θ)] is the same for all risks), the credibility factors Zi are
diﬀerent for diﬀerent risks, since the expression for Zi involves Pi• , the
total risk volume for that risk.
(3) Zi is an increasing function of Pi• – high risk volume for a risk implies
high credibility factor for that risk.
(4) As with Model 1, this model can be applied to estimating the expected
number of claims in the coming year rather than finding the credibility
premium – this is the case if the variables Yi j represent the number of
claims for the risk rather than the aggregate claims.
To sum up, the credibility premium for risk i in the collective is given by
Zi X i + (1 − Zi )X,
where
Xi =

n
j=1 Pi j Xi j
,
n
j=1 Pi j

and Zi =

n
j=1
n
j=1

Pi j +

Pi j
E[s2 (θ)]
Var[m(θ)]

,

with the structural parameters estimated as above.
Example 4.15 Table 4.12 gives the aggregate claims Yi j , i = 1, 2, 3, 4,
j = 1, 2, 3, 4, 5, in five successive years from comparable policies covering
the estate (buildings, vehicles, stock) of four medium-sized companies. The
level of activity for each company has been changing from year to year, and
for each company and year loss adjusters have given a quantitative assessment
(Pi j ) of the relative volume of business (the exposure covered by the policy).
The claims are inflation-adjusted and are in units of £1000. The cells in the
table contain the data in the form Yi j ; Pi j .
We will calculate the credibility premium per unit of risk volume to be
charged in the coming year for each risk. We are assuming that the conditions which have held for the past five years justify our adopting the structural

4.7 Empirical Bayesian credibility theory: Model 2

193

Table 4.13. The Xi j for the four risks in Example 4.15
Risk
1
2
3
4

1

2

3

4

5

8.2500
7.3333
7.1250
9.6250

6.5000
8.0000
6.1579
9.2500

5.6000
6.3333
6.4444
8.4286

8.2000
7.2500
7.7727
8.6000

6.8000
6.6000
6.3182
8.1667

Table 4.14. Intermediate calculations for Example 4.15
Risk

Yi•

Pi•

Xi

1
2
3
4

162
119
657
394

23
17
97
45

7.0435
7.0000
6.7732
8.7556

5
j=1

Pi j (Xi j − X i )2

24.407
4.7167
37.653
13.155

5
j=1

Pi j (Xi j − X)2

26.148
6.4431
66.516
106.06

assumptions which underpin EBCT Model 2 (for example, we are assuming that the claims per unit of risk volume from year to year for a particular
company have constant mean).
The calculations for this example were done using R (see the computing
recipes listed after this example), and the intermediate results are quoted to
five significant figures.
First we calculate the Xi j (these are given in Table 4.13). Next we calculate
the values of Yi• , Pi• , X i (using (4.38)) and X (using (4.39)), followed by the
sums we require for the calculation of the estimates; these intermediate values
are shown in Table 4.14.
For the calculations in the last column of Table 4.14 we require the value
X = Y•• /P•• = 1332/182 = 7.3187.
We will also require P∗ , which, by (4.40), is given by
{23(1 − 23/182) + 17(1 − 17/182) + 97(1 − 97/182) + 45(1 − 45/182)}/19
= 6.0359.
From Table 4.11, the estimate of E[m(θ)] is 7.3187, and the estimate of E[s2 (θ)]
is given by
(24.407 + 4.7167 + 37.653 + 13.155)/(4 × 4) = 4.9957.
Further, the estimate of Var[m(θ)] is given by
[(26.148 + 6.4431 + 66.516 + 106.06)/19 − 4.9957]/6.0359 = 0.96134,

194

Model based pricing – setting premiums
Table 4.15. Credibility factors and
premiums for the four risks in Example 4.15
Risk

Zi

of risk (£)

1
2
3
4

0.8157
0.7659
0.9492
0.8965

7094
7075
6801
8607

(note that the correct value from R is 0.96137). The ratio E[s2 (θ)]/Var[m(θ)]
from R is 5.1965. The credibility factor for risk 1 is 23/(23 + 5.1965) = 0.81570
(by (4.42)), and the credibility premium per unit of risk volume is
0.81570 × 7.0435 + 0.18430 × 7.3187 = 7.094.
The credibility factors and premiums for all four companies are given in
Table 4.15.
If the assessments of the risk volumes for companies 1, 2, 3 and 4 for the
coming year are 5, 6, 24 and 11, respectively, then the insurer’s credibility
premiums will be £35 470, £42 450, £163 220 and £94 680.
Computing recipes in R for EBCT Model 2
Recipe 1: using a simple sequence of elementary step-by-step calculations for
Example 4.15
Note that # is the comment symbol in R.
n=5 # number of years’ data for each risk
N=4 # number of risks in the collective
y1=c(33,26,28,41,34) # claims for risk 1
p1=c(4,4,5,5,5) # risk volumes for risk 1
x1=y1/p1 # claims per unit risk volume for risk 1
c12 = sum(y1)
# total claims for risk 1
c13 = sum(p1)
# total risk volumes for risk 1
x1bar=c12/c13

So x1bar contains X 1 . Carry out similar commands for y2, p2, x2, c22, c23,
x2bar (containing X 2 ); y3, p3, x3, c32, c33, x3bar (containing X 3 ); and y4,
p4, x4, c42, c43, x4bar (containing X 4 ).
c2=c(c12,c22,c32,c42)
c3=c(c13,c23,c33,c43)

4.7 Empirical Bayesian credibility theory: Model 2

195

c5=c6=1:N*0 # set up two N-element vectors
xbar=sum(c2)/sum(c3)
c5[1]=sum(p1*(x1-x1bar)^2)
c6[1]=sum(p1*(x1-xbar)^ 2)

So xbar, c5[1] and c6[1] contain
5

5

P1 j (X1 j − X 1 )2 and

X,
j=1

P1 j (X1 j − X)2 ,
j=1

respectively. Carrry out similar commands for c5[2], c5[3], c5[4], c6[2],
c6[3] and c6[4].
pstar=sum(c3*(1-c3/sum(c3)))/(N*n-1)
e1 = xbar
e2 = sum(c5)/(N*(n-1))
e3 = (sum(c6)/(N*n-1) - e2)/pstar
z1 = c13/(c13 + e2/e3)
prem1 = z1*x1bar + (1-z1)*xbar

So e1, e2 and e3 contain the estimates of E[m(θ)], E[s2 (θ)] and Var[m(θ)],
respectively, and z1 and prem1 contain the credibility factor and credibility
premium, respectively, for risk 1. Carry out similar commands for z2, prem2,
z3, prem3, z4 and prem4.
Recipe 2: via a function which uses matrices and a simple loop
First the claims and risk volumes are entered into matrices (here of dimension
4 × 5 and named mex411y and mex411p). The data are entered column by
column.
mex411y=matrix(c(33,22,114,77,26,...,139,98),4,5)
mex411p=matrix(c(4,3,16,8,4,...,22,12),4,5)

where ... denotes other values to be entered. A vector containing the credibility premiums, here called credpremiums, is created issuing the command

which calls up and executes a function called ebctmodel2 previously stored
as a text file as follows:
ebctmodel2=function(N,n,my,mp){
mx=my/mp
c2=apply(my,1,sum)
c3=apply(mp,1,sum)

196

Model based pricing – setting premiums

c5=c6=1:N*0
xibar=c2/c3
xbar=sum(c2)/sum(c3)
for (i in 1:N){
c5[i]=sum(mp[i,]*(mx[i,]-xibar[i])^ 2)
c6[i]=sum(mp[i,]*(mx[i,]-xbar)^2)}
pstar=sum(c3*(1-c3/sum(c3)))/(N*n-1)
e1 = xbar
e2 = sum(c5)/(N*(n-1))
e3 = (sum(c6)/(N*n-1) - e2)/pstar
z=c3/(c3+e2/e3)
prem=z*xibar+(1-z)*xbar
prem}

Exercises
4.1

4.2

4.3

Let P be the premium for a risk S calculated using the exponential
premium principle, with utility function parameter β (> 0).
(a) By using Jensen’s inequality (see Appendix A) on E[−eβS ] show that
P ≥ E[S ].
(b) By expanding E[eβS ] as far as the term in β2 , show that, for small
β, the results of using the exponential premium principle can be
approximated by using the variance principle.
Show that
property for independent risks,
(b) the expected value and the standard deviation premium calculation principles satisfy the scale invariance property, but the variance
principle does not.
Let u(x) be a utility function (with u (x) > 0 and u (x) < 0). The risk
aversion of an individual or company using this utility function and with
wealth x can be measured by the function
r(x) =

−u (x)
,
u (x)

where higher values of r(x) correspond to greater risk aversion (see
Appendix A).
Find the risk aversion r(x) under
(a) the log utility function u(x) = β log x,