2 Computing -1i,k(Bui,k - Cwi,k - gk)
Tải bản đầy đủ - 0trang
164
P. Kumar
where
Aˆ = T AT + Tˆ ,
Bˆ = T B.
(14)
Here T and Tˆ are defined as follows:
T = diag
0, if u i,k ( j) ∈ {−1, 1}
, j = 1, . . . , |Nh |,
1, otherwise
Tˆ = I d − T,
(15)
I d ∈ R|Nh |×|Nh | ,
where u i,k ( j) is the jth component of u i,k . In other words, in (14), Aˆ is the matrix
obtained from A by replacing the ith row and the ith column by the unit vector ei
corresponding to the active sets identified by diagonal entries of T . Similarly, Bˆ is
the matrix obtained from B by annihilating columns, and Bˆ T is the matrix obtained
from B T by annihilating rows.
4.3 Computing Step Length ρ i,k
The step length ρi,k is computed using a bisection method. We refer the reader to
[9, p. 88]. The correct step length lends global convergence for the Uzawa method.
4.4 Algebraic Monotone Multigrid for Obstacle Problem
To solve the quadratic obstacle problem (11), we may use the truncated monotone
multigrid method proposed in [11]. However, here we may use algebraic coarsening
to create initial set of interpolation operators. We describe it briefly as in [14].
4.4.1
Aggregation Based Coarsening
We first discuss the coarsening for two-grid, the multilevel interpolations are applied
recursively. In classical two-grid, a set of coarse grid unknowns is selected and the
matrix entries are used to build interpolation rules that define the prolongation matrix
P, and the coarse grid matrix Ac is computed from the following Galerkin formula
Ac = P T A P.
In contrast to the classical two-grid approach, in aggregation based multigrid, first a
set of aggregates G i is defined. Let |Nh,c | be the total number of such aggregates,
Fast Preconditioned Solver for Truncated Saddle Point Problem …
165
then the interpolation matrix P is defined as follows
Pi j =
1, if i ∈ G j ,
0, otherwise,
Here, 1 ≤ i ≤ |Nh |, 1 ≤ j ≤ |Nh,c |. Further, we assume that the aggregates G i are
such that
Gi
G j = φ, for i = j and
G i = {i ∈ N : 1 ≤ i ≤ |Nh |}.
i
The matrix P defined above is a |Nh | × |Nh,c | matrix, but since it has only one
non-zero entry (which are “one”) per row, the matrix is compactly represented by a
single array of length Nh,c storing the location of the non-zero entry on each row.
The coarse grid matrix Ac may be computed as follows
(Ac )(i, j) =
A(k, l),
k∈G i l∈G j
where 1 ≤ i, j ≤ |Nh,c |, and A(k, l) is the (k, l)th entry of A.
Numerous aggregation schemes have been proposed in the literature, but in this
paper we consider the standard aggregation based on strength of connection [21,
Appendix A, p. 413] where one first defines a set of nodes Si to which i is strongly
negatively coupled, using the Strong/Weak coupling threshold β:
Si = { j = i | A(i, j) < −β max |A(i, k)| }.
Then an unmarked node i is chosen such that priority is given to a node with minimal
Mi , here Mi being the number of unmarked nodes that are strongly negatively coupled
to i. For a complete algorithm of aggregation, the reader is referred to Notay [14, 17].
4.5 Preconditioner for Reduced Linear System
In Bosch et. al. [6], a preconditioner is proposed in the framework of semi-smooth
Newton method combined with Moreau–Yosida regularization for the same problem.
However, the preconditioner was constructed for a linear system which is different
from the one we consider in (13). For convenience of notation, we rewrite the system
matrix in (13) as follows
Ax = b,
166
P. Kumar
where scripted A above is
A=
b
Aˆ Bˆ T
x
, x= 1 , b= 1 ,
ˆ
x
b
B −C
2
2
(17)
where
x1 = u˜ i,k , x2 = d i,k , b1 = 0, b2 = g + Cwi,k − Bu i,k .
The preconditioner proposed in [6] has the following block lower triangular form
B=
Aˆ 0
,
Bˆ −S
where S = C + Bˆ Aˆ −1 Bˆ T is the negative Schur complement. Before we define
preconditioner, it is essential to know whether S is nonsingular.
In the following, we denote the set of truncated nodes by
Nh• = {i : T (i, i) = 0}.
That is, Nh• is the set of truncated nodes.
Theorem 1 The negative Schur complement S = C + Bˆ Aˆ −1 Bˆ T is non-singular, in
particular, SPD if and only if |Nh• | < |N |h .
Proof If |Nh• | = |N |h , then Bˆ is the zero matrix, consequently S = C = τ K
is singular since K correspond to stiffness matrix with pure Neumann boundary
condition. For other implication, we recall that Bˆ T = Mˆ T = −T M, where T is
defined in (15). The (i, j)th entry of element mass matrix is given as follows
MiKj =
φi φ j d x =
K
1
(1 + δi j |K |) i, j = 1, 2, 3,
12
(18)
where δi j is the Kronecker symbol, that is, it is equal to 1 if i = j, and 0 if i = j.
Here φ1 , φ2 , and φ3 are hat functions on triangular element K with local numbering,
and |K | is the area of triangle element K . From (19), it is easy to see that
MK
⎛
⎞
211
1 ⎝
1 2 1⎠ .
=
12 1 1 2
(19)
Evidently, entries of global mass matrix M = K M K are also all positive, hence
all entries of truncated mass matrix Mˆ remain non-negative. In particular, due to our
Fast Preconditioned Solver for Truncated Saddle Point Problem …
167
hypothesis |N • | > 0, there is atleast one untruncated column, hence, atleast few
positive entries. Consequently, M1 = 0, i.e., 1 or span{1} is neither in kernel of M,
ˆ in particular, 1T Mˆ T 1 > 0. The proof of the theorem then
nor in the kernel of M,
follows since C is SPD except on 1 for which Bˆ T 1 is non-zero, and the fact that Aˆ
is SPD yields
Bˆ Aˆ −1 Bˆ T 1, 1 = Aˆ −1 ( Bˆ T 1), ( Bˆ T 1) = Aˆ −1 (− Mˆ T 1), (− Mˆ T 1) > 0.
Note that such preconditioners are also called inexact or preconditioned Uzawa
preconditioners for the linear saddle point problems. By block 2×2 inversion formula
we have
B −1 =
Aˆ 0
Bˆ −S
−1
=
0
Aˆ −1
.
S Bˆ T Aˆ −1 −S −1
−1
ˆ
Let Sˆ be an approximation of Schur complement S in B. The new preconditioner B,
and the corresponding preconditioned operator Bˆ −1 A are given as follows
Bˆ =
Aˆ 0
I
, Bˆ −1 A =
0
Bˆ − Sˆ
Aˆ −1 Bˆ T
.
Sˆ −1 S
(20)
Using (20) above, we can note the following trivial result that justifies the need for
a good preconditioner for the Schur complement.
Theorem 2 Let B defined in (20) be a preconditioner for A defined in (17), then
there are |Nh | eigenvalues of B −1 A equal to one, and the rest are the eigenvalues of
the preconditioned Schur complement Sˆ −1 S.
Remark 1 When using GMRES [20], right preconditioning is preferred. Similar
result as for the left preconditioner above Theorem 2 holds.
The preconditioned system B −1 Ax = B −1 b is given as follows
I Aˆ −1 Bˆ T
0 Sˆ −1 S
x1
x2
=
0
Aˆ −1
S Bˆ T Aˆ −1 −S −1
−1
from which we obtain the following set of equations
x1 + Aˆ −1 Bˆ T x2 = Aˆ −1 b1 ,
Sˆ −1 Sx2 = S −1 ( Bˆ T Aˆ −1 b1 − b2 ).
b1
b2
168
P. Kumar
Algorithm 1 Objective: Solve B −1 Ax = B −1 b
1. Solve for x2 : Sˆ −1 Sx2 = Sˆ −1 ( Bˆ T Aˆ −1 b1 − b2 )
2. Set x1 = Aˆ −1 (b1 − Bˆ T x2 )
Here if Krylov subspace method is used to solve for x2 , then a matrix vector
product with S and a solve with Sˆ is needed. However, when the problem size, it
ˆ and we need to solve it inexactly, for
won’t be feasible to do exact solve with A,
example, using algebraic multigrid methods. In the later case, the decoupling of x1
and x2 as in Algorithm 1 is not possible, and in this case, we shall need matrix vector
ˆ We discuss at the end of
product with A (17) and a solve (forward sweep) with B.
ˆ
this subsection on how to take advantage of the special structure of A.
˜
As a preconditioner S of S, we choose the preconditioner first proposed in [6].
The preconditioner is given as follows:
ˆ
S˜ = S1 Aˆ −1 S2 = −( Bˆ − τ 1/2 K ) Aˆ −1 ( Bˆ T − τ 1/2 A),
where K is the stiffness matrix from (9). We observe that the preconditioned Schur
complement S˜ −1 S is not symmetric, in particular, it is not symmetric positive
(semi)definite w.r.t. ·, · S or w.r.t. ·, · S˜ , thus, we may not use preconditioned conjugate gradient method [20, p. 262]. Consequently, we shall use GMRES in Saad
[20, p. 269] that allows nonsymmetric preconditioners. However, it is easy to see
that S1 , S2 are SPD provided we use mass lumping which is possible since we use
non adaptive uniform grid in space, that keeps the mass matrices consistent.
4.5.1
Solve with Aˆ
In step 1 of Algorithm 1, we need to solve with Aˆ when constructing right hand side,
and also in step 2. Let P be a permutation matrix, then solving a system of the form
ˆ = P T g as P T is nonsingular. With a change
ˆ = g is equivalent to solving P T Ah
Ah
of variable Py := h, we then solve for y in
P T Aˆ Py = Pg,
(21)
and we set h = Py to obtain the desired solution. By choosing P that renumbers the
nodes corresponding to the coincidence set, we obtain
P T Aˆ P =
I
,
R T P T Aˆ P R
where R is the restriction operator that compresses the matrix P T Aˆ P to untruncated
nodes Nh . Here R is given as follows:
Fast Preconditioned Solver for Truncated Saddle Point Problem …
⎛⎛
0
⎜⎜ ..
⎜⎝ .
⎜
⎜ 0
⎜⎛
R=⎜
⎜ 1
⎜⎜0
⎜⎜
⎜⎜ ..
⎝⎝ .
0 ... 0
169
0
0
|Nh |ì|Nh \Nh |
0
, R R
0
. . . 0⎠⎠
0 0 ... 1
...
0
0
1
..
.
...
...
...
...
Let Kˆ = T K T, we have
R T P T Aˆ P R = R T P T ( Kˆ + mˆ mˆ T )P R
= (R T P T Kˆ P R + R T P T mˆ mˆ T P R),
where
R T P T Kˆ P R = (P T Kˆ P)|Nh \Nh• , Kˆ = T K T, mˆ = T m,
where m is the rank-one term defined in (10). For convenience of notation, we write
R T P T Aˆ P R = ( K + z˜ z˜ T ),
ˆ In the new notation, we have
where K = R T P T Kˆ P R and m˜ = R T P T m.
P T Aˆ P =
I
( K + m˜ m˜ T )
.
(22)
Thus (21) now reads
I
( K + m˜ m˜ T )
y1
y2
= Pg =:
g1
,
g2
which reduces to two-set of equations
y1 = g1 ,
( K + m˜ m˜ )y2 = g2 .
T
To solve the latter, we use the following Sherman–Woodbury formula
( K + m˜ m˜ T )+ = K + −
K + m˜ m˜ T K +
.
1 + m˜ T K + m˜
170
P. Kumar
The aggregation based AMG discussed before can be used to solve with K˜ , thus, we
avoid constructing the prohibitively dense matrix which would be the case when rank
one term is explicitly added. We stress here that, in case, there are no truncations,
i.e., when the set of truncated nodes Nh• = φ, then K = K is singular, then some
AMG may fail. This is the reason why we use pseudo-inverse notation. However,
K + may be replaced by K −1 in case |Nh | > 1, because of the following result that
follows later.
Lemma 1 (Poincaré separation theorem for eigenvalues) Let Z ∈ Rn×n be a symmetric matrix with eigenvalues λ1 ≤ λ2 ≤ · · · λn , and let P be a semi-orthogonal
n × k matrix such that P T P = I d ∈ Rk×k . Then the eigenvalues μ1 ≤ μ2 · · · μn−k+i
of P T Z P are separated by the eigenvalues of Z as follows
λi ≤ μi ≤ λn−k+i .
Proof The theorem is proved in [19, p. 337].
Lemma 2 (Eigenvalues of the truncated matrix) Let λ1 ≤ λ2 · · · ≤ λn be the
ˆ
eigenvalues of A, and let λˆ 1 ≤ λˆ 2 · · · ≤ λˆ n be the eigenvalues of truncated matrix A.
n
ˆ Let λˆ n 1 ≤ λˆ n 2 . . . λˆ n k
T (i, i) be the number of untruncated rows in A.
Let k = i=1
be the eigenvalues of R T P T Aˆ P R. Then the following holds
λi ≤ λˆ ni ≤ λn−k+i .
Proof The proof shall follow by application of Poincare separation theorem, to this
end, we need to reformulate our problem. Let P be a permutation matrix that renumbers the rows such that the truncated rows are numbered first, then we have
P T Aˆ P =
I
,
R T P T Aˆ P R
where R ∈ Rn×k is the restriction operator defined as follows
⎞
⎛⎛
⎞
0 0 ... 0
⎟
⎜⎜ ..
⎟
⎜⎝ . . . . . . . 0 ⎠
⎟
⎜
⎟
⎜ 0 0 ... 0
⎟
n−k×k
⎜ ⎛
⎟
⎞
⎟.
1
0
.
.
.
0
R=⎜
⎜
⎟
⎜ ⎜0 1 . . . 0⎟
⎟
⎜ ⎜
⎟
⎟
⎜ ⎜. .
⎟
⎟
⎝ ⎝ .. . . . . . 0⎠
⎠
0 0 ... 1
k×k
Clearly, R T R = I d ∈ Rk×k . Since P T Aˆ P and Aˆ are similar, because P being a
permutation matrix P P T = I d, consequently, from Lemma 1, theorem follows.
Fast Preconditioned Solver for Truncated Saddle Point Problem …
171
Corollary 1 From Theorem 2 and from (22), since λmin ( ( K + m˜ m˜ T )) = λmin
ˆ ≥ λmin (A), > 0, hence, K + m˜ m˜ T is SPD, since A =
(P T Aˆ P) = λmin ( A)
T
K + mm is SPD.
Let · denote · 2 which is a submultiplicative norm, and determines the maximum eigenvalue. Although, S is symmetric, the preconditioner S˜ is not symmetric,
but for the error S − S˜ we have the following bound.
Theorem 3 (Error in preconditioner) Let E = S − S˜ . There holds
E ≤
√
ˆ +
τ λmax ( M)
−1
ˆ .
λmax (K )(λmax (K ) + mm T )λmax ( M)
In particular, we have
E ≤
√
τ · Ch 2 +
√
τ
(C3 · h 2 ) λn−k+1 (K ) +
ˆ
(λn−k+1 (K ))2 (mˆ T m)
.
1/(mˆ T m)
ˆ + λk (K )
Proof We have
E = S − S˜ =
≤
√
√
τ Bˆ + K Aˆ −1 Bˆ T
τ ( Bˆ + K
Aˆ −1
Bˆ T ).
(23)
Recalling that T being a truncation matrix, T = λmax (T ) = 1, we observe that
Bˆ = T M ≤ T
M = 1 · λmax (M) = λmax (M).
Same estimate holds for Bˆ T . To estimate Aˆ −1 , we first estimate Aˆ . We write
Aˆ = ( Kˆ + mˆ mˆ T ), where Kˆ = T K T and mˆ = T m. From Theorem 2, we have
ˆ ≥ λmin (A) and λmax ( A)
ˆ ≤ λmax (A). Consequently, λmin ( Aˆ −1 ) ≥ λmin (A−1 )
λmin ( A)
−1
−1
and λmax ( Aˆ ) ≤ λmin (A ) = λmax (A) ≤ K + mm T ). We have
E ≤
√
ˆ +
τ λmax ( M)
−1
ˆ .
λmax (K )(λmax (K ) + mm T )λmax ( M)
More insight is obtained by using Sherman–Woodbury inversion
1 ˆ −1 Kˆ −1 mˆ mˆ T Kˆ −1
1
( Kˆ + mˆ mˆ T )−1 =
K −
Aˆ −1 =
1 + mˆ T Kˆ −1 mˆ
≤
1
Kˆ −1 +
⎛
=
1⎝
( Kˆ −1 m)(
ˆ Kˆ −1 m)
ˆ T
1 + mˆ T mλ
ˆ min ( Kˆ −1 )
λmax ( Kˆ −1 ) +
( Kˆ −1 m)(
ˆ Kˆ −1 m)
ˆ T
1 + (mˆ T m)λ
ˆ max ( Kˆ )
⎞
⎠
172
P. Kumar
=
⎛
1⎝
λmax ( Kˆ −1 ) +
=
⎛
1⎝
λmin ( Kˆ ) +
λmax ( Kˆ −1 m)(
ˆ Kˆ −1 m)
ˆ T
1 + (mˆ T m)λ
ˆ max ( Kˆ )
⎞
mˆ T ( Kˆ −1 )2 mˆ
⎠
⎞
⎠
1 + (mˆ T m)λ
ˆ max ( Kˆ )
⎞
⎛
ˆ −1 ))2 (mˆ T m)
1⎝
(
K
ˆ
(λ
max
⎠,
≤
λmax ( Kˆ ) +
1 + (mˆ T m)λ
ˆ max ( Kˆ )
⎞
⎛
2
T
ˆ
1⎝
(
K
))
(
m
ˆ
m)
ˆ
(λ
min
⎠.
=
λmin ( Kˆ ) +
T
ˆ
1 + (mˆ m)λ
ˆ max ( K )
mˆ T Kˆ −2 mˆ
≤ λmax ( Kˆ −2 )
mˆ T mˆ
For convenience of notation, let number of untruncated rows be denoted by k = |Nh \
Nh• |. From Lemma 1, λmin ( Kˆ ) = λ1 ( Kˆ ) ≤ λn−k+1 (K ), estimate above becomes
Aˆ −1 ≤
⎛
1⎝
λn−k+1 (K ) +
⎞
ˆ ⎠
(λn−k+1 (K ))2 (mˆ T m)
1 + (mˆ T m)λ
ˆ max ( Kˆ )
.
Again from Lemma 1, we have λmax ( Kˆ ) = λk ( Kˆ ) ≥ λk (K ). The error (23) now
becomes
√
τ · λmax (M)
√
τ
ˆ
(λn−k+1 (K ))2 (mˆ T m)
.
(λmax (K ) · λmax (M)) λn−k+1 (K ) +
+
1 + (mˆ T m)λ
ˆ k (K )
E ≤
To observe the dependence on the mesh size h, we recall that on (quasi)uniform grid,
the eigenvalues of M and K are known. On each element K , the element mass matrix
is given by (19). Let ξ ∈ Rn . We write ξ T Mξ as the sum K ∈K ξ T | K M K ξ| K , where
M K is proportional to |K | which is proportional to h 2K . We then have ch 2 (I d) ≤
M K ≤ Ch 2 (I d), which provides bound for λmax (M) as follows
λmax (M) ≤ Ch 2 .
Also, we have
T
ξT K ξ
−2 ξ Mξ
≤
C
≤ C2 C = C3 .
h
2
ξT ξ
ξT ξ
Fast Preconditioned Solver for Truncated Saddle Point Problem …
173
The error now becomes
√
E ≤ τ · Ch 2 +
√
τ
(C3 · h 2 ) λn−k+1 (K ) +
(λn−k+1 (K ))2 mˆ T mˆ
1/(mˆ T m)
ˆ + λk (K )
.
5 Numerical Experiments
All the experiments were performed in double precision arithmetic in MATLAB.
The Krylov solver used was GMRES with subspace dimension of 200, and maximum number of iterations allowed was 300. The GMRES iteration was stopped as
soon as the relative residual was below the tolerance of 10−7 , and for AMG, the
tolerance was set to be 10−8 . The maximum number of iterations for AMG was 100.
The system matrices were scaled (Fig. 1).
The initial random active set is similar to the one considered in [6], where the
initial value is set to be random taking values between −0.3 and 0.5. The subfigures
in Fig. 2 show the Cahn–Hilliard evolution for τ = 1, 20, 40, . . . , 200 steps. For the
evolution of random initial configuration as in Fig. 2, we set = 2 ∗ 10−2 , τ = 10−5
as in [6]. In Table 1, we compare the iteration count and times for h = 1/256, 1/400.
We note that the iteration count more or less remain comparable except for initial
time steps where it takes roughly double the number of iterations than at later time
steps. We also observe that the number of truncations increase with time, and this
helps in bringing down the iteration count especially during first few time steps. For
time steps until τ = 80, the time for h = 1/400 is relatively larger because of AMG
needing more iterations (Table 2).
Next we consider larger problems with two samples of active set configurations as
shown in Fig. 2, and study the effect of parameter and mesh sizes on the iteration count
for these fixed configurations. The region between the two squares and the circles
is the interface between two bulk phases taking values +1 and −1; we set random
values between −0.3 and 0.5 in this interface region. The width of the interface is
kept to be 10 times the chosen mesh size. The time step τ is chosen to be equal to
. We compare various mesh sizes leading to total number of grid points upto just
above 1 million. We observe that the number of iterations remain independent of the
mesh size, however it depends on . But we observe that for a fixed epsilon, with
finer mesh, the number of iterations actually decrease significantly. For example the
number of iterations for h = 2−7 , = 10−6 is 84 but the number of iterations for
h = 2−10 , = 10−6 is 38, a reduction of 46 iterations! It seems that finer mesh
size makes the preconditioner more efficient as also suggested by the error bound
in Theorem 3. We also observe that the time to solve is proportional to number of
iterations; the inexact solve for the (1, 1) block remains optimal because the (1, 1)
block is essentially Laplacian for which AMG remains very efficient.
174
P. Kumar
(a)
1
(b)
1
0.8
1
1
0.8
0.5
0.6
0.5
0.6
0
0.4
0
0.4
-0.5
0.2
0
-1
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
1
-0.5
0.2
(c)
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
1
0.8
0.5
0.6
(d)
1
0.8
0.6
0.4
0
0.4
-0.5
-0.5
0.2
1
1
0.5
0
0
-1
0.2
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
(e)
-1
0
1
1
0.8
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
(f)
-1
1
0.8
0.5
0.6
0.5
0.6
0
0.4
0
0.4
-0.5
0.2
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
-1
-0.5
0.2
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Fig. 1 Evolution of random initial active set configuration
-1