Chapter 3. Backtracking Algorithms for Network Reliability Analysis
Tải bản đầy đủ  0trang
50
M . Ball, R.M. Van Slyke
respectively. Communication can exist between a pair of nodes if they are operative
and if there is a path consisting of operative nodes and arcs connecting them.
The underlying model is not new. An early reference on it is [6]. In this paper and
in the majority of the work done on this problem thus far nodes are assumed to be
perfectly reliable. The reliability measure most often considered is the probability
that a specified pair of nodes can communicate. The reliability problems associated
with many physical systems can be stated in terms of finding the probability that a
specified pair of nodes can communicate in a network with perfectly reliable nodes.
We are interested in the reliability of data communications networks. For this
problem we cannot assume that nodes are perfectly reliable and we require global
measures of reliability. In addition to the probability that a specified pair of nodes
can communicate we consider the probability that all nodes can communicate and
the probability that all operative nodes can communicate. All algorithms can obtain
exact answers; in addition, t o allow for the analysis of larger networks we give a
truncation procedure with which approximate answers can be obtained in less time.
There are basically two approaches to network reliability analysis: simulation
and analytic. All known analytic methods for network reliability analysis have
worst case computation time which grows exponentially in the size of the network
considered. Our backtrack methods are analytic methods and are not exceptions to
this trend. Hence, they are not recommended for large networks. However, results
in [8] indicate that network reliability analysis is intrinsically very difficult.
Simulation methods, for which computation time grows only slightly faster than
linearly with network size, have been described in the literature. In our practical
experience we have found that simulation techniques are suitable for large
networks and are generally more flexible than analytic methods. However, they
have the disadvantage that they only give approximate answers; and when a high
degree of accuracy is necessary, the running time can grow quite large.
Analytic methods use basic probabilistic laws to reduce or decompose the
problem. Roughly speaking these methods use some combination of enumerative
and reduction techniques. Enumerative methods enumerate a set of probabilistic
events which are mutually exclusive and collectively exhaustive with respect to the
measure in question. O u r algorithms are examples of enumerative algorithms.
Reduction algorithms collapse two o r more network components into one network
component. The simplest example of network reduction is collapsing two series arcs
into one arc.
Enumerative algorithms for finding the node pair disconnection probability with
perfectly reliable nodes are given in [2,3,9]. Hansler, McAuliffe and Wilcox
produce as output a polynomial in P, the constant arc failure probability. Using an
APL implementation o n an IBM 36091 computer, their algorithm ran on two
9node, 12arc networks in a total of 18 seconds. Fratta and Montanari used a
network reduction technique to reduce a 21node, 26arc network to an %node,
12arc network. They used a FORTRAN IV implementation on an IBM36067
computer. Once the reduction was accomplished, they used their enumerative
Backtracking algorithms
51
algorithm on the 8node, 12arc network to produce the exact disconnection
probability. The total time for the reduction and the enumerative algorithm was 112
seconds. The reduction algorithm most probably took a small percentage of that
time. Segal initially enumerates all paths in the network. He then uses the *
operator (P:Pb = P, iff a = b ) to convert the probabilities that each path operates
to the probability that the node pair can communicate. This technique is especially
useful when the communication paths between the node pair are restricted.
Reduction techniques have been most successful in finding the probability that a
specified pair of nodes can communicate where parallel and series arcs can be
collapsed into single arcs. Rosenthal [8] gives more sophisticated reduction
techniques for finding other reliability measures. Rosenthal gives no computational
experience; however, it appears that his techniques may be valuable for analyzing
sparse networks. Generally, networks can only be reduced so far, so reduction
techniques must be used in conjunction with other methods. The one exception is in
the case of tree networks.
In [5] a recursive reduction algorithm is given for determining a variety of
reliability measures, including all of those mentioned in this paper, on tree
networks. A 500node tree was run in If seconds on a PDP10 computer.
Algorithms for general networks cannot come close to solving problems of this size.
Simulation methods have been given in [ l l , 121. They provide a great deal of
flexibility in the measures that can be investigated. In addition, they contain
powerful sensitivity analysis capabilities. For a given number of samples, the
running times increase almost linearly in the number of nodes and arcs. A 9node,
12arc network was run using the simulation algorithm with a FORTRAN IV
implementation. The simulation algorithm produced the expected fraction of node
pairs communicating and the probability that all operative nodes can communicate
in 54 seconds on a PDP10 computer.
We have implemented our algorithms using FORTRAN IV on a PDP10
computer. The results indicate reduction in running time over the analytic
algorithms listed below. In addition, our algorithms produce global reliability
measures of more interest to network designers, whereas, most of the previous
work was concentrated on the specified node pair problem. Our algorithms also
appear to be much quicker than simulation algorithms for networks with fewer than
20 arcs. A complete summary of computational experience is given in a later
section.
2. Probabilistic backtracking
Suppose we wish to enumerate all subsets of a set with a desired property. We
examine elements of the set in a prescribed order. When an element is examined we
decide whether or not to include it in the subset under construction. When the
subset has the desired property we list it. Afterwards, we change our decision about
52
M. Ball, R . M . Van Slyke
the last element and begin adding new elements until the subset again has the
desired property. If changing our decision on an element cannot produce a subset
with the desired property we backup to the previous element. If this element has
been considered both in and out, we backup again. If it has only been considered in
one state, we change our decision on it and proceed as before. When the process
terminates all subsets have been enumerated. Walker [13] has appropriately named
this process “backtracking”. If the enumeration is represented by a tree it can be
thought of as a method for exploring a tree. More recently, it has been generalized
as a method for exploring any graph and in this context it is called “depth first
search” [lo].
We have found this process very useful in determining the probability of a
random event E. In the probabilistic context backtracking proceeds by adding
probabilistic events to a stack. When the intersection of the events o n the stack
implies the event E, the probability of the stack configuration is added into a
cumulative sum. Afterwards, we complement the top event and begin adding new
events to the stack until it again implies the random event E. If complementing, the
top event implies that E cannot occur, we take the event off the stack and consider
the new top event. If both the event and its complement have been considered, we
take it off the stack. If its complement has not been considered, we complement it
and proceed as before. When this process terminates, the cumulative sum will
contain the probability of the event E. This is so because the events whose
probabilities were added into the sum form a partition of the event E.
3. Node pair disconnection
We will first consider finding the probability that a specified node pair cannot
communicate. One minus this value will give us the probability that the specified
pair can communicate which is the reliability measure of interest. Henceforth, this
node pair will be denoted as ( S , T ) .For the moment, we will assume that nodes are
perfectly reliable. All algorithms presented use t h e same basic approach. The
approach is best illustrated through the specified node pair problem which is the
simplest. Our algorithm embodies the general idea of [3] in a backtracking
structure. Their algorithm and ours enumerate a set of “modified cut sets”. A
modified cutset is the assignment of one of the states, operative, inoperative o r free
to all arcs in t h e network in such a way that the inoperative arcs form a cutset with
respect to the specified node pair. The probability of a modified cutset is the
product of the failure probabilities of all inoperative arcs times the product of one
minus the failure probabilities of all operative arcs. The modified cutsets we
enumerate are mutually exclusive and collectively exhaustive with respect to the
specified node pair being diconnected. Therefore, the sum of their probabilities is
the probability that the specified node pair cannot communicate.
We use probabilistic backtracking to enumerate the desired set of modified
cutsets. The events added to the stack are of the form “ A inoperative” or its
Backtracking algorithms
53
complement “A operative” where A is some arc. Inoperative events are added to
the stack until the inoperative arcs include an ST cut. At this point the arcs on the
stack will form a modified cutset and its probability will be added into a cumulative
sum. The stack configuration corresponds to a modified cutset in the following
manner. Arcs not included in any events o n the stack are free. Other arcs are
operative or inoperative depending on the type of event in which they appear.
After updating the cumulative sum, the top event is changed from “A inoperative”
to “A operative”. The algorithm continues to proceed in the backtracking manner
by again adding inoperative arcs to the stack. Two procedures are necessary to
implement the algorthm. The first is a method for choosing which arcs to mark
inoperative and add to the stack to form a modified cutset. In addition, after an
event has been changed from “A inoperative” to “A operative” we must be able
to determine if a cutset can be formed by making free arcs inoperative and adding
them to the stack. If one cannot be formed we do not make A operative but simply
take A off the stack. This will be the case if, when A is made operative, the
operative arcs on the stack would include an ST path.
Given this basic structure, a number of algorithms could be developed depending
on how the arcs to be made inoperative are chosen. Any such algorithm will fit into
the following general form:
Step 0: (Initialization). Mark all arcs free; create a stack which is initially empty
Step 1: (Generate modified cutset)
(a) Find a set of free arcs that together with all inoperative arcs will form an ST
cut.
(b) Mark all the arcs found in l(a) inoperative and add them to the stack.
(c) The stack now represents a modified custset; add its probability into a
cumulative sum.
Step 2: (Backtrack)
(a) If the stack is empty, we are done.
(b) Take an arc off the top of the stack.
(c) If the arc is inoperative and if when made operative, a path consisting only of
operative arcs would exist between S and T, then mark it free and go to 2(a).
(d) If the arc is inoperative and the condition tested in 2(c) does not hold, then
mark it operative, put it back on the stack and go to Step 1.
(e) If the arc is operative, then mark it free and go to 2(a).
Example.
,
s = 1,
T
= 4,
12 implies arc 12 is inoperative,
12 implies arc
12 is operative.
M. Ball, R.M. Van Slyke
54
Examples of possible stack configurations:
12,13
12, 13 are inoperative. All other arcs are free. This is a modified
cutset since 12 and 13 form an ST cut and they are inoperative. If
this were the stack configuration at Step 2 13 would be marked
operative.
1 2 , E , 24,34 12, 24, 34 are inoperative; 13 is operative. All other arcs are free.
This is a modified cutset since 24 and 34 form an ST cut and they
are inoperative. If this were the stack configuration at Step 2, 34
would be taken off the stack, since if it were marked operative, 13
and 34 would form an operative ST path.
12,23,%
12, 23 are inoperative; 34 is operative. All other arcs are free. This i
not
 a modified cutset. If this were the stack configuration at Step 2
34 would be removed from the stack since it is operative.
The two nontrivial operations contained in this algorithm are Step l(a) and Step
2(c). In Step l(a), we choose which arcs to make inoperative and put on the stack
and in Step 2(c), we decide whether an inoperative arc should be complemented or
whether it should be taken off the stack. Of course, the procedure used in one of
these steps is closely related to the procedure used in the other.
We have devised two algorithms based on this general algorithm. Algorithm 1
enumerates a set of modified cutsets similar to the set enumerated by Hansler,
McAuliffe and Wilcox. Algorithm 2 enumerates a set of minimum cardinality
modified cutsets with the use of a mincut algorithm.
In Algorithm 1 operative arcs form a tree rooted at node S. Inoperative arcs are
adjacent to nodes in the tree. Initially, the tree consists only of node S. Node 7'will
never be in the tree. Step l(a) chooses all free arcs adjacent to both a node in the
tree and a node not in the tree. These arcs clearly will disconnect the tree from the
rest of the network and consequently, will disconnect S and T. The fact that an
inoperative arc, when added to the stack, is adjacent to a node in the tree and a
node not in the tree insures that, when it is marked operative, the operative arcs
will continue to form a tree. In Step 2(c), an inoperative arc is taken off the stack if
it is adjacent to node T.
S = l , T=4.
The sequence of modified cutsets generated by Algorithm 1 is:
Backtracking algorithms
55
12,13;
12,E,
32,34;
_ _
12,13,32,24,34;
12,23,24,13;
12,23,24,E,34.
This algorithm has a very simple structure and all subprocedures take a small
amount of time. The only subprocedure that cannot be done in constant time is
choosing the free arcs to add to the stack, (Step l(a)). We propose that nodes in the
tree be kept on a linked list. Step l(a) is implemented by searching the set of arcs
incident to nodes on this list. This operation requires n o more than O ( N A )time.
Theorem. If NM = the number of modified cutsets enumer'ated and NA
of arcs then Algorithm 1 is O(NA * N M ) .
=
the number
Proof. Any time an arc is made operative, a modified cutset is generated. As was
shown earlier, in the worst case, this operation is O ( N , ) . All operations
performed in Step 2 can be done in constant time. Each operation either results in
an arc being made operative and thus, a new cut being generated, or an arc being
deleted from the stack. 0
In Algorithm 2, operative arcs form a forest. Node S and node T are contained
in different components of the forest. Step l(a) chooses the set of free arcs of
minimum cardinality that together with the inoperative arcs forms an ST cut. This
minimum set of arcs is found by finding the minimum S  T cut in the network with
free arcs having capacity 1, inoperative arcs deleted and operative arcs having
infinite capacity. The first set of free arcs added to the stack is a minimum
cardinality ST cut. T o implement Step 2(c), nodes in the operative tree containing
S are given the label L, where L is the length of the path in the tree from the node
to S. Nodes in the operative tree containing T are given the label L where L is the
length of the path in the tree from the node to T. All other nodes have L = 0. In
Step 2(c), an inoperative arc is taken off the stack if it is adjacent to nodes whose
labels have opposite signs.
Example.
The sequence of modified cutsets generated by Algorithm 2 is
M. Ball, R.M. Van Slyke
56
12,13;
1 2 ,_
E,_
32,34;
12,13,32,24,34;
.
12,24,34;
12,24,%, 13,23.
Note that 1 less cutset was generated than in Algorithm 1.
Algorithm 2 enumerates an entirely different partition of the probability space
than Algorithm 1. T h e number of events in this partition is smaller than in
Algorithm 1. Algorithm 2 pays for this by the necessity of performing much more
work per modified cutset generated. Again, every time an arc is made operative,
the algorithm produces a modified cutset.
To find this cutset a mincut algorithm must be performed. [1] gives a maxflow
algorithm for networks with unit arc capacities that runs in O(N?) time. This could
easily be converted to a mincut algorithm suitable for our problem with the same
time bound. The only other time consuming operation is the maintenance of the
labels on the trees rooted at S and T. Between the generation of two modified
cutsets at most one operative arc is added t o the stack but as many as NN  2 may
be taken off. When an operative arc is added t o the stack, if it is adjacent to a node
with a nonzero label, we must relabel all nodes added to the tree rooted at S or T.
This requires searching the tree that has just been joined to the tree rooted at S or
T. This operation requires at most O ( N A )time. When an operative arc, A , is
changed to free, if the nodes adjacent to it have nonzero labels, we must set the
labels of the nodes that this operation disconnects from S or T t o 0. We first find
the node, B,adjacent to A that has the label of higher absolute value. With arc A
changed to free node B will be the root of a tree not containing S or T whose nodes
have nonzero labels. We search this tree and change all node labels to 0. This
operation requires at most time proportional t o the number of arcs adjacent t o
nodes whose labels were changed. Changing any set of arcs to free can change the
label of each node at most once. Consequently, label changing operations between
the generation of modified cutsets require at most O ( N A )time.
Theorem. If N M = the number of modified cutsets enumerated and N A = the number
of arcs, then Algorithm 2 is O(NY*N,).
Proof. The proof follows the logic in the equivalent proof for Algorithm 1 using the
facts that the maxflow algorithm is O(N?) and updating the labels is O ( N A )for
each modified cutset. 0
The results concerning the computational complexity of Algorithm 2 led us to
believe that it would have a higher running time than Algorithm 1. Consequently,
we did not code Algorithm 2 and all extensions in this paper refer to Algorithm 1.
Algorithm 2 does have many interesting properties which we hope t o explore later.
Backtracking algorithms
57
4. Network disconnection
A measure of the reliability of the entire network is the probability that all nodes
can communicate. We chose to compute the probability that the network is
disconnected which is one minus this value. Algorithm 1 extends to this case quite
easily. Each modified cutset will disconnect the graph rather than only the specified
node pair. (Clearly, any modified cutset which disconnects a specified node pair
would also disconnect the graph.) Rather than stopping the growth of the tree when
the specified node pair becomes connected, we stop it when it becomes a spanning
tree. Spanning trees can easily be recognized by a count on the number of operative
arcs.
In Step 2(e),we take an inoperative arc off the stack if the number of operative
arcs equals "2.
Example.
s = 1.
The sequence of modified cutsets generated by Algorithm 1 with the network
disconnection alteration is:
12,13;
1 2 , E ,32,34;
12,3,32,34,42;
12,13,32,24,34;
12,23,24,13;
,
.
.
,
12,23,24,E, 34;
12.23.z.43.13;
12,23,24,34.
,
,
,
,
5. Truncation
Assuming arcs have constant failure probability, P, each configuration with
exactly K arcs inoperative has probability Pk(l P)N~k.
An approximation to the
node pair disconnection probability can be obtained by ignoring all network
configurations with more than LIMIT arcs inoperative. If LIMIT is the smallest L
such that
M. Ball, R.M. Van Slyke
58
L
C CA (NA,k ) Pk(1 P ) N ~ 3k 1

TOL,
k=l
where CA(NA,
k ) = NA things taken k at a time, then the approximation will be
within TOL of the true value.
Given LIMIT, we implement this truncation procedure in our backtracking
algorithm by keeping a count on the number of inoperative arcs. Whenever the
addition of an arc to the stack in Step l(6) would make the count exceed limit, the
algorithm immediately backtracks (goes t o Step 2).
6. Node failures
While computations are simpler when only arcs can fail, in reality nodes are also
unreliable. When considering the possibility of node failures a question arises as to
the definition of network disconnection. The most obvious definition would be,
“the network is disconnected any time at least one node cannot communicate with
some other node” (ND1). By this definition, a network would be disconnected any
time at least one node failed. An alternative definition which is much more useful
for the network designer who has n o control over node failure rates is, “the
network is disconnected any time an operative node cannot communicate with
another operative node” (ND2). Thus, if a given node is inoperative its ability to
communicate with the rest of the graph is irrelevant.
(A) Probubility {NDl}. W e will consider N D l first simply because it is easier to
handle. In fact, it reduces to the problem with perfectly reliable nodes.
Let :
A N 0 = {all nodes operative},
NANO = {not all nodes operative}.
Then since {ANO} and {NANO} are mutually exclusive, collectively exhaustive
events the law of total probability gives us:
I
P{NDl} = P{NDlI ANO}*P{ANO}+ P{NDl NANO}*P{NANO}.
P{NDl( ANO} can be found using Algorithm 1 with the network disconnection
option.
n (1
”
P{ANO} =
 PN(N))
N= 1
P(ND11 NANO} = 1
P{NANO} = 1 P{ANO}.
Thus, with one extra straight forward calculation the graph disconnection problem
with node failures reduces t o the graph disconnection problem with perfectly
reliable nodes.
Backtracking algorithms
59
(B) Probability { S , T cannot communicate}. The definition of ND2 presents a
much more difficult problem for which major modifications to the algorithm are
required. First, we will again consider the node pair disconnection problem.
Let:
S
T
imply S can communicate with T,
S # T imply S cannot communicate with T.
Then
P{S# T }= P{S& TI S inop}*P{S inop}
+ P{S + T I s op,
+PIS
P{S
P
T
+ T1S
T inop}*P{S op, T inopj
op, T op}*P{S op, T op},
1 s inop} = P{S 4 T I s op, T inop} = 1,
P{S inop} = PN(S),
P{S op, T inop} = (1 P,(S))*PN(T),
P{S Op, T O ~ } = ( ~  P N ( S ) ) * ( I  P N ( T ) ) .
1
The new version of the algorithm will compute P{S P T S op, T op}; i.e., we
assume S and T are perfectly reliable and then find the probability that they cannot
communicate.
The problem now has been reduced to enumerating a mutually exclusive,
collectively exhaustive set of modified cutsets between S and T where nodes other
than S and T can also “take part” in cuts. The most straightforward modification to
Algorithm 1 that would compute the desired probability would be to put nodes as
well as arcs o n the stack. Nodes are now marked either operative, inoperative, or
free. Every time an arc is made operative, the new node added to the tree is placed
on the stack and marked inoperative. To disconnect this tree from the rest of the
network, all free arcs between operative nodes in the tree and free nodes are added
to the stack and marked inoperative. When an inoperative node is encountered in a
backtrack, it is switched t o operative and a new modified cutset is found in the same
manner.
Consider the following example:
S = 1, T = 4. The sequence of modified cutsets generated by the suggested
algorithm for the 1 , 4 node pair disconnection probability is:
M . Ball, R.M. Van Slyke
12,13;
1 2 , E_ , 3;
12,13,3,32,34;
12,13,3,32,2,34;
12,13,3,32,2,24,34;
12,2,13;
12,2,13,3;
  _
12,2,13,3,34;
_
12,2,23,24,13;
_
1 _2 , 2 , 2 3 , 2 4, 3_ , 3 ;
12,2,23,24,13,3,34;
12,i,23,3,24;
12,2,23,3,24,34;
where 1 implies node 1 is inoperative and i implies node 1 is operative.
A large saving can be realized by taking advantage of the equivalence between
the following two events:
El = node N inoperative
E z = node N operative; all arcs between node N and free nodes inoperative.
Notice that in the example the modified cutsets
_
{12,2,13;12,2,3,3;12,2,13,3,34}
are the same as
_
_
_
{12,2,23,24,13;1 2 , 2 , 2 3 , 2 4 , 3 , 3 ; 12,2,23,24,13,3,34)
except that 2 in the first set is replaced by 2, 23, 24 in the second set.
El and El are equivalent in the following sense. Given the current stack
configuration the subsequent enumeration with El on the stack is exactly the same
as the enumeration would be with the events in Ez on the stack. Stated probabilistically this relation is:
I
I
P{S# T El n events on stack} = P{S& T E z f l events on stack}
It we let C1b e the value of the cumulative sum when E l is placed on the stack and
C : b e the value of the cumulative sum when El is changed t o operative we have:
1
P{S# T El fl events on stack} = ( C : CI)/P{E1fl events on stack}.
This relation also applies to E z . Thus, when we change El to operative we may
update the cumulative sum to C2 t o account for all the enumeration that would
have proceeded with the events in E z on the stack where:
Cz= ( C ;  C1)*P{E,}/P{E1}+ C : .