Tải bản đầy đủ - 0 (trang)
Chapter 12. How Can Specialized Discrete and Convex Optimization Methods be Married

Chapter 12. How Can Specialized Discrete and Convex Optimization Methods be Married

Tải bản đầy đủ - 0trang

A . M . Geoffrion


programs.” We define this class, survey its applications, describe four promising

approaches to the development of applicable hybrid algorithms, and finally

conclude with an indication of attractive opportunities for further research.

1.1. Definition of discretelconvex programming

By a discretelconvex program we mean an optimization problem of t h e form

Min cs + f , ( x )


s.t. 6 E A , x E Xs,

where A is a finite set of possible discrete choices or logical designs 6, and X , is a

convex set of possible continuous choices or activities x associated with any given 6.

The objective function distinguishes the direct cost of 6, cs, from the cost f , ( x ) of

the activities carried out under 6. The asymmetry of the notation in 6 and x reflects

the fact that, in many of the applications we have in mind, the choice of x is

predicated on the choice of 6 but not conversely; that is, the very domain of x may

depend on 6 whereas the domain of 6 can always be described independently of x.

More specifically, we presume that (DC) satisfies these two properties:

Property 1. For any fixed 6 in A, f,(. ) is convex on X , and its minimum can be

computed with reasonable efficiency by a known convex programming algorithm

(e.g., by LP, NLP, a network flow method, etc.)

Property 2 . A reasonable efficient discrete or combinatorial optimization algorithm is known for some problem related to (and hopefully a reasonable

approximation of)

Min c,


+ v(6),

where v ( 6 )

Inf f , ( x ) .


Problem (D) obviously is equivalent to (DC): it is infeasible or has unbounded

optimal value if and only if (DC) does; and if S o is optimal (E1-optimal) in (D) and

xo is optimal (&,-optimal) in the “inner” problem defining v ( S o ) , then (6°,xo) is

optimal (el + &,-optimal) in (DC).’Notice that Property 1 assures the relatively easy

evaluation of v ( 6 ) . Exactly what relative of (D) for which a discrete or combinatorial algorithm is available is deliberately left unspecified in Property 2. Usually v (

must be approximated by a much simpler function in such an algorithm, and

sometimes c, or even A must also be approximated. The intent of Property 2 is

simply to focus on applications where the discrete aspect of the problem is tractable

provided suitable approximations are made to submerge the continuous aspect.

One further comment must be made about (DC): although X , will necessarily be

a subset of a finite-dimensional vector space, no such restriction need be imposed

on A . In some applications S will be a map of one finite set into another, or some

a )

See, e.g., [15, Theorem I] (where (D) would be called the “projection” of (DC) onto 6).

How can optimization methods be married?


other combinatorial object, rather than a tuple of real numbers. It is not the

structure of the space in which A dwells, but rather the logical structure of A itself

(in addition to finiteness) which permits mathematical manipulations involving 6 to

be carried out. Of course (DC) could always be reformulated so that 6 is replaced

by integer-valued indicator variables. However, in most applications such an

artifice serves only to obscure the natural structure of A and t o cause an excessive

increase in representational complexity or size or both.’ It therefore seems wise not

to insist that (DC) be stated as a conventional mathematical programming problem

in real variables and equality or inequality constraints.

2. Some applications

Here we survey briefly some of the principal types of applications which fall

within the domain of discrete/convex pi-ogramming as defined above.

2.1. Production scheduling [21, 24, 28, 291

Setup and sequence-dependent changeover costs, minimum batch sizes, precedence constraints, and crew integrity are some of the factors which remove many

production scheduling problems from the realm of ordinary linear o r nonlinear

programming. The logical design 6 typically determines which jobs are to be done

in what order on which machines (or machine configurations), and possibly which

crew will handle each setup. The activity vector x then determines, for a given 6,

the timing and quantities of each run, the allocation of divisible resources to job

activities, and so on.

An algorithm in keeping with Property 1 is likely to be of L P type, possibly with

some nonlinear costs, while combinatorial algorithms in keeping with Property 2

abound (but with only limited success) in the literature on machine/job shop

scheduling/sequencing [6, 71. Example 1 describes a case where a successful

partnership was achieved between linear programming and a quadratic assignment

algorithm (see Section 3.1).

2.2. Network design [l, 2, 3, 4, 5, 12, 13, 321

Many problems connected with the design or modification of communication

networks and transportation networks can be posed as discrete/convex programs.

The discrete design S may select nodes for the installation of facilities multiplexers, concentrators, or interface message processors in computer communication networks, junctions in pipeline networks, interchanges in highway networks,

See Example 1 below, and think of the futility of attempting to express many realistic scheduling and

sequencing problems as integer linear programs.


A.M. Geoflrion

and so on. A design 6 may also select connecting links from a finite list of

possibilities, both in terms of which nodes are to be connected and in terms of the

capacity of the connection (there are standard transmission speeds for communication lines, standard sizes for gas and oil pipelines, only a few choices for the number

of lanes of a highway, etc.). The choice of discrete design requires that due

consideration be given to its impact on the flows in the network. Differences in unit

flow costs, delays due to congestion, and demand elasticity all tend to render flow

prediction a nontrivial problem even when 6 is fixed (see [13] for a discussion of the

influence of cost and congestion on the utilization of store-and-forward communication networks, and 19, 331 for a discussion of equilibrium flows in transportation

networks). The activity vector x represents, of course, the flows in a network.

Network flow algorithms are obviously the most natural choice for the task posed

by Property 1, particularly since their power has increased dramatically during the

last few years. Convex cost functions occur when congestion delays are taken as the

criterion [ 131. A variety of discrete optimization algorithms have potential for

Property 2: minimum spanning trees [4] when the network must have a tree

structure, set covering (341 for emergency service networks, generalized assignment

[31] when peripheral facilities must be linked directly to fixed service facilities, and

so on. Example 2 describes an application where a multicommodity flow algorithm

can be combined with a knapsack algorithm (see Section 3.2).

2.3. Physical distribution system planning [ 19, 20, 251

In distribution system planning problems the discrete design 6 determines the

geographic location of plants and/or warehouses, and possibly also the all-ornothing assignment of customers to these facilities for each integral bundle of

products. The activity vector x corresponds to product flows. This class of models is

conceptually close to network design as discussed above, but has enough distinguishing characteristics (such as the absence of link capacities and the presence of

facility capacities and economies-of-scale) that separate treatment is warranted.

2.4. Facilities layout [lo, 231

Facilities layout problems occur on a hierarchy of scales. On a global scale, in

which cities should the various facilities of a firm be located? Within a given city,

which sub-facility should be located in each available building? Within a given

building, which department or operating unit should be located on each floor and in

each work area? Within a given work area, what should be the layout of the various

pieces of equipment? The problem appears to be a combinatorial one, but flows

and communications can be influenced by locational layout and often need to be

considered jointly. Locational layout would be specified by 6 and x would specify

flows and communications.

Example 3 describes an application where linear programming for

How can optimization methods be married?


flow/communications is combined with a quadratic assignment algorithm for the

layout choices (see Section 3.3).

2.5. Other applications

There are many other applications which can be modeled as discrete/convex

programs. O n e interesting class is that of selecting and sequencing interdependent

capital investment projects (for hydroelectricity, manufacturing capacity expansion,

etc.). The logical design 6 would determine which projects are selected and their

sequence of execution, while x would determine the details of project timing and

how the system corresponding to a given 6 is operated over time. A particularly

nice case is developed in [8], where a dynamic programming approach was derived

for (D) itself that can be used for a variety of different “operating cost submodels”

specified by Xs and f s ( x ) .

Another important class of applications for discrete/convex programming is

transport scheduling. The problem here is different from the transportation

network design problems discussed earlier because the major emphasis is on how

fleet vehicles (planes, ships, trains, pool trucks, etc.) should move over an

established transportation network in response to demands for transport. The

possible sequences of moves for each vehicle comprises the combinatorial aspect of

the problem, while the exact timing of the moves and the determination of

passenger/cargo patronage comprises the continuous aspect. It is usually essential

to consider both aspects together since patronage adjusts to the frequency and

timing of transport service. See, for instance, [30] for a treatment of the problem in

the context of airline routing; the evaluation of v(6) is a linear programming

problem which determines the maximum profit loading of available passengers to


3. Computational approaches

We now describe four promising generic computational approaches to the

development of hybrid algorithms for discrete/convex programming. They are: (i)

combinatorial seeding with local convex enumeration, (ii) generalized branch-andbound, (iii) cyclic marginal optimization over 6 and x, and (iv) improving

approximations to (D).

3.1. Combinatorial seeding with local convex enumeration

By Property 2, a discrete optimization algorithm is available for some relative of

(D). Let 6” be the resulting approximation to an optimal choice for 6. Now use 6’ as

a “seed” to be improved, if possible, via “low order” changes evaluated by the

convex programming algorithm postulated by Property 1. What constitutes a low

A . M . Geoflrion


order change depends on the structure of A ; for instance, if S were a binary n-tuple

the order of change might be measured as the number of components whose values

are altered. It is helpful but not necessary for A to be a subset of a metrizable space.

Sometimes it is convenient to use the term “neighbor” for any modification of 6

that qualifies as being of acceptably low order. The emphasis on low order changes

is designed, of course, to restrict the magnitude of the local enumeration task.

Generally one wants the allowable order of change to be sufficiently low that local

enumeration is computationally practical, yet sufficiently high that an improved

logical design will be found if one exists.

This approach is pictured informally in Fig. 1. It is understood that the seed is not

actually replaced as the incumbent until one of its neighbors proves to yield a

superior feasible solution of (DC). Termination occurs when no neighbor of t h e

current incumbent is superior; the higher the allowable order of change the

stronger the degree of local optimality at termination.

Giscrete Problem

Convex Problem

-Solve an approximation “Seed” 80

4 Evaluate c a t

to (D).



Fig. 1

A variant would be to generate several seeds from (D) rather than just one, as by

solving several approximations to (D) or by finding several suboptimal solutions to

a single approximation.

This approach has familiar analogs in the literature on heuristic programming.

See [14, Chapter 91 and [27]. See also [32]€or a highly successful application to gas

pipeline network design that has since been adapted and used extensively for

computer communication network design (e.g., [ 111).

The author has had very satisfactory experience with this approach in the context

of scheduling parallel chemical reactors with product-dependent changeover costs.

This application is now briefly reviewed.

Example 1. A changeover scheduling problem [21]. Several independent continuous process facilities or flow shop production lines are arranged in parallel. Each

can make (process) some subset of products with production rates that may vary

from line to line, but that are reasonably proportional from line to line (as would be

the case when lines are similar except for their scale of implementation or their

basic cycle time). Each line has a linear production cost for each product it can

make, and a possibly different changeover cost between each pair of products. The

How can optimiznrion methods be married?


changeover cost matrices are reasonably proportional across lines. A number of

independent production orders are given, each of which specifies a minimum and

maximum production quantity, an earliest start date, and a due date. Violation of

either date incurs a per diem cost penalty. Splitting production orders is allowed. It

is desired to find a production schedule - which line produces how much of what

and when -that fills the production orders at minimum total cost over a scheduling

horizon of fixed (but somewhat flexible) length.

In this application, 6 gives the sequence of production runs specified as to

product and line but nor fully specified as to duration. Durations are given by x .

Property 1 holds because, when 6 is fixed, the optimal choice of x may be

determined by solving a linear program. The LP balances production costs

(exclusive of changeover charges) against penalties associated with any violations of

earliest start and due dates. Property 2 holds because (D) can be approximated

quite well by a quadratic assignment problem of reasonable size.

An LP code and quadratic assignment code were combined in the manner of

Figure 1. The definition used for “neighbor” was that any single production run

may be moved to another position on the same or another line, and any two

production runs may be interchanged.

A real application was made to the monthly scheduling of a complex of six

chemical reactors. A three month independent parallel test showed that the

program was able to achieve considerably better solutions than (experienced)

manual schedulers. The program has since been installed on the firm’s computer

and is being used routinely [21].

3.2. Generalized branch -and - bound

The essential concepts of branch-and-bound, currently the dominant approach to

integer programming, require very little mathematical structure and are quite

broad enough to encompass discrete/convex programming. The framework of [22]

will serve nicely with only the obvious notational changes to phrase it in terms of

(DC) rather than in terms of mixed integer linear programming. It is also advisable

to generalize the notion of “relaxation,” whence nearly all bounds are obtained in

branch-and-bound methods, to the following: a minimizing problem (PR) is said to

be a relaxation of a minimizing problem (P) if the feasible region of (PR)contains

that of (P) and if the objective function of (PR) is less than or equal to that of (P)

everywhere on the latter’s feasible region. This generalized definition requires an

obvious modification to property R3 and fathoming criterion FC3 in [22] in order to

reflect the fact that an optimal solution of (PR) is not optimal in (P) unless it is

feasible in (P) and yields the same objective function value for both problems

(although an E-optimality statement can still be made if the very last condition

fails). [22] will be sufficiently accessible to most readers that the algorithmic

framework of Section I1 therein, as generalized to (DC), need not be given in detail



A.M. Geoffrion

So far, no use whatever has been made of Properties 1 and 2. The principal way

of doing so is to select a type of relaxation which permits advantage to be taken of

one or the other or both of these properties when trying to fathom the candidate

problems (alias node- or sub-problems). There are two major types of relaxations

used in mixed integer linear programming, both of which can b e generalized to

apply to candidate problems derived from (DC) provided certain conditions hold:

relaxations based o n direct convexification of the decision domain of the candidate

problem (as by allowing integer variables t o take on continuous values), and

Lagrangean relaxation of selected constraints [ 181. Suppose that candidate problems are derived from (DC) by partially specifying certain components of 6 (we

presume, as seems permissible for most potential applications, that the structure of

A renders this prescription meaningful). An obvious difficulty with such candidate

problems is that the very notion of convexification in the domain of 6 is not

meaningful unless 6 inhabits a vector space, which definitely is not the case in many

applications of interest (e.g., Example 1). Moreover, the mathematical operation of

Lagrangean relaxation requires X , to be expressible at least partially in terms of

conventional real-valued equality or inequality constraints. T h e first difficulty can

be skirted if necessary by convexifying not in the domain of S, but rather in the

range spaces associated with S - the range of the real-valued function c ( )and of

the point-to-set map X , ). The second difficulty apparently cannot be skirted.

There is a striking relationship between the two types of relaxation just

discussed. It was shown in [18] that, for mixed integer linear programs, the best

possible Lagrangean relaxation is equivalent in a natural sense to a corresponding

convexification in the domain of the decision variables and also to a corresponding

convexification in the range space of the objective function and Lagrangeanized

constraints. The analysis can be generalized. Dropping the assumption that all

functions are linear invalidates the equivalence t o convexification in domain space

but does not invalidate the equivalence to convexification in range space. The latter

equivalence even remains true when 6 is no longer taken to dwell in a finite vector

space, and when the constraining conditions other than those being Lagrangeanized

are no longer expressible as conventional real-valued constraints. This is a

consequence of the fact that many basic results of Lagrangean duality theory

require virtually no assumptions at all on the domains of the functions (e.g., [16,

Lemmas 3, 4 and 51). A formal proof of the basic equivalence between the best

Lagrangean relaxation and problem convexification in range space can be found in

[26, Lemma 2.21.

In particular applications one seeks to apply the convexification or Lagrangean

relaxation devices just discussed or possibly some other device, in order to obtain

candidate problem relaxations which Properties 1 and/or 2 render tractable. The

following example illustrates a situation in which this can be done.

Example 2. Network expansion with a budget constraint. This problem is a

capacitated version of the o n e treated in [2]. A conventional multicommodity

How can optimization methods be married?


network is given with capacitated links, a known flow requirements matrix, and

linear flow costs. A number of possible new links have been proposed, each with a

given flow capacity, linear flow cost, and fixed capital cost. What is the optimal

subset of new links which reduces the total cost of the optimal flow as much as

possible without exceeding a given maximum authorized capital expenditure?

The problem can be stated mathematically as follows in an obvious notation

where ij refers to the particular commodity which flows from the i t h to the j t h

node, A is the set of existing links, B is the set of possible new links, and D is the

capital budget.


for all ij and kl E A U B,


8 k f = 0 or 1,

for all kl E B.


This is a mixed integer linear programming problem which, for reasonable

numbers of potential new links (not much more than a hundred, say), should b e

tractable by branch-and-bound if the main candidate problem relaxation is chosen

suitably. The usual LP relaxation, obtained by allowing the free binary variables to

be fractional, is not a multicommodity flow problem; efficient specialized multicommodity flow algorithms cannot be used and one must fall back to general linear

programming algorithms. A n attractive alternative t o the usual LP relaxation is t o

employ a “tandem” Lagrangean relaxation. This will be illustrated on the full

problem (P) as stated above since the candidate problems are of the same

mathematical form so long as conventional dichotomous branching is used.

Let p o> 0 be the analyst’s best guess concerning the marginal value to (P) of

increasing the budget D by one dollar. Solve the relaxation of (P) which results

when (4) is Lagrangeanized using p o and (6) is convexified in the usual way. This is

equivalent to an ordinary classical multicommodity flow problem because the 6 k f

variables can be eliminated analytically (solve for

from (3), which must hold with

equality in an optimal solution):


A.M. Geoffrion

Let x o be an optimal solution, and let

to (3)'. It can be shown that


5okl be the optimal multipliers corresponding

forall k l € B


is a set of optimal multipliers corresponding to (3) in the relaxed version of (P) prior

to analytic reduction to (MF,.). Now solve a second relaxed version of (Pj in which

(3)' is appended and h a from (7) is used to Lagrangeanize (3):

Evidently this problem can be solved independently for x and for 6. It is easy to

show that x n from (MF,.) is also optimal here, leaving just the binary knapsack





subject to (4) and (6)



as the only work necessary to solve the second relaxation (PRAo). Methods are

available which can solve (KAo) very efficiently even with several hundred binary


In summary, a tandem relaxation of (P) has been proposed which requires the

solution of one ordinary multicommodity flow problem (cf. Property 1) and one

binary knapsack problem over the possible new links (cf. Property 2). Both

Properties 1 and 2 are exploited. A n otherwise conventional branch-and-bound

procedure can be built around this tandem relaxation. How well such a procedure

would function depends on how good the resulting bounds are. This has not been

tested experimentally, but it can be observed from the known theory of Lagrangean

relaxation [18] that the lower bound produced by this tandem relaxation has the

potential of being superior to that provided by the usual L P relaxation (in which (6)

is convexified). It all depends on the choice of p n .If p n happens to have the same

value as an optimal multiplier of (4) in the usual LP relaxation, then the bound

produced by (MF,.) will coincide with that of the usual L P relaxation and the

second bound obtained with the help of (K,.) will usually be still better (it cannot be


How can optimization methods be married?


It may be worthwhile to iterate on the choice of p o . There are at least two

conspicuous ways to do this. One is to perform a one-dimensional (unimodal)

search for the value of p which leads to the highest optimal value of (MF,). This is

particularly easy to do if a parametric multicommodity flow algorithm is available

which accommodates a single linear parameter in the objective function (the cost

coefficients of the links in set B are cfcI+ p d k I / b k , )This


search is equivalent to

solving the partial dual of the usual LP relaxation in which only the budget

constraint (4) is dualized. The second way to find an improved p is to feed back the


(6) convexified.

budget constraint multiplier from ( K h ~with

3.3. Cyclic marginal optimization over 6 and x

In some applications, Property 2 permits (DC) to be optimized with any fixed x.

Then it is natural to think of seeking an optimum of (DC) by first optimizing over x

with some fixed 6, then optimizing over 6 with the resulting x, th,en by optimizing

over x again with the new resulting 6, and so on. A monotonely improving

succession of feasible solutions will be found by such a cyclic marginal optimization

approach until a “marginally optimal” solution is found after which the marginal

solutions in x and 6 begin to repeat. Marginal optimality is an obvious necessary

condition for global optimality, but whether it is sufficient depends upon the

structure of the problem.

This general approach is, of course, far from novel (e.g., [35, p. 1111).

The following example illustrates a plausible application of this approach in

which the discrete and convex marginal optimization problems are, respectively, a

quadratic assignment problem and a linear program.

Example 3. A facility assignment problem. A firm has a number of indivisible

facilities and a number of distinct locations to which they could be assigned. The

firm carries on a number of different activities, each of which imposes its own

requirements for “traffic” between the facilities. These requirements are sufficiently dissimilar, and the traffic costs are sufficiently high, that the assignment of

facilities to locations materially influences the most profitable mix of activities. It is

therefore appropriate to optimize jointly the facility location assignments and

activity mix.

We adopt the following notations and assumptions:

the level of the k th activity of the firm,


Ax s b the constraints specifying the set of possible activities,

(independent of the facility location assignments),

x z o


net profit per unit of activity k exclusive of traffic costs,


the amount of traffic between facilities i and j incurred for each unit of


activity k ,

the cost per unit of traffic between locations 1 and m,


A.M. Geoffrion


the cost associated with assigning facility i to location 1 (can be m to

indicate an impossible assignment),


a mapping of facilities into locations; 6 ( i ) = 1 means that 6 assigns

facility i to location 1.

Then the problem can be written:


s.t. Ax


x 20,

6 a 1:l mapping.


It is evident that this is an ordinary quadratic assignment problem for fixed x and an

ordinary linear program for fixed 6, and hence a plausible candidate for cyclic

marginal optimization. This approach has not been tested computationally.

3.4. Improving approximations to (D)

The essential idea of this computational approach is to generate a sequence of

approximations t o (D) which are improving in the sense that their solutions tend to

converge to an optimal solution of (D) itself. Property 1 comes into play in the

course of evaluating the performance v ( 6 " ) of the solution 6" of the K t h

Of course, the form of

must be compatible with the

approximation (b)".

is to be

scope of Property 2. A rule must be specified to prescribe how (8)"

generated based on knowledge of 6' and x k obtained from previous (k =

1,. . ., K - 1) evaluations of ~ ( 6and

~ ()f i ) k . See Fig. 2.



S1cA. -

Set K = 1 .

Evaluate v( 8 ').


the solution xK.




Generate an approximation (D)

to (D) based on knowledge of

8 k a nd xk for k



1, . . . , K.





Solve (61K+ Call the solution

a K + 1'. Increment K by 1 .

Fig. 2

Tài liệu bạn tìm kiếm đã sẵn sàng tải về

Chapter 12. How Can Specialized Discrete and Convex Optimization Methods be Married

Tải bản đầy đủ ngay(0 tr)