Wednesday, November 4, 2015

Discrete optimization @ Oaxaca - II

Some snippets from days 2 and 3 at the BIRS-CMO Workshop on Discrete Optimization in Oaxaca.

Thomas Rothvoss presented his work on constructive discrepancy minimization for convex sets (slides are available). The basic problem is that of assigning one of two colors \(\chi(i)\) to each \(i \in [n]:=\{1,\dots,n\}\) of \(n\) items represented by \(\{-1,+1\}\) such that for a system of sets \(\cal{S} = \{S_1,\dots,S_m\}\) with \(S_i \in [n]\) we minimize the maximum "mismatch", defined as the discrepancy:

\( \rm{disc}(\cal{S}) = \min\limits_{\chi(i) = \pm 1} \max\limits_{S \in \cal{S}} \left| \sum_{i \in S} \chi(i) \right| \).

I found the techniques developed fairly deep, and would find use in lots of applications (i.e., for proving results on other optimization problems; Thomas talked about an application to bin packing). We had previously looked at the somewhat related problem of number partitioning. There, we assign a set of integers \(\{a_1, \dots, a_n\}\) to two sets such that the sums of the numbers over the two sets are as close to each other as possible (the difference between these two sums is the discrepancy here). At the same time, the corresponding alternative definition of discrepancy given as
\( \rm{disc'}(\cal{S}) = \min\limits_{\chi(i) = \pm 1} \sum\limits_{S \in \cal{S}} \left| \sum_{i \in S} \chi(i) \right|\)

would make the problem sort of "easy" here. With that objective, one could prove that a random assignment of \(\pm 1\) would perhaps do as well as we can. Nonetheless, an appropriately defined notion of "weighted" discrepancy, where each element now has, say, nonnegative weights, would be interesting to consider. It appears a generalization to more than two colors would be interesting, but perhaps tricky to establish the building blocks of results.

Tamon Stephen talked about a variant of the Hirsch conjecture using circuit diameter of polyhedra, instead of the default graph diameter. See the preprint for an illustration of circuit distance between vertices of a polyhedron - unlike the graph diameter, it's not symmetric. The idea is that one is allowed to take "shortcuts" along the interior of the polyhedron along with the walks along the edges. In that sense, it's mixing the ideas of the default simplex method and the interior point method for solving linear programs (LPs). The authors show that the most basic counterexample to the Hirsch conjecture, the Klee-Walkup polyhedron, in fact satisfies the Hirsch bound. It would be interesting for software programs that solve LPs to be able to seamlessly and intelligently switch back and forth between interior point and simplex methods (a basic ability to do so is already provided by some of the state-of-the-art solvers)

Juan Pablo Vielma talked about when are Minkowski sums good/bad in the context of formulations for unions of polyhedra, and for unions of convex sets in general (the slides should be up soon here; but other versions are already available). For unions of polyhedra, aggregated formulations are often "short", but are not as tight as disaggregated ones (also termed extended formulations). The latter formulations are sharp/ideal, but are often too large in size (see the excellent review on MIP formulation techniques by JP Vielma, or lectures 5-7 from my IP class for a shorter overview). This interesting line of work tries to find a better middle ground by finding the sharpest formulations that are not extended, i.e., without having to add extra variables. Things get very interesting when one considers unions of convex sets (in place of polyhedra)!

Monday, November 2, 2015

Discrete optimization @ Oaxaca - I

I'm at the BIRS-CMO Workshop on Discrete Optimization in beautiful Oaxaca (in Mexico). Unlike other typical workshops, the organizers have tried hard to encourage lots of discussion and interactions among the participants - big props to Jesus De Loera and Jon Lee! I'm hoping to write snippets on talks/discussion that I found particularly interesting (yes, it'll be a biased view :-).

The meeting started with an apt talk by Dan Bienstock on LP formulations for polynomial optimization problems (based on this paper; Dan has also posted the slides). He started with a motivating problem - the optimal power flow (OPF) problem, which motivates the use of the treewidth of the intersection graph of the constraints as the parameter which controls the complexity of the proposed reformulation operator. And real-life power grids often have small treewidths. The reformulation operator produces linear programming approximations that attain provable bounds for mixed integer polynomial problems where the variables are either binary, or require \(0 \leq x_j \leq 1\).

Informally, the intersection graph of a system of constraints has one vertex for each variable \(x_j\), and edge \((x_i,x_j)\) is present when both \(x_i\) and \(x_j\) appear in some constraint. The simple example of a subset sum (or knapsack) problem was insightful. With the single constraint being \(x_1+\dots+x_n \leq \beta\), a constraint graph could have \(n+1\) vertices \({0,1,\dots,n}\), with the \(n\) edges \((0,j)\) for each \(j\) corresponding to \(x_j\) (node \(0\) is a "dummy" node here). The treewidth of this "star" graph is \(1\). The other key trick employed is the use of binary variables to approximate a continuous variable \(0 \leq x \leq 1\) (attributed originally to Glover). For a given error term \(0 < \gamma < 1\), we can approximate \(x \approx \sum_{j=1}^L \left(1/2^h\right) y_h\), where \(y_h\) are binary variables. With \(L = \lceil \log_2 (1/\gamma) \rceil \), we can get \(x \leq \sum_{j=1}^L \left(1/2^h\right) y_h \leq x + \gamma\). This step helps to get pure binary problems in place of the mixed integer problems. The treewidth gets blown up by \(L\), but things still work out nicely. This paper seems to have lots of nice and deep "tricks" (I hope to study it in detail).

There were several other interesting talks, and a very interactive problem session to conclude the day. Oh, and I learned a new terminology used in the power industry from Shabbir Ahmed's talk: a prosumer is someone who both produces and consumes power. I'm wondering why that is more apt than a conducer...

There was some lively discussion over dinner about how the optimization and operations research communities have failed to sell itself as well as the CS community (as a whole, or even the CS theory community by itself). Large membership sizes of ACM vs INFORMS and similarly large NSF budgets for CS vs OR were cited as indicators. There was also an anecdote mentioned about how back in the 1980s when NSF funding for algorithms/CS theory was on the decline, a group of several top big names from that field submitted a memo/petition to the lawmakers in DC, and also convinced them in person that "algorithms/theory is as fundamental as cosmology" (needless to say, I'm paraphrasing to a huge extent here!). And yes, they managed to restore the funding flow. The optimization community tried a bit of the same trick with Karmarkar's interior point algorithm for LP. May be we optimizers should try harder - not just from the point of view of securing funding, but also from the point of view of our students getting better industry jobs!