Planar multi-flows, L_1 embeddings, and differentiation

In this post, I’ll discuss the relationship between multi-flows and sparse cuts in graphs, bi-lipschitz embeddings into L_1, and the weak differentiation of L_1-valued mappings. It revolves around one of my favorite open questions in this area, the planar multi-flow conjecture.

Table of contents:

  1. The max-flow/min-cut theorem
  2. Multi-commodity flows and sparse cuts
  3. The planar multi-flow conjecture
  4. Bi-Lipschitz embeddings into L_1
  5. The planar embedding conjecture
  6. Parallel geodesics and the slums of geometry
  7. Local rigidity and coarse differentiation
  8. Beyond planar graphs

The max-flow/min-cut theorem

Let G=(V,E) be a finite, undirected graph, with a mapping \mathsf{cap} : E \to \mathbb R_+ assigning a capacity to every edge. If \mathcal P_{s,t} is the set of all paths from s to t, then an s\!-\!t flow is a mapping F : \mathcal P_{s,t} \to \mathbb R_+ which doesn’t overload the edges beyond their capacities: For every edge e \in E,

\displaystyle \sum_{\gamma \in \mathcal P_{s,t} : e \in \gamma} F(\gamma) \leq \mathsf{cap}(e).

The value of the flow F is the total amount of flow sent: \mathsf{val}(F) = \sum_{\gamma \in \mathcal P_{s,t}} F(\gamma). A cut in G is a partition V = S \cup \bar S which we will usually write as (S, \bar S). Naturally, one defines the capacity across (S, \bar S) by \mathsf{cap}(S,\bar S) = \sum_{xy \in E} \mathsf{cap}(x,y) |\mathbf{1}_S(x)-\mathbf{1}_S(y)|, where \mathbf{1}_S is the characteristic function of S. Since one can only send as much flow across a cut as there is capacity to support it, it is easy to see that for any valid flow F and any cut (S, \bar S) with s \in S and t \in \bar S, we have \mathsf{val}(F) \leq \mathsf{cap}(S, \bar S). The classical max-flow/min-cut theorem says that in fact these upper bounds are achieved:

\max_{F} \mathsf{val}(F)=\min_{(S, \bar S)} \mathsf{cap}(S, \bar S)

where the maximum is over all s\!-\! t flows, and the minimum is over all cuts (S, \bar S) with s \in S and t \in \bar S.

Multi-commodity flows and sparse cuts

We will presently be interested in multi-commodity flows (multi-flows for short), where we are given a demand function \mathsf{dem} : V \times V \to \mathbb R_+ which requests that we send \mathsf{dem}(x,y) units of flow from x to y for every x,y \in V. In this case, we’ll write the value of a valid flow (i.e. one which doesn’t try to send more total flow than an edge can carry) as

\displaystyle \mathsf{val}(F) = \min_{x,y \in V} \frac{\sum_{\gamma \in \mathcal P_{x,y}} F(\gamma)}{\mathsf{dem}(x,y)}

where the minimum is over pairs with \mathsf{dem}(x,y) \neq 0. So if the value of the flow is \frac12, it means that every pair gets at least half the flow it requested (well, demanded).

Again, cuts give a natural obstruction to flows. If we define \Phi(S, \bar S) = \displaystyle \frac{\mathsf{cap}(S, \bar S)}{\mathsf{dem}(S, \bar S)}, where \mathsf{dem}(S, \bar S) = \sum_{x,y \in V} \mathsf{dem}(x,y) |\mathbf{1}_S(x)-\mathbf{1}_S(y)| is the total demand requested across (S, \bar S), then \mathsf{val}(F) \leq \Phi(S, \bar S) for any valid flow F. Unfortunately, it is no longer true in general that \max_F \mathsf{val}(F) = \min_{(S, \bar S)} \Phi(S). (It is true as long as the demand is supported on a set of size at most 4, while it stops being true in general when the demand is supported on a set of size 5 or larger.)

In fact, the gap between the two can be arbitrarily large, as witnessed by expander graphs (this rather fruitful connection will be discussed in a future post). For now, we’ll concentrate on a setting where the gap is conjectured to be at most O(1).

The planar multi-flow conjecture

It has been conjectured that there exists a universal constant C \geq 1 such that if G=(V,E) is a planar graph (i.e. can be drawn in the plane without edge crossings), there is a C-approximate multi-commodity max-flow/min-cut theorem: For any choice of \mathsf{dem} and \mathsf{cap}, one has a

\displaystyle \frac{1}{C} \min_{(S, \bar S)} \Phi(S, \bar S) \leq \max_F \mathsf{val}(F) \leq \min_{(S, \bar S)} \Phi(S, \bar S) (1)

The conjecture first appeared in print here, but was tossed around since the publications of Linial, London, and Rabinovich and Aumann and Rabani, which recast these multi-flow/cut gaps as questions about bi-lipschitz mappings into L_1 (discussed next). Perhaps the most compelling reason to to believe the conjecture is the beautiful result of Klein, Plotkin, and Rao which shows that C = O(1) for any planar instance where \mathsf{dem}(x,y) = 1\,\,\forall x,y \in V (this is called a uniform multi-flow instance).

It is relatively easy to see that we cannot take C=1, as the following example of Okamura and Seymour shows.

Okamura-Seymour graph

In the example, the non-zero demands are \mathsf{dem}(s_i,t_i) = 1 for i \in \{1,2,3,4\}. It is easy to check that every cut has \Phi(S,\bar S) \geq 1, but the value of the maximum flow is only 3/4, implying that C \geq 4/3. If we use the same pattern on the complete bipartite graph K_{2,n} (instead of K_{2,3}), taking n \to \infty shows that we must have C \geq 3/2 in (1). (Later a differentiation argument applied to a different family of graphs will show that we need C \geq 2 in (1).)

For any graph G, define \mathsf{gap}(G) as the smallest value C such that (1) holds for every choice of capacities and demands on G. We will now see how \mathsf{gap}(G) is determined by the geometric properties of path metrics defined on G.

Bi-Lipschitz embeddings into L_1

Consider a metric space (X,d), and the space of absolutely integrable functions L_1 = L_1([0,1]), equipped with the \|\cdot\|_1 norm. A mapping f : X \to L_1 is said to be C-bi-lipschitz if there exist constants A,B > 0 such that for all x,y \in X,

\frac{1}{B} d(x,y) \leq \|f(x)-f(y)\|_1 \leq A\,d(x,y)

and C = A \cdot B (so if C=1, then f is an isometry up to scaling). The infimal value of C for which f is C-bi-lipschitz is called the distortion of f. Define c_1(X) as least distortion with which (X,d) maps into L_1.

We will now relate \mathsf{gap}(G) to the L_1-distortion of the geometries that G supports. Any set of non-negative weights w : E \to \mathbb R_+ on the edges of G gives rise to a distance \mathrm{dist}_w on V defined by taking shortest-paths. Define c_1(G) to be the maximum of c_1(V, \mathrm{dist}_w) as w ranges over all such weightings. (Strictly speaking, \mathrm{dist}_w is only a pseudometric.)

Here is the beautiful connection referred to previously (see this for a complete proof).

Theorem (Linial-London-Rabinovich, Aumann-Rabani): For any graph G, \mathsf{gap}(G) = c_1(G).

Metric obstructions. To give an idea of why the theorem is true, we mention two facts. It is rather obvious how a cut obstructs a flow. A more general type of obstruction is given by a metric \mathrm{dist} on V. For any valid flow F,

\displaystyle \mathsf{val}(F) \leq \frac{\sum_{xy \in E} \mathsf{cap}(x,y)\,\mathrm{dist}(x,y)}{\sum_{x,y \in V} \mathsf{dem}(x,y)\,\mathrm{dist}(x,y)} (2)

To see why this holds, think of giving every edge (x,y) \in E a “length” of \mathrm{dist}(x,y), and having a cross-sectional area of \mathsf{cap}(x,y). Then the numerator in (2) is precisely the “volume” of the network, while the denominator is the total “volume” required to satisfy all the demands: By the triangle inequality, sending one unit of flow from x to y requires volume at least \mathrm{dist}(x,y). Observe that the bound (2) is a generalization of the cut upper bound when we define the pseudometric \mathrm{dist}(x,y) = |\mathbf{1}_S(x)-\mathbf{1}_S(y)| for some S \subseteq V (this is called a cut pseudometric).

It turns out (by linear programming duality) that in fact metrics are the correct dual objects to flows, and maximizing the left hand side of (2) over valid flows, and minimizing the right-hand side over metrics yields equality (it is also easy to see that the minimal metric is a shortest-path metric). One might ask why one continues to study cuts if they are the “wrong” dual objects. I hope to address this extensively in future posts, but the basic idea is to flip the correspondence around: minimizing \Phi(S,\bar S) is NP-hard, while maximizing \mathsf{val}(F) can be done by linear programming. Thus we are trying to see how close we can get to the very complex object \Phi, by something which is efficiently computable.

The cut decomposition of L_1. Now that we see why metrics come into the picture, let’s see how L_1 presents itself. Define a cut measure on a finite set X as a mapping \mu : 2^X \to \mathbb R_+ which satisfies \mu(S) = \mu(\bar S) for every S \subseteq X.

Fact: Given F : X \to L_1, there exists a cut measure \mu on X such that \|F(x)-F(y)\|_1 = \sum_S \mu(S) |\mathbf{1}_S(x)-\mathbf{1}_S(y)| for every x,y \in X. Conversely, for every cut measure \mu, there exists a mapping F : X \to L_1 for which the same equality holds.

Thus every L_1-distance on a finite set is precisely a weighted sum of cut pseudometrics (and, of course, cuts are where this story began). Proving the fact is straightforward; first check that it holds for F : X \to \mathbb R, and then integrate. As an example, consider these two isometric embeddings, presented via their cut measures (the graphs are unit-weighted):

4-cycle 5-cycle embedding

In the 4-cycle, each cut has unit weight. The corresponding embedding into (\mathbb R^2, \|\cdot\|_1) maps the four vertices to \{(0,0),(0,1),(1,0),(1,1)\}. In the second example, the dashed cuts have weight 1/2, and the dotted cut has weight 1. (Exercise: Verify that every tree embeds into L_1 isometrically.)

Two comments are in order; first, when X is finite, one can equivalently take the target space to be \mathbb R^{|X| \choose 2} equipped with the \|\cdot\|_1 norm (exercise: prove this using Caratheodory’s theorem). Thus one might ask why we would introduce a function space L_1 in the first place. One reason is that the dimension is irrelevant; a deeper reason is that when X is infinite (in which case an appropriately stated version of the fact holds), L_1 is the proper setting (and not, e.g. \ell_1). This arises, for instance, in arguments involving ultralimits in the passage between the finite and infinite settings.

The planar embedding conjecture

Using the connection with L_1-embeddings, we can now state the planar multi-flow conjecture in its dual formulation.

Planar embedding conjecture: There exists a constant C \geq 1 such that every shortest-path metric on a planar graph G=(V,E) admits a C-bi-lipschitz embedding into L_1.

By a relatively simple approximation argument in one direction, and a compactness argument in the other, one has the following equivalent conjecture.

Riemannian version: There exists a constant C \geq 1 such that every Riemannian metric on the 2-sphere admits a C-bi-lipschitz embedding into L_1.

Recently, Indyk and Sidiropoulos proved that if it’s true for the 2-sphere, then it’s true for every compact surface of genus g \geq 1, where the constant C depends on g (as it must).

We can now state some known results in the dual setting of embeddings. Let G=(V,E) be a planar graph, and let \mathrm{dist} be a shortest-path metric on V. The previously mentioned theorem of Klein, Plotkin, and Rao can be stated as follows: There exists a non-expansive mapping f : V \to \mathbb R such that

O(1) \sum_{x,y \in V} |f(x)-f(y)| \geq \sum_{x,y \in V} \mathrm{dist}(x,y).

In Gromov’s language, the observable diameter of planar graph metrics is almost as large as possible (i.e. no forced concentration of Lispchitz mappings).

A classical theorem of Okamura and Seymour is stated in the embedding setting as: Let F \subseteq V be a face in some drawing of G in the plane. Then there exists a non-expansive mapping f : V \to L_1 such that f|_F is an isometry. In the flow setting, this says that if the demand function is supported on the pairs belonging to a single face, then (1) holds with C=1.

Finally, we mention the bound of Rao which shows that c_1(V,d) = O(\sqrt{\log |V|}) for planar metrics. A slightly more general version says that whenever the demand is supported on at most k pairs, the flow/cut gap can be at most O(\sqrt{\log k}).

Read about the relationship with coarse differentiation after the jump.

Parallel geodesics and the slums of geometry

Of course there is an obvious parallel between taking a fixed graph G and studying all the shortest-path metrics it supports, and considering the family of Riemannian metrics on a fixed manifold. Since graph theory is the “slums of topology,” I suppose this makes metric-graph theory the slums of differential geometry. (Fortuntately, in areas like coarse geometry, graphs are elevated to the level of equal partners.)

One fascinating question is how the structure of the geometries evolves as we allow the underlying graph to become more complex. It’s easy to understand the metrics supported on a path; they are all isometric to a subset of the real line. What if we allow more than one path between a pair of points?

Parallels

(Consider the edges as having unit weights.) As we’ve seen, the first space embeds isometrically. It turns out that the middle graph requires distortion 4/3 (the same as the flow gap we saw earlier), and if we put more and more parallel 2-paths, the distortion tends to 3/2 (look here for a short proof).

Is it possible that we can amplify this to unbounded distortion by being more clever? Consider the following operation on edges.

Edge splitting

Certainly the operation preserves planarity. Starting with an initial edge, we could repeat this over and over again to get a graph like

Series-parallel graph

The metrics supported on such graphs are precisely the class of shortest-path metrics on series-parallel (SP) graphs. (Although these graphs themselves do not constitute all SP graphs, by appropriately weighting them, we do exhaust all metrics supported on SP graphs.)

Gupta, Newman, Rabinovich, and Sinclair showed that all such graphs embed into L_1 with distortion at most 14. A first approximation to their embedding can be described as follows: Follow the sequence of edge splittings backwards by mapping a “child” vertex onto one of the two parents at random (not necessarily uniformly). At the end, every vertex is mapped onto either the left or right endpoint of the initial edge. This yields a cut; the weight of the cut is proportional to the probability of producing it. This doesn’t quite work; one has to occasionally isolate a child from her parents.

It turns out (as we showed recently with Chakrabarti, Jaffe, and Vincent) that the right answer for this class of graphs is 2. For now, we move to the lower bound.

Local rigidity and coarse differentiation

The following argument arose in joint work with P. Raghavendra.

Consider P_k, the path of length k. It is easy to check that every isometric embedding of the path into L_1 induces exactly the same cut measure:

Isometric embedding

where the dotted lines represent cuts in the obvious way. The question now becomes: how stable is this property? What happens if F : P_k \to L_1 is a mapping with distortion C? Obviously we cannot hope for a global conclusion (you can start adding all kind of crazy cuts to the isometry without affecting the distortion too much).

Local rigidity. Instead, think of P_k = \{1,2,\ldots,k\}. Given any N, C \geq 1 and \epsilon > 0, I claim that there exists a large enough k = k(N,\epsilon,C) such that in any distortion-C embedding F : P_k \to L_1, the following happens. There exist points Q = \{A,A+B,A+2B,\ldots,A+NB\} \subseteq P_k such that F|_Q has distortion at most 1+\epsilon. Moreover, if we look at the cut measure induced on Q by F, all but an \epsilon fraction of the weight is concentrated on cuts which occur in the isometric embedding, and the weight of each cut is right up to a factor of 1 + \epsilon. In other words, “locally” any low-distortion L_1 embedding has to act almost like an isometry. In fact, this holds “almost everywhere,” in the sense that it can be made to hold for most values of A and B. I’ll sketch the proof of this after showing how it can be used to prove a better max-flow/min-cut gap in planar graphs.

This is not very surprising if you’ve seen e.g. Lebesgue’s differentiation theorem or Rademacher’s theorem. But the crucial fact here is that the conclusion of Rademacher’s theorem does not hold for mappings f : \mathbb R \to L_1. In the language of geometric functional analysis, L_1 fails to have the Radon-Nikodym property. (The usual example is to take the map F(t) = \mathbf{1}_{[0,t]}. Observe that this map is an isometry, but is not differentiable anywhere, since \lim_{h \to 0} h^{-1}[F(t+h)-F(t)]=\lim_{h \to 0} h^{-1} \mathbf{1}_{[t,t+h]}.)

The lower bound graphs. Recall that K_{2,n} is the complete 2 \times n bipartite graph; we will think of K_{2,n} as having n disjoint 2-paths between a pair of points. Now replace every edge of K_{2,n} by a copy of K_{2,n}. If you did this with K_{2,3}, you would get:

K_{3,3} construction

Let’s call the n-fold repetition of this process K_{2,n}^{\oslash n}. Observe that the preceding graph has 6 small copies of K_{2,3}, but it also has a large metrical copy (consisting of the white nodes). In terms of offsets from the left and the notation above, we can describe the white copy by A=0,B=2 and e.g. one of the small copies as A=2,B=1. The graphs K_{2,2}^{\oslash n} are known as diamond graphs.

The point now is to take F : K_{2,n}^{\oslash n} \to L_1 and use the preceding local rigidity statement to find a “copy” of K_{2,n} so that F acts nearly isometrically on every 2-path in the copy. By setting \epsilon \approx \frac{1}{n} (with N=2), we can ensure that most of the cuts restricted to our copy of K_{2,n} are of the following form:

s Montone cut t

In other words, they separate s from t (stated differently, restricted to any shortest s\!-\!t path, the cuts look like those coming from an isometry). Although c_1(K_{2,n}) \leq \frac32, it is fairly easy to see that any embedding consisting solely of cuts of this form must have distortion \to 2 as n \to \infty, showing that if (1) holds, then it requires C \geq 2. To see this, notice that every such cut has half the edges crossing it, but can separate at most half of the {n \choose 2} middle pairs (well, slightly more than half which is why we have to take n \to \infty). But edges have length 1, and the middle pairs are all separated by distance 2. The fact that these cuts treats edges and middle pairs the same (in terms of the fraction cut), implies that any embedding consisting of such cuts incurs distortion at least 2.

Coarse differentiation. Finally, we move to the proof of the local rigidity statement. This is based on the coarse differentiation technique of Eskin, Fisher, and Whyte. Given a mapping F : P_N \to (X,\mathrm{dist}) of the path of length N into an arbitrary metric space X, we say that F is \epsilon-efficient if

\mathrm{dist}(F(1),F(N)) \leq \sum_{i=1}^{N-1} \mathrm{dist}(F(i), F(i+1)) \leq (1+\epsilon)\mathrm{dist}(F(1),F(N))

This is a discrete sort of bounded variation. The left-hand side holds trivially from the triangle inequality, and the right-hand side says that the triangle inequality isn’t too far off, i.e. the image F(P_N) is “almost straight.”

Let’s assume that k = N^r for some r \in \mathbb N, and we’ll break P_k into scales: Scale i involves points which are multiples of N^{r-i}. The level-i segments are those portions of precisely N+1 level-(i-1) points between two level-i points. See the following picture, where N = 3, and the level 1, 2, and 3 segments are shown.

Multi-scale path

We would like to find a segment at some scale on which F is \epsilon-efficient. The differentiation argument is now this: If F is \epsilon-efficient at level 1, stop (we found a good scale). Otherwise, F must be fairly non-straight at the first scale approximation:

Scale 1

Now if F is \epsilon-efficient on one of the level-2 segments, stop (actually, we want this to hold for most of the segments, but the same argument works). If not, F must be fairly non-straight on the second scale segments:

Scale 2

And if we fail again to find a good segment at the next scale:

Scale 3

As the pictures expresses pretty clearly, every time we reveal that F behaves badly on the next scale, our estimate on the length of F(P_k) goes up. But F has bounded distortion, and thus the length of F(P_k) cannot be too much bigger than the length of P_k. If there are enough scales, we eventually find a good one or reach a contradiction (and averaging shows, in fact, that at most scales, almost all the segments must be good lest we encounter a contradiction). It’s important to note that if we only used the fact that F is Lipschitz, we would not be able to get a definite increment in our estimate on the length of F(P_k) at every step; this requires that F be bi-Lipschitz.

Specializing to L_1 embeddings. The argument of Eskin, Fisher, and Whyte works for any target metric space X. To get information about cuts, we specialize the preceding discussion to the case X = L_1.

Again consider the path P_N = \{1,2,\ldots,N\}. Call a cut S \subseteq P_N monotone if S is either a prefix or a suffix of the path P_N (in other words, at most one edge of P_N crosses (S,\bar S)). Observe that monotone cuts are precisely those which have non-zero weight in the isometric embedding of P_N.

If (S,\bar S) is not a monotone cut, then it cuts at least two edges, hence we have the inequality

\displaystyle\sum_{i=1}^{N-1} |\mathbf{1}_S(i)-\mathbf{1}_S(i+1)| \geq 2\,|\mathbf{1}_S(1)-\mathbf{1}_S(N)|.

This shows that the mapping \mathbf{1}_S : P_N \to \mathbb R is at best 1-efficient. But now consider a mapping F : P_N \to L_1. We know that F decomposes as \|F(x)-F(y)\|_1 = \sum_{S \subseteq P_N} \mu(S) |\mathbf{1}_S(x)-\mathbf{1}_S(y)| for some cut measure \mu. Thus if F is \epsilon-efficient, then it must be that \mu has at most an \epsilon-fraction of its weight supported on non-monotone cuts.

So once we find a scale at which F is \epsilon-efficient, we have strong control over the cut measure restricted to that scale, as was required for the distortion lower bound earlier.

Beyond planar graphs

The planar embedding conjecture is still open. It seems that if differentiation arguments are going to disprove it, they will have to use more than just local rigidity for paths (with some work, one can get local rigidity for grids, for instance, but then it’s hard to play around with full-blown grids without violating the topology).

If one is willing to work with a significantly more sophisticated (though still “finite-dimensional,” e.g. doubling) metric space, then a different sort of weak differentiation argument is able to show that there exists no bi-Lipschitz embedding into L_1 at all. Assaf Naor and I conjectured that such a lower bound holds for the 3-dimensional Heisenberg group \mathbb H^3(\mathbb R) equipped with its Carnot-Caratheodory metric, in relation to the Goemans-Linial conjecture (this is a—now disproven—conjecture about the max-flow/min-cut gap for a significantly weaker notion of multi-commodity “flows”). The conjecture was proved by Cheeger and Kleiner, using local rigidity results of Franchi, Serapioni, and Serra Cassano for sets of finite perimeter in the Heisenberg group.

4 thoughts on “Planar multi-flows, L_1 embeddings, and differentiation

  1. Amazing. Here’s a question, are there families of graphs with good separators but bad L_1-distortion?

  2. Yes, for instance the 3-dimensional Heisenberg group mentioned at the end of the article. The results of Cheeger and Kleiner imply that grids in this group have unbounded distortion (as the size of the grid goes to infinity), but they have separators of size ~ sqrt{N}, where N is the number of points.

Leave a comment