# Lecture 5: Uniformizing graphs, multi-flows, and eigenvalues

In the previous lecture, we gave an upper bound on the second eigenvalue of the Laplacian of (bounded degree) planar graphs in order to analyze a simple spectral partitioning algorithm.  A natural question is whether these bounds extend to more general families of graphs.  Well-known generalizations of planar graphs are those which can be embedded on a surface of fixed genus, and, more generally, families of graphs that arise by forbidding minors.  In fact, Spielman and Teng conjectured that for any graph excluding $K_h$ as a minor, one should have $\lambda_2 \lesssim \mathrm{poly}(h) d_{\max}/n$.   Of course planar graphs have genus 0, and by Wagner’s theorem, are precisely the graphs which exclude $K_5$ and $K_{3,3}$ as minors.  In this lecture, we will follow an intrinsic approach of Biswal, myself, and Rao which, in particular, is able to resolve the conjecture of Spielman and Teng.  First, we see why even pushing the conformal approach to bounded genus graphs is difficult.

### Bounded genus graphs

For graphs of bounded genus, there is hope to use an approach based on conformal mappings.  In 1980, Yang and Yau proved that

$\displaystyle \lambda_2(M) \lesssim \frac{g+1}{\mathrm{vol}(M)}$

for any compact Riemannian surface $M$ of genus $g$.  (Note that for the Laplace-Beltrami operator, one usually writes $\lambda_1$ as the first non-zero eigenvalue, rather than $\lambda_2$.)  In analog with Hersch’s proof of the genus 0 case, they use Riemann-Roch to obtain a degree-$(g+1)$ conformal mapping to the Riemann sphere, then try to pull back a second eigenfunction.  A factor of the degree is lost in the Rayleigh quotient (hence the $g+1$ factor in the preceding bound), and Hersch’s Möbius trick is still required.

An analogous proof for graphs $G$ of bounded genus would proceed by constructing a circle packing of $G$ on the sphere $S^2$, but instead of the circles having disjoint interiors, we would be assured that every point of $S^2$ is contained in at most $g$ circles.  Unfortunately, such a result is impossible (this has to do with the handling of branch points in the discrete setting).  Kelner has to take a different approach in his proof that $\lambda_2(G) \leq d_{\max}^{O(1)} (g+1)/n$ for graphs $G$ of genus at most $g$.

He starts with a circle packing of $G$ on a compact surface $\mathbb S_0$ of genus $g$ (whose existence follows from results of Beardon and Stephenon and He and Schramm).  Then Kelner randomly subdivides $G$ repeatedly, and these subdivisions give progressively better approximations to some sequence of surfaces $\{\mathbb S_i\}$.  Once the approximation is of high enough quality, one applies Riemann-Roch to $\mathbb S_k$, and infers something about a subdivision of $G$.  The final element is to track how the second eigenvalue of $G$ changes (in expectation) under random subdivision.

Needless to say, this approach is already quite delicate, and for graphs that can’t be equipped with some kind of conformal structure, we seem to have reached a dead end.  In this lecture, we’ll see how to use intrinsic deformations of the geometry of $G$ in order to bound its eigenvalues.  Eventually, this will reduce to the study of certain kinds of multi-commodity flows.

### Metrics on graphs

Let $G=(V,E)$ be an arbitrary n-vertex graph with maximum degree $d_{\max}$.  Recall that we can write

$\displaystyle \lambda_2 = \min_{f \neq 0 : \sum_{x \in V} f(x)=0} \frac{\sum_{xy \in E} |f(x)-f(y)|^2}{\sum_{x \in V} f(x)^2}.$

where $f : V \to \mathbb R$.  (Also recall that we can replace $\mathbb R$ by any Hilbert space, and the same formula holds.)  The first step is to prepare this equality for “non-linearization” by getting rid of the linear condition $\sum_{x \in V} f(x)=0$ and the sum $\sum_{x \in V} f(x)^2$.  (This is a popular sort of passage in the non-linear geometry of Banach spaces, which also plays a rather important role in applications to the theoretical CS.)  The goal is to get only terms that look like $|f(x)-f(y)|$.  Fortunately, there is a well-known way to do this:

$\displaystyle \lambda_2 = 2 n \cdot \min_{f : V \to \mathbb R} \frac{\sum_{xy \in E} |f(x)-f(y)|^2}{\sum_{x,y \in V} |f(x)-f(y)|^2},$

which follows easily from the equality $\sum_{x,y \in V} |f(x)-f(y)|^2 = 2n \sum_{x \in V} f(x)^2$ when $\sum_{x \in V} f(x)=0$.

Thus if we want to bound $\lambda_2 = O(1/n)$, we need to find an $f : V \to \mathbb R$ for which the latter ratio (without the $2n$) is $O(1/n^2)$.  Now, for someone who works a lot with linear programming relaxations, it’s very natural to consider a “relaxation”

$\displaystyle \gamma(G) = \min_{d} \frac{\sum_{xy \in E} d(x,y)^2}{\sum_{x,y \in V} d(x,y)^2},$

where the minimization is over all pseudo-metrics d, i.e. symmetric non-negative functions $d : V \times V \to \mathbb R$ which satisfy the triangle inequality, but might have $d(x,y)=0$ even for $x \neq y$.  Certainly $\gamma(G) \leq \lambda_2/2n$, but Bourgain’s embedding theorem (which states that every n-point metric space embeds into a Hilbert space with distortion at most $O(\log n)$) also assures us that $\lambda_2(G) \leq O(n \log^2 n) \gamma(G)$.  Since we are trying to show that $\gamma(G) = O(1/n^2)$, this $O(\log^2 n)$ term is morally negligible.  One can see the paper for a more advanced embedding argument that doesn’t lose this factor, but for now we concentrate on proving that $\gamma(G) = O(1/n^2)$.  The embedding theorems allow us to concentrate on finding an intrinsic metric on the graph with small “Rayleigh quotient,” without having to worry about an eventual geometric representation.

As a brief preview… we are going to find a good metric $d$ by taking a certain kind of all-pairs multi-commodity flow at optimality, and weighting the edges by their congestion in the optimal flow.  Thus as the flow spreads out on the graph, it has the effect of “uniformizing” its geometry.

### Discrete Riemannian metrics, convexification, and duality

Let’s now assume that $G$ is planar.  We want to show that $\gamma(G) = O(d_{\max}/n^2)$.  First, let’s restrict ourselves to vertex weighted metrics on $G$.  Given any non-negative weight function $\omega : V \to \mathbb R$, we can define the length of a path $P$ in $G$ by summing the weights of vertices along it:  $\mathsf{len}_{\omega}(P) = \sum_{x \in P} \omega(x)$.  Then we can define a vertex-weighted shortest-path pseudo-metric on $G$ in the natural way

$\displaystyle \mathsf{dist}_{\omega}(x,y) = \min \left\{ \mathsf{len}_{\omega}(P) : P \in \mathcal P_{xy}\right\},$

where $\mathcal P_{uv}$ is the set of all u-v paths in $G$.  We also have the nice relationship

$\displaystyle \sum_{xy \in E} \mathsf{dist}_{\omega}(x,y)^2 \leq 2 d_{\max} \sum_{x \in V} \omega(x)^2,\qquad(1)$

since $\mathsf{dist}_{\omega}(x,y) \leq [\omega(x)+\omega(y)]^2$.

So if we define

$\displaystyle \Lambda_0(\omega) = \frac{\displaystyle \sum_{x \in V} \omega(x)^2}{\displaystyle \sum_{x,y \in V} \mathsf{dist}_{\omega}(x,y)^2}$

then by (1), we have $\gamma(G) \leq 2 d_{\max} \min_{\omega} \Lambda_0(\omega)$.

Examples. Let’s try to exhibit weights $\omega$ for two well-known examples:  the grid, and the complete binary tree.

For the $\sqrt{n} \times \sqrt{n}$ grid, we can simply take $\omega(x)=1$ for all $x \in V$.  Clearly $\sum_{x \in V} \omega(x)^2 = n$.  On the other hand, a random pair of points in the grid is $\Omega(\sqrt{n})$ apart, hence $\sum_{x,y \in V} \mathsf{dist}_{\omega}(x,y)^2 \approx n^2 \cdot (\sqrt{n})^2 = n^3$.  It follows that $\Lambda_0(\omega) = O(1/n^2)$, as desired.

For the complete binary tree with root $r$, we can simply put $\omega(r)=1$ and $\omega(x)=0$ for $x \neq r$.  (Astute readers will guess the geometrically decreasing weights are actually the optimal choice.)  In this case, $\sum_{x \in V} \omega(x)^2 = 1$, while all the pairs $x,y$ on opposite sides of the root have $\mathsf{dist}_{\omega}(x,y)=1$.  It again follows that $\Lambda_0(\omega) = O(1/n^2)$.  Our goal is to provide such a weight $\omega$ for any planar graph.

# Lecture 4: Conformal mappings, circle packings, and spectral geometry

In Lecture 2, we used spectral partitioning to rule out the existence of a strong parallel repetition theorem for unique games.  In practice, spectral methods are a very successful heuristic for graph partitioning, and in the present lecture we’ll see how to analyze these partitioning algorithms for some common families of graphs.

### Balanced separators, eigenvalues, and Cheeger’s inequality

Lipton and Tarjan proved that every planar graph has a negligibly small set of nodes whose remval splits the graph into two roughly equal pieces.  More specifically, every n-node planar graph can be partitioned into three disjoint sets $A,B,S$ such that there are no edges from $A$ to $B$, the separator $S$ has at most $O(\sqrt{n})$ nodes, and $|A|,|B| \geq n/3$.  This allows one to do all sorts of things, e.g. a simple divide-and-conquer algorithm gives a linear time $(1+\epsilon)$-approximation for the maximum independent set problem in such graphs, for any $\epsilon > 0$.

So there is a natural question of how well spectral methods do, for example, on planar graphs.  Spielman and Teng showed that for bounded-degree planar graphs, a simple recursive spectral algorithm recovers a partition $V=A \cup B$ of the vertex set so that $|E(A,B)| = O(\sqrt{n})$.  In other words, for bounded-degree planar graphs, spectral methods recover the Lipton-Tarjan separator theorem!  This is proved by combining Cheeger’s inequality with their main theorem.

Theorem [Spielman-Teng]: Every n-node planar graph with maximum degree $d_{\max}$ has $\displaystyle \lambda_2(G) = O\left(\frac{d_{\max}}{n}\right)$, where $\lambda_2(G)$ is the second eigenvalue of the combinatorial Laplacian on $G$.

Recall that we introduced the combinatorial Laplacian in Lecture 2.  If $G=(V,E)$ is an arbitrary finite graph, in this lecture it will make more sense to think about the Laplacian $\Delta$ as an operator on functions $f : V \to \mathbb R$ given by

$\displaystyle \Delta f(x) = \mathrm{deg}(x) f(x) - \sum_{y : xy \in E} f(y).$

If we define the standard inner product $\langle f,g\rangle = \sum_{x \in V} f(x)g(x)$, then one can easily check that for any such $f$, we have $\langle f, \Delta f\rangle = \sum_{xy \in E} |f(x)-f(y)|^2$.  In particular, this implies that $\Delta$ is a positive semi-definite operator.  If we denote its eigenvalues by $\lambda_1 \leq \lambda_2 \leq \cdots \leq \lambda_n$, then it is also easy to check that $\lambda_1 = 0$, with corresponding eigenfunction $f(x)=1$ for every $x\in V$.

Thus by standard variational principles, we have

$\displaystyle \lambda_2 = \min_{f \neq 0 : \sum_{x \in V} f(x)=0} \frac{\sum_{xy \in E} |f(x)-f(y)|^2}{\sum_{x \in V} f(x)^2}.$

Let us also define the Cheeger constant $h_G$.  For an arbitrary subset $S \subseteq V$, let

$\displaystyle h(S) = \frac{|E(S, \bar S)|}{\min(|S|,|\bar S|)}$,

note that this definition varies from the $h$ we defined in Lecture 2, because we will be discussing eigenfunctions without boundary conditions.  Now one defines $h_G = \min_{S \subseteq V} h(S)$.

Finally, we have the version of Cheeger’s inequality (proved by Alon and Milman in the discrete setting) for graphs without boundary.

Cheeger’s inequality: If $G=(V,E)$ is any graph with maximum degree $d_{\max}$, then

$\displaystyle \lambda_2(G) \geq \frac{h(G)^2}{2d_{\max}}.$

This follows fairly easy from the Dirichlet version of Cheeger’s inequality presented in Lecture 2.  Here’s a sketch:  Let $f : V \to \mathbb R$ satisfy $\Delta f = \lambda_2 f$, and suppose, without loss of generality, that $V_+ = \{ x : f(x) > 0 \}$ has $|V_+| \geq n/2$.  Define $f_+(x)=f(x)$ for $f(x) > 0$ and $f_+(x)=0$ otherwise.  Then $f_+|_B = 0$ for $B = V \setminus V_+$, so we can plug $f_+$ into the Dirichlet version of Cheeger’s inequality with boundary conditions on $B$.  For the full analysis, see this note which essentially follows this approach.  By examining the proof, note that one can find a subset $S \subseteq V$ with $h(S) \leq \sqrt{2 d_{\max} \lambda_2}$ by a simple “sweep” algorithm:  Arrange the vertices $V = \{v_1, v_2, \ldots, v_n\}$ so that $f(v_1) \leq f(v_2) \leq \cdots \leq f(v_n)$, and output the best of the $n-1$ cuts $\{v_1, \ldots, v_i\}, \{v_{i+1}, \ldots, v_n\}$.

So using the eigenvalue theorem of Spielman and Teng, along with Cheeger’s inequality, we can find a set $S \subseteq V$ with $h(S) \lesssim \sqrt{d_{\max}/n}$.  While this cut has the right Cheeger constant, it is not necessarily balanced (i.e. $\min(S, \bar S)$ could be very small).  But one can apply this algorithm recursively, perhaps continually cutting small chunks off of the graph until a balanced cut is collected.  Refer to the Spielman-Teng paper for details.  A great open question is how one might use spectral information about $G$ to recover a balanced cut immediately, without the need for recursion.

### Conformal mappings and circle packings

Now we focus on proving the bound $\lambda_2(G) \lesssim d_{\max}/n$ for any planar graph $G$.  A natural analog is to look at what happens for the Laplace-Beltrami operator for a Riemannian metric on the 2-sphere.  In fact, Hersch considered this problem almost 40 years ago and proved that $\lambda_2(M) \lesssim 1/\mathrm{vol}(M)$, for any such Riemannian manifold $M$.  His approach was to first use the uniformization theorem to get a conformal mapping from $M$ onto $S^2$, and then try to pull-back the standard second eigenfunctions on $S^2 \subseteq \mathbb R^3$ (which are just the three coordinate projections).  Since the Dirichlet energy is conformally invariant in dimension 2, this almost works, except that the pulled-back map might not be orthogonal to the constant function.  To fix this, he has to post-process the initial conformal mapping with an appropriate Möbius transformation.

Unaware of Hersch’s work, Spielman and Teng derived eigenvalue bounds for planar graphs using the discrete analog of this approach:  Circle packings replace conformal mappings, and one still has to show the existence of an appropriate post-processing Möbius transformation.

# Planar multi-flows, L_1 embeddings, and differentiation

In this post, I’ll discuss the relationship between multi-flows and sparse cuts in graphs, bi-lipschitz embeddings into $L_1$, and the weak differentiation of $L_1$-valued mappings. It revolves around one of my favorite open questions in this area, the planar multi-flow conjecture.

### The max-flow/min-cut theorem

Let $G=(V,E)$ be a finite, undirected graph, with a mapping $\mathsf{cap} : E \to \mathbb R_+$ assigning a capacity to every edge. If $\mathcal P_{s,t}$ is the set of all paths from $s$ to $t$, then an $s\!-\!t$ flow is a mapping $F : \mathcal P_{s,t} \to \mathbb R_+$ which doesn’t overload the edges beyond their capacities: For every edge $e \in E$,

$\displaystyle \sum_{\gamma \in \mathcal P_{s,t} : e \in \gamma} F(\gamma) \leq \mathsf{cap}(e).$

The value of the flow F is the total amount of flow sent: $\mathsf{val}(F) = \sum_{\gamma \in \mathcal P_{s,t}} F(\gamma)$. A cut in $G$ is a partition $V = S \cup \bar S$ which we will usually write as $(S, \bar S)$. Naturally, one defines the capacity across $(S, \bar S)$ by $\mathsf{cap}(S,\bar S) = \sum_{xy \in E} \mathsf{cap}(x,y) |\mathbf{1}_S(x)-\mathbf{1}_S(y)|$, where $\mathbf{1}_S$ is the characteristic function of $S$. Since one can only send as much flow across a cut as there is capacity to support it, it is easy to see that for any valid flow $F$ and any cut $(S, \bar S)$ with $s \in S$ and $t \in \bar S$, we have $\mathsf{val}(F) \leq \mathsf{cap}(S, \bar S)$. The classical max-flow/min-cut theorem says that in fact these upper bounds are achieved:

$\max_{F} \mathsf{val}(F)=\min_{(S, \bar S)} \mathsf{cap}(S, \bar S)$

where the maximum is over all $s\!-\! t$ flows, and the minimum is over all cuts $(S, \bar S)$ with $s \in S$ and $t \in \bar S$.

### Multi-commodity flows and sparse cuts

We will presently be interested in multi-commodity flows (multi-flows for short), where we are given a demand function $\mathsf{dem} : V \times V \to \mathbb R_+$ which requests that we send $\mathsf{dem}(x,y)$ units of flow from $x$ to $y$ for every $x,y \in V$. In this case, we’ll write the value of a valid flow (i.e. one which doesn’t try to send more total flow than an edge can carry) as

$\displaystyle \mathsf{val}(F) = \min_{x,y \in V} \frac{\sum_{\gamma \in \mathcal P_{x,y}} F(\gamma)}{\mathsf{dem}(x,y)}$

where the minimum is over pairs with $\mathsf{dem}(x,y) \neq 0$. So if the value of the flow is $\frac12$, it means that every pair gets at least half the flow it requested (well, demanded).

Again, cuts give a natural obstruction to flows. If we define $\Phi(S, \bar S) = \displaystyle \frac{\mathsf{cap}(S, \bar S)}{\mathsf{dem}(S, \bar S)}$, where $\mathsf{dem}(S, \bar S) = \sum_{x,y \in V} \mathsf{dem}(x,y) |\mathbf{1}_S(x)-\mathbf{1}_S(y)|$ is the total demand requested across $(S, \bar S)$, then $\mathsf{val}(F) \leq \Phi(S, \bar S)$ for any valid flow $F$. Unfortunately, it is no longer true in general that $\max_F \mathsf{val}(F) = \min_{(S, \bar S)} \Phi(S)$. (It is true as long as the demand is supported on a set of size at most 4, while it stops being true in general when the demand is supported on a set of size 5 or larger.)

In fact, the gap between the two can be arbitrarily large, as witnessed by expander graphs (this rather fruitful connection will be discussed in a future post). For now, we’ll concentrate on a setting where the gap is conjectured to be at most $O(1)$.

### The planar multi-flow conjecture

It has been conjectured that there exists a universal constant $C \geq 1$ such that if $G=(V,E)$ is a planar graph (i.e. can be drawn in the plane without edge crossings), there is a $C$-approximate multi-commodity max-flow/min-cut theorem: For any choice of $\mathsf{dem}$ and $\mathsf{cap}$, one has a

$\displaystyle \frac{1}{C} \min_{(S, \bar S)} \Phi(S, \bar S) \leq \max_F \mathsf{val}(F) \leq \min_{(S, \bar S)} \Phi(S, \bar S)$ (1)

The conjecture first appeared in print here, but was tossed around since the publications of Linial, London, and Rabinovich and Aumann and Rabani, which recast these multi-flow/cut gaps as questions about bi-lipschitz mappings into $L_1$ (discussed next). Perhaps the most compelling reason to to believe the conjecture is the beautiful result of Klein, Plotkin, and Rao which shows that $C = O(1)$ for any planar instance where $\mathsf{dem}(x,y) = 1\,\,\forall x,y \in V$ (this is called a uniform multi-flow instance).

It is relatively easy to see that we cannot take $C=1$, as the following example of Okamura and Seymour shows.

In the example, the non-zero demands are $\mathsf{dem}(s_i,t_i) = 1$ for $i \in \{1,2,3,4\}$. It is easy to check that every cut has $\Phi(S,\bar S) \geq 1$, but the value of the maximum flow is only $3/4$, implying that $C \geq 4/3$. If we use the same pattern on the complete bipartite graph $K_{2,n}$ (instead of $K_{2,3}$), taking $n \to \infty$ shows that we must have $C \geq 3/2$ in (1). (Later a differentiation argument applied to a different family of graphs will show that we need $C \geq 2$ in (1).)

For any graph $G$, define $\mathsf{gap}(G)$ as the smallest value $C$ such that (1) holds for every choice of capacities and demands on $G.$ We will now see how $\mathsf{gap}(G)$ is determined by the geometric properties of path metrics defined on $G.$

### Bi-Lipschitz embeddings into $L_1$

Consider a metric space $(X,d)$, and the space of absolutely integrable functions $L_1 = L_1([0,1])$, equipped with the $\|\cdot\|_1$ norm. A mapping $f : X \to L_1$ is said to be $C$-bi-lipschitz if there exist constants $A,B > 0$ such that for all $x,y \in X$,

$\frac{1}{B} d(x,y) \leq \|f(x)-f(y)\|_1 \leq A\,d(x,y)$

and $C = A \cdot B$ (so if $C=1$, then $f$ is an isometry up to scaling). The infimal value of $C$ for which $f$ is $C$-bi-lipschitz is called the distortion of $f$. Define $c_1(X)$ as least distortion with which $(X,d)$ maps into $L_1$.

We will now relate $\mathsf{gap}(G)$ to the $L_1$-distortion of the geometries that $G$ supports. Any set of non-negative weights $w : E \to \mathbb R_+$ on the edges of $G$ gives rise to a distance $\mathrm{dist}_w$ on $V$ defined by taking shortest-paths. Define $c_1(G)$ to be the maximum of $c_1(V, \mathrm{dist}_w)$ as $w$ ranges over all such weightings. (Strictly speaking, $\mathrm{dist}_w$ is only a pseudometric.)

Here is the beautiful connection referred to previously (see this for a complete proof).

Theorem (Linial-London-Rabinovich, Aumann-Rabani): For any graph $G$, $\mathsf{gap}(G) = c_1(G)$.

Metric obstructions. To give an idea of why the theorem is true, we mention two facts. It is rather obvious how a cut obstructs a flow. A more general type of obstruction is given by a metric $\mathrm{dist}$ on $V$. For any valid flow $F$,

$\displaystyle \mathsf{val}(F) \leq \frac{\sum_{xy \in E} \mathsf{cap}(x,y)\,\mathrm{dist}(x,y)}{\sum_{x,y \in V} \mathsf{dem}(x,y)\,\mathrm{dist}(x,y)}$ (2)

To see why this holds, think of giving every edge $(x,y) \in E$ a “length” of $\mathrm{dist}(x,y)$, and having a cross-sectional area of $\mathsf{cap}(x,y)$. Then the numerator in (2) is precisely the “volume” of the network, while the denominator is the total “volume” required to satisfy all the demands: By the triangle inequality, sending one unit of flow from $x$ to $y$ requires volume at least $\mathrm{dist}(x,y)$. Observe that the bound (2) is a generalization of the cut upper bound when we define the pseudometric $\mathrm{dist}(x,y) = |\mathbf{1}_S(x)-\mathbf{1}_S(y)|$ for some $S \subseteq V$ (this is called a cut pseudometric).

It turns out (by linear programming duality) that in fact metrics are the correct dual objects to flows, and maximizing the left hand side of (2) over valid flows, and minimizing the right-hand side over metrics yields equality (it is also easy to see that the minimal metric is a shortest-path metric). One might ask why one continues to study cuts if they are the “wrong” dual objects. I hope to address this extensively in future posts, but the basic idea is to flip the correspondence around: minimizing $\Phi(S,\bar S)$ is NP-hard, while maximizing $\mathsf{val}(F)$ can be done by linear programming. Thus we are trying to see how close we can get to the very complex object $\Phi$, by something which is efficiently computable.

The cut decomposition of $L_1$. Now that we see why metrics come into the picture, let’s see how $L_1$ presents itself. Define a cut measure on a finite set $X$ as a mapping $\mu : 2^X \to \mathbb R_+$ which satisfies $\mu(S) = \mu(\bar S)$ for every $S \subseteq X$.

Fact: Given $F : X \to L_1$, there exists a cut measure $\mu$ on $X$ such that $\|F(x)-F(y)\|_1 = \sum_S \mu(S) |\mathbf{1}_S(x)-\mathbf{1}_S(y)|$ for every $x,y \in X$. Conversely, for every cut measure $\mu$, there exists a mapping $F : X \to L_1$ for which the same equality holds.

Thus every $L_1$-distance on a finite set is precisely a weighted sum of cut pseudometrics (and, of course, cuts are where this story began). Proving the fact is straightforward; first check that it holds for $F : X \to \mathbb R$, and then integrate. As an example, consider these two isometric embeddings, presented via their cut measures (the graphs are unit-weighted):

In the 4-cycle, each cut has unit weight. The corresponding embedding into $(\mathbb R^2, \|\cdot\|_1)$ maps the four vertices to $\{(0,0),(0,1),(1,0),(1,1)\}$. In the second example, the dashed cuts have weight 1/2, and the dotted cut has weight 1. (Exercise: Verify that every tree embeds into $L_1$ isometrically.)

Two comments are in order; first, when $X$ is finite, one can equivalently take the target space to be $\mathbb R^{|X| \choose 2}$ equipped with the $\|\cdot\|_1$ norm (exercise: prove this using Caratheodory’s theorem). Thus one might ask why we would introduce a function space $L_1$ in the first place. One reason is that the dimension is irrelevant; a deeper reason is that when $X$ is infinite (in which case an appropriately stated version of the fact holds), $L_1$ is the proper setting (and not, e.g. $\ell_1$). This arises, for instance, in arguments involving ultralimits in the passage between the finite and infinite settings.

### The planar embedding conjecture

Using the connection with $L_1$-embeddings, we can now state the planar multi-flow conjecture in its dual formulation.

Planar embedding conjecture: There exists a constant $C \geq 1$ such that every shortest-path metric on a planar graph $G=(V,E)$ admits a $C$-bi-lipschitz embedding into $L_1$.

By a relatively simple approximation argument in one direction, and a compactness argument in the other, one has the following equivalent conjecture.

Riemannian version: There exists a constant $C \geq 1$ such that every Riemannian metric on the 2-sphere admits a $C$-bi-lipschitz embedding into $L_1$.

Recently, Indyk and Sidiropoulos proved that if it’s true for the 2-sphere, then it’s true for every compact surface of genus $g \geq 1$, where the constant $C$ depends on $g$ (as it must).

We can now state some known results in the dual setting of embeddings. Let $G=(V,E)$ be a planar graph, and let $\mathrm{dist}$ be a shortest-path metric on $V$. The previously mentioned theorem of Klein, Plotkin, and Rao can be stated as follows: There exists a non-expansive mapping $f : V \to \mathbb R$ such that

$O(1) \sum_{x,y \in V} |f(x)-f(y)| \geq \sum_{x,y \in V} \mathrm{dist}(x,y)$.

In Gromov’s language, the observable diameter of planar graph metrics is almost as large as possible (i.e. no forced concentration of Lispchitz mappings).

A classical theorem of Okamura and Seymour is stated in the embedding setting as: Let $F \subseteq V$ be a face in some drawing of $G$ in the plane. Then there exists a non-expansive mapping $f : V \to L_1$ such that $f|_F$ is an isometry. In the flow setting, this says that if the demand function is supported on the pairs belonging to a single face, then (1) holds with $C=1$.

Finally, we mention the bound of Rao which shows that $c_1(V,d) = O(\sqrt{\log |V|})$ for planar metrics. A slightly more general version says that whenever the demand is supported on at most $k$ pairs, the flow/cut gap can be at most $O(\sqrt{\log k})$.