Lecture 5: Uniformizing graphs, multi-flows, and eigenvalues

In the previous lecture, we gave an upper bound on the second eigenvalue of the Laplacian of (bounded degree) planar graphs in order to analyze a simple spectral partitioning algorithm.  A natural question is whether these bounds extend to more general families of graphs.  Well-known generalizations of planar graphs are those which can be embedded on a surface of fixed genus, and, more generally, families of graphs that arise by forbidding minors.  In fact, Spielman and Teng conjectured that for any graph excluding K_h as a minor, one should have \lambda_2 \lesssim \mathrm{poly}(h) d_{\max}/n.   Of course planar graphs have genus 0, and by Wagner’s theorem, are precisely the graphs which exclude K_5 and K_{3,3} as minors.  In this lecture, we will follow an intrinsic approach of Biswal, myself, and Rao which, in particular, is able to resolve the conjecture of Spielman and Teng.  First, we see why even pushing the conformal approach to bounded genus graphs is difficult.

Bounded genus graphs

For graphs of bounded genus, there is hope to use an approach based on conformal mappings.  In 1980, Yang and Yau proved that

\displaystyle \lambda_2(M) \lesssim \frac{g+1}{\mathrm{vol}(M)}

for any compact Riemannian surface M of genus g.  (Note that for the Laplace-Beltrami operator, one usually writes \lambda_1 as the first non-zero eigenvalue, rather than \lambda_2.)  In analog with Hersch’s proof of the genus 0 case, they use Riemann-Roch to obtain a degree-(g+1) conformal mapping to the Riemann sphere, then try to pull back a second eigenfunction.  A factor of the degree is lost in the Rayleigh quotient (hence the g+1 factor in the preceding bound), and Hersch’s Möbius trick is still required.

An analogous proof for graphs G of bounded genus would proceed by constructing a circle packing of G on the sphere S^2, but instead of the circles having disjoint interiors, we would be assured that every point of S^2 is contained in at most g circles.  Unfortunately, such a result is impossible (this has to do with the handling of branch points in the discrete setting).  Kelner has to take a different approach in his proof that \lambda_2(G) \leq d_{\max}^{O(1)} (g+1)/n for graphs G of genus at most g.

He starts with a circle packing of G on a compact surface \mathbb S_0 of genus g (whose existence follows from results of Beardon and Stephenon and He and Schramm).  Then Kelner randomly subdivides G repeatedly, and these subdivisions give progressively better approximations to some sequence of surfaces \{\mathbb S_i\}.  Once the approximation is of high enough quality, one applies Riemann-Roch to \mathbb S_k, and infers something about a subdivision of G.  The final element is to track how the second eigenvalue of G changes (in expectation) under random subdivision.

Needless to say, this approach is already quite delicate, and for graphs that can’t be equipped with some kind of conformal structure, we seem to have reached a dead end.  In this lecture, we’ll see how to use intrinsic deformations of the geometry of G in order to bound its eigenvalues.  Eventually, this will reduce to the study of certain kinds of multi-commodity flows.

Metrics on graphs

Let G=(V,E) be an arbitrary n-vertex graph with maximum degree d_{\max}.  Recall that we can write

\displaystyle \lambda_2 = \min_{f \neq 0 : \sum_{x \in V} f(x)=0} \frac{\sum_{xy \in E} |f(x)-f(y)|^2}{\sum_{x \in V} f(x)^2}.

where f : V \to \mathbb R.  (Also recall that we can replace \mathbb R by any Hilbert space, and the same formula holds.)  The first step is to prepare this equality for “non-linearization” by getting rid of the linear condition \sum_{x \in V} f(x)=0 and the sum \sum_{x \in V} f(x)^2.  (This is a popular sort of passage in the non-linear geometry of Banach spaces, which also plays a rather important role in applications to the theoretical CS.)  The goal is to get only terms that look like |f(x)-f(y)|.  Fortunately, there is a well-known way to do this:

\displaystyle \lambda_2 = 2 n \cdot \min_{f : V \to \mathbb R} \frac{\sum_{xy \in E} |f(x)-f(y)|^2}{\sum_{x,y \in V} |f(x)-f(y)|^2},

which follows easily from the equality \sum_{x,y \in V} |f(x)-f(y)|^2 = 2n \sum_{x \in V} f(x)^2 when \sum_{x \in V} f(x)=0.

Thus if we want to bound \lambda_2 = O(1/n), we need to find an f : V \to \mathbb R for which the latter ratio (without the 2n) is O(1/n^2).  Now, for someone who works a lot with linear programming relaxations, it’s very natural to consider a “relaxation”

\displaystyle \gamma(G) = \min_{d} \frac{\sum_{xy \in E} d(x,y)^2}{\sum_{x,y \in V} d(x,y)^2},

where the minimization is over all pseudo-metrics d, i.e. symmetric non-negative functions d : V \times V \to \mathbb R which satisfy the triangle inequality, but might have d(x,y)=0 even for x \neq y.  Certainly \gamma(G) \leq \lambda_2/2n, but Bourgain’s embedding theorem (which states that every n-point metric space embeds into a Hilbert space with distortion at most O(\log n)) also assures us that \lambda_2(G) \leq O(n \log^2 n) \gamma(G).  Since we are trying to show that \gamma(G) = O(1/n^2), this O(\log^2 n) term is morally negligible.  One can see the paper for a more advanced embedding argument that doesn’t lose this factor, but for now we concentrate on proving that \gamma(G) = O(1/n^2).  The embedding theorems allow us to concentrate on finding an intrinsic metric on the graph with small “Rayleigh quotient,” without having to worry about an eventual geometric representation.

As a brief preview… we are going to find a good metric d by taking a certain kind of all-pairs multi-commodity flow at optimality, and weighting the edges by their congestion in the optimal flow.  Thus as the flow spreads out on the graph, it has the effect of “uniformizing” its geometry.

Discrete Riemannian metrics, convexification, and duality

Let’s now assume that G is planar.  We want to show that \gamma(G) = O(d_{\max}/n^2).  First, let’s restrict ourselves to vertex weighted metrics on G.  Given any non-negative weight function \omega : V \to \mathbb R, we can define the length of a path P in G by summing the weights of vertices along it:  \mathsf{len}_{\omega}(P) = \sum_{x \in P} \omega(x).  Then we can define a vertex-weighted shortest-path pseudo-metric on G in the natural way

\displaystyle \mathsf{dist}_{\omega}(x,y) = \min \left\{ \mathsf{len}_{\omega}(P) : P \in \mathcal P_{xy}\right\},

where \mathcal P_{uv} is the set of all u-v paths in G.  We also have the nice relationship

\displaystyle \sum_{xy \in E} \mathsf{dist}_{\omega}(x,y)^2 \leq 2 d_{\max} \sum_{x \in V} \omega(x)^2,\qquad(1)

since \mathsf{dist}_{\omega}(x,y) \leq [\omega(x)+\omega(y)]^2.

So if we define

\displaystyle \Lambda_0(\omega) = \frac{\displaystyle \sum_{x \in V} \omega(x)^2}{\displaystyle \sum_{x,y \in V} \mathsf{dist}_{\omega}(x,y)^2}

then by (1), we have \gamma(G) \leq 2 d_{\max} \min_{\omega} \Lambda_0(\omega).

Examples. Let’s try to exhibit weights \omega for two well-known examples:  the grid, and the complete binary tree.

For the \sqrt{n} \times \sqrt{n} grid, we can simply take \omega(x)=1 for all x \in V.  Clearly \sum_{x \in V} \omega(x)^2 = n.  On the other hand, a random pair of points in the grid is \Omega(\sqrt{n}) apart, hence \sum_{x,y \in V} \mathsf{dist}_{\omega}(x,y)^2 \approx n^2 \cdot (\sqrt{n})^2 = n^3.  It follows that \Lambda_0(\omega) = O(1/n^2), as desired.

For the complete binary tree with root r, we can simply put \omega(r)=1 and \omega(x)=0 for x \neq r.  (Astute readers will guess the geometrically decreasing weights are actually the optimal choice.)  In this case, \sum_{x \in V} \omega(x)^2 = 1, while all the pairs x,y on opposite sides of the root have \mathsf{dist}_{\omega}(x,y)=1.  It again follows that \Lambda_0(\omega) = O(1/n^2).  Our goal is to provide such a weight \omega for any planar graph.

Continue reading

Eigenvalue multiplicity and growth of groups

This post is less about mathematics in TCS as it is about mathematics around TCS–specifically spectral graph theory and the structure of finite groups. Earlier this year at an IPAM conference on expander graphs, Terry Tao presented Bruce Kleiner’s new proof of Gromov’s theorem. After the talk, Luca Trevisan asked whether there exists an analog of certain steps in the proof for finite groups. Recently, Yury Makarychev and I gave a partial answer to Luca’s question in our paper Eigenvalue multiplicity and volume growth.

Gromov’s theorem

Let G be an infinite, finitely-generated group with a finite, symmetric generating set S = S^{-1}. One defines the Cayley graph \mathsf{Cay}(G;S) as an undirected |S|-regular graph with vertex set G, and which has an edge \{u,v\} whenever u = vs for some s \in S.

We let B(R) denote the set of all elements in G that can be written as a product of at most R generators (B(R) is a ball radius R about the identity, in the word metric). G is said to have polynomial growth if there exists a number m \in \mathbb N such that

|B(R)| = O(R^m)

as R \to \infty. Polynomial growth is a property of the group, and does not depend on the choice of finite generating set S (because one can express any two fixed generating sets in terms of each other with words of length O(1)).

It is straightforward, for instance, that every finitely generated abelian group has polynomial growth, since |B(R)| \leq {R \choose d}, where d = |S|. Wolf proved a generalization of this: In fact, it holds for every nilpotent group. On the other hand, the free group on two generators does not have polynomial growth, since |B(R)| = 2^R.

Notice also that every finite group has polynomial growth trivially. This fact extends a bit: If G is an arbitrary group and N is a subgroup of index O(1), then polynomial growth for N implies polynomial growth for G. Combining this with the result of Wolf, we see that: A group has polynomial growth if it has a nilpotent subgroup of finite index. In a stunning work, Gromov proved the conjecture of Milnor that this sufficient condition is also necessary: Every finitely generated group of polynomial growth has a nilpotent subgroup of finite index.

Gromov’s proof

Imagine starting with the integer lattice \mathbb Z^2, and slowly zooming out so that the gaps in the grid become smaller and smaller. As you move far enough away, the grid seems to morph into the continuum \mathbb R^2. Gromov defines this process abstractly, and shows that every group of polynomial growth “converges” to a finite-dimensional limit object on which the group acts by isometries (just as \mathbb Z^2 acts on \mathbb R^2 by translation). Finally, the Gleason-Montgomery-Zippin-Yamabe structure theory of locally compact groups is used to classify the limit object. The jump from geometry (polynomial growth of balls) to algebra is encapsulated in the following result.

Theorem 1: If G is a finitely generated infinite group of polynomial growth, then either

1. There exists a sequence of finite-dimensional linear representations \rho_i : G \to GL_k(\mathbb C) with |\rho_i(G)| \to \infty, or

2. There exists a single finite-dimensional linear representation \rho : G \to GL_k(\mathbb C), with |\rho(G)| infinite.

Here, GL_k(\mathbb C) is the general linear group. (Kleiner’s proof, which I’ll discuss momentarily, shows that actually (2) always holds.)

From Theorem 1, and work of Jordan and Tits, Gromov is able to conclude the following.

Theorem 2: Let G be a finitely generated infinite group of polynomial growth. Then G has a finite index subgroup which admits a homomorphism onto \mathbb Z.

From this fact, and a theorem of Milnor on solvable groups, an induction on the degree of growth finishes the argument (see Tits’ appendix in Gromov’s paper for a 2-page version of this argument).

What about finite groups?

Luca’s question concerned a (quantitative) version of Theorem 1 for finite groups. In this case, it’s not even clear how one defines “polynomial growth.” A possible definition is that there exists a generating set S such that in \mathsf{Cay}(G;S), one has |B(R)| \leq C R^m for some numbers C,m. Unfortunately, this property seems quite unwieldy in the finite case. We make a stronger assumption, using the doubling constant

c_{G;S} = \displaystyle \max_{R \geq 0} \frac{|B(2R)|}{|B(R)|}.

Observe that in the infinite case, c_G < \infty implies polynomial growth. It turns out (though it requires Gromov’s theorem to prove!) that if an infinite Cayley graph has polynomial growth, then it also has c_G < \infty, in fact it must satisfy R^m/C \leq |B(R)| \leq C R^m for some C, m. It is unclear whether a similar phenomenon holds in the finite case.

We prove the following quantitative analogs of Theorems 1 and 2 above, for finite groups.

Theorem 1 (finite): Let G be a finite group with symmetric generating set S. Then there exist constants k and \delta, depending only on the doubling constant c_{G;S}, such that G has a linear representation \rho : G \to GL_k(\mathbb R) with |\rho(G)| \geq |G|^{\delta}.

Theorem 2 (finite): Let G be a finite group with symmetric generating set S. Then there exist constants \alpha and \varepsilon, depending only on the doubling constant c_{G;S}, such that G has a normal subgroup N having index at most \alpha, and N admits a homomorphism onto the cyclic group \mathbb Z_M, where M \geq |G|^{\varepsilon}.

In fact, if c = c_{G;S}, then one can take k \approx e^{\log^2 c} and \delta \approx 1/\log(c) in the first theorem, and \alpha \approx k^{k^2} and \varepsilon \approx 1/k in the second.

Eigenvalue multiplicity and the Laplacian

It seems hopeless to use Gromov’s approach for finite groups; indeed, quite literally, zooming out from a finite group converges to a single point. Kleiner’s remarkable new proof is discussed in detail in Terry Tao’s blog entry. He completely avoids Gromov’s limiting process, and the difficult classification of the resulting limit objects. Instead, his proof is based on estimating the dimension of the space of harmonic functions of fixed polynomial growth on the Cayley graph of G. Kleiner’s approach is inspired by similar work of Colding and Minicozzi in the setting of non-negatively curved manifolds.

Define the discrete Laplacian on functions f : G \to \mathbb R by

\displaystyle \Delta f(x) = f(x) - \frac{1}{|S|} \sum_{s \in S} f(xs)

A harmonic function f on G is one for which \Delta f \equiv 0. It is straightforward to verify that every harmonic function on a connected, finite graph is constant, so again we seem stuck for finite groups.

Fortunately, though, the Laplacian is very nice on finite graphs. In particular, it is a self-adjoint operator on the |G|-dimensional space of functions L^2(G) = \{ f : G \to \mathbb R \}, with eigenvalues 0 = \lambda_1 \leq \lambda_2 \leq \cdots \leq \lambda_{|G|}. The second eigenspace of \Delta is the subspace W_2 \subseteq L^2(G) given by W_2 = \{ f \in L^2(G) : \Delta f = \lambda_2 f \}, and the (geometric) multiplicity of \lambda_2 is defined to be \mathsf{dim}(W_2). In some sense, W_2 contains the most “harmonic-like” functions on G which are orthogonal to the constant functions. The basis of our strategy is the following theorem, which is proved by “scaling down” the approach of Colding-Minicozzi and Kleiner. In order to prove that these functions are “harmonic enough,” we need precise bounds on \lambda_2 in terms of c_{G;S}, which we obtain in the paper. This yields the following theorem.

Theorem: For G finite, the multiplicity of the 2nd eigenvalue of the Laplacian on \mathsf{Cay}(G;S) is at most \displaystyle \exp\left(\log^2(c_{G;S})\right).

(In fact, we prove more general bounds on the multiplicity of higher eigenvalues, and more general graphs than Cayley graphs.)

To pass from this to Theorem 1 (finite) above, we use the fact that G acts on L^2(G) via the action \rho(g) f(x) = f(g^{-1} x). It is easy to see that this action commutes with the Laplacian, i.e. \rho(g) \Delta f = \Delta (\rho(g) f), so that G in fact acts by linear transformations on the second eigenspace W_2. Thus in order to finish the proof of Theorem 1 (finite), we need only show that |\rho(G)| is large.

It turns out that if |\rho(G)| is too small, then we can pass to a small quotient group, and every second eigenfunction f pushes down to an eigenfunction on the quotient. This allows us to bound \lambda_2 on the quotient group in terms of \lambda_2 on G. But \lambda_2 on a small, connected graph cannot be too close to zero by the discrete Cheeger inequality. In this way, we arrive at a contradiction if the image of the action is too small. Theorem 2 (finite) is then a simple corollary of Theorem 1 (finite), using a theorem of Jordan on finite linear groups.

Finally, note that the ideal algebraic conclusion of such a study is a statement of the form: There exists a normal subgroup N \leq G of index O(1), and such that N is an O(1)-step nilpotent group, where the O(1) notation hides a constant that depends only on the growth data of G. It is not clear that such a strong property can hold under only an assumption on c_{G;S} for some fixed generating set S. One might need to make assumptions on every generating set of G, or even geometric assumptions on families of subgroups in G. Defining a simple condition on G and its generators that can achieve the full algebraic conclusion is an intriguing open problem.

The Cheeger-Alon-Milman inequality

[See comments on the post and the update below for the corrected argument.]

Recently, Luca posted on Cheeger’s inequality. Whenever I try to reconstruct the proof, I start with the coarea formula and then play around with Cauchy-Schwarz (essentially the way that Cheeger proved it). The proof below turned out to be a bit more complicated than I thought. Oh well…

Let G = (V,E) be a graph with maximum degree d_{\max} and n = |V| . Let’s work directly with the sparsity \displaystyle \alpha_G = \min_{S \subseteq V} \frac{e(S,\bar S)}{|S| |\bar S|} which is a bit nicer. Notice that \alpha_G n is within factor 2 of the Cheeger constant h_G . We start with a very natural lemma:

Coarea lemma: For any f : V \to \mathbb R , we have \sum_{ij \in E} |f(i)-f(j)| \geq \alpha_G \sum_{i,j \in V} |f(i)-f(j)| .

Proof: Let V_r = \{ i : f(i) \leq r \} , then we can write \sum_{ij \in E} |f(i)-f(j)| as the integral

\displaystyle \int_{-\infty}^{\infty} e(V_r, \bar V_r) dr \geq \alpha_G \int_{-\infty}^{\infty} |V_r| |\bar V_r|dr = \alpha_G \sum_{i,j \in V} |f(i)-f(j)|

Now a bit of spectral graph theory: L = D - A is the Laplacian of G , where D is the diagonal degree matrix and A is the adjacency matrix. The first eigenvalue is \lambda_1 = 0 and the first eigenvector is (1,1,\ldots,1) . If \phi : V \to \mathbb R is the second eigenvector, then the second eigenvalue can be written

\displaystyle \lambda_2 = \frac{\sum_{ij \in E} |\phi(i)-\phi(j)|^2}{\sum_{i \in V} \phi(i)^2} \quad(1)

Given (1), it is natural to apply the coarea formula with f(i) = \phi(i)^2 and then play around.

Theorem: \displaystyle \lambda_2 \geq \frac{\alpha_G^2 n^2}{32 d_{\max}}

Proof: Unfortunately, if we try f(i) = \phi(i)^2 , it doesn’t quite work (think of the case when \phi takes some value v and -v equally often). Instead, let’s eliminate this case by setting \phi_0(i) = \max(m,\phi(i)) , where m = \mathrm{med}(\phi) . Now applying the coarea lemma with f(i) = \phi_0(i)^2 yields:

\displaystyle \alpha_G \sum_{i,j \in V} |\phi_0(i)^2-\phi_0(j)^2| \leq \sum_{ij \in E} |\phi_0(i)^2-\phi_0(j)^2|

\displaystyle = \sum_{ij \in E} |\phi_0(i)+\phi_0(j)||\phi_0(i)-\phi_0(j)| \leq \sqrt{\sum_{ij \in E} [\phi_0(i)+\phi_0(j)]^2} \sqrt{\sum_{ij \in E} |\phi_0(i)-\phi_0(j)|^2}

\displaystyle \leq \sqrt{2 d_{\max}} \sqrt{n m^2 + \sum_{i \in V} \phi(i)^2} \sqrt{\sum_{ij \in E} |\phi(i)-\phi(j)|^2}\quad\quad (2)

Notice that the same inequality holds for \phi_1(i) = \min(m, \phi(i)) . Also, observe that [the next inequality is wrong]

\displaystyle \sum_{i,j \in V} |\phi_0(i)^2-\phi_0(j)^2| + |\phi_1(i)^2-\phi_1(j)^2|

\displaystyle \geq \frac{n}{2} \sum_{i : \phi(i) \geq m} [\phi(i)-m]^2 + \frac{n}{2} \sum_{i : \phi(i) \leq m} [\phi(i)-m]^2

\displaystyle = \frac{n}{2} \sum_{i \in V} [\phi(i)-m]^2 = \frac{n}{2} \left[n m^2 +\sum_{i \in V} \phi(i)^2\right] ,

where in the final line we have used the fact that \sum_{i \in V} \phi(i) = 0 since \phi is orthogonal to the first eigenvector.

So we can assume that \displaystyle \sum_{i,j \in V} |\phi_0(i)-\phi_0(j)|^2 \geq \frac{n}{4} \left[n m^2 +\sum_{i \in V} \phi(i)^2\right] . Plugging this into (2), rearranging, and using (1) yields the claim.

A better (and even correct) way to think about the argument [4-Nov-2015]

Define {\displaystyle h_G(S) = \frac{e(S,\bar S)}{|S|}} and for {f : V \rightarrow \mathbb R} define {\displaystyle  R_G(f)=\frac{\sum_{ij \in E} (f(i)-f(j))^2}{\sum_{i \in V} f(i)^2}} and {\textrm{supp}(f) = \{ i \in V : f(i) \neq 0 \}} .

Lemma (Cheeger with boundary conditions) For any {f : V \rightarrow \mathbb R} , there is a subset {S \subseteq \textrm{supp}(f)} with

\displaystyle  h_G(S) \leq \sqrt{2 d_{\max} R_G(f)}\,.

Proof: We may assume that {\textrm{supp}(f) \neq V} , else taking {S=V} finishes the argument. For {t \in [0,\infty)} , define a subset {S_t = \{ i \in V : f(i)^2 > t \}} . Observe that for every {t \geq 0} , the inclusion {S_t \subseteq \textrm{supp}(f)} holds by construction.

Then we have the estimate,

\displaystyle  \int_0^\infty |S_t|\,dt= \sum_{i \in V} f(i)^2

as well as,

\displaystyle  \begin{array}{rcl}  \int_0^\infty e(S_t,\overline{S_t})\,dt &=& \sum_{ij \in E} \left|f(u)^2-f(v)^2\right| \\ &= & \sum_{ij \in E} |f(i)-f(j)| \cdot |f(i)+f(j)| \\ &\leq & \sqrt{\sum_{ij \in E} (f(i)-f(j))^2} \sqrt{\sum_{ij \in E} (f(i)+f(j))^2} \\ &\leq & \sqrt{\sum_{ij \in E} (f(i)-f(j))^2} \sqrt{2 d_{\max} \sum_{i \in V} f(i)^2} \end{array}

Combining these two inequalities yields,

\displaystyle  \frac{\int_0^\infty e(S_t,\overline{S_t})\,dt}{ \int_0^\infty |S_t|\,dt} \leq \sqrt{2 d_{\max} R_G(f)},

implying there exists a {t \in [0,\infty]} for which {S_t} satisfies the statement of the lemma. \Box

Now to prove the Cheeger inequality, consider any map {\phi : V \rightarrow \mathbb R} with {\sum_{i \in V} \phi(i)=0} . Let {m} be some median of the set {\phi(V)} and consider the two functions {\phi_0(i)=\max(0,\phi(i)-m)} and {\phi_1(i)=\max(0,m-\phi(i))} . By the definition of {m} , applying the lemma to either {\phi_0} or {\phi_1} yields a set with {|S| \leq n/2} .

But we also have

\displaystyle  \sum_{i \in V} \phi_0(i)^2+\phi_1(i)^2 =\sum_{i \in V} (\phi(i)-m)^2 = nm^2 +\sum_{i \in V} \phi(i)^2 \geq \sum_{i \in V} \phi(i)^2\,,

where we have used the fact that {\sum_{i \in V} \phi(i)=0} . Thus we have at least one of {R_G(\phi_0) \leq 2 \,R_G(\phi)} or {R_G(\phi_1) \leq 2 \, R_G(\phi)} , completing the proof.