# tcs math – some mathematics of theoretical computer science

## February 20, 2013

### No frills proof of higher-order Cheeger inequality

Filed under: Expository, Math — Tags: , , , — James Lee @ 11:41 pm

Following some recent applications by Mamoru Tanaka and Laurent Miclo, I was asked where there is a short, no-frills, self-contained, and (possibly) quantitatively non-optimal proof of the higher-order Cheeger inequalities from our paper with Shayan Oveis-Gharan and Luca Trevisan. I thought I would post it here. (If you’re hungering for something new, see this recently posted preprint of my coauthors relating higher eigenvalues to graph expansion.)

[Update: The application of Miclo can also be done using the higher-order Cheeger inequalities of Louis, Raghavendra, Tetali, and Vempala.]

The main simplification comes from doing the random partitioning non-optimally with axis-parallel cubes. For ease of notation, we will deal only with regular graphs, but there will be no quantitative dependence on the degree and this assumption can be removed (see the full paper).

Suppose that ${G=(V,E)}$ is a connected, ${n}$-vertex, ${d}$-regular graph. Define the Laplacian by ${\mathcal L = I - (1/d)A}$, where ${A}$ is the adjacency matrix of ${G}$. We will think of ${\mathcal L}$ as acting on ${\ell^2(V)}$, the space of functions ${f : V \rightarrow \mathbb R}$ equipped with the ${\ell^2}$ norm. ${\mathcal L}$ is positive semi-definite with spectrum

$\displaystyle 0 = \lambda_1 \leq \lambda_2 \leq \cdots \leq \lambda_{|V|} \leq 2\,.$

We we define the Rayleigh quotient of a function ${f \in \ell^2(V)}$ by

$\displaystyle \mathcal R(f) = \frac{\sum_{u \sim v} (f(u)-f(v))^2}{d \sum_{u \in V} f(u)^2}\,,$

where the numerator is summed over edges of ${G}$. By the variational principle for eigenvalues, we have

$\displaystyle \lambda_k = \min_{\stackrel{W \subseteq \ell^2(V)}{\dim(W)=k}} \max_{0 \neq f \in W} \mathcal R(f)\,. \ \ \ \ \ (1)$

For a subset ${S \subseteq V}$, we define the expansion of ${S}$ by ${\phi(S) = \mathcal R(\mathbf{1}_S)}$, where ${\mathbf{1}_S}$ is the indicator function of ${S}$.

Finally, for ${k \in \{1,2,\ldots,n\}}$, we define the ${k}$-way expansion constant of ${G}$ by

$\displaystyle \rho_G(k) = \min \left\{ \max_i \phi(S_i) : S_1, S_2, \ldots, S_k \subseteq V \right\}\,,$

where the minimum is over collections of ${k}$ disjoint, non-empty subsets of ${V}$.

The classical discrete Cheeger inequality asserts that

$\displaystyle \frac{\lambda_2}{2} \leq \rho_G(2) \leq \sqrt{2\lambda_2}\,.$

We will now prove the following generalization. See the full paper for a discussion of the surrounding issues and better quantitative bounds.

Theorem 1 For every ${k \in \{1,2,\ldots,n\}}$,

$\displaystyle \frac{\lambda_k}{2} \leq \rho_G(k) \leq 30 k^{3.5} \sqrt{\lambda_k}\,. \ \ \ \ \ (2)$

First, let’s prove the (easy) LHS of (2). Suppose we have ${S_1, S_2, \ldots, S_k \subseteq V}$ which are disjoint and non-empty and which satisfy ${\phi(S_i) = \mathcal R(\mathbf{1}_{S_i}) \leq \rho}$. Then certainly ${W = \mathrm{span}(\mathbf{1}_{S_1}, \ldots, \mathbf{1}_{S_k})}$ is a ${k}$-dimensional subspace of ${\ell^2(V)}$. On the other hand, every ${f \in W}$ satisfies ${\mathcal R(f) \leq 2 \rho}$ because if ${f = \alpha_1 \mathbf{1}_{S_1} + \cdots + \alpha_k \mathbf{1}_{S_k}}$, then

$\displaystyle \sum_{u \sim v} (f(u)-f(v))^2 \leq 2 \sum_{i=1}^k \alpha_i^2 \sum_{u \sim v} |\mathbf{1}_{S_i}(u)-\mathbf{1}_{S_i}(v)|^2\,,$

where we have used the fact that if ${u \in S_i}$ and ${v \in S_j}$, then

$\begin{array}{rcl} |\alpha_i \mathbf{1}_{S_i}(u) - \alpha_j \mathbf{1}_{S_j}(u)|^2 &\leq & 2(\alpha_i^2 + \alpha_j^2) \\ &=& 2\alpha_i^2 |\mathbf{1}_{S_i}(u) - \mathbf{1}_{S_i}(v)|^2 + 2\alpha_j^2 |\mathbf{1}_{S_j}(u)-\mathbf{1}_{S_j}(v)|^2\,.\end{array}$

But now using (1), the subspace ${W}$ witnesses the fact that ${\lambda_k \leq 2\rho}$.

To prove the more difficult RHS of (2), we will use the following discrete Cheeger inequality with boundary conditions.

Lemma 2 For any ${f : V\rightarrow \mathbb R}$, there is a subset ${U \subseteq \{ v \in V : f(v) \neq 0\}}$ such that

$\displaystyle \phi(U) \leq \sqrt{2 \mathcal R(f)}\,.$

Proof: Let ${U_t = \{ v \in V : f(v)^2 \geq t \}}$. Observe that for each ${t > 0}$, one has ${U_t \subseteq \{ v \in V : f(v) \neq 0\}}$. For ${S \subseteq V}$, let ${E(S,\bar S)}$ denote the edges of ${G}$ with exactly one endpoint in ${S}$. Then we have

$\displaystyle \begin{array}{rcl} \int_0^{\infty} |E(U_t, \bar U_t)|\,dt &=& \sum_{\{u,v\} \in E} \left|f(u)^2 - f(v)^2\right| \\ &= & \sum_{\{u,v\} \in E} |f(u) + f(v)| |f(u)-f(v)| \\ &\leq & \sqrt{\sum_{\{u,v\} \in E} (|f(u)|+|f(v)|)^2} \sqrt{\sum_{\{u,v\} \in E} |f(u)-f(v)|^2} \\ &\leq & \sqrt{2 d \sum_{u \in V} f(u)^2}\sqrt{\sum_{\{u,v\} \in E} |f(u)-f(v)|^2}. \end{array}$

On the other hand, ${\int_0^{\infty} d|U_t|\,dt = d \sum_{u \in V} |f(u)|^2,}$ thus

$\displaystyle \int_0^{\infty} |E(U_t, \bar U_t)|\,dt \leq \sqrt{2 \mathcal R(f)} \int_0^{\infty} d|U_t|\,dt\,,$

implying there exists a ${t > 0}$ such that ${\phi(U_t) = \frac{|E(U_t,\bar U_t)|}{d |U_t|} \leq \sqrt{2 \mathcal R(f)}}$. $\Box$

In light of the preceding lemma, to prove the RHS of (2), it suffices to find ${k}$ disjointly supported functions ${\psi_1, \psi_2, \ldots, \psi_k : V \rightarrow \mathbb R}$ such that ${\mathcal R(\psi_i) \leq (30)^2 k^6 \lambda_k}$ for each ${i=1,2,\ldots,k}$. Then Lemma 2 guarantees the existence of disjoint subsets of vertices satisfying our desired conclusion.

Toward this end, let ${f_1, f_2, \ldots, f_k : V \rightarrow \mathbb R}$ denote ${k}$ orthonormal functions satisfying ${\mathcal L f_i = \lambda_i f_i}$ for each ${i=1,2,\ldots,k}$, i.e. consider the first ${k}$ eigenfunctions of the Laplacian. Define the map ${F : V \rightarrow \mathbb R^k}$ via

$\displaystyle F(v) = \left(f_1(v), f_2(v), \ldots, f_k(v)\right)\,.$

We also put ${\hat F(v) = F(v)/\|F(v)\|}$. (Since ${\lambda_1=0}$, the function ${f_1}$ takes value ${1/\sqrt{n}}$ everywhere, hence ${F}$ is never zero and this is well-defined.)

The next lemma shows that, in order to find disjointly supported functions, it suffices to find “separated sets.”

Lemma 3 Suppose that for some ${\varepsilon > 0}$ and ${0 < \delta \leq 1}$, there exist ${k}$ subsets ${U_1, U_2, \ldots, U_k \subseteq V}$ satisfying the conditions:

1. For every ${i=1,2,\ldots,k}$, we have ${\sum_{v \in U_i} \|F(v)\|^2 \geq \varepsilon}$, and
2. For every ${i \neq j}$, if ${u \in U_i}$ and ${v \in U_j}$, then

$\displaystyle \|\hat F(u)-\hat F(v)\| \geq \delta$

Then there are disjointly supported functions ${\psi_1, \psi_2, \ldots, \psi_k : V \rightarrow \mathbb R}$ such that ${\mathcal R(\psi_i) \leq 25k\lambda_k/(\varepsilon\delta^2)}$.

Proof: For each ${i=1,2,\ldots,k}$, we define maps ${\theta_i : V \rightarrow \mathbb R}$ by

$\displaystyle \theta_i(v) = \max\left(0, 1 - \frac{2}{\delta} \min_{u \in U_i} \|\hat F(u)-\hat F(v)\|\right)\,.$

Observe that, by the triangle inequality, for ${u,v \in V}$, we have

$\displaystyle |\theta_i(u)-\theta_i(v)| \leq \frac{2}{\delta} \|\hat F(u)-\hat F(v)\|\,. \ \ \ \ \ (3)$

Next, we define

$\displaystyle \psi_i(v) = \theta_i(v) \|F(v)\|\,.$

Observe that since ${\theta_i}$ is identically ${1}$ on ${U_i}$, we have

$\displaystyle \sum_{v \in V} \psi_i(v)^2 \geq \sum_{v \in U_i} \|F(v)\|^2 \geq \varepsilon \ \ \ \ \ (4)$

by condition (i).

Next, we argue that the functions ${\{\theta_i\}}$ are disjointly supported. This immediately implies that the ${\{\psi_i\}}$ are disjointly supported. If ${u \in \mathrm{supp}(\theta_i) \cap \mathrm{supp}(\theta_j)}$ for some ${i \neq j}$, then there are points ${v \in U_i}$ and ${v' \in U_j}$ such that ${\|\hat F(u)-\hat F(v)\| < \delta/2}$ and ${\|\hat F(u)-\hat F(v')\| < \delta/2}$. But then by the triangle inequality, ${\|\hat F(v)-\hat F(v')\| < \delta}$, violating condition (ii).

Finally, we bound ${\mathcal R(\psi_i)}$. For any ${u,v \in V}$ and ${i=1,2,\ldots,k}$, we have

$\displaystyle |\psi_i(u)-\psi_i(v)| \leq \theta_i(u) \left|\vphantom{\bigoplus}\|F(u)\|-\|F(v)\|\right| + \|F(v)\| \cdot |\theta_i(u)-\theta_i(v)|\,. \ \ \ \ \ (5)$

Now using (3),

$\displaystyle \|F(v)\| \cdot |\theta_i(u)-\theta_i(v)| \leq \frac{2}{\delta} \|F(v)\| \cdot \|\hat F(u)-\hat F(v)\| \leq \frac{4}{\delta} \|F(u)-F(v)\|\,, \ \ \ \ \ (6)$

where we have used the fact that for any non-zero vectors ${x,y \in \mathbb R^k}$, we have

$\displaystyle \|x\| \left\|\frac{x}{\|x\|} - \frac{y}{\|y\|}\right\| = \left\|x - \frac{\|x\|}{\|y\|} y\right\| \leq \|x-y\| + \left\|y - \frac{\|x\|}{\|y\|} y\right\| \leq 2 \,\|x-y\|\,.$

Using (6) and the fact that ${\theta_i(u) \leq 1}$ in (5) yields

$\displaystyle |\psi_i(u)-\psi_i(v)| \leq \left(1+\frac{4}{\delta}\right) \|F(u)-F(v)\|\,. \ \ \ \ \ (7)$

Finally, observe that

$\displaystyle \sum_{u \sim v} \|F(u)-F(v)\|^2 = \sum_{i=1}^k \sum_{u \sim v} |f_i(u)-f_i(v)|^2 = d (\lambda_1 + \cdots + \lambda_k) \leq dk\lambda_k\,.$

Combining this with (7) and (4) shows that ${\mathcal R(\psi_i) \leq \lambda_k \frac{k}{\varepsilon}(1+\frac{4}{\delta})^2 \leq 25k\lambda_k/(\varepsilon \delta^2).}$ $\Box$

We will first make the task of finding separated sets slightly easier.

Lemma 4 Suppose that for some ${0 < \delta \leq 1}$ and ${m \geq 1}$, there are subsets ${T_1, T_2, \ldots, T_m \subseteq V}$ which satisfy:

1. ${\displaystyle \sum_{i=1}^m \sum_{v \in T_i} \|F(v)\|^2 \geq k - \frac14}$.
2. For every ${i \neq j}$, if ${u \in T_i}$ and ${v \in T_j}$, then ${ \|\hat F(u)-\hat F(v)\| \geq \delta }$
3. For every ${i=1,2,\ldots, m}$,

$\displaystyle \sum_{v \in T_i} \|F(v)\|^2 \leq 1 + \frac{1}{2k}\,.$

Then there are ${k}$ sets ${U_1, U_2, \ldots, U_k \subseteq V}$ satisfying the assumption of Lemma 3 with ${\varepsilon = \frac{1}{4}}$.

Proof: We can form the desired sets ${\{U_i\}}$ by taking disjoint unions of the sets ${\{T_i\}}$. Begin with the family ${\{T_1, T_2, \ldots, T_m\}}$ and repeatedly replace two sets ${T}$ and ${T'}$ by their union if they each satisfy ${\sum_{v \in T} \|F(v)\|^2 < \frac12}$.

When we finish, we are left with a collection of sets each member of which satisfies ${\sum_{v \in T} \|F(v)\|^2 \in [\frac12, 1+\frac{1}{2k}]}$, and possibly one set for which ${\sum_{v \in T} \|F(v)\|^2 < 1/2}$. By (i), we must end with at least ${k}$ sets which each satisfy ${\sum_{v \in T} \|F(v)\|^2 \geq 1/4}$. $\Box$

Our final lemma, which finishes the proof of Theorem 1, simply asserts that such sets exist. We will no longer need the fact that ${F}$ comes from eigenfunctions.

Lemma 5 Suppose that ${g_1, g_2, \ldots, g_k : V \rightarrow \mathbb R}$ are orthornormal in ${\ell^2(V)}$. Let

$\displaystyle G(v) = (g_1(v), g_2(v), \ldots, g_k(v))$

and ${\hat G(v) = G(v)/\|G(v)\|}$. Then there is an ${m \geq 1}$ and subsets ${T_1, T_2, \ldots, T_m \subseteq V}$ such that

1. ${\displaystyle\sum_{i=1}^m \sum_{v \in T_i} \|G(v)\|^2 \geq k - \frac14}$.
2. For every ${i \neq j}$, if ${u \in T_i}$ and ${v \in T_j}$, then

$\displaystyle \|\hat G(u)-\hat G(v)\| \geq \frac{1}{2\sqrt{2} k^{3}}\,.$

3. For every ${i = 1,2,\ldots,m}$,

$\displaystyle \sum_{v \in T_i} \|G(v)\|^2 \leq 1 + \frac{1}{2k}\,.$

Proof: Consider the ${n \times k}$ matrix ${A}$ which has columns ${g_1, g_2, \ldots, g_k}$. For any ${x \in \mathbb R^k}$, we have

$\displaystyle \sum_{v \in V} \langle x, G(v)\rangle^2 = \|Ax\|^2 = x^T A^T A x = x^T x = \|x\|^2\,, \ \ \ \ \ (8)$

since the columns of ${A}$ are orthornormal

For a subset ${X \subseteq S^{k-1}}$ of the unit sphere in ${\mathbb R^k}$, we put ${V(X) = \{ v \in V : \hat G(v) \in X \}}$. Fix some ${x \in X}$ and let ${\mathsf{diam}(X)}$ denote the diameter of ${X}$, then by (8), we have

$\displaystyle 1 = \sum_{v \in V(X)} \langle x, G(v)\rangle^2 = \sum_{v \in V(X)} \|G(v)\|^2 \left(1-\frac{\|\hat G(v)-x\|^2}{2}\right)^2 \geq \sum_{v \in V(X)} \|G(v)\|^2 \left(1-\frac{\mathrm{diam}(X)^2}{2}\right)^2\,.$

We conclude that if ${\mathrm{diam}(X) \leq 1/\sqrt{2k}}$, then

$\displaystyle \sum_{v \in V(X)} \|G(v)\|^2 \leq 1 + \frac{1}{2k}\,. \ \ \ \ \ (9)$

Now, let ${\mathcal{P}}$ be a partition of ${\mathbb R^k}$ into axis-parallel squares of side length ${L = \frac{1}{k \sqrt{2}}}$. For any such square ${Q \in \mathcal P}$, we let ${\tilde Q \subseteq Q}$ denote the set of points which are at Euclidean distance at least ${\frac{L}{4k^2}}$ from every side of ${Q}$. Observe that

$\displaystyle \mathrm{vol}(\tilde Q) \geq \left(1-\frac{1}{4k^2}\right)^k \mathrm{vol}(Q) \geq \left(1-\frac{1}{4k}\right) \mathrm{vol}(Q)\,. \ \ \ \ \ (10)$

Consider now the collection of sets ${\{V(\tilde Q) : Q \in \mathcal P\}}$. Since ${\mathrm{diam}(\tilde Q) \leq L \sqrt{k} = \frac{1}{\sqrt{2k}}}$, (9) implies that ${\sum_{v \in V(\tilde Q)} \|G(v)\|^2 \leq 1 + \frac{1}{2k}}$. Furthermore, by construction, for any ${u \in V(\tilde Q)}$ and ${v \in V(\tilde Q')}$ with ${Q \neq Q'}$, we have

$\displaystyle \|\hat G(u)-\hat G(v)\| \geq 2 \frac{L}{4k^2} \geq \frac{1}{2\sqrt{2} k^{3}}\,.$

Thus the collection of sets ${\{V(\tilde Q) : Q \in \mathcal P\}}$ satisfy both conditions (ii) and (iii) of the lemma. We are left to verify (i).

Note that ${\sum_{v \in V} \|G(v)\|^2 = \sum_{i=1}^k \sum_{v \in V} g_i(v)^2 = k}$. If we choose a uniformly random axis-parallel translation ${\mathcal P'}$ of the partition ${\mathcal P}$, then (10) implies

$\displaystyle \mathbb{E} \sum_{Q \in \mathcal P'} \sum_{v \in V(\tilde Q)} \|G(v)\|^2 \geq \left(1-\frac{1}{4k}\right) \sum_{v \in V(Q)} \|G(v)\|^2 \geq k - \frac14\,.$

In particular, there exists some fixed partition that achieves this bound. For this partition ${\mathcal P}$, the sets ${\{V(\tilde Q) : Q \in \mathcal P\}}$ satisfy all three conditions of the lemma, completing the proof.

## 8 Comments »

1. how does this proof relate to the original proof of gharan and trevisan in http://arxiv.org/pdf/1107.2686v1.pdf?

Comment by s.e. — February 25, 2013 @ 6:40 am

2. I do not understand the formula (3), please board can make more explicit. thank you.

Comment by Nguyen Van Phuc _ Viet Nam — April 17, 2013 @ 6:20 pm

3. I agree, there are a number of steps contained in that inequality. First, observe that if S is a set in a metric space and d(x,S) is the distance from x to the closest point in S, then $|d(x,S)-d(y,S)| \leq d(x,y)$. In other words, the map f(x)=d(x,S) is 1-Lipschitz. Now observe that scaling a 1-Lipschitz map by C, it becomes C-Lipschitz. Adding a constant, it remains C-Lipschitz. Finally, if f is C-Lipschitz, then the map g(x)=max(0,f(x)) is also C-Lipschitz. Combine all these together for the distance $d(u,v) = \|\hat F(u)-\hat F(v)\|$ and you get (3).

Comment by James Lee — April 17, 2013 @ 6:32 pm

4. Thanks to Professor Lee very much. I am very interested in this problem, because I am learning, learn about it in earnest.

Comment by Nguyen Van Phuc _ Viet Nam — April 17, 2013 @ 8:47 pm

5. Dear Mr. Lee.
I think, there is a typographical error in Lemma 3 does not.
$\mathcal{R}(\psi_i) \leq 25k / (\varepsilon \delta^2)$
to be $\mathcal{R}(\psi_i) \leq 25k \lambda_k / (\varepsilon \delta^2)$.
Because at the end of the line when we demonstrated that the combination of the three that still \lambda_k.

Comment by Nguyen Van Phuc_Viet Nam — May 11, 2013 @ 1:26 am

6. and if we choose $$\delta = \frac{1}{2 \sqrt{2} k^{5/2}}$$ as in 2. of Lemma 5 it is:
$$\mathcal{R}(\psi_i) \leq 800 k^6 \lambda_k$$
better than $$\mathcal{R}(\psi_i) \leq 30^2 k^6 \lambda_k$$
this true?
Thank you much.

Comment by Nguyen Van Phuc_Viet Nam — May 11, 2013 @ 1:56 am

7. Dear Mr. Lee.
I think, there is a typographical error in Lemma 3 does not.
$\mathcal{R}(\psi_i) \leq 25k / (\varepsilon \delta^2)$
to be $\mathcal{R}(\psi_i) \leq 25k \lambda_k / (\varepsilon \delta^2)$.
Because at the end of the line when we demonstrated that the combination of the three that still $\lambda_k$.

[Thanks. I fixed it. --j]

Comment by Nguyen Van Phuc — May 12, 2013 @ 12:01 am

8. and if we choose $\delta = \frac{1}{2 \sqrt{2} k^{5/2}}$ as in 2. of Lemma 5 it is:
$\mathcal{R}(\psi_i) \leq 800 k^6 \lambda_k$
better than $\mathcal{R}(\psi_i) \leq 30^2 k^6 \lambda_k$
this true?
Thank you much.

Comment by Nguyen Van Phuc — May 12, 2013 @ 12:03 am