# tcs math – some mathematics of theoretical computer science

## July 4, 2013

### What does Kadison-Singer have to do with Quantum Mechanics?

Filed under: Expository, Math — Tags: , , — James Lee @ 11:05 am

Most accounts of the Kadison-Singer problem mention that it arose from considerations in the foundations of quantum mechanics. Specifically, a question about “extensions of pure states.” See, for instance, Gil’s blog. But trying to reconcile this question with my (limited) knowledge of QM proved frustrating. There is also an intuitive explanation posted at soulphysics, along with mathematical notes. But I couldn’t figure that one out either, because it only deals with “normal” quantum states (those corresponding to countably additive measures; see below). I think the KS problem for normal states is trivial. For the meat of KS, one has to think not only about infinite-dimensional spaces, but about “pathological” (non-constructible) measures. In particular, one has to think about “quantum states” that don’t have density matrices.

What follows are my notes on the matter, all of which are either well-known or wrong. Skip to the end for the very simple formulation with no mention of quantum states. Even better advice is to just Read Nick Harvey’s survey.

How I understand quantum measurement

A (mixed) quantum state ${\rho}$ on the space ${\mathbb C^n}$ is an ${n \times n}$ Hermatian matrix satisfying ${\rho \succeq 0}$ (${\rho}$ is positive semi-definite) and ${\mathrm{Tr}(\rho)=1}$. Suppose we also have a resolution of the identity: A collection ${\mathcal M = \{M_1, M_2, \ldots, M_k\}}$ of PSD matrices with ${\sum_{j=1}^k M_j = I}$. Then when we perform the measurement ${\mathcal M}$, the probability of obtaining outcome ${j}$ is ${\mathrm{Tr}(\rho M_j)}$ for ${j=1,2,\ldots,k}$.

A pure state is given by a density matrix of the form ${\rho = \psi \psi^T}$ for some ${\psi \in \mathbb C^n}$. Equivalently, a state is pure if and only if it cannot be written as a non-trivial convex combination of other states, i.e. if ${\rho = \lambda \rho_1 + (1-\lambda) \rho_2}$ for ${\lambda \in (0,1)}$, then ${\rho=\rho_1=\rho_2}$. The pure states are the extreme points of the convex set of all states.

Quantum logic

There is another way to think about quantum measurement. Let ${\mathcal H}$ be a (separable) Hilbert space and consider a projection ${P : \mathcal H \rightarrow S}$ onto some closed subspace. A measure ${\mu}$ on projections assigns to every such projection a value in ${[0,1]}$. We say that ${\mu}$ is ${\sigma}$-additive if whenever ${Q = \sum_{j=1}^{\infty} P_j}$ and the projections ${\{P_j\}}$ are mutually orthogonal, we have ${\mu(Q) = \sum_{j=1}^{\infty} \mu(P_j)}$. If this only holds for finite sums of projections, we say that ${\mu}$ is finitely additive. We should also have ${\mu(I)=1}$.

Note that we have allowed here the possibility that ${\mathcal H}$ is infinite-dimensional. In the case where the measure ${\mu}$ is ${\sigma}$-additive, it is completely specified by its value at one-dimensional projections, i.e. the values ${\mu(P_x)}$ where ${P_x}$ denotes projection onto ${x \in \mathcal H}$. Now we can state Gleason’s theorem: Our two notions of quantum measurement coincide.

Theorem 1 Let ${\mathcal H}$ be a Hilbert space of dimension at least 3 and let ${\mu}$ be a ${\sigma}$-additive measure on projections. Then there exists a unique linear operator ${\rho}$ such that for any projection ${P}$, we have ${\mu(P) = \mathrm{Tr}(\rho P)}$.

In other words, every ${\sigma}$-additive measure on projections arises from a mixed state.

Restricted purity

Note that a measure on projections corresponds to a pure state if and only if it cannot be written as a non-trivial convex combination of measures on projections. In this way, we can also define purity with respect to a restricted family of projections.

Consider, for example, the Hilbert space ${\ell^2(\mathbb N)}$ of complex sequences, and let ${\{e_j : j \in \mathbb N\}}$ denote the standard basis. Suppose we consider only the projections ${\mathcal P = \{P_j\}}$ onto each of the basis elements. Say that a ${\mathcal P}$-measure is a measure on all projections of the form ${\sum_{i \in A} P_i}$ for ${A \subseteq \mathbb N}$. Such a measure only gives us probabilities for measurements performed in the standard basis. (The algebra generated by ${\mathcal P}$ is a canonical example of a “maximal abelian subalgebra,” i.e. MASA.) We can say that a ${\mathcal P}$-measure is pure if it cannot be written as a non-trivial convex combination of other ${\mathcal P}$-measures. What are the pure ${\mathcal P}$-measures? Well, the only ${\sigma}$-additive pure ${\mathcal P}$-measures are the point measures ${\mu(P_j)=1}$ for some ${j}$. If ${\mu(P_j), \mu(P_{j'}) > 0}$ for ${j \neq j'}$, then by ${\sigma}$-additivity, we can split ${\mu}$ into a convex combination of two measures (one supported only on ${P_j}$ and the other supported off ${P_j}$).

But when ${\sigma}$-additivity is dropped, some tricky things can happen (see the next section). It turns out that, in this case, there are more pure states than just ${e_j e_j^T}$ (in fact, the cardinality of the finitely-additive pure states is gigantic, ${2^{2^{\aleph_0}}}$). Now we are in position to state the Kadison-Singer problem.

Kadison-Singer Problem: Is it the case that every pure, finitely-additive ${\mathcal P}$-measure has a unique extension to a finitely-additive measure on all the projections of ${\mathcal H}$?

The answer is clearly positive for ${\sigma}$-additive measures (in which case the unique extension corresponds to ${e_j e_j^T}$). The answer is also positive for finitely-additive ${\mathcal P}$-measures, as recently proved by Marcus, Spielman, and Srivistava.

In their original work, Kadison and Singer showed that the answer is false if one constructs the MASA using the continuous Hilbert space ${L^2([0,1])}$ instead of ${\ell^2(\mathbb N)}$. One can take the subalgebra ${\mathcal C}$ of multiplication operators ${T_g (f)=gf}$ for ${\|g\|_{\infty} < \infty}$. Here the corresponding family of projections are the maps ${P_S f = \mathbf{1}_S f}$ for ${S \subseteq [0,1]}$. Unlike in the case above, this subalgebra is not “discrete.” It turns out that every maximal abelian subalgebra of bounded operators on a separable Hilbert space is either finite-dimensional, the discrete one, the continuous one, or a direct sum of algebras isomorphic to these.

Note that states corresponding to finitely additive (but not ${\sigma}$-additive measures) do not have density matrices and are often considered to be “unphysical” because they would take an infinite amount of energy to prepare.

A ${\beta}$-world teeming with pure states

In the above setting, we said there are finitely-additive pure ${\mathcal P}$-measures ${\mu}$ other than point measures. To see what one of these must look like, consider the following construction. Let ${P_{\mathrm{even}}}$ denote the projection onto ${\{e_j : j \textrm{ even}\}}$ and define ${P_{\mathrm{odd}}}$ similarly. Define the measures ${\mu_{\mathrm{even}}(P) = \mu(P P_{\mathrm{even}})}$ and ${\mu_{\mathrm{odd}}(P) = \mu(P P_{\mathrm{odd}})}$. Then we can write ${\mu = \mu(P_{\mathrm{even}}) \mu_{\mathrm{even}} + \mu(P_{\mathrm{odd}}) \mu_{\mathrm{odd}}}$. This is a non-trivial convex combination unless either ${\mu(P_{\mathrm{even}})=0}$ or ${\mu(P_{\mathrm{odd}})=0}$.

Indeed, for every partition ${\mathbb N = S \cup \bar S}$, a pure ${\mathcal P}$-measure ${\mu}$ can only put non-zero weight on projections involving coordinates of exactly one of ${S}$ or ${\bar S}$. Furthermore, if ${\mu(\sum_{i \in {\bar S}} P_i) = 0}$ then ${\mu(\sum_{i \in S} P_i)=1}$. In other words, for every set of projections, the measure ${\mu}$ takes only values 0 or 1. Such finitely-additive measures are exactly given by ultrafilters on ${\mathbb N}$. The finitely-additive pure ${\mathcal P}$-measures are in one-to-one correspondence with such ultrafilters, the set of which can be identified with ${\beta \mathbb N}$, the Stone-Cech compactification of the naturals. See Terry Tao’s notes on ultafilters and Stone-Cech compactification.

A simple formulation

Take the Hilbert space ${\ell^2 = \ell^2(\mathbb N)}$ with standard basis ${\{e_j : j \in \mathbb N\}}$, and let ${\mathcal S}$ denote the set of all its closed subspaces. Consider a map ${\mu : \mathcal S \rightarrow [0,1]}$. We say that ${\mu}$ is a finitely-additive measure if ${\mu(\ell^2)=1}$ and whenever ${S_1, S_2, \ldots, S_k \in \mathcal S}$ is a finite collection of mutually orthogonal subspaces, we have ${\mu(\mathrm{span}(S_1, \ldots, S_k)) = \sum_{i=1}^k \mu(S_i)}$. Say that such a measure ${\mu}$ is pure if it cannot be written as a non-trivial convex combination of two finitely-additive measures. A canonical example of a pure measure involves fixing a unit vector ${x \in \ell^2}$ and setting ${\mu(S) = \langle x, P_S x\rangle}$ where $P_S$ denotes the orthogonal projection onto $S$.

Now let ${\mathcal S_{\mathrm{diag}}}$ be the set of all the subspaces of ${\ell^2}$ of the form ${\mathrm{span}(e_i : i \in A)}$ for some subset ${A \subseteq \mathbb N}$, i.e. all the coordinate subspaces. We can define “finitely-additive” and “pure” for measures on ${\mathcal S_{\mathrm{diag}}}$ in the analogous way.

Kadison-Singer problem: Can every finitely-additive pure measure on ${\mathcal S_{\mathrm{diag}}}$ be extended uniquely to a finitely-additive measure on ${\mathcal S}$?

Note that the real question is about uniqueness of the extension. There always exists a pure extension; see the comments.

## June 26, 2013

### Separated sets in unions of frames

Filed under: Math, Open question — Tags: , , — James Lee @ 11:46 pm

In celebration of the recent resolution of the Kadison-Singer problem by Marcus, Spielman, and Srivastava, here is a question on isotropic point sets on which Kadison-Singer does not (seem to) shed any light. A positive resolution would likely have strong implications for the Sparsest Cut problem and SDP hierarchies. The question arose in discussions with Shayan Oveis Gharan, Prasad Raghavendra, and David Steurer.

Open Question: Do there exist constants $\varepsilon, \delta > 0$ such that for any $k \in \mathbb N$, the following holds? Let $\mathcal B_1, \mathcal B_2, \ldots, \mathcal B_k \subseteq \mathbb R^k$ be a collection of $k$ orthonormal bases and define $W = \mathcal B_1 \cup \mathcal B_2 \cup \cdots \cup \mathcal B_k$. Then there are subsets $A,B \subseteq W$ with $|A|,|B| \geq \varepsilon |W|$ and $\min_{x \in A, y \in B} \|x-y\|_2 \geq \delta$.

[Some additional notes: One piece of intuition for why the question should have a positive resolution is that these $k$ orthonormal bases which together comprise at most $k^2$ vectors cannot possibly “fill” $k$-dimensional space in a way that achieves $k$-dimensional isoperimetry. One would seem to need $\exp(O(k))$ points for this.

One can state an equivalent question in terms of vertex expansion. Say that a graph $G=(V,E)$ on $k^2$ vertices is a vertex expander if $|N(S)| \geq 1.1 |S|$ for all subsets $S \subseteq V$ with $|S| \leq |V|/2$. Here, $N(S)$ denotes all the nodes that are in $S$ or are adjacent to $S$. Then one can ask whether there exists a 1-1 mapping from $V$ to $\mathcal B_1 \cup \cdots \cup \mathcal B_k$ for some orthonormal bases $\{\mathcal B_i\}$ such that the endpoints of every edge are mapped at most $o(1)$ apart (as $k \to \infty$).]

## February 25, 2013

### Open question recap

Filed under: Math, Open question — James Lee @ 12:28 pm

In light of the nearly immediate failure of the last open question, here is a recap of the progress on the few somewhat more legitimate questions appearing here over the past few years. (Note that most of these questions are not original and appeared other places as well.)

## February 22, 2013

### Open question: Separated sets in isotropic measures

Filed under: Math, Open question — Tags: , , — James Lee @ 8:26 pm

[Note: This question is trivially impossible, though I leave the original form below. If $\mu$ is the uniform measure on $S^{k-1}$, then any two sets of $\Omega(1)$ measure are at distance $O(1/\sqrt{k})$ by concentration of measure. Thus it is impossible to find $k$ sets satisfying the desired bound. I will try to think of the right question and repost. Unfortunately, I now recall having gone down this path a few times before, and I fear there may not be a relevant elementary open question after all.]

Here is an interesting and elementary (to state) open question whose resolution would give an optimal higher-order Cheeger esimate. The goal is to replace the factor ${k^{O(1)}}$ in Theorem 1 of the previous post with a ${\mathrm{polylog}(k)}$ factor, or even an optimal bound of ${\sqrt{\log k}}$.

Fix ${k \in \mathbb N}$ and let ${\mu}$ denote a probability measure on the ${(k-1)}$-dimensional sphere ${S^{k-1}}$. For the purpose of this question, one can assume that ${\mu}$ is supported on a finite number of points. Suppose also that ${\mu}$ is isotropic in the sense that, for every ${\theta \in S^{k-1}}$,

$\displaystyle \int_{S^{k-1}} \langle x, \theta\rangle^2 \,d\mu(x)= \frac{1}{k}\,.$

Now our goal is to find, for some ${\varepsilon, \delta > 0}$, a collection of (measurable) subsets ${U_1, U_2, \ldots, U_k \subseteq S^{k-1}}$ such that ${\mu(U_i) \geq \varepsilon/k}$ for each ${i=1,2,\ldots,k}$, and such that for ${i \neq j}$, we have

$\displaystyle \min_{x \in U_i, y \in U_j} \|x-y\|_2 \geq \delta\,.$

In Lemma 5 of the previous post, we proved that this is possible for ${\varepsilon \gtrsim 1}$ and ${\delta \gtrsim k^{-5/2}}$. In this paper, we improve the estimate on ${\delta}$ somewhat, though the best-known is still ${\delta \gtrsim k^{-c}}$ for some ${c > 0}$.

Question:
Is it possible to achieve ${\varepsilon, \delta \gtrsim 1/\mathrm{poly}(\log k)}$? One could even hope for ${\varepsilon \gtrsim 1}$ and ${\delta \gtrsim 1/(\log k)}$.

Two extreme cases are when (1) ${\mu}$ is uniform on ${S^{k-1}}$, or (2) ${\mu}$ is uniform on the standard basis ${\{e_1,e_2,\ldots,e_k\}}$. In the first case, one can easily achieve ${\varepsilon,\delta \gtrsim 1}$ (in fact, one could even get ${100k}$ sets satisfying these bounds). In the second case, taking each set to contain a single basis vector achieves ${\varepsilon = 1}$ and ${\delta = \sqrt{2}}$.

If one only wants to find, say, ${\lceil 3k/4\rceil}$ sets instead of ${k}$ sets, then it is indeed possible to achieve ${\varepsilon \gtrsim 1}$ and ${\delta \gtrsim 1/O(\log k)}$. [Update: This is actually impossible for the reasons stated above. To get the corresponding bounds, we do a random projection into $O(\log k)$ dimensions. This only preserves the original Euclidean distances on average.] Furthermore, for ${\varepsilon \gtrsim 1}$, this bound on ${\delta}$ is asymptotically the best possible.

## February 20, 2013

### No frills proof of higher-order Cheeger inequality

Filed under: Expository, Math — Tags: , , , — James Lee @ 11:41 pm

Following some recent applications by Mamoru Tanaka and Laurent Miclo, I was asked where there is a short, no-frills, self-contained, and (possibly) quantitatively non-optimal proof of the higher-order Cheeger inequalities from our paper with Shayan Oveis-Gharan and Luca Trevisan. I thought I would post it here. (If you’re hungering for something new, see this recently posted preprint of my coauthors relating higher eigenvalues to graph expansion.)

[Update: The application of Miclo can also be done using the higher-order Cheeger inequalities of Louis, Raghavendra, Tetali, and Vempala.]

The main simplification comes from doing the random partitioning non-optimally with axis-parallel cubes. For ease of notation, we will deal only with regular graphs, but there will be no quantitative dependence on the degree and this assumption can be removed (see the full paper).

Suppose that ${G=(V,E)}$ is a connected, ${n}$-vertex, ${d}$-regular graph. Define the Laplacian by ${\mathcal L = I - (1/d)A}$, where ${A}$ is the adjacency matrix of ${G}$. We will think of ${\mathcal L}$ as acting on ${\ell^2(V)}$, the space of functions ${f : V \rightarrow \mathbb R}$ equipped with the ${\ell^2}$ norm. ${\mathcal L}$ is positive semi-definite with spectrum

$\displaystyle 0 = \lambda_1 \leq \lambda_2 \leq \cdots \leq \lambda_{|V|} \leq 2\,.$

We we define the Rayleigh quotient of a function ${f \in \ell^2(V)}$ by

$\displaystyle \mathcal R(f) = \frac{\sum_{u \sim v} (f(u)-f(v))^2}{d \sum_{u \in V} f(u)^2}\,,$

where the numerator is summed over edges of ${G}$. By the variational principle for eigenvalues, we have

$\displaystyle \lambda_k = \min_{\stackrel{W \subseteq \ell^2(V)}{\dim(W)=k}} \max_{0 \neq f \in W} \mathcal R(f)\,. \ \ \ \ \ (1)$

For a subset ${S \subseteq V}$, we define the expansion of ${S}$ by ${\phi(S) = \mathcal R(\mathbf{1}_S)}$, where ${\mathbf{1}_S}$ is the indicator function of ${S}$.

Finally, for ${k \in \{1,2,\ldots,n\}}$, we define the ${k}$-way expansion constant of ${G}$ by

$\displaystyle \rho_G(k) = \min \left\{ \max_i \phi(S_i) : S_1, S_2, \ldots, S_k \subseteq V \right\}\,,$

where the minimum is over collections of ${k}$ disjoint, non-empty subsets of ${V}$.

The classical discrete Cheeger inequality asserts that

$\displaystyle \frac{\lambda_2}{2} \leq \rho_G(2) \leq \sqrt{2\lambda_2}\,.$

We will now prove the following generalization. See the full paper for a discussion of the surrounding issues and better quantitative bounds.

Theorem 1 For every ${k \in \{1,2,\ldots,n\}}$,

$\displaystyle \frac{\lambda_k}{2} \leq \rho_G(k) \leq 30 k^{3.5} \sqrt{\lambda_k}\,. \ \ \ \ \ (2)$

First, let’s prove the (easy) LHS of (2). Suppose we have ${S_1, S_2, \ldots, S_k \subseteq V}$ which are disjoint and non-empty and which satisfy ${\phi(S_i) = \mathcal R(\mathbf{1}_{S_i}) \leq \rho}$. Then certainly ${W = \mathrm{span}(\mathbf{1}_{S_1}, \ldots, \mathbf{1}_{S_k})}$ is a ${k}$-dimensional subspace of ${\ell^2(V)}$. On the other hand, every ${f \in W}$ satisfies ${\mathcal R(f) \leq 2 \rho}$ because if ${f = \alpha_1 \mathbf{1}_{S_1} + \cdots + \alpha_k \mathbf{1}_{S_k}}$, then

$\displaystyle \sum_{u \sim v} (f(u)-f(v))^2 \leq 2 \sum_{i=1}^k \alpha_i^2 \sum_{u \sim v} |\mathbf{1}_{S_i}(u)-\mathbf{1}_{S_i}(v)|^2\,,$

where we have used the fact that if ${u \in S_i}$ and ${v \in S_j}$, then

$\begin{array}{rcl} |\alpha_i \mathbf{1}_{S_i}(u) - \alpha_j \mathbf{1}_{S_j}(u)|^2 &\leq & 2(\alpha_i^2 + \alpha_j^2) \\ &=& 2\alpha_i^2 |\mathbf{1}_{S_i}(u) - \mathbf{1}_{S_i}(v)|^2 + 2\alpha_j^2 |\mathbf{1}_{S_j}(u)-\mathbf{1}_{S_j}(v)|^2\,.\end{array}$

But now using (1), the subspace ${W}$ witnesses the fact that ${\lambda_k \leq 2\rho}$.

To prove the more difficult RHS of (2), we will use the following discrete Cheeger inequality with boundary conditions.

Lemma 2 For any ${f : V\rightarrow \mathbb R}$, there is a subset ${U \subseteq \{ v \in V : f(v) \neq 0\}}$ such that

$\displaystyle \phi(U) \leq \sqrt{2 \mathcal R(f)}\,.$

Proof: Let ${U_t = \{ v \in V : f(v)^2 \geq t \}}$. Observe that for each ${t > 0}$, one has ${U_t \subseteq \{ v \in V : f(v) \neq 0\}}$. For ${S \subseteq V}$, let ${E(S,\bar S)}$ denote the edges of ${G}$ with exactly one endpoint in ${S}$. Then we have

$\displaystyle \begin{array}{rcl} \int_0^{\infty} |E(U_t, \bar U_t)|\,dt &=& \sum_{\{u,v\} \in E} \left|f(u)^2 - f(v)^2\right| \\ &= & \sum_{\{u,v\} \in E} |f(u) + f(v)| |f(u)-f(v)| \\ &\leq & \sqrt{\sum_{\{u,v\} \in E} (|f(u)|+|f(v)|)^2} \sqrt{\sum_{\{u,v\} \in E} |f(u)-f(v)|^2} \\ &\leq & \sqrt{2 d \sum_{u \in V} f(u)^2}\sqrt{\sum_{\{u,v\} \in E} |f(u)-f(v)|^2}. \end{array}$

On the other hand, ${\int_0^{\infty} d|U_t|\,dt = d \sum_{u \in V} |f(u)|^2,}$ thus

$\displaystyle \int_0^{\infty} |E(U_t, \bar U_t)|\,dt \leq \sqrt{2 \mathcal R(f)} \int_0^{\infty} d|U_t|\,dt\,,$

implying there exists a ${t > 0}$ such that ${\phi(U_t) = \frac{|E(U_t,\bar U_t)|}{d |U_t|} \leq \sqrt{2 \mathcal R(f)}}$. $\Box$

In light of the preceding lemma, to prove the RHS of (2), it suffices to find ${k}$ disjointly supported functions ${\psi_1, \psi_2, \ldots, \psi_k : V \rightarrow \mathbb R}$ such that ${\mathcal R(\psi_i) \leq (30)^2 k^6 \lambda_k}$ for each ${i=1,2,\ldots,k}$. Then Lemma 2 guarantees the existence of disjoint subsets of vertices satisfying our desired conclusion.

Toward this end, let ${f_1, f_2, \ldots, f_k : V \rightarrow \mathbb R}$ denote ${k}$ orthonormal functions satisfying ${\mathcal L f_i = \lambda_i f_i}$ for each ${i=1,2,\ldots,k}$, i.e. consider the first ${k}$ eigenfunctions of the Laplacian. Define the map ${F : V \rightarrow \mathbb R^k}$ via

$\displaystyle F(v) = \left(f_1(v), f_2(v), \ldots, f_k(v)\right)\,.$

We also put ${\hat F(v) = F(v)/\|F(v)\|}$. (Since ${\lambda_1=0}$, the function ${f_1}$ takes value ${1/\sqrt{n}}$ everywhere, hence ${F}$ is never zero and this is well-defined.)

The next lemma shows that, in order to find disjointly supported functions, it suffices to find “separated sets.”

Lemma 3 Suppose that for some ${\varepsilon > 0}$ and ${0 < \delta \leq 1}$, there exist ${k}$ subsets ${U_1, U_2, \ldots, U_k \subseteq V}$ satisfying the conditions:

1. For every ${i=1,2,\ldots,k}$, we have ${\sum_{v \in U_i} \|F(v)\|^2 \geq \varepsilon}$, and
2. For every ${i \neq j}$, if ${u \in U_i}$ and ${v \in U_j}$, then

$\displaystyle \|\hat F(u)-\hat F(v)\| \geq \delta$

Then there are disjointly supported functions ${\psi_1, \psi_2, \ldots, \psi_k : V \rightarrow \mathbb R}$ such that ${\mathcal R(\psi_i) \leq 25k\lambda_k/(\varepsilon\delta^2)}$.

Proof: For each ${i=1,2,\ldots,k}$, we define maps ${\theta_i : V \rightarrow \mathbb R}$ by

$\displaystyle \theta_i(v) = \max\left(0, 1 - \frac{2}{\delta} \min_{u \in U_i} \|\hat F(u)-\hat F(v)\|\right)\,.$

Observe that, by the triangle inequality, for ${u,v \in V}$, we have

$\displaystyle |\theta_i(u)-\theta_i(v)| \leq \frac{2}{\delta} \|\hat F(u)-\hat F(v)\|\,. \ \ \ \ \ (3)$

Next, we define

$\displaystyle \psi_i(v) = \theta_i(v) \|F(v)\|\,.$

Observe that since ${\theta_i}$ is identically ${1}$ on ${U_i}$, we have

$\displaystyle \sum_{v \in V} \psi_i(v)^2 \geq \sum_{v \in U_i} \|F(v)\|^2 \geq \varepsilon \ \ \ \ \ (4)$

by condition (i).

Next, we argue that the functions ${\{\theta_i\}}$ are disjointly supported. This immediately implies that the ${\{\psi_i\}}$ are disjointly supported. If ${u \in \mathrm{supp}(\theta_i) \cap \mathrm{supp}(\theta_j)}$ for some ${i \neq j}$, then there are points ${v \in U_i}$ and ${v' \in U_j}$ such that ${\|\hat F(u)-\hat F(v)\| < \delta/2}$ and ${\|\hat F(u)-\hat F(v')\| < \delta/2}$. But then by the triangle inequality, ${\|\hat F(v)-\hat F(v')\| < \delta}$, violating condition (ii).

Finally, we bound ${\mathcal R(\psi_i)}$. For any ${u,v \in V}$ and ${i=1,2,\ldots,k}$, we have

$\displaystyle |\psi_i(u)-\psi_i(v)| \leq \theta_i(u) \left|\vphantom{\bigoplus}\|F(u)\|-\|F(v)\|\right| + \|F(v)\| \cdot |\theta_i(u)-\theta_i(v)|\,. \ \ \ \ \ (5)$

Now using (3),

$\displaystyle \|F(v)\| \cdot |\theta_i(u)-\theta_i(v)| \leq \frac{2}{\delta} \|F(v)\| \cdot \|\hat F(u)-\hat F(v)\| \leq \frac{4}{\delta} \|F(u)-F(v)\|\,, \ \ \ \ \ (6)$

where we have used the fact that for any non-zero vectors ${x,y \in \mathbb R^k}$, we have

$\displaystyle \|x\| \left\|\frac{x}{\|x\|} - \frac{y}{\|y\|}\right\| = \left\|x - \frac{\|x\|}{\|y\|} y\right\| \leq \|x-y\| + \left\|y - \frac{\|x\|}{\|y\|} y\right\| \leq 2 \,\|x-y\|\,.$

Using (6) and the fact that ${\theta_i(u) \leq 1}$ in (5) yields

$\displaystyle |\psi_i(u)-\psi_i(v)| \leq \left(1+\frac{4}{\delta}\right) \|F(u)-F(v)\|\,. \ \ \ \ \ (7)$

Finally, observe that

$\displaystyle \sum_{u \sim v} \|F(u)-F(v)\|^2 = \sum_{i=1}^k \sum_{u \sim v} |f_i(u)-f_i(v)|^2 = d (\lambda_1 + \cdots + \lambda_k) \leq dk\lambda_k\,.$

Combining this with (7) and (4) shows that ${\mathcal R(\psi_i) \leq \lambda_k \frac{k}{\varepsilon}(1+\frac{4}{\delta})^2 \leq 25k\lambda_k/(\varepsilon \delta^2).}$ $\Box$

We will first make the task of finding separated sets slightly easier.

Lemma 4 Suppose that for some ${0 < \delta \leq 1}$ and ${m \geq 1}$, there are subsets ${T_1, T_2, \ldots, T_m \subseteq V}$ which satisfy:

1. ${\displaystyle \sum_{i=1}^m \sum_{v \in T_i} \|F(v)\|^2 \geq k - \frac14}$.
2. For every ${i \neq j}$, if ${u \in T_i}$ and ${v \in T_j}$, then ${ \|\hat F(u)-\hat F(v)\| \geq \delta }$
3. For every ${i=1,2,\ldots, m}$,

$\displaystyle \sum_{v \in T_i} \|F(v)\|^2 \leq 1 + \frac{1}{2k}\,.$

Then there are ${k}$ sets ${U_1, U_2, \ldots, U_k \subseteq V}$ satisfying the assumption of Lemma 3 with ${\varepsilon = \frac{1}{4}}$.

Proof: We can form the desired sets ${\{U_i\}}$ by taking disjoint unions of the sets ${\{T_i\}}$. Begin with the family ${\{T_1, T_2, \ldots, T_m\}}$ and repeatedly replace two sets ${T}$ and ${T'}$ by their union if they each satisfy ${\sum_{v \in T} \|F(v)\|^2 < \frac12}$.

When we finish, we are left with a collection of sets each member of which satisfies ${\sum_{v \in T} \|F(v)\|^2 \in [\frac12, 1+\frac{1}{2k}]}$, and possibly one set for which ${\sum_{v \in T} \|F(v)\|^2 < 1/2}$. By (i), we must end with at least ${k}$ sets which each satisfy ${\sum_{v \in T} \|F(v)\|^2 \geq 1/4}$. $\Box$

Our final lemma, which finishes the proof of Theorem 1, simply asserts that such sets exist. We will no longer need the fact that ${F}$ comes from eigenfunctions.

Lemma 5 Suppose that ${g_1, g_2, \ldots, g_k : V \rightarrow \mathbb R}$ are orthornormal in ${\ell^2(V)}$. Let

$\displaystyle G(v) = (g_1(v), g_2(v), \ldots, g_k(v))$

and ${\hat G(v) = G(v)/\|G(v)\|}$. Then there is an ${m \geq 1}$ and subsets ${T_1, T_2, \ldots, T_m \subseteq V}$ such that

1. ${\displaystyle\sum_{i=1}^m \sum_{v \in T_i} \|G(v)\|^2 \geq k - \frac14}$.
2. For every ${i \neq j}$, if ${u \in T_i}$ and ${v \in T_j}$, then

$\displaystyle \|\hat G(u)-\hat G(v)\| \geq \frac{1}{2\sqrt{2} k^{3}}\,.$

3. For every ${i = 1,2,\ldots,m}$,

$\displaystyle \sum_{v \in T_i} \|G(v)\|^2 \leq 1 + \frac{1}{2k}\,.$

Proof: Consider the ${n \times k}$ matrix ${A}$ which has columns ${g_1, g_2, \ldots, g_k}$. For any ${x \in \mathbb R^k}$, we have

$\displaystyle \sum_{v \in V} \langle x, G(v)\rangle^2 = \|Ax\|^2 = x^T A^T A x = x^T x = \|x\|^2\,, \ \ \ \ \ (8)$

since the columns of ${A}$ are orthornormal

For a subset ${X \subseteq S^{k-1}}$ of the unit sphere in ${\mathbb R^k}$, we put ${V(X) = \{ v \in V : \hat G(v) \in X \}}$. Fix some ${x \in X}$ and let ${\mathsf{diam}(X)}$ denote the diameter of ${X}$, then by (8), we have

$\displaystyle 1 = \sum_{v \in V(X)} \langle x, G(v)\rangle^2 = \sum_{v \in V(X)} \|G(v)\|^2 \left(1-\frac{\|\hat G(v)-x\|^2}{2}\right)^2 \geq \sum_{v \in V(X)} \|G(v)\|^2 \left(1-\frac{\mathrm{diam}(X)^2}{2}\right)^2\,.$

We conclude that if ${\mathrm{diam}(X) \leq 1/\sqrt{2k}}$, then

$\displaystyle \sum_{v \in V(X)} \|G(v)\|^2 \leq 1 + \frac{1}{2k}\,. \ \ \ \ \ (9)$

Now, let ${\mathcal{P}}$ be a partition of ${\mathbb R^k}$ into axis-parallel squares of side length ${L = \frac{1}{k \sqrt{2}}}$. For any such square ${Q \in \mathcal P}$, we let ${\tilde Q \subseteq Q}$ denote the set of points which are at Euclidean distance at least ${\frac{L}{4k^2}}$ from every side of ${Q}$. Observe that

$\displaystyle \mathrm{vol}(\tilde Q) \geq \left(1-\frac{1}{4k^2}\right)^k \mathrm{vol}(Q) \geq \left(1-\frac{1}{4k}\right) \mathrm{vol}(Q)\,. \ \ \ \ \ (10)$

Consider now the collection of sets ${\{V(\tilde Q) : Q \in \mathcal P\}}$. Since ${\mathrm{diam}(\tilde Q) \leq L \sqrt{k} = \frac{1}{\sqrt{2k}}}$, (9) implies that ${\sum_{v \in V(\tilde Q)} \|G(v)\|^2 \leq 1 + \frac{1}{2k}}$. Furthermore, by construction, for any ${u \in V(\tilde Q)}$ and ${v \in V(\tilde Q')}$ with ${Q \neq Q'}$, we have

$\displaystyle \|\hat G(u)-\hat G(v)\| \geq 2 \frac{L}{4k^2} \geq \frac{1}{2\sqrt{2} k^{3}}\,.$

Thus the collection of sets ${\{V(\tilde Q) : Q \in \mathcal P\}}$ satisfy both conditions (ii) and (iii) of the lemma. We are left to verify (i).

Note that ${\sum_{v \in V} \|G(v)\|^2 = \sum_{i=1}^k \sum_{v \in V} g_i(v)^2 = k}$. If we choose a uniformly random axis-parallel translation ${\mathcal P'}$ of the partition ${\mathcal P}$, then (10) implies

$\displaystyle \mathbb{E} \sum_{Q \in \mathcal P'} \sum_{v \in V(\tilde Q)} \|G(v)\|^2 \geq \left(1-\frac{1}{4k}\right) \sum_{v \in V(Q)} \|G(v)\|^2 \geq k - \frac14\,.$

In particular, there exists some fixed partition that achieves this bound. For this partition ${\mathcal P}$, the sets ${\{V(\tilde Q) : Q \in \mathcal P\}}$ satisfy all three conditions of the lemma, completing the proof.

## February 18, 2013

### Talagrand’s Bernoulli Conjecture, resolved.

Filed under: Math — Tags: , , — James Lee @ 5:34 pm

Apparently, Bednorz and LataĹ‚a have solved Talagrand’s \$5,000 Bernoulli Conjecture. The conjecture concerns the supremum of a Bernoulli process.

Consider a finite subset ${T \subseteq \ell^2}$ and define the value

$\displaystyle b(T) = \mathbb{E} \max_{t \in T} \sum_{i \geq 1} \varepsilon_i t_i\,,$

where ${\varepsilon_1, \varepsilon_2, \ldots}$ are i.i.d. random ${\pm 1}$. This looks somewhat similar to the corresponding value

$\displaystyle g(T) = \mathbb{E} \max_{t \in T} \sum_{i \geq 1} g_i t_i\,,$

where ${g_1, g_2, \ldots}$ are i.i.d. standard normal random variables. But while ${g(T)}$ can be characterized (up to universal constant factors) by the Fernique-Talagrand majorizing measures theory, no similar control was known for ${b(T)}$. One stark difference between the two cases is that ${g(T)}$ depends only on the distance geometry of ${T}$, i.e. ${g(A(T))=g(T)}$ whenever ${A}$ is an affine isometry. On the other hand, ${b(T)}$ can depend heavily on the coordinate structure of ${T}$.

There are two basic ways to prove an upper bound on ${b(T)}$. One is via the trivial bound

$\displaystyle b(T) \leq \max_{t \in T} \|t\|_1\,. \ \ \ \ \ (1)$

The other uses the fact that the tails of Gaussians are “fatter” than those of Bernoullis.

$\displaystyle b(T) \leq \sqrt{\frac{\pi}{2}} g(T)\,. \ \ \ \ \ (2)$

This can be proved easily using Jensen’s inequality.

Talagrand’s Bernoulli conjecture is that ${b(T)}$ can be characterized by these two upper bounds.

Bernoulli conjecture: There exists a constant ${C > 0}$ such that for every ${T \subseteq \ell^2}$, there are two subsets ${T_1, T_2 \subseteq \ell^2}$ such that

$\displaystyle T \subseteq T_1 + T_2 = \{ t_1 + t_2 : t_1 \in T_1, t_2 \in T_2 \}\,,$

and

$\displaystyle g(T_1) + \sup_{t \in T_2} \|t\|_1 \leq C b(T)\,.$

Note that this is a “characterization” because given such sets ${T_1}$ and ${T_2}$, equations (1) and (2) imply

$\displaystyle b(T) \leq \sqrt{\frac{\pi}{2}} g(T_1) + \sup_{t \in T_2} \|t\|_1\,.$

The set ${T_1}$ controls the “Gaussian” part of the Bernoulli process, while the set ${T_2}$ controls the part that is heavily dependent on the coordinate structure.

This beautiful problem finally appears to have met a solution. While I don’t know of any applications yet in TCS, it does feel like something powerful and relevant.

## February 7, 2013

### A bit more on expanders…

Filed under: Math — Tags: , , — James Lee @ 11:56 am

So the lecture notes I posted a year ago giving a “simpler proof” of expansion for the Margulis-Gabber-Galil graphs turns out to be very similar to arguments of M. Berger (1991) and Y. Shalom (1999) for proving that $SL_2(\mathbb Z) \ltimes \mathbb Z^2$ has property (T). See Section 4.2 of the book of Bekka, de la Harpe, and Valette.

Anyone interested in expander graphs should take a look at Amir Yehudayoď¬€’s recent article in the SIGACT newsletter: Proving expansion in three steps. Amir uses a ping-pong lemma argument (which is essentially the same as the “combinatorial lemma” in the preceding post) to exemplify the “opening” of the three part strategy of Bourgain and Gamburd to proving expansion in Cayley graphs.

## January 27, 2013

### Expanders from the action of GL(2,Z)

Filed under: Math — Tags: , , , — James Lee @ 8:14 pm

Many months (and only one post) ago, I wrote about an analysis of the Gabber-Galil graphs using a simple combinatorial lemma and the discrete Cheeger inequality. Here is a note I posted recently carrying the study a bit further.

Given any two invertible, integral matrices ${S,T \in GL_2(\mathbb Z)}$, one can consider the family of graphs ${G_n^{S,T} = (V_n, E_n^{S,T})}$, where ${E_n^{S,T}}$ contains edges from every ${(x,y) \in V_n}$ to each of

$\displaystyle (x \pm 1,y), (x,y\pm 1), S(x,y), S^{-1}(x,y), T(x,y), T^{-1}(x,y)\,.$

The Gabber-Galil graphs correspond to the choice ${S = \left(\begin{smallmatrix} 1 & 1 \\ 0 & 1 \end{smallmatrix}\right)}$ and ${T = \left(\begin{smallmatrix} 1 & 0 \\ 1 & 1 \end{smallmatrix}\right)}$.

Consider also the countably infinite graph ${G^{S,T}}$ with vertex set ${\mathbb Z^2 \setminus \{0\}}$ and edges

$\displaystyle E^{S,T} = \left\{ \{z,S z\}, \{z, T z\} : z \in \mathbb Z^2 \setminus \{0\} \right\}\,.$

Using some elementary Fourier analysis and a discrete Cheeger inequality, one can prove the following relationship (we use $S'$ to represent the transpose of a matrix $S$).

Theorem 1 For any ${S,T \in GL_2(\mathbb Z)}$, if ${G^{S',T'}}$ has positive Cheeger constant, then ${\{G_n^{S,T}\}}$ is a family of expander graphs.

We recall that an infinite graph ${G=(V,E)}$ with uniformly bounded degrees has positive Cheeger constant if there is a number ${\epsilon > 0}$ such that every finite subset ${U \subseteq V}$ has at least ${\epsilon |U|}$ edges with exactly one endpoint in ${U}$.

While Theorem 1 may not seem particularly powerful, it turns out that in many interesting cases, proving a non-trivial lower bound on the Cheeger constant of ${G^{S,T}}$ is elementary. For the Gabber-Galil graphs, the argument is especially simple. The following argument is, in fact, significantly simpler than the analysis of the previous post.

In the next lemma, let $G=G^{S,T}$ where $S(x,y)=(x+y,x)$ and $T(x,y)=(x,y+x)$ and let $E=E^{S,T}$ be the corresponding edge set. We will prove that $G$ has positive Cheeger constant.

Lemma 2 For any finite subset ${A \subseteq \mathbb Z^2 \setminus \{0\}}$, we have ${ |E(A, \bar A)| \geq |A| }$.

Proof: Define ${Q_1 = \{ (x,y) \in \mathbb Z^2 : x > 0, y \geq 0 \}}$. This is the first quadrant, without the ${y}$-axis and the origin. Define ${Q_2, Q_3, Q_4}$ similarly by rotating ${Q_1}$ by ${90}$, ${180}$, and ${270}$ degrees, respectively, and note that we have a partition ${\mathbb Z^2 \setminus \{0\} = Q_1 \cup Q_2 \cup Q_3 \cup Q_4}$.

Let ${A_i = A \cap Q_i}$. We will show that ${|E(A_1, \bar A \cap Q_1)| \geq |A_1|}$. Since our graph is invariant under rotations of the plane by ${90^{\circ}}$, this will imply our goal:

$\displaystyle |E(A, \bar A)| \geq \sum_{i=1}^4 |E(A_i, \bar A \cap Q_i)| \geq \sum_{i=1}^4 |A_i| = |A|\,.$

It is immediate that ${S(A_1), T(A_1) \subseteq Q_1}$. Furthermore, we have ${S(A_1) \cap T(A_1) = \emptyset}$ because ${S}$ maps points in ${Q_1}$ above (or onto) the line ${y=x}$ and ${T}$ maps points of ${Q_1}$ below the line ${y=x}$. Furthermore, ${S}$ and ${T}$ are bijections, thus ${|S(A_1) + T(A_1)| = |S(A_1)| + |T(A_1)| = 2|A_1|\,.}$ In particular, this yields ${|E(A_1, \bar A \cap Q_1)| \geq |A_1|}$, as desired. $\Box$

One can generalize the Gabber-Galil graphs in a few different ways. As a prototypical example, consider the family ${\{G_n^{S,S'}\}}$ for any ${S \in GL_2(\mathbb Z)}$. An elementary analysis yields the following characterization.

Theorem 3 For any ${S =\left(\begin{smallmatrix} a & b \\ c & d \end{smallmatrix}\right) \in GL_2(\mathbb Z)}$, it holds that ${\{G_n^{S,S'}\}}$ is an expander family if and only if ${(a+d)(b-c) \neq 0}$.

For instance, the preceding theorem implies that if ${S}$ has order dividing ${4}$ then ${\{G_n^{S,S'}\}}$ is not a family of expander graphs, but if ${S}$ has order ${3, 6, 12}$ or infinite order (the other possibilities) and ${S \neq S'}$ then the graphs are expanders.

Here is a different sort of generalization. Let ${R = \left(\begin{smallmatrix} 0 & 1 \\ 1 & 0 \end{smallmatrix}\right)}$ be the reflection across the line ${y=x}$. The Gabber-Galil graphs can also be seen as ${G_n^{S,T}}$ where ${S = \left(\begin{smallmatrix} 1 & 1 \\ 0 & 1 \end{smallmatrix}\right)}$ and ${T = RSR}$. We can characterize expansion of these graphs as well.

Theorem 4 For any ${S =\left(\begin{smallmatrix} a & b \\ c & d \end{smallmatrix}\right) \in GL_2(\mathbb Z)}$, it holds that ${\{G_n^{S,RSR}\}}$ is an expander family if and only if ${(a+d)(b+c) \neq 0}$.

It is an interesting open problem to find a complete characterization of the pairs ${S,T \in GL_2(\mathbb Z)}$ such that ${\{G_n^{S,T}\}}$ is a family of expanders.

## May 2, 2012

### Solved problems, open problems, and re-solved problems

Filed under: Math, Reading — James Lee @ 12:15 am

Two open problems from this blog were solved recently. First, there is a solution by Bartal, Gottlieb, and Krauthgamer of the Doubling TSP problem. Next, Raghu Meka gave a deterministic procedure to compute the expected supremum of a Gaussian process.

If you’re thirsting for more open problems, I suggest this list on analysis of Boolean functions, compiled by Ryan O’Donnell in conjunction with the 2012 Simons Symposium.

Finally, in the realm of re-proving theorems, let me mention this absolutely gorgeous proof by Lovett and Meka of Spencer’s six standard deviations suffice theorem in discrepancy theory. The proof is constructive (i.e., there exists a polynomial-time algorithm to construct the 2-coloring) as in recent groundbreaking work of Bansal, but unlike Bansal’s proof is independent of Spencer’s bound. It also happens to be elementary and beautiful; my thanks to Aravind Srinivasan for pointing it out.

## April 18, 2012

### Gabber-Galil analysis of Margulis’ expanders

Filed under: Math — Tags: , , — James Lee @ 8:11 pm

I’m currently teaching a course on spectral graph theory, and I decided to lecture on Margulis’ expander construction and the analysis of its spectral gap by Gabber and Galil. I had never realized how intuitive the analysis could be; the lectures notes I had seen didn’t quite reflect this. In particular, we won’t do any unsightly calculations. Everything I’m about to say is probably well-known, but I thought I would write it down anyway.

The idea is to first start with an initial “expanding object,” and then try to construct a family of graphs out of it. First, consider the infinite graph ${G}$ with vertex set ${\mathbb Z^2}$. The edges are given by two maps ${S}$ and ${T}$, where ${S(x,y) = (x,x+y)}$ and ${T(x,y) = (x+y,y)}$. So the edges are ${\{(x,y), S(x,y)\}}$ and ${\{(x,y), T(x,y)\}}$. Clearly ${(0,0)}$ is not adjacent to anything. Except for the origin, this graph is an expander in the following sense.

Lemma 1 For any subset ${A \subseteq \mathbb Z^2 \setminus \{0\}}$, we have

$\displaystyle |E(A)| \geq \frac{1}{3} |A|\,,$

where ${E(A)}$ denotes the edges leaving ${A}$.

The following simple proof is inspired by a paper of Linial and London.

Proof: First consider a subset ${A}$ that does not intersect the coordinate axes. Let ${Q_1, Q_2, Q_3, Q_4}$ represent the four quadrants of ${\mathbb Z^2}$, and let ${A_i = A \cap Q_i}$. Consider that ${S(A_1), T(A_1) \subseteq Q_1}$ and ${S(A_1) \cap T(A_1) = \emptyset}$. The latter fact follows because if ${(x,x+y)=(x'+y',y')}$ then ${x'=-y}$. Similarly, ${S^{-1}(A_2), T^{-1}(A_2) \subseteq Q_2}$ and ${S(A_3), T(A_3) \subseteq Q_3}$ and ${S^{-1}(A_4), T^{-1}(A_4) \subseteq Q_4}$, while all the respective pairs ${S^{-1}(A_2), T^{-1}(A_2)}$ and ${S(A_3), T(A_3)}$ and ${S^{-1}(A_4), T^{-1}(A_4)}$ are disjoint.

Combining this with the fact that ${S}$ and ${T}$ are bijections on ${\mathbb Z^2}$ immediately shows that ${|S(A) \cup S^{-1}(A) \cup T(A) \cup T^{-1}(A)|}$ is at least

$\displaystyle |S(A_1)| + |T(A_1)| + |S^{-1}(A_2)| + |T^{-1}(A_2)|$

$\displaystyle + |S(A_3)| + |T(A_3)| + |S^{-1}(A_4)| + |T^{-1}(A_4)|$

$\displaystyle = 2(|A_1| + |A_2| + |A_3| + |A_4|)$

$\displaystyle = 2 |A|\,.$

Dealing with the coordinate axes is not too hard. Suppose now that ${A \subseteq \mathbb Z^2 \setminus \{0\}}$ is arbitrary. Let ${A_{+} = A \cap \{ (x,0), (0,x) : x \in \mathbb Z \}}$. The preceding argument shows ${|E(A)| \geq |A\setminus A_+|-|A_+| = |A|-2|A_+|}$. We will show that ${|E(A)| \geq |A_+|}$. Averaging these two inequalities with weights ${1/3}$ and ${2/3}$ completes the proof.

Let ${A_\times = A \cap \{ (x,x), (x,-x) : x \in \mathbb Z \}}$, and ${A_\circ = A \setminus (A_1 \cup A_2)}$. Then we have the bounds:

$\displaystyle |E(A_+, \bar A)| \geq |A_+| - |A_\times|$

$\displaystyle |E(A_\times, \bar A)| \geq 2|A_\times| - |A_\circ|$

$\displaystyle |E(A_\circ, \bar A)| \geq |A_\circ| - |A_\times|\,.$

The first equation follows since each element of ${A_+}$ is connected to exactly two elements of ${A_{\times}}$, and each element of ${A_{\times}}$ is connected to exactly two elements of ${A_+}$. For instance, ${(x,0)}$ is connected to ${(x,x)}$ and ${(x,-x)}$, while ${(x,x)}$ is connected to ${(x,0)}$ and ${(0,x)}$.

The second follows because every point of ${A_\times}$ is connected to two unique points of ${A_\circ}$, e.g. ${(x,x)}$ is connected to ${(x,2x)}$ and ${(2x,x)}$. The final inequality follows from the fact that ${|E(A_\circ)| \geq |A_\circ|}$ from our first argument (since ${A_{\circ}}$ does not touch the axes), and because ${A_\circ}$ has no edges to ${A_+}$. Summing these three inequalities yields ${|E(A)| \geq |A_+|}$. $\Box$

Of course, ${\mathbb Z^2}$ is not a finite graph, so for a parameter ${n}$, we define the graph ${G_n=(V_n,E_n)}$ with vertex set ${V_n = \mathbb Z_n \oplus \mathbb Z_n}$, where ${\mathbb Z_n = \mathbb Z/(n \mathbb Z)}$. There are four types of edges in ${E_n}$: A vertex ${(x,y)}$ is connected to the vertices

$\displaystyle \{(x,y+1), (x+1, y), (x,x+y), (x+y,y)\}\,.$

This yields a graph of degree at most 8.

Theorem 2 There is a constant ${c > 0}$ such that for every ${n \in \mathbb N}$,

$\displaystyle \lambda_2(G_n) \geq c\,,$

where ${\lambda_2}$ is the second-smallest eigenvalue of the Laplacian on ${G_n}$.

Of course, the graphs ${\{G_n\}}$ now have torsion, and thus our expansion result for ${\mathbb Z^2}$ is not, a priori, very useful. The non-trivial idea of the proof is that the torsion doesn’t matter, making Lemma 1 applicable. This is not difficult to show using some Fourier analysis. It turns out to be better though, if we first move to the continuous torus.

Recall that,

$\displaystyle \lambda_2(G_n) = \min_{f : V_n \rightarrow \mathbb R} \left\{\frac{\sum_{uv \in E_n} (f(u)-f(v))^2}{\sum_{u \in V_n} f(u)^2} : \sum_{u \in V_n} f(u) = 0\right\}\,,$

Let ${\mathbb T^2 = \mathbb R^2/\mathbb Z^2}$ be the 2-dimensional torus equipped with the Lebesgue measure, and consider the complex Hilbert space

$\displaystyle L^2(\mathbb T^2) = \left\{ f : \mathbb T^2 \rightarrow \mathbb C : \int_{\mathbb T^2} |f|^2 < \infty \right\}\,.$

equipped with the inner product ${\langle f,g\rangle_{L^2} = \int_{\mathbb T^2} f \bar g}$.

We might also define a related value,

$\displaystyle \lambda_2(\mathbb T^2) = \min_{f \in L^2(\mathbb T^2)} \left\{\frac{\|f-f\circ S\|_{L^2}^2 + \|f-f \circ T\|_{L^2}^2}{\|f\|_{L^2}^2} : \int_{\mathbb T^2} f = 0\right\}\,. \ \ \ \ \ (1)$

Note that this is just defined as a number; the eigenvalue notation is merely suggestive, but we have not introduced an operator on the torus.

Claim 1 There is some ${\varepsilon > 0}$ such that for any ${n \in \mathbb N}$, we have ${\lambda_2(G_n) \geq \varepsilon \lambda_2(\mathbb T^2)}$\

Proof: We simply sketch a proof, which is rather intuitive. Suppose we are given some map ${f : V_n \rightarrow \mathbb R}$ such that ${\sum_{u \in V_n} f(u)=0}$. Define its continuous extension ${\tilde f : \mathbb T^2 \rightarrow \mathbb R}$ as follows: Under the natural embedding of ${\mathbb Z_n \oplus \mathbb Z_n}$ into ${\mathbb T^2}$, every point ${z \in \mathbb T^2}$ sits inside a grid square with four corners ${u_1,u_2,u_3,u_4 \in \mathbb Z_n \oplus \mathbb Z_n}$. Writing a local convex combination ${z = \lambda_1 u_1 + \lambda_2 u_2 + \lambda_3 u_3 + \lambda_4 u_4}$, we define

$\displaystyle \tilde f(z) = \lambda_1 f(u_1) + \lambda_2 f(u_2) + \lambda_3 f(u_3) + \lambda_4 f(u_4)\,.$

Now it is elementary to verify that ${\int_{\mathbb T^2} \tilde f = 0}$. It is also easy to verify that ${\|\tilde f\|^2_{L^2} \geq c\frac{1}{n} \sum_{v \in V} f(v)^2}$ for some ${c > 0}$. (Even if ${f(u_1)+f(u_2)+f(u_3)+f(u_4)=0}$, we still get a contribution from ${f(u_1)^2 + f(u_2)^2 + f(u_3)^2 + f(u_4)^2}$ to ${\|\tilde f\|_{L^2}}$ on this square because we are taking a weighted average.)

Finally, for any ${z \in \mathbb T^2}$, there is a path of length ${O(1)}$ in ${G_n}$ connecting each of the corners of ${z}$‘s square to the corners of ${S(z)}$‘s square. A similar fact holds for ${T(z)}$. In fact, this is the only place where we need to use the fact that edges of the form ${(x,y) \leftrightarrow (x,y+1)}$ and ${(x,y) \leftrightarrow (x+1,y)}$ are present in ${G_n}$. Thus any contribution ${|\tilde f(z)-\tilde f(S(z))|^2}$ to ${\|\tilde f-\tilde f\circ S\|_{L^2}^2}$ can be charged against a term in ${\sum_{uv \in E_n} (f(u)-f(v))^2}$, and similarly for ${T}$. $\Box$

Now our goal is to show that ${\lambda_2(\mathbb T^2) > 0}$. We will use the Fourier transform to “remove the torsion.” The point is that ${S}$ and ${T}$, being shift operators, will act rather nicely on the Fourier basis.

We recall that if ${m,n \in \mathbb N}$ and we define ${\chi_{m,n} \in L^2(\mathbb T^2)}$ by ${\chi_{m,n}(x,y) = \exp(2\pi i(mx+ny))}$, then ${\{\chi_{m,n}\}}$ forms an orthonormal Hilbert basis for ${L^2(\mathbb T^2)}$. In particular, every ${f \in L^2(\mathbb T^2)}$ can be written uniquely as

$\displaystyle f = \sum_{m,n \in \mathbb Z} \hat f_{m,n} \chi_{m,n}\,,$

where ${\hat f_{m,n} = \langle f, \chi_{m,n}\rangle_{L^2}}$. The Fourier transform is defined as a map ${\mathcal F~:~L^2(\mathbb T^2) \rightarrow \ell^2(\mathbb Z^2)}$, where ${\mathcal F f(m,n) = \hat f_{m,n}}$. In particular, ${\mathcal F}$ is a linear isometry.

Define ${S^*(f) = f \circ S}$ and ${T^*(f) = f \circ T}$. Then for any ${m,n \in \mathbb Z}$, we have

$\displaystyle S^*(\chi_{m,n}) = \chi_{m+n,n} \quad\textrm{ and }\quad T^*(\chi_{m,n}) = \chi_{m,n+m}.$

In particular, for any ${f \in L^2(\mathbb T^2)}$, we have ${\widehat{S^*(f)} = \sum_{m,n} \hat f_{m-n,n} \chi_{m,n}}$ and ${\widehat{T^*(f)} = \sum_{m,n} \hat f_{m,n-m} \chi_{m,n}}$. The final thing to note is that ${\hat f(0,0) = \langle f, \chi_{0,0} \rangle = \int_{\mathbb T^2} f}$. So now if we simply apply the Fourier transform (a linear isometry) to the expression in (1), we get a reformulation that ${\lambda_2(\mathbb T^2)}$ is precisely

$\displaystyle \min_{g \in \ell^2(\mathbb Z^2)} \left\{ \frac{\sum_{z \in \mathbb Z^2} |g(z)-g(S(z))|^2 + |g(z)-g(T(z))|^2}{\|g\|^2_{\ell^2}}: g(0,0)=0 \right\}\,.$

Here we have simply replaced ${f}$ by ${\mathcal F f}$ in (1), and then written ${g = \mathcal F f}$.

But now recall our initial infinite graph ${G}$ on ${\mathbb Z^2}$. If we denote by ${L_G}$ the Laplacian on ${G}$, then we can rewrite this as,

$\displaystyle \lambda_2(\mathbb T^2) = \min_{g \in \ell^2(\mathbb Z^2)} \left\{ \frac{\langle g, L_G g\rangle_{\ell^2}}{\|g\|^2_{\ell^2}}: g(0,0)=0 \right\}\,.$

In other words, it is precisely the first Dirichlet eigenvalue on ${G}$, subject to the boundary condition ${g(0,0)=0}$.

But now the discrete Cheeger inequality tells us that

$\displaystyle \lambda_2(\mathbb T^2) \geq \frac1{2 d_{\max}} h^2,$

where ${h}$ is the minimal expansion of any set not containing the origin. Thus we have indeed unwrapped the torsion and returned to our initial question. Lemma 1 shows that ${h \geq 1/3}$, yielding the desired lower bound on ${\lambda_2(\mathbb T^2)}$.

Older Posts »