Hypercontractivity and its Applications

My student Punya Biswal just completed this great survey on hypercontractivity and its application in computer science. There is a PDF version from his home page, and accompanying slides.

Hypercontractive inequalities are a useful tool in dealing with extremal questions in the geometry of high-dimensional discrete and continuous spaces. In this survey we trace a few connections between different manifestations of hypercontractivity, and also present some relatively recent applications of these techniques in computer science.

1. Preliminaries and notation

Fourier analysis on the hypercube. We define the inner product {\langle f,g \rangle = \mathop{\mathbb E}_{x}f(x)g(x)} on functions {f,g \colon \{-1,1\}^{n} \rightarrow {\mathbb R}}, where the expectation is taken over the uniform (counting) measure on {\{-1,1\}^n}. The multilinear polynomials {\chi_{S}(x)=\prod_{i\in S}x_{i}} (where {S} ranges over subsets of {[n]}) form an orthogonal basis under this inner product; they are called the Fourier basis. Thus, for any function {f \colon \{-1,1\}^{n}\rightarrow{\mathbb R}}, we have {f = \sum_{S\subseteq[n]}\hat{f}(S)\chi_{S}(x)}, where the Fourier coefficients {\hat{f}(S)=\langle f,\chi_{S}\rangle} obey Plancherel’s relation {\sum\hat{f}(S)^{2}=1}. It is easy to verify that {\mathop{\mathbb E}_{x}f(x)=\hat{f}(0)} and {\textsf{Var}_{x}f(x)=\sum_{S\neq\emptyset}\hat{f}(S)^{2}}.

Norms. For {1\leq p<\infty}, define the {\ell_{p}} norm {\|f\|_{p}=(\mathop{\mathbb E}_{x}|f(x)|^{p})^{1/p}}. These norms are monotone in {p}: for every function {f}, {p\geq q} implies {\|f\|_{p}\geq\|f\|_{q}}. For a linear operator {M} carrying functions {f \colon \{-1,1\}^{n}\rightarrow{\mathbb R}} to functions {Mf=g \colon \{-1,1\}^{n}\rightarrow{\mathbb R}}, we define the {p}-to-{q} operator norm {\|M\|_{p\rightarrow q}=\sup_{f}\|Mf\|_{q}/\|f\|_{p}}. {M} is said to be a contraction from {\ell_{p}} to {\ell_{q}} when {\|M\|_{p\rightarrow q}\leq1}. Because of the monotonicity of norms, a contraction from {\ell_{p}} to {\ell_{p}} is automatically a contraction from {\ell_{p}} to {\ell_{q}} for any {q<p}. When {q>p} and {\|M\|_{p\rightarrow q}\leq1}, then {M} is said to be hypercontractive.

Convolution operators. Letting {xy} represent the coordinatewise product of {x, y \in \{-1,1\}^n}, we define the convolution {(f*g)(x)=\mathop{\mathbb E}_{y}f(y)g(xy)} of two functions {f,g \colon \{-1,1\}^{n}\rightarrow{\mathbb R}}, and note that it is a linear operator {f\mapsto f*g} for every fixed {g}. Convolution is commutative and associative, and the Fourier coefficients of a convolution satisfy the useful property {\widehat{f*g}=\hat{f}\hat{g}}. We shall be particularly interested in the convolution properties of the following functions

  • The Dirac delta {\delta \colon \{-1,1\}^{n}\rightarrow{\mathbb R}}, given by {\delta(1,\dotsc,1)=1} and {\delta(x)=0} otherwise. It is the identity for convolution and has {\hat{\delta}(S)=1} for all {S\subseteq[n]}.
  • The edge functions {h_{i} \colon \{-1,1\}^{n}\rightarrow{\mathbb R}} given by

    \displaystyle h_{i}(x)= \begin{cases} \phantom{-}1/2 & x=(1,\dotsc,1)\\ -1/2 & x_{i}=-1,x_{[n]\setminus\{i\}}=(1,\dotsc,1)\\ \phantom{-}0 & \text{otherwise.} \end{cases}

    {\hat{h}_{i}(S)} is {1} or {0} according as {S} contains or does not contain {i}, respectively. For any function {f \colon \{-1,1\}^{n}\rightarrow{\mathbb R}}, {(f*h_{i})(x)=(f(x)-f(y))/2}, where {y} is obtained from {x} by flipping just the {i}th bit. Convolution with {h_{i}} acts as an orthogonal projection (as we can easily see in the Fourier domain), so for any functions {f,g \colon \{-1,1\}^{n}\rightarrow{\mathbb R}}, we have {\langle f*h_{i},g\rangle=\langle f,h_{i}*g\rangle=\langle f*h_{i},g*h_{i}\rangle}

  • The Bonami-Gross-Beckner noise functions {\textsf{BG}_{\rho} \colon \{-1,1\}^{n}\rightarrow{\mathbb R}} for {0\leq\rho\leq1}, where {\widehat{\textsf{BG}}_{\rho}(S)=\rho^{|S|}} and we define {0^{0}=1}. These operators form a semigroup, because {\textsf{BG}_{\sigma}*\textsf{BG}_{\rho}=\textsf{BG}_{\sigma\rho}} and {\textsf{BG}_{1}=\delta}. Note that {\textsf{BG}_{\rho}(x)=\sum_{S}\rho^{|S|}\chi_{S}(x)=\prod_{i}(1+\rho x_{i})}. We define the noise operator {T_{\rho}} acting on functions on the discrete cube by {T_{\rho}f=\textsf{BG}_{\rho}*f}. In combinatorial terms, {(T_{\rho}f)(x)} is the expected value of {f(y)}, where {y} is obtained from {x} by independently flipping each bit of {x} with probability {1-\rho}.

Lemma 1 {\frac{d}{d\rho}\textsf{BG}_{\rho}=\frac{1}{\rho}\textsf{BG}_{\rho}*\sum h_{i}}

Proof: This is easy in the Fourier basis:

\displaystyle \widehat{\textsf{BG}}_{\rho}' = (\rho^{|S|})' = |S|\rho^{|S|-1} = \sum_{i\in[n]}\hat{h}_{i}\frac{\widehat{\textsf{BG}}_{\rho}}{\rho}.

Continue reading