Aram Harrow, Quantum Information, and Dvoretzky’s theorem

I’m delighted to announce that Aram Harrow will be joining us at UW as a visiting professor for the coming academic year. Aram is one of the young leaders in quantum information and computation. During a recent visit, Aram started to explain to me how ideas from asymptotic geometric analysis have started to become important in quantum information.

The capacity of quantum channels

A mixed quantum state on {\mathbb C^d} is represented by a density matrix {\rho}, which is {d \times d} Hermitian, positive semi-definite matrix with {\mathrm{tr}(\rho) = 1}. The von Neumann entropy of {\rho} is given by

\displaystyle  S(\rho) = - \mathrm{tr}(\rho \log \rho).

If {\mathcal M_d} is the space of {d \times d} matrices with complex entries, then a quantum channel is a mapping

\displaystyle  \Phi : \mathcal M_d \rightarrow \mathcal M_k

for some {d,k \geq 1}, which sends PSD matrices to PSD matrices and preserves the trace. Finally, the minimal output entropy of a quantum channel is

\displaystyle  S_{\min}(\Phi) = \min_{\rho} S(\Phi(\rho)),

where {\rho} ranges over all mixed states on {\mathbb C^d}. Since the entropy is a concave function, this minimum is achieved at a pure state.

Work of Shor showed that the following conjecture is equivalent to a number of long-standing problems in quantum information theory.

Additivity conjecture for minimal output von Neumann entropy: Is is true that, for every two quantum channels {\Phi_1} and {\Phi_2}, we have:

\displaystyle  S_{\min}(\Phi_1 \otimes \Phi_2) = S_{\min}(\Phi_1) + S_{\min}(\Phi_2).

Observe that we always have {\leq} above, by considering the product state {\rho_1 \otimes \rho_2} where {\rho_1} and {\rho_2} are minimizers for {\Phi_1} and {\Phi_2}, respectively. The conjecture is about whether we can get a smaller output entropy by entangling the the inputs to {\Phi_1 \otimes \Phi_2}.

Hastings’ counterexample and Dvoretzky’s theorem

Hastings proved that the conjecture is false: By appropriately choosing randomly constructed quantum channels, it is possible to achieve

\displaystyle  S_{\min}(\Phi_1 \otimes \Phi_2) < S_{\min}(\Phi_1) + S_{\min}(\Phi_2).

Remarkably, it has recently been shown by Aubrun, Szarek, and Werner that this counterexample follows from an appropriate version of Dvoretzky’s theorem on almost-Euclidean subspaces of normed spaces (for a refresher, see this earlier post on pseudorandom subspaces).

Let {\mathcal M_{k,d}} be the space of {k \times d} matrices with complex entries. We will consider two norms on matrices {M \in \mathcal M_{k,d}}: The Hilbert-Schmidt norm {\|M\|_2 = \sqrt{\mathrm{tr} |M|^2} = \sqrt{\sum_{i,j} M_{ij}^2}}, and the Schatten {4}-norm {\|M\|_4 = \left(\mathrm{tr} |M|^4\right)^{1/4}}, where {|M| = \sqrt{M^* M}}. This latter quantity is precisely the {\ell^4} norm of the singular values of {M}. Dvoretzky’s theorem tells us (qualitatively) that an appropriate random subspace of the normed space {(\mathcal M_{k,d}, \|\cdot\|_4)} will be nearly Euclidean.

More precisely, for every {k \geq 1}, a uniformly random subspace {\mathcal W \subseteq \mathcal M_{k,O(k^2)}} of dimension {\Omega(k^2)} will, with high probability, satisfy: For every {M \in \mathcal W},

\displaystyle  k^{-1/4} \|M\|_2 \leq \|M\|_4 \leq \left(1+\frac{O(1)}{k}\right) k^{-1/4} \|M\|_2,

i.e. it will be {\left(1+O(1/k)\right)}-Euclidean. This strong quantitative version follows along the lines of enhancements to Milman’s proof of Dvoretzky’s theorem.

Oscillations and chaining

The proof of the preceding quantitative bound is based on a version of Dvoretzky’s theorem for Lipschitz functions, which I will state here in the real case for simplicity. We will use {S^{n-1} \subseteq \mathbb R^n} for the {(n-1)}-dimensional unit sphere, and then for a function {f : S^{n-1} \rightarrow \mathbb R} and a subset {A \subseteq S^{n-1}}, we write

\displaystyle  \mathrm{osc}(f,A) = \sup_{x \in A} |f(x)-\mu(f)|,

where {\mu(f)} is the average value of {f} on {S^{n-1}}.

Theorem 1 If {f : S^{n-1} \rightarrow \mathbb R} is a 1-Lipschitz function, then for every {\varepsilon > 0}, if {E \subseteq \mathbb R^n} is a random subspace of dimension {\lfloor \varepsilon^2 n\rfloor}, then with high probability,

\displaystyle  \mathrm{osc}(f,E \cap S^{n-1}) \leq O(\varepsilon). \ \ \ \ \ (1)

A standard way to prove this lemma would be to use the fact that every {1}-Lipschitz function is highly concentrated on {S^{n-1}} (Levy’s lemma) and then to take a union bound over an {\varepsilon}-net on {S^{n-1}} whose size is bounded by {(1+2/\varepsilon)^n}. Unfortunately, this leads to an extra {\log(1/\varepsilon)} term on the right-hand side of (1) which, remarkably, would lead to a version of Dvoretzky’s theorem that is too weak to imply Hastings’ counterexample to the additivity conjecture.

Gordon originally proved Theorem 1 using comparison results for Gaussian processes. (In fact, we will see a main technical step called Slepian’s Lemma in our next post on majorizing measures.) Alternately, Schechtman showed that one can use concentration of measure and a more sophisticated chaining argument, of the type discussed in our previous post.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s