In the last post, we began proving that a random sign matrix almost surely satisfies . In other words, almost surely for every , we have
The proof follows roughly the lines of the proof that a uniformly random matrix with entries is almost surely the check matrix of a good linear code over . In other words, with high probability, there does not exist a non-zero vector with hamming weight , and which satisfies (this equation is considered over ). The proof of this is simple: Let be the rows of . If , then Therefore, for any non-zero , we have
On the other hand, for small enough, there are fewer than non-zero vectors of hamming weight at most , so a union bound finishes the argument.
The proof that proceeds along similar lines, except that we can no longer take a naive union bound since there are now infinitely many “bad” vectors. Of course the solution is to take a sufficiently fine discretization of the set of bad vectors. This proceeds in three steps:
- If is large, and is close to , then is also large.
- Any fixed vector with has large with high probability. This is the analog of (**) in the coding setting.
- There exists a small set of vectors that well-approximates the set of bad vectors.
Combining (1), (2), and (3) we will conclude that every bad vector has large (and, in particular, is not contained in the kernel).
To verify (1), we need to show that almost surely the operator norm is small.
From the last post, we have the following two statements.
Lemma 1: If and is chosen uniformly at random, then
The next theorem (see the proof from the last post) verifies (1) above.
Theorem 1: If is a random sign matrix, then
Now we turn to proving (2), which is the analog of (**). We need to upper bound the small ball probability, i.e. the probability that falls into a small ball (around 0) for any fixed . To start, we would like to say that for some fixed with , and a uniformly random sign vector , we have for example,
These kinds of estimates fall under the name of Littlewood-Offord type problems; see the work of Tao and Vu and of Rudelson and Vershynin for a detailed study of such random sums, and the beautiful connections with additive combinatorics. For the above estimate, we will need only the simple Paley-Zygmund inequality:
Lemma 2: If is a non-negative random variable and , then
Proof: By Cauchy-Schwarz,
Now observe that and rearrange.
We can use this to prove our estimate on random sums.
Lemma 3: For any , if is chosen uniformly at random, then for every ,
Proof: Apply Lemma 2 with . Note that by independence, we have . Furthermore, for any non-negative random variable and , we have , thus
where the final inequality follows from Lemma 1. It is easy to see that the final bound is at most .
Observing that , where are the rows of yields:
Corollary 1: For any ,
Proof: Apply Lemma 3 with , say, and then use a Chernoff bound to show that the conclusion of the Lemma holds for a constant fraction of the trials with overwhelming probability.
Using Theorem 1 and Corollary 1, we can verify (2):
Theorem 2: For any , we have
Proof: By Theorem 1 and Corollary 1, we may assume that and . Let be the closest point to . Then , so
But since , we must have .
Let and be the unit balls of the and norms, respectively, in . Property (3) above follows from the well-known fact that can be covered by at most copies of (see, e.g. Pisier’s book). We use this to finish the proof.
Theorem 3: There exists a constant such that almost surely the following holds. Every satisfies
In other words, .
Proof: Choose small enough so that , and let be such that
By a union bound, we can assume that , and for every ,
Now suppose there exists an with . We may assume that and . But then , hence there exists a such that . It follows that
Choosing , we have . But then (4) implies , contradicting .
In the next post, we’ll see how such subspaces are related to error-correcting codes over the reals.