Recall from the previous post that denotes the cut polytope on the -vertex complete graph, and is the smallest number of facets in any polytope that linearly projects to . Our goal is to prove a lower bound on this quantity, but first we should mention that a nearly tight lower bound is known.
Theorem 1 (Fiorini, Massar, Pokutta, Tiwari, de Wolf 2012) There is a constant such that for every , .
We will present a somewhat weaker lower bound using entropy maximization, following our joint works with Chan, Raghavendra, and Steurer (2013) and with Raghavendra and Steurer (2015). This method is only currently capable of proving that , but it has the advantage of being generalizable—it extends well to the setting of approximate lifts and spectrahedral lifts (we’ll come to the latter in a few posts).
1.1. The entropy maximization framework
To use entropy optimality, we proceed as follows. For the sake of contradiction, we assume that is small.
First, we will identify the space of potential lifts of small -value with the elements of a convex set of probability measures. (This is where the connection to non-negative matrix factorization (NMF) will come into play.) Then we will choose a family of “tests” intended to capture the difficult aspects of being a valid lift of . This step is important as having more freedom (corresponding to weaker tests) will allow the entropy maximization to do more “simplification.” The idea is that the family of tests should still be sufficiently powerful to prove a lower bound on the entropy-optimal hypothesis.
Finally, we will maximize the entropy of our lift over all elements of our convex set, subject to performing well on the tests. Our hope is that the resulting lift is simple enough that we can prove it couldn’t possibly pass all the tests, leading to a contradiction.
In order to find the right set of tests, we will identify a family of canonical (approximate) lifts. This is family of polytopes so that and which we expect to give the “best approximation” to among all lifts with similar -value. We can identify this family precisely because these will be lifts that obey the natural symmetries of the cut polytope (observe that the symmetric group acts on in the natural way).
1.2. NMF and positivity certificates
Recall the matrix given by , where is the set of all quadratic multi-linear functions that are non-negative on . In the previous post, we argued that .
for some functions and . (Here we are using a factorization where and .)
Thus the low-rank factorization gives us a “proof system” for . Every can be written as a conic combination of the functions , thereby certifying its positivity (since the ‘s are positive functions).
If we think about natural families of “axioms,” then since is invariant under the natural action of , we might expect that our family should share this invariance. Once we entertain this expectation, there are natural small families of axioms to consider: The space of non-negative -juntas for some .
A -junta is a function whose value only depends on of its input coordinates. For a subset with and an element , let denote the function given by if and only if .
We let . Observe that . Let us also define as the set of all conic combinations of functions in . It is easy to see that contains precisely the conic combinations of non-negative -juntas.
If it were true that for some , we could immediately conclude that by writing in the form (1) where now ranges over the elements of and gives the corresponding non-negative coefficients that follow from .
1.3. No -junta factorization for
We will now see that juntas cannot provide a small set of axioms for .
Toward the proof, let’s introduce a few definitions. First, for , define the junta degree of to be
Since every is an -junta, we have .
Now because is a cone, there is a universal way of proving that . Say that a functional is -locally positive if for all and , we have
These are precisely the linear functionals separating a function from : We have if and only if there is a -locally positive functional such that . Now we are ready to prove Theorem 2.
Proof: We will prove this using an appropriate -locally positive functional. Define
where denotes the hamming weight of .
First, observe that
Now recall the the function from the statement of the theorem and observe that by opening up the square, we have
Finally, consider some with and . If , then
If , then the sum is 0. If , then the sum is non-negative because in that case is only supported on non-negative values of . We conclude that is -locally positive for . Combined with (2), this yields the statement of the theorem.
1.4. From juntas to general factorizations
So far we have seen that we cannot achieve a low non-negative rank factorization of using -juntas for .
Brief aside: If one translates this back into the setting of lifts, it says that the -round Sherali-Adams lift of the polytope
does not capture for .
In the next post, we will use entropy maximization to show that a non-negative factorization of would lead to a -junta factorization with small (which we just saw is impossible).
For now, let us state the theorem we will prove. We first define a submatrix of . Fix some integer and a function . Now define the matrix given by
The matrix is indexed by subsets with and elements . Here, represents the (ordered) restriction of to the coordinates in .
Theorem 3 (Chan-L-Raghavendra-Steurer 2013) For every and , there is a constant such that for all ,
Note that if then is a submatrix of . Since Theorem 2 furnishes a sequence of quadratic multi-linear functions with , the preceding theorem tells us that cannot be bounded by any polynomial in . A more technical version of the theorem is able to achieve a lower bound of (see Section 7 here).
4 thoughts on “Entropy optimality: Lifts of polytopes”
The suspense is killing me!
Thomas- I can’t tell if you’re being sarcastic. But I will take your comment at face value. Actually, in trying to write the next part in the correct way (and looking ahead to quantum entropy), I ran into some things I don’t understand (around factorization and tensor norms) and made a note to talk to you. But I think I can finish the next post before that.
I’m certainly not being sarcastic – in fact I found it surprising no-one had explicitly expressed support so far and wanted to jump in. I’ve been immensely enjoying your posts, and just wanted to express how much I’m waiting for the next one every day :-) They’ve been extremely clear and helpful. And, having attempted something similar in the past I do realize the amount of work must go into them.
Thanks :) I’m enjoying writing them, except possibly for the most recent (the definition of is not great—but you can see the connection to factorization norms). I’m glad you mentioned your blog since I’ve been wondering about quantum hypercontractivity. Going to read now. After the polytope stuff, the next set of posts here is about (classical) hypercontractivity using–what else–entropy optimization.