# tcs math – some mathematics of theoretical computer science

## February 22, 2013

### Open question: Separated sets in isotropic measures

Filed under: Math, Open question — Tags: , , — James Lee @ 8:26 pm

[Note: This question is trivially impossible, though I leave the original form below. If $\mu$ is the uniform measure on $S^{k-1}$, then any two sets of $\Omega(1)$ measure are at distance $O(1/\sqrt{k})$ by concentration of measure. Thus it is impossible to find $k$ sets satisfying the desired bound. I will try to think of the right question and repost. Unfortunately, I now recall having gone down this path a few times before, and I fear there may not be a relevant elementary open question after all.]

Here is an interesting and elementary (to state) open question whose resolution would give an optimal higher-order Cheeger esimate. The goal is to replace the factor ${k^{O(1)}}$ in Theorem 1 of the previous post with a ${\mathrm{polylog}(k)}$ factor, or even an optimal bound of ${\sqrt{\log k}}$.

Fix ${k \in \mathbb N}$ and let ${\mu}$ denote a probability measure on the ${(k-1)}$-dimensional sphere ${S^{k-1}}$. For the purpose of this question, one can assume that ${\mu}$ is supported on a finite number of points. Suppose also that ${\mu}$ is isotropic in the sense that, for every ${\theta \in S^{k-1}}$,

$\displaystyle \int_{S^{k-1}} \langle x, \theta\rangle^2 \,d\mu(x)= \frac{1}{k}\,.$

Now our goal is to find, for some ${\varepsilon, \delta > 0}$, a collection of (measurable) subsets ${U_1, U_2, \ldots, U_k \subseteq S^{k-1}}$ such that ${\mu(U_i) \geq \varepsilon/k}$ for each ${i=1,2,\ldots,k}$, and such that for ${i \neq j}$, we have

$\displaystyle \min_{x \in U_i, y \in U_j} \|x-y\|_2 \geq \delta\,.$

In Lemma 5 of the previous post, we proved that this is possible for ${\varepsilon \gtrsim 1}$ and ${\delta \gtrsim k^{-5/2}}$. In this paper, we improve the estimate on ${\delta}$ somewhat, though the best-known is still ${\delta \gtrsim k^{-c}}$ for some ${c > 0}$.

Question:
Is it possible to achieve ${\varepsilon, \delta \gtrsim 1/\mathrm{poly}(\log k)}$? One could even hope for ${\varepsilon \gtrsim 1}$ and ${\delta \gtrsim 1/(\log k)}$.

Two extreme cases are when (1) ${\mu}$ is uniform on ${S^{k-1}}$, or (2) ${\mu}$ is uniform on the standard basis ${\{e_1,e_2,\ldots,e_k\}}$. In the first case, one can easily achieve ${\varepsilon,\delta \gtrsim 1}$ (in fact, one could even get ${100k}$ sets satisfying these bounds). In the second case, taking each set to contain a single basis vector achieves ${\varepsilon = 1}$ and ${\delta = \sqrt{2}}$.

If one only wants to find, say, ${\lceil 3k/4\rceil}$ sets instead of ${k}$ sets, then it is indeed possible to achieve ${\varepsilon \gtrsim 1}$ and ${\delta \gtrsim 1/O(\log k)}$. [Update: This is actually impossible for the reasons stated above. To get the corresponding bounds, we do a random projection into $O(\log k)$ dimensions. This only preserves the original Euclidean distances on average.] Furthermore, for ${\varepsilon \gtrsim 1}$, this bound on ${\delta}$ is asymptotically the best possible.

1. How much slack is allowed in the isotropic condition? Suppose $\mu$ is supported on $x_1,\ldots,x_n$ and we don’t have $\sum_i x_ix_i^T = k^{-1}\cdot I$, but we do have, say, $\|\sum_i x_ix_i^T – k^{-1}\cdot I\| \le \gamma$. Is there some value of $\gamma>0$ that is still acceptable for your application?
2. Oh right $\mu$ is part of the input …my above question is probably nonsense.