Breaking The Dimension Dependence in Sparse Distribution Estimation under Communication Constraints

Wei-Ning Chen , Peter Kairouz , Ayfer Ozgur

[Proceedings link] [PDF]

Session: Minimally Supervised Learning (B)

Session Chair: Aryeh Kontorovich

Poster: Poster Session 1

Abstract: We consider the problem of estimating a $d$-dimensional $s$-sparse discrete distribution from its samples observed under a $b$-bit communication constraint. The best-known previous result on $\ell_2$ estimation error for this problem is $O\lp \frac{s\log\lp {d}/{s}\rp}{n2^b}\rp$. Surprisingly, we show that when sample size $n$ exceeds a minimum threshold $n^*(s, d, b)$, we can achieve an $\ell_2$ estimation error of $O\lp \frac{s}{n2^b}\rp$. This implies that when $n>n^*(s, d, b)$ the convergence rate does not depend on the ambient dimension $d$ and is the same as knowing the support of the distribution beforehand. We next ask the question: ``what is the minimum $n^*(s, d, b)$ that allows dimension-free convergence?''. To upper bound $n^*(s, d, b)$, we develop novel localization schemes to accurately and efficiently localize the unknown support. For the non-interactive setting, we show that $n^*(s, d, b) = O\lp \min \lp {d^2\log^2 d}/{2^b}, {s^4\log^2 d}/{2^b}\rp \rp$. Moreover, we connect the problem with non-adaptive group testing and obtain a polynomial-time estimation scheme when $n = \tilde{\Omega}\lp{s^4\log^4 d}/{2^b}\rp$. This group testing based scheme is adaptive to the sparsity parameter $s$, and hence can be applied without knowing it. For the interactive setting, we propose a novel tree-based estimation scheme and show that the minimum sample-size needed to achieve dimension-free convergence can be further reduced to $n^*(s, d, b) = \tilde{O}\lp {s^2\log^2 d}/{2^b} \rp$.

Summary presentation

Full presentation