Group testing and local search: is there a computational-statistical gap?

Fotis Iliopoulos , Ilias Zadik

[Proceedings link] [PDF]

Session: Regression and High-Dimensional Statistics (A)

Session Chair: Yuting Wei

Poster: Poster Session 3

Abstract: Group testing is a fundamental problem in statistical inference with many real-world applications, including the need for massive group testing during the ongoing COVID-19 pandemic. In this paper we study the task of approximate recovery, in which we tolerate having a small number of incorrectly classified items. One of the most well-known, optimal, and easy to implement testing procedures is the non-adaptive Bernoulli group testing, where all tests are conducted in parallel, and each item is chosen to be part of any certain test independently with some fixed probability. In this setting, there is an observed gap between the number of tests above which recovery is information theoretically possible, and the number of tests required by the currently best known efficient algorithms to succeed. In this paper we seek to understand whether this computational-statistical gap can be closed. Our main contributions are the following: 1.Often times such gaps are explained by a phase transition in the landscape of the solution space of the problem (an Overlap Gap Property (OGP) phase transition). We provide first moment evidence that, perhaps surprisingly, such a phase transition does not take place throughout the regime for which recovery is information theoretically possible. This fact suggests that the model is in fact amenable to local search algorithms. 2. We prove the complete absence of “bad” local minima for a part of the “hard” regime, a fact which implies an improvement over known theoretical results on the performance of efficient algorithms for approximate recovery without false-negatives. 3. Finally, motivated by the evidence for the abscence for the OGP, we present extensive simulations that strongly suggest that a very simple local algorithm known as Glauber Dynamics does indeed succeed, and can be used to efficiently implement the well-known (theoretically optimal) Smallest Satisfying Set (SSS) estimator. Given that practical algorithms for this task utilize Branch and Bound and Linear Programming relaxation techniques, our finding could potentially be of practical interest.

Summary presentation

Full presentation