January | 2025 | Property Testing Review

Happy New Year to you and your loved ones! Welcome to the first post of 2025. This month, we feature four papers: one on property testing in the huge object model, two on distribution testing, and a fourth that, while not strictly a property testing paper, was too intriguing to ignore. The last paper recounts the resolution of a fascinating bet between Peter Sarnak and Noga Alon. With that teaser, let’s dive in!

Testing vs Estimation for Index-Invariant Properties in the Huge Object Model by Sourav Chakraborty, Eldar Fischer, Arijit Ghosh, Amit Levi, Gopinath Mishra, and Sayantan Sen (arXiv) Let me begin by reminding you the setup for distribution testing in the huge object model which was introduced by Goldreich and Ron. Here is a brief refresher of this model. Suppose you want to test whether a distribution \(\mathcal{D}\), supported over the Boolean hypercube, satisfies a certain property \(\mathcal{P}\). To do this, you sample a string \(x \sim \mathcal{D}\), where \(x\) is of length \(n\). Huge object model considers situations where \(n\) is very large and so you instead assume query access to the sampled strings. In this model, the distribution \(\mathcal{D}\) is considered \(\varepsilon\)-far from \(\mathcal{P}\) if the earthmover distance (EMD) between \(\mathcal{D}\) and the closest distribution satisfying \(\mathcal{P}\), measured with respect to the relative Hamming distance between bitstrings, is at least \(\varepsilon\). In our July 2022 post, we covered a paper by a subset of the authors which presented efficient testers in this model for index-invariant properties whose support had bounded VC dimension.

The main result of the featured paper shows the following. For an index invariant property, you can basically upgrade a plain tester to a tolerant tester. Thus, the ability to efficiently \(\varepsilon\)-test an index-invariant distribution property in the huge object model translates into an ability of being able to estimate distance from the property.

Optimal Algorithms for Augmented Testing of Discrete Distributions by Maryam Aliakbarpour, Piotr Indyk, Ronitt Rubinfeld, and Sandeep Silwal (arXiv) Let us take the standard setup of discrete distribution testing supported over \([n]\) with a slight twist. Suppose you can assume that the underlying distribution \(\boldsymbol{p}\) is not entirely unknown. As the paper argues, this might be a realistic assumption in distributions dealing with network traffic data or search engine queries. Among a couple more, the main result of this paper shows that indeed, with a good proxy \(\boldsymbol{p}’\) for the input distribution \(\boldsymbol{p}\) i.e., a situation where say \(\|\boldsymbol{p}’-\boldsymbol{u}\|_1 \gg \|\boldsymbol{p}’ – \boldsymbol{p}\|_1\) and \(\|\boldsymbol{p}’ – \boldsymbol{p}\|_1\) is small enough, you get testers for uniformity testing with sample complexity \(O(1)\). (Here, \(\boldsymbol{u}\) denotes the uniform distribution over \([n]\). In this framework, the authors also present algorithms with improved sample complexity for identity testing and closeness testing.

Settling the complexity of testing grainedness of distributions, and application to uniformity testing in the Huge Object model by Clement Canonne, Sayantan Sen, Qiping Yang (ECCC) A discrete distribution is called \(m\)-grained if all probabilities are integer multiples of \(1/m\). If you are a regular PTReview reader, you might recall our News for September 2021 which featured a paper by Goldreich and Ron which proved an \(\Omega(n^c)\) lower bound for testing \(m\)-grainedness for any \(c < 1\). Goldreich and Ron also conjectured that the true lower bound is actually \(\Omega(n/\log n)\) (when \(m = \Theta(n)\)). The current work resolves this conjecture settling the complexity of this problem.

Ramanujan Property and Edge Universality of Random Regular Graphs by Jiaoyang Huang, Theo Mckenzie, and Horng-Tzer Yau (arXiv) So, yes here is the (non-property testing paper) I wanted to tell you about. Let me start with the juicy bit, So, Peter Sarnak and Noga Alon had a bet about the following situation: Fix \(d \in \mathbb{N}\) and let us take a random \(d\)-regular graph. Sarnak conjectured that the probability this graph is in fact a Ramanujan expander goes to zero as \(n \to \infty\) whereas Alon conjectured that this probability tends to one as \(n \to \infty\). The featured paper shows that while this probability decreases with increasing \(n\), it approaches a limiting value which is around \(0.69\). You can watch this juicy bit here.

Property Testing Review

The latest in property testing and sublinear time algorithms

Monthly Archives: January 2025

News for December 2024