October | 2023 | Property Testing Review

Sorry for delay in getting this month’s post out. This time we had ~~seven (EDIT) eight~~ (LATER EDIT) nine papers. Thanks to our readers for pointing out a paper we missed. Do let us know if we missed any (EDIT) others. Alright, without further delay, let us look at this month’s spread.

Mildly Exponential Lower Bounds on Tolerant Testers for Monotonicity, Unateness, and Juntas by Xi Chen, Anindya De, Yuhao Li, Shivam Nadimpalli, and Rocco A. Servedio (arXiv) Let us begin with a paper on property testing of boolean functions over the binary hypercube $\{0,1\}^n$. Quoting from the abstract.

[This paper gives] the first superpolynomial (in fact, mildly exponential) lower bounds for tolerant testing of monotonicity, unateness, and juntas with a constant separation between yes and no cases.

Let us take a little dip to get a superficial feel for what techniques the paper uses for boolean monotonicity. The key lies in an adaptation of the lower bound techniques pioneered in the paper of Pallavoor-Raskhodnikova-Waingarten (PRW) which considers distributions over “yes functions” (which are $\varepsilon/\sqrt{n}$-close to monone) and “far functions” (which are $\eps$-far from being monotone). PRW split the $n$ variables into $n/2$ control variables and $n/2$ action variables. PRW consider the subcubes indexed by bit-strings where the control variables are balanced. The function value is chosen carefully on these subcubes which ensures any tester that reliably distinguishes yes and no functions need to sample bit-strings from the same balanced subcube which differ in lots of action variables. As PRW argue, this event occurs with low probability if the allowed number of queries is small. The key insight of the featured paper is to modify the PRW construction by using random monotone DNF formulas due to Talagrand. If this sets up your appetite, go read the paper!

Testing Junta Truncation by William He and Shivam Nadimpalli (arXiv)

Let $f \colon \{0,1\}^n \to \{0,1\}$ be a $k$-junta. You are given a set $\mathcal{U}_{\text{yes}} = \bigl\{0,1\bigr\}^n$ and a set $\mathcal{U}_{\text{no}} = \bigl\{x \in \{0,1\}^n \colon f(x) = 1 \bigr\}.$ Consider uniform distributions supported on both of these sets which we call $\mathcal{D}_{\text{yes}}$ and $\mathcal{D}_{\text{no}}$. Finally, we define a distribution $\mathcal{D} = \begin{cases} \mathcal{D}_{\text{yes}} \text{ w.p. } 1/2 \\ \mathcal{D}_{\text{no}} \text{ w.p } 1/2 \end{cases}.$

You are given $t$ samples from the distribution $D$. The task is to decide whether $D$ is the yes distribution or the no distribution. The featured paper shows you can reliably solve this task with $t \leq \min(2^k + \log{n \choose k }, 2^{k/2} \log^{1/2}{n \choose k})$ samples. The paper also supplements this result with a lower bound of $t \geq \log{n \choose k}$ samples fewer than which cannot be used to reliably distinguish these two distributions. The results suggest that this “testing junta truncation” problem requires learning the set of relevant variables for the junta.

Longest Common Substring and Longest Palindromic Substring in $\mathbf{\widetilde{O}(\sqrt n)}$ Time by Domenico Cantone, Simone Faro, Arianna Pavone, and Caterina Viola (arXiv) I paraphrase from the abstract of this paper. You know the longest common substring and longest palindromic substring as classic problems in computer science both of which can be solved in linear time using suffix trees. Recently, quantum algorithms were proposed for both of these problems in the query model both of which issue only $o(n)$ quantum queries. The featured paper notes that this query model has a shortcoming namely when it comes to real life implementation on actual hardware. The current paper address this shortcoming by presenting $o(n)$ quantum-query algorithms in the circuit model of computation.

Testing properties of distributions in the streaming model by Sampriti Roy and Yadu Vasudev (arXiv) Alright, now let us consider a different twist on distribution testing. Suppose you have a small memory. You obtain a bunch of samples to solve some standard distribution testing task but the twist is of course you cannot store all the samples. What can you say about how sample complexity trades off against space complexity? The featured paper studies this trade off in the standard access model and the conditional access model. One of the results of the paper asserts that in the conditional access model, you can do identity testing with $ \widetilde{O}\bigl(\frac{\log^4n}{\varepsilon^4}\bigr)$ samples while using only $O\bigl(\frac{\log^2 n}{\varepsilon^2} \bigr)$ bits of memory.

Testing Spreading Behavior in Networks with Arbitrary Topologies by Augusto Modanese and Yuichi Yoshida (arXiv) We covered the problem of testing dynamic environments in this March 2014 post and that May 2021 post earlier. The goal here is to check whether a dynamically evolving system evolves according to some fixed rule or whether it evolves according to some fixed rule or whether the system is far from systems that evolve according to that fixed rule. The May 2021 post covered a paper which shows you can test dynamically evolving systems that evolve according to what is called the threshold rule. The featured paper considers rules motivated by some kind of models for infection spreading. One of the results in the paper presents one-sided and two-sided testers (with $O(1/\varepsilon)$ query complexity) for testing a single step of evolution (on bounded degree graphs) with these rules.

A Tight Lower Bound of Ω(log n) for the Estimation of the Number of Defective Items by Nader Bshouty and Gergely Harcos (ECCC) The featured paper considers a problem in group testing. Let us quickly review the setup for group testing. You are given some ground set $X$ of $|X| = n$ items. Suppose the set of items in the set $I \subseteq X$ is defective. The challenge is to devise a test which refers to some set $Q \subseteq X$ where the test is said to be successful iff $Q \cap I \neq \emptyset$. This paper presents lower bounds for non-adpative algorithms for group testing. And as the title says, if your algorithm wishes to estimate the number of defective items to within a constant factor, you better pay up $\Omega(\log n)$ tests.

A tight lower bound on non-adaptive group testing estimation by Tsun-Ming Cheung, Hamed Hatami, and Anthony Ostuni (arXiv) As our readers pointed out, this paper is concurrent with the paper above and achieves the same lower bound. Indeed, this holds for both the one-sided and the two-sided variants. Furthermore, as this paper shows if one knows the set $I$ satisfies $L \leq |I| \leq U$ then you can show both one-sided and two-sided lower bounds of $\Omega(U/L)$ non-adaptive queries if you want a constant approximation to $|I|$.

On Testing Isomorphism to a Fixed Graph in the Bounded-Degree Graph Model by Oded Goldreich and Laliv Tauber (ECCC) This paper looks like a fantastic read for your students — especially when written in such an engaging style spanning (only) 18 highly readable pages. As the title indicates, this paper considers the challenge of testing isomorphism to a fixed graph in the bounded degree model. The main result of this paper asserts that for almost all $d$-regular graphs $H$ on $|V(H)| = n$ vertices, testing isomorphism to $H$ can be done in about $\approx \sqrt n$ queries. The paper also presents an almost matching (query) lower bound which also holds for almost all graphs $H$.

Tolerant Testing of High-Dimensional Samplers with Subcube Conditioning by Gunjan Kumar, Kuldeep S. Meel, and Yash Pote (arXiv) Let us consider the distribution testing setup with a twist: Suppose you are given some unknown distribution $\mathcal{Q}$ supported over $\{0,1\}^n$ and you want to sample from it conditioned on some predicate $\mathcal{Q}$. The question is can you efficiently check whether these are legit samples (satisfying the predicate) which are taken from the distribution $\mathcal{Q}$. Our October 2020 news covered a tolerant tester on this problem which involved some subset of the authors of the current paper. The featured paper considers what additional leverage you gain if you are given access to a sampling oracle which can sample from “conditioned subcubes.” In this model, you can query some subcube and after issuing a query, you will receive an element $x$ from this subcube with probability proportional to original probability weight of $x$. The paper provides a tolerant tester in this setup which makes at most $\widetilde{O}(n^3/(\varepsilon_2 – \varepsilon_1)^5)$ queries to this sampling oracle.

Property Testing Review

The latest in property testing and sublinear time algorithms

Monthly Archives: October 2023

News for September 2023