September | 2020 | Property Testing Review

Last month saw action in property testing across the board: graphs, distributions and functions were all objects of study in papers that came out last month. Also included is a separation result between quantum and classical query complexities resolving a conjecture of Aaronson and Ambainis. Here is a brief report.

Testing asymmetry in bounded degree graphs, by Oded Goldreich (ECCC). This paper studies a natural graph property hitherto not considered in the property testing literature. Namely, the question of testing whether a graph is asymmetric or whether it is far from being asymmetric. A graph is said to be asymmetric if its automorphism group is trivial (that is, it only contains the identity permutation). One of the results in the paper says that this problem is easy in the dense graph model – which is a side result of the paper. This is because all dense graphs are \(O(\log n/n)\)-close to being asymmetric. To see this, the paper points out that a simple randomized process which takes \(G\) as input and returns an asymmetric graph by changing very few edges. This process asks you to do the following: Take a set \(S \subseteq V\) with \(|S| = O(\log n)\) nodes and replace the subgraph they induce with a random graph. Moreover, randomize all the edges between \(S\) and \(V \setminus S\). What you can show is that in this modified graph, any automorphism (whp) will map \(S\) to itself. And all the remaining vertices behave (whp) in a unique manner which is peculiar to it. In particular, this means that any automorphism better not map a vertex \(v\) in \(V \setminus S\) to any other vertex in \(V \setminus S\). And this finishes the argument. The main result explores the bounded degree model. By a simple argument, you can show that testing asymmetry is easy if all the connected components have size \(s(n) \leq o\left(\frac{\log n}{\log {\log n}}\right)\). Thus, the challenging case is when you have some connected components of a larger size. In this case, what Goldreich shows is the following: If all the components have size at most \(s(n)\) where \(s(n) \geq \Omega\left(\frac{\log n}{\log {\log n}}\right)\), then you can test asymmetry (with one-sided-error) in \(O(n^{1/2} \cdot s(n)/\epsilon)\) queries. Moreover, the paper also shows a two-sided-lower bound of \(\Omega\left(n/s(n)\right)^{1/2}\) queries which holds as long as \(\epsilon \leq O(1/s(n)\). This leaves open the bounded degree question of determining the query complexity of testing asymmetry in the general case as the paper also points out.

On testability of first order properties in Bounded degree graphs, by Isolde Adler, Noleen Köhler and Pan Peng (arXiv). One of the well understood motivations for property testing begins in the following manner. Decision problems are typically hard to solve if they involve a universal quantifier — either \(\exists\) or \(\forall\). One way around this hardness is to do the following ritual: relax the problem by dropping the universal quantifier, define a notion of distance between objects in your universe and ask a promise problem instead. Indeed, if you take your favorite property testing problem, you will note that it precisely fits in the template above. How about making this business more rigorous in bounded degree graph property model? This is precisely the content of this work which considers the face off between (First Order) logic and property testing in the bounded degree model. The authors show some interesting results. They begin by showing that in the bounded degree model, you can show that all first order graph properties which involve a single quantifier \(Q \in \{\forall, \exists\}\) are testable with constant query complexity.
If you find this baffling, it would be good to remind yourself that not all graph properties can be expressed in the language of first order logic with a single quantifier! So, you can rest easy. The graph properties you know are not constant time testable are most assuredly not expressible with a single quantifier in First Order Logic. However, this work shows more. It turns out, that “any FO property that is defined by a formula with quantifier prefix \(\exists^* \forall^*\) is testable”. Moreover, there do exist FO graph properties defined by the quantifier prefix \(\forall^* \exists^*\) which are not testable. Thus, this work achieves results in the bounded degree model which are kind of analogous to the results in the dense graph model by Alon et al [1]. On a final note, I find the following one sentence summary of the techniques used to prove the lower bound rather intriguing: “[the paper] obtains the lower bound by a first-order formula that defines a class of bounded-degree expanders, based on zig-zag products of graphs.”

On graphs of bounded degree that are far from being Hamiltonian, by Isolde Adler and Noleen Köhler (arXiv). This paper explores the question of testing Hamiltonicity in the bounded degree model. The main result of the paper is that Hamiltonicity is not testable with one-sided error with \(o(n)\) queries. PTReview readers might recall from our July Post a concurrent paper by Goldriech [2] which achieves the same lower bound on query complexity in the two-sided error model (the authors call attention to [2] as well). One of the interesting feature of this result is that the lower bounds are obtained by an explicit deterministic reduction as opposed to the usual randomized reduction. Like the authors point out, this offers more insights into structural complexity of instances that are far from being Hamiltonian. We point out that this also differs from how the lower bound is derived in [2] — which is via local hardness reductions to a promise problem of 3 CNF satisfiability.

An optimal tester for \(k\)-linear, by Nader Bshouty (ECCC). This paper explores two related questions. We call a function \(f \colon \{0,1\}^n \to \{0,1\}\) \(k\)-linear if it equals the \(\sum_{i \in S} x_i\) for some \(S \subseteq [n]\) of size exactly \(k\). A boolean function is said to be \(k\)-linear* if it is \(j\) linear for a fixed \(j\) where \(j \in \{0,1,2, \cdots, k\}\). The paper proves the following theorems.

There exists a non-adaptive one-sided distribution free tester for \(k\)-linear* with query complexity being \(O\left(k \log k + \frac{1}{\varepsilon}\right)\). This matches the two-sided lower bound (where the underlying distribution is uniform) by Blais et al [3].
Using a reduction from \(k\)-linear* to \(k\)-linear, the paper shows one can obtain a non-adpative two-sided distribution free tester for \(k\)-linear with same query complexity as the above result. The lower bound from Blais et al applies here also (in fact, they prove a lower bound on \(k\)-linearity).
Next up, the paper has a couple of lower bound results to accompany this. One of these results reveals the price you pay for being one-sided and exact (that is, you insist on the function being exactly \(k\)-linear). Turns out, now you have a non-adaptive one-sided uniform distribution lower bound of \(\widetilde{\Omega}(k) \log n + \Omega(1/\varepsilon)\). If you allow adaptivity instead, the paper shows a lower bound of \(\widetilde{\Omega}(\sqrt k)\log n + \Omega(1/\varepsilon)\).

Amortized Edge Sampling, by Talya Eden, Saleet Mossel and Ronitt Rubinfeld (arXiv). Consider the following setup. You are given query access to adjacency list of a graph \(G\) with \(n\) vertices and \(m\) edges. You can make degree queries and neighbor queries. Suppose I ask you to sample a single edge from this graph from a distribution that is pointwise \(\varepsilon\) close to the uniform distribution. Eden and Rosenbaum already showed how you can achieve this with a budget of \(\widetilde{\Theta}(n/\sqrt m)\) queries. Starting from this jump off point, the authors ask whether you can circumvent this lower bound if you want to return multiple samples from a distribution which is again pointwise close to uniform. The paper answers this question in the affirmative and shows that if you know the number of samples, \(q\), in advance you can get away with an amortized bound of \(O*(\sqrt q n/\sqrt m)\) on the total number of queries needed.

On the High Accuracy Limitation of Adaptive Property Estimation, by Yanjun Han (arXiv). Take a discrete distribution \(\mathcal{P}\) with support size \(k\) and consider the task of estimating some symmetric property of \(\mathcal{P}\) to a small \(\pm \varepsilon\) additive error. Here, a symmetric property refers to a “nice” functional defined over the probability simplex, i.e., it refers to functions \(F \colon \Delta_k \to \mathbb{R}\) where \(F(p) = \sum_{i=1}^{k} f(p_i)\) where \(f \colon (0,1) \to \mathbb{R}\). A naive attack to these estimation tasks goes through the following ritual: you get your hands on the empirical distribution, you plug it in \(F\) and you hope for the best. Turns out, you are out of luck if the function \(f\) is non-smooth and in these cases you end up with a suboptimal estimator. Previous works have also looked at more sophisticated estimators (like the local moment matching or LMM and profile maximum likelihood or PML estimator). Turns out, using the LMM or PML estimator leads to optimal sample complexity for a handful of symmetric properties (as long as \(\varepsilon \geq n^{-1/3}\)). This paper considers the question of what can you say for supersmall values of \(\varepsilon\) where \(n^{-1/2} \leq \varepsilon \leq n^{-1/3}\). (The \(n^{-1/2}\) appears because there are estimators that use the knowledge of \(f\) and \(\varepsilon\) can be driven all the way down to \(n^{-1/2}\) for these estimators). The paper focuses on estimators which do not exploit any structure in \(f\). In particular, the paper specializes this question to PML and shows a fundamental limitation on PML which means that the PML approach fails to be sample optimal for the entire range of \(\varepsilon\) and is sample optimal only for \(\varepsilon >> n^{-1/3}\) — which also confirms a conjecture of Han and Shiragur (and refutes a conjecture of Acharya et al. who postulated this is sample optimal for the entire range of \(\varepsilon\)).

\(k\)-Forrelation Optimally Separates Quantum and Classical Query
Complexity, by Nikhil Bansal and Makrand Sinha (arXiv). Understanding the power of quantum query over classical queries is a well motivated problem with a rich history. One of the biggest questions in this area asks for the largest separation between classical and quantum query complexities. In a breakthrough, Aaronson and Ambainis [4] showed a fundamental simulation result which confirmed that you can simulate \(q\) quantum queries with \(O(N^{1 – 1/2q})\) classical queries in the randomized decision tree model of computation as long as \(q = O(1)\). In the same paper, the authors also showed that the standard forrelation* problem exhibits a \(1\) versus \(\widetilde{\Omega}(\sqrt n)\) separation. This means that for \(q = 1\), you essentially have optimal separation. But what about \(q > 1\)? To this end, Aaronson and Ambainis conjectured that a certain problem which they called \(k\)-forrelation — which can be computed with \(q = k/2\) queries requires at least \(\widetilde{\Omega}(n^{1-1/k})\) classical queries. The current work precisely confirms this conjecture.

(*) The forrelation problem asks you to decide whether one Boolean function is highly correlated with the Fourier transform of a second function.

(Edit: Added Later) Simon Apers points out a paper by Shrestov, Storozhenko and Wu that we missed. (Thanks Simon)! Here is a brief report on that paper.

An optimal separation of randomized and quantum query complexity (by Alexander Shrestov, Andrey Storozhenko and Pei Wu)(arXiv) Similar to the paper by Bansal and Sinha [BS20] mentioned above, this paper also resolves the conjecture by Aaronson and Ambainis proving the same result. Like the paper also notes, the techniques in both of these works are completely different and incomparable. On the one hand [BS20] proves the separation for an explicit function as opposed to a function chosen uniformly at random from a certain set as considered in this work. On the other hand, the separation result shown in [BS20] only applies when the query algorithm returns the correct answer with probability at least \(1/2 + 1/poly(\log n)\) — in contrast the results in this paper apply even when the query algorithm is required to have probability of correctness be a constant at least \(1/2\). In addition, this work also proves the \(\ell\)-Fourier weight conjecture of Tal which is of independent interest beyond quantum computing.

So, it looks like all in all we had a great month with two concurrent papers both resolving Aaronson Ambainis conjecture (yet again after two concurrent papers on testing Hamiltonicity)!

References:

[1] Noga Alon, Eldar Fischer, Michael Krivelevich, and Mario Szegedy. Efficient testing of
large graphs. Combinatorica, 20(4):451–476, 2000.

[2] Oded Goldreich. On testing hamiltonicity in the bounded degree graph model. Electronic Colloquium on Computational Complexity (ECCC), (18), 2020

[3] Eric Blais, Joshua Brody, and Kevin Matulef. Property testing lower bounds via communication complexity. CCC 2011

[4] Scott Aaronson and Andris Ambainis. Forrelation: A problem that optimally separates quantum from classical computing. SIAM J. Comput., 47(3):982–1038, 2018

Property Testing Review

The latest in property testing and sublinear time algorithms

Monthly Archives: September 2020

News for August 2020