# News for July 2023

We’re now back to our regular cadence of 4+ monthly papers on property testing and sublinear time algorithms. July brings us a new sublinear PageRank computation and numerous results on our trending topics of distribution and monotonicity testing. Without further delay, let’s see the July spread.

Estimating Single-Node PageRank in $$\widetilde{O}(\min(d_t,\sqrt{m}))$$ Time by Hanzhi Wang and Zhewei Wei (arXiv). PageRank is one of the most important quantities computed on graphs, especially in today’s networked world. Consider a graph $$G = (V,E)$$ which may be directed. The PageRank value of vertex $$t$$ is the probability that a short random walk, starting from the uniform distribution, reaches the vertex $$t$$. (Ok ok, I’m fudging details, but this is close enough to the truth.) The aim is to estimate this probability to within additive error $$O(1/n)$$, where $$n$$ is the number of vertices. A standard simulation would give an $$O(n)$$ time algorithm; can we do sublinear in graph size? Previous work (which actually has implementations!) has led to $$O(\sqrt{n\cdot d_t})$$ for undirected graphs [LBGS15] and (roughly) $$O(n^{2/3} d^{1/3}_{max})$$ for all graphs [BPP18]. Here, $$d_t$$ is the degree of vertex $$t$$ and $$d_{max}$$ is the maximum degree. This paper gets a bound of $$\widetilde{O}(\min(d_t,\sqrt{m}))$$, even for directed graphs (quite an achievement). A lower bound is still open for the fundamental problem! (A nice problem for any students reading…?)

Directed Poincare Inequalities and $$L_1$$ Monotonicity Testing of Lipschitz Functions by Renato Ferreira Pinto Jr. (arXiv, ECCC). We all (or at least me) love monotonicity testing. The paper studies the continuous version, where $$f:[0,1]^n \to \mathbb{R}$$. To have a reasonable notion of distance and testers, we will require functions to be differentiable and $$L$$-Lipschitz. We measure distance using $$\ell_1$$ distance, so the distance between $$f,g$$ is $$\int_{[0,1]^n} |f-g| d\nu$$ (where we integrate over the uniform measure) [BRY14]. To access $$f$$, we are also provided a “derivative oracle”: given a point $$x$$ and a direction $$\vec{v}$$, we can query the directional derivative along $$\vec{v}$$ at $$x$$. One of the key insights in (discrete, Boolean) monotonicity testing is the connection to directed isoperimetric inequalities. These inequalities are directed versions of classic inequalities that relate boundaries (or gradients) to volumes. For $$L$$-Lipschitz functions, the classic Poincare inequality states that $$E[\|\nabla f\|_1] \geq \textrm{var}(f)$$, where $$\nabla f$$ is the gradient and $$\textrm{var}(f)$$ is (up to constant factors) the distance to being constant. This paper proves the directed version $$E[\|\nabla^- f\|_1] \geq \varepsilon(f)$$. Roughly speaker, $$\nabla^-$$ is the “negative gradient” ($$min(\nabla f, 0)$$) and $$\varepsilon(f)$$ is the $$L_1$$-distance to monotonicity. This result directly yields a $$O(n/\varepsilon)$$ query tester for monotonicity. The paper asks the tantalizing question: can we do better?

A Strong Composition Theorem for Junta Complexity and the Boosting of Property Testers by Guy Blanc, Caleb Koch, Carmen Strassle, and Li-Yang Tan (arXiv). It is best to go backwards in discussing this paper, starting from the implications and going to the core results. The problem at hand is the classic one of junta testing. Given $$f:\{0,1\}^n \to \{0,1\}$$, distinguish if $$f$$ only depends on $$r$$ variables (an $$r$$-junta) or is $$\varepsilon$$-far from having that property. This problem is well studied, has optimal (efficient) testers, and even has results in the distribution-free setting. In the latter setting, there exists some (unknown) distribution $$\mathcal{D}$$ on the domain according to which distance is defined. The tester can access to queries according to $$\mathcal{D}$$. The clarity of junta testing disappears when we consider tolerant testing: can we distinguish $$f$$ is (say) $$1/4$$ close to an $$r$$-junta from being $$1/3$$-far (where distances are measured according to $$\mathcal{D}$$)? A remarkable consequence of this paper is that this tolerant testing version is unlikely to have a $$poly(n)$$ time algorithm. (Note that the sample complexity might still be small.) The main tool is a composition theorem that gives reductions between low $$\varepsilon$$ testers and constant $$\varepsilon$$ testers. The details are intricate, but here’s the gist. Suppose we can construct composing functions $$g$$ such that the distance to “junta-ness” of $$g \circ f$$ is much larger than the distance of $$f$$. Then a tester (that can only deal with large $$\varepsilon$$) on $$g \circ f$$ can effectively property test $$f$$. (Obviously, I’m greatly oversimplifying and mischaracterizing. So go read the paper!)

The Full Landscape of Robust Mean Testing: Sharp Separations between Oblivious and Adaptive Contamination by Clément L. Canonne, Samuel B. Hopkins, Jerry Li, Allen Liu, and Shyam Narayanan (arXiv). Consider the fundamental hypothesis testing problem of Gaussian testing. We wish to distinguish the $$d$$-dimensional, zero-mean, unit covariance Gaussian $$\mathcal{N}(0,I)$$, from the “shifted” version $$\mathcal{N}(\mu,I)$$, where $$\mu$$ is a vector of length at least $latex \alpha$. Existing results give a simple algorithm using $$\Theta(\sqrt{d}/\alpha^2)$$ samples. Suppose there is an adversary who can corrupt samples. The adaptive adversary can look at the samples generated, and arbitrarily change an $$\varepsilon$$ fraction of entries. The weaker, oblivious adversary can also arbitrarily change an $$\varepsilon$$ fraction of entries, but cannot look at the samples generated. Can we perform Gaussian testing in this setting? A previous result gave an optimal bound for adaptive adversaries [N22]. This paper gives the optimal sample complexity bound for oblivious adversaries. The expression is somewhat complicated. But the punchline is that for many regimes of parameters $$d, \alpha, \varepsilon$$, the oblivious adversary is strictly weaker. Meaning, there is a testing algorithm (against an oblivious adversary) whose sample complexity is strictly less than the optimal bound against adaptive adversaries. This comes as a surprise, because in typical distribution testing settings, these adversaries are basically equivalent.

Uniformity Testing over Hypergrids with Subcube Conditioning by Xi Chen and Cassandra Marcussen (arXiv). A problem of generalizing from hypercube to hypergrids, an issue I have much obsessed over. Consider the problem of uniformity testing, where the domain is the hypergrid $$[m]^n$$. We wish to test if a distribution $$\mathcal{D}$$ over the hypergrid is uniform. Unlike the standard distribution testing setting, the domain size is exponentially large. To aid us, we can perform conditional sampling: we can select any sub-hypergrid, and get samples from $$\mathcal{D}$$ restricted to this sub-hypergrid. Previous work solved this problem for the hypercube domain (when $$m=2$$), by providing a tester with sample complexity $$O(\sqrt{n}/\varepsilon^2)$$ [CCKLW22]. The previous work did not work for any other hypergrid (even $$m=3$$). This paper gives the first solution for uniformity testing on hypergrids with a tester of sample complexity $$O(poly(m)n/\varepsilon^2)$$. The dependence on $$m$$ has to be at least $$\sqrt{m}$$, from existing results. One of the important tools is getting robust versions of classic isoperimetric inequalities over the hypergrid. An important open question is to determine the optimal dependence on $$\mathcal{m}$$.

Learning and Testing Latent-Tree Ising Models Efficiently by Davin Choo, Yuval Dagan, Constantinos Daskalakis, and Anthimos Vardis Kandiros (arXiv). Depending on your standpoint, one may or may not consider this a “sublinear time” paper (it does not show that testing $$\ll$$ learning). But given the interest in distribution testing, I think it’s germane to our blog. We have a high-dimensional distribution $$\mathcal{D}$$ over $$(x_1, x_2, \ldots, x_n)$$. Without any assumption, learning (or even identity testing) requires exponential in $$n$$ many samples. This paper assumes that $$(x_1, x_2, \ldots, x_n)$$ is generated from a latent-tree Ising model. There is a tree where each node is associated with a number (think $$x_i$$), and the joint distribution satisfies some dependencies according to the edges. We only observe the values of the leaves; hence, the term “latent” model. The main result shows that identity testing can be done in with polynomial in $$n$$ samples. The authors also prove that one can learn the distribution in (albeit larger) $$poly(n)$$ samples.

# News for March 2023

I never thought this day would come.

For the first time in PTReview history, there is no paper to report. Nada. Zilch.

The calm before the storm…?

# News for December 2022

Happy New Year! Apologies for the late post; I was stuck for far too long in holiday mode. We haven’t missed much. There were only two papers in the month of December, since, I’m assuming, many of us were entering holiday mode.

Testing in the bounded-degree graph model with degree bound two by Oded Goldreich and Laliv Tauber (ECCC). One of great, central results in graph property testing is that all monotone properties are testable (with query complexity independent on graph size) on dense graphs. The sparse graph universe is far, far, more complicated and interesting. Even for graphs with degree bound 3, natural graph properties can have anywhere from constant to linear (in $$n$$) query complexity. This note shows that when considering graphs with degree bound at most 2, the landscape is quite plain. The paper shows that all properties are testable in $$poly(\varepsilon^{-1})$$. Any graph with degree at most 2 is a collection of paths and cycles. In $$poly(\varepsilon^{-1})$$ queries, one can approximately learn the graph. (After which the testing problem is trivial.) The paper gives a simple $$O(\varepsilon^{-4})$$ query algorithm, which is improved to the nearly optimal $$\widetilde{O}(\varepsilon^{-2})$$ bound.

On the power of nonstandard quantum oracles by Roozbeh Bassirian, Bill Fefferman, and Kunal Marwaha (arXiv). This paper is on the power of oracles in quantum computation. An important question in quantum complexity theory is whether $$QCMA$$ is a strict subset of $$QMA$$. The former consists of languages decided by Merlin-Arthur quantum protocols with a classical witness (the string that Merlin provides). The latter class allows Merlin to be a quantum witness. This paper shows a property testing problem where such a separation is shown. The property is essentially graph non-expansion (does there exist a set of low conductance?). The input graph should be thought of as an even (bounded) degree with “exponentially many” vertices. So it has $$N = 2^n$$ vertices. The graph is represented through a special “graph-coded” function. The paper shows that there is a $$poly(n)$$-sized quantum witness for non-expansion that can be verified in $$poly(n)$$ time, which includes queries to the graph-coded function. On the other hand, there is no classic $$poly(n)$$-sized witness that can be verified in $$poly(n)$$ queries to the graph-coded function. (Informally speaking, any $$QCMA$$ protocol needs exponentially many queries to the graph.)

# News for August 2022

A eerily quiet month, this August of ’22. We only found a single paper to cover!

Training Overparametrized Neural Networks in Sublinear Time by Hang Hu, Zhao Song, Omri Weinstein, Danyang Zhuo (arXiv). Think of a classification problem where the inputs are in $$\mathbb{R}^d$$. We have $$n$$ such points (with their true labels, as training data) and wish to train a Neural Network. A two layer Rectified Linear Unit (ReLU) Neural Network (NN) works as follows. The first layer has $$m$$ vertices, where each vertex has vector weight $$\vec{w}_i \in \mathbb{R}^d$$. The second “hidden layer” has $$m$$ vertices, each with a scalar weight $$a_1, a_2, \ldots, a_m$$. This network is called overparametrized when $$m \gg n$$. The output of this NN on input vector $$\vec{x}$$ is (up to scaling) $$\sum_{i \leq m} a_i \phi(\vec{w_i} \cdot \vec{x})$$ (where $$\phi$$ is a thresholded linear function). Observe that to compute the value on a single input takes $$O(md)$$ time, so the total time to compute all values on $$n$$ training inputs takes $$O(mnd)$$ time. The training is done by gradient descent methods; given a particular setting of weights, we compute the total loss, and then modify the weights along the gradient. Previous work showed how a single iteration can be done in time $$O(mnd + n^3)$$. When $$m \gg n^2$$, this can be thought of as linear in computing the loss function (which requires evaluating the NN on all the $$n$$ points). This paper shows how to implement a single iteration in $$O(m^{1-\alpha}nd + n^3)$$ time, for some $$\alpha > 0$$. Hence, the time for an iteration is sublinear in the trivial computation. The techniques used are sparse recovery methods and random projections.

# News for April 2022

We have…I don’t know, I’ve lost count of the number of papers this month. It’s a big bonanza. Sublinear algorithms for edit distance, planar graphs, distributions, bipartite graphs, groups, error correcting codes, Bayesian nets, polynomials…

Improved Sublinear-Time Edit Distance for Preprocessed Strings by Karl Bringmann, Alejandro Cassis, Nick Fischer, and Vasileios Nakos (arXiv). The edit distance between strings is a classic and important problem in algorithms. You might recall that classic $$O(n^2)$$ algorithm to compute the edit distance between strings of length $$n$$. It has been show that getting a $$O(n^{2-\delta})$$ time algorithm is SETH-hard. But what can be done in sublinear time? This paper considers the preprocessed version: suppose we can perform near-linear preprocessing on the strings. We now want to distinguish between edit distance between $$\leq k$$ and $$\geq k\cdot n^{o(1)}$$. This paper shows that with near-linear preprocessing on strings, one can solve this problem in $$k \cdot n^{o(1)}$$ time.

Optimal Closeness Testing of Discrete Distributions Made Complex Simple by (our own) Clément L. Canonne and Yucheng Sun (arXiv). Given two distributions $$p, q$$ over support $$[k]$$, the aim is to distinguish between (i) the distributions being equal, and (ii) the total variation distance between $$p, q$$ being at least $$\epsilon$$. The tester should has a failure probability of at most $$\delta$$. A recent work nails down the sample complexity with respect to all parameters. This paper gives a simpler proof of the main result. Earlier proofs used Poissonization tricks and fairly clever arguments about Poisson random variables. This proof is much more transparent, and uses an identity that relates the expectation of a random variable to its characteristic function. A nice feature of this proof is that it works directly with the multinomial distribution, which means a fixed number of samples (rather than choosing the number of samples from a distribution).

Tolerant Bipartiteness Testing in Dense Graphs by Arijit Ghosh, Gopinath Mishra, Rahul Raychaudhury, and Sayantan Sen (arXiv). Testing bipartiteness of dense graphs is about as a classic as it gets. We wish to distinguish a bipartite graph from one that requires $$\varepsilon n^2$$ edge removals to make it bipartite. Readers of this blog should know that there is a $$\widetilde{O}(\varepsilon^{-2})$$-query property tester for this problem. (Ok, so now you know.) This paper studies the tolerant version of bipartiteness testing. Note that this is equivalent to approximating the maxcut, up to additive error $$\varepsilon n^2$$. Classic approximation algorithms show that the latter can be done in $$\widetilde{O}(\varepsilon^{-6})$$ queries and $$\exp(\widetilde{O}(\varepsilon^{-2}))$$ time. This paper considers the easier problem of distinguishing whether the distance to bipartiteness is at most $$\varepsilon$$ or at least $$2 \varepsilon$$. This problem is solved in $$\widetilde{O}(\varepsilon^{-3})$$ queries and $$\exp(\widetilde{O}(\varepsilon^{-1}))$$.

Properly learning monotone functions via local reconstruction by Jane Lange, Ronitt Rubinfeld, Arsen Vasilyan (arXiv). Ah yes, monotone functions. An ongoing love (obsession? interest?) for property testing people. This paper studies the problem of proper learning of Boolean valued monotone functions over the Boolean hypercube. Given access to uniform random evaluations of a monotone function $$f:\{0,1\}^n \to \{0,1\}$$, we wish to compute a monotone function $$g$$ that approximates the original function. Classic results from Fourier analysis show that an approximation can be learned using $$\exp(\sqrt{n}/\varepsilon)$$ queries. But this approximation function might not be monotone, and only yields improper learning. This paper gives a proper learner that outputs a monotone approximation, in roughly the same query complexity. This result directly gives a constant tolerance monotonicity tester for Boolean functions. The paper uses recent results from distributed algorithms and local computation. It also leads to tolerant testers for monotonicity over posets with small diameter.

Massively Parallel Computation and Sublinear-Time Algorithms for Embedded Planar Graphs by Jacob Holm and Jakub Tětek (arXiv). Sublinear algorithms for planar graphs is another ongoing love (at least for me). This paper considers a new take of this problem: suppose we have access to a geometric embedding of a planar graph $$G$$. Can we get sublinear algorithms for a variety of problems? This paper first shows how to construct a convenient decomposition, called an $$r$$-division, in sublinear time. This division can be used to approximate Lipschitz graph parameters, such as maximum matching sizes, maximum independent set, etc. The paper also shows how to compute an $$r$$-division in the MPC model, which solves many classic graph problems (connected components, matchings, etc.) in $$O(1)$$ rounds. There is a (conditional) lower bound showing that, without an embedding, it is not possible to solve such problems in $$O(1)$$ rounds (and sublinear space per processor).

Independence Testing for Bounded Degree Bayesian Network by Arnab Bhattacharyya, Clément L. Canonne (again, our own), and Joy Qiping Yang (arXiv). Given a distribution $$\mathcal{P}$$ on the Boolean hypercube $$\{0,1\}^n$$, the problem is to determine whether $$\mathcal{P}$$ is a product distribution. In general, this problem requires $$\Omega(2^n)$$ samples. Suppose $$\mathcal{P}$$ has a sparse, “efficient” description. Can we do better? This paper shows that when $$\mathcal{P}$$ is generated by a Bayesian network (with bounded indegree), then the independence testing problem can be solved with a $$\widetilde{O}(n/\varepsilon^2)$$ samples. Think of a Bayesian network as a DAG, where each vertex generates a Bernoulli random variable. The variable at a vertex depends only the outcomes at its neighborhood.

Low Degree Testing over the Reals by Vipul Arora, Arnab Bhattacharyya, Noah Fleming, Esty Kelman, and Yuichi Yoshida (arXiv, ECCC). The problem testing low degree polynomials goes back to the birth of property testing. This paper studies real valued polynomials, in the distribution free setting. Formally, we have query access to a function $$f: \mathbb{R}^d \to \mathbb{R}$$. The distance is measured with respect to an unknown distribution $$\mathcal{D}$$ over the domain. This paper shows that the real low degree testing problem can be solved in $$poly(d\varepsilon^{-1})$$ queries (under some reasonableness conditions on the distribution). The approach is go to via the “self-correct and test” approach: try to compute a low degree polynomial that fits some sampled data, and then check how far the self-corrected version is from another sample.

Testing distributional assumptions of learning algorithms by Ronitt Rubinfeld and Arsen Vasilyan (arXiv). Consider the problem of learning a halfspace over $$\mathbb{R}^n$$. If the underlying distribution is Gaussian, then this class can be learned in $$n^{poly(\varepsilon^{-1})}$$ samples. If the distribution is arbitrary, no $$2^{o(n)}$$ algorithm is known despite much research. This paper introduces the notion of having a tester-learner pair. The tester first checks if the input distribution is “well-behaved” (Gaussian-like). If the tester passes, then we run the learner. Indeed, this perspective goes back to some of the original motivations for property testing (when is testing faster than learning). The intriguing aspect of this problem is that we do not have efficient testers for determining if an input distribution is Gaussian. This paper circumvents that problem by estimating certain moments of the distribution. If these moments agree with the moments of a Gaussian, then the learner is guaranteed to succeed. We get the best of both worlds: if the input distribution is Gaussian, the learning is done correctly. If the learner succeeds, then then output (hypothesis) is guaranteed to be correct, regardless of the input distribution.

Testability in group theory by Oren Becker, Alexander Lubotzky, and Jonathan Mosheiff (arXiv). This paper is the journal version of a result of the authors, and it gives a group theoretic presentation of a property testing result. Consider the following problem. The input is a pair permutations $$(\sigma_1, \sigma_2)$$ over $$[n]$$. The aim is to test whether they commute: $$\sigma_1 \sigma_2 = \sigma_2 \sigma_1$$. Another result of the authors gives a tester that makes $$O(\varepsilon^{-1})$$ queries. They refer to this problem as “testing the relation” $$XY = YX$$. This paper gives a grand generalization of that result, best explained by another example. Consider another relation/property denoted $$\{XZ = ZX, YZ = ZY\}$$. This property consists of all triples of permutations $$(\sigma_1, \sigma_2, \sigma_3)$$, where $$\sigma_3$$ commutes with the other two. A consequence of the main theorem is that this property is not testable with query complexity independent of $$n$$. The main result of this paper is a characterization of testable relations, which goes via studying the expansion of an infinite graph associated with the relation.

Testing Positive Semidefiniteness Using Linear Measurements by Deanna Needell, William Swartworth, and David P. Woodruff (arXiv). The input is a $$d \times d$$ real, symmetric matrix $$M$$ and we wish to determine if it is positive semidefinite (all eigenvalues are positive). For the testing problem, we reject when the minimum eigenvalue is at most $$-\varepsilon \|M\|_2$$. (The paper also considers general Schatten $$p$$-norms.) This paper gives a list of results for non-adaptive vs adaptive, and one-sided vs two-sided testers. There are two access models considered: a single query consists of either a (i) matrix-vector product $$Mx$$ or (ii) vector-matrix-vector product $$y^TMx$$. Typical models that query entries of the matrix require strong bounds on the entries, which is less reasonable in practical situations. An interesting discovery is that the non-adaptive, one-sided complexity is $$\Theta(\sqrt{d}\varepsilon^{-1})$$ while the two-sided bound is independent of $$d$$.

Relaxed Locally Decodable and Correctable Codes: Beyond Tensoring by Gil Cohen and Tal Yankovitz (ECCC). Locally decodable and correctable codes are a fundamental object of study in property testing (and TCS in general). Consider a locally correctable code (LCC). Given a string $$x$$, the decoder/corrector makes $$q$$ queries to $$x$$, and outputs a symbol. We can think of the output collectively as a string $$y$$. If $$x$$ is a codeword, then $$y = x$$. Otherwise, $$dist(y,z) \leq \varepsilon$$, where $$z$$ is some codeword close to $$x$$. In the relaxed version, the corrector is allowed to output $$\bot$$, denoting that it has discovered corruption. The distance is only measured in the coordinates where the corrector does not output $$\bot$$. Thus, the corrector gets a “free pass” if it outputs $$\bot$$. But note that when $$x$$ is a codeword, the output must be exactly $$x$$. This paper gives a Relaxed LCC with query complexity $$(\log n)^{O(\log\log\log n)}$$, a significant improvement over the previous best $$(\log n)^{O(\log\log n)}$$. It is know from previous work that the query complexity must be $$\Omega(\sqrt{\log n})$$.

Verifying The Unseen: Interactive Proofs for Label-Invariant Distribution Properties by Tal Herman and Guy Rothblum (arXiv). This paper considers the distribution testing problem in the context of interactive proofs. The verifier, who wishes to test a property of a distribution $$\mathcal{P}$$, interacts with a prover who knows the distribution. The guarantee required is the standard one for interactive proof systems: in the YES case, an honest prover should be able to convince the verifier. In the NO case, no prover can convince the verifier with high probability. There are two important parameters of interest: the sample complexity of the verifier, and the communication complexity of the messages. It is useful to consider the two extremes. In one extreme, the verifier can simply solve the problem herself, ignoring the prover. This could require $$\Theta(n/\log n)$$ queries (for the hardest properties like entropy and support size). Another extreme is for the honest prover to simply send an approximate description of the distribution, which takes $$O(n)$$ bits. The prover can just test equality to the prover message, which only takes $$\Theta(\sqrt{n})$$ queries. This paper shows a 2-round protocol for any (label-invariant) property where both the communication and the sample complexity can be made $$\Theta(\sqrt{n})$$. This result shows the power of interaction for distribution testing problems.

# News for December 2021

Happy 2022 everyone! We now cover the last papers of 2021. Our readers have pointed out a number of papers from previous months that we missed (sorry). To adequately cover (and publicize) these papers, we’ve decided to just incorporate those papers on this post rather than update previous posts.

We have a breakthrough on Locally Testable Codes, sublinear algorithms for testing pattern freeness, new thoughts on convexity testing, and a perspective on sampling community structure. And here’s the final list for 2021.

Locally Testable Codes with constant rate, distance, and locality by Irit Dinur, Shai Evra, Ron Livne, Alexander Lubotzky, and Shahar Mozes (ECCC, posted Nov 2021). A Locally Testable Code (LTC) is one where the set of codewords has a property tester. The study of LTCs started with the PCP theorem, but blossomed into a deep research agenda. The holy grail is a $$c^3$$-LTC: one with constant rate, constant distance, and constant query property tester. Just to recall, the rate $$r \in (0,1)$$ is the ratio of the message length to the encoding length. The distance $$\delta \in (0,1)$$ is the (normalized) Hamming distance between codewords (we typically expect this to be constant for an error correcting code). And, most familiar to us, $$q$$ is the number of queries required to property test codewords. The aim is to get all these parameters to be constants independent of the message length. The classic linearity test of BLR basically proves that the Hadamard code is 3-query testable. But the Hadamard code has poor (exponentially small) rate. Low-degree tests prove that Reed-Muller codes are constant-query testable, but they have inverse polynomial rate. Further work constructed LTCs with inverse polylogarithmic rate, meaning that the encoding has length $$n\cdot poly(\log n)$$ for message length $$n$$. There was even some belief that any LTC must have a rate going to zero. This paper invalidates that belief, and gives a truly $$c^3$$-construction. The construction is based on a 2-dimensional simplicial complex based on Cayley graphs. This is basically a graph with hyperedges corresponding to “squares” of vertices. Concurrent and independent work of Breuckmann-Ebedhardt and Panteleev-Kalachev also have similar constructions, though their perspective is of quantum codes.

On the Locally Testable Code of Dinur et al. (2021) by Oded Goldreich (ECCC). As the title says, this short paper gives a high level discussion of the previous result. It abstracts out some of the graph theoretic aspects, to explain connections to more combinatorial constructions. The construction and analysis is based on expander codes. These codes (e.g. low-density parity check, LDPC codes) have constant rate and distance, but we do not know how to test them. A string far from a codeword might only fail on (say) one parity check constraint. Interpreting the Dinur et al as an expander code, we essentially get a collection of correlated parity constraints. Hence, a string that is far from a codeword violates many constraints, leading to a tester. (Of course, this is an extremely watered down version of the analysis idea.) The construction takes two Cayley graphs on a non-Abelian group (which form the vertices). Specific 4-cycles in this graph (together with a constant sized tensor code) are used to make the final codewords.

Strongly Sublinear Algorithms for Testing Pattern Freeness by Ilan Newman and Nithin Varma (arXiv, posted Jun 2021). Pattern freeness is a major generalization of monotonicity testing. The input is (query access to) a function $$f: [n] \mapsto \mathbb{R}$$. Let $$\pi: [k] \mapsto [k]$$ be a permutation. The input is $$pi$$-free if no restriction of $$f$$ to $$k$$ elements induces the permutation $$pi$$. Newman-Rabinovich-Rajendraprasad-Sohler introduced this problem. When $$k=2$$, this property is precisely monotonicity, for which (many) $$O(\varepsilon^{-1} \log n)$$ property testers are known. Is such a bound achievable for $$k \geq 3$$? Interestingly, for non-adaptive algorithms, an optimal bound of $$\Theta(n^{1-1/(k-1)})$$ (ignoring, $$\varepsilon, k$$ dependencies) was achieved by Ben-Eliezer and Canonne. This papers shows that there is an adaptive algorithm that makes $$n^{o(1)}$$ queries, thereby proving a separation between adaptive and non-adaptive algorithms. A key idea is to visualize the input as a set of points in the plane, and construct a coarse grid that partitions the points. There is a highly non-trivial case analysis on top of this that leads to a $$O(\sqrt{n})$$-query algorithm (this already beats the non-adaptive lower bound).

Parameterized Convexity Testing by Abhiruk Lahiri, Ilan Newman, and Nithin Varma (arXiv, posted Oct 2021). Given a function $$f:[n] \mapsto \mathbb{R}$$, the property of interest is (discrete) convexity. The discrete derivatives should be monotonically non-decreasing. It is known that there is a $$O(\varepsilon^{-1}\log n)$$ query non-adaptive tester for this property. This paper studies this problem from the perspective of parameterized property testing, where the range is restricted. The main result is that if the number of distinct discrete derivates is at most $$s$$, there is a $$O(\varepsilon^{-1} \log s)$$ query non-adaptive tester. Moreover, this bound is optimal even for adaptive algorithms. This picture parallels that of monotonicity. Interestingly, when there are only two distinct discrete derivatives, the tester can be made deterministic.

Modularity and edge sampling by Colin McDiarmid and Fiona Skerman (arXiv). Not your typical sublinear algorithms paper, but certainly of interest to us. The main problem is to understand when a random samples of edges in a graph can reveal information about the “community structure”. The modularity of a graph is a measure of this structure. Essentially, a partition has low modularity if the number of edges crossing the partition is close to that of a random graph with the same degree distribution. A partition of high modularity has significant fewer edges, and is thus a valid “clustering” of the graph. The modularity of the graph is obtained by maximizing over partitions. (The normalization of modularity is chosen so that the partition generally has a constant number of sets.) This paper proves a concentration inequality for the modularity of a constant sample of uniform random edges. The modularity of a sample of $$\Theta(\varepsilon^{-5})$$ edges is an additive $$\varepsilon$$-approximation of the overall modularity, whp.

# News for September 2021

A nice array of papers for this month, ranging from distribution testing, dense graph property testing, sublinear graph algorithms, and various mixtures of them. Let us now sample our spread! (Adding another semistreaming paper who achieved similar results to another posted paper. -Ed)

A Lower Bound on the Complexity of Testing Grained Distributions by Oded Goldreich and Dana Ron (ECCC). A discrete distribution is called $$m$$-grained if all probabilities are integer multiples of $$1/m$$. This paper studies the complexity of testing this property of distributions. For simplicity, consider the property of being $$n/2$$-grained, where the support size is $$n$$. The classic lower bound for testing uniformity shows that $$\Omega(\sqrt{n})$$ samples are required to distinguish the uniform distribution from a distribution uniform on $$n/2$$ elements. Thus, we get a lower bound of $$\Omega(\sqrt{n})$$ for testing $$n/2$$-grainedness (if I am permitted to use that word). This paper proves a lower bound of $$\Omega(n^c)$$, for all constant $$c < 1$$. It is conjectured that the lower bound is actually $$\Omega(n/\log n)$$, which would match the upper bound (for any label-invariant property).

Testing Distributions of Huge Objects by Oded Goldreich and Dana Ron (ECCC). This paper introduced a new model that marries distribution testing with property testing on strings. The “object” of interest is a distribution $$\mathcal{D}$$ over strings of length $$n$$. We wish to test if $$\mathcal{D}$$ possesses some property. The tester can get a random string $$x$$ from the distribution, and can query any desired index of $$x$$. The distance between distributions is defined using the earthmover distance (where we use the Hamming distance between strings). This model is called the DoHO (Distributions of Huge Objects) model. There are many questions posed and connections drawn to classical property testing and distribution testing. What I find interesting is a compelling application: the distribution $$\mathcal{D}$$ may represent noisy or perturbed versions of a single object. The DoHO model gives a natural generalization of standard property testing to noisy objects. This paper considers problems such as testing if $$\mathcal{D}$$ is: a random perturbation of a string, or a random cyclic shift, or a random isomorphism of a graph.

Sublinear Time and Space Algorithms for Correlation Clustering via Sparse-Dense Decompositions by Sepehr Assadi and Chen Wang (arXiv). Correlation clustering is a classic problem where edges in a graph are labeled ‘+’ or ‘-‘, denoting whether these edges should be uncut or cut. The aim is to cluster the graph minimizing the total “disagreements” (cut ‘+’ edges or uncut ‘-‘ edges). This paper gives an $$O(1)$$-approximation algorithm that runs in $$O(n\log^2n)$$ time; this is the first sublinear time approximation algorithm for this problem. Correlation clustering has seen results for the property testing/sublinear algorithms community, first by Bonchi, Garcia Soriano, and Kutzkov. But previous results were essentially on the dense graph model, giving $$O(\epsilon n^2)$$ error assuming adjacency matrix input. This paper considers access to the adjacency list of ‘+’ edges. Interestingly (from a technical standpoint), the key tool is a new sparse-dense decomposition. Such decompositions emerged from the seminal work of Assadi-Chen-Khanna for sublinear $$(\Delta+1)$$-colorings, and it is great to see applications beyond coloring.

Sublinear-Time Computation in the Presence of Online Erasures by Iden Kalemaj, Sofya Raskhodnikova, and Nithin Varma (arXiv). Can property testing be done when portions of the input are hidden? This question was first raised by Dixit-Raskhodnikova-Thakurta-Varma, who gave a model of erasure-resilient testing. There is an adversary who hides (erases) part of the input function; queries to those parts just yield a dummy symbol. This paper defines an online version of this model. There is an erasure parameter $$t$$. On each query by the property tester, the adversary can erase $$t$$ values of the function. Consider the property of classic linearity of functions $$f:\{0,1\}^d \rightarrow \{0,1\}$$. The BLR tester queries triples of pairs $$(x,y, x \oplus y)$$. Observe how this tester is easily defeated by our adversary, by erasing the value $$f(x\oplus y)$$. One of the main results of this paper is a $$O(1/\varepsilon)$$-query tester for linearity, that works for any constant erasure parameter $$t$$. Note that this matches the bound for the standard setting. There are a number of results for other classic properties, such as monotonicity (sortedness) and Lipschitz.

Sublinear Time Eigenvalue Approximation via Random Sampling by Rajarshi Bhattacharjee, Cameron Musco, and Archan Ray (arXiv). Consider the problem of estimating all the eigenvalues of a real, symmetric $$n \times n$$ matrix $$M$$ with bounded entries, in sublinear time. The main result shows that the eigenvalues of a uniform random $$O(\epsilon^{-4}\log n)$$ principal submatrix can be used to approximate all eigenvalues of $$M$$ up to additive error $$\epsilon n$$. One can think of this as a sort of concentration inequality for eigenvalues. This result follows (and builds upon) work of Bakshi-Chepurko-Jayaram on property testing semidefiniteness. The key idea is that eigenvectors corresponding to large eigenvalues have small infinity norm: intuitively, since all entries in $$M$$ are bounded, such an eigenvector must have its mass spread out among many coordinates. Hence, we can get information about it by randomly sampling a few coordinates. The paper also shows that approach of taking principal submatrices requires taking $$\Omega(\epsilon^{-2})$$ columns/rows.

Deterministic Graph Coloring in the Streaming Model by Sepehr Assadi, Andrew Chen, and Glenn Sun (arXiv). This is technically not a sublinear algorithms paper (well ok, streaming is sublinear, but we tend not to cover the streaming literature. Maybe we should? – Ed.) But, I think that the connections are of interest to our community of readers. The main tool of sublinear $$(\Delta+1)$$-coloring algorithm of Assadi-Chen-Khanna is the palette sparsification lemma ($$\Delta$$ is the maximum degree). This lemma shows that vertices can randomly shorten their ”palette” of colors, after which all colorings from these palettes lead to very few monochromatic edges. This is an immensely powerful tool, since one can get immediately sublinear complexity algorithms in many models: adjacency list, streaming, distributed. Is the randomness necessary? Note that these algorithms run in $$\Omega(n)$$ time/space, so it is conceivable that deterministic sublinear algorithms exists. This paper shows that randomization is necessary in the semi-streaming model (space $$O(n poly(\log n))$$). Indeed, there exist no deterministic semi-streaming algorithms that can achieve even $$\exp(\Delta^{o(1)})$$ colorings.

Adversarially Robust Coloring for Graph Stream by Amit Chakrabarti, Prantar Ghosh, and Manuel Stoeckl (arXiv). This paper studies the same problem as the above, but presents the results in a different way. In a randomized algorithm, we normally think of an adversary that fixes a (hard) input, and the algorithm then makes its random decisions. An adaptive adversary is one that changes the input (stream) based on the decisions of an algorithm. In this definition, a robust algorithm is one that can give correct answers, even for adversarially generated output. A deterministic algorithm is automatically robust. This paper show that there do not exist semi-streaming algorithms that can achieve $$(\Delta+1)$$-colorings. The quantitative lower bound is weaker ($$\Omega(\Delta^2)$$ colors), but it is against a stronger adversary.

# News for June 2021

A quieter month after last month’s bonanza. One (applied!) paper on distribution testing, a paper on tolerant distribution testing, and a compendium of open problems. (Ed: alas, I missed the paper on tolerant distribution testing, authored by one of our editors. Sorry, Clément!)

Learning-based Support Estimation in Sublinear Time by Talya Eden, Piotr Indyk, Shyam Narayanan, Ronitt Rubinfeld, Sandeep Silwal, and Tal Wagner (arXiv). A classic problem in distribution testing is that of estimating the support size $$n$$ of an unknown distribution $$\mathcal{D}$$. (Assume that all elements in the support have probability at least $$1/n$$.) A fundamental result of Valiant-Valiant (2011) proves that the sample complexity of this problem is $$\Theta(n/\log n)$$. A line of work has emerged in trying to reduce this complexity, with additional sources of information. Canonne-Rubinfeld (2014) showed that, if one can query the exact probabilities of elements, then the complexity can be made independent of $$n$$. This paper studies a robust version of this assumption: suppose, we can get constant factor approximations to the probabilities. Then, the main result is that we can get a query complexity of $$n^{1-1/\log(\varepsilon^{-1})} \ll n/\log n$$ (where the constant $$\varepsilon$$ denotes the additive approximation to the support size). This paper also does empirical experiments to show that the new algorithm is indeed better in practice. Moreover, it shows that existing methods degraded rapidly with poorer probability estimates, while the new algorithm maintains its accuracy even with such estimates.

The Price of Tolerance in Distribution Testing by Clément L. Canonne, Ayush Jain, Gautam Kamath, and Jerry Li (arXiv). While we have seen many results in distribution testing, the subject of tolerance is one that hasn’t received as much attention. Consider the problem of testing if unknown distribution $$\mathcal{p}$$ (over domain $$[n]$$) is the same as known distribution $$\mathcal{q}$$. We wish to distinguish $$\varepsilon_1$$-close from $$\varepsilon_2$$-far, under total variation distance. When $$\varepsilon_1$$ is zero, this is the standard property testing setting, and classic results yield $$\Theta(\sqrt{n})$$ sample complexity. If $$\varepsilon_1 = \varepsilon_2/2$$, then we are looking for a constant factor approximation to the distance. And the complexity is $$\Theta(n/\log n)$$. Surprisingly, nothing was known in better. Until this paper, that is. The main result gives a complete characterization of sample complexity (up to log factors), for all values of $$\varepsilon_1, \varepsilon_2$$. Remarkably, the sample complexity has an additive term $$(n/\log n) \cdot (\varepsilon_1/\varepsilon^2_2)$$. Thus, when $$\varepsilon_1 > \sqrt{\varepsilon_2}$$, the sample complexity is $$\Theta(n/\log n)$$. When $$\varepsilon_1$$ is smaller, the main result gives a smooth dependence on the sample complexity. One the main challenges is that existing results use two very different techniques for the property testing vs constant-factor approximation regimes. The former uses simpler $$\ell_2$$-statistics (e.g. collision counting), while the latter is based on polynomial approximations (estimating moments). The upper bound in this paper shows that simpler statistics based on just the first two moments suffice to getting results for all regimes of $$\varepsilon_1, \varepsilon_2$$.

Open Problems in Property Testing of Graphs by Oded Goldreich (ECCC). As the title clearly states, this is a survey covering a number of open problems in graph property testing. The broad division is based on the query model: dense graphs, bounded degree graphs, and general graphs. A reader will see statements of various classic open problems, such as the complexity of testing triangle freeness for dense graphs and characterizing properties that can be tested in $$poly(\varepsilon^{-1})$$ queries. Arguably, there are more open problems (and fewer results) for testing in bounded degree graphs, where we lack broad characterizations of testable properties. An important, though less famous (?), open problem is that of the complexity of testing isomorphism. It would appear that the setting of general graphs, where we know the least, may be the next frontier for graph property testing. A problem that really caught my eye: can we transform testers that work for bounded degree graphs into those that work for bounded arboricity graphs? The latter is a generalization of bounded degree that has appeared in a number of recent results on sublinear graph algorithms.

# News for March 2021

A somewhat quieter month by recent standards. Three Two papers: graph property testing and quantum distribution testing. (Ed: The distribution testing paper was a revision of a paper we already covered in Sept 2020.)

Robust Self-Ordering versus Local Self-Ordering by Oded Goldreich (ECCC). In Nov 2020, we covered a paper that uses a tool called self-ordered graphs, that transferred bit string function lower bounds to graph property testing. Consider a labeled graph. A graph is self-ordered if its automorphism group only contains the identity element (it has no non-trivial isomorphisms). A graph is robustly self-ordered, if every permutation of the vertices leads to a (labeled) graph that is sufficiently “far” according to edit distance. Given a self-ordered graph $$G$$, a local self-ordering procedure is the following. Given access to a copy $$G’$$ of $$G$$ and a vertex $$v \in V(G’)$$, this procedure determines the (unique) vertex in $$V(G)$$ that corresponds to $$v$$ with sublinear queries to $$G$$. In other words, it can locally “label” the graph. Intuitively, one would think that more robustly self-ordered graphs will be easier to locally label. This paper studies the relation between robust and local self-ordering. Curiously, this paper refutes the above intuition for bounded-degree graphs, and (weakly) confirms it for dense graphs. Roughly speaking, there are bounded degree graphs that are highly robustly self-ordered, for which any local self-ordering procedure requires $$\omega(\sqrt{n})$$ queries. Moreover, there are bounded degree graphs with $$O(\log n)$$-query local self-ordering procedures, yet are not robustly self-ordered even for weak parameters. For dense graphs, the existence of fast non-adaptive local self-ordering procedures implies robust self-ordering.

Testing identity of collections of quantum states: sample complexity analysis by Marco Fanizza, Raffaele Salvia, and Vittorio Giovannetti (arXiv). This paper takes identity testing to the quantum setting. One should think of a $$d$$-dimensional quantum state as a $$d \times d$$ density matrix (with some special properties). To learn the state entirely up to error $$\varepsilon$$ would require $$O(\varepsilon^{-2} d^2)$$ samples/measurements. A recent result of Badescu-O’Donnell-Wright proves that identity testing to a known state can be done significantly faster using $$O(\varepsilon^{-2} d)$$ measurements. This paper takes this result a step further by consider a set of $$N$$ quantum states. A “sample” is like a classical sample, where one gets a sample from a distribution of quantum states. The YES (“uniform”) case is when all the states are identical. The NO (“far from uniform”) case is when they are “far” from being the same state. This paper proves that $$O(\varepsilon^{-2}\sqrt{N}d)$$ samples suffices for distinguishing these cases.

# News for December 2020

Happy 2021! We now cover the last papers of 2020: a (smaller) spread with graph property testing, distribution property testing, and circuit complexity.

Almost Optimal Bounds for Sublinear-Time Sampling of k-Cliques: Sampling Cliques is Harder Than Counting by Talya Eden, Dana Ron, and Will Rosenbaum (arXiv). In many settings, sampling and counting are tightly coupled; the classic example being permanent estimation and sampling random bipartite matchings.This paper investigates the problem of sampling uniform random cliques, and unearths an interesting separation of approximately uniform sampling from approximate counting. Let’s consider the fundamental problem of edge and triangle counting, in graphs of constant arboricity. This class, which contains all minor-closed families, has deep connections to clique counting; clique counting can be done in linear time for bounded arboricity graphs.) The input connected graph $$G$$ has $$n$$ vertices, $$m$$ edges, and $$t$$ triangles. We assume the standard query model, with uniform random vertices, neighbor, and degree queries. Our aim is to get $$(1+\varepsilon)$$-approximations for $$m$$ and $$t$$ (in general, any $$k$$-clique count). It is known from previous work that edge estimation can be done in $$O(1)$$ time and triangle estimation can be done in $$O(n/t)$$ time (ignoring log and $$\varepsilon$$ dependencies). Moreover, these bounds is optimal. Can one get similar bounds for approximately sampling a uniform random edge/triangle? Previous work of Eden-Rosenbaum achieve such a bound for sampling edges. Surprisingly, this paper shows that triangle sampling is fundamentally harder, and requires $$\Omega(n^{3/4}/\sqrt{t})$$ queries. The main result gives a precise and optimal bound for uniform sampling $$k$$-cliques in graphs of arboricity $$\alpha$$.

Testing and reconstruction via decision trees by Guy Blanc, Jane Lange, and Li-Yang Tan (arXiv). A property tester is an (sublinear) algorithm that distinguishes an object with a property from those that are far. Reconstruction is a much stronger sublinear algorithm, which actually provides query access to the closest object. Suppose we have query access to an $$n$$-variate Boolean function $$f$$, and the property $$\mathcal{P}$$ of interest is size-$$s$$ decision trees. A reconstructor would give query access to a function $$g \in \mathcal{P}$$, such that $$dist(f,g) = O(OPT_f)$$, where $$OPT_f$$ is the distance of $$f$$ to $$\mathcal{P}$$. Observe that a reconstructor automatically gives a distance estimator (just estimate $$dist(f,g)$$ by random sampling). One of the main results of this paper is a (weaker, bicriteria) reconstructor, where $$g$$ is guaranteed to have decision-tree complexity $$s^{O(\log^2s)}$$. Each query to $$g$$ is computed in $$poly(\log s, 1/\varepsilon)\cdot n$$ time. This reconstructor leads to a number of testing results, and reconstructors for other properties like low Fourier degree. Another interesting result proven: if there exists an efficient reconstructor that gets $$g$$ of size $$s$$, then there exists an efficient proper learner for decision trees (a big open problem).

Testing Product Distributions: A Closer Look by Arnab Bhattacharyya, Sutanu Gayen, Saravanan Kandasamy, and N. V. Vinodchandran (arXiv). This paper is on distribution testing, where all distributions involved are promised to be product distributions. Let $$P, Q$$ be distributions over $$\Sigma^n$$, where $$\Sigma$$ is some (small) finite set. The problems studied are identity testing ($$Q$$ is known/fixed, and one gets samples from unknown $$P$$) and closeness testing (both unknown). It is known from previous work that these problems can basically be solved with $$O(n|\Sigma|\varepsilon^{-2})$$ samples. Note that the domain size is exponential in $$n$$; thus, these bounds give an exponential improvement over identity/closeness testing in the vanilla (non-product) setting. Tolerant testing is of special significance here. Because the domain is so large, it makes sense to allow for some distance even in the “yes” case. The main result of this paper is to completely map out the complexities for identity/closeness testing where we wish to distinguish $$d_1(P,Q) < \varepsilon_1$$ from $$d_2(P,Q) > \varepsilon_2$$, where $$d_1, d_2$$ are potentially different distance measures. For example, the paper considers total variation (TV), KL, Hellinger, and $$\chi^2$$ distances. An important improvement achieved is for identity testing where $$d_1$$ is Hellinger and $$d_2$$ is TV distance: for certain values of $$\varepsilon_1, \varepsilon_2$$, one can get a tester of optimal sample complexity $$O(\sqrt{n|\Sigma|})$$. Previous bounds only gave a $$O(n|\Sigma|)$$ sample complexity. There are a number of new lower bounds for different combinations of $$d_1, d_2$$ distance pairs.