Author Archives: Seshadhri

News for August 2022

A eerily quiet month, this August of ’22. We only found a single paper to cover!

Training Overparametrized Neural Networks in Sublinear Time by Hang Hu, Zhao Song, Omri Weinstein, Danyang Zhuo (arXiv). Think of a classification problem where the inputs are in \(\mathbb{R}^d\). We have \(n\) such points (with their true labels, as training data) and wish to train a Neural Network. A two layer Rectified Linear Unit (ReLU) Neural Network (NN) works as follows. The first layer has \(m\) vertices, where each vertex has vector weight \(\vec{w}_i \in \mathbb{R}^d\). The second “hidden layer” has \(m\) vertices, each with a scalar weight \(a_1, a_2, \ldots, a_m\). This network is called overparametrized when \(m \gg n\). The output of this NN on input vector \(\vec{x}\) is (up to scaling) \(\sum_{i \leq m} a_i \phi(\vec{w_i} \cdot \vec{x})\) (where \(\phi\) is a thresholded linear function). Observe that to compute the value on a single input takes \(O(md)\) time, so the total time to compute all values on \(n\) training inputs takes \(O(mnd)\) time. The training is done by gradient descent methods; given a particular setting of weights, we compute the total loss, and then modify the weights along the gradient. Previous work showed how a single iteration can be done in time \(O(mnd + n^3)\). When \(m \gg n^2\), this can be thought of as linear in computing the loss function (which requires evaluating the NN on all the \(n\) points). This paper shows how to implement a single iteration in \(O(m^{1-\alpha}nd + n^3)\) time, for some \(\alpha > 0\). Hence, the time for an iteration is sublinear in the trivial computation. The techniques used are sparse recovery methods and random projections.

News for April 2022

We have…I don’t know, I’ve lost count of the number of papers this month. It’s a big bonanza. Sublinear algorithms for edit distance, planar graphs, distributions, bipartite graphs, groups, error correcting codes, Bayesian nets, polynomials…

Let’s proceed with the spread.

Improved Sublinear-Time Edit Distance for Preprocessed Strings by Karl Bringmann, Alejandro Cassis, Nick Fischer, and Vasileios Nakos (arXiv). The edit distance between strings is a classic and important problem in algorithms. You might recall that classic \(O(n^2)\) algorithm to compute the edit distance between strings of length \(n\). It has been show that getting a \(O(n^{2-\delta})\) time algorithm is SETH-hard. But what can be done in sublinear time? This paper considers the preprocessed version: suppose we can perform near-linear preprocessing on the strings. We now want to distinguish between edit distance between \(\leq k\) and \(\geq k\cdot n^{o(1)}\). This paper shows that with near-linear preprocessing on strings, one can solve this problem in \(k \cdot n^{o(1)}\) time.

Optimal Closeness Testing of Discrete Distributions Made Complex Simple by (our own) Clément L. Canonne and Yucheng Sun (arXiv). Given two distributions \(p, q\) over support \([k]\), the aim is to distinguish between (i) the distributions being equal, and (ii) the total variation distance between \(p, q\) being at least \(\epsilon\). The tester should has a failure probability of at most \(\delta\). A recent work nails down the sample complexity with respect to all parameters. This paper gives a simpler proof of the main result. Earlier proofs used Poissonization tricks and fairly clever arguments about Poisson random variables. This proof is much more transparent, and uses an identity that relates the expectation of a random variable to its characteristic function. A nice feature of this proof is that it works directly with the multinomial distribution, which means a fixed number of samples (rather than choosing the number of samples from a distribution).

Tolerant Bipartiteness Testing in Dense Graphs by Arijit Ghosh, Gopinath Mishra, Rahul Raychaudhury, and Sayantan Sen (arXiv). Testing bipartiteness of dense graphs is about as a classic as it gets. We wish to distinguish a bipartite graph from one that requires \(\varepsilon n^2\) edge removals to make it bipartite. Readers of this blog should know that there is a \(\widetilde{O}(\varepsilon^{-2})\)-query property tester for this problem. (Ok, so now you know.) This paper studies the tolerant version of bipartiteness testing. Note that this is equivalent to approximating the maxcut, up to additive error \(\varepsilon n^2\). Classic approximation algorithms show that the latter can be done in \(\widetilde{O}(\varepsilon^{-6})\) queries and \(\exp(\widetilde{O}(\varepsilon^{-2}))\) time. This paper considers the easier problem of distinguishing whether the distance to bipartiteness is at most \(\varepsilon\) or at least \(2 \varepsilon\). This problem is solved in \(\widetilde{O}(\varepsilon^{-3})\) queries and \(\exp(\widetilde{O}(\varepsilon^{-1}))\).

Properly learning monotone functions via local reconstruction by Jane Lange, Ronitt Rubinfeld, Arsen Vasilyan (arXiv). Ah yes, monotone functions. An ongoing love (obsession? interest?) for property testing people. This paper studies the problem of proper learning of Boolean valued monotone functions over the Boolean hypercube. Given access to uniform random evaluations of a monotone function \(f:\{0,1\}^n \to \{0,1\}\), we wish to compute a monotone function \(g\) that approximates the original function. Classic results from Fourier analysis show that an approximation can be learned using \(\exp(\sqrt{n}/\varepsilon)\) queries. But this approximation function might not be monotone, and only yields improper learning. This paper gives a proper learner that outputs a monotone approximation, in roughly the same query complexity. This result directly gives a constant tolerance monotonicity tester for Boolean functions. The paper uses recent results from distributed algorithms and local computation. It also leads to tolerant testers for monotonicity over posets with small diameter.

Massively Parallel Computation and Sublinear-Time Algorithms for Embedded Planar Graphs by Jacob Holm and Jakub Tětek (arXiv). Sublinear algorithms for planar graphs is another ongoing love (at least for me). This paper considers a new take of this problem: suppose we have access to a geometric embedding of a planar graph \(G\). Can we get sublinear algorithms for a variety of problems? This paper first shows how to construct a convenient decomposition, called an \(r\)-division, in sublinear time. This division can be used to approximate Lipschitz graph parameters, such as maximum matching sizes, maximum independent set, etc. The paper also shows how to compute an \(r\)-division in the MPC model, which solves many classic graph problems (connected components, matchings, etc.) in \(O(1)\) rounds. There is a (conditional) lower bound showing that, without an embedding, it is not possible to solve such problems in \(O(1)\) rounds (and sublinear space per processor).

Independence Testing for Bounded Degree Bayesian Network by Arnab Bhattacharyya, Clément L. Canonne (again, our own), and Joy Qiping Yang (arXiv). Given a distribution \(\mathcal{P}\) on the Boolean hypercube \(\{0,1\}^n\), the problem is to determine whether \(\mathcal{P}\) is a product distribution. In general, this problem requires \(\Omega(2^n)\) samples. Suppose \(\mathcal{P}\) has a sparse, “efficient” description. Can we do better? This paper shows that when \(\mathcal{P}\) is generated by a Bayesian network (with bounded indegree), then the independence testing problem can be solved with a \(\widetilde{O}(n/\varepsilon^2)\) samples. Think of a Bayesian network as a DAG, where each vertex generates a Bernoulli random variable. The variable at a vertex depends only the outcomes at its neighborhood.

Low Degree Testing over the Reals by Vipul Arora, Arnab Bhattacharyya, Noah Fleming, Esty Kelman, and Yuichi Yoshida (arXiv, ECCC). The problem testing low degree polynomials goes back to the birth of property testing. This paper studies real valued polynomials, in the distribution free setting. Formally, we have query access to a function \(f: \mathbb{R}^d \to \mathbb{R}\). The distance is measured with respect to an unknown distribution \(\mathcal{D}\) over the domain. This paper shows that the real low degree testing problem can be solved in \(poly(d\varepsilon^{-1})\) queries (under some reasonableness conditions on the distribution). The approach is go to via the “self-correct and test” approach: try to compute a low degree polynomial that fits some sampled data, and then check how far the self-corrected version is from another sample.

Testing distributional assumptions of learning algorithms by Ronitt Rubinfeld and Arsen Vasilyan (arXiv). Consider the problem of learning a halfspace over \(\mathbb{R}^n\). If the underlying distribution is Gaussian, then this class can be learned in \(n^{poly(\varepsilon^{-1})}\) samples. If the distribution is arbitrary, no \(2^{o(n)}\) algorithm is known despite much research. This paper introduces the notion of having a tester-learner pair. The tester first checks if the input distribution is “well-behaved” (Gaussian-like). If the tester passes, then we run the learner. Indeed, this perspective goes back to some of the original motivations for property testing (when is testing faster than learning). The intriguing aspect of this problem is that we do not have efficient testers for determining if an input distribution is Gaussian. This paper circumvents that problem by estimating certain moments of the distribution. If these moments agree with the moments of a Gaussian, then the learner is guaranteed to succeed. We get the best of both worlds: if the input distribution is Gaussian, the learning is done correctly. If the learner succeeds, then then output (hypothesis) is guaranteed to be correct, regardless of the input distribution.

Testability in group theory by Oren Becker, Alexander Lubotzky, and Jonathan Mosheiff (arXiv). This paper is the journal version of a result of the authors, and it gives a group theoretic presentation of a property testing result. Consider the following problem. The input is a pair permutations \((\sigma_1, \sigma_2)\) over \([n]\). The aim is to test whether they commute: \(\sigma_1 \sigma_2 = \sigma_2 \sigma_1\). Another result of the authors gives a tester that makes \(O(\varepsilon^{-1})\) queries. They refer to this problem as “testing the relation” \(XY = YX\). This paper gives a grand generalization of that result, best explained by another example. Consider another relation/property denoted \(\{XZ = ZX, YZ = ZY\}\). This property consists of all triples of permutations \((\sigma_1, \sigma_2, \sigma_3)\), where \(\sigma_3\) commutes with the other two. A consequence of the main theorem is that this property is not testable with query complexity independent of \(n\). The main result of this paper is a characterization of testable relations, which goes via studying the expansion of an infinite graph associated with the relation.

Testing Positive Semidefiniteness Using Linear Measurements by Deanna Needell, William Swartworth, and David P. Woodruff (arXiv). The input is a \(d \times d\) real, symmetric matrix \(M\) and we wish to determine if it is positive semidefinite (all eigenvalues are positive). For the testing problem, we reject when the minimum eigenvalue is at most \(-\varepsilon \|M\|_2\). (The paper also considers general Schatten \(p\)-norms.) This paper gives a list of results for non-adaptive vs adaptive, and one-sided vs two-sided testers. There are two access models considered: a single query consists of either a (i) matrix-vector product \(Mx\) or (ii) vector-matrix-vector product \(y^TMx\). Typical models that query entries of the matrix require strong bounds on the entries, which is less reasonable in practical situations. An interesting discovery is that the non-adaptive, one-sided complexity is \(\Theta(\sqrt{d}\varepsilon^{-1})\) while the two-sided bound is independent of \(d\).

Relaxed Locally Decodable and Correctable Codes: Beyond Tensoring by Gil Cohen and Tal Yankovitz (ECCC). Locally decodable and correctable codes are a fundamental object of study in property testing (and TCS in general). Consider a locally correctable code (LCC). Given a string \(x\), the decoder/corrector makes \(q\) queries to \(x\), and outputs a symbol. We can think of the output collectively as a string \(y\). If \(x\) is a codeword, then \(y = x\). Otherwise, \(dist(y,z) \leq \varepsilon\), where \(z\) is some codeword close to \(x\). In the relaxed version, the corrector is allowed to output \(\bot\), denoting that it has discovered corruption. The distance is only measured in the coordinates where the corrector does not output \(\bot\). Thus, the corrector gets a “free pass” if it outputs \(\bot\). But note that when \(x\) is a codeword, the output must be exactly \(x\). This paper gives a Relaxed LCC with query complexity \((\log n)^{O(\log\log\log n)}\), a significant improvement over the previous best \((\log n)^{O(\log\log n)}\). It is know from previous work that the query complexity must be \(\Omega(\sqrt{\log n})\).

Verifying The Unseen: Interactive Proofs for Label-Invariant Distribution Properties by Tal Herman and Guy Rothblum (arXiv). This paper considers the distribution testing problem in the context of interactive proofs. The verifier, who wishes to test a property of a distribution \(\mathcal{P}\), interacts with a prover who knows the distribution. The guarantee required is the standard one for interactive proof systems: in the YES case, an honest prover should be able to convince the verifier. In the NO case, no prover can convince the verifier with high probability. There are two important parameters of interest: the sample complexity of the verifier, and the communication complexity of the messages. It is useful to consider the two extremes. In one extreme, the verifier can simply solve the problem herself, ignoring the prover. This could require \(\Theta(n/\log n)\) queries (for the hardest properties like entropy and support size). Another extreme is for the honest prover to simply send an approximate description of the distribution, which takes \(O(n)\) bits. The prover can just test equality to the prover message, which only takes \(\Theta(\sqrt{n})\) queries. This paper shows a 2-round protocol for any (label-invariant) property where both the communication and the sample complexity can be made \(\Theta(\sqrt{n})\). This result shows the power of interaction for distribution testing problems.

News for December 2021

Happy 2022 everyone! We now cover the last papers of 2021. Our readers have pointed out a number of papers from previous months that we missed (sorry). To adequately cover (and publicize) these papers, we’ve decided to just incorporate those papers on this post rather than update previous posts.

We have a breakthrough on Locally Testable Codes, sublinear algorithms for testing pattern freeness, new thoughts on convexity testing, and a perspective on sampling community structure. And here’s the final list for 2021.

Locally Testable Codes with constant rate, distance, and locality by Irit Dinur, Shai Evra, Ron Livne, Alexander Lubotzky, and Shahar Mozes (ECCC, posted Nov 2021). A Locally Testable Code (LTC) is one where the set of codewords has a property tester. The study of LTCs started with the PCP theorem, but blossomed into a deep research agenda. The holy grail is a \(c^3\)-LTC: one with constant rate, constant distance, and constant query property tester. Just to recall, the rate \(r \in (0,1)\) is the ratio of the message length to the encoding length. The distance \(\delta \in (0,1)\) is the (normalized) Hamming distance between codewords (we typically expect this to be constant for an error correcting code). And, most familiar to us, \(q\) is the number of queries required to property test codewords. The aim is to get all these parameters to be constants independent of the message length. The classic linearity test of BLR basically proves that the Hadamard code is 3-query testable. But the Hadamard code has poor (exponentially small) rate. Low-degree tests prove that Reed-Muller codes are constant-query testable, but they have inverse polynomial rate. Further work constructed LTCs with inverse polylogarithmic rate, meaning that the encoding has length \(n\cdot poly(\log n)\) for message length \(n\). There was even some belief that any LTC must have a rate going to zero. This paper invalidates that belief, and gives a truly \(c^3\)-construction. The construction is based on a 2-dimensional simplicial complex based on Cayley graphs. This is basically a graph with hyperedges corresponding to “squares” of vertices. Concurrent and independent work of Breuckmann-Ebedhardt and Panteleev-Kalachev also have similar constructions, though their perspective is of quantum codes.

On the Locally Testable Code of Dinur et al. (2021) by Oded Goldreich (ECCC). As the title says, this short paper gives a high level discussion of the previous result. It abstracts out some of the graph theoretic aspects, to explain connections to more combinatorial constructions. The construction and analysis is based on expander codes. These codes (e.g. low-density parity check, LDPC codes) have constant rate and distance, but we do not know how to test them. A string far from a codeword might only fail on (say) one parity check constraint. Interpreting the Dinur et al as an expander code, we essentially get a collection of correlated parity constraints. Hence, a string that is far from a codeword violates many constraints, leading to a tester. (Of course, this is an extremely watered down version of the analysis idea.) The construction takes two Cayley graphs on a non-Abelian group (which form the vertices). Specific 4-cycles in this graph (together with a constant sized tensor code) are used to make the final codewords.

Strongly Sublinear Algorithms for Testing Pattern Freeness by Ilan Newman and Nithin Varma (arXiv, posted Jun 2021). Pattern freeness is a major generalization of monotonicity testing. The input is (query access to) a function \(f: [n] \mapsto \mathbb{R}\). Let \(\pi: [k] \mapsto [k]\) be a permutation. The input is \(pi\)-free if no restriction of \(f\) to \(k\) elements induces the permutation \(pi\). Newman-Rabinovich-Rajendraprasad-Sohler introduced this problem. When \(k=2\), this property is precisely monotonicity, for which (many) \(O(\varepsilon^{-1} \log n)\) property testers are known. Is such a bound achievable for \(k \geq 3\)? Interestingly, for non-adaptive algorithms, an optimal bound of \(\Theta(n^{1-1/(k-1)})\) (ignoring, \(\varepsilon, k\) dependencies) was achieved by Ben-Eliezer and Canonne. This papers shows that there is an adaptive algorithm that makes \(n^{o(1)}\) queries, thereby proving a separation between adaptive and non-adaptive algorithms. A key idea is to visualize the input as a set of points in the plane, and construct a coarse grid that partitions the points. There is a highly non-trivial case analysis on top of this that leads to a \(O(\sqrt{n})\)-query algorithm (this already beats the non-adaptive lower bound).

Parameterized Convexity Testing by Abhiruk Lahiri, Ilan Newman, and Nithin Varma (arXiv, posted Oct 2021). Given a function \(f:[n] \mapsto \mathbb{R}\), the property of interest is (discrete) convexity. The discrete derivatives should be monotonically non-decreasing. It is known that there is a \(O(\varepsilon^{-1}\log n)\) query non-adaptive tester for this property. This paper studies this problem from the perspective of parameterized property testing, where the range is restricted. The main result is that if the number of distinct discrete derivates is at most \(s\), there is a \(O(\varepsilon^{-1} \log s)\) query non-adaptive tester. Moreover, this bound is optimal even for adaptive algorithms. This picture parallels that of monotonicity. Interestingly, when there are only two distinct discrete derivatives, the tester can be made deterministic.

Modularity and edge sampling by Colin McDiarmid and Fiona Skerman (arXiv). Not your typical sublinear algorithms paper, but certainly of interest to us. The main problem is to understand when a random samples of edges in a graph can reveal information about the “community structure”. The modularity of a graph is a measure of this structure. Essentially, a partition has low modularity if the number of edges crossing the partition is close to that of a random graph with the same degree distribution. A partition of high modularity has significant fewer edges, and is thus a valid “clustering” of the graph. The modularity of the graph is obtained by maximizing over partitions. (The normalization of modularity is chosen so that the partition generally has a constant number of sets.) This paper proves a concentration inequality for the modularity of a constant sample of uniform random edges. The modularity of a sample of \(\Theta(\varepsilon^{-5})\) edges is an additive \(\varepsilon\)-approximation of the overall modularity, whp.

News for September 2021

A nice array of papers for this month, ranging from distribution testing, dense graph property testing, sublinear graph algorithms, and various mixtures of them. Let us now sample our spread! (Adding another semistreaming paper who achieved similar results to another posted paper. -Ed)

A Lower Bound on the Complexity of Testing Grained Distributions by Oded Goldreich and Dana Ron (ECCC). A discrete distribution is called \(m\)-grained if all probabilities are integer multiples of \(1/m\). This paper studies the complexity of testing this property of distributions. For simplicity, consider the property of being \(n/2\)-grained, where the support size is \(n\). The classic lower bound for testing uniformity shows that \(\Omega(\sqrt{n})\) samples are required to distinguish the uniform distribution from a distribution uniform on \(n/2\) elements. Thus, we get a lower bound of \(\Omega(\sqrt{n})\) for testing \(n/2\)-grainedness (if I am permitted to use that word). This paper proves a lower bound of \(\Omega(n^c)\), for all constant \(c < 1\). It is conjectured that the lower bound is actually \(\Omega(n/\log n)\), which would match the upper bound (for any label-invariant property).

Testing Distributions of Huge Objects by Oded Goldreich and Dana Ron (ECCC). This paper introduced a new model that marries distribution testing with property testing on strings. The “object” of interest is a distribution \(\mathcal{D}\) over strings of length \(n\). We wish to test if \(\mathcal{D}\) possesses some property. The tester can get a random string \(x\) from the distribution, and can query any desired index of \(x\). The distance between distributions is defined using the earthmover distance (where we use the Hamming distance between strings). This model is called the DoHO (Distributions of Huge Objects) model. There are many questions posed and connections drawn to classical property testing and distribution testing. What I find interesting is a compelling application: the distribution \(\mathcal{D}\) may represent noisy or perturbed versions of a single object. The DoHO model gives a natural generalization of standard property testing to noisy objects. This paper considers problems such as testing if \(\mathcal{D}\) is: a random perturbation of a string, or a random cyclic shift, or a random isomorphism of a graph.

Sublinear Time and Space Algorithms for Correlation Clustering via Sparse-Dense Decompositions by Sepehr Assadi and Chen Wang (arXiv). Correlation clustering is a classic problem where edges in a graph are labeled ‘+’ or ‘-‘, denoting whether these edges should be uncut or cut. The aim is to cluster the graph minimizing the total “disagreements” (cut ‘+’ edges or uncut ‘-‘ edges). This paper gives an \(O(1)\)-approximation algorithm that runs in \(O(n\log^2n)\) time; this is the first sublinear time approximation algorithm for this problem. Correlation clustering has seen results for the property testing/sublinear algorithms community, first by Bonchi, Garcia Soriano, and Kutzkov. But previous results were essentially on the dense graph model, giving \(O(\epsilon n^2)\) error assuming adjacency matrix input. This paper considers access to the adjacency list of ‘+’ edges. Interestingly (from a technical standpoint), the key tool is a new sparse-dense decomposition. Such decompositions emerged from the seminal work of Assadi-Chen-Khanna for sublinear \((\Delta+1)\)-colorings, and it is great to see applications beyond coloring.

Sublinear-Time Computation in the Presence of Online Erasures by Iden Kalemaj, Sofya Raskhodnikova, and Nithin Varma (arXiv). Can property testing be done when portions of the input are hidden? This question was first raised by Dixit-Raskhodnikova-Thakurta-Varma, who gave a model of erasure-resilient testing. There is an adversary who hides (erases) part of the input function; queries to those parts just yield a dummy symbol. This paper defines an online version of this model. There is an erasure parameter \(t\). On each query by the property tester, the adversary can erase \(t\) values of the function. Consider the property of classic linearity of functions \(f:\{0,1\}^d \rightarrow \{0,1\}\). The BLR tester queries triples of pairs \((x,y, x \oplus y)\). Observe how this tester is easily defeated by our adversary, by erasing the value \(f(x\oplus y)\). One of the main results of this paper is a \(O(1/\varepsilon)\)-query tester for linearity, that works for any constant erasure parameter \(t\). Note that this matches the bound for the standard setting. There are a number of results for other classic properties, such as monotonicity (sortedness) and Lipschitz.

Sublinear Time Eigenvalue Approximation via Random Sampling by Rajarshi Bhattacharjee, Cameron Musco, and Archan Ray (arXiv). Consider the problem of estimating all the eigenvalues of a real, symmetric \(n \times n\) matrix \(M\) with bounded entries, in sublinear time. The main result shows that the eigenvalues of a uniform random \(O(\epsilon^{-4}\log n)\) principal submatrix can be used to approximate all eigenvalues of \(M\) up to additive error \(\epsilon n\). One can think of this as a sort of concentration inequality for eigenvalues. This result follows (and builds upon) work of Bakshi-Chepurko-Jayaram on property testing semidefiniteness. The key idea is that eigenvectors corresponding to large eigenvalues have small infinity norm: intuitively, since all entries in \(M\) are bounded, such an eigenvector must have its mass spread out among many coordinates. Hence, we can get information about it by randomly sampling a few coordinates. The paper also shows that approach of taking principal submatrices requires taking \(\Omega(\epsilon^{-2})\) columns/rows.

Deterministic Graph Coloring in the Streaming Model by Sepehr Assadi, Andrew Chen, and Glenn Sun (arXiv). This is technically not a sublinear algorithms paper (well ok, streaming is sublinear, but we tend not to cover the streaming literature. Maybe we should? – Ed.) But, I think that the connections are of interest to our community of readers. The main tool of sublinear \((\Delta+1)\)-coloring algorithm of Assadi-Chen-Khanna is the palette sparsification lemma (\(\Delta\) is the maximum degree). This lemma shows that vertices can randomly shorten their ”palette” of colors, after which all colorings from these palettes lead to very few monochromatic edges. This is an immensely powerful tool, since one can get immediately sublinear complexity algorithms in many models: adjacency list, streaming, distributed. Is the randomness necessary? Note that these algorithms run in \(\Omega(n)\) time/space, so it is conceivable that deterministic sublinear algorithms exists. This paper shows that randomization is necessary in the semi-streaming model (space \(O(n poly(\log n))\)). Indeed, there exist no deterministic semi-streaming algorithms that can achieve even \(\exp(\Delta^{o(1)})\) colorings.

Adversarially Robust Coloring for Graph Stream by Amit Chakrabarti, Prantar Ghosh, and Manuel Stoeckl (arXiv). This paper studies the same problem as the above, but presents the results in a different way. In a randomized algorithm, we normally think of an adversary that fixes a (hard) input, and the algorithm then makes its random decisions. An adaptive adversary is one that changes the input (stream) based on the decisions of an algorithm. In this definition, a robust algorithm is one that can give correct answers, even for adversarially generated output. A deterministic algorithm is automatically robust. This paper show that there do not exist semi-streaming algorithms that can achieve \((\Delta+1)\)-colorings. The quantitative lower bound is weaker (\(\Omega(\Delta^2)\) colors), but it is against a stronger adversary.

News for June 2021

A quieter month after last month’s bonanza. One (applied!) paper on distribution testing, a paper on tolerant distribution testing, and a compendium of open problems. (Ed: alas, I missed the paper on tolerant distribution testing, authored by one of our editors. Sorry, Clément!)

Learning-based Support Estimation in Sublinear Time by Talya Eden, Piotr Indyk, Shyam Narayanan, Ronitt Rubinfeld, Sandeep Silwal, and Tal Wagner (arXiv). A classic problem in distribution testing is that of estimating the support size \(n\) of an unknown distribution \(\mathcal{D}\). (Assume that all elements in the support have probability at least \(1/n\).) A fundamental result of Valiant-Valiant (2011) proves that the sample complexity of this problem is \(\Theta(n/\log n)\). A line of work has emerged in trying to reduce this complexity, with additional sources of information. Canonne-Rubinfeld (2014) showed that, if one can query the exact probabilities of elements, then the complexity can be made independent of \(n\). This paper studies a robust version of this assumption: suppose, we can get constant factor approximations to the probabilities. Then, the main result is that we can get a query complexity of \(n^{1-1/\log(\varepsilon^{-1})} \ll n/\log n\) (where the constant \(\varepsilon\) denotes the additive approximation to the support size). This paper also does empirical experiments to show that the new algorithm is indeed better in practice. Moreover, it shows that existing methods degraded rapidly with poorer probability estimates, while the new algorithm maintains its accuracy even with such estimates.

The Price of Tolerance in Distribution Testing by Clément L. Canonne, Ayush Jain, Gautam Kamath, and Jerry Li (arXiv). While we have seen many results in distribution testing, the subject of tolerance is one that hasn’t received as much attention. Consider the problem of testing if unknown distribution \(\mathcal{p}\) (over domain \([n]\)) is the same as known distribution \(\mathcal{q}\). We wish to distinguish \(\varepsilon_1\)-close from \(\varepsilon_2\)-far, under total variation distance. When \(\varepsilon_1\) is zero, this is the standard property testing setting, and classic results yield \(\Theta(\sqrt{n})\) sample complexity. If \(\varepsilon_1 = \varepsilon_2/2\), then we are looking for a constant factor approximation to the distance. And the complexity is \(\Theta(n/\log n)\). Surprisingly, nothing was known in better. Until this paper, that is. The main result gives a complete characterization of sample complexity (up to log factors), for all values of \(\varepsilon_1, \varepsilon_2\). Remarkably, the sample complexity has an additive term \((n/\log n) \cdot (\varepsilon_1/\varepsilon^2_2)\). Thus, when \(\varepsilon_1 > \sqrt{\varepsilon_2}\), the sample complexity is \(\Theta(n/\log n)\). When \(\varepsilon_1\) is smaller, the main result gives a smooth dependence on the sample complexity. One the main challenges is that existing results use two very different techniques for the property testing vs constant-factor approximation regimes. The former uses simpler \(\ell_2\)-statistics (e.g. collision counting), while the latter is based on polynomial approximations (estimating moments). The upper bound in this paper shows that simpler statistics based on just the first two moments suffice to getting results for all regimes of \(\varepsilon_1, \varepsilon_2\).

Open Problems in Property Testing of Graphs by Oded Goldreich (ECCC). As the title clearly states, this is a survey covering a number of open problems in graph property testing. The broad division is based on the query model: dense graphs, bounded degree graphs, and general graphs. A reader will see statements of various classic open problems, such as the complexity of testing triangle freeness for dense graphs and characterizing properties that can be tested in \(poly(\varepsilon^{-1})\) queries. Arguably, there are more open problems (and fewer results) for testing in bounded degree graphs, where we lack broad characterizations of testable properties. An important, though less famous (?), open problem is that of the complexity of testing isomorphism. It would appear that the setting of general graphs, where we know the least, may be the next frontier for graph property testing. A problem that really caught my eye: can we transform testers that work for bounded degree graphs into those that work for bounded arboricity graphs? The latter is a generalization of bounded degree that has appeared in a number of recent results on sublinear graph algorithms.

News for March 2021

A somewhat quieter month by recent standards. Three Two papers: graph property testing and quantum distribution testing. (Ed: The distribution testing paper was a revision of a paper we already covered in Sept 2020.)

Robust Self-Ordering versus Local Self-Ordering by Oded Goldreich (ECCC). In Nov 2020, we covered a paper that uses a tool called self-ordered graphs, that transferred bit string function lower bounds to graph property testing. Consider a labeled graph. A graph is self-ordered if its automorphism group only contains the identity element (it has no non-trivial isomorphisms). A graph is robustly self-ordered, if every permutation of the vertices leads to a (labeled) graph that is sufficiently “far” according to edit distance. Given a self-ordered graph \(G\), a local self-ordering procedure is the following. Given access to a copy \(G’\) of \(G\) and a vertex \(v \in V(G’)\), this procedure determines the (unique) vertex in \(V(G)\) that corresponds to \(v\) with sublinear queries to \(G\). In other words, it can locally “label” the graph. Intuitively, one would think that more robustly self-ordered graphs will be easier to locally label. This paper studies the relation between robust and local self-ordering. Curiously, this paper refutes the above intuition for bounded-degree graphs, and (weakly) confirms it for dense graphs. Roughly speaking, there are bounded degree graphs that are highly robustly self-ordered, for which any local self-ordering procedure requires \(\omega(\sqrt{n})\) queries. Moreover, there are bounded degree graphs with \(O(\log n)\)-query local self-ordering procedures, yet are not robustly self-ordered even for weak parameters. For dense graphs, the existence of fast non-adaptive local self-ordering procedures implies robust self-ordering.

Testing identity of collections of quantum states: sample complexity analysis by Marco Fanizza, Raffaele Salvia, and Vittorio Giovannetti (arXiv). This paper takes identity testing to the quantum setting. One should think of a \(d\)-dimensional quantum state as a \(d \times d\) density matrix (with some special properties). To learn the state entirely up to error \(\varepsilon\) would require \(O(\varepsilon^{-2} d^2)\) samples/measurements. A recent result of Badescu-O’Donnell-Wright proves that identity testing to a known state can be done significantly faster using \(O(\varepsilon^{-2} d)\) measurements. This paper takes this result a step further by consider a set of \(N\) quantum states. A “sample” is like a classical sample, where one gets a sample from a distribution of quantum states. The YES (“uniform”) case is when all the states are identical. The NO (“far from uniform”) case is when they are “far” from being the same state. This paper proves that \(O(\varepsilon^{-2}\sqrt{N}d)\) samples suffices for distinguishing these cases.

News for December 2020

Happy 2021! We now cover the last papers of 2020: a (smaller) spread with graph property testing, distribution property testing, and circuit complexity.

Almost Optimal Bounds for Sublinear-Time Sampling of k-Cliques: Sampling Cliques is Harder Than Counting by Talya Eden, Dana Ron, and Will Rosenbaum (arXiv). In many settings, sampling and counting are tightly coupled; the classic example being permanent estimation and sampling random bipartite matchings.This paper investigates the problem of sampling uniform random cliques, and unearths an interesting separation of approximately uniform sampling from approximate counting. Let’s consider the fundamental problem of edge and triangle counting, in graphs of constant arboricity. This class, which contains all minor-closed families, has deep connections to clique counting; clique counting can be done in linear time for bounded arboricity graphs.) The input connected graph \(G\) has \(n\) vertices, \(m\) edges, and \(t\) triangles. We assume the standard query model, with uniform random vertices, neighbor, and degree queries. Our aim is to get \((1+\varepsilon)\)-approximations for \(m\) and \(t\) (in general, any \(k\)-clique count). It is known from previous work that edge estimation can be done in \(O(1)\) time and triangle estimation can be done in \(O(n/t)\) time (ignoring log and \(\varepsilon\) dependencies). Moreover, these bounds is optimal. Can one get similar bounds for approximately sampling a uniform random edge/triangle? Previous work of Eden-Rosenbaum achieve such a bound for sampling edges. Surprisingly, this paper shows that triangle sampling is fundamentally harder, and requires \(\Omega(n^{3/4}/\sqrt{t})\) queries. The main result gives a precise and optimal bound for uniform sampling \(k\)-cliques in graphs of arboricity \(\alpha\).

Testing and reconstruction via decision trees by Guy Blanc, Jane Lange, and Li-Yang Tan (arXiv). A property tester is an (sublinear) algorithm that distinguishes an object with a property from those that are far. Reconstruction is a much stronger sublinear algorithm, which actually provides query access to the closest object. Suppose we have query access to an \(n\)-variate Boolean function \(f\), and the property \(\mathcal{P}\) of interest is size-\(s\) decision trees. A reconstructor would give query access to a function \(g \in \mathcal{P}\), such that \(dist(f,g) = O(OPT_f)\), where \(OPT_f\) is the distance of \(f\) to \(\mathcal{P}\). Observe that a reconstructor automatically gives a distance estimator (just estimate \(dist(f,g)\) by random sampling). One of the main results of this paper is a (weaker, bicriteria) reconstructor, where \(g\) is guaranteed to have decision-tree complexity \(s^{O(\log^2s)}\). Each query to \(g\) is computed in \(poly(\log s, 1/\varepsilon)\cdot n\) time. This reconstructor leads to a number of testing results, and reconstructors for other properties like low Fourier degree. Another interesting result proven: if there exists an efficient reconstructor that gets \(g\) of size \(s\), then there exists an efficient proper learner for decision trees (a big open problem).

Testing Product Distributions: A Closer Look by Arnab Bhattacharyya, Sutanu Gayen, Saravanan Kandasamy, and N. V. Vinodchandran (arXiv). This paper is on distribution testing, where all distributions involved are promised to be product distributions. Let \(P, Q\) be distributions over \(\Sigma^n\), where \(\Sigma\) is some (small) finite set. The problems studied are identity testing (\(Q\) is known/fixed, and one gets samples from unknown \(P\)) and closeness testing (both unknown). It is known from previous work that these problems can basically be solved with \(O(n|\Sigma|\varepsilon^{-2})\) samples. Note that the domain size is exponential in \(n\); thus, these bounds give an exponential improvement over identity/closeness testing in the vanilla (non-product) setting. Tolerant testing is of special significance here. Because the domain is so large, it makes sense to allow for some distance even in the “yes” case. The main result of this paper is to completely map out the complexities for identity/closeness testing where we wish to distinguish \(d_1(P,Q) < \varepsilon_1\) from \(d_2(P,Q) > \varepsilon_2\), where \(d_1, d_2\) are potentially different distance measures. For example, the paper considers total variation (TV), KL, Hellinger, and \(\chi^2\) distances. An important improvement achieved is for identity testing where \(d_1\) is Hellinger and \(d_2\) is TV distance: for certain values of \(\varepsilon_1, \varepsilon_2\), one can get a tester of optimal sample complexity \(O(\sqrt{n|\Sigma|})\). Previous bounds only gave a \(O(n|\Sigma|)\) sample complexity. There are a number of new lower bounds for different combinations of \(d_1, d_2\) distance pairs.

News for September 2020

Apologies dear readers for the late posting. The beginning of the school year is always frenzied, and the pandemic has only added to that frenzy. We have an exciting September, with four papers on graph property testing, one two papers on distribution testing, and one paper that connects both topics.

(Ed: we normally scan through ECCC and arXiv, but are happy to post about papers that appear elsewhere. Thanks to the reader who pointed out a relevant COLT 2020 paper.)

Estimation of Graph Isomorphism Distance in the Query World by Sourav Chakraborty, Arijit Ghosh, Gopinath Mishra, and Sayantan Sen (ECCC). Graph isomorphism is about as fundamental as it gets, and this papers studies approximating the graph isomorphism distance for dense graphs. There is a known graph \(G_k\) (with \(n\) vertices). The algorithm is given query access to an input graph \(G_u\) and needs to approximate the number of edge inserts/deletes required to make the graphs isomorphic. This is the tolerant testing version; the property testing version is known to be doable in \(\widetilde{O}(\sqrt{n})\) queries (Matsliah-Fischer). The main insight of this paper is to relate the tolerant testing complexity to a distribution testing problem. Consider distributions over the \(\{0,1\}^n\) defined by multisets of \(n\) hypercube points. Our aim is to estimate the earthmover distance between a known distribution and an unknown distribution. Interestingly, the query model is different: one can sample the underlying multisets without replacement. It turns out that the optimal complexity of this problem is (upto polylog factors) is the same as the optimal complexity of tolerant testing of graph isomorphism. A direct corollary is that the isomorphism distance can be approximated upto additive \(\epsilon n^2\) using \(\widetilde{O}(n)\) samples. This equivalence also gives an alternate proof for lower bounds for property testing graph isomorphism.

Robustly Self-Ordered Graphs: Constructions and Applications to Property Testing by Oded Goldreich and Avi Wigderson (ECCC). Let’s start from the application. The aim is to prove the following property testing lower bounds for the bounded-degree graph setting: an exponential separation between tolerant and vanilla testing, and finding an efficiently decidable property (in polynomial time) that cannot be property tested in sublinear time. For binary strings, results of this form are known. Can these be “ported” to the bounded-degree graph world? Can we construct graphs such that adjacency queries reduce to bit queries in strings? Naturally, one can simply represent the adjacency list as a string and treat graph queries as bit queries. But the problem is that of isomorphisms: different bit strings could represent the same graph and therefore, the different bit strings must have the same status with respect to the underlying property. The key insight in this paper is to introduce robustly self-ordered graphs, as a tool to port bit string property testing lower bounds to bounded-degree graphs. Such graphs essentially have a unique (identity) automorphism, even after a few edge insert/deletes. The actual definition is more technical, but that is the essence. The main result is an explicit construction of such graphs, from which the lower bound can be ported directly through a convenient lemma.

Modifying a Graph’s Degree Sequence and the Testablity of Degree Sequence Properties by Lior Gishboliner (arXiv). A sequence of numbers \(D = (d_1, d_2, \ldots, d_n)\) is graphic if there exists an undirected graph on \(n\) vertices whose degrees are precisely the numbers of the sequence. Graphical sequences have been characterized by classic results of Erdös-Gállai and Havel-Hakimi. This paper first proves the following theorem. Suppose a graphic sequence \(D’\) has \(l_1\)-distance at most \(\delta\) from the degree sequence \(D\) of a graph \(G\). Then, there exists a graph \(G’\) with degree sequence \(D’\) such that the (dense graph) distance between \(G\) and \(G’\) is \(O(\sqrt{\delta})\). This theorem is used to prove an interesting property testing result. Let \(\mathcal{D}\) be a subset of graphic sequences that are closed under permutation. Let \(\mathcal{G}\) be the set of graphs that have a degree sequence in \(\mathcal{D}\). Then \(\mathcal{G}\) can be tested in \(poly(1/\epsilon)\) queries.

Sampling an Edge Uniformly in Sublinear Time by Jakub Têtek (arXiv). In the general model for sublinear algorithms on graphs, an important choice is whether one allows uniform random edge queries. A natural question is whether such queries can simulated efficiently, using only random vertex, degree, and neighbor queries. This problem appears somewhat implicitly in previous sublinear subgraph counting algorithms, and Eden-Ron-Rosenbaum study it explicitly. They prove that one can sample from an \(\epsilon\)-approximate uniform distribution (over edges) using \(O(n/\sqrt{\epsilon m})\) samples. The problem of sampling from exactly the uniform distribution is left open. Until this paper. The main result shows that by modifying the Eden-Ron-Rosenbaum algorithm parameters, one can generate edge samples from an \(\epsilon\)-approximate uniform distribution using \(O((n/\sqrt{m})\log \epsilon^{-1})\) samples. The exact uniform distribution is achieved by setting \(\epsilon = 1/n\), to get a sample complexity of \(O((n\log n)/\sqrt{m})\).

Faster Property Testers in a Variation of the Bounded Degree Model by Isolde Adler and Polly Fahey (arXiv). The setting of bounded-degree graph property testing naturally extends to bounded-degree relational databases, which can be thought of as “directed” hypergraphs. This is an interesting new direction of research, that combines property testing with database theory (see Adler-Harwath and Chen-Yoshida). One of the main contributions of this work is to consider another notion of distance: edge and vertex inserts/deletes. This is a natural extension, and we can now compare distances between graphs/databases with different numbers of vertices. The main result is that, under this notion of distance, a large class of properties can be tested in constant running time on databases with bounded degree and treewidth. Specifically, any property expressible in Counting Monadic Second-Order Logic (CMSO) can be tested in constant time. Previous results by Alder-Harwath showed that such properties can be tested (under the standard distance notion) in constant queries, but polylogarithmic time.

Optimal Testing of Discrete Distributions with High Probability by Ilias Diakonikolas, Themis Gouleakis, Daniel M. Kane, John Peebles, and Eric Price (arXiv, ECCC). The focus of this paper is distribution testing in the “high probability” regime, where we wish the error of the tester to be \(< \delta\). Typically, most results just get an error of at most \(1/3\), from which standard probabilistic boosting would tack on an extra \(O(\log 1/\delta)\) factor. In standard TCS settings, one doesn’t focus on optimizing this dependence, but in statistics, there is significant focus on the optimal sample complexity. And indeed, for practical applications, it is crucial to have sharp bounds on the right number of samples required for hypothesis testing. The paper also argues that getting the optimal sample complexity requires new algorithms, even for uniformity testing. There are optimal results given for closeness and independence testing. The optimal sample complexity only pays a multiplicative factor of \(\log^{1/3} (1/\delta)\) or \(\log^{1/2}(1/\delta)\) over the optimal bound for constant error (with other additive terms depending on \(\log(1/\delta)\)).

Bessel Smoothing and Multi-Distribution Property Estimation by Yi Hao and Ping Li (COLT 2020). Let us consider some standard (tolerant) distribution testing questions, phrases as approximation algorithms. Given sample access to two distributions \(p\) and \(q\) over \([n]\), we may wish to estimate the \(l_1\)-distance, \(l_2\)-distance, relative entropy, etc. between these distributions. One can phrases this problem abstractly as estimating \(\sum_{i \in [n]} f(p_i, q_i)\), where \(f\) is some explicit function. This papers shows that for any 1-Lipschitz function \(f\) that satisfies some “regularity” property, the sum \(\sum_{i \in [n]} f(p_i, q_i)\) can be \(\epsilon\)-approximated with \(O(\epsilon^{-3}n/\sqrt{\log n})\) samples (apologies to the authors to replacing their \(k\) with the more familiar \(n\) for our readers). Thus, we can get sublinear sampling complexity for a very general class of estimation problems. Moreover, this was actually the simplest setting consider in the paper. One can deal with such functions of \(d\) distributions, not just two distributions. One of the corollaries of the theorems is a sublinear tolerant tester for the property of being a mixture of distributions.

Policy on reporting papers

While we at PTReview always look through the posted papers, we do not check for correctness. We make a serious attempt to make sure the paper is reasonable. In a few instances, we have decided not to post a (topically relevant) paper, because it looks absolutely wrong. Our position is: the benefit of doubt goes to the author, and a borderline paper should be posted. We are only curating relevant tech reports, not passing judgment on results.

In some borderline cases, readers familiar with the subject complained to us that the paper should be not be considered a scientific contribution (because of, say, unspecified algorithms, blatantly incorrect or unverifiable central claims). These are cases where we were also unsure of the paper. We have usually removed/not posted such papers.

If the paper author(s) feels that his/her paper should nonetheless be posted, then they should email us at little.oh.of.n@gmail.com. As long as the paper is not complete nonsense and appears to cite relevant history, we will defer to the authors’ wishes.

News for June 2020

Sublinear algorithms in times of social distancing…always something exciting. This month we have a slew of results on sublinear algorithms for classic graph problems.

(Ed: We have removed a previously posted paper due to correctness concerns raised by our readers. Please look at the post on our paper policy.)

Palette Sparsification Beyond (∆ + 1) Vertex Coloring by Noga Alon and Sepehr Assadi (arXiv). A basic fact from graph theory is that any graph has a \((\Delta+1)\)-coloring, where \(\Delta\) is the maximum degree. Followers of property testing are likely familiar with a fantastic result of Assadi-Chen-Khanna (ACK) on sublinear algorithms, that gives a sublinear algorithm for \((\Delta+1)\)-coloring. (The running time is \(\widetilde{O}(n^{3/2})\), where \(n\) is the number of vertices.) The key tool is a palette sparsification theorem: suppose each vertex is given a “palette” of \((\Delta+1)\) colors. Each vertex randomly sparsifies its palette by sampling \(O(\log n)\) colors, and is constrained to only use these colors. Remarkably, whp the graph can still be properly colored. This tool is at the heart of sublinear time/space algorithms for coloring. This paper gives numerous extensions to this theorem, where one can tradeoff a larger initially palette for a smaller final sample. Another extension is for triangle-free graphs, where the initial palette is of size \(O(\Delta/\ln \Delta)\) and the sample is of size \(O(\Delta^\gamma + \sqrt{\ln n})\) (for parameter \(\gamma < 1\). This leads to an \(O(n^{3/2 + \gamma})\) time algorithm for \(O(\Delta/\ln \Delta)\) coloring of triangle-free graphs.

When Algorithms for Maximal Independent Set and Maximal Matching Run in Sublinear-Time by Sepehr Assadi and Shay Solomon (arXiv). Taking off from sublinear coloring algorithms, one can ask if there are sublinear time algorithms for Maximal Independent Set (MIS) and Maximal Matching (MM). Alas, ACK prove that this is impossible. This paper investigates when one can get a sublinear time algorithm for these problems. For graph \(G\), let \(\beta(G)\) be the “neighborhood independence number”, the size of the largest independent set contained in a vertex neighborhood. This papers shows that both problems can be solved in \(\widetilde{O}(n \beta(G))\) time. Examples of natural classes of graphs where \(\beta(G)\) is constant: line graphs and unit-disk graphs. An interesting aspect is that MIS algorithm is actually deterministic! It’s the simple marking algorithm that rules out neighborhoods of chosen vertices; the analysis shows that not much time is wasted in remarking the same vertex.

Sublinear Algorithms and Lower Bounds for Metric TSP Cost Estimation by Yu Chen, Sampath Kannan, and Sanjeev Khanna (arXiv). This paper studies sublinear algorithms for the metric TSP problem. The input is an \(n \times n\) distance matrix. One can 2-approximate the TSP by computing the MST, and a result of Czumaj-Sohler gives a \((1+\varepsilon)\)-approximation algorithm for the latter, running in \(O(n\varepsilon^{-O(1)})\) time. The main question is: can one beat the 2-factor approximation in sublinear time? This paper considers the graphic TSP setting, where the distance matrix corresponds to the shortest path metric of an unweighted graph. One result is a \((2-\varepsilon_0)\)-approximation algorithm (for an explicit constant \(\varepsilon_0\)) that runs in \(\widetilde{O}(n)\) time. For the important \((1,2)\) TSP setting (all distances are either 1 or 2), the paper gives a \(O(n^{1.5})\) time 1.63-approximation algorithm. Interestingly, there is a lower bound showing that \((1+\varepsilon)\)-approximations, for arbitrarily small \(\varepsilon\), cannot be achieved in \(o(n^2)\) time. One of the key tools is sublinear algorithms for estimating the maximum matching size, itself a well-studied problem in the community.