Category Archives: Monthly digest

News for June 2016

This month isn’t the most exciting for property testing. Just one paper to report (thanks to Clément for catching!), of which only a minor part is actually related to property testing.

Approximating the Spectral Sums of Large-scale Matrices using Chebyshev Approximation, by Insu Han, Dmitry Malioutov, Haim Avron, and Jinwoo Shin (arXiv). The main results of the paper deal with approximating the trace of matrix functions. Consider symmetric matrix \(M \in \mathbb{R}^{d\times d}\), with eigenvalues \(\lambda_1, \lambda_2, \ldots, \lambda_d\). The paper gives new algorithms to approximate \(\sum_i f(\lambda_i)\). Each “operation” of the algorithm is a matrix-vector product, and the aim is to minimize the number of such operations. The property testing application is as follows. Suppose we wish to test if \(M\) is positive-definite (all \(\lambda_i > 0\)). We consider a matrix \(\epsilon\)-far from being positive-definite if the smallest eigenvalue is less than \(-\epsilon \|M\|_F = -\epsilon \sum_i \lambda^2_i\). The main theorems in this paper yield a testing algorithm (under this definition of distance) that makes \(o(d)\) matrix-vector products. While this is not exactly sublinear under a standard query access model, it is relevant when \(M\) is not explicitly represented and the only access is through such matrix-vector product queries.

News for May 2016

Last month witnessed two new property testing papers go online, which May be of interest to the community.

Reducing testing affine spaces to testing linearity, by Oded Goldreich (ECCC). In his recent guest post, the author outlined a reduction between testing if a Boolean function is an \((n-k)\)-dimensional affine subspace of \(\{0,1\}^n\), and testing linearity of functions \(f\colon\{0,1\}^n\to \{0,1\}\). This preprint encompasses details of this reduction, as part as a general approach to testing monomials (resolving along the way one of the open problems from April). Namely, it shows that testing if \(f\) is a \(k\)-monomial can be reduced to testing two properties, of \(f\) and of a (related) function \(g\colon\{0,1\}^n\to \{0,1\}^k\). Moreover, it establishes that the general case (\(k\geq 1\)) of the second part  can itself be reduced to the simpler case (\(k=1\)), giving a unified argument.

Testing Equality in Communication Graphs, by Noga Alon, Klim Efremenko, Benny Sudakov (ECCC). Given a connected graph on \(k\) nodes, where each node has an \(n\)-bit string, how many bits of communication are needed to test if all strings are equal? This paper investigates this problem for many interesting graphs, resolving the complexity up to lower order terms. For example, if the graph is Hamiltonian, they show that \(\frac{kn}{2} + o(n)\) bits are sufficient, while at least \(\frac{kn}{2}\) bits are required.

Edit (06/16): updated the description of the first preprint, which was not entirely accurate.

News for April 2016

April has been kind to us, providing us with a trio of new papers in sublinear algorithms. Also, in case you missed them, be sure to check out Oded Goldreich’s guest post and the associated open problem.

Sparse Fourier Transform in Any Constant Dimension with Nearly-Optimal Sample Complexity in Sublinear Time, by Michael Kapralov (ArXiv). There is a rich line of work focused on going beyond the celebrated Fast Fourier Transform algorithm by exploiting sparsity in the signal’s Fourier representation. While the one-dimensional case is very well understood, results in higher dimensions have either been lacking with respect to time complexity, sample complexity, or the approximation guarantee. This work is the first to give an algorithm which runs in sublinear time, has near-optimal sample complexity, and provides an arbitrarily accurate approximation guarantee for the multivariate case.

Sublinear Time Estimation of Degree Distribution Moments: The Arboricity Connection, by Talya Eden, Dana Ron, C. Seshadhri (ArXiv). This work revisits the problem of approximating the moments of the degree distribution of a graph. Prior work has shown that the \(s\)-th moment of the degree distribution can be computed with \(\tilde O(n^{1 – 1/(s+1)})\) queries, which is optimal up to \(\mathrm{poly}(\log n)\) factors. This work provides a new algorithm which requires only \(\tilde O(n^{1 – 1/s})\) queries on graphs of bounded arboricity, while still matching previous near-optimal bounds in the worst case. Impressively, the algorithm is incredibly simple, and can be stated in just a few lines of pseudocode.

A Local Algorithm for Constructing Spanners in Minor-Free Graphs, by Reut Levi, Dana Ron, Ronitt Rubinfeld (ArXiv). This paper addresses the problem of locally constructing a sparse spanning graph. For an edge \(e\) in a graph \(G\), one must determine whether or not \(e\) is in a sparse spanning graph \(G’\) in a manner that is consistent with previous answers. In general graphs, the complexity of this problem is \(\Omega(\sqrt{n})\), leading to the study of restricted families of graphs. In minor-free graphs (which include, for example, planar graphs), existing algorithms require \((d/\varepsilon)^{\mathrm{poly}(h)\log(1/\varepsilon)}\) queries and induce a stretch of \(\mathrm{poly}(d, h, 1/\varepsilon)\), where \(d\) is the maximum degree, \(h\) is the size of the minor, and \(G’\) is of size at most \((1 + \varepsilon)n\). The algorithm in this paper shows the complexity of this problem to be polynomial, requiring only \(\mathrm{poly}(d, h, 1/\varepsilon)\) queries and inducing a stretch of \(\tilde O(h \log (d)/\varepsilon)\).

News for March 2016

While March 2016 has been rather low in terms of property testing, we did see a new paper appear:

A Note on Tolerant Testing with One-Sided Error, by Roei Tell (ECCC). A natural generalization of property testing is that of tolerant testing, as introduced by Parnas, Ron, and Rubinfeld [PRR06]: where the tester still must reject all objects that are far from satisfying the property, but now also has to accept those that are sufficiently close (all that with constant probability). In this work is considered the question of one-sidedness of tolerant testers: namely, is it possible to only err on the farness side, but accept close output with probability one? As it turns out, it is not — the author shows that any such one-sided tolerant tester, for basically any property of interest, must essentially query the whole input…

Universal Locally Testable Codes, by Oded Goldreich and Tom Gur (ECCC). In this work, the authors introduce and initiate the study of an extension of locally testable codes they name universal locally testable codes (universal-LTC). At a high-level, a universal-LTC (with regard to a family of functions \(\cal F\)) is a locally testable code \(C\) “for which the restrictions (subcodes) of \(C\) by functions in \(\cal F\) are also locally testable.” In other terms, one is then able to test efficiently, given an encoded string \(w\), if (i) \(w=C(x)\) for some \(x\); but also, for any \(f\in \cal F\), if (ii) \(w=C(x)\) for some \(x\) that satisfies \(f(x)=1\).

Edit (04/06): added the work of Goldreich and Gur, which was overlooked in our first version of the article.

News For February 2016

News for February 2016! Another take on distribution testing, exciting stuff about generating random graphs, and distributed property testing.

Sublinear Random Access Generators for Preferential Attachment Graphs by Guy Even, Reut Levi, Moti Medina, and Adi Rosén (ArXiv). A major aspect of research in “network science” and applied large graph analysis is random graph models. A classic example is the Barabasi-Albert preferential attachment (PA) model, that attempts to model graphs that appear in the real-world. The model is inherently sequential, in that vertices “arrive” in order and connect to other vertices according to a specified distribution. It was not clear how to generate a PA graph in parallel, or how one can construct a local view of a PA graph without constructing all of it. Not clear, that is, until this paper. This really cool result shows the following. Suppose we want to construct an \(n\) vertex PA graph. One can basically query the adjacency list of any vertex in \(poly(\log n)\) time, without constructing the graph! The PA graph is implicitly constructed through the queries. This is really interesting because PA graphs (at least conceptually) are created sequentially, and the order of vertex arrival fundamentally affects the graph structure.

 

The Uniform Distribution is Complete with Respect to Testing Identity to a Fixed Distribution by Oded Goldreich (ECCC) . This is a follow up to a recent result by Diakonikolas and Kane (covered last month!). This paper reduces testing equality (of a distribution) to a fixed distribution, to the special case to when the fixed distribution is uniform. The notions of reduction in distribution testing were exploited by Diakonikolas and Kane to get optimal testers. While this new paper does not give new bounds, it gives a clean and simplified approach for testing distribution equality. This paper came out of Oded’s lecture notes, and has a nice expository feel to it.

 

Fast Distributed Algorithms for Testing Graph Properties by Keren Censor-Hillel, Eldar Fischer, Gregory Schwartzman, and Yadu Vasudev (ArXiv) . Connections between distributed algorithms and property testing have been unearthed in the past (most explicitly by Parnas and Ron). This paper directly solves graph property testing problems in the distributed setting. For dense graph testing, it is known that any (constant-query) testing algorithm algorithm can be made non-adaptive, and thus can be simulated in the distributed setting. The results for testing bipartiteness in the sparse graph model are quite interesting, because existing property testers use random walks. This paper gives a distributed implementation of multiple random walks from all vertices, and controls the total congestion of the implementation. This leads to a \(O(\log n)\)-round bipartiteness tester.

News for January 2016

In the first month of 2016 we had a couple of interesting papers in property testing.

Sublinear-Time Algorithms for Counting Star Subgraphs with Applications to Join Selectivity Estimation by Maryam Aliakbarpour, Amartya Shankha Biswas, Themistoklis Gouleakis, John Peebles, Ronitt Rubinfeld and Anak Yodpinyanee (ArXiv) . Estimating the number of subgraphs of a fixed form in a given graph has been a very well studied problem in the subject of property testing.  The classical example is counting the number of triangles in a graph. Usually we assume we can sample the vertices uniformly at random. In this paper a new model of query has been considered where one can sample the edges uniformly at random. Under this model it is shown that the query complexity can be reduced when the goal is to estimate the number of p-stars in a graph. It would be interesting to see if this model helps to get better query complexity for other problems also. Depending on the application different query models can be relevant and hence understanding these different query models are essential.

 

A New Approach for Testing Properties of Discrete Distributions by Ilias Diakonikolas and Daniel M. Kane (ArXiv) . Testing properties of distribution has been a central topic in our field. Not only are these problems interesting they also has been applied as subroutines to many other testing algorithms. In the literature a number of different properties like testing whether a distribution is uniform or testing whether two distributions are identical or testing if two distributions are independent and many more has been studied. In most of the properties the distance measure is the L_1 measure. In this paper a unifying technique to obtain testing algorithms for properties of distributions in the L_1 distance has been given by converting them to the problem of testing in the L_2 distance and applying standard testing algorithms in the L_2 distance. This approach would hopefully help us simplify the various algorithms in distribution testing.

News for November 2015

(Updating post with an additional paper that we missed in our first posting. Sorry! Feel free to email us at little.oh.of.n@gmail.com to inform us of papers we should mention.)

We’ve got exciting new in November! Optimal results for testing of monotone conjunctions, a new lower bound for monotonicity testing (yay!), and new lower bounds for Locally Testable Codes.

Tight Bounds for the Distribution-Free Testing of Monotone Conjunctions by Xi Chen and Jinyu Xie (arxiv). Consider functions \(f: \{0,1\}^n \rightarrow \{0,1\}\) (amen), and the basic property of being a monotone conjunction. Our notion of distance is with respect to a distribution \(\mathcal{D}\), so the distance between functions \(f\) and \(g\) is \(\mathop{Pr}_{x \sim \mathcal{D}} [f(x) \neq g(x)]\). In the distribution-free testing model, the tester does not know the distribution \(\mathcal{D}\), but is allowed samples from \(\mathcal{D}\). When \(\mathcal{D}\) is uniform, this coincides with standard property testing. There can be significant gaps between standard and distribution-free testing, evidenced by conjunctions. Parnas, Ron, and Samorodnitsky proved that monotone conjunctions can be tested in the vanilla setting in time independent of \(\mathcal{n}\), while Glasner-Servedio prove a \(\Omega(n^{1/5})\) lower bound for distribution-free testing. This paper provides a one-sided, adaptive distribution-free tester that makes \(\widetilde{O}(n^{1/3})\) queries, and they also prove a matching (up to polylog terms and \(\epsilon\)-dependencies) two-sided, adaptive lower bound. This is a significant improvement on the previous upper bound of \(\widetilde{O}(\sqrt{n})\) of Dolev-Ron, as well as over the previous lower bound. Furthermore, the results of this paper hold for general conjunctions.

A Polynomial Lower Bound for Testing Monotonicity by Aleksandrs Belovs and Eric Blais (arxiv). Surely the reader needs little introduction to testing monotonicity of Boolean functions, and a previous Open Problem post should help. A quick summary: we want to test monotonicity of \(f:\{0,1\}^n \rightarrow \{0,1\}\). The best upper bound is the \(\widetilde{O}(\sqrt{n})\) non-adaptive, one-sided tester of Khot et al. There is a matching, non-adaptive lower bound (up to polylog terms) by Chen et al, implying the best-known adaptive lower bound of \(\Omega(\log n)\). Can adaptivity help? This paper proves an adaptive lower bound of \(\Omega(n^{1/4})\). Exciting! The approach of Chen et al is to focus on monotonicity of linear threshold functions (technically, regular LTFS, where the weights are bounded). The authors show that this property can be tested in \(\mathop{poly}(\log n)\) time, shooting down hopes of extending the LTF approach for adaptive lower bounds. The key insight is to work with the distribution of Talagrand’s random DNF instead, which is the most noise sensitive DNF. (Talagrand strikes again. He helps the upper bound, he helps the lower bound.) Perturbations of this DNF lead to non-monotone functions, the paper proves that these distributions cannot be distinguished in \(o(n^{1/4})\) queries.

Lower bounds for constant query affine-invariant LCCs and LTCs by Arnab Bhattacharyya and Sivakanth Gopi (arxiv). Think of any code as a set/property of codewords in the domain \(\Sigma^N\). Such a code is an \(r\)-LTC if it has an \(r\)-query property tester. An \(r\)-LCC is has the property that, given any \(x \in \Sigma^n\) sufficiently close to a codeword, one can determine any coordinate of the codeword using \(r\)-queries. A fundamental question is to understand the length of LCCs and LTCS, or alternately (fixing \(\Sigma^N\)) determining the largest possible set of codewords. Existing constructions typically have much symmetry, either linear or affine invariance. (Check out Arnab Bhattacharyya’s survey on affine invariant properties for more details.) It is convenient to think of any codeword as a function \(f: [N] \rightarrow \Sigma\), and furthermore, think of \([N]\) as a vector space \(\mathbb{K}^n\). The best known construction of Guo, Kopparty, and Sudan yields (affine-invariant) LCCs of size \(\exp(\Theta(n^{r-1}))\) and LTCs of size \(\exp(\Theta(n^{r-2}))\) (many dependences on \(\mathbb{K}\), the rate, etc. are hidden in the big-Oh). This paper show that these bounds are actually the best achievable by any affine-invariant code. (Previous lower bounds of Ben-Sasson and Sudan only held for linear codes.) The intriguing and wide-open question is to prove such lower bounds without affine invariance.

News for September 2015

We have a couple of papers this month.

Are Few Bins Enough: Testing Histogram Distributions by Clement Canonne (ECCC). Testing whether a distribution is uniform is a very important, well studied and almost completely understood problem. A generalization of this question is whether a distribution on the set {1,…,n} is a k-histogram, that is, whether the distribution can be represented as a piece-wise constant function over at most k contiguous intervals. This very important generalization has been much less understood and huge gap between the upper and lower bound for the query complexity existed. In this paper the gap has been almost closed by improving both the upper and lower bounds.

Testing Properties of Functions on Finite Groups by Kenta Oono and Yuichi Yoshida (arXiv). Testing algebraic properties of functions have taken a central place in property testing right from the time of its inception.  Different kinds of functions have been studied in literature. In this paper the authors study the functions that map elements of a finite group to square matrices over the complex field with Frobenius norm 1. In this paper various properties of these functions are proved to be testable.  Some of the properties considered in this paper are important from the point of view of representation theory. This is the first paper making the connection between property testing and representation theory. With representation theory taking a prominent role in computer science lately, this paper is expected to be the first of a long list of related results to follow.

 

News for August

This month saw more development on testing properties of distributions and a result with intriguing connections to property testing. And for readers who may have missed it, Clément Canonne and Gautam Kamath wrote an engaging survey of some recent work on testing properties of distribution here.

Optimal algorithms and lower bounds for testing closeness of structured distributions by Ilias Diakonikolas, Daniel M. Kane, and Vladimir Nikishkin (arXiv). One of the fundamental results in testing properties of distributions is that if we want to estimate the (\(L_1\)) distance between two unknown distributions on a domain of size \(n\) up to some constant additive factor,  we need to draw \(O(n^{2/3})\) samples from these distributions, and this sample complexity is tight in general. But what if we consider the same problem in the setting where we are promised that the distributions come from some (known) class of distribution? This paper shows that, for many natural classes of distributions, we can obtain much more efficient algorithms for testing the closeness of distributions in this setting.

Smooth Boolean functions are easy: efficient algorithms for low-sensitivity functions by Parikshit Gopalan, Noam Nisan, Rocco A. Servedio, Kunal Talwar, and Avi Wigderson (arXiv). The (maximum) sensitivity of a Boolean function \(f : \{0,1\}^n \to \{0,1\}\) is the size of the largest set \(S \subseteq [n]\) such that there is an input \(x \in \{0,1\}^n\) for which \(f(x) = f(y)\) for every input \(y\) obtained by flipping the value of the \(i\)th coordinate of \(x\) for some \(i \in S\). One of the results in this paper shows that functions with low sensitivity can be locally self-corrected: given query access to a function \(r : \{0,1\}^n \to \{0,1\}\) that is close to a function \(f\) with low sensitivity (think of \(r\) as a corrupted version of \(f\)), there is an algorithm that for any input $x \in \{0,1\}^n$, queries \(r\) on a few inputs and outputs with high probability the value \(f(x)\). This notion of local self-correction is of course closely related to locally-decodable codes; it is also one of the fundamental techniques used to obtain many results in property testing as well (see for example Chapter 3 of Dana Ron’s survey). Can this result, or the techniques used to obtain it, also yield new results in property testing?