News for January 2016

In the first month of 2016 we had a couple of interesting papers in property testing.

Sublinear-Time Algorithms for Counting Star Subgraphs with Applications to Join Selectivity Estimation by Maryam Aliakbarpour, Amartya Shankha Biswas, Themistoklis Gouleakis, John Peebles, Ronitt Rubinfeld and Anak Yodpinyanee (ArXiv) . Estimating the number of subgraphs of a fixed form in a given graph has been a very well studied problem in the subject of property testing. The classical example is counting the number of triangles in a graph. Usually we assume we can sample the vertices uniformly at random. In this paper a new model of query has been considered where one can sample the edges uniformly at random. Under this model it is shown that the query complexity can be reduced when the goal is to estimate the number of p-stars in a graph. It would be interesting to see if this model helps to get better query complexity for other problems also. Depending on the application different query models can be relevant and hence understanding these different query models are essential.

A New Approach for Testing Properties of Discrete Distributions by Ilias Diakonikolas and Daniel M. Kane (ArXiv) . Testing properties of distribution has been a central topic in our field. Not only are these problems interesting they also has been applied as subroutines to many other testing algorithms. In the literature a number of different properties like testing whether a distribution is uniform or testing whether two distributions are identical or testing if two distributions are independent and many more has been studied. In most of the properties the distance measure is the L_1 measure. In this paper a unifying technique to obtain testing algorithms for properties of distributions in the L_1 distance has been given by converting them to the problem of testing in the L_2 distance and applying standard testing algorithms in the L_2 distance. This approach would hopefully help us simplify the various algorithms in distribution testing.

News for December 2015

Leave a reply

Greetings from the exciting Workshop on Sublinear Algorithms at John Hopkins University! As this workshop and the upcoming SODA and ITCS conferences get 2016 to a roaring start, let us take one last look back at property testing news from last year. In December, one work in particular caught my eye:

Non-Local Probes Do Not Help with Graph Problems by Mika Göös, Juho Hirvonen, Reut Levi, Moti Medina, and Jukka Suomela (arXiv). A generalization of property testing that has recently seen some fascinating developments in the past few years is the local computation algorithms (LCA) model, in which the algorithm is asked to answer some local query (such as “what is the color of this vertex in some fixed, legal coloring of the graph?”) in sublinear-time. This paper relates the LCA model to message-passing models and in the process gives a powerful new tool for establishing lower bounds in LCAs.

News for November 2015

Leave a reply

(Updating post with an additional paper that we missed in our first posting. Sorry! Feel free to email us at little.oh.of.n@gmail.com to inform us of papers we should mention.)

We’ve got exciting new in November! Optimal results for testing of monotone conjunctions, a new lower bound for monotonicity testing (yay!), and new lower bounds for Locally Testable Codes.

Tight Bounds for the Distribution-Free Testing of Monotone Conjunctions by Xi Chen and Jinyu Xie (arxiv). Consider functions $f: \{0,1\}^n \rightarrow \{0,1\}$ (amen), and the basic property of being a monotone conjunction. Our notion of distance is with respect to a distribution $\mathcal{D}$, so the distance between functions $f$ and $g$ is $\mathop{Pr}_{x \sim \mathcal{D}} [f(x) \neq g(x)]$. In the distribution-free testing model, the tester does not know the distribution $\mathcal{D}$, but is allowed samples from $\mathcal{D}$. When $\mathcal{D}$ is uniform, this coincides with standard property testing. There can be significant gaps between standard and distribution-free testing, evidenced by conjunctions. Parnas, Ron, and Samorodnitsky proved that monotone conjunctions can be tested in the vanilla setting in time independent of $\mathcal{n}$, while Glasner-Servedio prove a $\Omega(n^{1/5})$ lower bound for distribution-free testing. This paper provides a one-sided, adaptive distribution-free tester that makes $\widetilde{O}(n^{1/3})$ queries, and they also prove a matching (up to polylog terms and $\epsilon$-dependencies) two-sided, adaptive lower bound. This is a significant improvement on the previous upper bound of $\widetilde{O}(\sqrt{n})$ of Dolev-Ron, as well as over the previous lower bound. Furthermore, the results of this paper hold for general conjunctions.

A Polynomial Lower Bound for Testing Monotonicity by Aleksandrs Belovs and Eric Blais (arxiv). Surely the reader needs little introduction to testing monotonicity of Boolean functions, and a previous Open Problem post should help. A quick summary: we want to test monotonicity of $f:\{0,1\}^n \rightarrow \{0,1\}$. The best upper bound is the $\widetilde{O}(\sqrt{n})$ non-adaptive, one-sided tester of Khot et al. There is a matching, non-adaptive lower bound (up to polylog terms) by Chen et al, implying the best-known adaptive lower bound of $\Omega(\log n)$. Can adaptivity help? This paper proves an adaptive lower bound of $\Omega(n^{1/4})$. Exciting! The approach of Chen et al is to focus on monotonicity of linear threshold functions (technically, regular LTFS, where the weights are bounded). The authors show that this property can be tested in $\mathop{poly}(\log n)$ time, shooting down hopes of extending the LTF approach for adaptive lower bounds. The key insight is to work with the distribution of Talagrand’s random DNF instead, which is the most noise sensitive DNF. (Talagrand strikes again. He helps the upper bound, he helps the lower bound.) Perturbations of this DNF lead to non-monotone functions, the paper proves that these distributions cannot be distinguished in $o(n^{1/4})$ queries.

Lower bounds for constant query affine-invariant LCCs and LTCs by Arnab Bhattacharyya and Sivakanth Gopi (arxiv). Think of any code as a set/property of codewords in the domain $\Sigma^N$. Such a code is an $r$-LTC if it has an $r$-query property tester. An $r$-LCC is has the property that, given any $x \in \Sigma^n$ sufficiently close to a codeword, one can determine any coordinate of the codeword using $r$-queries. A fundamental question is to understand the length of LCCs and LTCS, or alternately (fixing $\Sigma^N$) determining the largest possible set of codewords. Existing constructions typically have much symmetry, either linear or affine invariance. (Check out Arnab Bhattacharyya’s survey on affine invariant properties for more details.) It is convenient to think of any codeword as a function $f: [N] \rightarrow \Sigma$, and furthermore, think of $[N]$ as a vector space $\mathbb{K}^n$. The best known construction of Guo, Kopparty, and Sudan yields (affine-invariant) LCCs of size $\exp(\Theta(n^{r-1}))$ and LTCs of size $\exp(\Theta(n^{r-2}))$ (many dependences on $\mathbb{K}$, the rate, etc. are hidden in the big-Oh). This paper show that these bounds are actually the best achievable by any affine-invariant code. (Previous lower bounds of Ben-Sasson and Sudan only held for linear codes.) The intriguing and wide-open question is to prove such lower bounds without affine invariance.

News (of lack thereof) for October 2015

Leave a reply

Alas! A dry month, with no property testing or sublinear papers. Maybe next month…

Sublinear Algorithms Workshop at Johns Hopkins University

Leave a reply

(Posting an announcement for a workshop on sublinear algorithms.)

We are organizing a Sublinear Algorithms workshop that will take place at Johns Hopkins University, January 7-9, 2016. The workshop will bring together researchers interested in sublinear algorithms, including sublinear-time algorithms (e.g., property testing and distribution testing), sublinear-space algorithms (e.g., sketching and streaming) and sublinear measurements (e.g., sparse recovery and compressive sensing).

The workshop will be held right before SODA’16, which starts on January 10 in Arlington, VA (about 50 miles from JHU).

Participation in this workshop is open to all, with free registration. In addition to 20+ excellent invited talks, the program will include short contributed talks by graduating students and postdocs, as well as a poster session. To participate in the contributed talk session and/or the poster session, apply by December 1.

For further details and registration, please visit

http://www.cs.jhu.edu/~vova/sublinear2016/main.html .

Best,

Vladimir Braverman, Johns Hopkins University
Piotr Indyk, MIT
Robert Krauthgamer, Weizmann Institute of Science
Sofya Raskhodnikova, Pennsylvania State University

News for September 2015

Leave a reply

We have a couple of papers this month.

Are Few Bins Enough: Testing Histogram Distributions by Clement Canonne (ECCC). Testing whether a distribution is uniform is a very important, well studied and almost completely understood problem. A generalization of this question is whether a distribution on the set {1,…,n} is a k-histogram, that is, whether the distribution can be represented as a piece-wise constant function over at most k contiguous intervals. This very important generalization has been much less understood and huge gap between the upper and lower bound for the query complexity existed. In this paper the gap has been almost closed by improving both the upper and lower bounds.

Testing Properties of Functions on Finite Groups by Kenta Oono and Yuichi Yoshida (arXiv). Testing algebraic properties of functions have taken a central place in property testing right from the time of its inception. Different kinds of functions have been studied in literature. In this paper the authors study the functions that map elements of a finite group to square matrices over the complex field with Frobenius norm 1. In this paper various properties of these functions are proved to be testable. Some of the properties considered in this paper are important from the point of view of representation theory. This is the first paper making the connection between property testing and representation theory. With representation theory taking a prominent role in computer science lately, this paper is expected to be the first of a long list of related results to follow.

News for August

Leave a reply

This month saw more development on testing properties of distributions and a result with intriguing connections to property testing. And for readers who may have missed it, Clément Canonne and Gautam Kamath wrote an engaging survey of some recent work on testing properties of distribution here.

Optimal algorithms and lower bounds for testing closeness of structured distributions by Ilias Diakonikolas, Daniel M. Kane, and Vladimir Nikishkin (arXiv). One of the fundamental results in testing properties of distributions is that if we want to estimate the ($L_1$) distance between two unknown distributions on a domain of size $n$ up to some constant additive factor, we need to draw $O(n^{2/3})$ samples from these distributions, and this sample complexity is tight in general. But what if we consider the same problem in the setting where we are promised that the distributions come from some (known) class of distribution? This paper shows that, for many natural classes of distributions, we can obtain much more efficient algorithms for testing the closeness of distributions in this setting.

Smooth Boolean functions are easy: efficient algorithms for low-sensitivity functions by Parikshit Gopalan, Noam Nisan, Rocco A. Servedio, Kunal Talwar, and Avi Wigderson (arXiv). The (maximum) sensitivity of a Boolean function $f : \{0,1\}^n \to \{0,1\}$ is the size of the largest set $S \subseteq [n]$ such that there is an input $x \in \{0,1\}^n$ for which $f(x) = f(y)$ for every input $y$ obtained by flipping the value of the $i$th coordinate of $x$ for some $i \in S$. One of the results in this paper shows that functions with low sensitivity can be locally self-corrected: given query access to a function $r : \{0,1\}^n \to \{0,1\}$ that is close to a function $f$ with low sensitivity (think of $r$ as a corrupted version of $f$), there is an algorithm that for any input $x \in \{0,1\}^n$, queries $r$ on a few inputs and outputs with high probability the value $f(x)$. This notion of local self-correction is of course closely related to locally-decodable codes; it is also one of the fundamental techniques used to obtain many results in property testing as well (see for example Chapter 3 of Dana Ron’s survey). Can this result, or the techniques used to obtain it, also yield new results in property testing?

News for July 2015

Leave a reply

The summer month of July brings to us exciting work on distribution testing, scale-free graph testing, personal PageRank estimation, and (hold your breath!) cake cutting. Let’s hear more…

Avid followers of PTReview need little introduction to distribution testing, since we’ve seen much work on this topic over the past year. We have two concurrent results that really push the state-of-the-art of distribution testing.
Testing Shape Restrictions of Discrete Distributions by Clément Canonne, Ilias Diakonikolas, Themis Gouleakis, and Ronitt Rubinfeld (arxiv). Most (sublinear) distribution testing results work for structured classes, like monotone, k-modal, Poisson, multinomial etc. distributions. This paper gives an overarching framework that unifies these concepts in terms of succinctness. Consider a class $\mathbb{C}$ of distributions over $[n]$. This class is succinct if for any distribution $\mathcal{D} \in \mathbb{C}$, one can partition $[n]$ into a “small” number of intervals such that $\mathcal{D}$ is “approximately constant” with each interval. The main punchline is that any such class has a sublinear sample distribution tester. Amazingly, the generic tester given is optimal (up to poly log factors and the distance parameter)! Much previous work is subsumed as a corollary of this main theorem, and the paper also gives a lower bound framework to get matching lower bounds.

Optimal Testing for Properties of Distributions by Jayadev Acharya, Constantinos Daskalakis, and Gautam Kamath (arxiv). This paper concurrently gets many of the results of the previous paper using completely different techniques. The focus is on monotone, product, and log-concave distributions, where the paper gives completely optimal testers (both in terms on the domain size and the distance parameter). The key insight is to focus on the $\chi^2$-distance between distributions, though distribution testing typically uses the total-variation (or $\ell_1$) distance. The tester is also robust/tolerant, in that yes instances include distributions close to the class considered.

Bidirectional PageRank Estimation: From Average-Case to Worst-Case by Peter Lofgren, Siddhartha Banerjee, and Ashish Goel (arxiv). Not your typical sublinear algorithms paper, but that never stopped us. Personalized PageRank (PPR) is a fundamental similarity notion between vertices in a graph, and is (sort of) the probability that a short random walk from $u$ ends at $v$. Naturally, estimating the PPR value from $u$ to $v$ can be done in time $O(1/p)$, where $p$ is the PPR value. (Simply take many random walks from $u$, and see how often you hit $v$.) Can you do sublinear in this value? The answer is yes, when the graph is undirected and has bounded degree. One can do this in $O(1/\sqrt{p})$ time, a significant improvement. It basically uses bidirectional search, by doing forward walks from $u$ and backward walks from $v$. These ideas were developed in earlier work of Lofgren et al, based on true-blooded sublinear graph algorithms for expander reconstruction by Kale, Peres, and Seshadhri, which are further based on truer-blooded property testing questions by Goldreich and Ron on expansion testing. The best thing, these algorithms work in practice!

Every Property is Testable on a Natural Class of Scale-Free Multigraphs – ver. 2 by Hiro Ito (arxiv). Alas, PTRreview missed ver. 1, which appeared in April. One of the weaknesses of property testing on sparse graphs is the restriction to bounded degree. This paper considers arbitrary sparse graphs, and focuses on graphs with a power-law degree distribution. There are classic results of Barabási and Albert showing the significance of such graphs (also called scale-free networks) in the real-world. The main result is that special classes of scale-free graphs defined by clustering properties are hyperfinite. This means that one can remove a small fraction of edges to split the graph into constant-sized pieces. An important result of Newman and Sohler showed (in the bounded degree case) that any property defined by a subset of hyperfinite graphs graphs is testable. This paper leverages that result to show that properties defined by clustered classes of scale-free graphs are also testable.

How to solve the cake-cutting problem in sublinear time – ver. 2 by Hiro Ito and Takahiro Ueda (arxiv). Again, ver. 1 appeared in April and was missed by us. Apologies! Cake cutting is a classic resource allocation problem. Consider a cake, as the unit interval $[0,1]$. There are $n$ players, each of whom has a different utility of the cake, each specified by a continuous, real function on $[0,1]$. We wish to give each player a slice (subinterval of $[0,1]$) such that they all have utility $1/n$. An old result of Even and Paz give $O(n \log n)$ divide and conquer algorithms. What can be said in sublinear time? This paper shows that one can give fair assignments to $o(n)$ players in $o(n)$ time, such that the assignment can be extended to almost all remaining players.

News for June 2015

Leave a reply

This month we have a number of results that are related to query complexity though not directly related to property testing.

Diamond Sampling for Approximate Maximum All-pairs Dot-product (MAD) Search by Grey Ballard, Tamara G. Kolda, Ali Pinar and C. Seshadhri (arXiv). For a long time we have been hoping to use our tools and techniques to solve problems that are useful in the real world One such problem which has importance in the real life applications is, given two sets of vectors the problem is to find the t pairs of vectors with the highest dot product. In this paper they use clever sampling techniques to design algorithms which are better than the state-of-the-art algorithms. They not only give theoretical guarantee but also validate their results empirically. Bridging the gap between theory and practice is an extremely important at the same time a very challenging job. We hope more work will be done in this direction.

Sub-linear Upper Bounds on Fourier dimension of Boolean Functions in terms of Fourier sparsity by Swagato Sanyal (arXiv). Understanding the relationship between Fourier dimension of a Boolean function and sparsity is an important problem. In this paper a better bound on the Fourier dimension in terms of sparsity is obtained. The main technique is to use the fact that the Fourier dimension is equivalent to the the non-adaptive parity decision tree and then bounding the parity decision tree in terms of sparsity.

Relationship between Deterministic Query Complexity, Randomized Query Complexity and Quantum Query Complexity. In the world of query complexity understanding the exact relationship between the the various models of computation is the main problem. It is known that Deterministic Query Complexity, Randomized Query Complexity and Quantum Query complexity are all polynomially related. But the exact polynomial relation between them is not known. Last month there was a sudden burst of activity in this area with three papers addressing this problem coming out is a span of two weeks. In the papers Separations in Query Complexity Based on Pointer Functions by Andris Ambainis, Kaspars Balodis, Aleksandrs Belovs, Troy Lee, Miklos Santha and Juris Smotrovs (arXiv) and Towards Better Separation between Deterministic and Randomized Query Complexity by Sagnik Mukhopadhyay Swagato Sanyal (arXiv) it is proved that the Randomized Query Complexity and the Deterministic Query Complexity are quadratically related and that this bound is tight up to logarithmic factors. Very soon after in A Super-Grover Separation Between Randomized and Quantum Query Complexities by Shalev Ben-David (arXiv) it was proved that the separation between the Quantum Query Complexity and Randomized query Complexity is super quadratic. With these three results our knowledge about the query complexity is slightly more clearer.

News for May 2015

1 Reply

Given the outstanding performance of last month it is not surprising (though disappointing) that this month performance has not matched up to last months performance. We have only three paper on property testing this month. Please do let us know if we have missed anything. Hopefully next month will be more exciting.

A simpler sublinear algorithm for approximating the triangle count by C. Seshadhri (arXiv). Approximating number of triangles in a graph is a very important problem that also has applications in real life. Using a model (that is possibly more relevant for real life application), in which the queries are more powerful than those allowed in sparse graph and dense graph model, Eden, Levi and Ron gave a sublinear algorithm for approximating the number of triangles in an undirected graph. Seshadri has given an alternate algorithm with similar complexity but much simpler analysis.

Using higher-order Fourier analysis over general fields by Arnab Bhattacharyya and Abhishek Bhowmick (arXiv). The holy grail of property testing is to characterize properties that are testable. Many classes of properties have been proved to the testable. For most of results in this direction the main tool has been Fourier analysis. But classical Fourier analysis has its limitations. Recently higher-oder Fourier analysis has been used as a tool in many fields of theory. In this paper also higher-order Fourier analysis has been used to prove that a large class of property is testable.

Streaming Property Testing of Visibly Pushdown Languages by Nathanaël François, Frédéric Magniez, Michel de Rougemont, Olivier Serre (arXiv). The subject of streaming property testing deals with the problem of distinguishing if a stream of data satisfies a property or is “far” from satisfying the property. The goal is to minimize the amount of memory space used (number of queries made is not relevant in this model. In this paper they give a streaming property tester for Visibly Pushdown Languages using poly logarithmic space.

Property Testing Review

The latest in property testing and sublinear time algorithms

News for January 2016

News for December 2015

News for November 2015

News (of lack thereof) for October 2015

Sublinear Algorithms Workshop at Johns Hopkins University

News for September 2015

News for August

News for July 2015

News for June 2015

News for May 2015