Category Archives: Monthly digest

News for July 2015

The summer month of July brings to us exciting work on distribution testing, scale-free graph testing, personal PageRank estimation, and (hold your breath!) cake cutting. Let’s hear more…

Avid followers of PTReview need little introduction to distribution testing, since we’ve seen much work on this topic over the past year. We have two concurrent results that really push the state-of-the-art of distribution testing.
Testing Shape Restrictions of Discrete Distributions by Clément Canonne, Ilias Diakonikolas, Themis Gouleakis, and Ronitt Rubinfeld (arxiv). Most (sublinear) distribution testing results work for structured classes, like monotone, k-modal, Poisson, multinomial etc. distributions. This paper gives an overarching framework that unifies these concepts in terms of succinctness. Consider a class \(\mathbb{C}\) of distributions over \([n]\). This class is succinct if for any distribution \(\mathcal{D} \in \mathbb{C}\), one can partition \([n]\) into a “small” number of intervals such that \(\mathcal{D}\) is “approximately constant” with each interval. The main punchline is that any such class has a sublinear sample distribution tester. Amazingly, the generic tester given is optimal (up to poly log factors and the distance parameter)! Much previous work is subsumed as a corollary of this main theorem, and the paper also gives a lower bound framework to get matching lower bounds.

Optimal Testing for Properties of Distributions by Jayadev Acharya, Constantinos Daskalakis, and Gautam Kamath (arxiv). This paper concurrently gets many of the results of the previous paper using completely different techniques. The focus is on monotone, product, and log-concave distributions, where the paper gives completely optimal testers (both in terms on the domain size and the distance parameter). The key insight is to focus on the \(\chi^2\)-distance between distributions, though distribution testing typically uses the total-variation (or \(\ell_1\)) distance. The tester is also robust/tolerant, in that yes instances include distributions close to the class considered.

Bidirectional PageRank Estimation: From Average-Case to Worst-Case by Peter Lofgren, Siddhartha Banerjee, and Ashish Goel (arxiv). Not your typical sublinear algorithms paper, but that never stopped us. Personalized PageRank (PPR) is a fundamental similarity notion between vertices in a graph, and is (sort of) the probability that a short random walk from \(u\) ends at \(v\). Naturally, estimating the PPR value from \(u\) to \(v\) can be done in time \(O(1/p)\), where \(p\) is the PPR value. (Simply take many random walks from \(u\), and see how often you hit \(v\).) Can you do sublinear in this value? The answer is yes, when the graph is undirected and has bounded degree. One can do this in \(O(1/\sqrt{p})\) time, a significant improvement. It basically uses bidirectional search, by doing forward walks from \(u\) and backward walks from \(v\). These ideas were developed in earlier work of Lofgren et al, based on true-blooded sublinear graph algorithms for expander reconstruction by Kale, Peres, and Seshadhri, which are further based on truer-blooded property testing questions by Goldreich and Ron on expansion testing. The best thing, these algorithms work in practice!

Every Property is Testable on a Natural Class of Scale-Free Multigraphs – ver. 2 by Hiro Ito (arxiv). Alas, PTRreview missed ver. 1, which appeared in April. One of the weaknesses of property testing on sparse graphs is the restriction to bounded degree. This paper considers arbitrary sparse graphs, and focuses on graphs with a power-law degree distribution. There are classic results of Barabási and Albert showing the significance of such graphs (also called scale-free networks) in the real-world. The main result is that special classes of scale-free graphs defined by clustering properties are hyperfinite. This means that one can remove a small fraction of edges to split the graph into constant-sized pieces. An important result of Newman and Sohler showed (in the bounded degree case) that any property defined by a subset of hyperfinite graphs graphs is testable. This paper leverages that result to show that properties defined by clustered classes of scale-free graphs are also testable.

How to solve the cake-cutting problem in sublinear time – ver. 2 by Hiro Ito and Takahiro Ueda (arxiv). Again, ver. 1 appeared in April and was missed by us. Apologies! Cake cutting is a classic resource allocation problem. Consider a cake, as the unit interval \([0,1]\). There are \(n\) players, each of whom has a different utility of the cake, each specified by a continuous, real function on \([0,1]\). We wish to give each player a slice (subinterval of \([0,1]\)) such that they all have utility \(1/n\). An old result of Even and Paz give \(O(n \log n)\) divide and conquer algorithms. What can be said in sublinear time? This paper shows that one can give fair assignments to \(o(n)\) players in \(o(n)\) time, such that the assignment can be extended to almost all remaining players.

News for June 2015

This month we have a number of results that are related to query complexity though not directly related to property testing.

Diamond Sampling for Approximate Maximum All-pairs Dot-product (MAD) Search by Grey Ballard, Tamara G. Kolda, Ali Pinar and C. Seshadhri (arXiv). For a long time we have been hoping to use our tools and techniques to solve problems that are useful in the real world One such problem which has importance in the real life applications is, given two sets of vectors the problem is to find the t pairs of vectors with the highest dot product. In this paper they use clever sampling techniques to design algorithms which are better than the state-of-the-art algorithms. They not only give theoretical guarantee but also validate their results empirically.  Bridging the gap between theory and practice is an extremely important at the same time a very challenging job. We hope more work will be done in this direction.

Sub-linear Upper Bounds on Fourier dimension of Boolean Functions in terms of Fourier sparsity by Swagato Sanyal (arXiv). Understanding the relationship between Fourier dimension of a Boolean function and sparsity is an important problem. In this paper a better bound on the Fourier dimension in terms of sparsity is obtained. The main technique is to use the fact that the Fourier dimension is equivalent to the the non-adaptive parity decision tree and then bounding the parity decision tree in terms of sparsity.

Relationship between Deterministic Query Complexity, Randomized Query Complexity and Quantum Query Complexity.   In the world of query complexity understanding the exact relationship between the the various models of computation is the main problem. It is known that Deterministic Query Complexity, Randomized Query Complexity and Quantum Query complexity are all polynomially related. But the exact polynomial relation between them is not known. Last month there was a sudden burst of activity in this area with three papers addressing this problem coming out is a span of two weeks.  In the papers Separations in Query Complexity Based on Pointer Functions by Andris Ambainis, Kaspars Balodis, Aleksandrs Belovs, Troy Lee, Miklos Santha and Juris Smotrovs (arXiv) and Towards Better Separation between Deterministic and Randomized Query Complexity by Sagnik Mukhopadhyay Swagato Sanyal (arXiv) it is proved that the Randomized Query Complexity and the Deterministic Query Complexity are quadratically related and that this bound is tight up to logarithmic factors. Very soon after in A Super-Grover Separation Between Randomized and Quantum Query Complexities by Shalev Ben-David (arXiv) it was proved that the separation between the Quantum Query Complexity and Randomized query Complexity is super quadratic. With these three results our knowledge about the query complexity is slightly more clearer.

News for May 2015

Given the outstanding performance of last month it is not surprising (though disappointing) that this month performance has not matched up to last months performance. We have only three paper on property testing this month. Please do let us know if we have missed anything. Hopefully next month will be more exciting.

A simpler sublinear algorithm for approximating the triangle count by C. Seshadhri (arXiv). Approximating number of triangles in a graph is a very important problem that also has applications in real life. Using a model (that is possibly more relevant for real life application), in which the queries are more powerful than those allowed in sparse graph and dense graph model, Eden, Levi and Ron gave a sublinear algorithm for approximating the number of triangles in an undirected graph. Seshadri has given an alternate algorithm with similar complexity but much simpler analysis.

Using higher-order Fourier analysis over general fields by Arnab Bhattacharyya and Abhishek Bhowmick (arXiv). The holy grail of property testing is to characterize properties that are testable. Many classes of properties have been proved to the testable. For most of results in this direction the main tool has been Fourier analysis. But classical Fourier analysis has its limitations. Recently higher-oder Fourier analysis has been used as a tool in many fields of theory. In this paper also higher-order Fourier analysis has been used to prove that a large class of property is testable.

Streaming Property Testing of Visibly Pushdown Languages by Nathanaël François, Frédéric Magniez, Michel de Rougemont, Olivier Serre (arXiv).  The subject of streaming property testing deals with the problem of distinguishing if a stream of data satisfies a property or is “far” from satisfying the property. The goal is to minimize the amount of memory space used (number of queries made is not relevant in this model. In this paper they give a streaming property tester for  Visibly Pushdown Languages using poly logarithmic space.

 

News for April 2015

April was a busy time in property testing with 9 (!) papers posted online this month. So let’s jump right in.

Testing properties of graphs

Approximately Counting Triangles in Sublinear Time by Talya Eden, Amit Levi, and Dana Ron (ECCC). Can we estimate the number of triangles in a sparse graph in sublinear time? When we can only query a vertex to learn its degree or its i-th neighbor, then the answer is no. But this paper shows that if we can also ask whether a pair of vertices are connected by an edge, then the answer is yes!

Testing Cluster Structure of Graphs by Artur Czumaj, Pan Peng, and Christian Sohler (arXiv). Another fundamental problem in the analysis of graphs is that of determining if the vertices of a graph can be partitioned into a small number of good clusters. This paper shows that the appropriate formalization of this problem can be solved in sublinear-time, and that in fact \(O(\sqrt{n})\) queries suffice to test whether a graph with \(n\) vertices can be partitioned into \(k = O(1)\) clusters.

Testing properties of distributions

A Survey on Distribution Testing: Your Data is Big. But is it Blue? by Clément Canonne (ECCC). As we have seen on this blog, there has been a lot of recent development in testing properties of distributions. This latest survey provides an up-to-date introduction to this research.

Faster Algorithms for Testing under Conditional Sampling by Moein Falahatgar, Ashkan Jafarpour, Alon Orlitsky, Venkatadheeraj Pichapathi, and Ananda Theertha Suresh (arXiv). One of the recent developments in distribution testing has been the discovery that natural conditional sampling models offer a surprising amount of algorithmic power. This paper gives yet more evidence of this fact, notably showing that testing the equivalence of two distributions with support size \(n\) can be done with roughly \(O(\log \log n)\) queries—a doubly-exponential improvement over the best possible algorithm for the same problem in the standard sampling model!

Sampling Correctors by Clément Canonne, Themis Gouleakis, and Ronitt Rubinfeld (arXiv). This paper initiates the study of a research direction related, but not exactly equivalent to distribution testing: given that we assume that the true distribution has some structural property (like monotonicity, for example), but that the samples we draw from this distribution may contain errors, can we somehow use these noisy samples to generate “corrected” samples from the original underlying distribution?

Locally-testable codes

High rate locally-correctable and locally-testable codes with sub-polynomial query complexity by Swastik Kopparty, Or Meir, Noga Ron-Zewi, and Shubhangi Saraf (ECCC). The central question in locally-testable error-correcting codes is to determine the tradeoff (if any!) that we must have between the number of queries require to test if an input is a codeword, the rate of the code, and the distance between the codewords. This paper shows that there are codes with constant rate and distance that can also be tested and decoded with sub-polynomial query complexity \(\exp(\tilde{O}(\sqrt{\log n}))\).

Robust testing of lifted codes with applications to low-degree testing by Alan Guo, Elad Haramaty, and Madhu Sudan (ECCC). Typically, we require a local test to reject inputs that are far from codewords with a reasonably large probability (that is: we want the testers to have good soundness properties). A stronger requirement might be to want the local view of a tester on those inputs to be far from the local view of any codeword with good probability. Testers that satisfy this property are called robust, and have many useful properties. This paper shows that natural testers of low-degree polynomials and their generalizations known as lifted codes are indeed robust testers.

General property testing results

On Being Far from Far and on Dual Problems in Property Testing by Roei Tell (ECCC). If I can efficiently test that an input has property \(\Pi\), can I also efficiently test if an input is far from having the property \(\Pi\)? At first glance, it seems that both these problems (that is, the direct and dual version of the problem of testing \(\Pi\)) are equivalent, but this is not always the case. This paper explores this subtle question and provides initial results for dual problems in testing properties of functions, graphs, and distributions.

Trading query complexity for sample-based testing and multi-testing scalability by Eldar Fischer, Oded Lachish, Yadu Vasudev (arXiv). What properties of combinatorial objects can we test in sublinear time when we have only sample access to it (instead of full query access)? This paper provides a general and powerful result showing that every property that can be tested non-adaptively with a constant number of queries can also be tested with a sublinear number of samples.

 

News for March 2015

This month’s post will liberally link to previous posts, because it turns out we have written enough in the past to motivate this month’s work. Our news this March has more monotonicity testing, non-deterministic testing, and image testing. Let me test your patience no more…

On the Complexity of Nondeterministically Testable Hypergraph Parameters

by Marek Karpinski and Ronald Markó (arXiv). The notion of nondeterministic graph property testing was introduced by Lovász and Vestergombi and further developed by Gishboliner and Shapira, and the authors. Our August 2014 news post explains these results, rather (ahem) beautifully, so check that out. This result generalizes nondeterministic property testing to hypergraphs, and proves the equivalence to deterministic (standard) property testing. Unlike their previous result, this does not give an explicit function for the complexity of a deterministic tester for a specified nondeterministically testable property. (We only know that it is “some” constant.) This is analogous to the original work of Lovász and Vestergombi, which was non-constructive. These non-constructive results use limit theorems for dense (hyper)graphs. This certainly leaves the open question of getting explicit complexity bounds.

Quantum Algorithm for Monotonicity Testing on the Hypercube by Aleksandrs Belovs and (PTReview’s very own) Eric Blais (arXiv). Much has been said about Boolean monotonicity testing in our reports, and you can refresh everything by checking out the open problem for Feb 2015. This result gives a \(O(n^{1/4}\epsilon^{-1/2})\) in the quantum testing model (check Ashley Montanaro’s survey on PTReview). Standard quantum amplification of the classic edge tester and the recent Khot-Minzer-Safra monotonicity tester yield \(O(\sqrt{n/\epsilon})\) and \((n^{1/4}/\epsilon)\) testers respectively. The point of this result is to get the best dependence on both \(n\) and \(\epsilon\). For \(\epsilon = 1/\sqrt{n}\), the new quantum algorithm gives a cubic speedup over existing classical testers. From a quantum computing perspective, there are few problems with polynomial super-quadratic speedup, and Boolean monotonicity testing now enters that list.

 

Constant-Time Testing and Learning of Image Properties by Piotr Berman, Meiram Murzabulatov, and Sofya Raskhodnikova (arXiv). The notion of testing image properties was first introduced by Raskhodnikova and further developed by Tsur and Ron. There have been some really interesting results on applying these methods in practice for computer vision. This paper studies different query models for the property tester. Typically, the input is a grid of (black and white) pixels, and we wish to determine if the black region is connected, convex, etc. Previous testers were allowed to query any pixel of choice and could be adaptive. This paper focuses on the restricted sample based models, where the tester only sees uniform random pixels, and variants thereof. Interestingly, this limited power suffices to test and even tolerant test the properties of being a half-plane, connectedness, and convexity. There are numerous results for the different models and tolerant vs standard testers. Having richer models would likely have more impact in practice, where query models may not be feasible. Maybe this result will lead to more work in computer vision!

News for February 2015

The following three very interesting papers came out in the month of February.

A polynomial regularity lemma for semi-algebraic hypergraphs and its applications in geometry and property testing by Jacob Fox, Janos Pach and Andrew Suk (arXiv). Testability of graph or hyper-graph properties is all about regularity. Alon and Shapira used regularity lemma for graphs to prove that every hereditary property of graphs is testable. Generalization of this result for hyper-graphs has been obtain recently. The upper bound on the query complexity for testing hereditary properties for graphs (or hyper-graphs) is a function that is independent of the size of the input but dependent on the promise parameter. The query complexity as a function of the promise parameter, ε, is the same as the function that appears in the regularity lemma (where ε is the approximation parameter), which is usually a tower of ε. In this paper it has been shown that for special hereditary graph properties (and for other generalized objects) the constant as a function of the approximation parameter is polynomial. Thus for these special classes of hereditary properties (for example H-freeness, where H is k-uniform) the query complexity for testing is a polynomial in the promise parameter.  This is an important step towards understanding the dependence of the query complexity for testing hereditary graph (and hyper-graph) properties on the promise parameter.

Proofs of Proximity for Context-Free Languages and Read-Once Branching Programs by Oded Goldreich, Tom Gur and Ron Rothblum (ECCC). In a recent paper of Gur and Rothblum, they studied the property testing model where the tester/verifier is given free access to a short proof in addition to the query access to the input. A related model called, interactive proof of proximity, by proposed by Rothblum, Vadhan and Wigderson where the verifier is allowed to interact with an all powerful prover.  There is a tradeoff between the size of the short proof or the amount of interaction and the amount of queries made to the input. In both these models the verifier is more powerful than in the traditional property testing setting  So for properties that are not testable in the standard property testing setting, understanding the query complexity in the stronger models is a natural question. In this paper context-free languages and small read-once branching programs has been studied.

Sample Complexity for Winner Prediction in Elections by Arnab Bhattacharyya and Palash Dey (arXiv). Number of random samples necessary to predict the winner in an election, where there are a number of contestants, is not only an interesting theoretical problem but also has lots of real life applications. Different voting rules may have different complexity in terms of number of samples required. In this paper a number of different standard voting rules have been considered and bounds for number of samples required has been presented. It is possibly the first time sampling has been analyzed in the context of social choice theory. This is indeed a area where a lot more theoretical works are expected to follow.

News for January 2015

We ended our last post discussing results from 2014 by discussing the monotonicity testing conjecture that \(O(\sqrt{n})\) queries suffice to test whether a Boolean function is monotone.  As it turns out, a proof of this conjecture was just around the corner…

On Monotonicity Testing and Boolean Isoperimetric type Theorems by Subhash Khot, Dor Minzer, and Muli Safra (ECCC). This remarkable paper shows that, indeed, we can test whether \(f : \{0,1\}^n \to \{0,1\}\) is monotone with \(O(\sqrt{n})\) queries using a natural tester. The authors establish this result by building on the connection between monotonicity testers and directed analogues of isoperimetric inequalities on the Boolean hypercube first established by Chakrabarty and Seshadhri. Specifically, the authors establish a directed analogue of a classic inequality of Talagrand and combine it with a number of other interesting innovations to analyze their monotonicity tester.

Big Data on the Rise: Testing monotonicity of distributions by Clément Canonne (arXiv). Another setting in which monotonicity testing has received a lot of attention recently is in the setting of testing properties of distributions. A distribution \(D\) on \(\{1,2,\ldots,n\}\) is monotone (non-increasing) if its probability mass function is non-increasing: \(D(1) \ge D(2) \ge \cdots \ge D(n)\). When we must test whether a function is monotone by drawing samples from the distribution, \(\tilde{\Theta}(\sqrt{n})\) samples are both necessary and sufficient. This work shows that, in contrast, testers with stronger query access (such as conditional sampling) to the distribution can test monotonicity much more efficiently.

News for December 2014

Happy new year! I just looked back at our archives, and saw that PTReview has been on from July 2013. I’m happy (and maybe mildly surprised) that it’s still going strong. And so the chronicling of \(o(n)\) continues…

Much coolness we have to report: permutation testing, linearity testing, distribution testing, and monotonicity testing. Without further ado:

Large permutations and parameter testing by Roman Glebov, Carlos Hoppen, Tereza Klimosova, Yoshiharu Kohayakawa, Daniel Kral, and Hong Liu (arXiv). Just as typical dense graph testing involves checking properties on a random, constant-sized induced subgraph, we can look at permutation testers that test properties of permutations by sampling sub-permutations. The theory of dense graph testing is closely tied to the Szemeredi regularity lemma, the notion of graph limits, and the behavior of subgraph densities. (Check out this survey by Borgs et al.) An analogous theory for permutations has been built by a subset of the authors in a series of papers (survey). There is a notion of permutation properties/parameters that are forcible, meaning that two permutations that have similar frequencies of (a fixed, finite set of) subpermutations have close values of the parameter. This seems naturally connected to testing. Indeed, a permutation parameter that is forcible can be approximated in constant-time by simply approximating the frequencies of the subpermutations. This paper shows that converse is false: there is an constant-time approximable property that is not forcible.

A Self-Tester for Linear Functions over the Integers with an Elementary Proof of Correctness by Sheela Devadas and Ronitt Rubinfeld (arXiv). The venerable property of linearity testing needs no introduction, and has enough history to fill up this month’s post (and the next and the next). But what if the domain is n-bit integers and we care about running time in terms of bit operations? Suppose we wish to check if a program supposed to compute \(f(x) = b\cdot x\) (for fixed, known \(b\)). Inputs like \(2^k\) are easy to evaluate by the checker, and could be used to get faster checkers. This paper gives testers for linearity, multivariate linear functions, and multilinear functions, where computation time is always linear in sample complexity.

\(\ell_p\) Testing and Learning of Discrete Distributions by Bo Waggoner (arXiv). Distribution testing needs little introduction, and we have seen much progress over our time in PTReview. Let’s start with the basics. The seminal work of Batu et al. and later Paninksi showed that the sample complexity of testing uniformity of a distribution \(\mathcal{D}\) over universe \([n]\) is \(\Theta(\sqrt{n}/\epsilon^2)\). Meaning, the sample complexity of checking \(\|\mathcal{D} – \mathcal{U}\|_1 > \epsilon\) (where \(\mathcal{U}\) is uniform on \([n]\)) is \(\Theta(\sqrt{n}/\epsilon^2)\). But what if we had \(\ll \sqrt{n}/\epsilon^2\) samples? From prior work, nothing could be inferred. (Update: As Ilias Diakonikolas pointed out to me, the \(\ell_2\) results were previously known, both by the Batu et al. work and a recent paper by Chan et al. that settles the \(\ell_2\) question.) This paper changes that, and says that we can still infer something about other \(\ell_p\) norms, \(\|\mathcal{D} – \mathcal{U}\|_p\). What impressed me about this paper is the detailed understand of the interplay between \(n, \epsilon, p\). For example, the sample complexity of uniformity testing over \(\ell_p\)-norm for any \(p > 1\) is independent of \(n\). There are many, many results in this paper with an intriguing threshold phenomenon at \(p=2\). For distribution testing in practice, I would think that this result would be of much significance.

New algorithms and lower bounds for monotonicity testing by Xi Chen, Rocco A. Servedio, and Li-Yang Tan (arXiv). Ah yes, Boolean monotonicity testing. Consider the standard coordinate wise partial order on \(\{0,1\}^n\), given by \(\prec\). A function \(f:\{0,1\}^n \rightarrow \{0,1\}\) is monotone if \(\forall x \prec y, f(x) \leq f(y)\). The complexity of property testing (Boolean) monotonicity is one of those tantalizing, simple-to-state questions that is still quite open. I’ll spare you the full story and the epsilons, but here’s the status. The best upper bound is a non-adaptive, one-sided \(O(n^{7/8})\) tester by Chakrabarty and Seshadhri. The best lower bound is a non-adaptive, one-sided lower bound of \(\Omega(\sqrt{n})\) by Fischer et al. This implies an \(\Omega(\log\log n)\) lower bound for general testers. This paper changes all of this. The authors prove a \(\Omega(n^{1/5})\) lower bound for two-sided non-adaptive testers, leading to an exponentially better \(\Omega(\log n)\) lower bound for general testers. The main insight is to focus on monotone vs non-monotone families of linear threshold functions, and show that procedures making few (non-adaptive) queries cannot distinguish between such families. The main hammer is to use recent Central Limit Theorems. As an added bonus, this paper improves the upper bound for monotonicity testing to \(O(n^{5/6})\), with a modified tester (and better analysis) of Chakrabarty and Seshadhri. But can the lower bound be improved?

Boolean function monotonicity testing requires (almost) \(n^{1/2}\) non-adaptive queries by Xi Chen, Anindya De, Rocco A. Servedio, and Li-Yang Tan (arXiv). Yes, the lower bound can be improved. This paper gets the non-adaptive two-sided bound to (almost) \(\Omega(\sqrt{n})\), matching the one-sided Fischer et al bound. The paper proves improved Central Limit Theorems, tailored for this application. The authors, and I with them, believe that this is the true complexity. At that intriguing note, we end 2014!

News for November 2014

With the submission deadline for many conferences in November it is no surprise that we have good a number of nice articles related to property testing submitted in ArXiv and ECCC this month. Here they are:

Sunflowers and Testing Triangle-Freeness of Functions by Ishay Haviv and Ning Xie (arXiv). Testing triangle freeness is one of the central problems in property testing. The upper bound on the query complexity is obtained using the regularity lemma. The upper bound is constant that comes from the regularity lemma. The lower bound (in terms of the distance parameter, ε) is a polynomial in ε. In another line of recent works different variations of sunflower lemma has been studied.  Some important problems in complexity theory and algorithms, like matrix multiplication, has been connected to different variations of the sunflower lemma. In this work a new approach, using variations of sunflower lemma, is given to obtain lower bounds for triangle freeness. This work gives a new lower bound technique. It also reconfirms the need to understand the combinatorial problem of sunflower lemma and its variations.

Path Sampling: A Fast and Provable Method for Estimating 4-Vertex Subgraph Counts by Madhav Jha, C. Seshadhri and Ali Pinar (arXiv). Counting or approximately counting the number of occurrences of a particular subgraph in a given graph is a very important problem – not only as a mathematically interesting problem but also from the application point of view. The hard part is to design an algorithm that is provably good as well as empirically satisfactory. There is a huge line of research on this topic that solves this problem for a particular subgraph.  In this paper a 3-path sampling based algorithm is presented that can approximate the number of occurrences of any 4-vertex subgraph. Not only is the correctness and the running time of the algorithm is proved mathematically but the speed of the algorithm has been tested empirically also. This work should motivate us to design sub-linear algorithms that are both provably good as well as can be implementable.

Forrelation: A Problem that Optimally Separates Quantum from Classical Computing by Scott Aaronson and Andris Ambainis (arXiv). Understanding the power of quantum query over classical queries is well studied subject and has a long line of interesting results. A long standing open problem posed by Buhrman et al is related to the problem of what is the largest separation between classical query complexity and quantum query complexity.  In this paper it is shown that the Forrelation problem, where one needs to decide whether one Boolean function is highly correlated with the Fourier transform of a second function, can be solved using 1 quantum query where as classically one needs Ω(√n/log n) number of queries. They also prove this is almost tight. This improves the previous know gap of 1 vs. Θ(n^{1/4}).

Query complexity in expectation by Jedrzej Kaniewski, Troy Lee and Ronald de Wolf (arXiv). In query complexity usually we consider the number of queries required for computing a particular given function f(x) exactly when x has to be queried. But if we are required to only output a value whose expected value (expectation taken over random coin tosses of the algorithm) is f(x), then what happens to the query complexity.  This natural question has been address in this paper for the first time. They prove that in this model both the randomized and quantum query complexity is equal to two natural definition of degree for the function f. This is very natural model of computing and seems like a very interesting direction to pursue. it should yield some nice results and applications in near future.

A Chasm Between Identity and Equivalence Testing with Conditional Queries by Jayadev Acharya, Clément L. Canonne and Gautam Kamath (arXiv). Recently a number of works have studied the power of conditional sampling. Conditional sampling in general gives lot more power compared to standard sampling model. For example testing identity of an unknown distribution, with support size n, can be done using constant number of conditional sampling, while in the standard sampling model Θ(√n) number of samples are required. For a related problem of testing equivalence of two unknown distributions, in the standard sampling model Θ(n^{2/3}) number of queries are required while only polylog(n) conditional samples are sufficient.  It was not known if this bound for conditional samples is tight for this problem of testing equivalence of unknown distribution. In this paper a lower bound of Ω(√(loglog n)) is proved.

News for October 2014

Halloween came early for the property testing community this year, with lots of treats throughout the month of October.

There was FOCS 2014, with a number of relevant contributions including On Learning and Testing Dynamic Environments by Oded Goldreich and Dana Ron (discussed previously here), New algorithms and lower bounds for monotonicity testing by Xi Chen, Rocco Servedio, and Li-Yang Tan and An Automatic Inequality Prover and Instance Optimal Identity Testing by Greg and Paul Valiant.

There were also two workshops on the first day of FOCS with very close connections to property testing. The first, Sparse Fourier Transform: Theory and Applications, was centered around recent developments in sublinear-time algorithms for computing the Fourier transform of signals that have a sparse spectrum (or are close to it). The second, Higher-order Fourier Analysis, discussed recent applications of this area of research in computer science; one of which is in testing algebraic properties of functions. The slides for the talks at these workshops are available online.

Another treat (or a promise of future treats) came in the form of the list of accepted papers for ITCS 2015. As we can see from the list, there will be many presentations at the conference on property testing topics, and we can look forward to lots of interesting reading when these papers are posted online.

Finally, we have this month’s contributions:

Gowers Norm, Function Limits, and Parameter Estimation by Yuichi Yoshida. (arXiv) There has been a lot of recent progress on the problem of characterizing the set of algebraic properties of functions that can be tested with a constant number of queries. This line of work has been notable not just for its success, but also for the connections it has established between property testing and other areas of mathematics (such as higher-order Fourier analysis, for example). In this work, this problem (and generalizations of it) is revisited from a different angle, by considering a new notion of function limits. As the results in the paper show, this notion leads to alternative proofs of the constant-query testability of many classic algebraic properties of functions, and promises to offer a fruitful new line of inquiry.

Testing Identity of Structured Distributions by Ilias Diakonikolas, Daniel M. Kane, and Vladimir Nikishkin. (arXiv) The most fundamental problem in distribution testing is that of testing identity: given sample access to some unknown distribution q, is it identical (or far from) some known distribution p? This problem has been studied extensively and is well-understood: \(\Theta(\sqrt{n})\) queries are both necessary and sufficient to test identity of distributions with support size \(n\). The current paper considers a natural follow-up question: if our distributions p and q are not arbitrary distributions but instead come from some (potentially large but structured) class of distributions, can we test identity more efficiently in this setting? The results of the paper give an emphatic affirmative answer to this question, showing that for many natural classes of distributions, identity can be tested quite efficiently.

Testing Poisson Binomial Distributions by Jayadev Acharya and Constantinos Daskalakis. (arXiv) This paper considers another fundamental problem in distribution testing: how many samples must we draw from some unknown distribution q on \([n]\) to test whether it is a sum of Bernoulli distributions or not? Such distributions are known as Poisson Binomial distributions, and the current paper gives tight upper and lower bounds of \(\Theta(n^{1/4})\) samples for this testing task.