Author Archives: Sourav Chakraborty

News for January 2016

In the first month of 2016 we had a couple of interesting papers in property testing.

Sublinear-Time Algorithms for Counting Star Subgraphs with Applications to Join Selectivity Estimation by Maryam Aliakbarpour, Amartya Shankha Biswas, Themistoklis Gouleakis, John Peebles, Ronitt Rubinfeld and Anak Yodpinyanee (ArXiv) . Estimating the number of subgraphs of a fixed form in a given graph has been a very well studied problem in the subject of property testing.  The classical example is counting the number of triangles in a graph. Usually we assume we can sample the vertices uniformly at random. In this paper a new model of query has been considered where one can sample the edges uniformly at random. Under this model it is shown that the query complexity can be reduced when the goal is to estimate the number of p-stars in a graph. It would be interesting to see if this model helps to get better query complexity for other problems also. Depending on the application different query models can be relevant and hence understanding these different query models are essential.

 

A New Approach for Testing Properties of Discrete Distributions by Ilias Diakonikolas and Daniel M. Kane (ArXiv) . Testing properties of distribution has been a central topic in our field. Not only are these problems interesting they also has been applied as subroutines to many other testing algorithms. In the literature a number of different properties like testing whether a distribution is uniform or testing whether two distributions are identical or testing if two distributions are independent and many more has been studied. In most of the properties the distance measure is the L_1 measure. In this paper a unifying technique to obtain testing algorithms for properties of distributions in the L_1 distance has been given by converting them to the problem of testing in the L_2 distance and applying standard testing algorithms in the L_2 distance. This approach would hopefully help us simplify the various algorithms in distribution testing.

News for September 2015

We have a couple of papers this month.

Are Few Bins Enough: Testing Histogram Distributions by Clement Canonne (ECCC). Testing whether a distribution is uniform is a very important, well studied and almost completely understood problem. A generalization of this question is whether a distribution on the set {1,…,n} is a k-histogram, that is, whether the distribution can be represented as a piece-wise constant function over at most k contiguous intervals. This very important generalization has been much less understood and huge gap between the upper and lower bound for the query complexity existed. In this paper the gap has been almost closed by improving both the upper and lower bounds.

Testing Properties of Functions on Finite Groups by Kenta Oono and Yuichi Yoshida (arXiv). Testing algebraic properties of functions have taken a central place in property testing right from the time of its inception.  Different kinds of functions have been studied in literature. In this paper the authors study the functions that map elements of a finite group to square matrices over the complex field with Frobenius norm 1. In this paper various properties of these functions are proved to be testable.  Some of the properties considered in this paper are important from the point of view of representation theory. This is the first paper making the connection between property testing and representation theory. With representation theory taking a prominent role in computer science lately, this paper is expected to be the first of a long list of related results to follow.

 

News for June 2015

This month we have a number of results that are related to query complexity though not directly related to property testing.

Diamond Sampling for Approximate Maximum All-pairs Dot-product (MAD) Search by Grey Ballard, Tamara G. Kolda, Ali Pinar and C. Seshadhri (arXiv). For a long time we have been hoping to use our tools and techniques to solve problems that are useful in the real world One such problem which has importance in the real life applications is, given two sets of vectors the problem is to find the t pairs of vectors with the highest dot product. In this paper they use clever sampling techniques to design algorithms which are better than the state-of-the-art algorithms. They not only give theoretical guarantee but also validate their results empirically.  Bridging the gap between theory and practice is an extremely important at the same time a very challenging job. We hope more work will be done in this direction.

Sub-linear Upper Bounds on Fourier dimension of Boolean Functions in terms of Fourier sparsity by Swagato Sanyal (arXiv). Understanding the relationship between Fourier dimension of a Boolean function and sparsity is an important problem. In this paper a better bound on the Fourier dimension in terms of sparsity is obtained. The main technique is to use the fact that the Fourier dimension is equivalent to the the non-adaptive parity decision tree and then bounding the parity decision tree in terms of sparsity.

Relationship between Deterministic Query Complexity, Randomized Query Complexity and Quantum Query Complexity.   In the world of query complexity understanding the exact relationship between the the various models of computation is the main problem. It is known that Deterministic Query Complexity, Randomized Query Complexity and Quantum Query complexity are all polynomially related. But the exact polynomial relation between them is not known. Last month there was a sudden burst of activity in this area with three papers addressing this problem coming out is a span of two weeks.  In the papers Separations in Query Complexity Based on Pointer Functions by Andris Ambainis, Kaspars Balodis, Aleksandrs Belovs, Troy Lee, Miklos Santha and Juris Smotrovs (arXiv) and Towards Better Separation between Deterministic and Randomized Query Complexity by Sagnik Mukhopadhyay Swagato Sanyal (arXiv) it is proved that the Randomized Query Complexity and the Deterministic Query Complexity are quadratically related and that this bound is tight up to logarithmic factors. Very soon after in A Super-Grover Separation Between Randomized and Quantum Query Complexities by Shalev Ben-David (arXiv) it was proved that the separation between the Quantum Query Complexity and Randomized query Complexity is super quadratic. With these three results our knowledge about the query complexity is slightly more clearer.

News for May 2015

Given the outstanding performance of last month it is not surprising (though disappointing) that this month performance has not matched up to last months performance. We have only three paper on property testing this month. Please do let us know if we have missed anything. Hopefully next month will be more exciting.

A simpler sublinear algorithm for approximating the triangle count by C. Seshadhri (arXiv). Approximating number of triangles in a graph is a very important problem that also has applications in real life. Using a model (that is possibly more relevant for real life application), in which the queries are more powerful than those allowed in sparse graph and dense graph model, Eden, Levi and Ron gave a sublinear algorithm for approximating the number of triangles in an undirected graph. Seshadri has given an alternate algorithm with similar complexity but much simpler analysis.

Using higher-order Fourier analysis over general fields by Arnab Bhattacharyya and Abhishek Bhowmick (arXiv). The holy grail of property testing is to characterize properties that are testable. Many classes of properties have been proved to the testable. For most of results in this direction the main tool has been Fourier analysis. But classical Fourier analysis has its limitations. Recently higher-oder Fourier analysis has been used as a tool in many fields of theory. In this paper also higher-order Fourier analysis has been used to prove that a large class of property is testable.

Streaming Property Testing of Visibly Pushdown Languages by Nathanaël François, Frédéric Magniez, Michel de Rougemont, Olivier Serre (arXiv).  The subject of streaming property testing deals with the problem of distinguishing if a stream of data satisfies a property or is “far” from satisfying the property. The goal is to minimize the amount of memory space used (number of queries made is not relevant in this model. In this paper they give a streaming property tester for  Visibly Pushdown Languages using poly logarithmic space.

 

News for February 2015

The following three very interesting papers came out in the month of February.

A polynomial regularity lemma for semi-algebraic hypergraphs and its applications in geometry and property testing by Jacob Fox, Janos Pach and Andrew Suk (arXiv). Testability of graph or hyper-graph properties is all about regularity. Alon and Shapira used regularity lemma for graphs to prove that every hereditary property of graphs is testable. Generalization of this result for hyper-graphs has been obtain recently. The upper bound on the query complexity for testing hereditary properties for graphs (or hyper-graphs) is a function that is independent of the size of the input but dependent on the promise parameter. The query complexity as a function of the promise parameter, ε, is the same as the function that appears in the regularity lemma (where ε is the approximation parameter), which is usually a tower of ε. In this paper it has been shown that for special hereditary graph properties (and for other generalized objects) the constant as a function of the approximation parameter is polynomial. Thus for these special classes of hereditary properties (for example H-freeness, where H is k-uniform) the query complexity for testing is a polynomial in the promise parameter.  This is an important step towards understanding the dependence of the query complexity for testing hereditary graph (and hyper-graph) properties on the promise parameter.

Proofs of Proximity for Context-Free Languages and Read-Once Branching Programs by Oded Goldreich, Tom Gur and Ron Rothblum (ECCC). In a recent paper of Gur and Rothblum, they studied the property testing model where the tester/verifier is given free access to a short proof in addition to the query access to the input. A related model called, interactive proof of proximity, by proposed by Rothblum, Vadhan and Wigderson where the verifier is allowed to interact with an all powerful prover.  There is a tradeoff between the size of the short proof or the amount of interaction and the amount of queries made to the input. In both these models the verifier is more powerful than in the traditional property testing setting  So for properties that are not testable in the standard property testing setting, understanding the query complexity in the stronger models is a natural question. In this paper context-free languages and small read-once branching programs has been studied.

Sample Complexity for Winner Prediction in Elections by Arnab Bhattacharyya and Palash Dey (arXiv). Number of random samples necessary to predict the winner in an election, where there are a number of contestants, is not only an interesting theoretical problem but also has lots of real life applications. Different voting rules may have different complexity in terms of number of samples required. In this paper a number of different standard voting rules have been considered and bounds for number of samples required has been presented. It is possibly the first time sampling has been analyzed in the context of social choice theory. This is indeed a area where a lot more theoretical works are expected to follow.

News for November 2014

With the submission deadline for many conferences in November it is no surprise that we have good a number of nice articles related to property testing submitted in ArXiv and ECCC this month. Here they are:

Sunflowers and Testing Triangle-Freeness of Functions by Ishay Haviv and Ning Xie (arXiv). Testing triangle freeness is one of the central problems in property testing. The upper bound on the query complexity is obtained using the regularity lemma. The upper bound is constant that comes from the regularity lemma. The lower bound (in terms of the distance parameter, ε) is a polynomial in ε. In another line of recent works different variations of sunflower lemma has been studied.  Some important problems in complexity theory and algorithms, like matrix multiplication, has been connected to different variations of the sunflower lemma. In this work a new approach, using variations of sunflower lemma, is given to obtain lower bounds for triangle freeness. This work gives a new lower bound technique. It also reconfirms the need to understand the combinatorial problem of sunflower lemma and its variations.

Path Sampling: A Fast and Provable Method for Estimating 4-Vertex Subgraph Counts by Madhav Jha, C. Seshadhri and Ali Pinar (arXiv). Counting or approximately counting the number of occurrences of a particular subgraph in a given graph is a very important problem – not only as a mathematically interesting problem but also from the application point of view. The hard part is to design an algorithm that is provably good as well as empirically satisfactory. There is a huge line of research on this topic that solves this problem for a particular subgraph.  In this paper a 3-path sampling based algorithm is presented that can approximate the number of occurrences of any 4-vertex subgraph. Not only is the correctness and the running time of the algorithm is proved mathematically but the speed of the algorithm has been tested empirically also. This work should motivate us to design sub-linear algorithms that are both provably good as well as can be implementable.

Forrelation: A Problem that Optimally Separates Quantum from Classical Computing by Scott Aaronson and Andris Ambainis (arXiv). Understanding the power of quantum query over classical queries is well studied subject and has a long line of interesting results. A long standing open problem posed by Buhrman et al is related to the problem of what is the largest separation between classical query complexity and quantum query complexity.  In this paper it is shown that the Forrelation problem, where one needs to decide whether one Boolean function is highly correlated with the Fourier transform of a second function, can be solved using 1 quantum query where as classically one needs Ω(√n/log n) number of queries. They also prove this is almost tight. This improves the previous know gap of 1 vs. Θ(n^{1/4}).

Query complexity in expectation by Jedrzej Kaniewski, Troy Lee and Ronald de Wolf (arXiv). In query complexity usually we consider the number of queries required for computing a particular given function f(x) exactly when x has to be queried. But if we are required to only output a value whose expected value (expectation taken over random coin tosses of the algorithm) is f(x), then what happens to the query complexity.  This natural question has been address in this paper for the first time. They prove that in this model both the randomized and quantum query complexity is equal to two natural definition of degree for the function f. This is very natural model of computing and seems like a very interesting direction to pursue. it should yield some nice results and applications in near future.

A Chasm Between Identity and Equivalence Testing with Conditional Queries by Jayadev Acharya, Clément L. Canonne and Gautam Kamath (arXiv). Recently a number of works have studied the power of conditional sampling. Conditional sampling in general gives lot more power compared to standard sampling model. For example testing identity of an unknown distribution, with support size n, can be done using constant number of conditional sampling, while in the standard sampling model Θ(√n) number of samples are required. For a related problem of testing equivalence of two unknown distributions, in the standard sampling model Θ(n^{2/3}) number of queries are required while only polylog(n) conditional samples are sufficient.  It was not known if this bound for conditional samples is tight for this problem of testing equivalence of unknown distribution. In this paper a lower bound of Ω(√(loglog n)) is proved.

News for September 2014

It is unfortunate that in the month of september we had very few papers in theory that appeared in ECCC or Arxiv. And of them I could not find any paper that falls in the category of property testing / query complexity or sublinear algorithms.  It is possible I have overlooked some papers. Please do point out any paper that you may think relevant that appeared in the month of September.

 

News for March 2014

On Learning and Testing Dynamic Environments by Oded Goldreich and Dana Ron (ECCC). Given a dynamic environment that is evolving according to a local rule, the goal is to query some of the states in the system at some point of time and determine if the system is evolving according to some fixed rule or is far from it. A harder problem is to learn the evolution of the environment. They give some upper and some lower bounds for various versions of these problems. In particular they show that using sub-linear queries a lot can be achieved even in this setting. This is the first of its kind result and possibly many more results in this area will follow soon.