Author Archives: Gautam Kamath

News for August 2017

This month has been comparatively slow, with only five papers. I suppose we’re lucky that five papers is considered to be a slow month!

Local Testing and Decoding of High-Rate Error-Correcting Codes, by Swastik Kopparty and Shubhangi Saraf (ECCC). This is a survey article, covering recent results in locally testable, correctable, and decodable codes. A good place to start for any who have fallen behind on the recent literature.

Sample-Optimal Identity Testing with High Probability, by Ilias Diakonikolas, Themis Gouleakis, John Peebles, and Eric Price (arXiv). Distribution testing results are often stated in the regime where the probability of failure is some constant, i.e. \(\delta = 1/3\). These can be boosted to arbitrarily high probability of success at a multiplicative cost of \(\log (1/\delta)\) using the “median trick” — repeat the test \(\log (1/\delta)\) times and choose the majority result. This paper shows that this method is suboptimal for identity testing with small values of \(\delta\). In particular, they give upper and lower bounds for the entire tradeoff curve between \(n, \varepsilon\), and \(\delta\).

Sample Efficient Estimation and Recovery in Sparse FFT via Isolation on Average, by Michael Kapralov (arXiv). The Sparse Fourier Transform problem has seen a wealth of study in the past few years, with various tradeoffs between sample and time complexity. This paper gives the first result achieving the optimal sample complexity of \(O(k \log n)\) while maintaining a time complexity which is sublinear in the length of the signal, \(n\). In order to obtain this, the author introduces a new technique for analyzing noisy hashing schemes.

Generalized Uniformity Testing, by Tuğkan Batu and Clément L. Canonne (arXiv). Uniformity testing is one of the most studied problems in distribution testing, and it is well known that the complexity is \(\Theta(\sqrt{n})\). This paper studies a slightly different question: given samples from a distribution \(p\), is it uniform over some (unknown) subset of its domain? The authors give upper and lower bounds for this question, showing that the complexity is \(\Theta(1/\|p\|_3)\). Interestingly, when \(p\) is uniform over a set of size \(\Omega(n)\), the complexity is \(\Theta(n^{2/3})\), which is more than the \(\Theta(\sqrt{n})\) cost of vanilla uniformity testing.

Boolean Unateness Testing with \(\tilde O(n^{3/4})\) Adaptive Queries, by Xi Chen, Erik Waingarten, and Jinyu Xie (arXiv). At this point, we have a fairly mature understanding of testing monotonicity over the Boolean hypercube: for non-adaptive algorithms, the complexity is \(\tilde \Theta(\sqrt{n})\). There exists a gap for adaptive algorithms — the best known lower bound is \(\tilde \Omega(n^{1/3})\), while the best upper bound is the same as for the non-adaptive case \(\tilde O(n^{1/2})\). However, many conjecture that adaptivity does not help for monotonicity testing. More recently, there has been study of testing unateness — a function is unate if it is monotone non-decreasing or non-increasing with respect to each coordinate. Interestingly, this work proves an adaptive upper bound of \(\tilde O(n^{2/3})\) which beats the lower bound of \(\tilde \Omega(n)\) for non-adaptive algorithms, thus showing that adaptivity does help for unateness testing.

News for May 2017

News for February 2017

News for November 2016

1 Reply

November was quite eventful for property testing, with six exciting new results for you to peruse.

Alice and Bob Show Distribution Testing Lower Bounds (They don’t talk to each other anymore.), by Eric Blais, Clément L. Canonne, and Tom Gur (ECCC). The authors examine distribution testing lower bounds through the lens of communication complexity, a-la Blais, Brody, and Matulef, who previously showed such a connection for property testing lower bounds in the Boolean function setting. In this work, the authors’ main result involves testing identity to a specific distribution \(p\). While Valiant and Valiant showed tight bounds involving the \(\ell_{2/3}\)-quasinorm of \(p\), this paper gives tight bounds using a different quantity, namely Peetre’s \(K\)-functional. Their techniques also give lower bounds for several other properties (some old and some new), including monotonicity, sparse symmetric support, and \(k\)-juntas in the PAIRCOND model.

Fast property testing and metrics for permutations, by Jacob Fox and Fan Wei (arXiv). This paper proves a general testing result for permutations. In particular, it shows that any hereditary property of permutations is two-sided testable with respect to the rectangular distance with a constant number of queries. While in many such testing results on combinatorial objects (such as graphs), a “constant number of queries” may be exorbitantly large (due to complexities arising from an application of the strong regularity lemma), surprisingly, the complexity obtained in this paper is polynomial in \(1/\varepsilon\).

A Unified Maximum Likelihood Approach for Optimal Distribution Property Estimation, by Jayadev Acharya, Hirakendu Das, Alon Orlitsky, and Ananda Theertha Suresh (arXiv, ECCC). There has been a considerable deal of work recently on estimating several symmetric distribution properties, namely support size, support coverage, entropy, and distance to uniformity. One drawback of these results is that, despite the similarities between these properties, seemingly different techniques are required to obtain optimal rates for each. This paper shows that one concept, pattern maximum likelihood (PML), unifies them all. A PML distribution of a multiset of samples is any distribution which maximizes the likelihood of observing the multiplicities of the multiset, after discarding the labels on the elements. This can behave quite differently from the sequence maximum likelihood (SML), or empirical distribution. In particular, if a multiset of samples on support \(\{x, y\}\) is \(\{x, y, x\}\), then the SML is \((2/3, 1/3)\), while the PML is \((1/2, 1/2)\). The main result of this paper is, if one can approximate PML, then applying the plug-in estimator gives the optimal sample complexity for all of the aforementioned properties. The one catch is that efficient approximation of the PML is currently open. Consider the gauntlet thrown to all our readers!

Statistical Query Lower Bounds for Robust Estimation of High-dimensional Gaussians and Gaussian Mixtures, by Ilias Diakonikolas, Daniel M. Kane, and Alistair Stewart (arXiv, ECCC). While the main focus of this work is on lower bounds for distribution estimation in the statistical query (SQ) model, this paper also has some interesting lower bounds for multivariate testing problems. Namely, they show that it is impossible to achieve a sample complexity which is significantly sublinear in the dimension for either of the following two problems:

Given samples from an \(n\)-dimensional distribution \(D\), distinguish whether \(D = \mathcal{N}(0,I)\) or \(D\) is \(\varepsilon/100\)-close to any \(\mathcal{N}(\mu, I)\) where \(\|\mu\|_2 \geq \varepsilon\).
Given samples from an \(n\)-dimensional distribution \(D\), distinguish whether \(D = \mathcal{N}(0,I)\) or \(D\) is a mixture of \(k\) Gaussians with almost non-overlapping components.

Collision-based Testers are Optimal for Uniformity and Closeness, by Ilias Diakonikolas, Themis Gouleakis, John Peebles, and Eric Price (arXiv, ECCC). In the TCS community, the seminal results in the field of distribution testing include the papers of Goldreich and Ron and Batu et al., which study uniformity testing and \(\ell_2\)-closeness testing (respectively) using collision based testers. While these testers appeared to be lossy, subsequent results have attained tight upper and lower bounds for these problems. As suggested by the title, this paper shows that collision-based testers actually achieve the optimal sample complexities for uniformity and \(\ell_2\)-closeness testing.

Testing submodularity and other properties of valuation functions, by Eric Blais, and Abhinav Bommireddi (arXiv). This paper studies the query complexity of several properties which have been studied in the context of valuation functions in algorithmic game theory. These properties are real-valued functions over the Boolean hypercube, and include submodularity, additivity, unit-demand, and much more. The authors show that, for constant \(\varepsilon\) and any \(p \geq 1\), these properties are constant-sample testable. Their results are obtained via an extension of the testing by implicit learning method of Diakonikolas et al.

News for August 2016

3 Replies

This summer is a very exciting one for property testing, with seven (!) papers uploaded during the month of August. In particular, there have been a number of results on testing unateness.

An \(\tilde O(n)\) Queries Adaptive Tester for Unateness, by Subhash Khot and Igor Shinkar (arXiv, ECCC). A Boolean function is unate if, for each \(i \in [n]\), it is monotone non-increasing or monotone non-decreasing in coordinate \(i\). This is a generalization of monotonicity, one of the most studied problems in the property testing literature. This paper gives an adaptive algorithm for testing if a Boolean function is unate. The algorithm requires \(\tilde O(n)\) queries, improving on the \(O(n^{1.5})\) query tester of Goldreich et al. from 2000.

A \(\tilde O(n)\) Non-Adaptive Tester for Unateness, by Deeparnab Chakrabarty and C. Seshadhri (arXiv, ECCC). The unateness testing continues! This paper also gives a \(\tilde O(n)\) algorithm for this problem, but improves upon the previous work by making non-adaptive queries. The algorithm and analysis are very clean and elegant — with a theorem by Chakrabarty et al. in place, they take up just a single page!

Testing Unateness of Real-Valued Functions, by Roksana Baleshzar, Meiram Murzabulatov, Ramesh Krishnan S. Pallavoor, and Sofya Raskhodnikova (arXiv). The final paper this month on testing unateness, this work studies real-valued functions on the hypergrid \([n]^d\) (as opposed to Boolean functions on the hypercube). They give an adaptive algorithm requiring \(O\left(\frac{d \log (\max (d,n))}{\varepsilon}\right)\) queries, generalizing the above result by Khot and Shinkar, which works only for Boolean functions on the hypercube and requires \(O\left(\frac{d \log d}{\varepsilon}\right)\) queries. Additionally, they prove lower bounds for this setting (of \(\Omega(\min(d, |R|^2))\), where \(R\) is the range of the function) and the Boolean hypercube setting (of \(\Omega(\sqrt{d}/\varepsilon)\)). The latter lower bound leaves open a tantalizing question: does a \(O(\sqrt{d})\) query algorithm exist? In other words, is testing unateness no harder than testing monotonicity?

Testing k-Monotonicity, by Clément L. Canonne, Elena Grigorescu, Siyao Guo, Akash Kumar, Karl Wimmer (arXiv, ECCC). And now, a different generalization of monotonicity! This paper studies testing \(k\)-monotonicity, where a Boolean function over a poset \(\mathcal{D}\) is \(k\)-monotone if it alternates between the values \(0\) and \(1\) at most \(k\) times on any ascending chain in \(\mathcal{D}\). Classical monotonicity is then \(1\)-monotone in this definition. On the hypercube domain, the authors prove a separation between testing monotonicity and testing \(k\)-monotonicity and a separation between testing and learning. They also present a tolerant test for \(k\)-monotonicity over the hypergrid \([n]^d\) with complexity independent of \(n\).

The Dictionary Testing Problem, by Siddharth Barman, Arnab Bhattacharyya, and Suprovat Ghoshal (arXiv). The dictionary learning problem has enjoyed extensive study in computer science. In this problem, given a matrix \(Y\), one wishes to construct a decomposition \(Y = AX\) such that each column of \(X\) is \(k\)-sparse. In the settings typically studied, such a decomposition is guaranteed to exist. This paper initiates the dictionary testing problem, in which one wishes to test whether \(Y\) admits such a decomposition, or is far from it. The authors provide a very elegant characterization — it admits a decomposition iff the columns of \(Y\) have a sufficiently narrow Gaussian width.

The Sparse Awakens: Streaming Algorithms for Matching Size Estimation in Sparse Graphs, by Graham Cormode, Hossein Jowhari, Morteza Monemizadeh, and S. Muthukrishnan (arXiv). This paper provides streaming algorithms for estimating the size of the maximum matching in graphs with some underlying sparsity. They give a one-pass algorithm for insert-only and dynamic streams with bounded arboricity, and a two-pass algorithm for random order streams with bounded degree. Interestingly, for the latter result, they use local exploration techniques, which have seen previous use in the property testing literature.

Testing Assignments to Constraint Satisfaction Problems, by Hubie Chen, Matt Valeriote, and Yuichi Yoshida (arXiv). Given a CSP from a finite relational structure \(A\) and query access to an assignment, is the CSP satisfied? This paper shows a dichotomy theorem which characterizes which \(A\) admit a constant-query test. They go on to show a trichotomy theorem for when instances may include existentially quantified variables, classifying which \(A\) admit a constant-query test, which admit only a sublinear-query test, and which require a linear number of queries.

Minimizing Quadratic Functions in Constant Time, by Kohei Hayashi and Yuichi Yoshida (arXiv). This paper investigates the problem of (approximately) minimizing quadratic functions in high-dimensions. Prior approaches to this problem scale poorly with the dimension — at least linearly, if not worse. This work uses a sampling-based method to avoid paying this cost: the resulting time complexity is completely independent of the dimension!

Property Testing Review

The latest in property testing and sublinear time algorithms