Author Archives: Clement Canonne

FOCS 2017: Workshop on Distribution Testing

As it may have transpired by now, FOCS 2017 (a.k.a. the 58^th Annual Symposium on Foundations of Computer Science) is around the corner.

I am not going to go into the details of the talks on or related to property testing at the conference itself,${}^{(\ast)}$ but instead focus on (one of) the workshops.

This FOCS will include 3 workshops,${}^{(\dagger)}$ on Saturday 14^th (gird your loins — they last the full day!):

Linear Sketching as a Tool for Everything
Frontiers in Distribution Testing
Hardness Escalation in Communication Complexity and Query Complexity

The second, Frontiers in Distribution Testing, is specifically about recent advances, new questions, and novel directions in property testing of distributions. Co-organized by Gautam Kamath and me, its official objective is to “catch the community up in recent developments, and highlight some of the most interesting frontiers in distribution testing.”

It’ll feature a stellar lineup of speakers${}^{(\ddagger)}$ (namely, Ilias Diakonikolas, Jiantao Jiao, Alon Orlitsky, Constantinos Daskalakis, Ryan O’Donnell, Ronitt Rubinfeld, and Tom Gur), as well as an open problem session — and, very possibly, pictures of eggs. The program is available, with all relevant details, on the workshop webpage.

If you are around next Saturday, consider attending — and submit your favorite open problems!

${}^{(\ast)}$ Hint: look at Sessions 7A and 10B.
${}^{(\dagger)}$ Thanks to the FOCS workshop chairs James Lee and Aleksander Mądry.
${}^{(\ddagger)}$ A lineup of stellar speakers.

News for September 2017

Leave a reply

Summer came and went, and Fall begins at full throttle with a (metric) ton of papers. Eight that we counted — if any was missed, please mention it in the comments!

Efficient Removal without Efficient Regularity, by Lior Gishboliner, Asaf Shapira (arXiv). Obtaining efficient removal lemmata for graphs pattern (such as triangle, to name the most famous), that is removal results with bounds on the number of copies of the pattern that is not mind-blowingly huge like a tower of $\varepsilon$, is a classic and longstanding problem. This work makes significant progress for the last remaining case, i.e. for the pattern $C_4$: providing bounds that are merely exponential in $\varepsilon$.

Local decoding and testing of polynomials over grids, by Srikanth Srinivasan, Madhu Sudan (arXiv, ECCC). In this work, the authors study the local decodability and local testability of error-correcting codes corresponding to low-degree polynomials on the grid $\{0,1\}^n$ (over a field $\mathbb{F}\supseteq \{0,1\}$). Obtaining both positive and negative results on these, a consequence of their results is a separation between local testability and local decodability for a natural family of codes.

Lower Bounds for Approximating Graph Parameters via Communication Complexity, by Talya Eden and Will Rosenbaum (arXiv). This paper establishes an analogue of the framework of Blais, Brody, and Matulef (2012), which enabled one to obtain property testing lower bounds by a reduction from communication complexity, for the setting of graph parameter estimation. The authors then leverage this technique to give new and simpler proofs of lower bounds for several such estimation tasks.

A Note on Property Testing Sum of Squares and Multivariate Polynomial Interpolation, by Aaron Potechin and Liu Yang (arXiv). The authors introduce and study the question of testing “sum-of-square-ness,” i.e. the property of a degree-$d$-polynomial being a sum of squares. Specifically, they show that one-sided sample-based testers cannot do much better than the trivial approach, that is that they require sample complexity $n^{\Omega(d)}$ — while learning the polynomial can be done with $n^{O(d)}$ samples.

Sharp Bounds for Generalized Uniformity Testing, by Ilias Diakonikolas, Daniel Kane, and Alistair Stewart (arXiv, ECCC). Remember the post from last month, which included a paper on “Generalized Uniformity Testing”? Well, this paper more or less settles the question, establishing tight bounds on the sample complexity of testing whether an (unknown) probability distribution over an (unknown) discrete domain is uniform on its support, or far from every uniform distribution. Specifically, the authors significantly strengthen the previous upper bound, by getting the right dependence on $\varepsilon$ for all regimes; and complement it by a matching worst-case lower bound.

Sample-Optimal Identity Testing with High Probability, by Ilias Diakonikolas, Themis Gouleakis, John Peebles, and Eric Price (ECCC). Usually, in property testing we do not care too much about the error probability $\delta$: if one can achieve $1/3$, then simple repetition can bring it down to $\delta$ at the mild price of a $\log(1/\delta)$ factor in the query/sample complexity. Is that necessary, though? This paper shows that for uniformity and identity testing of distributions, the answer is “no”: for some regimes, this repetition trick is strictly suboptimal, as one can pay instead only a multiplicative $\sqrt{\log(1/\delta)}$. And quite interestingly, this improvement is achieved with the simplest algorithm one can think of: by considering the empirical distribution obtained from the samples.

A Family of Dictatorship Tests with Perfect Completeness for 2-to-2 Label Cover, by Joshua Brakensiek and Venkatesan Guruswami (ECCC). While I tried to paraphrase the original abstract, but my attempts only succeeded in making it less clear; and, for fear of botching the job, decided to instead quote said abstract: “[the authors] give a family of dictatorship tests with perfect completeness [that is, one-sided] and low-soundness for 2-to-2 constraints. The associated 2-to-2 conjecture has been the basis of some previous inapproximability results with perfect completeness. However, evidence towards the conjecture in the form of integrality gaps even against weak semidefinite programs has been elusive. [Their] result provides some indication of the expressiveness and non-triviality of 2-to-2 constraints, given the close connections between dictatorship tests and satisfiability and approximability of CSPs.”

A polynomial bound for the arithmetic $k$-cycle removal lemma in vector spaces, by Jacob Fox, László Miklós Lovász, and Lisa Sauermann (arXiv). And back to removal lemmata! This work proves a generalization of Green’s arithmetic $k$-cycle removal lemma, which held for any $k\geq 3$ and abelian group $G$; however, the bounds in this lemma were quite large — i.e., tower-ype. Here, the authors establish an efficient lemma (with polynomial bounds) for the case of the group $\mathbb{F}_p^n$ (where $p\geq 2$ is any fixed prime, and $k\geq 3$).

Update (10/04): Finally, a paper we covered last summer, The Dictionary Testing Problem, by Siddharth Barman, Arnab Bhattacharyya, and Suprovat Ghoshal, went under signficant changes. Now titled Testing Sparsity over Known and Unknown Bases, it now includes (in addition to the previous results) a testing algorithm for sparsity with regard to a specific basis: given a matrix $A \in \mathbb{R}^{d \times m}$ and unknown input vector $y \in \mathbb{R}^d$, does $y$ equal $Ax$ for some $k$-sparse vector $x$, or is it far from all such representations?

Update (10/5): we missed a recent paper of Benjamin Fish, Lev Reyzin, and Benjamin Rubinstein on Sublinear-Time Adaptive Data Analysis (arXiv). While not directly falling into the umbrella of property testing, this work considers sublinear-time algorithms for adaptive data analysis — similar in goal and spirit to property testing.

News for June 2017

Leave a reply

Happy Fourth of July to everyone! Last month was prolific for property testing, as we counted no less than nine papers making their way online!*

Parameterized Property Testing of Functions, by Ramesh Krishnan S. Pallavoor, Sofya Raskhodnikova, and Nithin Varma. (ECCC) (A preliminary version was published in ITCS’17). In property testing, here of functions, it is standard to parameterize the query complexity by the two “obvious” parameters — namely, the domain size parameter $n$, and the proximity parameter $\varepsilon$. While this often leads to a good understanding of the landscape, sometime a more fine-grained analysis may be useful, to capture the complexity of the question in terms of the setting-specific “right” parameters. This work initiates this line of inquiry for functions $f\colon [n] \to \mathbb{R}$, showing examples where classic lower bounds can be circumvented by focusing on a more relevant parameters of the problem.

A Lower Bound for Nonadaptive, One-Sided Error Testing of Unateness of Boolean Functions over the Hypercube, by Roksana Baleshzar, Deeparnab Chakrabarty, Ramesh Krishnan S. Pallavoor, Sofya Raskhodnikova, C. Seshadhri (arXiv). A Boolean function $f\colon \{0,1\}^n\to \{0,1\}$ is said to be unate if it is monotone (either increasing or decreasing) in each coordinate; or, equivalently, if there exists $\sigma\in\{0,1\}^n$ such that $x\mapsto f(x\oplus\sigma)$ is monotone. As reported in the previous months, there has been a significant amount of work lately on testing unateness of functions (including real-valued). In this short paper, the authors improve a lower bound of Chen et al. (2017) on non-adaptive, one-sided unateness testing of Boolean functions, bringing it from $\Omega(\frac{n}{\log^2 n})$ to $\Omega(\frac{n}{\log n})$ — leaving only a gap of $\log^2 n$ between known upper and lower bounds.

Adaptivity is exponentially powerful for testing monotonicity of halfspaces, by Xi Chen, Rocco A. Servedio, Li-Yang Tan, and Erik Waingarten (arXiv:). While monotonicity testing of general Boolean functions has been widely studied these past years, testing monotonicity of functions promised to belong to a specific class (here, halfspaces — linear threshold functions) is much less understood. In this work, the authors show that testing monotonicity of halfspaces can done adaptively with $\mathrm{poly}(\log n, 1/\varepsilon)$ queries. Since the $\Omega(n^{1/2})$ non-adaptive lower bound for general monotonicity testing relied on instances that were halfspaces, it applies here as well — showing an exponential gap between adaptive and non-adaptive testing in this case. “It’s as extreme as it gets!”

Testing Piecewise Functions, by Steve Hanneke and Liu Yang (arXiv). Generalizing the concept of “unions of intervals,” the authors consider here the following general question: fix a class of functions $\mathcal{H}\subseteq \mathcal{Y}^{\mathcal{X}}$, where $\mathcal{X},\mathcal{Y}$ are “nice” spaces (typically $\mathcal{X}=\mathbb{R}$) and parameter $k$ and $\varepsilon$. The goal is to test whether (i) one can partition $\mathcal{X}$ into $k$ pieces and find $h_1,\dots,h_k\in\mathcal{H}$, such that $f=h_i$ on the $i$-th piece; or (ii) $f$ is $\varepsilon$-far (for a notion of distance depending on the measure on $\mathcal{X}$) from every such “$k$-piecewise function.”
They give upper bounds (as well as a lower bound) for the query complexity of this question in both the active testing and the sample-based testing settings (for the latter, considering the class $\mathcal{H}$ of constant functions).

Sample-based high-dimensional convexity testing, by Xi Chen, Adam Freilich, Rocco A. Servedio, and Timothy Sun (arXiv). In the setting considered in this paper, there exists an unknown set $S\subseteq \mathbb{R}^n$, and the algorithm is provided with samples $\langle x, S(x)\rangle$, where $x$ is distributed according to a standard Gaussian distribution and the label $S(x)$ is $1$ iff $x\in S$. The goal is to test whether $S$ is convex (or $\varepsilon$-far from convex under the standard Gaussian distribution). The authors then show near-matching upper and lower bounds: one-sided testing of convexity in this testing has sample complexity $2^{\tilde{\Theta}(n)}$, while two-sided testing has sample complexity $2^{\tilde{\Theta}(\sqrt{n})}$.

Distributed Detection of Cycles, by Pierre Fraigniaud and Dennis Olivetti (arXiv); and Distributed Subgraph Detection, by Pierre Fraigniaud, Pedro Montealegre, Dennis Olivetti, Ivan Rapaport, and Ioan Todinca (arXiv). Improving the understanding of distributed property testing of graphs in the CONGEST model (already touched upon last month), these two works tackle the question of distributed testing of subgraph freeness. The first considers testing cycle-freeness, showing that $C_k$-freeness can be tested in the CONGEST model in a constant number of rounds (and with one-sided error). The second considers the more general question testing $H$-freeness of graphs, for a larger class of subgraphs that includes cycles. Specifically, they show that for any pattern $H$ consisting of a couple of nodes, connected in an arbitrary manner to a tree, $H$-freeness can be tested (with one-sided error) in $O(1)$ rounds.

Hypothesis Testing For Densities and High-Dimensional Multinomials: Sharp Local Minimax Rates, by Sivaraman Balakrishnan and Larry Wasserman (arXiv). In distribution testing, the question of testing identity to a known discrete distribution $p$ is almost fully settled, with near tight “instance-optimal” bounds. But what about the related question of testing identity to $p$ (not necessarily discrete) under the additional promise that both $p$ and the unknown distribution $q$ both belong to a specific class $C$ of distributions, instead of arbitrary? This is the problem this work tackles, for the specific case of $C$ being (i) the class of multinomial distributions, and (ii) the class of (continuous) Lipschitz distributions. In both cases, the authors establish some near-tight bounds, both of an “instance-optimal” flavor.

On Axis-Parallel Tests for Tensor Product Codes, by Alessandro Chiesa, Peter Manohar, and Igor Shinkar (ECCC). Low-degree testing has its roots in the origins of property testing, and its connection to the PCP theorem; and has been a fruitful and rich line of work since. Here, the authors analyze a type of low-degree test introduced in 2006 by Ben-Sasson and Sudan, which constrains the tester to only considers restrictions along axis-parallel hyperplanes; and establish new results on these class of (weaker but more general) testers.

* As usual, if you find a paper we forgot or misrepresented, please signal it in the comments below.

News for March 2017

Leave a reply

March was a relatively good month for property testing, with 3 papers — including a foray in differential privacy.

Optimal Unateness Testers for Real-Valued Functions: Adaptivity Helps, by Roksana Baleshzar, Deeparnab Chakrabarty, Ramesh Krishnan S. Pallavoor, Sofya Raskhodnikova, and C. Seshadhri (arXiv). Testing unateness (that is, whether a function is monotone after a rotation of the hypercube) has recently received significant attention. Understanding the power of adaptivity in property testing, on the other hand, has been a long-standing question, which has sparked much interest over the past decade. In this paper, the authors obtain optimal results for testing unateness of real-valued functions $f\colon \{0,1\}^d\to \mathbb{R}$ and $f\colon [n]^d\to \mathbb{R}$ (up to the dependence on the proximty parameter $\varepsilon$), and show that — unlike what happens for monotonicity testing of real-valued functions — adaptivity helps. Namely, adaptive testing of unatess of functions $f\colon \{0,1\}^d\to \mathbb{R}$ has query complexity $O(d/\varepsilon)$, but non-adaptive testers require $\Omega(d\log d)$ queries.
This work is a merged, and significantly improved version of two papers we covered a few months back, by the same authors.

Switching from function to distribution testing, last month saw two independent samples appear online:

Near-Optimal Closeness Testing of Discrete Histogram Distributions, by Ilias Diakonikolas, Daniel M. Kane, and Vladimir Nikishkin (arXiv). Closeness testing of discrete distributions is the question of testing, given sample access to two unknown distributions $p,q$ over a known discrete domain $[n] \stackrel{\rm def}{=} \{1,\dots,n\}$, whether $p=q$ or $d_{\rm TV}(p,q)> \varepsilon$ (are the two distributions the same, or differ significantly in total variation distance?). This quite fundamental question is by now fully understood in the general case, where $p,q$ are allowed to be arbitrary distributions over ; but what if one has some additional knowledge about them? Specifically, what if we were guaranteed that both distributions have some “nice structure” — e.g., that they are piecewise constant with $k$ pieces ($k$-histograms)? Leveraging machinery the authors developed in a series of previous work (for the continuous case), the authors show that in this case, the sample complexity of “testing closeness under structural assumptions” is quite different than in the general case; and that the resulting (near tight) bound is a rather intricate tradeoff between the three parameters $n,k,\varepsilon$.

Priv’IT: Private and Sample Efficient Identity Testing, by Bryan Cai, Constantinos Daskalakis, and Gautam Kamath (arXiv). Distribution testing is a vibrant field; differential privacy (DP) is an incredibly active and topical area. What about differentially private distribution testing — a.k.a. testing the data without learning too much about any single sample? In this paper, the authors address the task of performing identity testing of distributions (“given samples from an unknown arbitrary distribution, decide whether it matches a fixed known probability distribution/model”) in the DP setting, focusing on getting the best of both worlds: how to guarantee privacy without sacrificing efficiency. Not only do they obtain asymptotically near-optimal bounds in both the testing and differential privacy parameters — they also run some experiments to validate their approach empirically. (Plus, the title is rather cute.)

Usual disclaimer: if we forgot or misrepresented your paper, please signal it in the comments — we’ll address it as quickly as we can.

News for December 2016

Leave a reply

December was indeed a merry month for property testing, with seven new papers appearing online.* Let’s hope the trend continues!

Cube vs. Cube Low Degree Test, by Amey Bhangale, Irit Dinur, and Inbal Livni Navon (ECCC). This work provides a new and improved analysis of the “low-degree test” of Raz and Safra, tightening the dependence on the alphabet size of the soundness parameter. Specifically, the goal is as follows: given query access to the encoding of a function $f\colon \mathbb{F}^m\to \mathbb{F}$ as a “cube table,” decide whether $f$ is a low-degree polynomial (i.e., of degree at most $d$). With direct applications to PCP theorems, this question has a long history; here, the authors focus on a very simple and natural test, introduced by Raz–Safra and Arora–Sudan. In particular, they improve the soundness guarantee, which previously required the error to be at least $\textrm{poly}(d)/\mathbb{F}^{1/8}$, to obtain a dependence on the field size which is only $\mathbb{F}^{-1/2}$.

Robust Multiplication-based Tests for Reed-Muller Codes, by Prahladh Harsha and Srikanth Srinivasan (arXiv). Given a function $f\colon \mathbb{F}_q^n\to\mathbb{F}_q$ purported to be a degree-$d$ polynomial, deciding whether this is indeed the case is a question with applications to hardness of approximation and — as the title strongly hints— to testing of Reed—Muller codes. Here, the authors generalize and improve on a test originally due to Dinur and Guruswami, improving on the soundness: that is, they show a robust version of the soundness guarantee of this multiplication test, answering a question left open by Dinur and Guruswami.

A Note on Testing Intersection of Convex Sets in Sublinear Time, by Israela Solomon (arXiv). In this note, the author addresses the following testing question: given $n$ convex sets in $\mathbb{R}^d$, distinguish between the case where (i) the intersection is non-empty, and (ii) even after removing any $\varepsilon n$ sets, the intersection is still empty. Through the use of a generalization of Helly’s Theorem due Katchalski and Liu (1979), the author then provides and analyzes an algorithm for this question, with query complexity $O(\log 1/(d\varepsilon))$.

A Characterization of Constant-Sample Testable Properties, by Eric Blais and Yuichi Yoshida (arXiv). A lot of work in property testing has been to understand which properties can be locally tested, i.e. admit testers with constant query complexity. Here, the authors tackle — and answer — the variant of this question in the sample-based testing model, that is restricting the ability of the algorithm to only obtain the value of the function on independently and uniformly distributed locations. Namely, they provide a full characterization of the properties of Boolean-valued function which are testable with constant sample complexity, resolving the natural question of whether most properties are easily tested from samples only. As it turns out, only those properties which are (essentially) $O(1)$-part-symmetric have this nice testability — where a property $\mathcal{P}$ is $k$-part-symmetric if one can divide the input variables in $k$ blocks, such that membership to $\mathcal{P}$ is invariant by permutations inside each block.

The last three are all works on distribution testing, and all concerned with the high-dimensional case. Indeed, most of the distribution testing literature so far has been focusing in probability distributions over arbitrary discrete sets or the one-dimension ordered line; however, when trying to use these results in the (arbitrary) high-dimensional case on is subjected to the curse of dimensionality. Can one leverage (presumed) structure to reduce the cost, and obtain computationally and sample-efficient testers?

Testing Ising Models, by Costis Daskalakis, Nishanth Dikkala, and Gautam Kamath (arXiv). In this paper, the authors take on the above question in the context of Markov Random Fields, and more specifically in the Ising model. Modeling the (unknown) high-dimensional distributions as Ising models with some promise on the parameters, they tackle the two questions of identity and independence testing; respectively, “is the unknown Ising model equal to a fixed, known reference distribution?” and “is the high-dimensional, a priori complex distribution a product distribution?”. They obtain both testing algorithms and lower bounds for these two problems, where the soundness guarantee sought is in terms of the symmetrized Kullback—Leibler divergence (a notion of distance more stringent that the statistical distance).

Square Hellinger Subadditivity for Bayesian Networks and its Applications to Identity Testing, by Costis Daskalakis and Qinxuan Pan (arXiv). Another natural model to study structured high-dimensional distribution is that of Bayesian networks, that is directed graphs providing a succinct description of the dependencies between coordinates. This paper studies the closeness testing problem (testing if two unknown distributions are equal or far, with regard to the usual statistical distance) for Bayesian networks, parameterized by the dimension, alphabet size, and (promise on the) maximum in-degree of the unknown Bayes net. At its core is a new inequality relating the (squared) Hellinger distance of two Bayesian networks to the sum of the (squared) Hellinger distances between their marginals.

Testing Bayesian Networks, by Clément Canonne, Ilias Diakonikolas, Daniel Kane, and Alistair Stewart (arXiv). Tackling the testing of Bayesian networks as well, this second paper considers two of the most standard testing questions — identity (one-unknown) and closeness (two-unknown) testing — under various assumptions, in order to pinpoint the exact sample complexity in each case. Specifically the goal is to see when (and under which natural restrictions) does testing become easier than learning for Bayesian networks, focusing on the dimension and maximum in-degree as parameters.

* As usual, if we forgot one or you find imprecisions in our review of a paper, please let us now in the comments below.

News for September 2016

Leave a reply

September has just concluded, and with it the rise of Fall in property testing: no less than 5 papers* this month.

Removal Lemmas for Matrices, by Noga Alon and Omri Ben-Eliezer (arXiv). The graph removal lemma, and its many variants and generalizations, is a cornerstone in graph property testing (among others), and is a fundamental ingredient in many graph testing results. In its simplest form, it states that for any fixed $h$-vertex pattern $H$ and $\varepsilon > 0$, there is some constant $\delta=\delta(\varepsilon)$ such that any graph $G$ on $n$ vertices containing at most $\delta n^h$ copies of $H$ can be “fixed” (made $H$-free) by removing at most $\varepsilon n^2$ edges.
The authors introduce and establish several analogues of this result, in the context of matrices. These matrix removal lemmas have direct implications in property testing (discussed in Section 1.2); for instance, they imply constant-query testing of $F$-freeness of (binary) matrices, where $F$ is any family of binary matrices closed under row permutations.

Improving and extending the testing of distributions for shape-restricted properties, by Eldar Fischer, Oded Lachish, and Yadu Vasudev (arXiv). Testing membership to a class (family) is a very natural question in distribution testing, and one which has received significant attention lately. In this paper, the authors build on, improve, and generalize the techniques of Canonne, Diakonikolas, Gouleakis, and Rubinfeld (2016) to obtain a generic, one-size-fits-all algorithm for this question. They also consider the same question in the “conditional oracle” setting, for which their ideas yield significantly more sample-efficient algorithms. At the core of their approach is a better and simpler learning procedure for families of distributions that enjoy some decomposability property.

The Bradley―Terry condition is $L_1$–testable, by Agelos Georgakopoulos and Konstantinos Tyros (arXiv). An incursion in probability models: say that the directed weighted graph $((V,E),w)$ on $n$ vertices, with $w\colon E_n\to[0,1]$ such that $w(x,y)=w(y,x)$ for all $(x,y)\in V^2$, satisfies the Bradley—Terry condition if there exists an assignment of probabilities $(a_x)_{x\in V}$ such that

$\qquad\displaystyle \frac{w(x,y)}{w(y,x)} = \frac{a_x}{a_y}$

for all $(x,y)\in V^2$. This condition captures whether such a graph corresponds to a Bradley—Terry tournament.
This paper considers this characterization from a property testing point of view: that is, it addresses the task of testing, in the $L_1$-sense of Berman, Raskhodnikova, and Yaroslavtsev (2012), if a $((V,E),w)$ satisfies the Bradley—Terry condition (or is far from any such graph).

On SZK and PP, by Adam Bouland, Lijie Chen, Dhiraj Holden, Justin Thaler, and Prashant Nalini Vasudevan (arXiv, ECCC). Another (unexpected) connection! This work proves query and communication complexity separations between two complexity classes, $\textsf{NISZK}$ (non-interactive statistical zero knowledge) and $\textsf{UPP}$ (a generalization of $\textsf{PP}$ with unbounded coin flips, and thus arbitrary gap in the error probability). While I probably cannot even begin to list all that goes over my head in this paper, the authors show in Section 2.3.2 some implications of their results for property testing, for instance in distribution testing. In more detail, their results yield some lower bounds on the sample complexity of (tolerant) property testers with success probability arbitrarily close to $1/2$ (as opposed to the “usual” setting of $1/2+\Omega(1)$).

Quantum conditional query complexity, by Imdad S. B. Sardharwalla, Sergii Strelchuk, and Richard Jozsa (arXiv). Finally, our last paper of the list deals with both quantum algorithms and distribution testing, by introducing a new distribution testing model in a quantum setting, the quantum conditional oracle. This model, the quantum analogue of the conditional query oracle of Chakraborty et al. and Canonne et al., allows the (quantum) testing algorithm to get samples from the probability distribution to test, conditioning on chosen subsets of the domain.
In this setting, they obtain speedups over both (i) quantum “standard” oracle and (ii) “classical” conditional query testing for many distribution testing problems; and also consider applications to quantum testing of Boolean functions, namely balancedness.

* As usual, in case we missed or misrepresented any, please signal it in the comments.

Distribution Testing: a short non-survey

Leave a reply

The past years have witnessed a great deal of activity in the field of distribution testing— so much, in fact, that it has become a bit of a challenge to even keep track of what’s happening and what has already happened. While this blog post is not going to be a comprehensive (comprehensible, maybe?) survey or summary, it hopefully will at least shed some light on what distribution testing is — and where to look at if interested in it.

Obligatory XKCD comic.

Recall that in the “vanilla” property testing setting, one is provided with access (typically query access) to an object $\mathcal{O}$ from some universe $\Omega$ (typically the space of Boolean functions over $\{0,1\}^n$ or the set of $n$-vertex graphs), a distance measure $\textrm{d}(\cdot,\cdot)$ between such objects (usually the Hamming distance), a property $\mathcal{P}\subseteq \Omega$ (the subset of “good objects”) and a parameter $\varepsilon \in (0,1]$. The goal is to design a randomized algorithm that (i) accepts with high probability if $\mathcal{O}\in\mathcal{P}$; and (ii) rejects with high probability if $d(\mathcal{O},\mathcal{O}’)>\varepsilon$ for every single object $\mathcal{O}’\in\mathcal{P}$.

The distribution testing setting is very much like this one, with of course crucial twists. The object is now a probability distribution $D$ over a (discrete) set, typically $[n]=\{1,\dots,n\}$; the access model is usually a sample oracle, providing independent samples from $D$ on demand; the distance is the total variation distance between distributions (or, equivalently, the $\ell_1$ norm between the probability mass functions).

$\textrm{d}_{\rm TV}(D,D’) = \sup_{S\subseteq [n]} \left( D(S) – D'(S) \right) = \frac{1}{2} \sum_{i\in[n]} \left|D(i)-D'(i) \right|$

This brings in some quite important distinctions: for instance, randomness is now inherent: even taking a number of samples beyond any reasonable bound will not bring the probability of failure of the algorithm to zero, this time.

A Brief History

The field made its debut in TCS with a work of Goldreich and Ron [5], who considered the question of testing if an expander graph had mixed — that is, if the distribution induced on its nodes was uniform. However, it is only a tad later, in a work of Batu, Fortnow, Rubinfeld, Smith, and White [6] that the setting was formally defined and studied, for the specific problem of testing whether two unknown random variables have the same distribution (closeness, or equivalence testing).

This started a line of work, which analyzed the sample complexity of testing uniformity (is my random variable uniformly distributed?), identity (is it distributed according to this nice distribution I fully know in advance?), independence (are my random variables independent of each other?), monotonicity (is the probability density function increasing?), and many more. Over the course of the past 16 years, many breakthroughs, new techniques, new questions, and new answers to old problems were made and found. With surprising twists (if your domain size is n, then surely $\Theta(n)$ samples are enough and necessary to learn an arbitrary distribution; but who would have thought that $\Theta(n/\log n)$ was the right answer to anything? [7]). And game changers (general theorems that apply to many questions or blackbox lemmas are really nice to have in one’s toolbox).

And that’s barely the tip of the iceberg! For more in-depth or better written prose, the interested Internet wanderer may want to consult one of the following (usual non-exhaustivity caveats apply):

Ronitt Rubinfeld’s introduction [1]
Dana Ron’s survey on Property testing [2] (Section 11.4)
Oded Goldreich’s upcoming book [3] (Chapter 11)
or my own survey [4] on distribution testing (but I cannot promise better written prose for this one).

For the more video-inclined, there are also quite a few survey talks available online, the most recent I know of being this talk by Ronitt Rubinfeld at COLT’16.

A Brief Comparison

Note: this section is taken verbatim from my survey [4] (Section 1.2).

It is natural to wonder how the above approach to distribution testing compares to classic methods and formulations, as studied in Statistics. While the following will not be a thorough comparison, it may help shed some light on the difference.

Null and alternative hypotheses. The standard take on hypothesis testing, simple hypothesis testing, relies on defining two classes of distributions, the null and alternative hypotheses $\mathcal{H}_0$ and $\mathcal{H}_1$. A test statistic is then tailored specifically to these two classes, in order to optimally distinguish between $\mathcal{H}_0$ and $\mathcal{H}_1$— that is, under the underlying assumption that the unknown distribution $D\in\mathcal{H}_0\cup\mathcal{H}_1$.
Something like that.

The test then rejects the null hypothesis $\mathcal{H}_0$ if statistical evidence is obtained that $D\notin\mathcal{H}_0$. In this view, the distribution testing formulation would be to set $\mathcal{H}_0$ to be the property $\mathcal{P}$ to be tested, and define the alternative hypothesis as “everything far from $\mathcal{P}$.” In this sense, the latter captures a much more adversarial setting, where almost no structure is assumed on the alternative hypothesis— setting known in Statistics as composite hypothesis testing.
Small-sample regime. A second and fundamental difference resides in the emphasis given to the question. Traditionally, statisticians tend to focus on asymptotic analysis, characterizing— often exactly— the rate of convergence of the statistical tests under the alternative hypothesis, as the number of samples $m$ grows to infinity. Specifically, the goal is to pinpoint the error exponent $\rho$ such that the probability of error (failing to reject the null hypothesis) asymptotically decays as $e^{-\rho m}$. However, this asymptotic behavior will generally only hold for values of $m$ greater than the size of the domain (“alphabet”). In contrast, the computer science focus is on the small-sample regime, where the number of samples available is small with regard to the domain size, and one aims at achieving a fixed probability of error.
Algorithmic flavor. At a more practical level, a third point on which the two approaches deviate is the set of techniques used in order to tackle the question. Namely, the Statistics literature very often relies on relatively simple-looking and “natural” tests and estimators, which need not be computationally efficient. (This is for instance the case for the generalized likelihood ratio test that requires to compute the maximum likelihood of the sequence of samples obtained under the two hypotheses $\mathcal{H}_0$ and $\mathcal{H}_1$; which is not in general tractable.) On the other hand, works in distribution testing are predominantly algorithmic, with a computational emphasis on the testing algorithms thus obtained.

Exciting times!

And while the above has hinted at all that has been done so far in the area, there is still plenty ahead — many new questions to be posed, answered, revisited under another light, tackled with new insights and techniques, and connections to be made. As a small sample (sic), we list below a few of the recent developments or trends that herald these exciting times.

New tools

Appeared in the 1800’s thanks to Pearson, testers based on (some tailored variant of) a $\chi^2$-based statistic are now aplenty. Yielding simple, concise, very often optimal testing algorithms… [8], [9], [10], [11]…
Information theory. Techniques and results from this related area have started to permeate distribution testing— mostly to prove lower bounds in an easier or conceptually simpler way. [12], [13,14]
They were there from the beginning. Now, the $\ell_2$-testing subroutines strike back. (For neat, elegant testing algorithms that use $\ell_2$ as a proxy “in the right way.”) [5,6,15], [14]

New models

What if modeling the access to the data as a source of i.i.d. samples was not enough? Because we can— the situation allows it— or because we want— what we are trying to achieve is best modeled a bit differently? If we get more ways to query it, can we do more, and by how much?

This recent line of work [20,21,22,23,…] offers many challenges, ranging from the right way to ask (what to model, and how?) to the right way to answer (intuition if often off the hook, in these uncharted waters?). New tools to design, intuition to build, and problems to consider…

New questions

And of course, even if many of the “classic” problems have by now been fully understood, there are many more that await… from class testing [16,11,17,18] to twists on old questions (uneven sample size for closeness?) to new questions entirely [19], there is no shortage. Not to forget there is a stronger connection with practice to build, and asking the “real-worlders” what makes sense to consider is also unlikely to stop producing challenges.

[1] Ronitt Rubinfeld. Taming Big Probability Distributions. XRDS, 19(1):24-28, 2012. http://xrds.acm.org/article.cfm?aid=2331052
[2] Dana Ron. Algorithmic and Analysis Techniques in Property Testing. Foundations and Trends in Theoretical Computer Science: Vol. 5: No. 2, pp 73-205. 2010. http://dx.doi.org/10.1561/0400000029
[3] Oded Goldreich. Introduction to Property Testing. (In preparation) http://www.wisdom.weizmann.ac.il/~oded/pt-intro.html
[4] Clément Canonne. A Survey on Distribution Testing: Your Data is Big. But is it Blue? 2015. http://eccc.hpi-web.de/report/2015/063/
[5] Oded Goldreich and Dana Ron. On testing expansion in bounded-degree graphs. Electronic Colloquium on Computational Complexity (ECCC), 7:20, 2000.
[6] Tuğkan Batu, Lance Fortnow, Ronitt Rubinfeld, Warren D. Smith, and Patrick White. Testing that distributions are close. In Proceedings of FOCS, pages 189-197, 2000.
[7] Gregory Valiant and Paul Valiant. The power of linear estimators. In Proceedings of FOCS, pages 403-412, October 2011.
[8] Siu-On Chan, Ilias Diakonikolas, Gregory Valiant, and Paul Valiant. Optimal algorithms for testing closeness of discrete distributions. In Proceedings of SODA, pages 1193-1203. Society for Industrial and Applied Mathematics (SIAM), 2014.
[9] Gregory Valiant and Paul Valiant. An automatic inequality prover and instance optimal identity testing. In Proceedings of FOCS, 2014.
[10] Ilias Diakonikolas, Daniel M. Kane, and Vladimir Nikishkin. Testing Identity of Structured Distributions. In Proceedings of SODA, pages 1841-1854. Society for Industrial and Applied Mathematics (SIAM), 2015.
[11] Jayadev Acharya, Constantinos Daskalakis, Gautam Kamath. Optimal Testing for Properties of Distributions. In Advances in Neural Information Processing Systems (NIPS), 2015.
[12] Yihong Wu and Pengkun Yang. Minimax rates of entropy estimation on large alphabets via best polynomial approximation. ArXiV, abs/1407.0381, 2014.
[13] Yihong Wu and Pengkun Yang. Chebyshev polynomials, moment matching, and optimal estimation of the unseen. ArXiV, abs/1504.01227, 2015.
[14] Ilias Diakonikolas and Daniel Kane. A New Approach for Testing Properties of Discrete Distributions. In Proceedings of FOCS, 2016. (To appear. Also available as abs/1601.05557)
[15] Tuğkan Batu, Eldar Fischer, Lance Fortnow, Ravi Kumar, Ronitt Rubinfeld, and Patrick White. Testing random variables for independence and identity. InProceedings of FOCS, pages 442-451, 2001.
[16] Tuğkan Batu, Ravi Kumar, and Ronitt Rubinfeld. Sublinear algorithms for testing monotone and unimodal distributions. In Proceedings of STOC, pages 381-390, New York, NY, USA, 2004. ACM.
[17] Clément L. Canonne, Ilias Diakonikolas, Themis Gouleakis, and Ronitt Rubinfeld. Testing Shape Restrictions of Discrete Distributions, In Proceedings of STACS, pages 25:1–25:14, 2016.
[18] MIT Theory Blog. Distribution Testing: Do It With Class! https://mittheory.wordpress.com/2015/08/09/distribution-testing-do-it-with-class/
[19] Maryam Aliakbarpour, Eric Blais, Ronitt Rubinfeld. Learning and Testing Junta Distributions. In Proceedings of COLT, 2016. pp. 19–46, 2016
[20] Reut Levi, Dana Ron, and Ronitt Rubinfeld. Testing properties of collections of distributions. Theory of Computing, 9:295-347, 2013.
[21] Sourav Chakraborty, Eldar Fischer, Yonatan Goldhirsh, and Arie Matsliah. On the power of conditional samples in distribution testing. In Proceedings of ITCS, pages 561-580, New York, NY, USA, 2013. ACM.
[22] Clément L. Canonne, Dana Ron, and Rocco A. Servedio. Testing probability distributions using conditional samples. SIAM Journal on Computing, 2015.
[23] Clément L. Canonne and Ronitt Rubinfeld. Testing probability distributions underlying aggregated data. In Proceedings of ICALP, pages 283-295, 2014.

News for May 2016

Leave a reply

Last month witnessed two new property testing papers go online, which May be of interest to the community.

Reducing testing affine spaces to testing linearity, by Oded Goldreich (ECCC). In his recent guest post, the author outlined a reduction between testing if a Boolean function is an $(n-k)$-dimensional affine subspace of $\{0,1\}^n$, and testing linearity of functions $f\colon\{0,1\}^n\to \{0,1\}$. This preprint encompasses details of this reduction, as part as a general approach to testing monomials (resolving along the way one of the open problems from April). Namely, it shows that testing if $f$ is a $k$-monomial can be reduced to testing two properties, of $f$ and of a (related) function $g\colon\{0,1\}^n\to \{0,1\}^k$. Moreover, it establishes that the general case ($k\geq 1$) of the second part can itself be reduced to the simpler case ($k=1$), giving a unified argument.

Testing Equality in Communication Graphs, by Noga Alon, Klim Efremenko, Benny Sudakov (ECCC). Given a connected graph on $k$ nodes, where each node has an $n$-bit string, how many bits of communication are needed to test if all strings are equal? This paper investigates this problem for many interesting graphs, resolving the complexity up to lower order terms. For example, if the graph is Hamiltonian, they show that $\frac{kn}{2} + o(n)$ bits are sufficient, while at least $\frac{kn}{2}$ bits are required.

Edit (06/16): updated the description of the first preprint, which was not entirely accurate.

Call for feedback: upcoming “Introduction to Property Testing”

4 Replies

Forwarding an announcement by Oded Goldreich:

I would like to seek the help of this community regarding my forthcoming book Introduction to Property Testing (see http://www.wisdom.weizmann.ac.il/~oded/pt-intro.html). Specifically, I would be grateful for feedback regarding any aspect and part of the current text, although detailed comments are likely to be more helpful at this time than suggestions for drastic and/or structural changes. Please do not shy from indicating relevant works that may be mentioned (rather than described in detail), since it is quite possible that I was not aware of them or forgot to mention them (rather than chose not to include a detailed description due to various considerations).

Since I am likely to continue revising the text and posting the revisions on the website, please indicate the text-locations to which you refer in a manner that is most robust to revision (e.g., indicate relative location with respect to sectioning and/or theorem-environment numbers rather than with respect to page numbers).

Open problem for April 2016

1 Reply

As open problems of the month, we state here the two questions discussed in our recent guest post by Oded Goldreich:

Open Problem 1 (obtaining a one-sided error reduction): The reduction from testing affinity of a subspace to that of testing affinity of a function has two-sided error probability. We wonder whether a one-sided error reduction of similar complexity can be found.

Note that this will yield a one-sided error tester for affinity, which we believe is not known. Actually, we would also welcome a two-sided error reduction that, when combined with the linearity tester, yields a tester of complexity $O(1/\epsilon)$ rather than $\tilde{O}(1/\epsilon)$.

Turning to the task of testing monomials, we recall that the solution of [PRS02] is based on employing self-correction to the following test that refers to the case that $H=\{x:AX=b\}$ is an $(\ell-k)$-dimensional affine space. Basically, the test selects a random $x\in H$ and a random $y\in\{0,1\}^\ell$, and checks whether it holds that $y\in H$ if and only if $x\wedge y \in H$, where $\wedge$ denotes the bit-by-bit product of vectors. It is painfully shown in [5] that if $H$ is not a translation by $1^\ell$ of an axis-aligned linear space, then the check fails with probability $\Omega(2^{-k})$. Furthermore, it is shown that for a constant fraction of the $x$’s (in $H$), the check fails on a $\Omega(2^{-k})$ fraction of the $y$’s (in $\{0,1\}^\ell$). This strengthening is important, since selecting $x\in H$ uniformly requires $\Theta(2^k)$ trials. Recall that proving the foregoing assertion for $k=1$ is quite easy (cf. [5]), which leads us to ask

Open Problem 2 (can a simpler proof be found for the case of $k>1$): Is there a relatively simple reduction of the foregoing claim for general $k$ to the special case of $k=1$?

[PRS02] M. Parnas, D. Ron, and A. Samorodnitsky. Testing Basic Boolean Formulae. SIAM Journal on Disc. Math. and Alg., Vol. 16 (1), pages 20–46, 2002.

Property Testing Review

The latest in property testing and sublinear time algorithms

Author Archives: Clement Canonne

FOCS 2017: Workshop on Distribution Testing

News for September 2017

News for June 2017

News for March 2017

News for December 2016

News for September 2016

Distribution Testing: a short non-survey

A Brief History

A Brief Comparison

Exciting times!

New tools

New models

New questions

News for May 2016

Call for feedback: upcoming “Introduction to Property Testing”

Open problem for April 2016