# News for May 2020

Last month saw activity across a diverse collection of topics in sublinear algorithms. In particular, we had the following five six papers. (Sorry, I missed one)

One-Sided Error Testing of Monomials and Affine Subspaces by Oded Goldreich and Dana Ron (ECCC). This work focuses on one-sided testing of two kinds of problems (and their variants):
1. Testing Monomials: Suppose you are given a function $$f \colon \{0,1\}^n \to \{0,1\}$$. Is $$f = \wedge_{i \in I} x_i$$ (that is, is $$f$$ a monotone monomial).
2. Testing Affine Subspaces: Consider the task of testing whether a $$f \colon \mathcal{F}^n \to \{0,1\}$$ is the indicator of an $$(n-k)$$-dimensional affine space for some $$k$$ (where $$\mathcal{F}$$ is a finite field).

The paper shows that the general problem — the one in which the arity of the monomial (resp the co-dimension of the subspace) is not specified has one-sided query complexity $$\widetilde{O}(1/\varepsilon)$$. The same holds for testing whether the arity of the monomial is at most $$k$$ (resp the co-dimension of the subspace is at most $$k$$). Finally, the exact problem which seeks to test whether the arity of the monomial is exactly $$k$$ (resp the co-dimension of the space is exactly $$k$$) has query complexity $$\Omega(\log n)$$. For two sided testers however, size oblivious testers are known for this problem. Thus, like the authors remark, two-sided error is inherent in the case of the exact version of the problem.

Sampling Arbitrary Subgraphs Exactly Uniformly in Sublinear Time by Hendrik Fichtenberger, Mingze Gao, Pan Peng (arXiv). Readers of PT Review are no strangers to the problem of counting cliques in sublinear time (with a certain query model). Building on tools from , in , Eden-Ron-Seshadhri gave the first algorithms for counting number of copies $$K_r$$ in a graph $$G$$ to within a $$(1 \pm \varepsilon)$$ multiplicative factor. En route to this result, they also gave a procedure to sample cliques incident to some special set $$S \subseteq V(G)$$. The query model in  allowed the following queries: a u.a.r vertex query, degree query, $$i^{th}$$ neighbor query and a pair query which answers whether a pair $$(u,v)$$ forms an edge. The work under consideration shows a result which I personally find remarkable: given the additional ability to get a u.a.r edge sample, we can do the following. For any graph $$H$$ we can obtain a uniformly random subgraph isomorphic to $$H$$ in $$G$$. Let that sink in: this work shows that you can sample $$H$$ exactly uniformly from the graph $$G$$.

Finding Planted Cliques in Sublinear Time by Jay Mardia, Hilal Asi, Kabir Aladin Chandrasekher (arXiv). Planted Clique is a time honored problem in average case complexity. This classic problem asks the following: You are given a $$G \sim \mathcal{G}(n, 1/2)$$. Suppose I select a subset of $$k$$ vertices in this graph and put a clique on the subgraph they induce. In principle it is possible to recover the clique I planted if $$k > (2 + \varepsilon) \log n$$. But it seems you get polynomial time algorithms only when $$k \geq \Omega(\sqrt n)$$ even after you throw SDPs at the problem. Moreover, so far, the algorithms which recover the planted $$k$$-clique were known to take $$\widetilde{O}(n^2)$$ time. This work shows that you actually get algorithms which take time $$\widetilde{O}(n^{3/2})$$ if $$k \geq \Omega(\sqrt{n \log n})$$. The key idea is to first obtain a “core” part of the clique of size $$O(\log n)$$ in time $$\widetilde{O}(n^2/k)$$. This is followed up with a clique completion routine where you mark all vertices connected to the entire core as being potentially in the clique. The paper also shows a conditional lower bound result which shows that given query access to adjacency matrix of the graph, a natural family of non-adaptive algorithms cannot recover a planted $$k$$ clique in time $$o\left(\frac{n}{k}\right)^3$$ (for $$k \geq \widetilde{\Omega}(\sqrt n))$$.

A robust multi-dimensional sparse Fourier transform in the continuous setting by Yaonan Jin, Daogao Liu and Zhao Song (arXiv). Suppose you are given an unknown signal whose Fourier Spectrum is k-sparse (that is, there are at most k dominant Fourier Coefficients and all the others are zero or close to zero). Significant research effort has been devoted to learn these signals leading to works which study this problem for multi-dimensional discrete setting and in the one-dimensional continuous case. The $$d$$-dimensional continuous case $$(d = \Theta(1))$$ was largely unexplored. This work makes progress on this frontier by making some natural assumptions on the unknown signal. In particular, the paper assumes that the frequencies — which are vectors $$f_i’s \in R^d$$ — are well separated and satisfy $$\|f_i – f_j\|_2 \leq \eta$$ and that all $${f_i}_{i \in [k]} \subseteq [-F, F]^d$$ sit inside a bounded box.
The authors assume sample access to the signal in the sense that at any desired timestep $$\tau$$, the algorithm can sample the signal’s value. With this setup, the authors show that all the dominant frequencies can be recovered with a $$O_d(k \cdot \log(F/\eta))$$ samples by considering a relatively small time horizon.

Extrapolating the profile of a finite population by Soham Jana, Yury Polyanskiy, Yihong Wu (arXiv). Consider the following setup. You are given a universe $$k$$ balls. Ball come in up to $$k$$ different colors. Say you $$\theta_j$$ balls in color $$j$$ for each $$j \in [k]$$. One of the fundamental problems in statistics considers taking samples $$m$$ balls from the universe and attempts estimating “population profile” (that is, the number of balls in each color). Historically, it is known that unless an overwhelming majority of the universe has been seen, one cannot estimate the empirical distribution of colors. This paper shows that in the sublinear regime, with $$m \geq \omega(k/\log k)$$, it is possible to consistently estimate the population profile in total variation. And once you have a handle on the empirical distribution of the population, you can go ahead and learn lots of interesting label invariant properties of your universe (things like entropy, number of distinct elements etc).

Testing Positive Semi-Definiteness via Random Submatrices by Ainesh Bakshi, Nadiia Chepurko, Rajesh Jayaram (arXiv). Suppose I give you a PSD matrix $$A \in R^{n \times n}$$. You know that all of its principle submatrices are also PSD. What if $$A$$ was $$\varepsilon$$-far from the PSD cone (in a sense I will define soon)? What can you say about the eigenvalues of principle submatrices of $$A$$ now? In this paper, the authors tackle precisely this question. The paper defines a matrix $$A$$ to be $$\varepsilon$$-far in $$\ell_2^2$$ distance from the PSD Cone if you have that $$\min_{B \geq 0: B \in R^{n \times n}}\|A – B\|_F^2 \geq \varepsilon n^2$$. You are allowed to randomly sample a bunch of principle submatrices (of order roughly $$O(1/\varepsilon)$$ by $$O(1/\varepsilon)$$ and check if they are PSD. Armed with this setup, the paper gives a non-adaptive one sided tester for this problem which makes $$\widetilde{O}(1/\varepsilon^4)$$ queries. The paper also supplements this result with a lower bound of $$\widetilde{\Omega}(1/\varepsilon^2)$$ queries.

If I missed something, please let me know. This is my first post on PT Review and I might have botched up a few things.

References

 Talya Eden, Amit Levi, Dana Ron and C. Seshadhri. Approximately Counting Triangles in Sublinear Time. 56th Annual Symposium on Foundations of Computer Science, 2015

 Talya Eden, Dana Ron and C. Seshadhri. On approximating the number of k-cliques in sublinear time. Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing 2018.

# News for April 2020

April is now behind us, and we hope you and your families are all staying safe and healthy. We saw six seven property papers appear online last month, so at least there is some reading ahead of us! A mixture of privacy, quantum, high-dimensional distributions, and juntas (juntæ?). A lot of distribution testing, overall.

Connecting Robust Shuffle Privacy and Pan-Privacy, by Victor Balcer, Albert Cheu, Matthew Joseph, and Jieming Mao (arXiv). This paper considers a recent notion of differential privacy called shuffle privacy, where users have sensitive data, a central untrusted server wants to do something with that data (for instance, say… testing its distribution), and a trusted middle-man/entity shuffles the users’ messages u.a.r. to bring in a bit more anonymity. As it turns out, testing uniformity (or identity) of distributions in the shuffle privacy model is (i) much harder than without privacy constraints; (ii) much harder than with ‘usual’ (weaker) differential privacy (iii) much easier than with local privacy; (iv) related to the sample complexity under another privacy notion, pan-privacy. It’s a brand exciting new world out there!

(Note: for the reader interested in keeping track of identity/uniformity testing of probability distributions under various privacy models, I wrote a very short summary of the current results here.)

Entanglement is Necessary for Optimal Quantum Property Testing, by Sebastien Bubeck, Sitan Chen, and Jerry Li (arXiv). The analogue of uniformity testing, in the quantum world, is testing whether a quantum state is equal (or far from) the maximally mixed state. It’s known that this task has “quantum sample complexity” (number of measurements) $$\Theta(d/\varepsilon^2)$$ (i.e., square root dependence on the dimension of the state, $$d^2$$). But this requires entangled measurements, which may be tricky to get (or, in my case, understand): what happens if the measurements can be adaptive, but not entangled? In this work, the authors show that, under this weaker access model $$\Omega(d^{4/3}/\varepsilon^2)$$ measurements are necessary: adaptivity alone won’t cut it. It may still help though: without either entanglement nor adaptivity, the authors also show a $$\Omega(d^{3/2}/\varepsilon^2)$$ measurements lower bound.

Testing Data Binnings, by Clément Canonne and Karl Wimmer (ECCC). More identity testing! Not private and not quantum for this one, but… not quite identity testing either. To paraphrase the abstract: this paper introduces (and gives near matching bounds for) the related question of identity up to binning, where the reference distribution $$q$$ is over $$k \ll n$$ elements: the question is then whether there exists a suitable binning of the domain $$[n]$$ into $$k$$ intervals such that, once binned, $$p$$ is equal to $$q$$.”

Hardness of Identity Testing for Restricted Boltzmann Machines and Potts models, by Antonio Blanca, Zongchen Chen, Daniel Štefankovič, and Eric Vigoda (arXiv). Back to identity testing of distributions, but for high-dimensional structured ones this one. Specifically, this paper focuses on the undirected graphical models known as restricted Boltzmann machines, and provides efficient algorithms for identity testing and conditional hardness lower bounds depending on the type of correlations allowed in the graphical models.

Robust testing of low-dimensional functions, by Anindya De, Elchanan Mossel, and Joe Neeman (arXiv). Junta testing is a classical, central problem in property testing, with motivations and applications in machine learning and complexity. The related (and equally well-motivated) question of junta testing of functions on $$\mathbb{R}^d$$ (instead of the Boolean hypercube) was recently studied by the same authors; and the related (and, again, equally well-motivated) question of tolerant junta testing on the Boolean hypercube was also recently studied (among other works) by the same authors. Well, this paper does it all, and tackles the challenging (and, for a change, equally well-motivated!) question of tolerant testing of juntas on $$\mathbb{R}^d$$.

Differentially Private Assouad, Fano, and Le Cam, by Jayadev Acharya, Ziteng Sun, and Huanyu Zhang (arXiv). Back to probability distributions and privacy. This paper provides differentially private analogues of the classical eponymous statistical inference results (Assouad’s lemma, Fano’s inequality, and Le Cam’s method). In particular, it gives ready-to-use, blackbox tools to prove testing and learning lower bounds for distributions in the differentially private setting, and shows how to use them to easily derive, and rederive, several lower bounds.

Edit: We missed one!

Learning and Testing Junta Distributions with Subcube Conditioning, by Xi Chen, Rajesh Jayaram, Amit Levi, Erik Waingarten (arXiv). This paper focuses on the subcube conditioning model of (high-dimensional) distribution testing, where the algorithm can fix some variables to values of its choosing and get samples conditioned on those variables. Extending and refining techniques from a previous work by a (sub+super)set of the authors, the paper shows how to optimally learn and test junta distributions in this framework—with exponential savings with respect to the usual i.i.d. sampling model.

# News for March 2020

I hope all of you are keeping safe and healthy in these difficult times. Thank heavens we have our research and our math to keep our sanity…

This month has seen two papers, one on testing variable partitions and one on distributed isomorphism testing.

Learning and Testing Variable Partitions by Andrej Bogdanov and Baoxiang Wang (arXiv). Consider a function $$f:\Sigma^n \to G$$, where $$G$$ is Abelian group. Let $$V$$ denote the set of variables. The function $$f$$ is $$k$$-separable if there is a partition $$V_1, V_2, \ldots, V_k$$ of $$V$$ such that $$f(V)$$ can be expressed as the sum $$f_1(V_1) + f_2(V_2) + \ldots + f_k(V_k)$$. This is an obviously natural property to study, though the specific application mentioned in the paper is high-dimensional reinforcement learning control. There are a number of learning results, but we’ll focus on the main testing result. The property of $$k$$-separability can be tested with $$O(kn^3/\varepsilon)$$ queries, for $$\Sigma = \mathbb{Z}_q$$ (and distance between functions is the usual Hamming distance). There is an analogous result (with different query complexity) for $$\Sigma = \mathbb{R}$$. It is also shown that testing 2-separability requires $$\Omega(n)$$ queries, even with 2-sided error. The paper, to its credit, also has empirical studies of the learning algorithm with applications to reinforcement learning.

Distributed Testing of Graph Isomorphism in the CONGEST model by Reut Levi and Moti Medina (arXiv). This result follows a recent line of research in distributed property testing algorithms. The main aim is to minimize the number of rounds of (synchronous) communication for a property testing problem. Let $$G_U$$ denote the graph representing the distributive network. The aim is to test whether an input graph $$G_K$$ is isomorphic to $$G_U$$. The main property testing results are as follows. For the dense graph case, isomorphism can be property tested (with two-sided error) in $$O(D + (\varepsilon^{-1}\log n)^2)$$ rounds, where $$D$$ is the diameter of the graph and $$n$$ is the number of nodes. (And, as a reader of this blog, you probably know what $$\varepsilon$$ is already…). There is a standard $$\Omega(D)$$ lower bound for distributed testing problems. For various classes of sparse graphs (like bounded-degree minor-free classes), constant time isomorphism (standard) property testers are known. This paper provides a simulation argument showing that standard/centralized $$q$$-query property testers can be implemented in the distributed model, in $$O(Dq)$$ rounds (this holds for any property, not just isomorphism). Thus, these simulations imply $$O(D)$$-round property testers for isomorphism for bounded-degree minor-free classes.

# News for February 2020

Despite a wealth of February conference deadlines, papers were fairly sparse. We found two EDIT: three EDIT EDIT: four papers for the month of February, please share if we missed any relevant papers.

Monotone probability distributions over the Boolean cube can be learned with sublinear samples, by Ronitt Rubinfeld and Arsen Vasilyan (arXiv). By now, it is well known that assuming an (unknown) distribution enjoys some sort of structure can lead to more efficient algorithms for learning and testing. Often one proves that the structure permits a convenient representation, and exploits this representation to solve the problem at hand. This paper studies the learning of monotone distributions over the Boolean hypercube. The authors exploit and extend a structural statement about monotone Boolean functions by Blais, Håstad, Servedio, and Tan, using it to provide sublinear algorithms for estimating the support size, distance to uniformity, and the distribution itself.

Locally Private Hypothesis Selection, by Sivakanth Gopi, Gautam Kamath, Janardhan Kulkarni, Aleksandar Nikolov, Zhiwei Steven Wu, and Huanyu Zhang (arXiv). Given a collection of $$k$$ distributions and a set of samples from one of them, can we identify which distribution it is? This paper studies this problem (and an agnostic generalization of it) under the constraint of local differential privacy. The authors show that this problem requires $$\Omega(k)$$ samples, in contrast to the $$O(\log k)$$ complexity in the non-private model. Furthermore, they give $$\tilde O(k)$$-sample upper bounds in various interactivity models.

Efficient Distance Approximation for Structured High-Dimensional Distributions via Learning, by Arnab Bhattacharyya, Sutanu Gayen, Kuldeep S. Meel, and N. V. Vinodchandran (arXiv). Given samples from two distributions, can you estimate the total variation distance between them? This paper gives a framework for solving this problem for structured distribution classes, including Ising models, Bayesian networks, Gaussians, and causal models. The approach can be decomposed properly learning the distributions, followed by estimating the distance between the two hypotheses. Challenges arise when densities are hard to compute exactly.

Profile Entropy: A Fundamental Measure for the Learnability and Compressibility of Discrete Distributions, by Yi Hao and Alon Orlitsky (arXiv). The histogram of a dataset is the collection of frequency counts of domain elements. The profile of a dataset can be succinctly described as the histogram of the histogram. Recent works have shown that, in some sense, discarding information about your dataset by looking solely at the profile can be beneficial for certain problems in which it is “universal”. This work explores two new quantities, the entropy and dimension of the profile, which turn out to play a key role in quantifying the performance of estimators based on the profile.

# News for January 2020

The first month of 2020 is behind us, and it’s already looking good! Four papers last month, spanning quantum, graphs, and probability distributions.

On Efficient Distance Approximation for Graph Properties, by Nimrod Fiat and Dana Ron (arXiv). In the dense graph (i.e., adjacency-matrix) model, one is given a distance parameter $$\varepsilon$$ and granted query access to the adjacency matrix of a graph $$G$$, and seeks to determine something about the distance of $$G$$ to a prespecified property $$\mathcal{P}$$ of interest (i.e., the fraction of the matrix that needs to be changed for $$G$$ to satisfy the property). Testing requires to distinguish whether that distance is zero, or at least $$\varepsilon$$; many results over the past years have shown that many properties can be tested with a number of queries depending only on $$\varepsilon$$ (and not on $$n=|G|$$. This work focuses on the harder problem of distance estimation, or, equivalently, tolerant testing: that is, estimate this distance up to $$\pm\varepsilon$$. The authors introduce a general framework to get distance approximation algorithms from a “complicated” property by decomposing it into simpler properties and using algorithms for those. Applying this framework to a few flagship properties, they then show that $$P_3$$-freeness, induced $$P_4$$-freeness, and cordiality are properties which have efficient distance estimation algorithms (independent of $$n$$, and polynomial in $$1/\varepsilon$$).

Minimax Optimal Conditional Independence Testing, by Matey Neykov, Sivaraman Balakrishnan, and Larry Wasserman (arXiv). Given i.i.d. draws from a triple $$(X,Y,Z)$$, how hard is it to check whether $$X \perp Y \mid Z$$, that is, “$$X$$ and $$Y$$ are independent conditioned on $$Z$$ (or far from it)?” This is the problem on conditional independence testing, which was covered back in the days for the case where $$X,Y,Z$$ are discrete. Well, this new work takes the fight out of the discrete world: extending the results and techniques from the discrete case, it provides optimal bound for the continuous case: where $$Z$$ is on $$[0,1]$$, and then when all three $$X,Y, Z$$ are continuous.

How symmetric is too symmetric for large quantum speedups?, by Shalev Ben-David and Supartha Podder (arXiv); and Can graph properties have exponential quantum speedup?, by Andrew M. Childs and Daochen Wang (arXiv). Both these works independently study the relation between the (bounded-error) randomized and quantum query complexities of any graph property $$f$$, in the dense graph (adjacency-matrix) model. In particular, how much advantage do quantum algorithms provide for those?
As it turns out, not so much: for those functions, both papers show the two quantities are always polynomially related ($$R(f) \leq Q(f) \leq O(R(f)^6))$$) in the dense-graph model. As a corollary, this implies that testing any such property won’t benefit much from quantum (that is, at most polynomially)…. at least in this model. In the adjacency list model (also known as the bounded-degree graph model), the first paper conjectures that exponential query complexity improvements are possible; and the second paper provides an example, establishing it. Altogether, this fully settles an open problem of Ambainis, Childs, and Liu, and Montanaro and de Wolf.

# News for December 2019

Happy new year! And now for the last post of 2019 papers. We have found a diverse collection of six papers, ranging from classic property testing topics to new perspectives on sublinear computation.

Sublinear Optimal Policy Value Estimation in Contextual Bandits by Weihao Kong, Gregory Valiant, Emma Brunskill (arXiv). This isn’t our usual sublinear paper, but it is definitely of interest to us sublinear folk. Let’s start with a stripped down definition of the problem (or rather, game). There are $$K$$ “arms”, where the $$i$$th arm is represented by an unknown vector in $$\beta_i \in \mathbb{R}^d$$. We are presented with a “context”, which is a vector $$x \in \mathbb{R}^d$$. Our job is to choose an arm $$i \in [K]$$. We get the reward $$x \cdot \beta_i$$ (with some noise added). The contexts appears from a known distribution. To aid us, we observe the rewards of $$N$$ iid contexts, so we observe a total of $$T = KN$$ rewards. There has been much work on figuring out the minimum value of $$T$$ required to learn the optimal policy. One requires at least $$d$$ (the dimension) samples to estimate any of the arm vectors. This papers shows that one can actually estimate the expected reward of the optimal policy, without being able to describe it, with sublinear in $$d$$ (technically, $$\widetilde{O}(\sqrt{d})$$) samples. We see this a lot in property testing, where producing the “optimal” solution for a problem requires linear-in-dimension samples, but estimating the optimal value is much cheaper (consider, for example, the situation of linearity testing, where we wish to find the closest linear function).

Sublinear Time Numerical Linear Algebra for Structured Matrices by Xiaofei Shi and David P. Woodruff (arXiv). This follows the recent linear of advances in sublinear time linear algebra. Given a matrix $$A \in \mathbb{R}^{n \times d}$$, the aim is to get algorithms that only look at $$o(nnz(A))$$ entries (where $$nnz(A)$$ is the number of non-zeroes, or the support). Consider the classic talk of low rank approximation. Unfortunately, suppose one entry is extremely large, and the others are extremely small. One has to find this large entry for any reasonable approximation, which (in the worst-case) requires $$nnz(A)$$ queries into $$A$$. Thus, previous papers make structural assumption (such as, $$A$$ being a Vandermonde matrix) to get sublinear bounds. This paper gives a clean black box method to get a variety of such results. Basically, one can replace the usual $$nnz(A)$$ term in many algorithms, by $$T(A)$$, which is the time to compute the matrix-vector product $$Ay$$, for $$y \in \mathbb{R}^d$$. In many cases $$T(A) = \widetilde{O}(n)$$, which can be significantly smaller than $$nnz(A)$$. This paper gives such results for low-rank approximations and many regression problems.

Robust and Sample Optimal Algorithms for PSD Low-Rank Approximation by Ainesh Bakshi, Nadiia Chepurko, and David P. Woodruff. Consider the low rank problem discussed above. As mentioned in the previous paragraph, we need structural assumptions on $$A$$. Previous results gave sublinear time low-rank approximations assuming that $$A$$ is positive semidefinite (PSD). The aim is to get a rank $$k$$ matrix $$B$$ such that $$\|A-B\|^2_2$$ is at most $$(1+\epsilon)$$-times the optimal such approximation. The previous algorithm of Musco-Woodruff makes $$\widetilde{O}(nk/\epsilon^{2.5})$$ queries in to $$A$$, while there is a lower bound of $$\Omega(nk/\epsilon)$$. This gap between the complexities is resolved in this paper with an upper bound of $$\widetilde{O}(nk/\epsilon)$$ queries.

Constructive derandomization of query algorithms by Guy Blanc, Jane Lange, and Li-Yang Tan (arXiv). This paper discusses an intriguing angle to sublinear question: when can they be derandomized? Abstractly, consider a randomized algorithm $$R$$ that makes $$q$$ queries. Think of $$R$$ as a function $$R(x,r)$$, where $$x$$ is the input, and $$r$$ is the randomness. We would like to design a deterministic algorithm $$D$$ making, ideally, close to $$q$$ queries and approximates $$\mathop{E}_r[R(x,r)]$$. For starters, consider some distribution over $$x$$, and suppose we want $$\mathbb{E}_x[D(x) – \mathbb{E}_r[R(x,r)]] < \epsilon$$. By (the easy direction of) Yao’s minimax lemma, one can show the existence of such an algorithm $$D$$ that makes $$O(q/\epsilon)$$ queries. But how to explicitly construct it? Indeed, the first result of this paper gives a “meta-algorithm” that takes as input the description of $$R$$ (which is of size $$N$$), has running time $$poly(N)2^{O(q/\epsilon)}$$ and outputs a description of $$D$$. When $$R$$ satisfies the stronger property of “bounded error”, one can get a $$O(q^3)$$-query algorithm $$D$$ that approximates $$\mathop{E}_r[R(x,r)]$$ for all $$x$$ (again, the existence is proven by a classic theorem of Nisan). Overall, this paper gives a method to derandomize sublinear time algorithms, and I wonder if there could be some applications of this method for proving lower bounds. After all, Yao’s minimax theorem is the tool for property testing lower bounds, and any perspective on Yao’s theorem is likely of relevance.

Testing Membership for Timed Automata by Richard Lassaigne and Michel de Rougemont (arXiv). Property testing for regular languages is a classic result in sublinear algorithms. This paper focuses on the more complex notion of timed automata. The technical definition is quite complicated, but here’s an overview. There is a finite automaton and a collection of “clocks”. Imagine a string being processed, where each alphabet symbol appears with a new timestamp. Thus, the input word is called a “timed word”. The transitions of the automaton involve the new symbol read, as well as constraints involving the clock times and the timestamp. Thus, we can enforce conditions like “only transition if another symbol is read within a single time unit”. In general, deciding whether a timed word is accepted by a timed automaton is NP-complete. This papers studies the property testing viewpoint. The paper gives a new definition of “timed edit distance” between timed words. The main result shows that one can distinguish time words accepted by a timed automaton from words that are far (according to timed edit distance), by querying a constant number of word positions.

On the query complexity of estimating the distance to hereditary graph properties by Carlos Hoppen, Yoshiharu Kohayakawa, Richard Lang, Hanno Lefmann, and Henrique Stagni (arXiv). This paper concerns the classic setting of property testing of dense graphs. It is well-known that all hereditary graph properties are testable, and are moreover, one can estimate the distance to the property in time that only depends on $$\varepsilon$$. Unfortunately, the queries complexities have large tower dependencies on $$\varepsilon$$, arising from the use of the Szemeredi regularity lemma. The question of property testing in dense graphs can be reduced to finding “removal” lemmas (such as the classic triangle remove lemma). Such a lemma states that if at least $$\varepsilon n^2$$ edges need to be removed from $$G$$ to destroy all “forbidden subgraphs”, then there must be “many” forbidden subgraphs in $$G$$. There is much recent research on finding families of forbidden subgraphs, where the “many” (in the above statement) is at least $$poly(\varepsilon)$$ times the trivial upper bound. This paper shows that one can also estimate the distance to any hereditary property, in a query complexity that depends directly on the corresponding removal lemma parameters. As a compelling example, one can estimate the distance to being chordal in $$\exp(1/\varepsilon)$$ queries, a significant improvement over standard tower bounds.

# News for November 2019

We hit the mother-lode of property testing papers this month. Stick with us, as we cover 10 (!) papers that appeared online in November.

EDIT: We actually have 11 papers, check out Optimal Adaptive Detection of Monotone Patterns at the bottom.

Testing noisy linear functions for sparsity, by Xue Chen, Anindya De, and Rocco A. Servedio (arXiv). Given samples from a noisy linear model $$y = w\cdot x + \mathrm{noise}$$, test whether $$w$$ is $$k$$-sparse, or far from being $$k$$-sparse. This is a property testing version of the celebrated sparse recovery problem, whose sample complexity is well-known to be $$O(k\log n)$$, where the data lies in $$\mathbb{R}^n$$. This paper shows that the testing version of the problem can be solved (tolerantly) with a number of samples independent of $$n$$, assuming technical conditions: the distribution of coordinates of $$x$$ are i.i.d. and non-Gaussian, and the noise distribution is known to the algorithm. Surprisingly, all these conditions are needed, otherwise the dependence on $$n$$ is $$\tilde \Omega(\log n)$$, essentially the same as the recovery problem.

Pan-Private Uniformity Testing, by Kareem Amin, Matthew Joseph, Jieming Mao (arXiv). Differentially private distribution testing has now seen significant study, in both the local and central models of privacy. This paper studies a distribution testing in the pan-private model, which is intermediate: the algorithm receives samples one by one in the clear, but it must maintain a differentially private internal state at all time steps. The sample complexity turns out to be qualitatively intermediate to the two other models: testing uniformity over $$[k]$$ requires $$\Theta(\sqrt{k})$$ samples in the central model, $$\Theta(k)$$ samples in the local model, and this paper shows that $$\Theta(k^{2/3})$$ samples are necessary and sufficient in the pan-private model.

Almost Optimal Testers for Concise Representations, by Nader Bshouty (ECCC). This work gives a unified approach for testing for a plethora of different classes which possess some sort of sparsity. These classes include $$k$$-juntas, $$k$$-linear functions, $$k$$-terms, various types of DNFs, decision lists, functions with bounded Fourier degree, and much more.

Unified Sample-Optimal Property Estimation in Near-Linear Time, by Yi Hao and Alon Orlitsky (arXiv). This paper presents a unified approach for estimating several distribution properties with both near-optimal time and sample complexity, based on piecewise-polynomial approximation. Some applications include estimators for Shannon entropy, power sums, distance to uniformity, normalized support size, and normalized support coverage. More generally, results hold for all Lipschitz properties, and consequences include high-confidence property estimation (outperforming the “median trick”) and differentially private property estimation.

Testing linear-invariant properties, by Jonathan Tidor and Yufei Zhao (arXiv). This paper studies property testing of functions which are in a formal sense, definable by restrictions to subspaces of bounded degree. This class of functions is a broad generalization of testing whether a function is linear, or a degree-$$d$$ polynomial (for constant $$d$$). The algorithm is the oblivious one, which simply repeatedly takes random restrictions and tests whether the property is satisfied or not (similar to the classic linearity test of BLR, along with many others).

Approximating the Distance to Monotonicity of Boolean Functions, by Ramesh Krishnan S. Pallavoor, Sofya Raskhodnikova, Erik Waingarten (ECCC). This paper studies the following fundamental question in tolerant testing: given a Boolean function on the hypercube, test whether it is $$\varepsilon’$$-close or $$\varepsilon$$-far from monotone. It is shown that there is a non-adaptive polynomial query algorithm which can solve this problem for $$\varepsilon’ = \varepsilon/\tilde \Theta(\sqrt{n})$$, implying an algorithm which can approximate distance to monotonicity up to a multiplicative $$\tilde O(\sqrt{n})$$ (addressing an open problem by Sesh). They also give a lower bound demonstrating that improving this approximating factor significantly would necessitate exponentially-many queries. Interestingly, this is proved for the (easier) erasure-resilient model, and also implies lower bounds for tolerant testing of unateness and juntas.

Testing Properties of Multiple Distributions with Few Samples, by Maryam Aliakbarpour and Sandeep Silwal (arXiv). This paper introduces a new model for distribution testing. Generally, we are given $$n$$ samples from a distribution which is either (say) uniform or far from uniform, and we wish to test which is the case. The authors here study the problem where we are given a single sample from $$n$$ different distributions which are either all uniform or far from uniform, and we wish to test which is the case. By additionally assuming a structural condition in the latter case (it is argued that some structural condition is necessary), they give sample-optimal algorithms for testing uniformity, identity, and closeness.

Random Restrictions of High-Dimensional Distributions and Uniformity Testing with Subcube Conditioning, by Clément L. Canonne, Xi Chen, Gautam Kamath, Amit Levi, and Erik Waingarten (ECCC, arXiv). By now, it is well-known that testing uniformity over the $$n$$-dimensional hypercube requires $$\Omega(2^{n/2})$$ samples — the curse of dimensionality quickly makes this problem intractable. One option is to assume that the distribution is product, which causes the complexity to drop to $$O(\sqrt{n})$$. This paper instead assumes one has stronger access to the distribution — namely, one can receive samples conditioned on being from some subcube of the domain. With this, the paper shows that the complexity drops to the near-optimal $$\tilde O(\sqrt{n})$$ samples. The related problem of testing whether a distribution is either uniform or has large mean is also considered.

Property Testing of LP-Type Problems, by Rogers Epstein, Sandeep Silwal (arXiv). An LP-Type problem (also known as a generalized linear program) is an optimization problem sharing some properties with linear programs. More formally, they consist of a set of constraints $$S$$ and a function $$\varphi$$ which maps subsets of $$S$$ to some totally ordered set, such that $$\varphi$$ possesses monotonicity and locality properties. This paper considers the problem of testing whether $$\varphi(S) \leq k$$, or whether at least an $$\varepsilon$$-fraction of constraints in $$S$$ must be removed for $$\varphi(S) \leq k$$ to hold. This paper gives an algorithm with query complexity $$O(\delta/\varepsilon)$$, where $$\delta$$ is a dimension measure of the problem. This is applied to testing problems for linear separability, smallest enclosing ball, smallest intersecting ball, smallest volume annulus. The authors also provide lower bounds for some of these problems as well.

Near-Optimal Algorithm for Distribution-Free Junta Testing, by Xiaojin Zhang (arXiv). This paper presents an (adaptive) algorithm for testing juntas, in the distribution-free model with one-sided error. The query complexity is $$\tilde O(k/\varepsilon)$$, which is nearly optimal. Algorithms with this sample complexity were previously known under the uniform distribution, or with two-sided error, but this is the first paper to achieve it in the distribution-free model with one-sided error.

Optimal Adaptive Detection of Monotone Patterns, by Omri Ben-Eliezer, Shoham Letzter, Erik Waingarten (arXiv). Consider the problem of testing whether a function has no monotone increasing subsequences of length $$k$$, versus being $$\varepsilon$$-far from having this property. Note that this is a generalization of testing whether a function is monotone (decreasing), which corresponds to the case $$k = 2$$. This work shows that the adaptive sample complexity of this problem is $$O_{k,\varepsilon}(\log n)$$, matching the lower bound for monotonicity testing. This is in comparison to the non-adaptive sample complexity, which is $$O_{k,\varepsilon}((\log n)^{\lfloor \log_2 k\rfloor})$$. In fact, the main result provides a certificate of being far, in the form of a monotone increasing subsequence of length $$k$$.

# News for October 2019

Last month was a lean month, with only three papers: one on direct product testing, one on finding forbidden patterns in a sequence, and one (an update of a paper which we had missed in the Spring) on quantum distribution testing.

Direct sum testing – the general case, by Irit Dinur and Konstantin Golubev (ECCC). Say a function $$f\colon \prod_{i=1}^d [n_i] \to \mathbb{F}_2$$ is a direct product if it can be factored as $$f(x_1,\dots,x_d)=\sum_{i=1}^d f_i(x_i)$$, where $$f_i\colon [n_i]\to\mathbb{F}_2$$.
This paper provides a 4-query tester (i.e., a proximity oblivious tester (POT)) for the direct product property, reminiscent of (and relying on) the BLR linearity test: specifically, draw two subsets $$S,T\subseteq [d]$$ and two inputs $$x,y\in \prod_{i=1}^d [n_i]$$ u.a.r., and accept iff
$$f(x)+f(x_Sy)+f(x_Ty)+f(x_{S\Delta T}y) = 0\,.$$
The main theorem of the paper is to show that the probability that this simple test rejects is lower bounded (up to a constant factor) by the distance of $$f$$ to direct-product-ness. (The authors also provide a different POT making $$d+1$$ queries, but with a simpler analysis.)

Finding monotone patterns in sublinear time, by Omri Ben-Eliezer, Clément Canonne, Shoham Letzter, and Erik Waingarten (ECCC). Given a function $$f\colon [n]\to\mathbb{R}$$, a monotone subsequence of size $$k$$ is a $$k$$-tuple of indices $$i_1 < \dots <i_k$$ such that $$f(i_j) < f(i_{j+1})$$ for all $$j$$. This work considers (non-adaptive) one-sided testing of monotone-subsequence-freeness, or, equivalently, the task of finding such a monotone subsequence in a function promised to contain many of them. (This, in particular, generalizes the problem of one-sided monotonicity testing, which is the case $$k=2$$.) The main result is a full characterization of the query complexity of this question (for constant $$k$$): strange as the exponent may seem, $$\Theta_\varepsilon( (\log n)^{\lfloor \log_2 k\rfloor} )$$ queries are necessary and sufficient. The proof relies on a structural dichotomy result, stating that any far-from-free sequence either contains “easy to find” increasing subsequences with increasing gaps between the elements, or has a specific hierarchical structure.

Quantum Closeness Testing: A Streaming Algorithm and Applications, by Nengkun Yu (arXiv). This paper is concerned with quantum distribution testing in the local model, which only allows a very restricted (albeit, as the author argues, more natural and easier to implement) type of measurements, and is particularly well-suited to a streaming setting. The main contribution of this paper is to show a connection to classical distribution testing, allowing one to obtain quantum distribution testing upper bounds from their classical distribution testing counterparts. In more detail, the paper shows that, from local measurements to two $$d$$-dimensional quantum states $$\rho,\sigma$$, one can provide access to two classical distributions $$p,q$$ on $$\approx d^2$$ elements such that (i) $$\| p-q\|_2 \approx \|\rho-\sigma\|_2/d$$ and (ii) $$\| p\|_2,\| q\|_2 = O(1/d)$$.
Using this connection, the paper proceeds to establish a variety of upper bounds for testing several distribution properties in the local quantum model.

# News for Sept 2019

Five Six papers this month: results on testing separations, linearity testing in $$\mathbb{R}^n$$, testing for regular languages, graph property testing, topological property testing, and Boolean rank.

Hard properties with (very) short PCPPs and their applications, by Omri Ben-Eliezer, Eldar Fischer, Amit Levi, and Ron D. Rothblum (arXiv). Probably, the most significant takeaway from this work is a (largest possible) separation between standard and tolerant property testing. PCPPs (Probabilistically Checkable Proofs of Proximity) are the “NP” variant of property testing, where the tester is aided by a proof string. Consider property $$\mathcal{P}$$. If $$x \in \mathcal{P}$$, there must be a proof string that makes the tester accept (with probability 1). If $$x$$ is far from $$\mathcal{P}$$ (in the usual property testing sense), for any proof string, the tester must reject with sufficiently high probability. PCPPs have played a role in the classical constructions of PCPs, but have also found uses in getting a better understanding of property testing itself. And this paper shows how PCPP constructions can be used to get property testing separations. The main result in this paper is a property $$\mathcal{P}$$ that (basically) requires $$\Omega(n)$$ queries to “property test”, but has a PCPP system where the proof length is $$\widetilde{O}(n)$$. ($$n$$ is the input length.) The main construction uses collections of random linear codes. Significantly, these constructions show a strong separation between standard vs tolerant property testing, and standard vs erasure-resilient property testing. (The latter is a recent variant by Dixit et al, where certain parts of the input are hidden from the tester.) There is a property that is testable in a constant number of queries, but requires $$\widetilde{\Omega}(n)$$ queries to test tolerantly (for any non-trivial choice of parameters). An analogous result holds for erasure-resilient testing.

Distribution-Free Testing of Linear Functions on R^n, by Noah Fleming and Yuichi Yoshida (arXiv). Linearity testing is arguably the canonical problem in property testing, yet there is still much to be learned about it. This paper considers functions $$f: \mathbb{R}^n \to \mathbb{R}$$, and the distribution-free setting. (In this setting, distance is measured according is an unknown distribution $$\mathcal{D}$$ over the input, and the tester can access samples from this distribution. For $$\mathbb{R}^n$$, the “standard” distribution would the $$n$$-dimensional Gaussian.) The main result is that linearity testing can be done in the distribution-free setting with $$\widetilde{O}(1/\varepsilon)$$ queries, assuming that the distribution is continuous. The primary technical tool, an interesting result in its own right, is that additivity $$(f(x+y) = f(x) + f(y))$$ can be tested in $$\widetilde{O}(1/\varepsilon)$$ queries. The significance of the testing result is cemented by an $$\Omega(n)$$ lower bound for sample-based testers.

Sliding window property testing for regular languages by Moses Ganardi, Danny Hucke, Markus Lohrey, Tatiana Starikovskaya (arXiv). Fix a regular language $$\mathcal{R}$$. Consider the streaming model, and the basic question of recognizing whether the string (being streamed) is in $$\mathcal{R}$$. Simple, you will say! Run the DFA recognizing $$\mathcal{R}$$ in constant space. Now, suppose there is a sliding window length of $$n$$. The aim is to determine if the past $$n$$ symbols (the “active window”) form a string in $$\mathcal{R}$$. Suprisingly (at least to me), there is a full characterization of the space required for randomized algorithms, and (depending on $$\mathcal{R}$$), it is either $$\Theta(1)$$, $$\Theta(\log\log n)$$, $$\Theta(\log n)$$, or $$\Theta(n)$$. In the interest of beating these lower bounds, suppose we wish to property test on the active window. It turns out the answer is quite nuanced. There are deterministic $$O(\log n)$$-space testers and randomized two-sided $$O(1/\varepsilon)$$-space testers for all regular languages. For randomized one-sided testers, there are multiple possibilities for the optimal space complexity, and there is a full characterization of these regular languages.

A characterization of graph properties testable for general planar graphs with one-sided error (It’s all about forbidden subgraphs) by Artur Czumaj and Christian Sohler (arXiv). Property testing of sparse graphs has been receiving more attention, but most results focus on the bounded degree setting. Unfortunately, many of these results break quite dramatically on sparse graphs with unbounded degrees. This paper focuses on property testing, within the class of unbounded degree planar graphs. (Meaning, the input is always assumed to be planar.) The results achieve a significant goal: as the title suggests, there is a complete characterization of properties that are constant-query testable with one-sided error. The easier part is in showing that all such properties can be reduced to testing $$H$$-freeness. The harder (remarkable) result is $$H$$-freeness can be tested in general planar queries with constant queries. (This is non-trivial even for triangle-freeness.) And, as is easy to conjecture but hard to prove, these results carry over for all minor-closed families. As a small indication of the challenge, most testers for bounded-degree graphs work by doing constant depth BFSes. When high degree vertices are present, this method fails, and we really need new ideas to deal with such graphs.

Near Coverings and Cosystolic Expansion – an example of topological property testing by Irit Dinur and Roy Meshulam (ECCC). In most algebraic settings, property testing results can be seen as local to global theorems. When do local constraints on a large object imply a global condition? This paper gives a topological instantiation of this phenomenon. We need to define the cover of a simplicial complex $$X$$. For concreteness, think of a 2D simplicial complex $$X$$, which is a hypergraph with hyperedges of size at most 3, where subsets of hyperedges are also present. A 2-cover is a simplicial complex $$X’$$ with the following property. It has two copies of each vertex of $$X$$. Each hyperedge of $$X$$ must have two “corresponding” disjoint copies in $$X’$$. Let the copies of vertex $$v$$ be $$v_0, v_1$$. Then, for every hyperedge (say) $$(u,v,w)$$ of $$X$$, there must be two disjoint hyperedges in $$X’$$ involving copies of the corresponding vertices. One can consider the property testing twist: if the neighborhoods of “most” vertices $$v$$ in $$X$$ satisfy these condition (with respect to the neighborhoods of the copies of $$v$$ in $$X’$$), then is $$X’$$ close to being a cover of $$X$$? Indeed, this paper proves that such a “property testing condition” holds iff $$X$$ is a high-dimensional expander.

Property testing of the Boolean and binary rank by Michal Parnas, Dana Ron, and Adi Shraibman (arXiv). The Boolean rank of a matrix $$M$$ is a fundamental quantity that appears in many lower bound constructions. (Recall that an $$n \times m$$ Boolean matrix $$M$$ has a rank $$r$$ if $$M$$ can be expressed as $$X \cdot Y$$, where $$X \in \mathbb{F}_2^{n \times d}$$ and $$Y \in \mathbb{F}_2^{d \times m}$$.) In the real-valued setting, results show that one can property test rank in $$poly(d/\varepsilon)$$ queries. This paper proves an analogous result for the Boolean rank. There is a surprise element here: over reals, the rank can be computed in polynomial time, and many of the geometric intuitions can be brought over to the property testing problem. On the other hand, the Boolean rank is NP-hard to compute exactly, yet we can still get a tester with $$poly(d)$$ query complexity. The paper also gives results for binary rank. For the binary rank, we require the component matrices $$X, Y$$ to be Boolean, but algebraic operations are over the reals. In the case, the tester has query complexity $$2^{2d}$$ (with varying dependencies on $$\varepsilon$$ for adaptive/non-adaptive testers). The intriguing open problem is whether $$poly(d)$$-query testers exist for binary rank.

# News for August 2019

A comparatively slow month, as summer draws to a close: we found three papers online. Please let us know if we missed any! (Edit: And we added two papers missed from June.)

Testing convexity of functions over finite domains, by Aleksandrs Belovs, Eric Blais, and Abhinav Bommireddi (arXiv). This paper studies the classic problem of convexity testing, and proves a number of interesting results on the adaptive and non-adaptive complexity of this problem in single- and multi-dimensional settings. In the single-dimensional setting on domain $$[n]$$, they show that adaptivity doesn’t help: the complexity will be $$O(\log n)$$ in both cases. However, in the simplest two-dimensional setting, a domain of $$ \times [n]$$, they give a polylogarithmic upper bound in the adaptive setting, but a polynomial lower bound in the non-adaptive setting, showing a strong separation. Finally, they provide a lower bound for $$[n]^d$$ which scales exponentially in the dimension. This leaves open the tantalizing open question: is it possible to avoid the curse of dimensionality when testing convexity?

Testing Isomorphism in the Bounded-Degree Graph Model, by Oded Goldreich (ECCC). This work investigates the problem of testing isomorphism of graphs, focusing on the special case when the connected components are only polylogarithmically large (the general bounded-degree case is left open). One can consider when a graph is given as input, and we have to query a graph to test if they are isomorphic. This can be shown to be equivalent (up to polylogarithmic factors) to testing (from queries) whether a sequence is a permutation of a reference sequence. In turn, this can be shown to be equivalent to the classic distribution testing question of testing (from samples) whether a distribution is equal to some reference distribution. The same sequence of equivalences almost works for the case where there is no reference graph/sequence/distribution, but we only have query/query/sample access to the object. The one exception is that the reduction doesn’t work to reduce from testing distributions to testing whether a sequence is a permutation, due to challenges involving sampling with and without replacement. However, the author still shows the lower bound which would be implied by such a reduction by adapting Valiant’s proof for the distribution testing problem to this case.

Learning Very Large Graphs with Unknown Vertex Distributions, by Gábor Elek (arXiv). In this note, the author studies a variant of distribution-free property testing on graphs, in which (roughly) neighboring vertices have probabilities of bounded ratio, and a query reveals this ratio. Applications to local graph algorithms and connections to dynamical systems are also discussed.

EDIT: We apparently missed two papers from June — the first paper was accepted to NeurIPS 2019, the second to COLT 2019.
The Broad Optimality of Profile Maximum Likelihood, by Yi Hao and Alon Orlitsky (arXiv). Recently, Acharya, Das, Orlitsky, and Suresh (ICML 2017) showed that the Profile Maximum Likelihood (PML) estimator enables a unified framework for estimating a number of distribution properties, including support size, support coverage, entropy, and distance to uniformity, obtaining estimates which are competitive with the best possible. The approach is rather clean: simply estimate the PML of the distribution (i.e., the maximum likelihood distribution of the data, if the the labels are discarded and only the multiplicities of elements are kept), and apply the plug-in estimator (i.e., if you want to estimate entropy, compute the entropy of the resulting PML distribution). The present work shows that PML is even more broadly applicable — such an approach applies to any property which is additive, symmetric, and appropriately Lipschitz. They also show specific results for many other properties which have been considered in the past, including Rényi entropy, distribution estimation, and identity testing.

Sample-Optimal Low-Rank Approximation of Distance Matrices by Piotr Indyk, Ali Vakilian, Tal Wagner, David Woodruff (arXiv). Getting a rank $$k$$ approximation of an $$n \times m$$ matrix $$M$$ is about as classic a problem as it gets. Suppose we wanted a running time of $$O(n+m)$$, which is sublinear in the matrix size. In general, this is not feasible, since there could be a single large entry that dominates the matrix norm. This paper studies the case where the matrix is itself a distance matrix. So there is an underlying point set in a metric space, and the $$i, j$$th entry of $$M$$ is the distance between the $i$th and $j$th point. Previous work showed the existence of $$O((n+m)^{1+\gamma})$$ time algorithms (for arbitrary small constant $\gamma > 0$, with polynomial dependence on $$k$$ and error parameters). This work gives an algorithm that runs in $$\widetilde{O}(n+m)$$ time. The main idea is to sample the rows and columns according to row/column norms.