Category Archives: Uncategorized

New for August 2024

Apologies for the late post! After the relative silence of last month, we’re getting back to a moderate cadence of papers. Some distribution testing, quantum testing, learning and testing. We’ve also added a non-sublinear paper on distributions that should be of interest. And here’s the roundup.

Efficient Testable Learning of General Halfspaces with Adversarial Label Noise by Ilias Diakonikolas, Daniel M. Kane, Sihan Liu, and Nikos Zarifis (arXiv). This paper is on the recent model of testable learning introduced by Rubinfeld-Vasiliyan. Consider learning halfspaces over the Gaussian distribution. We have sample access to an input function \(f:\mathbf{R}^d \to \{0,1\}\), where our aim is learn the closest halfspace to \(f\). The samples comes from some fixed, underlying distribution \(\mathcal{D}\). But it is often infeasible to validate this distributional assumption, even in the property testing framework. A tester-learner pair will try to test this assumption, and if it accepts, apply the learning algorithm. The guarantee is: if \(\mathcal{D}\) is indeed (say) Gaussian, then the learning algorithms works as desired. If the tester accepts and the learner outputs a hypothesis (say, halfspace) \(h\), then it is guaranteed that \(h\) is close to \(f\) according to \(\mathcal{D}\), even if \(\mathcal{D}\) is far from Gaussian. This last part makes the whole setup interesting; the distribution tester might fail to reject the distribution, but we can trust the output hypothesis! There have been many results on learning homogenous halfspaces under the Gaussian distribution. So the hypothesis class consists of halfspaces going through the origin. This paper is about general (inhomogenous) halfspaces. At first blush, this might seem like a simple issue; we’re just adding a constant term to the halfspace. But existing techniques break down, often because they’re solving an optimization problem of minimizing error around a band close the origin. This paper gives a careful reduction to the “nearly” homogeneous case, and gives the first polynomial sample tester-learner pair for this problem.

Tolerant testing stabilizer states by Srinivasan Arunachalam and Arkopal Dutt (arXiv). Let us start with a familiar classical problem, low degree testing. Given access to a function \(f:\{0,1\}^n \to \{0,1\}\), we wish to test if \(f\) is a quadratic polynomial (over \(\mathbb{F}_2\)). There exist testers based on the Gowers-norm: basically, compute various (constant dimensional) discrete derivatives and check they are consistent with a quadratic function. These discrete derivatives can be analyzed using Fourier analysis, and the main aim is show that a function that “locally” (in terms of derivatives) behaves like a quadratic is indeed globally close to being one. This method can be used to get tolerant testers for quadratic polynomials. This paper is on a quantum generalization of this testing result. The input is a qubit \(|\psi_f\rangle\), promised to be a “phase state”. A phase state has a Boolean function \(f\) embedded in it, because one can write a phase state as a linear combination of \(2^n\) “base” states, with the coefficients being Boolean. A “stabilizer state” is one where the function \(f\) is a quadratic (or so I believe, I’m probably exposing my ignorance of quantum mechanics). This paper uses the Gowers norm techniques to give the first quantum tolerant tester for stabilizer states.

Improved Bounds for High-Dimensional Equivalence and Product Testing using Subcube Queries by Tomer Adar, Eldar Fischer, and Amit Levi (arXiv). Consider distribution testing on high-dimensional data, so the domain is \(\{0,1\}^n\). A nice model for distribution tester is the subcube conditioning model of Bhattacharyya and Chakraborty. Suppose we fix any subset \(S \subseteq [n]\) coordinates with \(x_S\). We can generate a sample \(y\) from the distribution, conditioned on \(y_S = x_S\) (meaning, \(y\) agrees with \(x\) on \(S\)). The problem is to perform equivalence testing of distributions on this model. Previous results got \(O(n^2/\varepsilon^2)\) query algorithms, and this paper give a significantly improved algorithm making \(O(n/\varepsilon^2)\) queries. Interestingly, the algorithm only makes weaker queries. One distribution is only accessed by “marginal queries”. So, given \(x_S\) as before, but we only sample the marginal distribution over coordinate \(i\), conditioned on \(S\) being fixed as \(x_S\). (Hence, the output is a single bit. Also, we note that the other distribution is accessed by prefix queries, a weaker version of subcube queries.) These generalizations lead to more results, on testing equivalence in the interval query model, and for testing the property of product distributions. This paper also proves a \(\Omega(\sqrt{n}/\varepsilon^2)\) lower bound for testing product distributions.

Parallel Sampling via Counting by Nima Anari, Ruiquan Gao, and Aviad Rubinstein (arXiv). This isn’t a “typical sublinear algorithm” per se, but I think it is quite interesting to those of us who think about sublinearity, adaptivity, and distributions. This result has connections to the previous paper. Our aim is to sample from an unknown distribution \(\mu\) over \([q]^n\). We have access to “marginal queries” as described above. This problem appears in large language models, wherein neural nets are trained on various marginals (next word), but the output is a sentence (list of words). Observe there is a simple “\(O(n)\) round” algorithm. Without any fixing, query the marginal of the first coordinate. Fix the first coordinate, query the marginal of the second coordinate. Fix the first two coordinates, rinse and repeat. This requires \(n\) rounds with the marginal query. In this model, polynomially many marginal queries can be made in a single round, and the aim is to minimize the number of rounds (basically, bounding the adaptivity). This paper gives a \(\widetilde{O}(n^{2/3})\) round procedure for sampling, and shows an \(\Omega(n^{1/3})\) lower bound.

News for January 2021

The first month of 2021 has brought with it 5 papers, covering graph testing, Boolean function testing, and distribution testing — as well as database theory. Let’s dive into it.

Random walks and forbidden minors III: \(\mathrm{poly}(d/\varepsilon)\)-time partition oracles for minor-free graph classes, by Akash Kumar, C. Seshadhri, and Andrew Stolman (arXiv). Minor-closed bounded-degree graphs have a very nice property: denoting by \(d\) the degree bound and \(n\) the number of edges, it is always possible to partition any such graph into components of constant size, \(O_\varepsilon(1)\), just by removing a linear number of edges, merely \(\varepsilon dn\) (\(\varepsilon\) being a small parameter). This is a crucial result in many graph property testing algorithms, those which rely on something called a “partition oracle”: loosely speaking, a routine which makes “few” queries to the graph, and is able to indicate which component of an underlying partition any given vertex belongs to. But what is “few” here? The first partition oracles made \(d^{\mathrm{poly}(d,1/\varepsilon)}\) queries to the graph to answer any such request. This later got significantly improved to \(d^{\mathrm{log}(d/\varepsilon)}\). Using spectral graph theory techniques previously developed by the authors (hence the “III” in the title), this work settles the question, achieving partition oracles which as good as it gets: making only \(\mathrm{poly}(d,1/\varepsilon)\) queries to the graph! This in turns has immediate consequences for graph property testing, which the paper details.

And since we are on the topic of oracles… more exciting news on that front:

Spectral Clustering Oracles in Sublinear Time, by Grzegorz Gluch, Michael Kapralov, Silvio Lattanzi, Aida Mousavifar, Christian Sohler (arXiv). Given a graph \(G\) and an integer \(k\), one often want to partition the graph into components \(C_1,C_2,\dots,C_k\), such that each \(C_i\) is well-connected and has few edges to the other \(C_j\)’s. But can we get ultra efficient algorithms for such spectral clusterings? Specifically, can we design oracles for them: sublinear-time algorithms which provide implicit access to an underlying “good” spectral clustering \(\hat{C}_1,\hat{C}_2,\dots,\hat{C}_k\), by returning on any query vertex \(v\) the index of the cluster \(\hat{C}_i\) to which \(v\) belongs? This paper introduces the question, and answers it in the affirmative: in more detail, it provides a spectral clustering oracle which, for any \(\varepsilon>0\), has preprocessing time \(2^{\mathrm{poly}(k/\varepsilon)}n^{1/2+O(\varepsilon)}\), query time \(n^{1/2+O(\varepsilon)}\), and space \(n^{1/2+O(\varepsilon)}\); and provides access to a clustering with relative error \(O(\varepsilon\log k)\) per cluster. The paper also allows tradeoffs between query time and space, and discusses applications to the Local Computation Algorithms (LCA) model.

Next stop, distribution testing…

The Sample Complexity of Robust Covariance Testing, by Ilias Diakonikolas and Daniel M. Kane (arXiv). Suppose you have i.i.d. samples from some high-dimensional Gaussian \(\mathcal{N}(0,\Sigma)\) in \(d\) dimensions, and want to test whether the unknown covariance matrix \(\Sigma\) is the identity, versus \(\varepsilon\)-far from it (in Frobenius norm). Good news: we know how to do that, and \(\Theta(d/\varepsilon^2)\) samples are necessary and sufficient. (To learn \(\Sigma\), that’d be \(\Theta(d^2/\varepsilon^2)\).) Bad news: you don’t have i.i.d. samples from some high-dimensional Gaussian \(\mathcal{N}(0,\Sigma)\); what you have is i.i.d. samples from a noisy version of it, \((1-\alpha)\mathcal{N}(0,\Sigma) + \alpha B\), where \(B\) is an arbitrary “bad” distribution (not necessarily Gaussian itself). You still want to test whether the covariance \(\Sigma\) is the identity, but now you have that extra \(\alpha\) fraction of noisy samples, and you need to do that testing robustly… The good news is, you can still do that by learning the covariance matrix \(\Sigma\), robustly, with \(O(d^2/\varepsilon^2)\) samples. The bad news is the main result of this paper: that’s also the best you can do. That is, \(\Omega(d^2)\) samples are necessary: if you have to be robust to noise, testing is no longer easier than learning….

Onto Boolean functions!

Junta Distance Approximation with Sub-Exponential Queries, by Vishnu Iyer, Avishay Tal, and Michael Whitmeyer (ECCC). If you follow this blog, you may have seen over the past couple years a flurry of results about tolerant junta testing: “given query access to some Boolean function \(f\colon\{-1,1\}^n\to\{-1,1\}\), how close is \(f\) to only depending on \(k \ll n\) of its variables?”
This paper contains several results on this problem, including an improved bicriteria tolerant testing algorithm: an efficient algorithm to distinguish between \(\varepsilon\)-close to \(k\)-junta and \(1.1\varepsilon\)-far from \(k’\)-junta making \(\mathrm{poly}(k,1/\varepsilon)\) queries (and \(k’ = O(k/\varepsilon^2)\)). But the main result of the paper, and the one giving it its name, is for the non-relaxed version where \(k’=k\): while all previous works had a query complexity \(2^{O(k)}\), here the authors show how to break that exponential barrier, giving a fully tolerant testing algorithm with query complexity \(2^{\tilde{O}(\sqrt{k})}\)!

And finally, a foray into database theory:

Towards Approximate Query Enumeration with Sublinear Preprocessing Time, by Isolde Adler and Polly Fahey (arXiv). In this paper, the authors are concerned with the task of (approximate) query enumeration on databases, aiming for ultra efficient (i.e., sublinear-time) algorithms. Leveraging techniques from property testing (specifically, in the bounded-degree graph model), they show the following:
On input databases of bounded degree and bounded tree-width, every (fixed) first-order definable query can be enumerated approximately in time linear in the output size, after only a sublinear-time preprocessing phase.

That’s all for this month! If you noticed a paper we missed, please let us know in the comments.

News for October 2020

Sorry for the delay in writing this monthly digest: we hope you didn’t spend the week frantically refreshing your browsers! We found four papers this month: let’s dive in.

A Structural Theorem for Local Algorithms with Applications to Coding, Testing, and Privacy, by Marcel Dall’Agnol, Tom Gur, and Oded Lachish (ECCC). This paper introduces the notion of “robust (local) algorithm,” an abstraction which encompasses may types of algorithms: e.g., property testing algorithms, locally decodable codes, etc. The main result of this work is that any (possibly adaptive) \(q\)-query robust local algorithm can be transformed into a non-adaptive, sample-based one making \(n^{1-1/(q^2\log q)}\) queries (where \(n\) is the input size). Here, “sample-based” means that the algorithm doesn’t get to make arbitrary queries, but just gets to observe randomly chosen coordinates of the input. As application of this result, the authors derive new upper and lower bounds for several of the types of local algorithms mentioned above, resolving open questions from previous works.

Testing Tail Weight of a Distribution Via Hazard Rate, by Maryam Aliakbarpour, Amartya Shankha Biswas, Kavya Ravichandran, Ronitt Rubinfeld (arXiv). The authors consider, from the point of view of distribution testing, the question of deciding whether data is heavy-tailed: as would, for instance, data following a power law. The paper first sets out to formalize the question, and discusses various possible definitional choices before setting on one; after which it provides a test, and analyzes its sample complexity as a function of various parameters (such as smoothness of the unknown distribution). The authors finally back these results with empirical evaluation of their algorithm.

On Testing of Samplers, by Kuldeep S. Meel, Yash Pote, Sourav Chakraborty (arXiv). Suppose you are given a (known) distribution \(p\) over some domain \(\Omega\), and want to sample from it conditioned on some predicate \(\varphi\). Now someone comes to you with an algorithm which does exactly that, efficiently, and cheaply: great! But can you easily check that you’re not getting fooled, and that this sampler actually does what it claims? This paper provides this: an algorithm which accepts if the sampled distribution is \(\varepsilon\)-close to what it should (roughly, in a multiplicative, KL divergence sense), and rejects if it’s \(\varepsilon’\)-far (in total variation distance). The number of samples required is polynomial in \(\varepsilon’-\varepsilon\), and depends on some characteristic of \(p\) and \(\varphi\), the “tilt” (ratio between max and min probability of the conditional distribution).

Finally, an omission from late September:
Sample optimal Quantum identity testing via Pauli Measurements, by Nengkun Yu (arXiv). The abstract is concise and clear enough to speak for itself: “In this paper, we show that \(\Theta(\textrm{poly}(n)\cdot\frac{4^n}{\varepsilon^2})\) is the sample complexity of testing whether two \(n\)-qubit quantum states \(\rho\) and \(\sigma\) are identical or \(\varepsilon\)-far in trace distance using two-outcome Pauli measurements.”

Please let us know if you spotted a paper we missed!

News for January 2017

2017 starts off rather slow for property testing. Though, we have an intriguing paper to report – an experimental analysis of a classic sublinear algorithm.

Evaluating a Sublinear-time Algorithm for the Minimum Spanning Tree Weight Problem, by Gabriele Santi and Leonardo De Laurentiis (arXiv). The Chazelle-Rubinfeld-Trevisan Minimum Spanning Tree algorithm is a classic in sublinear algorithms. This algorithms provides a \((1+\varepsilon)\) approximation to the MST in time independent of the number of vertices (although it does depend on the average degree). But how this compare with Prim’s algorithm on real instances, in a real (not theoretical) computer? This intriguing paper does a detailed experimental comparison. Having done experimental graph algorithms myself, I can attest to the various challenges: how to choose a test set of graphs? How to set error parameters? Can data structure optimization on the coding side beat asymptotic improvements? This paper does a series of experiments on synthetic graph generators (such as Erdős-Rényi, Barabási-Albert, Watts-Strogatz models). They do validate the basic CRT algorithm at scale, by showing that it is faster than Prim for graphs with more than a million edges. Their experiments suggest that the sublinear-time algorithm gives little benefits when \(\varepsilon \leq 0.2\). The paper has many experiments for a variety of settings, and the authors do a comprehensive study of the various parameters. I’d definitely recommend to anyone interested in exploring how property testing might influence algorithms in the real world.

News for December 2015

Greetings from the exciting Workshop on Sublinear Algorithms at John Hopkins University! As this workshop and the upcoming SODA and ITCS conferences get 2016 to a roaring start, let us take one last look back at property testing news from last year. In December, one work in particular caught my eye:

 Non-Local Probes Do Not Help with Graph Problems by Mika Göös, Juho Hirvonen, Reut Levi, Moti Medina, and Jukka Suomela (arXiv). A generalization of property testing that has recently seen some fascinating developments in the past few years is the local computation algorithms (LCA) model, in which the algorithm is asked to answer some local query (such as “what is the color of this vertex in some fixed, legal coloring of the graph?”) in sublinear-time. This paper relates the LCA model to message-passing models and in the process gives a powerful new tool for establishing lower bounds in LCAs.