Author Archives: Seshadhri

News for April 2026

Apologies for the major delay! We have a nice bounty of 8 papers, with a big haul of graph property testing papers. Let us not delay even further and get right to it.

Sublinear-query relative-error testing of halfspaces by Xi Chen, Anindya De, Yizhi Huang, Shivam Nadimpalli, Rocco A. Servedio, and Tianqi Yang (arXiv). This paper studies the relative error model for property testing, which is designed for studying “sparse” inputs. Consider functions \(f, g: \mathbb{R}^n \to \{0,1\}\) over the Gaussian measure in \(\mathbb{R}^n\). Let \(p = \Pr[f(x) = 1]\), which one can think of as the “volume” of \(f\). The relative distance of \(g\) from \(f\) is defined as \(\Pr_{x}[f(x) \neq g(x)]/p\). In situations where the volume is tiny, this definition makes more sense than the standard definition, where \(f\) would simply be close to the all zero function. To get non-trivial results, one needs another oracle called SAMP that gives a random sample of a point in \(f^{-1}(1)\). This paper studies the classic problem of halfspace testing. The main results are property testers with query complexity sublinear in \(n\) and polylogarithmic in \(1/p\). Some of these testers are given the value \(p\), and some only use the SAMP oracle (so no querying of arbitrary points). In the full model of standard queries and the SAMP oracle, one can get a \(poly(\log(1/p), \varepsilon)\) query tester.

Classes Testable with \(O(1/\varepsilon)\) Queries for Small ε Independent of the Number of Variables by Nader H. Bshouty and George Haddad (arXiv). Consider property testing in the most classic setting, functions \(f:\{0,1\}^n \to \{0,1\}\). An obvious lower bound for testing any non-trivial property is \(\Omega(1/\varepsilon)\); in fewer queries, one cannot even detect the existence of an \(\varepsilon\)-fraction of the domain that may certify that input \(f\) is not in the property. When can this trivial lower bound be matched? Specifically, this paper studies parameterized properties where the tester query complexity is \(O(\psi + 1/\varepsilon)\), where \(\psi\) depends on the property parameters, and the big-Oh only hides an “absolute” constant. Take the classic problem of \(k\)-junta testing. A \(k\)-junta is a function that depends only on \(k\) variables. An old result of Blais gives a tester with query complexity \(O(k\log k + k/\varepsilon)\). But note that this does not match the trivial lower bound as \(k\) becomes large. This paper gives a tester with complexity \(O(k2^k + 1/\varepsilon)\), thereby matching the trivial lower bound even when \(k\) has some (small) dependence on \(\varepsilon\). This paper gives a number of such results for well studied properties like low Fourier degree and being a sparse polynomial.

Distributed Quantum Property Testing with Communication Constraints by Mina Doosti, Ryan Sweke, and Chirag Wadhwa (arXiv). This paper solves a quantum analogue of a distributed distribution testing problem, introduced by Acharya-Canonne-Tyagi. Suppose \(m\) machines have access to a distribution \(\rho\). An algorithm wishes to test if this distribution is close to a known distribution \(\sigma\) (the identity testing problem), and can do independent communication with each machine. This problem is studied under constraints on the amount of communication and the amount of shared randomness between the machines. The quantum equivalent of a distribution is a quantum state and communication can be either classical bits or qubits. So \(\rho, \sigma\) denotes quantum states, and the aim is property test if they are the same. This is called the quantum “certification” problem, which is the analogue of identity testing) in the distributed setting. The main results are in the setting where there is no classical communication, and the machines only send qubits.

A characterization of one-sided error testable graph properties in bounded degeneracy graphs by Oded Lachish, Amit Levi, Ilan Newman, and Felix Reidl (arXiv). For the setting of general graphs, the standard model is the Goldreich-Ron adjacency list model, as clarified by Czumaj-Sohler. Given a vertex, one can fetch a uniform random neighbor. In this model, Czumaj-Sohler completely characterized one-sided (constant query) testable properties as those described by subgraph freeness. The main result there was to show that \(H\)-freeness is constant time testable when the input graph \(G\) is minor-free. A natural super-class of minor-free graphs is that of bounded degeneracy graphs. A graph has bounded degeneracy if all subgraphs have bounded average degree; this is a rich class of graphs containing all minor-free classes, and often occurring in practice. It has been shown that testing four cycle-freeness in bounded degeneracy graphs requires \(\Omega(n^{1/4})\) queries. So it raises the natural question of characterizing the patterns \(H\) for which \(H\)-subgraph freeness is testable in constant queries. And indeed, this paper achieves exactly that, with a neat characterization.

Lower Bounds for Testing Directed Acyclicity in the Unidirectional Bounded-Degree Model by Yuichi Yoshida (arXiv). While property testing of sparse graphs has received some attention, sublinear algorithms for directed graphs is relatively unexplored. Problems become surprisingly hard, and analysis is even harder. Consider property testing of bounded outdegree graphs with no bound on indegree. The model only gives access to the out-neighborhoods, not the in-neighborhood. This asymmetry lies at the root of hardness. This paper studies the classic property of being acyclic. The main results are new lower bounds that are significant improvements over past results. Let \(n\) be the number of vertices. Ignoring log factors, One-sided testers for acyclicity require \(\Omega(n^{2/3})\) queries and two-sided testers require \(\Omega(\sqrt{n})\) queries. And tolerant testers essentially require reading the whole graph, with a lower bound of \(\Omega(n)\) queries.

Quantum Property Testing for Bounded-Degree Directed Graphs by Pan Peng and Jingyu Wu (arXiv). Same setting as above, but assume that both outdegree and indegrees are bounded. We consider the bidirectional model, which gives access to both in and out neighborhoods, as opposed to the (“standard”) unidirectional model. Pang-Wang proved that there is a property that requires \(O(1)\) queries in bidirectional model, but (almost) \(\Omega(n)\) queries in the unidirectional model. (Ignoring dependencies of \(\varepsilon\) and the degree \(d\) for clarity.) Remarkably, this paper shows that such a lower bound does not hold in the quantum model. The main result is that any property that is classically testable with \(O(1)\) queries in bidirectional model can be tested with \(o(\sqrt{n})\) queries in undirectional model with a quantum computer. Moreover, there exists a hard property with a nearly matching lower bound. One of these consequences of this result is a quantum algorithm for testing \(H\)-freeness in the unidirectional model.

Sublinear Spectral Clustering Oracle with Little Memory by Ranran Shen, Xiaoyi Zhu, Pan Peng, and Zengfeng Huang (arXiv). The focus of this paper, as the title clearly says, are sublinear spectral clustering oracles. Consider an bounded degree input graph \(G\) that is \(k\)-clusterable, which means that (say) one can remove a small fraction of edges, to get at most \(k\) connected components, each of which has high internal conductance. A line of work has shown that one can answer cluster membership queries in sublinear time, without any preprocessing of \(G\). So this is a Local Computation Algorithm (LCA) for the clusters. Any LCA has a storage budget where it keeps a data structure to ensure that all query answers are consistent with a single solution. Previous results by Peng showed that there is a clustering oracle with \(O(\sqrt{n})\) space, such that each query can be answered in \(O(\sqrt{n})\) time. One can obviously use linear space and answer queries in constant time. This paper shows a smooth tradeoff between these solutions: given \(M\) storage, there is an oracle that answers questions in \(O(n/M)\) time.

Locally Computable High Independence Hashing by Yevgeniy Dodis, Shachar Lovett, and Daniel Wichs (ECCC). Not exactly a sublinear algorithms result, but one can interpret it as an LCA result. Consider the classic problem of constructing a family \(\mathcal{F}\) of \(k\)-wise independent hash functions. Let each function be of the form \(f:\{0,1\}^n \to \{0,1\}^n\). These constructions need a seed of $O(kn)$ bits. In many settings, \(k\) is quite large, say \(poly(n)\) (so going beyond the typical constant \(k\) setting). How many queries, denoted \(t\), to the seed does it take to compute \(f(x)\), for an input \(x \in \{0,1\}^n\)? Thinking of \(n\) as small and \(k\) as large, this is exactly an LCA for the function \(f\). An old lower bound of Siegel proves that computing \(f(x)\) requires \(\Omega(t)\) queries to the seed. The main result of this paper is giving a construction that matches this bound; unfortunately, the construction is non-constructive and depends on the existence of certain expanders. If we relax the independence condition to “almost” independent (up to some precision parameter \(\varepsilon\)), then the paper gives an explicit construction.

News for March 2026

After either really busy months or really quiet months, we’re now on a moderately busy month (which is the way it should be). A collection of 7 papers, now showing an increased interest in sublinear graph algorithms. Of course, we still have some distribution testing, and a paper on sparse functional representations.

A note on approximating the average degree of bounded arboricity graphs by Talya Eden, C. Seshadhri (arXiv). Let us begin with this simple note, since it will set the stage for the other papers. Consider the basic problem of estimating the average degree of a (simple) graph \(G = (V,E)\), under the standard adjacency list model. We assume that \(V = [n]\), and an algorithm can get the degree of a vertex and query for uar neighbors. Classic results prove that there are \(\widetilde{O}(\varepsilon^{-2} n/\sqrt{m})\) query (and time) algorithms. Moreover, when the graph arboricity is \(\alpha\), one can get \(\widetilde{O}(\varepsilon^{-2} \alpha)\) query algorithms. These algorithms and results are unfortunately buried deep inside papers, or have extraneous \(\varepsilon\) and \(\log n\) factors. This note give a simple, clean presentation of these results; the main result just takes two pages.

Almost-Uniform Edge Sampling: Leveraging Independent-Set and Local Graph Queries by Tomer Adar and Amit Levi (arXiv). Same setting as the previous blurb. A result of Eden-Rosenbaum (among others) showed that the problem of sampling a uniform random edge can be done in the same query complexity as estimating the number of edges. This is quite interesting, since these problems (at least from a sublinear perspective) are not similar, and the underlying algorithms are quite different. This paper studies this phenomenon in more general models. Very recent results (from Jan 2026!) by Adar-Hotam-Levi showed that average degree estimation can be done much faster with Independent Set (IS) queries together with standard adjacency list queries. Essentially, for sparse graphs, one can get \(O(n^{1/4})\) queries, as opposed to the usual \(\sqrt{n}\) bound. This paper shows that edge sampling can be done the same time as edge estimation (with nearly matching lower bounds). This “complexity equivalence” is also proven for the model with only IS queries.

Testing Properties of Edge Distributions by Yumou Fei (arXiv). Let us bring in some distributions to the (dense) graph setting. Consider an input graph \(G\) with \(n\) vertices, represented as an adjacency matrix. Suppose there is an unknown distribution \(\mu\) over \([n]^2\) that we can sample from. Our aim is to test if (say) the graph is bipartite, where distance is defined according to \(\mu\). If \(\mu\) is uniform over all edges, then this is standard property testing. With arbitrary \(\mu\), we are looking at bipartiteness testing in the distribution-free setting. The classic Goldreich-Goldwasser-Ron result shows that bipartiteness testing (in the standard, uniform setting) can be done in \(poly(\varepsilon^{-1})\) queries. They also prove an \(\Theta(n)\) bound when the distribution \(\mu\) is uniform over an arbitrary subset of edges. This papers gives a bipartiteness tester for any \(\mu\), with the same asymptotic complexity. Moreover, there are results for testing triangle-freeness and square-freeness in this setting; the complexities are \(\Theta(n^{4/3})\) and \(\Theta(n^{9/8})\) respectively. (Note that in the standard setting, we get complexities independent of \(n\).)

Optimal Prediction-Augmented Algorithms for Testing Independence of Distributions by Maryam Aliakbarpour, Alireza Azizi, Ria Stevens (arXiv). Now onto distribution testing. Consider the problem of determining if a distribution \(p\) with support \([n_1] \times [n_2] \times \ldots [n_d]\) is a product distribution. This is a fundamental problem that goes back to the birth of statistics and Pearson’s \(\chi^2\) test. This paper studies the problem in the learning augmented setting. Suppose we are also given a complete description of a predicted approximate distribution \(\widehat{p}\), and a tolerance \(\alpha\). If \(\|\widehat{p} – p\|_{TV} \geq \alpha\), then the algorithm can output “Inaccurate Information”. So, if the predictor \(\widehat{p}\) is accurate, then the algorithm needs to give a correct output. The main result shows that one can improve on the optimal sample complexity by \(\alpha^{1/3}\) factors, so the approximate prediction can help avoid the lower bounds for independence testing.

Improved Local Computation Algorithms for Greedy Set Cover via Retroactive Updates by Slobodan Mitrović, Srikkanth Ramachandran, Ronitt Rubinfeld, Mihir Singhal (arXiv). Consider the classic set cover problem; given a collection of sets \(\mathcal{S}\) over a universe \(\mathcal{E} = \bigcup_{S \in \mathcal{S}} S\), find the smallest subcollection that covers \(\mathcal{S}\). For the sublinear perspective, assume that the input is represented as a bipartite incidence graph between \(\mathcal{S}\) and \(\mathcal{E}\), with adjacency list access. The aim is to get a Local Computation Algorithm (LCA) that computes a \(O(\log \Delta)\)-approximate set cover (where \(\Delta\) is the largest size in \(\mathcal{S}\)). An LCA is an algorithm that, given some \(S \in \mathcal{S}\), outputs (in sublinear time) whether this set is in the output subcollection. The main challenge is to bound the query complexity of this operation. The guarantee is that all the output sets should have the desired guarantees. There are distributed protocols that can output an \(O(\log \Delta)\)-approximate set cover in \(r = O(\log \Delta \cdot \log f)\) rounds (\(f\) is the largest number of sets that an element participates in). Parnas-Ron showed that one can get an LCA using \(\Delta^r\) queries. This paper gives an LCA that makes \(2^r\) queries, which avoids numerous roadblocks in previous work.

Approximate Butterfly Counting in Sublinear Time by Chi Luo, Jiaxin Song, Yuhao Zhang, Kai Wang, Zhixing He, Kuan Yang (arXiv). This is an interesting applied algorithms paper, that at its core, uses a sublinear algorithm. A significant amount of graph data is really a bipartite graph (like actor-movie, author-paper, etc.). A butterfly is a \(K_{2,2}\), which is also a 4-cycle. Counting butterflies is a common data mining problem, since a high density of \(K_{2,2}\)’s is usually indicative of some structure or clustering in the bipartite graph. There have been a number of practical algorithms to compute (or estimate) butterfly counts in bipartite graphs. This paper gives a theoretical sublinear algorithm that runs in \(\widetilde{O}(w\sqrt{m}/b + m/b^{1/4})\), where \(w\) is the wedge (2-path) count and \(b\) is butterfly count. This algorithm is implemented and compared with previous results. Interestingly, the comparison is done on queries and raw running time, and even the latter shows orders of magnitude improvement. This kind of work bridges the theory/practice divide, and shows the power of pure theoretical ideas (like heavy-light vertex partitioning, various edge sampling strategies) for practical algorithmics. Hopefully this inspires even more work along these lines!

Testing Sparse Functions over the Reals by Vipul Arora, Arnab Bhattacharyya, Philips George John, Sayantan Sen (arXiv). The problems of linearity/low degree testing are well-known to our readers. There is a recent line of work generalized such results to the real-valued setting. Consider \(f: \mathbb{R}^n \to \mathbb{R}\). Distance is measured in \(l_1\) over the Gaussian measure. This paper studies a number of properties parameterized by sparsity. Consider linear functions, where \(f(x)\) is linear iff \(f(x) = \sum_i c_i x_i\), for some constants \(c_i\). A function is \(k\)-linear if only \(k\) of the \(c_i\) coefficients are non-zero. So the function has a small representation, independent of the dimension \(n\). This paper gives a tester that makes \(\widetilde{O}(k\log k + 1/\varepsilon)\) queries. Similar results are given for \(k\)-sparse low-degree polynomials and the general problem of \(k\)-juntas. There is a significant subtlety in querying a real-valued function. One can imagine that the value \(f(x)\) is only obtained with some precision \(\eta\). There are results for \(k\)-linearity for the standard Boolean setting, but the precision and \(l_1\)-distance introduce challenges for generalizing to the real valued setting.

News for January 2026

After some insanely busy months, we have a merely busy month. A couple of neat results on sublinear graph algorithms (a subject dear to me!), DNF testing, and two expositions that should be of interest to our readers.

When Local and Non-Local Meet: Quadratic Improvement for Edge Estimation with Independent Set Queries by Tomer Adar, Yahel Hotam, Amit Levi (arXiv). We’ve seen a lot of recent activity of the fundamental problem of edge estimation in graph. Given a simple, connected, undirected graph \(G = (V,E)\) with \(n\) vertices, we want to get a \((1+\varepsilon)\)-approximation to the number of edges \(m\). In the standard adjacency list access model, a classic result of Goldreich-Ron proves that the sample complexity of this problem is \(\Theta(n/\sqrt{m})\) (ignoring \(poly(\varepsilon^{-1} \log n)\) factors). A completely different access model is the Independent Set (IS) model. For any set \(S\), an oracle outputs a single bit indicating whether an edge is present in \(S\). Again, in this model, the optimal bound is \(\Theta(n/\sqrt{m})\). This paper shows that with access to all oracles (standard adjacency list and IS queries), one gets a quadratic improvement and the complexity is \(\Theta(\sqrt{n/\sqrt{m}})\).

Spectral Clustering in Birthday Paradox Time by Michael Kapralov, Ekaterina Kochetkova, Weronika Wrzos-Kaminska (arXiv). Consider a (bounded degree) graph \(G\) that is can be partitioned into \(k\) roughly equal “expanding clusters”. This means that one can remove an \(\varepsilon\)-fraction of the edges to get \(k\) connected components of size around \(n/k\), each of which has expansion at least \(\phi\). The aim is to get a sublinear oracle that determines the cluster that a vertex belongs to. The origins of this problem go back to problems in expansion testing and local expansion reconstruction. From a spectral perspective, the ideal “cluster classifier” is to simply take the bottom \(k\) eigenvectors of the Laplacian, and project every vertex onto this vector space. The corresponding embeddings of vertices in the same cluster will be close. This paper effectively implements this procedure in sublinear time; specifically \(\widetilde{O}(\sqrt{n})\) time, using a birthday paradox type collision argument to estimate the embedding similarities.

DNF formulas are efficiently testable with relative error by Xi Chen, William Pires, Toniann Pitassi, Rocco A. Servedio (arXiv). The relative error model for Boolean functions was recently proposed as an analogy to sparse graph testing. The standard notion of distance between two functions \(f, g: \{0,1\}^n \to \{0,1\}\) is just \(\| f – g\|_0/2^n\). But this is not meaningful when \(|f^{-1}(1)| \ll 2^n\). A natural notion of distance is to consider the sets \(f^{-1}(1)\) and \(g^{-1}(1)\) and measure their symmetric difference. One can now define distances to properties analogously. There have been numerous property testing results with this notion of distance, called the “relative error” setting. This paper considers the property of being a (small) DNF. Learning an \(s\)-term DNF requires \(n^{O(\log(s/\varepsilon))}\) queries. This paper shows that the (relative error) testing of \(s\)-term DNFs can be done in \(poly(s/\varepsilon)\) queries.

A short note on (distribution) testing lower bounds via polynomials by Clément Canonne (ECCC). As the title says, this is a short note on a fundamental question of proving lower bounds for distribution testing. For your favorite distribution testing problem, to prove a lower bound, you start with a “Yes” distribution \(\mathcal{Y}\) and a “No” distribution \(\mathcal{N}\). So \(\mathcal{Y}\) would satisfy the desired property (say, uniformity) and \(\mathcal{N}\) would not. Then, you need to argue some sort of “statistical indistinguishability” of samples. So the distribution of the set \(s\) samples from \(\mathcal{Y}\) is almost the same as that from \(\mathcal{N}\). How does one prove the latter? A convenient lemma shows that if the first \(\ell\) moments of the distributions are the same, then we get a \(\Omega(k^{1-1/\ell})\) sample lower bound (\(k\) is the support size). The key insight is that such “matching moment” distributions/random variables can be generated by looking at specific univariate polynomials. It’s a really clever trick that looks at the powers of roots scaled by the derivative at those points. A really nice read overall!

Mathematical and computational perspectives on the Boolean and binary rank and their relation to the real rank by Michal Parnas (arXiv). This is long survey on the notion of Boolean and binary rank, and an overview of their use in TCS as well as methods to bound this rank. Section 7.3 gives an excellent discussion of property testing problems on Boolean rank. It also gives some of the main open questions regarding property testing low Boolean rank.

News for October 2025

Another busy month in property testing. We have nine papers, covering a range of topics from distribution testing, pattern-freeness testing, and even LLMs! Let’s start the journey.

Proving Natural Distribution Properties is Harder than Testing Them by Tal Herman and Guy Rothblum (ECCC). This paper is on interactive proof systems for distribution testing. Suppose an untrusted prover claims to have discovered some property of a distribution \(\mathcal{D}\). The verifier can sample from \(\mathcal{D}\) and communicate with the prover, in the hope of verifying this property more efficiently that running a property tester directly. There is a rich line of work showing that many non-trivial properties can be verified much faster than testing directly. But in all these results, the honest prover learns the entire distribution and, thus has sample complexity \(\Omega(N)\) (where \(N\) is the domain size). This paper asks whether doubly-sublinear non-trivial protocols exists, wherein the honest prover has sample complexity \(o(N)\) and the verifier is more efficient than directly testing. The main result is hardness: for many natural properties, if the honest prover has \(o(N)\) sample complexity, then the verifier complexity is the same as the testing complexity. This says that proving is (provably!) more difficult than directly testing. The main technical work is in showing such a result for estimating collision statistics of the distribution. Collision statistics form the basis of most label-invariant distribution testing algorithms, leading to the various results in the paper.

Testing forbidden order-pattern properties on hypergrids by Harish Chandramouleeswaran, Ilan Newman, Tomer Pelleg, and Nithin Varma (arXiv). Monotonicity is arguably one of the most fundamental properties studied in property testing. It is a special case of “forbidden order” properties. Consider a function \(f:[n]^d \to \mathbb{R}\), and permutation \(\pi\) of \([k]\), for some constant \(k\). The function is “\(\pi\)-free” if there do not exist \(x_1 \prec x_2 \prec \ldots \prec x_k\) where \(\pi\) is the permutation of the sorting of \(f(x_1), f(x_2), \ldots, f(x_k)\). (We use \(\prec\) for the standard coordinate-wise partial order.) So monotonicity is equivalent to \((2,1)\)-freeness (or \((1,2)\)-freeness depending on ascending/descending order), since it suffices to consider the order of pairs of domain points. Even over the line (\(d=1\)), forbidden order properties exhibit a rich theory. This paper initiates the study of forbidden order freeness for the \(d=2\) case. For \(\pi = (1,2,3)\)-freeness, the paper gives a \(poly(\log n)\) time property tester. But curiously, any one-sided tester for \(\pi=(1,3,2)\)-freeness requires \(\Omega(\sqrt{n})\) queries. There is an interesting adaptivity gap: there is an adaptive \(n^{4/5}\)-query tester for any \(\pi\) for \(k=3\), but a corresponding non-adaptive (one-sided) tester requires \(\Omega(n)\) queries.

Relative-error unateness testing by Xi Chen, Diptaksho Palit, Kabir Peshawaria, William Pires, Rocco A. Servedio, Yiding Zhang (arXiv). Another variant of monotonicity testing. A function \(f:\{0,1\}^n \to \{0,1\}\) is unate if it is either non-decreasing or non-increasing along every dimension. (A monotone function must be non-decreasing in every dimension.) Interestingly, depending on the setting, unateness can be harder or have the same complexity as monotonicity testing. The paper investigates the recent “relative-error model”. In the standard setting, the distance between two functions \(f, g\) is the (normalized) Hamming distance \(\| f -g \|_0/2^n\). In the relative-error setting, we measure the distance as the symmetric difference between the “ones”, so it is \(|f^{-1}(1) \Delta g^{-1}(1)|/|f^{-1}(1)|\). This distinction is analogous to dense vs sparse graph property testing. This paper shows that the non-adaptive complexity of unateness testing is (ignoring \(\varepsilon\) factors) is \(\widetilde{\Theta}(\log N)\), where \(N = |f^{-1}(1)|\). There is also an adaptive \(\Omega((\log N)^{2/3})\) bound. (Thanks for the post by M.C. on the correction. -Ed) The upper bound is the same as recently obtained for monotonicity in this model.

Uniformity Testing under User-Level Local Privacy by Clément L. Canonne, Abigail Gentle, Vikrant Singhal (arXiv). Let us revisit the standard uniformity testing problem with user-level privacy. There are \(n\) users, each of whom holds \(m\) samples of a distribution \(\mathcal{D}\). The domain size is denoted as \(k\). These users communicate with a central server, whose aim is to test \(\mathcal{D}\) for uniformity (or equality, or whatever property you care for). This is a practically motivated problem where each user is a phone that does not want to share its private data, but the server wishes to perform some learning on all the data. If each user held one sample, the setting is called “local differential privacy”: we want the algorithm/tester output to be nearly identical if a single user changes their data. It is known that, for public coin protocols, such a tester requires \(\Theta(k)\) samples in total (in contrast with the \(\Theta(\sqrt{k})\) samples for the standard tester). In the paper’s setting, the protocol has the differentially private with respect to changes in any of the user’s data. This is much stronger, since an individual change is a much smaller fraction of the user’s data. If each user held more than \(\sqrt{k}\) samples, then each user could simply run a distribution tester and communicate a single private bit with the server. The interesting case is when \(m \ll \sqrt{k}\). This paper gives a natural tradeoff between \(m\) and \(n\), that generalizes the local DP guarantee. Basically, the paper shows that when \(mn\) is at least \(k\), then one can get uniformity testing with user-level privacy. Different results hold for private coin protocols, where local DP is harder.

Non-iid hypothesis testing: from classical to quantum by Giacomo De Palma, Marco Fanizza, Connor Mowry, Ryan O’Donnell (arXiv). This paper studies the “non-iid” distribution hypothesis/equality testing problem. Suppose there are \(T\) distributions \(p_1, p_2, \ldots, p_T\), over the domain \([d]\). The aim is to test if the average distribution \(\sum_i p_i/T\) is a known distribution \(q\). We are only allowed a few samples from each distribution. A recent result proved if we get just \(2\) samples from each distribution, then equality testing is possible when \(T \gg \sqrt{d}\) (ignoring \(\varepsilon\) factors). On the other hand, this is not possible with just a single sample from each distribution. This paper studies the quantum analogue of this problem. But first, it actually improves the \(\varepsilon\) dependence on the classical bound, matching the optimal \(\sqrt{d}/\varepsilon^2\) bound. In the quantum setting, each “distribution” is a quantum state, which is represented as a \(d \times d\) matrix. Samples of the state are basically projections/eigenvectors. It is known that \(O(d)\) samples suffices to do test if a quantum state is “maximally mixed” (the equivalent of the identity distribution). This paper shows that, in the non-iid setting, even getting one sample from each state suffices for equality testing. This is in contrast with the classical setting, where \(2\) samples are required per distribution.

Near-Optimal Property Testers for Pattern Matching by Ce Jin, Tomasz Kociumaka (arXiv). This is a result on sublinear algorithms for string matching. Consider a pattern string \(P\) of length \(m\) and a text \(T\) of length \(n\), where both are considered part of the input. Our aim is to determine if \(P\) is a substring of \(T\), or if \(P\) has Hamming distance \(k = \varepsilon m\) from every substring of \(T\). Conventionally for this literature, the parameter \(k\) (and not \(\varepsilon\)) is used. There is a simple sampling algorithm that makes \(\widetilde{O}(\sqrt{nm/k} + n/k)\) queries, but has linear running time. The main question is to get an algorithm whose running time also matches this bound. Previous work gave a \(\widetilde{O}((n^2m/k)^{1/3} + n/k)\) time procedure. The main result gives a non-adaptive algorithm whose running time matches the \(\widetilde{O}(\sqrt{nm/k} + n/k)\) query bound. Curiously, one can get an adaptive algorithm that improves the dependence on \(n\) to \(n-m\), so it is faster when the pattern and text are roughly of the same length. These upper bounds are matched with lower bounds, so it proves an adaptivity gap for this regime. The paper gets the complete picture of the time complexity for all ranges of \(n-m\).

Computational Complexity in Property Testing by Renato Ferreira Pinto Jr., Diptaksho Palit, Sofya Raskhodnikova (arXiv). A generalization of the spirit of the previous paper. The paper asks the fundamental question how query complexity and time complexity are related for property testing. A property tester is a \((q(n), t(n))\)-tester if it makes \(q(n)\) queries and has running time \(t(n)\). The paper provides a precise formalization of running time in the RAM model for a sublinear algorithm. The main result is a hierarchy theorem akin to the classic time hierarchy theorem. Assuming SETH, for any (reasonable) \((q(n), t(n))\), there is a property with a \((q(n), t(n))\)-tester that has no \((q'(n), t'(n))\)-tester, where \(q'(n) \ll q(n)\) and \(t'(n) \ll t(n)\). (There is an unconditional hierarchy theorem, with much stronger conditions on \(t'(n)\).) The gaps between query complexity and running time are explored for the classic problem of halfspace testing. For the distribution-free distance approximation problem, there are \(O(d/\varepsilon^2)\)-query testers, but they run in time exponential in \(d\). Assuming the fine-grained hardness of \(k\)-SUM, this paper proves a lower bound showing that the running time of any property tester must have such an exponential dependence.

Sublinear Algorithms for Estimating Single-Linkage Clustering Costs by Pan Peng, Christian Sohler, Yi Xu (arXiv). Single-linkage clustering (SLC) is a common practical algorithm. The input is a weighted graph, where the weight denotes the distance between the objects/vertices. Given a collection of clusters, the distance between two clusters is the distance between the closest pair of vertices. SLC just repeatedly merges the two clusters that are closest, to get a hierarchical clustering. This paper investigates SLC from the sublinear lens. One can define an objective function corresponding to SLC; for a given clustering, the cost is the sum (over clusters) of the minimum spanning trees of each cluster. For any \(k\), one can consider the optimal clustering with \(k\) clusters. The aim is to approximate these costs. The “SLC profile vector” is the list of costs over all \(k\). The main result gives a sublinear time algorithm that approximates this profile vector. The running time is \(O(\sqrt{W} d)\) (ignoring \(poly(\varepsilon^{-1} \log n)\) factors), where \(W\) is the maximum weight and \(d\) is the maximum degree of the graph. Naturally, these results are obtained by building on the classic sublinear time MST approximation algorithm. The algorithms in the paper are quite interesting, and one wonders if there is a potential to implement them.

Prior Makes It Possible: From Sublinear Graph Algorithms to LLM Test-Time Methods by Avrim Blum, Daniel Hsu, Cyrus Rashtchian, Donya Saless (arXiv). This paper is a fascinating take on Retrieval Augmented Generation (RAG) on Large Language Models (LLMs). While existing LLMs can surprise us with their synthesis abilities, they often suffer in learning new context or factual responses. RAGs are a technique used wherein the LLM is given more context through an existing, curated knowledge base to improve its accuracy. The paper models this problem in terms of a knowledge graph. Suppose there is an unknown ground truth graph \(G^*\), which one can think of as a representation of all “facts” in an area. Most question/answer or factual retrievals can be thought as a paths in this graph. (Factual data is sometimes store in graph format, where edge represent relationships between entities.) So our aim is to find a short (even constant length) \(s\)-\(t\) path in \(G^*\). There is a subgraph \(G \subseteq G^*\) that is known as “prior knowledge”. (One could think of this as what the LLM has been trained on.) There is a retrieval mechanism that generates edges from \(G^*\); in property testing language, this is exactly the query model, and provides to connection to sublinear graph algorithms. With a few queries to the retrieval mechanism, we wish to find the \(s\)-\(t\) path. The main result shows that the prior graph \(G\) has to be quite dense to have constant query path algorithms. Otherwise, there is a lower bound of \(\sqrt{n}\) queries, where \(n\) is the number of vertices in \(G^*\). There are many related results on hallucinations, where spurious edges may be in the prior.


News for July 2025

Finally, a post on the first! This month we have four papers, on understanding Nash Equilibria, deeper meditations on distribution testing, sublinear average degree estimation, and a paper on degree distributions that I’m quite excited to see.

Protocols for Verifying Smooth Strategies in Bandits and Games by Miranda Christ, Daniel Reichman, and Jonathan Shafer (arXiv). Consider a \(k\)-player game where each player has \(n\) options. Given a (mixed) strategy for each player, we would like to know if this is an approximate Nash equilibrium. But think of \(n\) as extremely large, so algorithms need to be sublinear in \(n\). This is a natural requirement, since in many real-world games, the independent options come from an extremely large space. This paper studies this setting from the verification perspective. Assume the algorithm/verifier can communicate with an all powerful prover, and has access to the game matrix/tensor. We want to verifier to make sublinear queries to the game and have sublinear communication (in bits) with the prover. The main result shows that, under some “smoothness” condition on the near equilibrium strategies, one can actually get such sublinear verifiers for Nash equilibria. The starting results in the paper are for the simpler multi-armed bandit setting, which are generalized to more complex games.

Faster Estimation of the Average Degree of a Graph Using Random Edges and Structural Queries by Lorenzo Beretta, Deeparnab Chakrabarty, and C. Seshadhri (arXiv). The problem of estimating the average degree of a graph in sublinear time is an old and classic problem. In a seminal work, Goldreich-Ron gave \((1+\varepsilon)\)-approximations with \(\widetilde{O}(\sqrt{n})\) queries, where \(n\) is the number of vertices. (Up to \(\log n, \varepsilon\) factors, the bound is optimal.) This works in the classic model where an algorithm can sample uar vertices, query degrees, and get neighbors. Suppose uar edges are also available. In that case, Motwani-Panigrahy-Xu give an algorithm that makes \(\widetilde{O}(n^{1/3})\) queries. What if one can also query pairs to check if they are edges? This is a standard operation in sublinear graph algorithms. Interestingly, this paper gives an \(\widetilde{O}(n^{1/4})\) query algorithm for this setting. And, if one can get the entire adjacency list in a single query (common in the study of web networks), then there is an \(\widetilde{O}(n^{1/5})\) query algorithm for this setting. This paper, following Beretta-Tetek, also studies the case where the number of vertices \(n\) is unknown, in which case the previous complexities change. Overall, there is a nuanced picture of the complexity of estimating the average degree.

Towards Tight Bounds for Estimating Degree Distribution in Streaming and Query Models by Arijit Bishnu, Debarshi Chanda, Gopinath Mishra (arXiv). In network science, which is study of real-world large graphs, probably one of the most important graph parameters studied is the degree distribution. Formally, let \(N(d)\) be the number of vertices of degree \(d\). The set \(\{N(d)\}\) is sometimes called the “degree distribution”, though to be precise, it is the “complementary cumulative degree histogram” (ccdh). There is a rich applied history of estimating the ccdh, but the only truly sublinear bound was given by Eden-Jain-Pinar-Ron-Seshadhri. The actual complexity was a little involved, and your truly had posed (in WOLA 2019) the question of determining the optimal bound. Which, I’m happy to say, this paper has nailed down. Ignoring log factors, the complexity is \(\Theta(m/h)\), where \(m\) is the number of edges and \(h\) is the, well, \(h\)-index of the degree distribution (the largest number \(k\) such that there are at least \(k\) vertices of degree at least \(k\)). The algorithm, which is similar to the previous result, is eminently practical and uses a combination of vertex and edge sampling to estimate different part of the distribution. Moreover, the algorithm can be implemented in the streaming setting.

Location-Invariant Properties of Functions versus Properties of Distributions: United in Testing but Separated in Verification by Oded Goldreich and Guy Rothblum (ECCC). Let us build up the distribution testing context. Consider a distribution \(\mathcal{D}\) over universe \([n]\). In the standard distribution testing setup, the tester gets samples from this distribution and has to (approximately) decide some property. One could imagine that there is an underlying space/function that actually generates this distribution. So think of \(f:[m] \to [n]\), where \(\mathcal{D}(i)\) is the fraction of the domain values \(j\) where \(f(j) = i\). A distribution property can be translated to the property/set of functions that induce the desired distributions. In the latter setting, there is more power, since a tester can explicitly query \(j \in [m]\) to get \(f(j)\). (While, in the distribution setting, the tester merely gets \(f(j)\) for a uar \(j \in [m]\).) It turns out that property testing in both settings are basically equivalent. This paper shows, to the contrary, there is a gap between these settings for the verification problem. Consider “doubly sublinear interactive protocols”, so that verifer must run with queries sublinear to the vanilla testing complexity, and the honest prover must run in queries sublinear to the learning complexity. For the classic uniformity testing problem, there is no such interactive protocol in the distribution testing setting. But, interestingly, there is a “doubly sublinear interactive protocol” in the functional setting, where the tester can get \(o(\sqrt{n})\) to verify uniformity.

News for June 2025

Another (sigh) delayed post. A moderately busy month, with 4 papers with sublinear graph algorithms and, of course, distribution testing.

A 0.51-Approximation of Maximum Matching in Sublinear \(n^{1.5}\) Time by Sepideh Mahabadi, Mohammad Roghani, and Jakub Tarnawski (arXiv). The title says it all. This paper gives a 0.51-approximation for the maximum matching in, well, \(O(n^{1.5})\) time. Here’s the back story. There are two kinds of sublinear maximum matching algorithms. The first kind (achieved after a long line of impressive results) get \((1- \varepsilon)\)-approximations in time \(O(n^{2 – \Omega_{\varepsilon}(1)})\). This means that that in sublinear \(o(n^2)\) time, one can get arbitrarily good approximations. The second kind beats the simple \(0.5\)-approximation (obtained from maximal matchings), but tries for closer to \(O(n)\) time. BehnezhadRoghaniRubinsteinSaberi gave a \((0.5+\varepsilon)\)-approximation algorithm running in \(n^{1+\varepsilon}\) time. But the largest value of \(\varepsilon\) possible in these results is tiny. Can one get a “significant” improvement over 0.5-approximations in time “significantly less” than \(n^2\)? The answer is yes, as the title explains.

Testing (Conditional) Mutual Information by Jan Seyfried, Sayantan Sen, and Marco Tomamichel (arXiv, ECCC). Consider a distribution over pairs \((A, B)\). The mutual information \(I(A,B)\) of \(A\) and \(B\) is the minimum KL-divergence between the original distribution and any product distribution over the support of \(A\) and \(B\). So the mutual information is zero iff \((A,B)\) is a product distribution, or equivalently \(A\) and \(B\) are independent. A natural distribution testing question is to distinguish \(A\) and \(B\) being independent from \(I(A,B) > \varepsilon\). One can take this problem a step further. Suppose the distribution is over triples \((A,B,C)\). Then can one distinguish conditional independence (\(I(A, B | C) = 0\)) from sufficient conditional dependence, so \(I(A, B | C) > \varepsilon\)? This problem was studied by (including our very own) Canonne-Diakonikolas-Kane-Stewart, and is known that \(O(d^{7/4})\) samples suffice. Here, \(d\) denotes the maximum alphabet size for each coordinate, so the overall domain size is \(d^3\). This paper gives a much more nuanced complexity, that depends separately on the support sizes for each coordinate. It is interesting that each coordinate plays a different role in the final complexity.

Tight simulation of a distribution using conditional samples by Tomer Adar (arXiv). Consider a distribution of \(\{0,1\}^n\). Many distribution testing results would not yield useful algorithms for this setting, since the domain is exponentially large. Inspired by this setting, the subcube conditional model was introduced by Bhattacharyya-Chakraborty. In this setting, we can generate samples conditioned on any subcube. The aim of this paper goes beyond property testing. Can we actually estimate specific values of the distribution with a few queries? Can we simulate the distribution, which means to “locally compute” a distribution that is close to the input? Specifically, given domain point \(x\), we want to compute a value \(\mu(x)\) such that \(\mu\) represents a distribution close to the input. This paper gives such a result, where each \(\mu(x)\) can be computed in \(O(n^2/\varepsilon^2)\) queries. The final distribution has KL-divergence \(\varepsilon^2\) from the original.

Sampling and Identity-Testing Without Approximate Tensorization of Entropy by William Gay, William He, Nicholas Kocurek, and Ryan O’Donnell (arXiv). Again, consider a distribution \(D\) over a large product domain \(\Sigma^n\). Our aim is identity testing, so we wish to determine if \({D} = {D}’\) or \(|{D} – {D}’| > \varepsilon\), where \({D}’\) is some explicitly known distribution. The model is much weaker than subcube condition access. Essentially, we can only get samples from subcubes of dimension (at least) \(n-1\). Blanca-Chen-Štefankovič-Vigoda studied identity testing where \({D}\) satisfies a condition called “approximate tensorization of entropy” (ATE). By Jensen’s inequality, if \({D}\) is a product distribution, the entropy of \({D}\) is at most the sum of entropies of the marginal distributions. If the entropy of \({D}\) is (say) at most a constant factor of this sum, we say it satisfies the ATE. Blanca-Chen-Štefankovič-Vigoda proves that \(\widetilde{O}(n/\varepsilon)\) suffices for identity testing. This paper studies input distributions that are mixtures of (say, a constant number of) distributions satisfying ATE. Interestingly, these mixtures do not satisfy the ATE. But this paper shows that one can still get \(\widetilde{O}(n/\varepsilon)\) query identity testers.

News for February 2025

After a few quiet months on PTReview, the community is back with a bang. We have a selection of nine papers this month, ranging from hypergraphs, vertex coloring, triangle counting, a new take on linearity testing, and a sublinear foray into transformers.

Property Testing in Bounded Degree Hypergraphs by Hugo Aaronson, Gaia Carenini, and Atreyi Chanda (arXiv). We all know about the rich history of property testing and sublinear algorithms for graphs. The space for hypergraphs is far less studied, and likely has an even richer theory. Consider the input being an \(n\) vertex \(k\)-uniform hypergraph, so every hyperedge is a subset of \(n\) vertices. This paper is the first to define the bounded degree setting. The graph is represented as an adjacency list, wherein, for each vertex, we have the list of hyperedges containing that vertex. Distance between hypergraphs is defined in terms of the symmetric difference between the sets of hyperedges. The primary focus of the paper is on the fundamental property of \(k\)-colorability (note that \(k\) is also the hyperedge size). The main lower bound result is that for \(k \geq 3\), this property requires \(\Omega(n)\) queries to test. This is in contrast to the graph setting (\(k = 2\)), where bipartiteness is known to be testable for bounded degree graphs. This paper shows if that inputs have bounded treewidth, then \(k\)-colorability is testable with one-sided error in constant queries.

Simple Sublinear Algorithms for \((\Delta + 1)\) Vertex Coloring via Asymmetric Palette Sparsification by Sepehr Assadi and Helia Yazdanyar (arXiv). Consider access to the adjacency list of a graph with \(n\) vertices and maximum degree \(\Delta\). Most of our readers will be familiar with result of Assadi-Chen-Khanna proving that \((\Delta + 1)\)-coloring can be done in \(\widetilde{O}(n^{3/2})\) time. The key tool is a palette sparsification lemma, that shows that each vertex can independently pick a small list of colors, such that the final coloring is consistent with these lists. This paper proves a weaker version of this lemma with a significantly simpler proof. The result is an asymmetric version, where vertices choose color lists of varying sizes. The average length goes up by a log factor. But the proof is simpler, and it also allows for the final coloring to just use the greedy algorithm (constrained to these lists).

Improved Sublinear Algorithms for Classical and Quantum Graph Coloring by Asaf Ferber, Liam Hardiman, and Xiaonan Chen (arXiv). Another result on \((\Delta +1)\)-coloring. A result of Morris and Song show that there are simpler algorithms that give a \((1+\varepsilon)\Delta\)-coloring. Adapting their techniques, this paper gives a sublinear algorithm for \((\Delta+1)\)-coloring that runs in \(O(n^{3/2}\sqrt{\log n})\) time. This gives a \(\sqrt{\log n}\) factor improvement from the previous results. The algorithm can be accelerated in the quantum model using Grover’s search, to get a running time of \(\widetilde{O}(n^{4/3})\). Finally, the paper gives a \(\widetilde{O}(n^{5/4})\) time quantum algorithm for getting a \((1+\varepsilon)\Delta\)-coloring.

Monotonicity Testing of High-Dimensional Distributions with Subcube Conditioning by Deeparnab Chakrabarty, Xi Chen, Simeon Ristic, C. Seshadhri, and Erik Waingarten (arXiv). In my first foray into distribution testing, this paper studies the problem of testing monotonicity of distributions over the hypercube, \(\{-1,1\}^n\). Observe that the domain is considered to be exponentially large, and the aim is to get distribution testers that make \(\textrm{poly}(n)\) samples. To get such bounds, this paper assumes subcube conditional samples. Given any subhypercube of the domain, the algorithm can generate samples restricted to this subhypercube. A distribution is called monotone, if the corresponding probability mass function is monotone (wrt to the standard partial order on \(\{-1,1\}^n\)). For this setting, this paper shows that property of monotonicity can tested with \(\widetilde{O}(n/\varepsilon^2)\) samples. Moreover, this bound is shown to be optimal, up to polylog factors. One of the main tools is a new directed isoperimetric theorem for real-valued functions on the hypercube.

Compression Barriers in Autoregressive Transformers by Themistoklis Haris and Krzysztof Onak (arXiv). While this paper does not give a sublinear time algorithm, it uses many techniques from the sublinear algorithms toolkit to prove lower bounds. And it is on transformers, arguably one of the most important components for machine learning on sequences and Large Language Models. Consider a sequence of \(n\) vectors \(v_1, v_2, \ldots, v_n \in \mathbb{R}^d\), which one can think of as vector embeddings of \(n\) words (coming from, say, some generated text). Based on extensive learning and model fine tuning, assume that the model has learned some “key” vectors \(k_1, k_2, \ldots, k_n \in \mathbb{R}^d\). The next word is generated from a “query vector” \(q\), using which we compute \(M^{-1} \sum_{\ell \leq n} \exp(q \cdot k_\ell) v_\ell\) (where \(M\) is a normalization factor). Suppose there are \(n\) query vectors, and the aim is to generate the above sum for all the query vectors (these are like the next \(n\) words). In practice, there are many schemes to speed up computation by compressing the key vectors. This paper shows that, in the worst case, a computation of the next word/vector takes \(\Omega(nd)\) time and space. Thus, any sublinear compression/speed up method must fundamentally use properties of the data.

Arboricity and Random Edge Queries Matter for Triangle Counting using Sublinear Queries by Arijit Bishnu, Debarshi Chanda, Gopinath Mishra (arXiv). Triangle counting is a classic problem, with a rich history in sublinear algorithms. The earliest work gives an algorithm running in \(\widetilde{O}(n/t^{1/3} + m^{3/2}/t)\) time, where \(n, m\) are the usual graph parameters and \(t\) is the triangle count. (Polylog and \(\varepsilon\) factors are ignored.) This complexity can be improved to \(\widetilde{O}(n/t^{1/3} + m\alpha/t)\), where \(\alpha\) is the graph arboricity/degeneracy. The standard adjacency list model can be augmented with random edge queries, in which case, it was known that \(O(m^{3/2}/t)\) time algorithms exist. This paper shows that \(O(m\alpha/t)\)-time algorithms exist for the model with random edge queries, which combines the previous results into an even better algorithm.

Improved Sublinear-time Moment Estimation using Weighted Sampling by Anup Bhattacharya, Pinki Pradhan (arXiv). Consider an array of \(n\) non-negative weights \(w_1, \ldots, w_n\). The \(t\)th-moment is the sum \(\sum_{i \leq n} w^t_i\). The aim is to estimate this moment, with proportional samples. This means that an algorithm can generate index samples where the index \(i\) has probability proportional to \(w_i\). The first result on this problem, for \(t=1\), was by Motwani-Panigrahy-Xu. Beretta-Tetek do a detailed study of this problem for case \(t=1\) (sum estimation), and show that this problem can be solved in \(O(\sqrt{n})\) queries. When the weights are vertex degrees and we have graph access, there are results showing the existence of \(\widetilde{O}(n^{1-1/t})\) algorithms for \(t > 1\). This paper gives such a result for any non-negative weights with $t \geq 1$, and proves a matching lower bound. Moreover, this paper considers the setting where $atex t < 1$. Interestingly, there is an algorithm of query complexity \(O(\sqrt{n} + n^{1/t-1})\), and a lower bound showing that \(\Omega(n)\) samples are needed when \(t \leq 1/2\).

Biased Linearity Testing in the 1% Regime by Subhash Khot and Kunal Mittal (ECCC). The classic problem of linearity testing never stops yielding new insights. A function \(f:\{0,1\}^n \to \{-1,1\}\), if \(f(x) = (-1)^{a \cdot x}\) where \(a \in \{0,1\}^n\). The classic 3-query BLR test distinguishes linear functions from those that are far from linear. Specifically, if the tester passes with probability \(1-\delta\), then the function \(f\) is \(O(\delta)\)-close to linear. This called testing in the “99% regime”. Suppose we want to tester with guarantees in the “1%” regime. This means that if the tester passes with probability \(> \delta\), then \(f\) has at least \(\epsilon\) correlation with some linear function (where \(\epsilon\) is some function of \(\delta\)). Such results were known for the uniform distribution on the hypercube. This paper studies product distributions on the hypercube, where all marginals are identically \(p\)-biased (the probability of choosing a \(1\) is \(p\)). Such 1%-testers were known for \(p \in (1/3,2/3)\) from recent work. This paper gives a \(1+1/p\)-query tester for all values of \(p\), and proves that the number of queries must be at least \(1+1/p\).

News for November 2024

Alas, another late post! But we compensate for that. With a rich hoard of ten papers this month, covering quantum property testing, distribution testing, matchings, Steiner trees, linearity testing, and dealing with corruptions. And some papers on multiple topics simultaneously.

Uniformity testing when you have the source code by Clément L. Canonne, Robin Kothari, and Ryan O’Donnell (arXiv). Consider classic problem of uniformity testing, where the domain is \([d]\). We all know by now that the complexity of uniformity testing is \(\Theta(\sqrt{d}/\varepsilon^2)\), where \(\varepsilon\) is the proximity parameter. Suppose the distribution was the output of an algorithm (like, a hash function) and we have access to the source code. (The source code is thought of as a circuit that outputs a single domain element.) Can we beat this bound? Surprisingly, the answer is yes, when the tester is allowed to be quantum. A single “sample” is basically a run of the code. The main result is an upper bound of \(O(d^{1/3}/\varepsilon^{4/3})\) (with some additional terms for the low \(\varepsilon\) setting). Tight lower bounds for this setting are still unknown, so this research direction of “distribution testing with the source code” may lead to many more results.

Coherence in Property Testing: Quantum-Classical Collapses and Separations by Fernando Jeronimo, Nir Magrafta, Joseph Slote, and Pei Wu (ECCC). The distribution testing problem again, where the domain is \(\{0,1\}^n\) and thought of as exponentially large. To distinguish (say) a uniform distribution with support size \(2^{n/8}\) vs support size \(2^{n/4}\) would require exponentially many (\(2^{n/16}\)) samples. From a quantum standpoint, we can restrict the input distribution to be coherent. Intuitively, this is analogous to being restricted to a subcube. Even then, the paper shows that the testing problem requires exponentially many samples. To show a separation between classical and quantum algorithms, the paper gives a model of interactive protocols for distribution testing (which has been studied before in the classical setting, like Chiesa-Gur). In this setting, the paper gives a quantum algorithm that runs in \(\textrm{poly}(n)\) time with polynomially many proof strings, and can approximate the support size of coherent distributions. Classical algorithms, on the other hand, still require exponentially many queries, even with access to proof strings.

Testing classical properties from quantum data by Matthias C. Caro, Preksha Naik, and Joseph Slote (arXiv). Property testing gains its power because of query access, since the tester can “zero in” on the relevant portion of the data. Sample based testers often have far worse complexities. Such testers only get access to random samples of the data/function. Consider access to a function \(f:\{0,1\}^n \rightarrow \{0,1\}\). The classic property of monotonicity can be tested in \(\sqrt{n}\) queries (ignoring the proximity parameter dependencies), but requires \(2^{\Theta(\sqrt{n})}\) samples. The paper studies sample based property testing, except with quantum “samples”. In the quantum world, the function \(f\) is stored/represented as a quantum superposition of states. Quantum samples can obtain richer information, like sampling the Fourier coefficients of \(f\). (Quantum queries give even stronger access.) This allows for many classical query-based property testers to become quantum sample-based testers. This paper gives results for many fundamental properties like monotonicity, symmetry, and triangle freeness.

Tolerant Testing of Stabilizer States with Mixed State Inputs by Vishnu Iyer and Daniel Liang (arXiv). Another quantum testing paper, but here, the property itself is quantum. The input is a quantum state, and the aim is test the property of being a stabilizer state. In all previous work on property testing of being a stabilizer, it was assumed that the input is a pure state. (One should think of a pure state as a sort of “basis” state, while a mixed state is a linear combination of these states.) Given noise, one would typically expect the input to be mixed. This paper gives the first tolerant tester for stabilizer states, where the input is allowed to be a mixed state.

Stochastic Matching via In-n-Out Local Computation Algorithms by Amir Azarmehr, Soheil Behnezhad, Alma Ghafari, and Ronitt Rubinfeld (arXiv). This is not a sublinear result by itself, but has connections that should interest our readers. The main problem is stochastic matching. We are given a graph \(G\). An input \(G_p\) is generated by keeping each edge independently with probability \(p\). The aim is to construct a sparse graph \(H\) such that the expected matching size of \(H \cap G_p\) is close to the expected matching size of \(G_p\). After a long line of work, this paper shows that there exists a subgraph \(H\) of \(\textrm{poly}(1/p)\) degree whose expected matching size is a \((1-\varepsilon)\)-approximation of the overall expectation. The main analysis technique is to study a local computation algorithm (LCA) for computing the maximum matching. An LCA essentially gives local access to a large output (say, a matching or coloring) such that each matching edge (say) of the output can be computed with sublinear lookups to the input. The standard complexity measure is the number of lookups of the input to compute a matching edge. But this paper looks at the “in complexity”, which is the number of queries that lookup a given edge. When both complexities are small, one can show a “decorrelation” of the overall output, which is used in the analysis.

Nearly Tight Bounds on Testing of Metric Properties by Yiqiao Bao, Sampath Kannan, and Erik Waingarten (arXiv). Consider an \(n \times n\) matrix of “distances” between \(n\) points. The aim is to test the fundamental property of the distances forming a metric. Despite a long line of work studying property testing of distances, points, etc., there has surprisingly been no result on this problem. The earliest work in this line is by Parnas-Ron, studying the property of being a tree metric or ultrametric (which satisfy a stronger version of triangle inequality). This paper gives a \(O(n^{2/3}/\varepsilon^{4/3})\) query algorithm for property testing metrics. There is a also a nearly matching lower bound, if one assumes that the dependence on \(\varepsilon\) is polynomial. For the problems of testing tree metrics (and ultrametrics), the paper gives a \(O(1/\varepsilon^2)\) query algorithm, an improvement from the previous bound of \(O(1/\varepsilon^6)\).

Sublinear Metric Steiner Tree via Improved Bounds for Set Cover by Sepideh Mahabadi, Mohammad Roghani, Jakub Tarnawski, Ali Vakilian (arXiv). This paper is on the classic metric Steiner tree problem. The input is an \(n \times n\) matrix of distances in a metric and a subset \(S\) of points. The aim is to estimate the minimum weight Steiner tree (a tree that connects all the terminals \(S\)). Getting a \(2\)-approximation is quite easy, since one can just consider a spanning tree on \(S\). A result of Chen-Khanna-Tan give a \((2-\eta)\)-approximation algorithm that runs in \(O(n^{13/7})\) queries (for some positive constant \(\eta\)). This paper gets the same approximation with \(O(n^{5/3})\) queries. The main technical work goes in designing a sublinear algorithm for a version of set cover problem, where the input set system is given as a matrix.

On Optimal Testing of Linearity by Vipul Arora, Esty Kelman, Uri Meir (arXiv). In property testing, it’s hard to get more fundamental than linearity testing. What is there new to say about this well-studied problem? This paper studies linearity testing under the online manipulations model of Kalemaj-Raskhodnikova-Varma. In this model, after every query of the tester, an adversary corrupts up to \(t\) entries on the input. Previous results show that linearity testing can be done in \(O(\log t + 1/\varepsilon)\) queries. But, the paper notes, all previous results require \(t \leq 2^{n/c}\) for some \(c \geq 2\). How far can be push \(t\)? Remarkably, the main result shows that \(t\) can be as large as \(2^n/\varepsilon^2\) and linearity testing is still feasible under online manipulations. The next result of this paper studies the property of low degree polynomials over the reals. The paper gives an optimal \(O(1/\varepsilon)\)-query tester for the property of being a \(d\)-degree polynomial.

Online versus Offline Adversaries in Property Testing by Esty Kelman, Ephraim Linder, and Sofya Raskhodnikova (arXiv). This paper is related to the online manipulations model discussed above. There is also an offline “erasure” model, wherein some coordinates/bits of the input are erased by an adversary. The original paper of Kalemaj-Raskhodnikova-Varma showed that sortedness of arrays can be tested with offline erasures, but cannot be tested with online erasures even when \(t=1\). This paper proves a complementary theorem. There are properties that can be tested with a \(O(1)\) queries for \(t = O(n/\log n)\) in the online manipulations model, but requires \(\Omega(\log n)\) queries in the offline setting with \(\Theta(n/\log\log n)\) erasures. So the online and offline models are incomparable in power. There is a similar result for a setting where \(t\) is constant. The lower bounds are shown using a property regarding repeated characters in strings.

New for August 2024

Apologies for the late post! After the relative silence of last month, we’re getting back to a moderate cadence of papers. Some distribution testing, quantum testing, learning and testing. We’ve also added a non-sublinear paper on distributions that should be of interest. And here’s the roundup.

Efficient Testable Learning of General Halfspaces with Adversarial Label Noise by Ilias Diakonikolas, Daniel M. Kane, Sihan Liu, and Nikos Zarifis (arXiv). This paper is on the recent model of testable learning introduced by Rubinfeld-Vasiliyan. Consider learning halfspaces over the Gaussian distribution. We have sample access to an input function \(f:\mathbf{R}^d \to \{0,1\}\), where our aim is learn the closest halfspace to \(f\). The samples comes from some fixed, underlying distribution \(\mathcal{D}\). But it is often infeasible to validate this distributional assumption, even in the property testing framework. A tester-learner pair will try to test this assumption, and if it accepts, apply the learning algorithm. The guarantee is: if \(\mathcal{D}\) is indeed (say) Gaussian, then the learning algorithms works as desired. If the tester accepts and the learner outputs a hypothesis (say, halfspace) \(h\), then it is guaranteed that \(h\) is close to \(f\) according to \(\mathcal{D}\), even if \(\mathcal{D}\) is far from Gaussian. This last part makes the whole setup interesting; the distribution tester might fail to reject the distribution, but we can trust the output hypothesis! There have been many results on learning homogenous halfspaces under the Gaussian distribution. So the hypothesis class consists of halfspaces going through the origin. This paper is about general (inhomogenous) halfspaces. At first blush, this might seem like a simple issue; we’re just adding a constant term to the halfspace. But existing techniques break down, often because they’re solving an optimization problem of minimizing error around a band close the origin. This paper gives a careful reduction to the “nearly” homogeneous case, and gives the first polynomial sample tester-learner pair for this problem.

Tolerant testing stabilizer states by Srinivasan Arunachalam and Arkopal Dutt (arXiv). Let us start with a familiar classical problem, low degree testing. Given access to a function \(f:\{0,1\}^n \to \{0,1\}\), we wish to test if \(f\) is a quadratic polynomial (over \(\mathbb{F}_2\)). There exist testers based on the Gowers-norm: basically, compute various (constant dimensional) discrete derivatives and check they are consistent with a quadratic function. These discrete derivatives can be analyzed using Fourier analysis, and the main aim is show that a function that “locally” (in terms of derivatives) behaves like a quadratic is indeed globally close to being one. This method can be used to get tolerant testers for quadratic polynomials. This paper is on a quantum generalization of this testing result. The input is a qubit \(|\psi_f\rangle\), promised to be a “phase state”. A phase state has a Boolean function \(f\) embedded in it, because one can write a phase state as a linear combination of \(2^n\) “base” states, with the coefficients being Boolean. A “stabilizer state” is one where the function \(f\) is a quadratic (or so I believe, I’m probably exposing my ignorance of quantum mechanics). This paper uses the Gowers norm techniques to give the first quantum tolerant tester for stabilizer states.

Improved Bounds for High-Dimensional Equivalence and Product Testing using Subcube Queries by Tomer Adar, Eldar Fischer, and Amit Levi (arXiv). Consider distribution testing on high-dimensional data, so the domain is \(\{0,1\}^n\). A nice model for distribution tester is the subcube conditioning model of Bhattacharyya and Chakraborty. Suppose we fix any subset \(S \subseteq [n]\) coordinates with \(x_S\). We can generate a sample \(y\) from the distribution, conditioned on \(y_S = x_S\) (meaning, \(y\) agrees with \(x\) on \(S\)). The problem is to perform equivalence testing of distributions on this model. Previous results got \(O(n^2/\varepsilon^2)\) query algorithms, and this paper give a significantly improved algorithm making \(O(n/\varepsilon^2)\) queries. Interestingly, the algorithm only makes weaker queries. One distribution is only accessed by “marginal queries”. So, given \(x_S\) as before, but we only sample the marginal distribution over coordinate \(i\), conditioned on \(S\) being fixed as \(x_S\). (Hence, the output is a single bit. Also, we note that the other distribution is accessed by prefix queries, a weaker version of subcube queries.) These generalizations lead to more results, on testing equivalence in the interval query model, and for testing the property of product distributions. This paper also proves a \(\Omega(\sqrt{n}/\varepsilon^2)\) lower bound for testing product distributions.

Parallel Sampling via Counting by Nima Anari, Ruiquan Gao, and Aviad Rubinstein (arXiv). This isn’t a “typical sublinear algorithm” per se, but I think it is quite interesting to those of us who think about sublinearity, adaptivity, and distributions. This result has connections to the previous paper. Our aim is to sample from an unknown distribution \(\mu\) over \([q]^n\). We have access to “marginal queries” as described above. This problem appears in large language models, wherein neural nets are trained on various marginals (next word), but the output is a sentence (list of words). Observe there is a simple “\(O(n)\) round” algorithm. Without any fixing, query the marginal of the first coordinate. Fix the first coordinate, query the marginal of the second coordinate. Fix the first two coordinates, rinse and repeat. This requires \(n\) rounds with the marginal query. In this model, polynomially many marginal queries can be made in a single round, and the aim is to minimize the number of rounds (basically, bounding the adaptivity). This paper gives a \(\widetilde{O}(n^{2/3})\) round procedure for sampling, and shows an \(\Omega(n^{1/3})\) lower bound.

News for June 2024

A nice diverse month, this June 2024. We have a selection of papers on quantum property testing, shortest paths, polynomial threshold functions, and nearest neighbor search. And a nice survey on a closely related topic. Onward with the papers!

Invitation to Local Algorithms by Václav Rozhoň (arXiv). This is a great survey on local algorithms, a topic (roughly) about message passing distributed algorithms on graphs. While that is not a sublinear algorithm per se, the core combinatorial question has relevance to our readers. Informally, a “local problem” is one where, if a solution is incorrect, that can be determined by looking at a small radius around some vertex. For example, consider problems like maximal independent set, matching, etc. The problem is to construct such a solution in a distributed manner by only looking at low radius balls. There is a direct connection with Local Computation Algorithms (LCAs), with the primary difference being the complexity resource. In Local Algorithms as defined, one is interested in the number of rounds of computation, corresponding to the radius of balls. In LCAs, we care about the number of vertices seen, which is the size of the balls. The survey gives a detailed discussion of these connections.

Testably Learning Polynomial Threshold Functions by Lucas Slot, Stefan Tiegel, and Manuel Wiedmer (arXiv). Consider the classic model of agnostic learning with a distributional assumption. Our aim is to learn a function \(f: X \to \{-1,1\}\). The learner gets labeled samples \((x,f(x))\), where \(x\) is drawn from an unknown distribution \(\mathcal{D}\). In agnostic learning, we aim to output a hypothesis \(\hat{f}\) such that \(\|f – \hat{f}\|\) is small, where distance is measured according to \(\mathcal{D}\). Often one makes distributional assumptions (like \(\mathcal{D}\) is Gaussian) to get non-trivial results. This is a severe limitation, since such distributional assumptions cannot be validated. A recent model of Rubinfeld-Vasilyan asks for “tester-learner” pairs that address this issue. We first run a “tester” to check the property (say, Gaussian) of the distribution. If the tester passes, then we run the learner. If \(\mathcal{D}\) satisfies the distributional property, the tester is guaranteed to pass (whp). If the tester passes, then the learner is guaranteed to output a valid hypothesis. Thus, the tester is allowed to pass distributions that may be far from satisfying the property, but in such situations, the learner outputs a correct hypothesis. This paper gives such tester-learner pairs for learning polynomial threshold functions with polynomially many samples. Previously, such learning results were not known even for degree-2 PTFs with respect to the Gaussian distribution.

A Sublinear Algorithm for Approximate Shortest Paths in Large Networks by Sabyasachi Basu, Nadia Kōshima, Talya Eden, Omri Ben-Eliezer, C. Seshadhri (arXiv). This paper brings a mix of a classic problem, a real-world observation, and a new graph class. Finding shortest paths in graphs is about as classic as it gets. There is a plethora of applied work on shortest paths algorithms for real-world networks, much of which is based on landmarking. One precomputes distances between chosen landmarks, and then uses those to output shortest path distance queries. This paper asks: is it possible to get approximation shortest paths in sublinear time? Hence, we are not even allowed to preprocess the entire graph. This paper builds on previous work in social networks about “core-periphery” structure. The core is a denser, small region of the graph; the rest of the vertices form the periphery that connect to the core. The hypothesis is that shortest paths in real-world graphs typically go via the core. Given the (small) core, one can build small sublinear data structures that quickly answer shortest path queries. The paper uses insights from previous work on core-periphery structure, and gives a practical sublinear algorithm for computing approximate shortest paths. If the graph comes from a Chung-Lu distribution, then queries can be provably answered in \(n^{o(1)}\) time.

Quantum Property Testing Algorithm for the Concatenation of Two Palindromes Language by Kamil Khadiev and Danil Serov (arXiv). An early result of property testing is that all regular languages can tested in constant time. But the simple (non-regular) language \(\mathcal{L}\) of strings formed by concatenating two palindromes requires super-constant queries. Specifically, the complexity of property testing \(\mathcal{L}\) is \(\Theta(\sqrt{n})\) (ignoring log and \(\varepsilon\) factors). Here, \(n\) denotes the string size. This paper shows that there exist quantum property testing algorithms that beat this bound. There is a quantum property tester for the language \(\mathcal{L}\) that makes \(\widetilde{O}(n^{1/3})\) queries. The paper also shows that quantum query complexity of the exact decision problem is \(\Theta(\sqrt{n})\). Thus, one gets non-trivial quantum speedup for both the property testing and exact problem.

A Bi-metric Framework for Fast Similarity Search by Haike Xu, Sandeep Silwal, and Piotr Indyk (arXiv). Nearest-neighbor (NN) search is one of the most important algorithmic tasks in modern data analysis. The usual setting is that we have a set of \(n\) points over a metric, denoted by \(D\). We preprocess the points and construct a data structure. Given a query point \(q\), we wish to find the (approximate) closest points from our data set, in time sublinear in the data size. This paper introduces a bi-metric setting. Think of \(D\) as expensive to compute, while there exists a cheap “proxy metric” \(d\). For example, if the points represent images, the expensive metric \(D\) could be some complex (but semantically accurate) metric, while \(d\) may represent a simple Euclidean metric over an embedding. Formally, \(d\) is a (large) constant factor approximation of \(D\). This paper gives an algorithm that performs nearest neighbor search, but only preprocesses over \(d\). Given, a query \(q\), it makes sublinear queries to the expensive \(D\). All in all, it is a truly sublinear algorithm in terms of \(D\), and leverages the preprocessing over the cheap metric \(d\). Technically, this paper gives a meta-algorithm that can be applied to any “base” NN algorithm. The paper proves results for two popular NN algorithms, and gives extensive empirical evidence for the underlying approach.