Author Archives: Seshadhri

News for June 2025

Another (sigh) delayed post. A moderately busy month, with 4 papers with sublinear graph algorithms and, of course, distribution testing.

A 0.51-Approximation of Maximum Matching in Sublinear $n^{1.5}$ Time by Sepideh Mahabadi, Mohammad Roghani, and Jakub Tarnawski (arXiv). The title says it all. This paper gives a 0.51-approximation for the maximum matching in, well, $O(n^{1.5})$ time. Here’s the back story. There are two kinds of sublinear maximum matching algorithms. The first kind (achieved after a long line of impressive results) get $(1- \varepsilon)$-approximations in time $O(n^{2 – \Omega_{\varepsilon}(1)})$. This means that that in sublinear $o(n^2)$ time, one can get arbitrarily good approximations. The second kind beats the simple $0.5$-approximation (obtained from maximal matchings), but tries for closer to $O(n)$ time. Behnezhad –Roghani –Rubinstein –Saberi gave a $(0.5+\varepsilon)$-approximation algorithm running in $n^{1+\varepsilon}$ time. But the largest value of $\varepsilon$ possible in these results is tiny. Can one get a “significant” improvement over 0.5-approximations in time “significantly less” than $n^2$? The answer is yes, as the title explains.

Testing (Conditional) Mutual Information by Jan Seyfried, Sayantan Sen, and Marco Tomamichel (arXiv, ECCC). Consider a distribution over pairs $(A, B)$. The mutual information $I(A,B)$ of $A$ and $B$ is the minimum KL-divergence between the original distribution and any product distribution over the support of $A$ and $B$. So the mutual information is zero iff $(A,B)$ is a product distribution, or equivalently $A$ and $B$ are independent. A natural distribution testing question is to distinguish $A$ and $B$ being independent from $I(A,B) > \varepsilon$. One can take this problem a step further. Suppose the distribution is over triples $(A,B,C)$. Then can one distinguish conditional independence ($I(A, B | C) = 0$) from sufficient conditional dependence, so $I(A, B | C) > \varepsilon$? This problem was studied by (including our very own) Canonne-Diakonikolas-Kane-Stewart, and is known that $O(d^{7/4})$ samples suffice. Here, $d$ denotes the maximum alphabet size for each coordinate, so the overall domain size is $d^3$. This paper gives a much more nuanced complexity, that depends separately on the support sizes for each coordinate. It is interesting that each coordinate plays a different role in the final complexity.

Tight simulation of a distribution using conditional samples by Tomer Adar (arXiv). Consider a distribution of $\{0,1\}^n$. Many distribution testing results would not yield useful algorithms for this setting, since the domain is exponentially large. Inspired by this setting, the subcube conditional model was introduced by Bhattacharyya-Chakraborty. In this setting, we can generate samples conditioned on any subcube. The aim of this paper goes beyond property testing. Can we actually estimate specific values of the distribution with a few queries? Can we simulate the distribution, which means to “locally compute” a distribution that is close to the input? Specifically, given domain point $x$, we want to compute a value $\mu(x)$ such that $\mu$ represents a distribution close to the input. This paper gives such a result, where each $\mu(x)$ can be computed in $O(n^2/\varepsilon^2)$ queries. The final distribution has KL-divergence $\varepsilon^2$ from the original.

Sampling and Identity-Testing Without Approximate Tensorization of Entropy by William Gay, William He, Nicholas Kocurek, and Ryan O’Donnell (arXiv). Again, consider a distribution $D$ over a large product domain $\Sigma^n$. Our aim is identity testing, so we wish to determine if ${D} = {D}’$ or $|{D} – {D}’| > \varepsilon$, where ${D}’$ is some explicitly known distribution. The model is much weaker than subcube condition access. Essentially, we can only get samples from subcubes of dimension (at least) $n-1$. Blanca-Chen-Štefankovič-Vigoda studied identity testing where ${D}$ satisfies a condition called “approximate tensorization of entropy” (ATE). By Jensen’s inequality, if ${D}$ is a product distribution, the entropy of ${D}$ is at most the sum of entropies of the marginal distributions. If the entropy of ${D}$ is (say) at most a constant factor of this sum, we say it satisfies the ATE. Blanca-Chen-Štefankovič-Vigoda proves that $\widetilde{O}(n/\varepsilon)$ suffices for identity testing. This paper studies input distributions that are mixtures of (say, a constant number of) distributions satisfying ATE. Interestingly, these mixtures do not satisfy the ATE. But this paper shows that one can still get $\widetilde{O}(n/\varepsilon)$ query identity testers.

News for February 2025

Leave a reply

After a few quiet months on PTReview, the community is back with a bang. We have a selection of nine papers this month, ranging from hypergraphs, vertex coloring, triangle counting, a new take on linearity testing, and a sublinear foray into transformers.

Property Testing in Bounded Degree Hypergraphs by Hugo Aaronson, Gaia Carenini, and Atreyi Chanda (arXiv). We all know about the rich history of property testing and sublinear algorithms for graphs. The space for hypergraphs is far less studied, and likely has an even richer theory. Consider the input being an $n$ vertex $k$-uniform hypergraph, so every hyperedge is a subset of $n$ vertices. This paper is the first to define the bounded degree setting. The graph is represented as an adjacency list, wherein, for each vertex, we have the list of hyperedges containing that vertex. Distance between hypergraphs is defined in terms of the symmetric difference between the sets of hyperedges. The primary focus of the paper is on the fundamental property of $k$-colorability (note that $k$ is also the hyperedge size). The main lower bound result is that for $k \geq 3$, this property requires $\Omega(n)$ queries to test. This is in contrast to the graph setting ($k = 2$), where bipartiteness is known to be testable for bounded degree graphs. This paper shows if that inputs have bounded treewidth, then $k$-colorability is testable with one-sided error in constant queries.

Simple Sublinear Algorithms for $(\Delta + 1)$ Vertex Coloring via Asymmetric Palette Sparsification by Sepehr Assadi and Helia Yazdanyar (arXiv). Consider access to the adjacency list of a graph with $n$ vertices and maximum degree $\Delta$. Most of our readers will be familiar with result of Assadi-Chen-Khanna proving that $(\Delta + 1)$-coloring can be done in $\widetilde{O}(n^{3/2})$ time. The key tool is a palette sparsification lemma, that shows that each vertex can independently pick a small list of colors, such that the final coloring is consistent with these lists. This paper proves a weaker version of this lemma with a significantly simpler proof. The result is an asymmetric version, where vertices choose color lists of varying sizes. The average length goes up by a log factor. But the proof is simpler, and it also allows for the final coloring to just use the greedy algorithm (constrained to these lists).

Improved Sublinear Algorithms for Classical and Quantum Graph Coloring by Asaf Ferber, Liam Hardiman, and Xiaonan Chen (arXiv). Another result on $(\Delta +1)$-coloring. A result of Morris and Song show that there are simpler algorithms that give a $(1+\varepsilon)\Delta$-coloring. Adapting their techniques, this paper gives a sublinear algorithm for $(\Delta+1)$-coloring that runs in $O(n^{3/2}\sqrt{\log n})$ time. This gives a $\sqrt{\log n}$ factor improvement from the previous results. The algorithm can be accelerated in the quantum model using Grover’s search, to get a running time of $\widetilde{O}(n^{4/3})$. Finally, the paper gives a $\widetilde{O}(n^{5/4})$ time quantum algorithm for getting a $(1+\varepsilon)\Delta$-coloring.

Monotonicity Testing of High-Dimensional Distributions with Subcube Conditioning by Deeparnab Chakrabarty, Xi Chen, Simeon Ristic, C. Seshadhri, and Erik Waingarten (arXiv). In my first foray into distribution testing, this paper studies the problem of testing monotonicity of distributions over the hypercube, $\{-1,1\}^n$. Observe that the domain is considered to be exponentially large, and the aim is to get distribution testers that make $\textrm{poly}(n)$ samples. To get such bounds, this paper assumes subcube conditional samples. Given any subhypercube of the domain, the algorithm can generate samples restricted to this subhypercube. A distribution is called monotone, if the corresponding probability mass function is monotone (wrt to the standard partial order on $\{-1,1\}^n$). For this setting, this paper shows that property of monotonicity can tested with $\widetilde{O}(n/\varepsilon^2)$ samples. Moreover, this bound is shown to be optimal, up to polylog factors. One of the main tools is a new directed isoperimetric theorem for real-valued functions on the hypercube.

Compression Barriers in Autoregressive Transformers by Themistoklis Haris and Krzysztof Onak (arXiv). While this paper does not give a sublinear time algorithm, it uses many techniques from the sublinear algorithms toolkit to prove lower bounds. And it is on transformers, arguably one of the most important components for machine learning on sequences and Large Language Models. Consider a sequence of $n$ vectors $v_1, v_2, \ldots, v_n \in \mathbb{R}^d$, which one can think of as vector embeddings of $n$ words (coming from, say, some generated text). Based on extensive learning and model fine tuning, assume that the model has learned some “key” vectors $k_1, k_2, \ldots, k_n \in \mathbb{R}^d$. The next word is generated from a “query vector” $q$, using which we compute $M^{-1} \sum_{\ell \leq n} \exp(q \cdot k_\ell) v_\ell$ (where $M$ is a normalization factor). Suppose there are $n$ query vectors, and the aim is to generate the above sum for all the query vectors (these are like the next $n$ words). In practice, there are many schemes to speed up computation by compressing the key vectors. This paper shows that, in the worst case, a computation of the next word/vector takes $\Omega(nd)$ time and space. Thus, any sublinear compression/speed up method must fundamentally use properties of the data.

Arboricity and Random Edge Queries Matter for Triangle Counting using Sublinear Queries by Arijit Bishnu, Debarshi Chanda, Gopinath Mishra (arXiv). Triangle counting is a classic problem, with a rich history in sublinear algorithms. The earliest work gives an algorithm running in $\widetilde{O}(n/t^{1/3} + m^{3/2}/t)$ time, where $n, m$ are the usual graph parameters and $t$ is the triangle count. (Polylog and $\varepsilon$ factors are ignored.) This complexity can be improved to $\widetilde{O}(n/t^{1/3} + m\alpha/t)$, where $\alpha$ is the graph arboricity/degeneracy. The standard adjacency list model can be augmented with random edge queries, in which case, it was known that $O(m^{3/2}/t)$ time algorithms exist. This paper shows that $O(m\alpha/t)$-time algorithms exist for the model with random edge queries, which combines the previous results into an even better algorithm.

Improved Sublinear-time Moment Estimation using Weighted Sampling by Anup Bhattacharya, Pinki Pradhan (arXiv). Consider an array of $n$ non-negative weights $w_1, \ldots, w_n$. The $t$th-moment is the sum $\sum_{i \leq n} w^t_i$. The aim is to estimate this moment, with proportional samples. This means that an algorithm can generate index samples where the index $i$ has probability proportional to $w_i$. The first result on this problem, for $t=1$, was by Motwani-Panigrahy-Xu. Beretta-Tetek do a detailed study of this problem for case $t=1$ (sum estimation), and show that this problem can be solved in $O(\sqrt{n})$ queries. When the weights are vertex degrees and we have graph access, there are results showing the existence of $\widetilde{O}(n^{1-1/t})$ algorithms for $t > 1$. This paper gives such a result for any non-negative weights with $t \geq 1$, and proves a matching lower bound. Moreover, this paper considers the setting where $atex t < 1$. Interestingly, there is an algorithm of query complexity $O(\sqrt{n} + n^{1/t-1})$, and a lower bound showing that $\Omega(n)$ samples are needed when $t \leq 1/2$.

Biased Linearity Testing in the 1% Regime by Subhash Khot and Kunal Mittal (ECCC). The classic problem of linearity testing never stops yielding new insights. A function $f:\{0,1\}^n \to \{-1,1\}$, if $f(x) = (-1)^{a \cdot x}$ where $a \in \{0,1\}^n$. The classic 3-query BLR test distinguishes linear functions from those that are far from linear. Specifically, if the tester passes with probability $1-\delta$, then the function $f$ is $O(\delta)$-close to linear. This called testing in the “99% regime”. Suppose we want to tester with guarantees in the “1%” regime. This means that if the tester passes with probability $> \delta$, then $f$ has at least $\epsilon$ correlation with some linear function (where $\epsilon$ is some function of $\delta$). Such results were known for the uniform distribution on the hypercube. This paper studies product distributions on the hypercube, where all marginals are identically $p$-biased (the probability of choosing a $1$ is $p$). Such 1%-testers were known for $p \in (1/3,2/3)$ from recent work. This paper gives a $1+1/p$-query tester for all values of $p$, and proves that the number of queries must be at least $1+1/p$.

News for November 2024

Leave a reply

Alas, another late post! But we compensate for that. With a rich hoard of ten papers this month, covering quantum property testing, distribution testing, matchings, Steiner trees, linearity testing, and dealing with corruptions. And some papers on multiple topics simultaneously.

Uniformity testing when you have the source code by Clément L. Canonne, Robin Kothari, and Ryan O’Donnell (arXiv). Consider classic problem of uniformity testing, where the domain is $[d]$. We all know by now that the complexity of uniformity testing is $\Theta(\sqrt{d}/\varepsilon^2)$, where $\varepsilon$ is the proximity parameter. Suppose the distribution was the output of an algorithm (like, a hash function) and we have access to the source code. (The source code is thought of as a circuit that outputs a single domain element.) Can we beat this bound? Surprisingly, the answer is yes, when the tester is allowed to be quantum. A single “sample” is basically a run of the code. The main result is an upper bound of $O(d^{1/3}/\varepsilon^{4/3})$ (with some additional terms for the low $\varepsilon$ setting). Tight lower bounds for this setting are still unknown, so this research direction of “distribution testing with the source code” may lead to many more results.

Coherence in Property Testing: Quantum-Classical Collapses and Separations by Fernando Jeronimo, Nir Magrafta, Joseph Slote, and Pei Wu (ECCC). The distribution testing problem again, where the domain is $\{0,1\}^n$ and thought of as exponentially large. To distinguish (say) a uniform distribution with support size $2^{n/8}$ vs support size $2^{n/4}$ would require exponentially many ($2^{n/16}$) samples. From a quantum standpoint, we can restrict the input distribution to be coherent. Intuitively, this is analogous to being restricted to a subcube. Even then, the paper shows that the testing problem requires exponentially many samples. To show a separation between classical and quantum algorithms, the paper gives a model of interactive protocols for distribution testing (which has been studied before in the classical setting, like Chiesa-Gur). In this setting, the paper gives a quantum algorithm that runs in $\textrm{poly}(n)$ time with polynomially many proof strings, and can approximate the support size of coherent distributions. Classical algorithms, on the other hand, still require exponentially many queries, even with access to proof strings.

Testing classical properties from quantum data by Matthias C. Caro, Preksha Naik, and Joseph Slote (arXiv). Property testing gains its power because of query access, since the tester can “zero in” on the relevant portion of the data. Sample based testers often have far worse complexities. Such testers only get access to random samples of the data/function. Consider access to a function $f:\{0,1\}^n \rightarrow \{0,1\}$. The classic property of monotonicity can be tested in $\sqrt{n}$ queries (ignoring the proximity parameter dependencies), but requires $2^{\Theta(\sqrt{n})}$ samples. The paper studies sample based property testing, except with quantum “samples”. In the quantum world, the function $f$ is stored/represented as a quantum superposition of states. Quantum samples can obtain richer information, like sampling the Fourier coefficients of $f$. (Quantum queries give even stronger access.) This allows for many classical query-based property testers to become quantum sample-based testers. This paper gives results for many fundamental properties like monotonicity, symmetry, and triangle freeness.

Tolerant Testing of Stabilizer States with Mixed State Inputs by Vishnu Iyer and Daniel Liang (arXiv). Another quantum testing paper, but here, the property itself is quantum. The input is a quantum state, and the aim is test the property of being a stabilizer state. In all previous work on property testing of being a stabilizer, it was assumed that the input is a pure state. (One should think of a pure state as a sort of “basis” state, while a mixed state is a linear combination of these states.) Given noise, one would typically expect the input to be mixed. This paper gives the first tolerant tester for stabilizer states, where the input is allowed to be a mixed state.

Stochastic Matching via In-n-Out Local Computation Algorithms by Amir Azarmehr, Soheil Behnezhad, Alma Ghafari, and Ronitt Rubinfeld (arXiv). This is not a sublinear result by itself, but has connections that should interest our readers. The main problem is stochastic matching. We are given a graph $G$. An input $G_p$ is generated by keeping each edge independently with probability $p$. The aim is to construct a sparse graph $H$ such that the expected matching size of $H \cap G_p$ is close to the expected matching size of $G_p$. After a long line of work, this paper shows that there exists a subgraph $H$ of $\textrm{poly}(1/p)$ degree whose expected matching size is a $(1-\varepsilon)$-approximation of the overall expectation. The main analysis technique is to study a local computation algorithm (LCA) for computing the maximum matching. An LCA essentially gives local access to a large output (say, a matching or coloring) such that each matching edge (say) of the output can be computed with sublinear lookups to the input. The standard complexity measure is the number of lookups of the input to compute a matching edge. But this paper looks at the “in complexity”, which is the number of queries that lookup a given edge. When both complexities are small, one can show a “decorrelation” of the overall output, which is used in the analysis.

Nearly Tight Bounds on Testing of Metric Properties by Yiqiao Bao, Sampath Kannan, and Erik Waingarten (arXiv). Consider an $n \times n$ matrix of “distances” between $n$ points. The aim is to test the fundamental property of the distances forming a metric. Despite a long line of work studying property testing of distances, points, etc., there has surprisingly been no result on this problem. The earliest work in this line is by Parnas-Ron, studying the property of being a tree metric or ultrametric (which satisfy a stronger version of triangle inequality). This paper gives a $O(n^{2/3}/\varepsilon^{4/3})$ query algorithm for property testing metrics. There is a also a nearly matching lower bound, if one assumes that the dependence on $\varepsilon$ is polynomial. For the problems of testing tree metrics (and ultrametrics), the paper gives a $O(1/\varepsilon^2)$ query algorithm, an improvement from the previous bound of $O(1/\varepsilon^6)$.

Sublinear Metric Steiner Tree via Improved Bounds for Set Cover by Sepideh Mahabadi, Mohammad Roghani, Jakub Tarnawski, Ali Vakilian (arXiv). This paper is on the classic metric Steiner tree problem. The input is an $n \times n$ matrix of distances in a metric and a subset $S$ of points. The aim is to estimate the minimum weight Steiner tree (a tree that connects all the terminals $S$). Getting a $2$-approximation is quite easy, since one can just consider a spanning tree on $S$. A result of Chen-Khanna-Tan give a $(2-\eta)$-approximation algorithm that runs in $O(n^{13/7})$ queries (for some positive constant $\eta$). This paper gets the same approximation with $O(n^{5/3})$ queries. The main technical work goes in designing a sublinear algorithm for a version of set cover problem, where the input set system is given as a matrix.

On Optimal Testing of Linearity by Vipul Arora, Esty Kelman, Uri Meir (arXiv). In property testing, it’s hard to get more fundamental than linearity testing. What is there new to say about this well-studied problem? This paper studies linearity testing under the online manipulations model of Kalemaj-Raskhodnikova-Varma. In this model, after every query of the tester, an adversary corrupts up to $t$ entries on the input. Previous results show that linearity testing can be done in $O(\log t + 1/\varepsilon)$ queries. But, the paper notes, all previous results require $t \leq 2^{n/c}$ for some $c \geq 2$. How far can be push $t$? Remarkably, the main result shows that $t$ can be as large as $2^n/\varepsilon^2$ and linearity testing is still feasible under online manipulations. The next result of this paper studies the property of low degree polynomials over the reals. The paper gives an optimal $O(1/\varepsilon)$-query tester for the property of being a $d$-degree polynomial.

Online versus Offline Adversaries in Property Testing by Esty Kelman, Ephraim Linder, and Sofya Raskhodnikova (arXiv). This paper is related to the online manipulations model discussed above. There is also an offline “erasure” model, wherein some coordinates/bits of the input are erased by an adversary. The original paper of Kalemaj-Raskhodnikova-Varma showed that sortedness of arrays can be tested with offline erasures, but cannot be tested with online erasures even when $t=1$. This paper proves a complementary theorem. There are properties that can be tested with a $O(1)$ queries for $t = O(n/\log n)$ in the online manipulations model, but requires $\Omega(\log n)$ queries in the offline setting with $\Theta(n/\log\log n)$ erasures. So the online and offline models are incomparable in power. There is a similar result for a setting where $t$ is constant. The lower bounds are shown using a property regarding repeated characters in strings.

New for August 2024

Leave a reply

Apologies for the late post! After the relative silence of last month, we’re getting back to a moderate cadence of papers. Some distribution testing, quantum testing, learning and testing. We’ve also added a non-sublinear paper on distributions that should be of interest. And here’s the roundup.

Efficient Testable Learning of General Halfspaces with Adversarial Label Noise by Ilias Diakonikolas, Daniel M. Kane, Sihan Liu, and Nikos Zarifis (arXiv). This paper is on the recent model of testable learning introduced by Rubinfeld-Vasiliyan. Consider learning halfspaces over the Gaussian distribution. We have sample access to an input function $f:\mathbf{R}^d \to \{0,1\}$, where our aim is learn the closest halfspace to $f$. The samples comes from some fixed, underlying distribution $\mathcal{D}$. But it is often infeasible to validate this distributional assumption, even in the property testing framework. A tester-learner pair will try to test this assumption, and if it accepts, apply the learning algorithm. The guarantee is: if $\mathcal{D}$ is indeed (say) Gaussian, then the learning algorithms works as desired. If the tester accepts and the learner outputs a hypothesis (say, halfspace) $h$, then it is guaranteed that $h$ is close to $f$ according to $\mathcal{D}$, even if $\mathcal{D}$ is far from Gaussian. This last part makes the whole setup interesting; the distribution tester might fail to reject the distribution, but we can trust the output hypothesis! There have been many results on learning homogenous halfspaces under the Gaussian distribution. So the hypothesis class consists of halfspaces going through the origin. This paper is about general (inhomogenous) halfspaces. At first blush, this might seem like a simple issue; we’re just adding a constant term to the halfspace. But existing techniques break down, often because they’re solving an optimization problem of minimizing error around a band close the origin. This paper gives a careful reduction to the “nearly” homogeneous case, and gives the first polynomial sample tester-learner pair for this problem.

Tolerant testing stabilizer states by Srinivasan Arunachalam and Arkopal Dutt (arXiv). Let us start with a familiar classical problem, low degree testing. Given access to a function $f:\{0,1\}^n \to \{0,1\}$, we wish to test if $f$ is a quadratic polynomial (over $\mathbb{F}_2$). There exist testers based on the Gowers-norm: basically, compute various (constant dimensional) discrete derivatives and check they are consistent with a quadratic function. These discrete derivatives can be analyzed using Fourier analysis, and the main aim is show that a function that “locally” (in terms of derivatives) behaves like a quadratic is indeed globally close to being one. This method can be used to get tolerant testers for quadratic polynomials. This paper is on a quantum generalization of this testing result. The input is a qubit $|\psi_f\rangle$, promised to be a “phase state”. A phase state has a Boolean function $f$ embedded in it, because one can write a phase state as a linear combination of $2^n$ “base” states, with the coefficients being Boolean. A “stabilizer state” is one where the function $f$ is a quadratic (or so I believe, I’m probably exposing my ignorance of quantum mechanics). This paper uses the Gowers norm techniques to give the first quantum tolerant tester for stabilizer states.

Improved Bounds for High-Dimensional Equivalence and Product Testing using Subcube Queries by Tomer Adar, Eldar Fischer, and Amit Levi (arXiv). Consider distribution testing on high-dimensional data, so the domain is $\{0,1\}^n$. A nice model for distribution tester is the subcube conditioning model of Bhattacharyya and Chakraborty. Suppose we fix any subset $S \subseteq [n]$ coordinates with $x_S$. We can generate a sample $y$ from the distribution, conditioned on $y_S = x_S$ (meaning, $y$ agrees with $x$ on $S$). The problem is to perform equivalence testing of distributions on this model. Previous results got $O(n^2/\varepsilon^2)$ query algorithms, and this paper give a significantly improved algorithm making $O(n/\varepsilon^2)$ queries. Interestingly, the algorithm only makes weaker queries. One distribution is only accessed by “marginal queries”. So, given $x_S$ as before, but we only sample the marginal distribution over coordinate $i$, conditioned on $S$ being fixed as $x_S$. (Hence, the output is a single bit. Also, we note that the other distribution is accessed by prefix queries, a weaker version of subcube queries.) These generalizations lead to more results, on testing equivalence in the interval query model, and for testing the property of product distributions. This paper also proves a $\Omega(\sqrt{n}/\varepsilon^2)$ lower bound for testing product distributions.

Parallel Sampling via Counting by Nima Anari, Ruiquan Gao, and Aviad Rubinstein (arXiv). This isn’t a “typical sublinear algorithm” per se, but I think it is quite interesting to those of us who think about sublinearity, adaptivity, and distributions. This result has connections to the previous paper. Our aim is to sample from an unknown distribution $\mu$ over $[q]^n$. We have access to “marginal queries” as described above. This problem appears in large language models, wherein neural nets are trained on various marginals (next word), but the output is a sentence (list of words). Observe there is a simple “$O(n)$ round” algorithm. Without any fixing, query the marginal of the first coordinate. Fix the first coordinate, query the marginal of the second coordinate. Fix the first two coordinates, rinse and repeat. This requires $n$ rounds with the marginal query. In this model, polynomially many marginal queries can be made in a single round, and the aim is to minimize the number of rounds (basically, bounding the adaptivity). This paper gives a $\widetilde{O}(n^{2/3})$ round procedure for sampling, and shows an $\Omega(n^{1/3})$ lower bound.

News for June 2024

Leave a reply

A nice diverse month, this June 2024. We have a selection of papers on quantum property testing, shortest paths, polynomial threshold functions, and nearest neighbor search. And a nice survey on a closely related topic. Onward with the papers!

Invitation to Local Algorithms by Václav Rozhoň (arXiv). This is a great survey on local algorithms, a topic (roughly) about message passing distributed algorithms on graphs. While that is not a sublinear algorithm per se, the core combinatorial question has relevance to our readers. Informally, a “local problem” is one where, if a solution is incorrect, that can be determined by looking at a small radius around some vertex. For example, consider problems like maximal independent set, matching, etc. The problem is to construct such a solution in a distributed manner by only looking at low radius balls. There is a direct connection with Local Computation Algorithms (LCAs), with the primary difference being the complexity resource. In Local Algorithms as defined, one is interested in the number of rounds of computation, corresponding to the radius of balls. In LCAs, we care about the number of vertices seen, which is the size of the balls. The survey gives a detailed discussion of these connections.

Testably Learning Polynomial Threshold Functions by Lucas Slot, Stefan Tiegel, and Manuel Wiedmer (arXiv). Consider the classic model of agnostic learning with a distributional assumption. Our aim is to learn a function $f: X \to \{-1,1\}$. The learner gets labeled samples $(x,f(x))$, where $x$ is drawn from an unknown distribution $\mathcal{D}$. In agnostic learning, we aim to output a hypothesis $\hat{f}$ such that $\|f – \hat{f}\|$ is small, where distance is measured according to $\mathcal{D}$. Often one makes distributional assumptions (like $\mathcal{D}$ is Gaussian) to get non-trivial results. This is a severe limitation, since such distributional assumptions cannot be validated. A recent model of Rubinfeld-Vasilyan asks for “tester-learner” pairs that address this issue. We first run a “tester” to check the property (say, Gaussian) of the distribution. If the tester passes, then we run the learner. If $\mathcal{D}$ satisfies the distributional property, the tester is guaranteed to pass (whp). If the tester passes, then the learner is guaranteed to output a valid hypothesis. Thus, the tester is allowed to pass distributions that may be far from satisfying the property, but in such situations, the learner outputs a correct hypothesis. This paper gives such tester-learner pairs for learning polynomial threshold functions with polynomially many samples. Previously, such learning results were not known even for degree-2 PTFs with respect to the Gaussian distribution.

A Sublinear Algorithm for Approximate Shortest Paths in Large Networks by Sabyasachi Basu, Nadia Kōshima, Talya Eden, Omri Ben-Eliezer, C. Seshadhri (arXiv). This paper brings a mix of a classic problem, a real-world observation, and a new graph class. Finding shortest paths in graphs is about as classic as it gets. There is a plethora of applied work on shortest paths algorithms for real-world networks, much of which is based on landmarking. One precomputes distances between chosen landmarks, and then uses those to output shortest path distance queries. This paper asks: is it possible to get approximation shortest paths in sublinear time? Hence, we are not even allowed to preprocess the entire graph. This paper builds on previous work in social networks about “core-periphery” structure. The core is a denser, small region of the graph; the rest of the vertices form the periphery that connect to the core. The hypothesis is that shortest paths in real-world graphs typically go via the core. Given the (small) core, one can build small sublinear data structures that quickly answer shortest path queries. The paper uses insights from previous work on core-periphery structure, and gives a practical sublinear algorithm for computing approximate shortest paths. If the graph comes from a Chung-Lu distribution, then queries can be provably answered in $n^{o(1)}$ time.

Quantum Property Testing Algorithm for the Concatenation of Two Palindromes Language by Kamil Khadiev and Danil Serov (arXiv). An early result of property testing is that all regular languages can tested in constant time. But the simple (non-regular) language $\mathcal{L}$ of strings formed by concatenating two palindromes requires super-constant queries. Specifically, the complexity of property testing $\mathcal{L}$ is $\Theta(\sqrt{n})$ (ignoring log and $\varepsilon$ factors). Here, $n$ denotes the string size. This paper shows that there exist quantum property testing algorithms that beat this bound. There is a quantum property tester for the language $\mathcal{L}$ that makes $\widetilde{O}(n^{1/3})$ queries. The paper also shows that quantum query complexity of the exact decision problem is $\Theta(\sqrt{n})$. Thus, one gets non-trivial quantum speedup for both the property testing and exact problem.

A Bi-metric Framework for Fast Similarity Search by Haike Xu, Sandeep Silwal, and Piotr Indyk (arXiv). Nearest-neighbor (NN) search is one of the most important algorithmic tasks in modern data analysis. The usual setting is that we have a set of $n$ points over a metric, denoted by $D$. We preprocess the points and construct a data structure. Given a query point $q$, we wish to find the (approximate) closest points from our data set, in time sublinear in the data size. This paper introduces a bi-metric setting. Think of $D$ as expensive to compute, while there exists a cheap “proxy metric” $d$. For example, if the points represent images, the expensive metric $D$ could be some complex (but semantically accurate) metric, while $d$ may represent a simple Euclidean metric over an embedding. Formally, $d$ is a (large) constant factor approximation of $D$. This paper gives an algorithm that performs nearest neighbor search, but only preprocesses over $d$. Given, a query $q$, it makes sublinear queries to the expensive $D$. All in all, it is a truly sublinear algorithm in terms of $D$, and leverages the preprocessing over the cheap metric $d$. Technically, this paper gives a meta-algorithm that can be applied to any “base” NN algorithm. The paper proves results for two popular NN algorithms, and gives extensive empirical evidence for the underlying approach.

News for March 2024

Leave a reply

We have a rich bounty of papers this March (with a Feb 29 submission that got associated with this month). Local Computations Algorithms, Error Correction, PageRank computations, Quantum testing, Distribution testing, Graph property testing, and nice note on a statement we all knew (but never bothered to prove!). (Ed: We missed a paper on graph property testing, and updated the blurb on “A basic lower bound for property testing”.)

Average-Case Local Computation Algorithms by Amartya Shankha Biswas, Ruidi Cao, Edward Pyne, and Ronitt Rubinfeld (arXiv). Readers of this blog are likely familiar with Local Computation Algorithms (LCA) model. Imagine the input is (say) a large undirected graph $G$, to which an algorithm has query access to. The aim is to give sublinear access to a large, linear sized output, such as a graph spanner. So each bit/coordinate/edge of the output should be computable with only sublinear access to $G$ (these are called probes). There is a rich line of results for computing $k$-spanners, which are sparse subgraphs of $G$ that approximate the shortest path distances up to a factor $k$. Previous LCA lower bounds give a barrier of $\Omega(\sqrt{n})$ probes per spanner query. To beat this bound, this paper introduces average-case LCAs. Imagine that $G$ is generated according to graph distribution, like Erdős-Rényi, and the LCA only needs to succeed with high probability over the input. For such settings, this paper gives a number of LCAs for different parameter settings that beats the $\sqrt{n}$ probe barrier. An additional concept introduced is that of joint sampling. Here, the LCA is expected to generate the random input $G$ and gives access to the spanner. It is shown that one can beat the trivial bound obtained by just coupling together LCAs for input generation and spanner construction.

Local Correction of Linear Functions over the Boolean Cube by Prashanth Amireddy, Amik Raj Behera, Manaswi Paraashar, Srikanth Srinivasan, and Madhu Sudan (ECCC, arXiv). Speaking of LCAs, a classic example of an LCA is a local error correcting code (LCC). The input is a function $f:\{0,1\}^n \to G$ where $G$ is an Abelian group. Think of codewords as linear functions over this space. A $(\delta, q)$-LCC would provide local access to the closest keyword to $f$ with distance $< \delta$ from the property of linear functions. Each coordinate of the closest keyword is obtained with $q$ queries/probes to $f$. Classic results provide a $(\Omega(1), O(1))$-LCC when $G = \mathbb{Z}_2$. The aim of this paper is to consider general range groups, such as the reals. Not much is known in this space for LCCs. This paper gives a $(1/4-\varepsilon, \widetilde{O}(\log n))$-LCC for this setting. (Note that $1/4$ is the decoding radius.) Most previous LCC results use “common” symmetries among the (hypercube) domain and range, while this result works when the range is totally unrelated to the domain. Moreover the $\widetilde{O}(\log n)$ query complexity is nearly matches the latex $\Omega(\log n)$-query lower bound for this problem.

Revisiting Local Computation of PageRank: Simple and Optimal by Hanzhi Wang, Zhewei Wei, Ji-Rong Wen, and Mingji Yang (arXiv). PageRank is one of the fundamental computations in modern network science. Given a (possibly) directed graph $G = (V,E)$, PageRank defines a “truncated random walk” that works as follows. It starts from the uniform distribution. At each step, independently with probability $\alpha$, the walk just stops. With the remaining probability, it performs a single random walk step by going to a random (out) neighbor. (This definition is equivalent to the stationary distribution of the random walk Markov Chain with a teleport probability $\alpha$.) The PageRank value of vertex $t$, denoted $\pi(t)$, is the probability of ending at $t$. The obvious algorithm for estimating $t$ is to simply perform a bunch of random walks, requiring $\Theta(1/\pi(t))$ to get non-trivial estimates. The average PageRank value is $1/n$, so this leads to a linear time algorithm for an average vertex. This is a rich history of work on beating this bound, and getting a sublinear query complexity for (almost) all vertices. The approach is to get an estimator with running time (ideally) $O(n \pi(t))$, By combining both estimators, we get $O(\sqrt{n})$ time algorithms for estimating PageRank. This paper gets as close as possible to this bound, and achieves (the almost optimal) bound of $O(\sqrt{n} \cdot m^{1/4})$. The actual bound is more involved and depends on the maximum in and out degrees. It is worth noting that random walks on directed graphs are nasty objects, so getting these algorithms is extremely challenging. Moreover, this paper nails down the upper and lower bounds in terms of the $n, m$, and the max degrees.

Efficient Algorithms for Personalized PageRank Computation: A Survey by Mingji Yang, Hanzhi Wang, Zhewei Wei, Sibo Wang, and Ji-Rong Wen (arXiv). This survey isn’t meant for a TCS audience per se, but has bearing to the previous paper. And it has many explicit mentions to sublinear algorithms for PageRank computations. This survey focuses more on Personalized PageRank (PPR), wherein the walks starts from a single source vertex. Section 7 talks about sublinear algorithms for estimating individual PPR values, and discusses the techniques involved in these algorithms. This is a good survey for getting a sense of the interesting problems in estimating (P)PR values.

New Graph and Hypergraph Container Lemmas with Applications in Property Testing by Eric Blais and Cameron Seth (arXiv). The container method is a combinatorial approach that is now seeing many property testing applications. The best way to motivate this is to consider the classic problem of property testing if a dense graph $G$ has a clique of size $\rho n$. The canonical tester samples a small set of vertices (of size $f(\rho)/poly(\varepsilon)$, where $f(\cdot)$ is some explicit function), computes the largest clique in this set, and then extrapolates that to guess the largest clique size in $G$. If $G$ has a clique of size at least $\rho n$ (the YES case), the analysis is an easy exercise. If $G$ is $\varepsilon$-far from having a large clique (the NO case), the analysis needs to deal with probability that small cliques in $G$ lead to large cliques in the sample (just by random deviation). This argument needs a union bound over “all possible (small) cliques” of $G$. And this is always the hard part. The container method is about proving that there is a small (polynomial) number of “containers”, that contain all small cliques. The union bound then works out with stronger parameters. This was introduced to property testing in a beautiful, previous work of the authors, which led to substantial improvements for many property testing problems over dense graphs. In this paper, they generalize the container method to more settings, and prove stronger property testing bounds for satisfiability and $k$-colorability.

Hamiltonian Property Testing by Andreas Bluhm, Matthias C. Caro, and Aadil Oufkir (arXiv). This paper is on property testing the quantum notion of Hamiltonian locality. This description will only highlight my poor knowledge of quantum physics, but let me try. A Hamiltonian is an operator on $n$ qubits, each of which should be thought of as a complex number with magnitude $1$ (you can think of this a 2D vector, with each axis representing a classical bit). On a single qubit, there are four possible operations: identity, negate the complex part, switch the real and complex parts, or negate the complex and switch real/complex parts. These operations can be represented as $2 \times 2$ matrices (the qubit is a 2D vector), and form a basis. We can generalize this basis to $n$ qubits as follows. Pick a string in $\{0,1,2,3\}^n$, where $0$ indicates the identity operation, and the others map to the three non-trivial qubit transforms above. Apply the corresponding operation to the corresponding qubit. Mathematically, we represent this operation as a tensor product of the corresponding $2 \times 2$ matrices. These operators form the Pauli basis; the locality of a basis operator is the number of non-trivial qubit operators used. Any Hamiltonian can be written in the Pauli basis, and it is called $k$-local if it is spanned by at most $k$-local basis operators. This paper sets up a natural property testing question. Distinguish the input Hamilton being $k$-local from the input being $\varepsilon$-far from being $k$-local. Access to the Hamiltonian (and exponentially large object to “write down”) is achieved by applying the operator/transformation to a vector of qubits. If distance between operators is measured by $l_\infty$-norm, then the property testing problem requires exponentially many queries to the Hamiltonian. The main result is that if distance is measure by $l_2$-norm, then only polynomially many queries suffice to test $k$-locality, for constant $k$.

Equivalence Testing: The Power of Bounded Adaptivity by Diptarka Chakraborty, Sourav Chakraborty, Gunjan Kumar, Kuldeep S. Meel (arXiv). We all know the problem of equivalence/identity testing of distributions. This work is on the conditional sampling model. Consider two input distributions $\mathcal{P}$ and $\mathcal{Q}$ over domain $[n]$. For any subset $S$ of the domain, we can generate a sample from the input distributions, conditioned on being in $S$. In this model, existing results show that equivalence ($\mathcal{P} = \mathcal{Q}$) can property tested using $O(\log\log n)$ samples (ignoring $\varepsilon$ dependencies), a significant demonstration of the power of conditional sampling. But this is a sequential, and hence adaptive, algorithm. The best known bound for non-adaptive adaptive algorithms is $O(\log^{12}n)$, while the known lower bound is $\Omega(\log n)$. This paper shows that with a single round of adaptivity, one can get a $\widetilde{O}(\log n)$-query equivalence tester. This result opens up a rich source of problems for the conditional sampling model, wherein we look at the power of bounded adaptivity.

On the query complexity of testing local graph properties in the bounded-degree graph model by Oded Goldreich (arXiv). Consider property testing in bounded degree graph model. The input graph $G = ([n], E)$ is represented as an adjacency list. In the dense graph setting, a vast majority of “interesting” properties are testable in time independent to graph size. For bounded-degree graphs, the story is much more complicated. A fundamental question is to characterize such “constant-time” testable properties. A local property (in this context) is one that is basically defined as subgraph freeness, with some augmentations by vertex degree. Previous work suggested that all local properties can be tested in time independent on graph size. This conjecture was subsequently refuted, but with a non-constructive argument that does not give an explicit query lower bound. This paper overcomes the non-constructive barrier, and gives a query lower bound of $\Omega(\log^{(4)} n)$ for an explicit local property ($\log^{(i)}$ is the iterated log). In a slightly weaker model, where the algorithm is not given the exact size of $G$, the lower bound can be significantly improved to $\Omega(\log n)$.

A basic lower bound for property testing by Eldar Fischer (arXiv). We end with a fundamental statement that you must have known. In any (discrete) setting, consider some non-trivial property $\mathcal{P}$ with distance measured by Hamming distance. The setting that covers almost all property testing settings. Of course, any property tester requires $\Omega(1/\varepsilon)$ queries. Intuitively, if a tester makes less $O(1/\varepsilon)$ input queries, it might not “touch” the portion of the input that helps it in the distinguishing task. This basic fact has not been proven explicitly, and this note resolves that issue. The short paper gives a full self-contained proof of this fact, and does so without resorting to Yao’s minimax lemma. The proof is not difficult, but needs some care because the tester could be adaptive. Great read for students. (Further addendum: Nader Bshouty and Oded Goldreich pointed out that a previous note by Bshouty and Goldreich also proves such a lower bound. These results differ on the precise notation of a “non-trivial property”. Fischer’s definition is more direct and slightly stronger than the Bshouty-Goldreich definition. The Bshouty-Goldreich proof uses a construction of a “sufficiently separated” set of strings, which is different from the proof in this note. Again, both proofs are excellent reads for students to learn more about property testing. These results give a good perspective on crafting the right definitions.)

News for December 2023

Leave a reply

Happy New Year! This post covers the final property testing results for 2023. We have ~~just two~~ three papers, one on group testing, a new bound on sample-based testers, and a distribution testing result over higher dimensions.

On Testing Group Properties by Oded Goldreich and Laliv Tauber (ECCC). Group testing is a classic problem going back to the early days of property testing. We are given access to a “multiplication table” $f: [n]^2 \to [n]$. Each element in $[n]$ is (supposedly) an element of a group, and $f(i,j)$ is the product of elements $i$ and $j$. Our aim is to determine, in the property testing sense, whether $f$ is the multiplication table of a group. The earliest result, from the seminal Spot-checkers paper, gave an $\widetilde{O}(n^{3/2}/\varepsilon)$ time tester (where $\varepsilon$ is the proximity parameter). This paper significantly improves that classic bound with a one-sided $\widetilde{O}(n/\varepsilon)$ tester. Moreover, this tester can be adapted for testing of $f$ is Abelian. The result is obtained by a series of testers, starting from extremely simple (yet time inefficient) to more complex versions. The best known lower bound is just $\Omega(\log n)$, and that leaves a tantalizing (and wide open) gap to be reduced.

A Structural Theorem for Local Algorithms with Applications to Coding, Testing, and Verification by Marcel Dall’Agnol, Tom Gur, and Oded Lachish (arXiv). One of the powers of property testers is their choice of query. This is unlike the typical setting in learning, where one only gets access to random samples (or evaluations). Sample-based testers were defined to bridge this gap; so it is a property tester that only has access to (evaluations of) uniform random domain points. A sample-based tester simply checks if the sample is consistent with the property. It is natural to ask if the existence of a (vanilla) property tester implies the existence of a sample based tester. For the simplest setting, consider a $q$-query non-adaptive tester for some property $\mathcal{P}$. One can visualize this as a collection of query sets of size $q$ in a universe of size $n$ (the domain). Naively, one might hope that with a $q$-way collision argument, a random sample of size $O(n^{1-1/q})$ would contain a query set, yielding a sample based tester. Previous work showed that, for any property with a $q$-query non-adaptive tester, there is a sample-based tester with complexity $O(n^{1-1/(q^2\log q)})$. Remarkably, this work gives such a bound for even adaptive testers (the best previous bound was $O(n^{1-1/\exp(q)})$). The result is placed in a broader framework of robust local algorithms, which subsume $q$ query property testers, locally decodable codes (LDC), and MA proofs of proximity.

Testing Closeness of Multivariate Distributions via Ramsey Theory by Ilias Diakonikolas, Daniel M. Kane, and Sihan Liu (arXiv). (Missed from last month. -Ed) This paper considers distribution testing where the universe is $\mathbb{R}^d$. The notion of closeness is called $\mathcal{A}_k$ distance: we cover the universe with $k$ axis-parallel rectangles, and “reduce” the distribution to a discrete universe of size $k$. We then take TV-distance over these reduced distributions. When $d=1$, the complexity of testing closeness was known to be $\Theta(k^{4/5}/poly(\epsilon))$. For $d > 1$, this paper gives the first non-trivial closeness testing result. The (optimal in $k$) bound achieved is $\Theta(k^{6/7}/poly(\epsilon))$. Interestingly, there is a jump in the exponent on $k$ from $d=1$ to $d=2$, but no jump for larger $d$.

News for July 2023

1 Reply

We’re now back to our regular cadence of 4+ monthly papers on property testing and sublinear time algorithms. July brings us a new sublinear PageRank computation and numerous results on our trending topics of distribution and monotonicity testing. Without further delay, let’s see the July spread.

Estimating Single-Node PageRank in $\widetilde{O}(\min(d_t,\sqrt{m}))$ Time by Hanzhi Wang and Zhewei Wei (arXiv). PageRank is one of the most important quantities computed on graphs, especially in today’s networked world. Consider an undirected graph $G = (V,E)$. The PageRank value of vertex $t$ is the probability that a short random walk, starting from the uniform distribution, reaches the vertex $t$. (Ok ok, I’m fudging details, but this is close enough to the truth.) The aim is to estimate this probability to within additive error $O(1/n)$, where $n$ is the number of vertices. A standard simulation would give an $O(n)$ time algorithm; can we do sublinear in graph size? Previous work (which actually has implementations!) has led to $O(\sqrt{n\cdot d_t})$ for undirected graphs [LBGS15] and (roughly) $O(n^{2/3} d^{1/3}_{max})$ for all graphs [BPP18]. Here, $d_t$ is the degree of vertex $t$ and $d_{max}$ is the maximum degree. This paper gets a bound of $\widetilde{O}(\min(d_t,\sqrt{m}))$. A lower bound is still open for the fundamental problem! (A nice problem for any students reading…?)

Directed Poincare Inequalities and $L_1$ Monotonicity Testing of Lipschitz Functions by Renato Ferreira Pinto Jr. (arXiv, ECCC). We all (or at least me) love monotonicity testing. The paper studies the continuous version, where $f:[0,1]^n \to \mathbb{R}$. To have a reasonable notion of distance and testers, we will require functions to be differentiable and $L$-Lipschitz. We measure distance using $\ell_1$ distance, so the distance between $f,g$ is $\int_{[0,1]^n} |f-g| d\nu$ (where we integrate over the uniform measure) [BRY14]. To access $f$, we are also provided a “derivative oracle”: given a point $x$ and a direction $\vec{v}$, we can query the directional derivative along $\vec{v}$ at $x$. One of the key insights in (discrete, Boolean) monotonicity testing is the connection to directed isoperimetric inequalities. These inequalities are directed versions of classic inequalities that relate boundaries (or gradients) to volumes. For $L$-Lipschitz functions, the classic Poincare inequality states that $E[\|\nabla f\|_1] \geq \textrm{var}(f)$, where $\nabla f$ is the gradient and $\textrm{var}(f)$ is (up to constant factors) the distance to being constant. This paper proves the directed version $E[\|\nabla^- f\|_1] \geq \varepsilon(f)$. Roughly speaker, $\nabla^-$ is the “negative gradient” ($min(\nabla f, 0)$) and $\varepsilon(f)$ is the $L_1$-distance to monotonicity. This result directly yields a $O(n/\varepsilon)$ query tester for monotonicity. The paper asks the tantalizing question: can we do better?

A Strong Composition Theorem for Junta Complexity and the Boosting of Property Testers by Guy Blanc, Caleb Koch, Carmen Strassle, and Li-Yang Tan (arXiv). It is best to go backwards in discussing this paper, starting from the implications and going to the core results. The problem at hand is the classic one of junta testing. Given $f:\{0,1\}^n \to \{0,1\}$, distinguish if $f$ only depends on $r$ variables (an $r$-junta) or is $\varepsilon$-far from having that property. This problem is well studied, has optimal (efficient) testers, and even has results in the distribution-free setting. In the latter setting, there exists some (unknown) distribution $\mathcal{D}$ on the domain according to which distance is defined. The tester can access to queries according to $\mathcal{D}$. The clarity of junta testing disappears when we consider tolerant testing: can we distinguish $f$ is (say) $1/4$ close to an $r$-junta from being $1/3$-far (where distances are measured according to $\mathcal{D}$)? A remarkable consequence of this paper is that this tolerant testing version is unlikely to have a $poly(n)$ time algorithm. (Note that the sample complexity might still be small.) The main tool is a composition theorem that gives reductions between low $\varepsilon$ testers and constant $\varepsilon$ testers. The details are intricate, but here’s the gist. Suppose we can construct composing functions $g$ such that the distance to “junta-ness” of $g \circ f$ is much larger than the distance of $f$. Then a tester (that can only deal with large $\varepsilon$) on $g \circ f$ can effectively property test $f$. (Obviously, I’m greatly oversimplifying and mischaracterizing. So go read the paper!)

The Full Landscape of Robust Mean Testing: Sharp Separations between Oblivious and Adaptive Contamination by Clément L. Canonne, Samuel B. Hopkins, Jerry Li, Allen Liu, and Shyam Narayanan (arXiv). Consider the fundamental hypothesis testing problem of Gaussian testing. We wish to distinguish the $d$-dimensional, zero-mean, unit covariance Gaussian $\mathcal{N}(0,I)$, from the “shifted” version $\mathcal{N}(\mu,I)$, where $\mu$ is a vector of length at least $latex \alpha$. Existing results give a simple algorithm using $\Theta(\sqrt{d}/\alpha^2)$ samples. Suppose there is an adversary who can corrupt samples. The adaptive adversary can look at the samples generated, and arbitrarily change an $\varepsilon$ fraction of entries. The weaker, oblivious adversary can also arbitrarily change an $\varepsilon$ fraction of entries, but cannot look at the samples generated. Can we perform Gaussian testing in this setting? A previous result gave an optimal bound for adaptive adversaries [N22]. This paper gives the optimal sample complexity bound for oblivious adversaries. The expression is somewhat complicated. But the punchline is that for many regimes of parameters $d, \alpha, \varepsilon$, the oblivious adversary is strictly weaker. Meaning, there is a testing algorithm (against an oblivious adversary) whose sample complexity is strictly less than the optimal bound against adaptive adversaries. This comes as a surprise, because in typical distribution testing settings, these adversaries are basically equivalent.

Uniformity Testing over Hypergrids with Subcube Conditioning by Xi Chen and Cassandra Marcussen (arXiv). A problem of generalizing from hypercube to hypergrids, an issue I have much obsessed over. Consider the problem of uniformity testing, where the domain is the hypergrid $[m]^n$. We wish to test if a distribution $\mathcal{D}$ over the hypergrid is uniform. Unlike the standard distribution testing setting, the domain size is exponentially large. To aid us, we can perform conditional sampling: we can select any sub-hypergrid, and get samples from $\mathcal{D}$ restricted to this sub-hypergrid. Previous work solved this problem for the hypercube domain (when $m=2$), by providing a tester with sample complexity $O(\sqrt{n}/\varepsilon^2)$ [CCKLW22]. The previous work did not work for any other hypergrid (even $m=3$). This paper gives the first solution for uniformity testing on hypergrids with a tester of sample complexity $O(poly(m)n/\varepsilon^2)$. The dependence on $m$ has to be at least $\sqrt{m}$, from existing results. One of the important tools is getting robust versions of classic isoperimetric inequalities over the hypergrid. An important open question is to determine the optimal dependence on $\mathcal{m}$.

Learning and Testing Latent-Tree Ising Models Efficiently by Davin Choo, Yuval Dagan, Constantinos Daskalakis, and Anthimos Vardis Kandiros (arXiv). Depending on your standpoint, one may or may not consider this a “sublinear time” paper (it does not show that testing $\ll$ learning). But given the interest in distribution testing, I think it’s germane to our blog. We have a high-dimensional distribution $\mathcal{D}$ over $(x_1, x_2, \ldots, x_n)$. Without any assumption, learning (or even identity testing) requires exponential in $n$ many samples. This paper assumes that $(x_1, x_2, \ldots, x_n)$ is generated from a latent-tree Ising model. There is a tree where each node is associated with a number (think $x_i$), and the joint distribution satisfies some dependencies according to the edges. We only observe the values of the leaves; hence, the term “latent” model. The main result shows that identity testing can be done in with polynomial in $n$ samples. The authors also prove that one can learn the distribution in (albeit larger) $poly(n)$ samples.

News for March 2023

Leave a reply

I never thought this day would come.

For the first time in PTReview history, there is no paper to report. Nada. Zilch.

The calm before the storm…?

News for December 2022

Leave a reply

Happy New Year! Apologies for the late post; I was stuck for far too long in holiday mode. We haven’t missed much. There were only two papers in the month of December, since, I’m assuming, many of us were entering holiday mode.

Testing in the bounded-degree graph model with degree bound two by Oded Goldreich and Laliv Tauber (ECCC). One of great, central results in graph property testing is that all monotone properties are testable (with query complexity independent on graph size) on dense graphs. The sparse graph universe is far, far, more complicated and interesting. Even for graphs with degree bound 3, natural graph properties can have anywhere from constant to linear (in $n$) query complexity. This note shows that when considering graphs with degree bound at most 2, the landscape is quite plain. The paper shows that all properties are testable in $poly(\varepsilon^{-1})$. Any graph with degree at most 2 is a collection of paths and cycles. In $poly(\varepsilon^{-1})$ queries, one can approximately learn the graph. (After which the testing problem is trivial.) The paper gives a simple $O(\varepsilon^{-4})$ query algorithm, which is improved to the nearly optimal $\widetilde{O}(\varepsilon^{-2})$ bound.

On the power of nonstandard quantum oracles by Roozbeh Bassirian, Bill Fefferman, and Kunal Marwaha (arXiv). This paper is on the power of oracles in quantum computation. An important question in quantum complexity theory is whether $QCMA$ is a strict subset of $QMA$. The former consists of languages decided by Merlin-Arthur quantum protocols with a classical witness (the string that Merlin provides). The latter class allows Merlin to be a quantum witness. This paper shows a property testing problem where such a separation is shown. The property is essentially graph non-expansion (does there exist a set of low conductance?). The input graph should be thought of as an even (bounded) degree with “exponentially many” vertices. So it has $N = 2^n$ vertices. The graph is represented through a special “graph-coded” function. The paper shows that there is a $poly(n)$-sized quantum witness for non-expansion that can be verified in $poly(n)$ time, which includes queries to the graph-coded function. On the other hand, there is no classic $poly(n)$-sized witness that can be verified in $poly(n)$ queries to the graph-coded function. (Informally speaking, any $QCMA$ protocol needs exponentially many queries to the graph.)

Property Testing Review

The latest in property testing and sublinear time algorithms

Author Archives: Seshadhri

News for June 2025

News for February 2025

News for November 2024

New for August 2024

News for June 2024

News for March 2024

News for December 2023

News for July 2023

News for March 2023

News for December 2022