After a few quiet months on PTReview, the community is back with a bang. We have a selection of nine papers this month, ranging from hypergraphs, vertex coloring, triangle counting, a new take on linearity testing, and a sublinear foray into transformers.
Property Testing in Bounded Degree Hypergraphs by Hugo Aaronson, Gaia Carenini, and Atreyi Chanda (arXiv). We all know about the rich history of property testing and sublinear algorithms for graphs. The space for hypergraphs is far less studied, and likely has an even richer theory. Consider the input being an \(n\) vertex \(k\)-uniform hypergraph, so every hyperedge is a subset of \(n\) vertices. This paper is the first to define the bounded degree setting. The graph is represented as an adjacency list, wherein, for each vertex, we have the list of hyperedges containing that vertex. Distance between hypergraphs is defined in terms of the symmetric difference between the sets of hyperedges. The primary focus of the paper is on the fundamental property of \(k\)-colorability (note that \(k\) is also the hyperedge size). The main lower bound result is that for \(k \geq 3\), this property requires \(\Omega(n)\) queries to test. This is in contrast to the graph setting (\(k = 2\)), where bipartiteness is known to be testable for bounded degree graphs. This paper shows if that inputs have bounded treewidth, then \(k\)-colorability is testable with one-sided error in constant queries.
Simple Sublinear Algorithms for \((\Delta + 1)\) Vertex Coloring via Asymmetric Palette Sparsification by Sepehr Assadi and Helia Yazdanyar (arXiv). Consider access to the adjacency list of a graph with \(n\) vertices and maximum degree \(\Delta\). Most of our readers will be familiar with result of Assadi-Chen-Khanna proving that \((\Delta + 1)\)-coloring can be done in \(\widetilde{O}(n^{3/2})\) time. The key tool is a palette sparsification lemma, that shows that each vertex can independently pick a small list of colors, such that the final coloring is consistent with these lists. This paper proves a weaker version of this lemma with a significantly simpler proof. The result is an asymmetric version, where vertices choose color lists of varying sizes. The average length goes up by a log factor. But the proof is simpler, and it also allows for the final coloring to just use the greedy algorithm (constrained to these lists).
Improved Sublinear Algorithms for Classical and Quantum Graph Coloring by Asaf Ferber, Liam Hardiman, and Xiaonan Chen (arXiv). Another result on \((\Delta +1)\)-coloring. A result of Morris and Song show that there are simpler algorithms that give a \((1+\varepsilon)\Delta\)-coloring. Adapting their techniques, this paper gives a sublinear algorithm for \((\Delta+1)\)-coloring that runs in \(O(n^{3/2}\sqrt{\log n})\) time. This gives a \(\sqrt{\log n}\) factor improvement from the previous results. The algorithm can be accelerated in the quantum model using Grover’s search, to get a running time of \(\widetilde{O}(n^{4/3})\). Finally, the paper gives a \(\widetilde{O}(n^{5/4})\) time quantum algorithm for getting a \((1+\varepsilon)\Delta\)-coloring.
Monotonicity Testing of High-Dimensional Distributions with Subcube Conditioning by Deeparnab Chakrabarty, Xi Chen, Simeon Ristic, C. Seshadhri, and Erik Waingarten (arXiv). In my first foray into distribution testing, this paper studies the problem of testing monotonicity of distributions over the hypercube, \(\{-1,1\}^n\). Observe that the domain is considered to be exponentially large, and the aim is to get distribution testers that make \(\textrm{poly}(n)\) samples. To get such bounds, this paper assumes subcube conditional samples. Given any subhypercube of the domain, the algorithm can generate samples restricted to this subhypercube. A distribution is called monotone, if the corresponding probability mass function is monotone (wrt to the standard partial order on \(\{-1,1\}^n\)). For this setting, this paper shows that property of monotonicity can tested with \(\widetilde{O}(n/\varepsilon^2)\) samples. Moreover, this bound is shown to be optimal, up to polylog factors. One of the main tools is a new directed isoperimetric theorem for real-valued functions on the hypercube.
Compression Barriers in Autoregressive Transformers by Themistoklis Haris and Krzysztof Onak (arXiv). While this paper does not give a sublinear time algorithm, it uses many techniques from the sublinear algorithms toolkit to prove lower bounds. And it is on transformers, arguably one of the most important components for machine learning on sequences and Large Language Models. Consider a sequence of \(n\) vectors \(v_1, v_2, \ldots, v_n \in \mathbb{R}^d\), which one can think of as vector embeddings of \(n\) words (coming from, say, some generated text). Based on extensive learning and model fine tuning, assume that the model has learned some “key” vectors \(k_1, k_2, \ldots, k_n \in \mathbb{R}^d\). The next word is generated from a “query vector” \(q\), using which we compute \(M^{-1} \sum_{\ell \leq n} \exp(q \cdot k_\ell) v_\ell\) (where \(M\) is a normalization factor). Suppose there are \(n\) query vectors, and the aim is to generate the above sum for all the query vectors (these are like the next \(n\) words). In practice, there are many schemes to speed up computation by compressing the key vectors. This paper shows that, in the worst case, a computation of the next word/vector takes \(\Omega(nd)\) time and space. Thus, any sublinear compression/speed up method must fundamentally use properties of the data.
Arboricity and Random Edge Queries Matter for Triangle Counting using Sublinear Queries by Arijit Bishnu, Debarshi Chanda, Gopinath Mishra (arXiv). Triangle counting is a classic problem, with a rich history in sublinear algorithms. The earliest work gives an algorithm running in \(\widetilde{O}(n/t^{1/3} + m^{3/2}/t)\) time, where \(n, m\) are the usual graph parameters and \(t\) is the triangle count. (Polylog and \(\varepsilon\) factors are ignored.) This complexity can be improved to \(\widetilde{O}(n/t^{1/3} + m\alpha/t)\), where \(\alpha\) is the graph arboricity/degeneracy. The standard adjacency list model can be augmented with random edge queries, in which case, it was known that \(O(m^{3/2}/t)\) time algorithms exist. This paper shows that \(O(m\alpha/t)\)-time algorithms exist for the model with random edge queries, which combines the previous results into an even better algorithm.
Improved Sublinear-time Moment Estimation using Weighted Sampling by Anup Bhattacharya, Pinki Pradhan (arXiv). Consider an array of \(n\) non-negative weights \(w_1, \ldots, w_n\). The \(t\)th-moment is the sum \(\sum_{i \leq n} w^t_i\). The aim is to estimate this moment, with proportional samples. This means that an algorithm can generate index samples where the index \(i\) has probability proportional to \(w_i\). The first result on this problem, for \(t=1\), was by Motwani-Panigrahy-Xu. Beretta-Tetek do a detailed study of this problem for case \(t=1\) (sum estimation), and show that this problem can be solved in \(O(\sqrt{n})\) queries. When the weights are vertex degrees and we have graph access, there are results showing the existence of \(\widetilde{O}(n^{1-1/t})\) algorithms for \(t > 1\). This paper gives such a result for any non-negative weights with $t \geq 1$, and proves a matching lower bound. Moreover, this paper considers the setting where $atex t < 1$. Interestingly, there is an algorithm of query complexity \(O(\sqrt{n} + n^{1/t-1})\), and a lower bound showing that \(\Omega(n)\) samples are needed when \(t \leq 1/2\).
Biased Linearity Testing in the 1% Regime by Subhash Khot and Kunal Mittal (ECCC). The classic problem of linearity testing never stops yielding new insights. A function \(f:\{0,1\}^n \to \{-1,1\}\), if \(f(x) = (-1)^{a \cdot x}\) where \(a \in \{0,1\}^n\). The classic 3-query BLR test distinguishes linear functions from those that are far from linear. Specifically, if the tester passes with probability \(1-\delta\), then the function \(f\) is \(O(\delta)\)-close to linear. This called testing in the “99% regime”. Suppose we want to tester with guarantees in the “1%” regime. This means that if the tester passes with probability \(> \delta\), then \(f\) has at least \(\epsilon\) correlation with some linear function (where \(\epsilon\) is some function of \(\delta\)). Such results were known for the uniform distribution on the hypercube. This paper studies product distributions on the hypercube, where all marginals are identically \(p\)-biased (the probability of choosing a \(1\) is \(p\)). Such 1%-testers were known for \(p \in (1/3,2/3)\) from recent work. This paper gives a \(1+1/p\)-query tester for all values of \(p\), and proves that the number of queries must be at least \(1+1/p\).