Author Archives: Clement Canonne

News for September 2025

With October already here (!), time to look back at September, with seven property testing papers! With a healthy mix of error-correcting codes, graph testing, distribution testing, and computational geometry:

Robust Local Testability of Tensor Products of Constant-Rate Algebraic Geometry Codes, by Sumegha Garg and Akash Sengupta (ECCC). Given two error-correcting codes \(\mathcal{C}_1,\mathcal{C}_2\), their tensor product code \(\mathcal{C}_1\otimes\mathcal{C}_2\) can be represented as the set of matrices \(M\) whose rows are codewords of \(\mathcal{C}_1\) and columns codewords of \(\mathcal{C}_2\). How to (very quickly) test if an arbitrary matrix \(M\) corresponds to a codeword of \(\mathcal{C}_1\otimes\mathcal{C}_2\)? That is, when is such a tensor code a Locally Testable Code (LTC)?
A natural approach is to pick either a row or column of \(M\) uniformly at random, and check if that is a codeword of \(\mathcal{C}_1\) or \(\mathcal{C}_2\). If this natural test works, then \(\mathcal{C}_1\otimes\mathcal{C}_2\) is said to be robustly locally testable. Which brings us to this paper: the main result is to show that the tensor codes obtained from Algebraic-Geometry (AG) codes (a generalization of Reed-Solomon codes) are, indeed, robustly locally testable.

Expansion without Connectivity: A Property Testing Perspective, by Irit Dinur and Oded Goldreich (ECCC). Testing whether a (bounded-degree) \(n\)-vertex graph is an expander is among the earliest tasks considered in property testing, back to the influential work of Goldreich and Ron (2000) which connected this to the distribution testing question of uniformity testing. This paper looks at an important twist of the question: what if instead of testing if a graph is an expander, one wanted to assess whether it was expanding? This sounds like a rephrasing of the same question, but with a crucial difference: here, a graph doesn’t need to be connected to be expanding, as long as each of its connected components each (by itself) an expander. After motivating this variant of the question and connecting , the authors provide a lower bound of \(\Omega(\sqrt{n})\) queries for testing expansion, and complement it with an \(O(n^{2/3+0.0001})\)-query algorithm solving (a bicriterion version of) the testing task. Interestingly, the upper bound involves another connection to distribution testing, this time to the task of generalized uniformity testing introduced by Batu and Canonne (2017).

Plane vs. Plane Low Degree Test, by Amey Bhangale and Silas Richelson (ECCC). Low-degree tests, one of the key ingredients of the original PCP theorem and a central object of study in related areas of TCS, must, given query access to a function \(f\colon \mathbb{F}_q^m \to \mathbb{F}_q\), test whether \(f\) is a degree-\(d\) polynomial, of far from every such low-degree function. The “plane v. plane” test is a natural approach to perform this task: given a description of the function as a truth table listing its degree-d restriction \(f_P\) to every plane \(P\), do the following. Pick two random planes \(P,P’\), and check if the restrictions agree on the intersection \(P\cap P’\). The question can then be rephrased as that of characterizing the soundness of the test: if this “local test” accepts with probability \(\varepsilon\), then must \(f\) be \(\textrm{poly}(\varepsilon)\)-close to a “global” low-degree polynomial? This work significantly improves the state-of-the-art on this question, in the low agreement regime (i.e., when the soundness parameter \(\varepsilon\) can be very small: only \(\Omega(d/q)\)).

Distribution Testing in the Presence of Arbitrarily Dominant Noise with Verification Queries, by Hadley Black and Christopher Ye (arXiv). The authors introduce a new setting for distribution testing, where the (unknown) distribution \(p\) to be test is noisy (the “Huber contamination model”): that is, instead of getting i.i.d. samples from \(p\), the algorithms gets i.i.d. samples from some \(\tilde{p} = \alpha p + (1-\alpha)q\), where \(q\) is an arbitrary “noise distribution.” But all isn’t lost, as the algorithm, for any sample, can also query whether it comes from the “good” part of the mixture \(p\), or the “bad” part \(q\) (there is no “ugly” part). The obvious baseline then is to query for all samples, discard the bad ones, and run any out-of-the-box testing algorithm on what remains: this only costs an extra \(1/\alpha\) factor in the sample complexity. The authors show that one can do (much) better than this baseline! In the case of uniformity/identity and closeness testing, their results show that one can trade additional noisy samples for fewer verification queries.

On the Structure of Replicable Hypothesis Testers, by Anders Aamand, Maryam Aliakbarpour, Justin Y. Chen, Shyam Narayanan, Sandeep Silwal (arXiv). A paper from July, which we missed then. Say you want to perform your favorite distribution testing task, but in a replicable fashion (as defined by Impagliazzo, Lei, Pitassi, and Sorell (2022)): that is, run on a new set of samples, your algorithm should have the same output with high probability. “Easy”! After all, your algorithm already accepts things it should accept (in the property) with high probability, and rejects those it should reject (far from the property) with high probability… Yet not so easy: the replicability requirement also asks the algorithm to be consistent for the in-between distributions, too, those that are close but not in the property! This question was introduced by Liu and Ye in 2024: this new paper significantly improves both the results themselves and the understanding of replicable testing algorithms. In particular, it provides a characterization of what any such algorithm must satisfy, and uses this to design a “canonical tester” in the replicable setting.

Testing Depth First Search Numbering, by Artur Czumaj, Christian Sohler, Stefan Walzer (arXiv). In the bounded-degree graph model for testing, the algorithm is given query access to the adjacency function (given a vertex \(v\), can ask for the i-th neighbor of \(v\)). In this paper, the author augment this model by adding labels to the vertices, along with the ability to query the label of any vertex; but, crucially, not the reverse access, which would be to get the vertex corresponding to a chosen label. The main motivation and focus of the paper is then to test whether the (numerical) labels of the input graph correspond to a valid DFS numbering, as one would obtain by running a DFS on the graph, or if the graph is \(\varepsilon\)-far from any graph for which these labels are a DFS numbering. The authors provide both an upper and lower bound for testing DFS numbering, showing that \(\Theta(n^{1/3})\) queries are necessary and sufficient (for constant \(\varepsilon\)).

And finally, another paper we originally missed earlier this year, now updated on the arXiv:

Property Testing of Curve Similarity, by Peyman Afshani, Maike Buchin, Anne Driemel, Marena Richter, and Sampson Wong (arXiv). The authors initiate the study of property testing for Fréchet distance (continuous or discrete): given two \(n\)-vertex curves \(P,Q\) and parameters \(\delta, \varepsilon\), the algorithm must accept if \(P,Q\) have Fréchet distance at most \(\delta\), and reject if they are “\(\varepsilon\)-far from having Fréchet distance \(\delta\)” (where this “farness” is formally defined as saying the best coupling path between \(P,Q\) in their \(\delta\)-free space matrix has cost at least \(\varepsilon n\)).
The authors provide two different (incomparable) algorithms analyzed in terms of a locality parameter \(t\), the first algorithm showing that when this parameter is known only \(\tilde{O}(t/\varepsilon)\) queries are sufficient, and the second algorithm achieving query complexity \(\tilde{O}(t^2 \max(t,\log n)/\varepsilon)\) when \(t\) is unknown. The authors then conclude with a few open questions, which the readers of this blog keen to explore property testing in computational geometry may find interesting!

Update (Oct 8): another paper on replicable distribution testing we missed in July! With a very fitting title, too:
Replicable Distribution Testing, by Ilias Diakonikolas, Jingyi Gao, Daniel Kane, Sihan Liu, and Christopher Ye (arXiv). Concurrent to the work mentioned above, this paper establishes new bounds for uniformity, closeness and independence testing in the replicable setting; and complements it with a new general-purpose lower bound technique, enabling the authors to establish optimality (up to logarithmic factors) of their results on uniformity and closeness testing.

News for May 2025

Leave a reply

Apologies for the delay in publishing this digest. May has seen a lot of activity in property testing, with quite a lot from the quantum side: 8 papers in total!

Improved Sample Upper and Lower Bounds for Trace Estimation of Quantum State Powers (arXiv) by Kean Chen and Qisheng Wang. Say you have access to copies of a quantum state \(\rho\) (lucky you!) and want to estimate some property of \(\rho\): specifically, you’d like an additive estimate of \(\mathrm{Tr} \rho^q\), for some \(q> 1\) of you choosing. (This quantity, for various \(q\), has connections to Rényi and Tsallis entropies). How many copies of the state do you need? The authors significantly strengthen previous bounds on the sample complexity of the task, providing a tight answer for \(q > 2\), and improved results for \(1< q < 2\). Besides their results, they also provide a list of future directions.

On estimating the quantum \(\ell_\alpha\) distance (arXiv), by Yupan Liu and Qisheng Wang. Now you’re given access to two quantum state \(\rho_0,\rho_1\), and you want to estimate their \(\alpha\)-Schatten distance, for a fixed \(\alpha>1\) — the “quantum analogue” of estimating the \(\ell_\alpha\) distance between two (classical) discrete probability distributions. The authors provide computationally efficient algorithms with constant sample complexity (for constant additive error), and characterize the computational complexity of the task as a function of \(\alpha\) (as \(\alpha \to 1\)).

Quantum Hamiltonian Certification (arXiv), by Minbo Gao, Zhengfeng Ji, Qisheng Wang, Wenjun Yu, and Qi Zhao. Now, you are given access to the time evolution operator \(e^{-i Ht}\), for an unknown Hamiltonian \(H\) over \(n\) qubits. Can you efficiently test (certify) whether \(H\) is close to (or far from) a reference Hamiltonian \(H_0\)? The paper considers this tolerant Hamiltonian task with respect to the (normalized) Frobenius distance, and provides tight bounds on the evolution time (the measure of “sample complexity” in this time-evolution setting). The authors also discuss consequences of their results for other norms, and also give an ancillary-free algorithm for the tolerant testing task.

Hamiltonian Locality Testing via Trotterized Postselection (arXiv), by John Kallaugher and Daniel Liang. When it Hamiltonian-rains, it Hamiltonian-pours! This papers considers another tolerant testing task on Hamiltonians, where instead of certifying whether \(H\) is close to a reference \(H_0\), the goal is to tolerantly test whether it is close to being “local” (vs. far from any local Hamiltonian), where a Hamiltonian is “local” if it can be decomposed as the sum of small-weight Pauli operators. The authors provide significantly improved upper bounds on the time evolution needed for this task, as well as a tight lower (and upper) bound for algorithms alllowed reverse time evolution.

And for the last quantum paper of the month (that we could find), quantum tomography:

Sample-optimal learning of quantum states using gentle measurements (arXiv), by Cristina Butucea, Jan Johannes, and Henning Stein. A gentle measurement, as formalized by Aaronson and Rothblum, is a measurement which only “mildly” collapses the quantum state. In their paper, Aaronson and Rothblum established two-way connections between gentle measurements and differential privacy. In this work, the authors study quantum tomography with gentle measurements, and provide what looks like a connection to local differential privacy, along with a new information theoretical tool, a quantum strong data processing inequality (SDPI) for gentle measurements.

And with this perfect segue, we now switch from quantum to local differential privacy:

Locally Differentially Private Two-Sample Testing (arXiv), by Alexander Kent, Thomas B. Berrett, and Yi Yu. This paper considers the question of two-sample testing (maybe more familiar to our reader under the name closeness testing) under local differential privacy, both for discrete and smooth continuous distributions. While some previous results on identity testing under LDP carry over to closeness testing, this work extends it to the smooth continuous case, and, focusing on practicality of the algorithms, focuses on a class of testers, the permutation tests.

… and to Boolean functions!

Testing Juntas Optimally with Samples (arXiv), by Lorenzo Beretta, Nathaniel Harms, and Caleb Koch. It is a truth universally acknowledged, that a single month in possession of a good number of property testing results, must be in want of a junta testing paper. And lo and behold: junta testing, indeed! The contributions of this paper are two-fold: first, the authors provide a tight query complexity bound for junta testing the distribution-free setting (the testing analogue of the PAC learning setting, where distance is measured with respect to an arbitrary, unknown, ambient distribution). Second, they give a lower bound showing that for tolerant junta testing in the distribution-free setting, one may as well learn the whole thing: testing is as hard as learning.

And to conclude… anti-concentration inequalities, and applications.

Algebraic aspects of the polynomial Littlewood-Offord problem (arXiv), by Zhihan Jin, Matthew Kwan, Lisa Sauermann, and Yiting Wang. While this paper is concerned primarily with the (polynomial) Littlewood–Offord problem, and how to improve a recent theorem of Meka, Nguyen and Vu under additional structural assumptions, the authors remark that some of the lemmas they establish along the way have direct implications for property testing of matrices and tensors. This is discussed in Section 1.1.4 of the paper.

Announcing WoLA 2025 in Chicago

Leave a reply

The 9th edition of WoLA, the Workshop on Local Algorithms, will be taking place on August 18-20, 2025 at the Toyota Technological Institute (TTIC), Chicago, IL.

For those unfamiliar with WoLA:

Local algorithms, that is, algorithms that compute and make decisions on parts of the output considering only a portion of the input, have been studied in a number of areas in theoretical computer science and mathematics. Some of these areas include sublinear-time algorithms, distributed algorithms, inference in large networks and graphical models. These communities have similar goals but a variety of approaches, techniques, and methods. This workshop is aimed at fostering dialogue and cross-pollination of ideas between the various communities.

You are all invited and we would love to see you there! For more on (free) registration (by ⏰ August 10), how to submit a poster, list of invited speakers, and local arrangements, see the website.

Program Committee: Arnab Bhattacharyya (University of Warwick), Clément Canonne (University of Sydney), Elena Grigorescu (University of Waterloo), Moti Medina (Bar Ilan University), Rocco Servedio (Columbia University), Asaf Shapira (Tel Aviv University) [Chair], Ali Vakilian (TTIC), Yuichi Yoshida (NII)

News for April 2025

Leave a reply

The bonanza of property testing results continues this month, with seven new papers on the arXiv! And with range, too: from regular languages to quantum states, with detours through distribution testing, Boolean functions, and relative error property testing. Oh, and yes, also, (quantum) magic!

The Trichotomy of Regular Property Testing (arXiv), by Gabriel Bathie, Nathanaël Fijalkow, and Corto Mascle. In property testing of regular languages, initiated by Alon, Krivelevich, Newman, and Szegedy in 2001, one has to decide whether an input word belongs to the target language \(L\), or is at distance at least \(\varepsilon\) from every \(x \in L\) (where the distance is typically Hamming or the (easier) edit distance). Surprisingly, it was shown that every regular language could be tested with \(\tilde{O}(1/\varepsilon)\) queries, independent of the input size. But is that tight? And is that \(\tilde{O}\) necessary? Many years of work later, the main result of this paper is settling the question, showing that for testing under the Hamming distance there are only three options: either a language is trivial (0 queries needed!), or easy (\(\Theta(1/\varepsilon)\) necessary and sufficient), or just plain hard \(\Theta(\log(1/\varepsilon)/\varepsilon)\) queries necessary and sufficient). Nothing else!

A Mysterious Connection Between Tolerant Junta Testing and Agnostically Learning Conjunctions (arXiv), by Xi Chen, Shyamal Patel, and Rocco Servedio. (1) Take two well-studied, notoriously challenging and seemingly unrelated problems about Boolean functions: agnostic learning conjunctions (that is, learning conjunctions with noise), and tolerantly testing juntas. (2) Unearth a new connection between the two tasks. (3) Write a very entertaining, illuminating introduction about this. (4) Oh, also, provide a new \(\tilde{O})(2^{n^{1/3}})\)-time agnostic learning algorithm for conjunctions, improving the previous best known result after 18 years; use it to obtain a \(\tilde{O})(2^{k^{1/3}})\)-query tolerant tester for juntas using this new connection, thus showing a polynomial separation between adaptive and non-adaptive algorithms for this task. (5) You get this paper.

Distribution Testing Meets Sum Estimation (arXiv), by Pinki Pradhan and Sampriti Roy. In the sum estimation problem, there is a set of \(n\) elements, each with a non-negative weight, and the goal is to estimate the total weight \(W\) while minimizing the number of weight queried. Under a model which allows both weighted and uniform sampling, this is known to be achievable with \(O(n^{1/3})\) queries (Beretta and Tětek). This paper considers the task under two additional assumptions: first, the weights are non-increasing, and second, the algorithm is allowed conditional weighted and uniform sampling (i.e., the same two types of sampling, but conditioned on any subset \(S\) of its choosing). In this setting, the authors show how to estimate the total weight to \(1\pm\varepsilon\) with only \(O((\log n)\text{poly}(1/\varepsilon))\) queries.

Efficient witnessing and testing of magic in mixed quantum states (arXiv), by Tobias Haug, and Poetri Sonya Tarabunga. Magic is real, and it’s quantifiable: roughly speaking, it quantifies the amount or “non-stabilizerness” of a state. I won’t pretend to fully understand what this means, but this paper shows that one can test the magic of low-entropy \(n\)-qubit states (i.e., distinguish between low magic states and high-magic states) with only polynomially (in \(n\)) many copies of the state.

Mildly-Interacting Fermionic Unitaries are Efficiently Learnable (arXiv), by Vishnu Iyer. More quantum property testing! In this paper, the author shows how to test whether an \(n\)-mode fermionic unitary has Gaussian dimension at least \(k\) (or is \(\varepsilon\)-far from it in Frobenius norm) in time \(\text{poly}(n, 1/\varepsilon)\). (This is then used as a building block to efficiently learn such unitaries.)

Testing Juntas and Junta Subclasses with Relative Error (arXiv), by Xi Chen, William Pires, Toniann Pitassi, and Rocco Servedio. In this relatively (!) new model of property testing, which we covered last October, the notion of farness between two Boolean functions is relative to their number of satisfying assignments. Testing with relative error is at least as hard as in the standard setting, and could be strictly harder: this paper shows that, for the case of testing juntas, this is not the case. Even with relative error testing, \(\tilde{O}(k/\varepsilon)\) queries are necessary and sufficient for junta testing! Using ideas from “standard testing” (specifically, from the “testing by implicit learning” framework), their results further extend to testing a large number of subclasses of juntas.

Relative-error testing of conjunctions and decision lists (arXiv), by Xi Chen, William Pires, Toniann Pitassi, and Rocco Servedio. Same team, same model — more results! After testing juntas in the relative error model, the authors continue their systematic exploration of the testing questions, this time focusing on testing decision lists and conjunctions. They are able to obtain a \(\tilde{O}(1/\varepsilon)\)-tester with two-sided error for the former, and an \(O(1/\varepsilon)\)-tester with one-sided error for the latter: both matching the best-known query complexity in the standard model.

News for January 2025

Leave a reply

January 2025 was by many measures a very… eventful month; as far as property testing is concerned, not so much, with only two papers (and a third we had previously missed). Uneventful is not a bad thing, sometimes!

Many pentagons in triple systems, by Dhruv Mubayi and Jozsef Solymosi (arXiv). This paper is interested in quantifying the number of copies of \(C_k\) in 3-uniform hypergraphs. In the process, the authors establish a quantitative result very relevant to property testing, at least for those with an interest in testing triangle-freeness in dense graphs, improving on a result of Gishboliner, Shapira and Wigderson: namely, that if an \(n\)-vertex graph is \(\varepsilon\)-far from triangle-freeness, then for every \(\ell \geq 2\) it must contain \(\Omega(\varepsilon^{3\ell} n^{2\ell+1})\) copies of \(C_{2\ell+1}\).

Testing Noise Assumptions of Learning Algorithms, by Surbhi Goel, Adam R. Klivans, Konstantinos Stavropoulos, and Arsen Vasilyan (arXiv). Testable learning has seen a surge of interest since its (recent) introduction by Rubinfeld and Vasilyan (2023). In this framework, a learning algorithm which works under some data distribution assumption (e.g., the data is from a spherical ~~cow~~ Gaussian) is not “off the hook” when that assumption isn’t met, as is the case in classical learning. Instead, the algorithm must put its money where its algorithmic mouth is: if the data does indeed satisfy the assumption, then it must output a hypothesis that satisfies the learning guarantee; if the data does not satisfy the assumption, it is allowed to abort and output an error flag; but if it does output a hypothesis, regardless of whether the distributional assumption is met then that hypothesis must satisfy the learning guarantee. In this sense, the algorithm must act like a property tester for the distributional assumption made on the data.
This paper extends the testable learning framework from data distribution tonoisy data generation model: the assumption to be tested (and used) is no longer only on the distribution of the data (regardless of the associated labels), but on the distribution of the pair (data, label), including the way the label may be corrupted. In particular, the authors focus as an application on learning high-dimensional origin-centered halfspaces, where the assumption is that the data is from a Gaussian distribution, with labels perturbed by Massart noise.

Learning multivariate Gaussians with imperfect advice, by Arnab Bhattacharyya, Davin Choo, Philips George John, and Themis Gouleakis (arXiv). Suppose you want to learn the mean (or, if the covariance isn’t known, even better, the mean and covariance) of a high-dimensional Gaussian from i.i.d. samples. You’re in luck: we know how to do it, and the algorithm is very simple! You’re not in luck, though: you’ll need a lot of these i.i.d. samples to achieve non-trivial accuracy. A number either linear or quadratic in the dimension, depending on whether you’re learning only the mean vector or the whole thing.
But say you’re in “luck”: a “good” (but not very trustworthy) friend comes to your help, claiming they already know the mean and covariance, and tell you what they (claim they) are. Can you use this possibly unreliable advice to learn the Gaussian better? This is the setting of learning with advice, and, in this paper, the authors show that yes, when learning Gaussians, you can! And, what’s even better (for this blog), the algorithm they design uses as a core subroutine a tolerant tester, which allows them to carefully checks the quality of the “advice”.

News for October 2024

1 Reply

Four* papers on property testing last month! Lower bounds, upper bounds, distribution testing, quantum, and a new testing model!

* at least four. If we missed any, please let us know in the comments!

Lower Bounds for Convexity Testing, by Xi Chen, Anindya De, Shivam Nadimpalli, Rocco Servedio, and Erik Waingarten (arXiv). You’re given a membership oracle to a set \(S\) in \(\mathbb{R}^n\) (that is, query access to its indicator function \(f_S\colon \mathbb{R}^n\to \{0,1\}\)), and asked to decide if this set is convex, or “far from it”. This is a very natural and seemingly basic question— of course, we need to define what “far” means here, and the natural (normal, one may say) choice of underlying measure in \(\mathbb{R}^n\) is the standard Gaussian measure: \(d(S,T) = \Pr_{\mathcal{N}(0,I_n)}[ x \in S\Delta T]\).
Previous algorithms for this convexity testing question (and its tolerant testing analogue) are non-adaptive, and have \(2^{\tilde{O}(\sqrt{n})}\) query complexity. This paper shows that this is not just unfortunate, but also necessary: every non-adaptive tolerant tester for this question must make \(2^{\Omega(\sqrt[4]{n}}\) queries, and every (possibly adaptive) one-sided tester must have polynomial query complexity.

Replicable Uniformity Testing, by Sihan Liu and Christopher Ye (arXiv). In property testing, the algorithm must say YES with high probability on inputs which have the property, and NO with high probability on those which are far. On anything else, the algorithm is off the hook and can output either. This is typically considered to be fine, and, in any case, necessary to be able to obtain ultra-efficient algorithms. But what if, in this third case, we wanted to put the algorithm partially back on that hook, and required it to be consistent? The algorithm can answer either YES or NO, sure, but if I run it again on that same input, it should give the same answer with high probability. This is in line with a recent line of works on replicable algorithms, and is non-trivial to achieve in (the standard model of) distribution testing, where a distribution testing algorithm only gets to see random samples from the distribution, and thus needs to have a replicable behavior over that randomness. This paper introduces the question of replicable distribution testing, and provides both upper and lower bounds (essentially matching, with an asterisk) for the flagship task of uniformity testing.

Quantum property testing in sparse directed graphs, by Simon Apers, Frédéric Magniez, Sayantan Sen, and Dániel Szabó (arXiv). Graph property testing has a long and rich history in the classical setting, spanning more than two decades. There are several testing models, depending on whether the graph is dense, sparse, and directed or not: and even in the sparse, directed case, it is important to sometimes only allow outgoing edge queries. All these variants capture different meaningful scenarios, and relations and separations between them are known. This paper opens the direction of quantum testing for sparse graphs, either directed or not. The authors investigate what advantage quantum computers can bring for graph testing in this setting, and show one natural property for which a quadratic speedup exists: \(o(\sqrt{n})\) quantum queries in the outgoing-edge-query-only (unidrectional) sparse model, while classically \(\Omega(n)\) are necessary. They also show that this is not always the case: quantum testing of 3-colorability, as in the classical case, does not admit a \(o(n)\)-query tester.

Relative-error monotonicity testing, by Xi Chen, Anindya De, Yizhi Huang, Yuhao Li, Shivam Nadimpalli, Rocco Servedio, and Tianqi Yang (arXiv). Property testing of Boolean functions is defined “absolutely“: the distance between two functions is the fraction of the domain on which they differ, i.e., \(\displaystyle\frac{|f^{-1}(\{1\})\Delta g^{-1}(\{1\})|}{2^n}\)
This makes sense when the functions have a reasonable number of satisfying assignments: but may be much less meaningful for sparse functions, which only are non-zero on a \(o(1)\) fraction of the inputs—for instance, functions where “all the action” is concentrated in a tiny subcube of the hypercube. All these functions are vanishingly close to each other! To address this, the authors introduce a new distance notion, relative-error, where the distance from \(g\) to \(f\) is scaled by the sparsity of \(f\):
\(\displaystyle\frac{|f^{-1}(\{1\})\Delta g^{-1}(\{1\})|}{|f^{-1}(\{1\})|}\)
This requires a slightly different access model to avoid trivial impossibility results, so the tester is augmented with sampling access to satisfying assignments of \(f\), on top of query access to \(f\) (as otherwise it may just never even find one satisfying assignment). After introducing and motivating this testing model, the paper initiates its study in the specific case of testing monotonicity of Boolean functions.

News for May 2024

Leave a reply

May came with 3 new papers on property testing algorithms — or inspired by them.

Interactive Proofs for General Distribution Properties, by Tal Herman and Guy Rothblum (ECCC). Following a fruitful line of work (including by the authors themselves: see, e.g., this previous monthly post), this paper considers interactive proofs for distribution testing: Merlin and Arthur have data over a universe of size \(n\), Arthur wants to test properties of that data (probability distribution), but he has much less data (samples) than Merlin.
As it turns out, as long as the property he’s interested in can be checked efficiently (computationally: via a small-depth circuit), then Arthur can do it with strongly sublinear sample complexity: he needs only \(n^{1-\Omega(1)}\) samples, even for tolerant testing! And all that’s needed is a small number of rounds of interaction with Merlin. And even more, all (honest) parties can do that via a computationally efficient protocol…

Oracle-Checker Scheme for Evaluating a Generative Large Language Model, by Yueling Jenny Zeng, Li-C. Wang, and Thomas Ibbetson (arXiv). This paper draws inspiration from property testing and program checking (à la Blum, Luby, and Rubinfeld) to check the output of large language models (LLMs): specifically, for the task of entity extraction: the authors formalize how to view entity extraction as a homomorphism, and then assess empirically what using a property tester for linearity leads to. Overall, it sounds like an interesting (and somewhat unexpected?) use of property testing for LLM trustworthiness assessment!

Property testing in graphical models: testing small separation numbers, by Luc Devroye, Gábor Lugosi, and Piotr Zwiernik (arXiv). Here too, ideas from property testing are used, this time in the context of high-dimensional (Gaussian) graphical models. This paper focuses on testing properties of the structure of the graphical model: given query access to the covariance matrix \(\Sigma\) consistent with some underlying graph structure \(G\), can we test whether this structure is a tree? Is it has small separation number?
The focus differs a little from the classical setting of property testing, in that there is no distance parameter and the goal is to get an exact decision algorithm (adaptive, but with unbounded query complexity: rejecting graphs that are far from the property as a function of the unknown distance parameter, and always accepting graphs with the property). But besides this small variation, great to see more uses of property testing in the wild!

WoLA’24: Dates, Registration, and call for contributed talks and posters

Leave a reply

The 8th edition of WoLA, the Workshop on Local Algorithms, will be taking place on August 5-7, 2024 at the Simons Institute, as part of the Simons Institute’ summer program on Sublinear Algorithms.

Now, what is WoLA, some of you may ask?
“Local algorithms, that is, algorithms that compute and make decisions on parts of the output considering only a portion of the input, have been studied in a number of areas in theoretical computer science and mathematics. Some of these areas include sublinear-time algorithms, distributed algorithms, inference in large networks and graphical models. These communities have similar goals but a variety of approaches, techniques, and methods. This workshop is aimed at fostering dialogue and cross-pollination of ideas between the various communities.”

Save the date — the workshop has consistently been a great event for the local algorithms community to meet and discuss, and everyone is welcome to attend! (Registration is free)

The schedule is still being finalized, but here are some salient points:

on the first day, a celebration of Dana Ron‘s work and far-reaching influence, at the occasion of her 60th birthday
an open problem session for people to propose and discuss open questions and directions in local algorithms
Senior-junior lunches, and lightning talks (“graduating bits”) for graduating or soon-to-be-graduating students and postdocs
contributed short talks and a poster session

Importantly, there will also be another event (independent of WoLA, but very much related) on August 8 at the Simons Institute: a one-day birthday celebration for Ronitt Rubinfeld, “RR@60”, organized by Arnab Bhattacharyya, Funda Ergun, Krzysztof Onak, and Ravi Kumar. So plan on attending both!

To register your interest in attending WoLA (optional: for planning purposes), please fill the Simons Institute form on the right side here (or, alternatively, the form below). If you’d like to present your work, as a short contributed talk and/or a poster, or would like to take part in the “graduating bits” session, please express your interest by filling this form by ⏰ May 3, 5pm ET. Notifications will be sent by May 15.

If you have any questions about WoLA, or need an invitation letter to attend (for visa reasons), please indicate it in the form.

Update (25/04): updated the post to include the link to the Simons Institute registration form.

Clément Canonne, on behalf of the WoLA PC (Talya Eden, Manuela Fischer, Michael Kapralov, Robi Krauthgamer, Reut Levi, Rotem Oshman, Michal Parnas, Ron Rothblum, and Jara Uitto)

News for February 2024

Leave a reply

February this year was slightly superlinear, with 29 days instead of the usual 28. As a result… 5 property testing papers! (Including one overlooked from January),

Testing Calibration in Subquadratic Time, by Lunjia Hu, Kevin Tian, and Chutong Yang (arXiv). The authors consider the question of model calibration, where a binary predictor is said to be calibrated if \(\mathbb{E}[ y\mid v=t ] = t\) for all \(t\), where \(y\) is the observed outcome and \(v\) is the prediction. This notion, central to algorithmic fairness, comes with a host of challenges: one of them being to assess whether a given predictor is indeed calibrated, and quantifying by how much it deviates from it. Following work by Błasiok, Gopalan, Hu, and Nakkiran which introduced a notion of distance to calibration, the paper defines the (property testing) task of calibration testing, with connections to distribution testing, and provides subquadratic-time algorithms (in the sample complexity) for the task. The authors also obtain analogous results for tolerant calibration testing, which they also introduce.

The role of shared randomness in quantum state certification with unentangled measurements, by Jayadev Acharya and Yuhan Liu (arXiv). In this paper (from January), the authors consider the following question, the quantum analogue of identity testing from the classical distribution testing world: what is the copy complexity (≈sample complexity) of certifying (≈testing) whether an unknown quantum state (≈quantum analogue of a probability distribution) is equal to a known, reference quantum state? And, crucially, what about doing this when our quantum hands are tied, i.e., without using entanglement — but possibly with adaptive measurements? This is not a new question, and we previously covered a couple papers on this in April 2020 and Feb 2021. What is new here is that the authors show it’s not about adaptivity! Mirroring what happens in the classical (distributed) case, the key here turns out to be shared randomness: that is, whether the measurements are made independently (in which case \(\Theta(d^2)\) copies are necessary and sufficient), or chosen randomly but jointly (in which case the copy complexity is \(\Theta(d^{3/2})\)).

Low Acceptance Agreement Tests via Bounded-Degree Symplectic HDXs, by Yotam Dikstein, Irit Dinur, and Alexander Lubotzky (ECCC) and Constant Degree Direct Product Testers with Small Soundness, by Mitali Bafna, Noam Lifshitz, Dor Minzer (ECCC). [Two independent works]

Let \(X\) be a (small) set of \(k\)-element subsets of \([n]\), and \(\{f_S\colon S\to \Sigma\}_{S\in X}\) a family of partial functions. Is there a way to “stitch together” all the functions \(f_S\) into a global one \(G\colon X \to \Sigma\)? A testing algorithm for this is called an agreement test, and the most natural goes as follows: pick \(S,T\in X\) at random (say, with fixed, small intersection), and accept if, and only if, \(f_{S}, f_T\) agree on \(S\cap T\). Does this work? In which parameter regime (i.e., how does the acceptance probability \(\varepsilon\) relate to the closeness-to-a-global-function-\(G\)? How large does \(X\) need to be? The two papers both show that the above agreement test works in the small soundness regime (small \(\varepsilon\)), for \(= O(n)\). Or, as the authors of the first of the two papers put it: “In words, we show that ε agreement implies global structure”

Efficient learning of quantum states prepared with few fermionic non-Gaussian gates, by Antonio Anna Mele and Yaroslav Herasymenko (arXiv). While most of the paper’s focus is on tomography (learning) of a specific class of quantum states, the authors also provide in Appendix A an algorithm for a property testing question: namely, testing the Gaussian dimension of a quantum state: specifically, tolerant testing of \(t\)-compressible \(n\)-qubit Gaussian states in trace distance (Theorem 48). I do not fully grasp what all this means, to be honest, so I’ll stop here.

Sublinear Algorithms Program at the Simons Institute in 2024

Leave a reply

Exciting news!* Next year, the Simons Institute will host a summer program on Sublinear Algorithms. from May 20 to August 9, 2024. Organised by Artur Czumaj, Piotr Indyk, Jelani Nelson, Noga Ron-Zewi, Ronitt Rubinfeld, Asaf Shapira and myself, the summer program will feature 4 workshops:

the bootcamp (May 21 to 24)
Extroverted Sublinear Algorithms (June 17 to 21)
Sublinear Graph Simplification (July 29 to August 2)
the Workshop on Local Algorithms (WoLA) (August 5 to 7)

This is, of course, in addition to the bulk of the program itself: research discussions, reading groups, talks, social activities… If you happen to be in the area, you’re more than welcome to come and take part in some of these!

While each workshop will have its own set of attendees (more details soon), there are also some slots for (1) long-term visitors and (2) Simons Research Fellows (within five years of the award of their PhD at the start of academic year 2024-25, may already hold faculty positions) you can apply to! The deadline to apply is December 1, 2023:

Hope to see many of you next summer! Oh, and to conclude… have you seen our logo?

* Full disclosure: I am biased, being an organizer, but do find that very exciting.

Property Testing Review

The latest in property testing and sublinear time algorithms

Author Archives: Clement Canonne

News for September 2025

News for May 2025

Announcing WoLA 2025 in Chicago

News for April 2025

News for January 2025

News for October 2024

News for May 2024

WoLA’24: Dates, Registration, and call for contributed talks and posters

News for February 2024

Sublinear Algorithms Program at the Simons Institute in 2024