The holiday season is just around the corner. Best wishes to you and your loved ones in advance from us here at PTReview. This month, we had ~~four~~ five papers in all. To prepare for the festive mood associated with the onset of December, also included are brief updates on the recently disproven implicit graph conjecture. Let’s dig in. *(And please, do let us know if we missed your paper)*.

(** EDIT**: Thanks to our readers for pointing out a paper we missed.)

**Downsampling for Testing and Learning in Product Distributions** by Nathaniel Harms, Yuichi Yoshida (arXiv). Contemplating on connections in algorithms used for testing a diverse collection of function properties, this paper provides a unified and generalized view of a technique: which the authors call downsampling. As the paper explains, the name is motivated by analogy to image/signal processing tasks. Very often, these tasks involve two steps. In the first step, you break the input domain into “grid cells”. You use oracle calls to the function to obtain a crude approximation over all these cells. In the second step, you learn this gridded-up or coarsened function cell-by-cell.

This two-step attack could just be what the doctor ordered for your favorite function property testing problem: in particular, it has been put to work for approximately testing visual properties, approximating distance to monotonicity in high dimensions, testing \(k\)-monotone functions, and more. However, if you wanted to obtain property testers using this approach in the distribution-free setup, your ordeal might be far from over. The unknown distribution your domain is equipped with can mess with geometric arguments your gridding approach hopes to exploit. This is precisely the setup considered in the paper (i.e, distribution-free testing of function properties).

The essence of downsampling, is captured by a slick proposition that prescribes coarsening as your goto weapon if the

1) fraction of cells on which \(f\) is not constant, and

2) a measure of how non-uniform the unknown distribution D your domain is equipped with is

are both small.

Equipped with this machinery, the paper tackles the task of designing distribution-free testers for boolean monotonicity with the underlying domain being \(\mathbb{R}^d\). The argument is pretty short and improves upon the sample complexity of the corresponding result in the paper by Black-Chakrabarti-Seshadhri. Do check it out, looks like some nice ammo to add to your toolkit.

Let’s stay on board for sublinear time algorithms for gap-edit distance.

**Gap Edit Distance via Non-Adaptive Queries: Simple and Optimal** by Elazar Goldenberg, Tomasz Kociumaka, Robert Krauthgamer, Barna Saha (arXiv). The edit distance problem needs no introduction. This paper studies the problem of approximating edit distance in sublinear time. In particular, this is formalized by introducing the gapped version of the problem. This entails the following computational task: Fix \(k \in \mathbb{N}\) and some \(c > 0\). Given a pair \(x, y\) of strings over a finite alphabet, decide whether the edit distance between \(x,y\) is at most \(k\) or whether it is at least \(k^c\). This paper resolves the non-adaptive query complexity of edit distance and proves that the above gapped version can be decided in at most \(O\left(\frac{n}{k^{c – 1/2}}\right)\) queries. The paper also proves that this bound is almost optimal up to some polylog factors.

Next up, we have *time-optimal* algorithms for maximum matching and vertex cover! Sweet, right?

**Time-Optimal Sublinear Algorithms for Matching and Vertex Cover** by Soheil Behnezhad (arXiv). This paper gives algorithms for the classic problem of estimating the size of vertex cover and maximum matching in graphs in sublinear time (in both adjacency matrix and adjacency list models). This is another nice read for the holiday season — after all the paper obtains time-optimal algorithms (up to polylog factors) in both of these models for a multiplicative-additive \((2, \varepsilon n)\) algorithms. Let me set up some historical context to appreciate the maximum matching result better.

In a classic work, Parnas and Ron gave sublinear time algorithms that obtain a \((2, \varepsilon n)\) estimate to the size of a maximum matching in a graph using query access to the adjacency list in sublinear time. Their sublinear time algorithm is inspired by ideas with roots in distributed algorithms. In particular, their algorithm returns in time

\(\Delta^{O(\log {\frac{\delta}{\varepsilon}})}\) (where \(\Delta = \Delta(G)\) denotes the degree bound) an estimate \(\text{EST}\) to the size of the max-matching where \(OPT \leq \text{EST} \leq 2 \cdot OPT + \varepsilon n\). Unfortunately, algorithms in this Parnas-Ron Framework must necessarily suffer a quasi-polynomial running time because of lower bounds from distributed algorithms. Using a randomized greedy approach to estimating the size of a maximum matching, Yoshida, Yamamoto, and Ito gave the first algorithm which returned a \((2, \varepsilon n)\)-estimate to maximum matching in time \(poly(\Delta/\varepsilon)\). This is where the story essentially froze — though there were results that improved the dependence on \(\Delta\) to \(O(\Delta^2)\). Unfortunately, this is not truly sublinear for large \(\Delta\). In another direction, Kapralov-Mitrovic-Norouzi Fard-Tardos gave an algorithm that estimates the maximum matching size in time \(O(\Delta)\) (which is truly sublinear) but it no longer guarantees a \((2, \varepsilon n)\)-approximation and instead returns a \((O(1), \varepsilon n)\)-approximation. The current paper, as remarked above achieves the best of both worlds.

Final stop, updates on the implicit graph conjecture.

**The Implicit Graph Conjecture is False** by Hamed Hatami, Pooya Hatami (arXiv). While not a property testing paper per se, it is always good to see an old conjecture resolved one way or the other. In this paper, the Hatami brothers (Pooya and Hamed) refute a three decade old conjecture — the one mentioned in the title. Here is a brief history. Sampath Kannan, Moni Naor and Steven Rudich defined the notion of implicit representation of graphs. Take a family of \(\mathcal{F}\) of graphs and consider a \(n\) vertex graph \(G \in \mathcal{F}\). You say this family admits an efficient implicit representation if there exists an assignment of \(O(\log n)\) length labels to all the vertices in \(V(G)\) such that the adjacencies between every pair of vertices is a function of the labels of the corresponding pair. Crucially, the labeling function may depend on the family, but not on individual graphs in the family. What is cool about families admitting such an efficient implicit representation, is that the number of \(n\)-vertex graphs in this family cannot be any bigger than \(2^{O(n \log n)}\) — that is such families have at most factorial growth rate. Implicit graph conjecture asserts that for every hereditary graph family, the converse also holds. Namely, if the hereditary family has at most factorial growth rate, then the family admits efficient implicit representations. The key, as shown in this paper (which spans all of six pages!) is to choose as your hereditary graph class, the closure of a random collection of graphs. The authors show that now your hand is forced and your representation will no longer be efficient. However, annoyingly, your hereditary graph class need not have a factorial growth rate as taking the closure of a random collection expands to contain all graphs and has growth rate \(2^{\Omega(n^2)}\). The cool thing is, you can avoid this issue by choosing a random collection of slightly sparse random graphs (with \(n^{2-\varepsilon}\) edges). Interestingly, this gives you enough ammo to finally control the growth rate which in turn allows the authors to slay this conjecture.

(** EDIT**: Omitted Stop, added retroactively).

**Sublinear quantum algorithms for estimating von Neumann entropy** by Tom Gur, Min-Hsiu Hsieh, Sathyawageeswar Subramanian (arXiv). This paper presents quantum algorithms for the problem of obtaining multiplicative estimates of the Shannon entropy of the familiar classical distributions and the more exotic von Neumann entropy of mixed quantum systems. In particular, the paper presents an \(\widetilde{O}(n^{\frac{1+\eta }{2\gamma^2 }})\)-query quantum algorithm that achieves a \(\gamma\)-multiplicative approximation algorithm for the Shannon Entropy of an input distribution \(\mathbf{p}\) supported on a universe of size \(n\) where \(H(\mathbf{p}) \geq \gamma/\eta\). As if this were not already cool enough, the paper also presents sublinear quantum algorithms for estimating Von Neumann Entropy as well. This is supplemented by a lower bound of \(\Omega(n^{\frac{1}{3 \gamma^2}})\) queries for achieving a \(\gamma\)-factor approximation for any of these two kinds of entropies.

Oded GoldreichYou missed the breakthrough result reported in “Locally Testable Codes with constant rate, distance, and locality” (ECCC TR21-151, see https://eccc.weizmann.ac.il/report/2021/151/).

I would also recommend my own digest at https://eccc.weizmann.ac.il/report/2021/175/….

SeshSorry for the delay. We’ll incorporate this result in our Dec post, as well as your digest.

Pingback: News for March 2022 | Property Testing Review