News for November 2020

Highlights in property testing this month, include developments across the board. We have a nice potpourri of results which will catch your fancy. From a result which shows the tightness of quadratic gap between query complexities of adaptive vs non-adaptive testers for dense graph properties, to isoperimetric Talagrand-like inequalities for real valued functions over the hypercube, to results which consider face off between quantum computing and distribution testing and more. Looks like the festive season came early for the PTReview readers.

Non-adaptive vs Adaptive Queries in the Dense Graph Testing Model (by Oded Goldreich and Avi Wigderson) (ECCC) It has been long established in the dense graph property testing literature that the query complexity of adaptive testers and non-adaptive testers are only a quadratic factor apart. This work asks: is this gap necessarily tight and answers in the affirmative. The main conceptual tool used in the paper is the notion of robustly self-ordered graphs and local self-ordering procedures for these graphs (which was developed by the same authors and was covered in our September Post). These results use these tools to port lower bounds on property testing problems for bit strings to graph properties.

Direct Sum and Partitionability Testing over General Groups (by Andrej Bogdanov and Gautam Prakriya) (ECCC) In their celebrated work, Blum, Luby and Rubinfeld gave a four query test to determine whether a function $f \colon \mathbb{F}_2^n \to \mathbb{F}_2$ is affine. This paper considers a generalization of the notion of affinity to functions $f \colon \{0,1\}^n \to G$ where $G$ is an Abelian group. The generalization sough asks: does $f$ have the form $f = \sum x_ig_i + g_0$ for group elements $g_0, g_1, \cdots, g_n \in G$. In this setting, the BLR analysis does not apply as the domain and range are note equipped with the same group structure. This work presents a four query test for the above problem. In fact, the test is more general and can be used to decide whether a function $f \colon D_1 \times D_2 \ldots \times D_n \to G$ from a product domain $D_1 \times D_2 \ldots \times D_n$ to an Abelian group $G$ is a direct sum if it has the form $f(x_1, x_2, \cdots, x_n) = \sum f_i(x_i)$ which resolves a conjecture of Dinur and Golubev. The work also presents results for testing whether a function $f$ defined from a product domain to an Abelian group $G$ is a direct product.

Isoperimetric Inequalities for Real-Valued Functions with Applications to Monotonicity Testing (by Hadley Black, Iden Kalemaj and Sofya Raskhodnikova) (ArXiv) This paper generalizes the celebrated isoperimetric inequalities of boolean functions over the hypercube to the setting of real valued functions. This generalized isoperimetry is then put to work to obtain a monotonicity tester for $f \colon \{0,1\}^n \to \mathbb{R}$. The tester makes $\min(\tilde{O}(r \sqrt n), O(n))$ non-adaptive queries (where $r$ is the size of image of $f$) and has one-sided error. The authors generalize the KMS inequality by a Boolean decomposition theorem which allows them to represent any real valued function over the cube as a collection of Boolean functions over the cube which capture $f$’s distance to montonicity as well the structure of violations to monotonicity in $f$.

Erasure-Resilient Sublinear-Time Graph Algorithms (by Amit Levi, Ramesh Krishnan S. Pallavoor, Sofya Raskhodnikova and Nithin Varma) (ArXiv) In a previous work, a subset of the authors explored how to equip property testers with the erasure-resilience for function properties. With graph properties, a different picture emerges. In the dense graph model, where you have query access to the adjacency matrix, the situation is still fine: adjacency matrices are functional representations of graphs: therefore, if you have a black belt in making property testers erasure resilient for function properties, you would be able to test properties of dense graphs too. However, if I give you query access to the graph through an adjacency list, the picture changes. These are non-functional representations of graphs and therefore need new conceptual tools. This paper begins the study of erasure resilience in the adjacency list model. It focuses on two computational tasks: testing connectivity and estimating average degree. Let me showcase the results in the paper on testing connectivity. It is shown that you encounter a threshold phenomena in testing connectivity. As long as the fraction of erasures is small, you get testers which run in time independent of the size of the graph. But when the fraction of erasures exceeds a certain cutoff, it is shown that the tester needs a number of queries linear in the size of the adjacency list.

Expander Random Walks: A Fourier Analytic Approach (by Gil Cohen, Noam Peri and Amnon Ta-Shma) (ECCC) While this is a not exactly a property testing paper, how can you not report a paper which proves a significant strengthening of the classic CLT for Markov Chains (alright, for random walks on expanders) with respect to the TV distance? Let us now unpack the above a little. So, consider the following setup: Suppose you have a $d$-regular expander $G$. Now, imagine running the following process $\textsf{(P)}$:

1) Label half the vertices in $G$ as $+1$ and the other half as $-1$ arbitrarily.
2) Take a $t-1$ step random walk which involves $t$ vertices: say $v_1, v_2, \ldots v_t$.
3) Finally, return the boolean label of all of the visited vertices. This gives a bit-vector $x \in \{-1,1\}^t$.

Now, let us ask: Which class of functions get fooled by this process?

The main result of the paper shows that this process fools

1) Symmetric functions
2) The class $AC^0$
3) Functions $f$ which are computable by low width read once branching programs.

In more detail, let $f$ be a symmetric function and let $G$ be a $\lambda$-spectral expander. For the process $\textsf{(P)}$ defined above, it follows by the spectral expansion of $G$, that $discrepancy(f) = |E_{x \sim \textsf{(P)}} [f(x)] – E_{x \sim U_t} [f(x)]|$ is small.

Now note that the total variation distance is precisely equal to the maximum discrepancy you get over all symmetric functions $f$. This leads to the connection that allows the authors to upgrade CLT for expander random walks as they are able to show a CLT for Markov Chains with respect to the TV distance (as opposed to the CLT with respect to the Kolmogorov Distance which is what was known from the work of Kipnis and Vadhan since 1986).

New Sublinear Algorithms and Lower Bounds for LIS Estimation (by Ilan Newman and Nithin Varma) (ArXiv) As the title suggests, this paper considers the task of estimating the length of the longest increasing sequence in an array. In particular, it gives a non-adaptive algorithm which estimates the LIS length within an additive error of $\pm \varepsilon n$ in $\tilde{O}\left(\frac{r}{\varepsilon^3}\right)$ queries (where $r$ is the number of distinct elements in $A$). On the lower bound side, the paper presents a lower bound of $(\log n)^{\Omega(1/\varepsilon)}$ non-adaptive queries. The lower bound construction uses ideas from lower bound of Ben-Eliezer, Canonne, Letzter and Waingarten on testing monotone pattern freeness.

Stability and testability: equations in permutations (by Oren Becker, Alexander Lubotzky, AND Jonathan Mosheiff) (ArXiv) Consider the following question: Suppose I ask if two permuations $A,B \in Sym(n)$. Suppose I want to decide whether these permutations commute or whether they are far from permutations which commute. Can this question be answered in time independent of $n$? The authors answer this question in the affirmative. They show a simple Sample and Substitute procedure which essentially does the following: take samples $x_1, x_2, cdots x_k \sim [n]$. And accept iff $ABx_i = BAx_i$ for each $i$. The authors call this particular set of equations/constraints (checking whether $AB = BA$) stable: as it admits a Sample and Substitute style tester with query complexity independent of $[n]$. This allows the authors to consider the group theoretic notion of stability as a property testing concept and allows them to examine the notion of stability from a computational standpoint.

StoqMA meets distribution testing (by Yupan Liu) (ArXiv) This paper shows a containment result. A certain complexity class $\textsf{eStoqMA}$ is shown to be a subset of $\textsf{StoqMA}$. Let us informally define what these classes are. A language $L \in \textsf{StoqMA}$ if there exists a verifier (which is a reversible quantum circuit) such that on input a string $x$ such that for every $x \in L$ there exists a witness quantum state which makes the verifier accept with probability at least $2/3$. And in case $x \not\in L$, for every quantum state the verifier rejects $x$ with probability at least $2/3$. The paper characterizes this class from a distribution testing point of view. And this connection allows the author to conclude that $\textsf{eStoqMA} = \textsf{MA}$. Here, $\textsf{eStoqMA}$ is a sub-class of $\textsf{StoqMA}$ where all YES instances have an easy witness.

(Edit: Added later) We missed the following two papers. Thanks to our readers for pointing it out.

Near-Optimal Learning of Tree-Structured Distributions by Chow-Liu (by Arnab Bhattacharya, Sutanu Gayen, Eric Price and N.V. Vinodchandran) (ArXiv) Graphical models are a convenient way to describe high dimensional data in terms of its dependence structure. An important question in this area concerns learning/inferring the underlying graphical model from iid samples. This paper considers the problem when the underlying graph is in fact a tree. Thus, the challenge is to learn a tree structured distribution. The main result is that given sample access to a tree structured distribution, you can recover an $\varepsilon$ approximate tree with good probability in $O(n/\varepsilon)$.

Relaxed Locally Correctable Codes with Improved Parameters (by Vahid R. Asadi and Igor Shinkar) (ECCC) In their work on Robust PCPs of Proximity, Ben-Sasson et al. asked whether it is possible to obtain a $q$-query (Relaxed) Locally Decodable Code whose block length is strictly smaller than the best known bounds on block lengths for Locally Decodable Codes. Recall with relaxed LDCs you are allowed to abort outputting the $i^{th}$ bit of the message should you detect that the received word is not a valid codeword. This paper makes progress on this problem and shows that you can actually construct an $O(q)$-query relaxed LDCs which encode a message of length $k$ using $O(k^{1 + 1/q})$ queries. This matches some of the known lower bounds for constant query LDCs which thus makes progress towards understanding the gap between LDCs and relaxed LDCs in the regime $q = O(1)$.

Property Testing Review

The latest in property testing and sublinear time algorithms

Leave a Reply Cancel reply