News for July 2014

In this month’s post, we review papers that raise two questions that I find particularly interesting. First, can sublinear algorithms find applications in efficient approximation of CSPs? And second, can property testing be extended to give meaningful results in settings where Hamming distance is not the “right” measure of distance between functions?

Going for Speed: Sublinear Algorithms for Dense r-CSPs by Grigory Yaroslavtsev (arXiv). In their seminal paper, Goldreich, Goldwasser, and Ron considered the MAX-CUT problem. They showed that both the testing variant of the problem (where the algorithm must distinguish between graphs with a cut of size \(s\) from those with maximum cut size at most \(s – \epsilon N^2\)) and the approximation variant (where the algorithm must find a cut that has at most \(\epsilon N^2\) fewer edges than the largest cut in the graph) can be solved in sublinear time. This result inspired the very successful line of research on testing properties of graphs. We can also view this result as motivation for studying the approximability of general CSPs: given some constraint satisfaction problem, can we find an assignment that satisfies almost as many constraints as the maximally satisfying assignment in sublinear time? This current paper begins the exploration of this question with an improved bound on the query complexity of the MAX-CUT problem and with the first general result for approximating CSPs of arity greater than 2 (i.e., where the constraints are on more than 2 variables).

Lp testing by Piotr BermanSofya Raskhodnikova, and Grigory Yaroslavtsev (preprint). Almost all of the work in testing properties of functions is based on the Hamming distance: for a given property P, a tester is required to accept functions that have the property P and to reject the functions that differ from every function with the property P on an \(\epsilon\) fraction of the inputs. This notion of distance is appropriate for algebraic properties of functions and for properties of boolean functions, but for other properties of real-valued functions, \(L_p\) distances are sometimes more appropriate. This paper provides the first systematic study of property testing under these distances. An important high-level message from this work is that the testing problems can be very different under these distances than under Hamming distance. For example, the classic problem of \(\epsilon\)-testing whether \(f : [n] \to \mathbb{R}\) is monotone requires \(\Theta(\frac{\log n}{\epsilon})\) queries in the Hamming distance setting, but only \(\Theta(\frac1{\epsilon^p})\) queries in the \(L_p\) distance setting!

News for May 2014

May seems to be a slow month for property testing, with only one paper on locally testable codes. Nonetheless, we have unearthed a paper from April (missed despite our best efforts) on a related topic of hidden set approximations.

The Power of an Example: Hidden Set Size Approximation Using Group Queries and Conditional Sampling by Dana Ron and Gilad Tsur (arXiv). Consider a known universe \(U\) and an unknown subset \(S\). Our aim is to approximately determine \(|S|\). We can perform subset queries: for a query subset \(T\), an oracle returns whether \(T \cap S \neq \emptyset\) or (a more powerful model) the oracle returns a uniform random element in \(T \cap S\). This paper gives a detailed study of this problem under various settings. There is a fascinating array of upper and lower bounds, including situations where \(U\) is ordered and \(T\) must be an interval, the difference between adaptivity and non-adaptivity, arbitrary subset queries, and much more. The topic seems to be quite rich, and it should yield some nice sets of problems to study further. It appears that some lower bounds are quite open. Students, pay attention!

Limitations on Testable Affine-Invariant Codes in the High-Rate Regime by Venkatesan Guruswami, Madhu Sudan, Ameya Velingker, and Carol Wang (ECCC). Locally Testable Codes (LTCs) are codes that (typically) have a constant time tester. Interestingly, recent work on testing codes using a linear number of queries (some small constant fraction of the code length) has connections to the Small Set Expansion (SSE) problem, which in turn is related to the famous Unique Games Conjecture (UGC). A paper by Barak et al. gives a construction of a small set expander (a graph where all small sets have high conductance) where the Laplacian has many small eigenvalues (so it’s “difficult” to find a low conductance cut). This construction uses high-rate LTCs that are linear time testable. In some sense, the better the rate, the better construction one gets. This paper asks how far this idea can go. So the idea is to construct the best rate linear-time testable LTC. Reed-Muller codes fall in this category, but their rate is quite far from optimal. Recent work by Guo et al. gives a slightly better construction, but this is far from the (near) optimal rate achieved by BCH codes (which are not known to be testable). All in all, this paper shows that for affine-invariant codes, the Guo et al. construction is essentially the best testable high-rate code one can get. I have obviously done no justice to the depth of the connections, the intricacies of the parameters, or the overall coolness of this subject. So go read the paper!

News for April 2014

As winter finally releases its grip here on the east coast, we celebrate with forests, with bounded-derivative properties, and with hypergraphs. (I believe these are the traditional ingredients for the beginning-of-spring celebrations in New England, but I could be wrong.)

Testing Forest-Isomorphism in the Adjacency List Model by Mitsuru Kusumoto and Yuichi Yoshida (arXiv). Which properties of sparse graphs can we test efficiently? To a large extent, this fundamental question remains open. The natural setting in which to study this question is the adjacency list query model. In this model, a testing algorithm can select any vertex \(v\) and index \(i\); it then receives the identity \(w\) of the \(i\)th neighbor of \(v\). Kusumoto and Yoshida show that in this model, we can test if two forests (i.e., collections of trees) on \(n\) vertices are identical up to relabeling of the vertices with polylog(\(n\)) queries. They also show that, remarkably, this tester can be used to test any property of forests in the adjacency list model with polylog(\(n\)) queries.

Property Testing on Product Distributions: Optimal Testers for Bounded Derivative Properties by Deeparnab Chakrabarty, Kashyap Dixit, Madhav Jha, and C. Seshadhri (arXiv). Two basic properties of a function \(f : [n]^d \to \mathbb{R}\) on the hypergrid are monotonicity and the Lipschitz property. These two properties are special cases of a more general class of properties called bounded-derivative properties. In this paper, the authors give optimal bounds on the number of queries required to test these properties over every product distribution on the hypergrid. These results are obtained by deriving new dimension-reduction results and, most interestingly, by establishing and exploiting a strong connection between bounded-derivative properties and binary search trees.

Any Monotone Property of 3-uniform Hypergraphs is Weakly Evasive by Raghav Kulkarni, Youming Qiao, and Xiaoming Sun (ECCC). What happens if we remove the distance promise in the definition of property testing? Or, equivalently, how efficient can testers be if they must distinguish objects with some property \(P\) from every object that does not have the property \(P\)? The answer to this question seems completely obvious: no non-trivial property can be tested in this setting with a sublinear number of queries. We can easily verify that this answer is indeed correct for our favorite properties… but is it really true for every non-trivial property? As it turns out, this question is far from trivial, and in fact leads to the famous evasiveness conjectures and to the celebrated results of Rivest and Vuillemin and of Kahn, Saks, and Sturvevant. The current paper combines and extends ideas from both of these papers to show that deterministic testers for any monotone non-trivial property of 3-uniform hypergraphs indeed have linear query complexity.

News for March 2014

On Learning and Testing Dynamic Environments by Oded Goldreich and Dana Ron (ECCC). Given a dynamic environment that is evolving according to a local rule, the goal is to query some of the states in the system at some point of time and determine if the system is evolving according to some fixed rule or is far from it. A harder problem is to learn the evolution of the environment. They give some upper and some lower bounds for various versions of these problems. In particular they show that using sub-linear queries a lot can be achieved even in this setting. This is the first of its kind result and possibly many more results in this area will follow soon.

News for February 2014

So what did we see in February 2014?

Local Algorithms for Sparse Spanning Graphs by Reut Levi, Dana Ron, and Ronitt Rubinfeld (arXiv). Given a graph \(G = (V,E)\), the aim is the find a small sparse spanning subgraph \(G’\) in the local algorithm setting. So, given any edge \((u,v) \in E\), we must determine in sublinear time whether this edge exists in \(G’\). The main result is an \(\Omega(|V|)\) lower bound, and a matching algorithm when \(G\) is either an expander or hyperfinite.

A Characterization of Locally Testable Affine-Invariant Properties via Decomposition Theorems by Yuichi Yoshida (arXiv). This result gives a complete characterization of (two-sided) locally testable affine-invariant properties. These are based on existing decomposition theorems for functions into structured and pseudorandom parts. This result is analogous to that of Alon, Fischer, Newman, and Shapira (link) on characterizations for testability over dense graphs.

Testing probability distributions underlying aggregated data by Clément Canonne and Ronitt Rubinfeld (arXiv). There has been much recent work on distribution testing, and this work discusses stronger models where better algorithms can be obtained. Given a distribution \(D\) over \([n]\), we can get samples from this distribution, and also query the probability \(D(i)\) of any \(i \in [n]\). With these “dual” queries, one gets distribution testers for a variety of problems surpassing previous lower bounds. Interestingly, this is stronger than the conditional model introduced recently (here and here).

Strong Locally Testable Codes with Relaxed Local Decoders by Oded Goldreich, Tom Gur, and Ilan Komargodski (ECCC). For a locally testable code (LTC), the property of being a codeword is efficiently (constant time) testable. A locally decodable code (LDC) is a code from which individual bits of the original message can be recovered in constant time. A relaxed local decoder is allowed to declare failure for recovered a small fraction of message bits. This allows for bypassing the difficult (and prominent) open problem of constructing small LDCs. This result gives a construction of near-linear LTCs with a relaxed local decoder. This result leads to a property that is hard to test but easy to verify (in the MA proofs of proximity sense, refer to last month’s open problem).

Open problem for February 2014: Better approximations for the distance to monotonicity

Slightly delayed, Feb’s open problem is by one of the three possible “yours truly”s, Sesh. Looking for more reader participation, hint, hint.

Basic setting: Consider \(f:\{0,1\}^n \rightarrow R\), where \(R\) is some ordered range. There is a natural coordinate-wise partial order denoted by \(\prec\). The function is monotone if for all \(x \prec y\), \(f(x) \leq f(y)\). The distance to monotonicity, \(\epsilon_f\) is the minimum fraction of values that must be changed to make \(f\) monotone. This is an old problem in property testing.

Open problem: Is there an efficient constant factor approximation algorithm for \(\epsilon_f\)? In other words, is there a \(poly(n/\epsilon_f)\) time procedure that can output a value \(\epsilon’ = \Theta(\epsilon_f)\)?

State of the art: Existing monotonicity testers give an \(O(n)\)-approximation for \(\epsilon_f\), so there is much much room for improvement. (I’d be happy to see a \(O(\log n)\)-approximation.) Basically, it is known that the number of edges of \(\{0,1\}^n\) that violate monotonicity is at least \(\epsilon_f 2^{n-1}\) [GGLRS00], [CS13]. A simple exercise (given below) shows that there are at most \(n\epsilon_f 2^n\) violated edges. So just estimate the number of violated edges for an \(O(n)\)-approximation.
(Consider \(S \subseteq \{0,1\}^n\) such that modifying \(f\) on \(S\) makes it monotone. Every violated edge must have an endpoint in \(S\).)

Related work: Fattal and Ron [FR10] is a great place to look at various related problems, especially for hypergrid domains.

References

[CS13] D. Chakrabarty and C. Seshadhri. Optimal Bounds for Monotonicity and Lipschitz Testing over Hypercubes and Hypergrids. Symposium on Theory of Computing, 2013.

[FR10] S. Fattal and D. Ron. Approximating the distance to monotonicity in high dimensions . ACM Transactions on Algorithms, 2010.

[GGL+00] O. Goldreich, S. Goldwasser, E. Lehman, D. Ron, and A. Samorodnitsky. Testing Monotonicity . Combinatorica, 2000.

News for January 2014

As we already covered in earlier posts, quite a few notable property testing papers were presented at the SODA and ITCS conferences this month. In addition, an exciting new set of results on a variant of the linearity testing problem was also posted on ECCC.

Direct Sum Testing by Roee David, Irit Dinur, Elazar Goldenberg, Guy Kindler, and Igor Shinkar. One of the true gems of property testing is the linearity testing: as Blum, Luby, and Rubinfeld first showed, we can test whether a boolean function is linear with a (very natural) 3-query test. A fascinating variant of this problem that is still largely open to this day is the following: is linearity still testable even when we consider functions defined on a subset of the hypercube? This paper considers the problem when the subsets in question are layers of the hypercube, a problem that is closely connected to PCP constructions and direct sum operations on multi-player games. The main result of the paper shows that we can also test linearity for functions defined on layers of the hypercube with a natural 3-query tester.

Open problem for Jan 2014: Explicit lower bound for MAPs

The open problem of the month is a new feature on PTReview. Every month we put up on open problem contributed by our readers. To know more about getting your problem broadcasted, check out the link “About Open Problem of the Month” at the top of the page.

The first ever open problem post is by Tom Gur on Merlin-Arthur proofs of proximity. This post doubles up a great survey on this new topic. Thanks Tom!

Primary reference: Non-interactive proofs of proximity by Gur and Rothblum, ECCC, 2013.

Basic setting: Recent work by Ron Rothblum and myself [GR13] initiates the study of \(\mathsf{MA}\) proof-system analogues of property testing, referred to as \(\mathsf{MA}\) proofs of proximity (\(\mathsf{MAP}\)s). A proof-aided tester for property \(\Pi\) is given oracle access to an input \(x\) and free access to a proof \(w\). A valid \(\mathsf{MAP}\) demands the following property. Given a proximity parameter \(\epsilon>0\), for inputs \(x \in \Pi\), there exists a proof that the tester accepts with high probability, and for inputs \(x\) that are \(\epsilon\)-far from \(\Pi\) no proof will make the tester accept, except with some small probability of error.

Open problem: Can one show an explicit property where \(\mathsf{MAP}\)s do not give additional power over standard testers? Formally, show an explicit property that requires any \(\mathsf{MAP}\) to make \(\Omega(|x|)\) queries even when given a long (e.g., length \(|x|/100\)) proof?

Background: Goldreich, Goldwasser, and Ron [GGR98] showed that “almost all” properties require probing a constant fraction of the tested object. This limitation was the primary motivation for \(\mathsf{MAP}\)s. Observe that given a sufficiently long proof, every property can be tested very efficiently by an \(\mathsf{MAP}\). Consider a proof that fully describes the object. The tester has free access to the proof, so all it has to do is verify the consistency of the proof and the object (which can be done by only making \(O(1/\epsilon)\) random queries). The tester can check for free that the object described in the proof has the desired property.

Indeed, the name of the game is to construct \(\mathsf{MAP}\)s with short proofs and significantly lower query complexity than standard testers. There exists a property that has an \(\mathsf{MAP}\) using a logarithmic-length proof and only a constant number of queries, but requires nearly linear number of queries for standard testing (see Section 3 of [GR13]).

Nevertheless, \(\mathsf{MAP}\)s with short proofs are far from being omnipotent. A random property requires any \(\mathsf{MAP}\) to perform \(\Omega(n)\) queries (where \(n\) is the length of the input), even when given a \(n/100\) length proof (Section 5 of [GR13]). The open problem is to find an explicit property that matches this lower bound. The best lower bound on explicit properties is far weaker; specifically, Fischer, Goldhirsh, and Lachish [FGL13] showed that testing a certain family of linear codes (namely, codes with “large” dual distance) requires any \(\mathsf{MAP}\) with a proof of length \(p \ge 1\) to make \(\Omega(n/p)\) queries.

References

[FGL13] E. Fischer, Y. Goldhirsh, and O. Lachish. Partial tests, universal tests and decomposability. Innovations in Theoretical Computer Science (ITCS), 2013.

[GGR98] O. Goldreich, S. Goldwasser, and D. Ron. Property testing and its connection to learning and approximation. Journal of the ACM (JACM), 1998.

[GR13] T. Gur and R. Rothblum. Non-interactive proofs of proximity. Electronic Colloquium on Computational Complexity (ECCC), 2013.