# News for November 2015

(Updating post with an additional paper that we missed in our first posting. Sorry! Feel free to email us at little.oh.of.n@gmail.com to inform us of papers we should mention.)

We’ve got exciting new in November! Optimal results for testing of monotone conjunctions, a new lower bound for monotonicity testing (yay!), and new lower bounds for Locally Testable Codes.

Tight Bounds for the Distribution-Free Testing of Monotone Conjunctions by Xi Chen and Jinyu Xie (arxiv). Consider functions $$f: \{0,1\}^n \rightarrow \{0,1\}$$ (amen), and the basic property of being a monotone conjunction. Our notion of distance is with respect to a distribution $$\mathcal{D}$$, so the distance between functions $$f$$ and $$g$$ is $$\mathop{Pr}_{x \sim \mathcal{D}} [f(x) \neq g(x)]$$. In the distribution-free testing model, the tester does not know the distribution $$\mathcal{D}$$, but is allowed samples from $$\mathcal{D}$$. When $$\mathcal{D}$$ is uniform, this coincides with standard property testing. There can be significant gaps between standard and distribution-free testing, evidenced by conjunctions. Parnas, Ron, and Samorodnitsky proved that monotone conjunctions can be tested in the vanilla setting in time independent of $$\mathcal{n}$$, while Glasner-Servedio prove a $$\Omega(n^{1/5})$$ lower bound for distribution-free testing. This paper provides a one-sided, adaptive distribution-free tester that makes $$\widetilde{O}(n^{1/3})$$ queries, and they also prove a matching (up to polylog terms and $$\epsilon$$-dependencies) two-sided, adaptive lower bound. This is a significant improvement on the previous upper bound of $$\widetilde{O}(\sqrt{n})$$ of Dolev-Ron, as well as over the previous lower bound. Furthermore, the results of this paper hold for general conjunctions.

A Polynomial Lower Bound for Testing Monotonicity by Aleksandrs Belovs and Eric Blais (arxiv). Surely the reader needs little introduction to testing monotonicity of Boolean functions, and a previous Open Problem post should help. A quick summary: we want to test monotonicity of $$f:\{0,1\}^n \rightarrow \{0,1\}$$. The best upper bound is the $$\widetilde{O}(\sqrt{n})$$ non-adaptive, one-sided tester of Khot et al. There is a matching, non-adaptive lower bound (up to polylog terms) by Chen et al, implying the best-known adaptive lower bound of $$\Omega(\log n)$$. Can adaptivity help? This paper proves an adaptive lower bound of $$\Omega(n^{1/4})$$. Exciting! The approach of Chen et al is to focus on monotonicity of linear threshold functions (technically, regular LTFS, where the weights are bounded). The authors show that this property can be tested in $$\mathop{poly}(\log n)$$ time, shooting down hopes of extending the LTF approach for adaptive lower bounds. The key insight is to work with the distribution of Talagrand’s random DNF instead, which is the most noise sensitive DNF. (Talagrand strikes again. He helps the upper bound, he helps the lower bound.) Perturbations of this DNF lead to non-monotone functions, the paper proves that these distributions cannot be distinguished in $$o(n^{1/4})$$ queries.

Lower bounds for constant query affine-invariant LCCs and LTCs by Arnab Bhattacharyya and Sivakanth Gopi (arxiv). Think of any code as a set/property of codewords in the domain $$\Sigma^N$$. Such a code is an $$r$$-LTC if it has an $$r$$-query property tester. An $$r$$-LCC is has the property that, given any $$x \in \Sigma^n$$ sufficiently close to a codeword, one can determine any coordinate of the codeword using $$r$$-queries. A fundamental question is to understand the length of LCCs and LTCS, or alternately (fixing $$\Sigma^N$$) determining the largest possible set of codewords. Existing constructions typically have much symmetry, either linear or affine invariance. (Check out Arnab Bhattacharyya’s survey on affine invariant properties for more details.) It is convenient to think of any codeword as a function $$f: [N] \rightarrow \Sigma$$, and furthermore, think of $$[N]$$ as a vector space $$\mathbb{K}^n$$. The best known construction of Guo, Kopparty, and Sudan yields (affine-invariant) LCCs of size $$\exp(\Theta(n^{r-1}))$$ and LTCs of size $$\exp(\Theta(n^{r-2}))$$ (many dependences on $$\mathbb{K}$$, the rate, etc. are hidden in the big-Oh). This paper show that these bounds are actually the best achievable by any affine-invariant code. (Previous lower bounds of Ben-Sasson and Sudan only held for linear codes.) The intriguing and wide-open question is to prove such lower bounds without affine invariance.

# News (of lack thereof) for October 2015

Alas! A dry month, with no property testing or sublinear papers. Maybe next month…

# Sublinear Algorithms Workshop at Johns Hopkins University

(Posting an announcement for a workshop on sublinear algorithms.)

We are organizing a Sublinear Algorithms workshop that will take place at Johns Hopkins University, January 7-9, 2016. The workshop will bring together researchers interested in sublinear algorithms, including sublinear-time algorithms (e.g., property testing and distribution testing), sublinear-space algorithms (e.g., sketching and streaming) and sublinear measurements (e.g., sparse recovery and compressive sensing).

The workshop will be held right before SODA’16, which starts on January 10 in Arlington, VA (about 50 miles from JHU).

Participation in this workshop is open to all, with free registration. In addition to 20+ excellent invited talks, the program will include short contributed talks by graduating students and postdocs, as well as a poster session. To participate in the contributed talk session and/or the poster session, apply by December 1.

For further details and registration, please visit

Best,

Piotr Indyk, MIT
Robert Krauthgamer, Weizmann Institute of Science

# News for July 2015

The summer month of July brings to us exciting work on distribution testing, scale-free graph testing, personal PageRank estimation, and (hold your breath!) cake cutting. Let’s hear more…

Avid followers of PTReview need little introduction to distribution testing, since we’ve seen much work on this topic over the past year. We have two concurrent results that really push the state-of-the-art of distribution testing.
Testing Shape Restrictions of Discrete Distributions by Clément Canonne, Ilias Diakonikolas, Themis Gouleakis, and Ronitt Rubinfeld (arxiv). Most (sublinear) distribution testing results work for structured classes, like monotone, k-modal, Poisson, multinomial etc. distributions. This paper gives an overarching framework that unifies these concepts in terms of succinctness. Consider a class $$\mathbb{C}$$ of distributions over $$[n]$$. This class is succinct if for any distribution $$\mathcal{D} \in \mathbb{C}$$, one can partition $$[n]$$ into a “small” number of intervals such that $$\mathcal{D}$$ is “approximately constant” with each interval. The main punchline is that any such class has a sublinear sample distribution tester. Amazingly, the generic tester given is optimal (up to poly log factors and the distance parameter)! Much previous work is subsumed as a corollary of this main theorem, and the paper also gives a lower bound framework to get matching lower bounds.

Optimal Testing for Properties of Distributions by Jayadev Acharya, Constantinos Daskalakis, and Gautam Kamath (arxiv). This paper concurrently gets many of the results of the previous paper using completely different techniques. The focus is on monotone, product, and log-concave distributions, where the paper gives completely optimal testers (both in terms on the domain size and the distance parameter). The key insight is to focus on the $$\chi^2$$-distance between distributions, though distribution testing typically uses the total-variation (or $$\ell_1$$) distance. The tester is also robust/tolerant, in that yes instances include distributions close to the class considered.

Bidirectional PageRank Estimation: From Average-Case to Worst-Case by Peter Lofgren, Siddhartha Banerjee, and Ashish Goel (arxiv). Not your typical sublinear algorithms paper, but that never stopped us. Personalized PageRank (PPR) is a fundamental similarity notion between vertices in a graph, and is (sort of) the probability that a short random walk from $$u$$ ends at $$v$$. Naturally, estimating the PPR value from $$u$$ to $$v$$ can be done in time $$O(1/p)$$, where $$p$$ is the PPR value. (Simply take many random walks from $$u$$, and see how often you hit $$v$$.) Can you do sublinear in this value? The answer is yes, when the graph is undirected and has bounded degree. One can do this in $$O(1/\sqrt{p})$$ time, a significant improvement. It basically uses bidirectional search, by doing forward walks from $$u$$ and backward walks from $$v$$. These ideas were developed in earlier work of Lofgren et al, based on true-blooded sublinear graph algorithms for expander reconstruction by Kale, Peres, and Seshadhri, which are further based on truer-blooded property testing questions by Goldreich and Ron on expansion testing. The best thing, these algorithms work in practice!

Every Property is Testable on a Natural Class of Scale-Free Multigraphs – ver. 2 by Hiro Ito (arxiv). Alas, PTRreview missed ver. 1, which appeared in April. One of the weaknesses of property testing on sparse graphs is the restriction to bounded degree. This paper considers arbitrary sparse graphs, and focuses on graphs with a power-law degree distribution. There are classic results of Barabási and Albert showing the significance of such graphs (also called scale-free networks) in the real-world. The main result is that special classes of scale-free graphs defined by clustering properties are hyperfinite. This means that one can remove a small fraction of edges to split the graph into constant-sized pieces. An important result of Newman and Sohler showed (in the bounded degree case) that any property defined by a subset of hyperfinite graphs graphs is testable. This paper leverages that result to show that properties defined by clustered classes of scale-free graphs are also testable.

How to solve the cake-cutting problem in sublinear time – ver. 2 by Hiro Ito and Takahiro Ueda (arxiv). Again, ver. 1 appeared in April and was missed by us. Apologies! Cake cutting is a classic resource allocation problem. Consider a cake, as the unit interval $$[0,1]$$. There are $$n$$ players, each of whom has a different utility of the cake, each specified by a continuous, real function on $$[0,1]$$. We wish to give each player a slice (subinterval of $$[0,1]$$) such that they all have utility $$1/n$$. An old result of Even and Paz give $$O(n \log n)$$ divide and conquer algorithms. What can be said in sublinear time? This paper shows that one can give fair assignments to $$o(n)$$ players in $$o(n)$$ time, such that the assignment can be extended to almost all remaining players.

# News for March 2015

This month’s post will liberally link to previous posts, because it turns out we have written enough in the past to motivate this month’s work. Our news this March has more monotonicity testing, non-deterministic testing, and image testing. Let me test your patience no more…

On the Complexity of Nondeterministically Testable Hypergraph Parameters

by Marek Karpinski and Ronald Markó (arXiv). The notion of nondeterministic graph property testing was introduced by Lovász and Vestergombi and further developed by Gishboliner and Shapira, and the authors. Our August 2014 news post explains these results, rather (ahem) beautifully, so check that out. This result generalizes nondeterministic property testing to hypergraphs, and proves the equivalence to deterministic (standard) property testing. Unlike their previous result, this does not give an explicit function for the complexity of a deterministic tester for a specified nondeterministically testable property. (We only know that it is “some” constant.) This is analogous to the original work of Lovász and Vestergombi, which was non-constructive. These non-constructive results use limit theorems for dense (hyper)graphs. This certainly leaves the open question of getting explicit complexity bounds.

Quantum Algorithm for Monotonicity Testing on the Hypercube by Aleksandrs Belovs and (PTReview’s very own) Eric Blais (arXiv). Much has been said about Boolean monotonicity testing in our reports, and you can refresh everything by checking out the open problem for Feb 2015. This result gives a $$O(n^{1/4}\epsilon^{-1/2})$$ in the quantum testing model (check Ashley Montanaro’s survey on PTReview). Standard quantum amplification of the classic edge tester and the recent Khot-Minzer-Safra monotonicity tester yield $$O(\sqrt{n/\epsilon})$$ and $$(n^{1/4}/\epsilon)$$ testers respectively. The point of this result is to get the best dependence on both $$n$$ and $$\epsilon$$. For $$\epsilon = 1/\sqrt{n}$$, the new quantum algorithm gives a cubic speedup over existing classical testers. From a quantum computing perspective, there are few problems with polynomial super-quadratic speedup, and Boolean monotonicity testing now enters that list.

Constant-Time Testing and Learning of Image Properties by Piotr Berman, Meiram Murzabulatov, and Sofya Raskhodnikova (arXiv). The notion of testing image properties was first introduced by Raskhodnikova and further developed by Tsur and Ron. There have been some really interesting results on applying these methods in practice for computer vision. This paper studies different query models for the property tester. Typically, the input is a grid of (black and white) pixels, and we wish to determine if the black region is connected, convex, etc. Previous testers were allowed to query any pixel of choice and could be adaptive. This paper focuses on the restricted sample based models, where the tester only sees uniform random pixels, and variants thereof. Interestingly, this limited power suffices to test and even tolerant test the properties of being a half-plane, connectedness, and convexity. There are numerous results for the different models and tolerant vs standard testers. Having richer models would likely have more impact in practice, where query models may not be feasible. Maybe this result will lead to more work in computer vision!

# Open problem for February 2015

Today’s post by Clément Canonne.

Following the Boolean monotonicity testing bonanza, here’s an open problem. In short, does adaptivity help for monotonicity testing of Boolean functions?

Problem: Consider the problem of monotonicity testing for Boolean functions on the hypercube. Given oracle access to $$f\colon \{0,1\}^n \to \{0,1\}$$, we wish to decide if $$f$$ is (i) monotone vs. (ii) $$\epsilon$$-far from monotone (in Hamming distance). For either the one-sided or two-sided version of the problem, what is the exact status of adaptive testers?

State of the art:
Fischer et al. [FLN+02] showed one-sided non-adaptive testers require $$\sqrt{n}$$ queries. This implies an $$\Omega(\log n)$$ lower bound for one-sided adaptive testers.
Chen et al. [CDST15] proved that two-sided non-adaptive testers require (essentially) $$\Omega(\sqrt{n})$$ queries. This implies an $$\Omega(\log n)$$ lower bound for 2-sided adaptive testers.
Khot et al. [KMS15] recently gave a one-sided non-adaptive tester making $$\tilde{O}(\sqrt{n}/\epsilon^2)$$ queries. The story is essentially complete for non-adaptive testing.

Comments: As of now, it is not clear whether adaptivity can help. Berman et al. [BRY14] showed the benefit of adaptivity for Boolean monotonicity testing over the domain $$[n]^2$$ (switch the $$2$$ and the $$n$$ from the hypercube). A gap provably exists between adaptive and non-adaptive testers: $$O(1/\epsilon)$$ vs. $$\Omega(\log(1/\epsilon)/\epsilon)$$.

References:

[FLN+02] E. Fischer, E. Lehman, I. Newman, S. Raskhodnikova, R. Rubinfeld, and A. Samorodnitsky. Monotonicity testing over general poset domains. Symposium on Theory of Computing, 2002

[BRY14] P. Berman, S. Raskhodnikova, and G. Yaroslavtsev. $$L_p$$ testing. Symposium on Theory of Computing, 2014

[CDST15] X. Chen, A. De, R. Servedio, L.-Y. Tang. Boolean function monotonicity testing requires (almost) $$n^{1/2}$$ non-adaptive queries. Symposium on Theory of Computing, 2015

[KMS15] S. Khot, D. Minzer, and S. Safra. On monotonicity testing and Boolean Isoperimetric type theorems. ECCC, 2015

Erratum: a previous version of this post stated (incorrectly)  lower bound of $$\Omega(\sqrt{n}/\epsilon^2)$$. This has been corrected to $$\Omega(\sqrt{n})$$.

# News for December 2014

Happy new year! I just looked back at our archives, and saw that PTReview has been on from July 2013. I’m happy (and maybe mildly surprised) that it’s still going strong. And so the chronicling of $$o(n)$$ continues…

Much coolness we have to report: permutation testing, linearity testing, distribution testing, and monotonicity testing. Without further ado:

Large permutations and parameter testing by Roman Glebov, Carlos Hoppen, Tereza Klimosova, Yoshiharu Kohayakawa, Daniel Kral, and Hong Liu (arXiv). Just as typical dense graph testing involves checking properties on a random, constant-sized induced subgraph, we can look at permutation testers that test properties of permutations by sampling sub-permutations. The theory of dense graph testing is closely tied to the Szemeredi regularity lemma, the notion of graph limits, and the behavior of subgraph densities. (Check out this survey by Borgs et al.) An analogous theory for permutations has been built by a subset of the authors in a series of papers (survey). There is a notion of permutation properties/parameters that are forcible, meaning that two permutations that have similar frequencies of (a fixed, finite set of) subpermutations have close values of the parameter. This seems naturally connected to testing. Indeed, a permutation parameter that is forcible can be approximated in constant-time by simply approximating the frequencies of the subpermutations. This paper shows that converse is false: there is an constant-time approximable property that is not forcible.

A Self-Tester for Linear Functions over the Integers with an Elementary Proof of Correctness by Sheela Devadas and Ronitt Rubinfeld (arXiv). The venerable property of linearity testing needs no introduction, and has enough history to fill up this month’s post (and the next and the next). But what if the domain is n-bit integers and we care about running time in terms of bit operations? Suppose we wish to check if a program supposed to compute $$f(x) = b\cdot x$$ (for fixed, known $$b$$). Inputs like $$2^k$$ are easy to evaluate by the checker, and could be used to get faster checkers. This paper gives testers for linearity, multivariate linear functions, and multilinear functions, where computation time is always linear in sample complexity.

$$\ell_p$$ Testing and Learning of Discrete Distributions by Bo Waggoner (arXiv). Distribution testing needs little introduction, and we have seen much progress over our time in PTReview. Let’s start with the basics. The seminal work of Batu et al. and later Paninksi showed that the sample complexity of testing uniformity of a distribution $$\mathcal{D}$$ over universe $$[n]$$ is $$\Theta(\sqrt{n}/\epsilon^2)$$. Meaning, the sample complexity of checking $$\|\mathcal{D} – \mathcal{U}\|_1 > \epsilon$$ (where $$\mathcal{U}$$ is uniform on $$[n]$$) is $$\Theta(\sqrt{n}/\epsilon^2)$$. But what if we had $$\ll \sqrt{n}/\epsilon^2$$ samples? From prior work, nothing could be inferred. (Update: As Ilias Diakonikolas pointed out to me, the $$\ell_2$$ results were previously known, both by the Batu et al. work and a recent paper by Chan et al. that settles the $$\ell_2$$ question.) This paper changes that, and says that we can still infer something about other $$\ell_p$$ norms, $$\|\mathcal{D} – \mathcal{U}\|_p$$. What impressed me about this paper is the detailed understand of the interplay between $$n, \epsilon, p$$. For example, the sample complexity of uniformity testing over $$\ell_p$$-norm for any $$p > 1$$ is independent of $$n$$. There are many, many results in this paper with an intriguing threshold phenomenon at $$p=2$$. For distribution testing in practice, I would think that this result would be of much significance.

New algorithms and lower bounds for monotonicity testing by Xi Chen, Rocco A. Servedio, and Li-Yang Tan (arXiv). Ah yes, Boolean monotonicity testing. Consider the standard coordinate wise partial order on $$\{0,1\}^n$$, given by $$\prec$$. A function $$f:\{0,1\}^n \rightarrow \{0,1\}$$ is monotone if $$\forall x \prec y, f(x) \leq f(y)$$. The complexity of property testing (Boolean) monotonicity is one of those tantalizing, simple-to-state questions that is still quite open. I’ll spare you the full story and the epsilons, but here’s the status. The best upper bound is a non-adaptive, one-sided $$O(n^{7/8})$$ tester by Chakrabarty and Seshadhri. The best lower bound is a non-adaptive, one-sided lower bound of $$\Omega(\sqrt{n})$$ by Fischer et al. This implies an $$\Omega(\log\log n)$$ lower bound for general testers. This paper changes all of this. The authors prove a $$\Omega(n^{1/5})$$ lower bound for two-sided non-adaptive testers, leading to an exponentially better $$\Omega(\log n)$$ lower bound for general testers. The main insight is to focus on monotone vs non-monotone families of linear threshold functions, and show that procedures making few (non-adaptive) queries cannot distinguish between such families. The main hammer is to use recent Central Limit Theorems. As an added bonus, this paper improves the upper bound for monotonicity testing to $$O(n^{5/6})$$, with a modified tester (and better analysis) of Chakrabarty and Seshadhri. But can the lower bound be improved?

Boolean function monotonicity testing requires (almost) $$n^{1/2}$$ non-adaptive queries by Xi Chen, Anindya De, Rocco A. Servedio, and Li-Yang Tan (arXiv). Yes, the lower bound can be improved. This paper gets the non-adaptive two-sided bound to (almost) $$\Omega(\sqrt{n})$$, matching the one-sided Fischer et al bound. The paper proves improved Central Limit Theorems, tailored for this application. The authors, and I with them, believe that this is the true complexity. At that intriguing note, we end 2014!

# New for August 2014

We’ve got a nice collection of papers this August, ranging from graph property testing, locally testable codes, to new lower bound approaches.

Deconstructions of Reductions from Communication Complexity to Property Testing using Generalized Parity Decision Trees by Roei Tell (ECCC). Blais, Brody, and Matulef (BBM) introduced the beautiful and powerful method of using communication complexity to derive property testing lower bounds. This result is all about understanding this approach is greater depth, and really trying to get at the crux of the reductions. The BBM approach uses rather heavy hammers (communication complexity lower bounds), so could we get away with weaker hammers to get property testing lower bounds? The key insight is that less powerful models can be reduced to existing property testing problems, and these are equivalent to the BBM reductions. While this (as of now) does not lead to newer bounds, it does shed significant light on existing bounds. The weaker model is essentially that of linear decision trees. Such reductions were also discovered by Brushundi, Chakraborty, and Kulkarni, proving $$\Omega(k)$$ lower bounds for testing $$k$$-linearity.

Locally Correctable and Testable Codes Approaching the Singleton Bound by Or Meir (ECCC). Locally Testable Codes (LTCs) are codes that have a constant (or polylogarithmic) time tester. It is known that such codes with minor caveats cannot have a constant rate (a number of papers on this topic: Katz and Trevisan, Dinur and Kaufman, Ben-Sasson and Viderman). But when the query complexity can be $$n^\beta$$ (where $$n$$ is the codeword size), the rate can be arbitrarily close to $$1$$ (Viderman, and more refs in the paper). This result takes this fact to the very extreme. The Singleton Bound is a lower bound between the tradeoff between rate and relative distance of a code. Amazingly, this lower bound tradeoff can be achieved by an LTC (and also Locally Correctable Code) testable with $$n^\beta$$ queries!

Complexity of Nondeterministic Graph Parameter Testing by Marek Karpinksi and Ronald Mark&#243 (arXiv). It is easiest to think of nondeterminstic graph testing with the example of testing bipartiteness for dense graphs. Suppose we wish to test $$G$$ for bipartiteness. As a “certificate”, we are provided a candidate partition $$V = V_1 \cup V_2$$. There is a “simple” constant-time tester that can property testing if $$G$$ is bipartite with respect to this partition. (Just sample two uniform random vertices, and check if the edge between them respects the bipartition.) So we say that bipartiteness is non-determinstically testable: if $$G$$ is far from being bipartite, no certificate can convince the simple tester. The full definition is somewhat complex, but it is basically the same idea. (In this context, “testablility” always refers to constant time testable.) Lovász and Vestergombi proved the striking result that any non-deterministically testable property is also deterministically testable, but their proof did not give an explicit bound on the query complexity. Gishboliner and Shapira fixed that, and gave a tower-type bound for the query complexity. So if the query complexity of the underlying “simple” tester is $$q$$, then the non-deterministic property has a tester whose query complexity is a tower of height $$q$$. This results goes much further, and shows the final query complexity is at most triply exponential in $$q$$. This is quite significant, since it is uses a weaker regularity approach than previous results.

# News for May 2014

May seems to be a slow month for property testing, with only one paper on locally testable codes. Nonetheless, we have unearthed a paper from April (missed despite our best efforts) on a related topic of hidden set approximations.

The Power of an Example: Hidden Set Size Approximation Using Group Queries and Conditional Sampling by Dana Ron and Gilad Tsur (arXiv). Consider a known universe $$U$$ and an unknown subset $$S$$. Our aim is to approximately determine $$|S|$$. We can perform subset queries: for a query subset $$T$$, an oracle returns whether $$T \cap S \neq \emptyset$$ or (a more powerful model) the oracle returns a uniform random element in $$T \cap S$$. This paper gives a detailed study of this problem under various settings. There is a fascinating array of upper and lower bounds, including situations where $$U$$ is ordered and $$T$$ must be an interval, the difference between adaptivity and non-adaptivity, arbitrary subset queries, and much more. The topic seems to be quite rich, and it should yield some nice sets of problems to study further. It appears that some lower bounds are quite open. Students, pay attention!

Limitations on Testable Affine-Invariant Codes in the High-Rate Regime by Venkatesan Guruswami, Madhu Sudan, Ameya Velingker, and Carol Wang (ECCC). Locally Testable Codes (LTCs) are codes that (typically) have a constant time tester. Interestingly, recent work on testing codes using a linear number of queries (some small constant fraction of the code length) has connections to the Small Set Expansion (SSE) problem, which in turn is related to the famous Unique Games Conjecture (UGC). A paper by Barak et al. gives a construction of a small set expander (a graph where all small sets have high conductance) where the Laplacian has many small eigenvalues (so it’s “difficult” to find a low conductance cut). This construction uses high-rate LTCs that are linear time testable. In some sense, the better the rate, the better construction one gets. This paper asks how far this idea can go. So the idea is to construct the best rate linear-time testable LTC. Reed-Muller codes fall in this category, but their rate is quite far from optimal. Recent work by Guo et al. gives a slightly better construction, but this is far from the (near) optimal rate achieved by BCH codes (which are not known to be testable). All in all, this paper shows that for affine-invariant codes, the Guo et al. construction is essentially the best testable high-rate code one can get. I have obviously done no justice to the depth of the connections, the intricacies of the parameters, or the overall coolness of this subject. So go read the paper!

# News for February 2014

So what did we see in February 2014?

Local Algorithms for Sparse Spanning Graphs by Reut Levi, Dana Ron, and Ronitt Rubinfeld (arXiv). Given a graph $$G = (V,E)$$, the aim is the find a small sparse spanning subgraph $$G’$$ in the local algorithm setting. So, given any edge $$(u,v) \in E$$, we must determine in sublinear time whether this edge exists in $$G’$$. The main result is an $$\Omega(|V|)$$ lower bound, and a matching algorithm when $$G$$ is either an expander or hyperfinite.

A Characterization of Locally Testable Affine-Invariant Properties via Decomposition Theorems by Yuichi Yoshida (arXiv). This result gives a complete characterization of (two-sided) locally testable affine-invariant properties. These are based on existing decomposition theorems for functions into structured and pseudorandom parts. This result is analogous to that of Alon, Fischer, Newman, and Shapira (link) on characterizations for testability over dense graphs.

Testing probability distributions underlying aggregated data by Clément Canonne and Ronitt Rubinfeld (arXiv). There has been much recent work on distribution testing, and this work discusses stronger models where better algorithms can be obtained. Given a distribution $$D$$ over $$[n]$$, we can get samples from this distribution, and also query the probability $$D(i)$$ of any $$i \in [n]$$. With these “dual” queries, one gets distribution testers for a variety of problems surpassing previous lower bounds. Interestingly, this is stronger than the conditional model introduced recently (here and here).

Strong Locally Testable Codes with Relaxed Local Decoders by Oded Goldreich, Tom Gur, and Ilan Komargodski (ECCC). For a locally testable code (LTC), the property of being a codeword is efficiently (constant time) testable. A locally decodable code (LDC) is a code from which individual bits of the original message can be recovered in constant time. A relaxed local decoder is allowed to declare failure for recovered a small fraction of message bits. This allows for bypassing the difficult (and prominent) open problem of constructing small LDCs. This result gives a construction of near-linear LTCs with a relaxed local decoder. This result leads to a property that is hard to test but easy to verify (in the MA proofs of proximity sense, refer to last month’s open problem).