Happy new year! And now for the last post of 2019 papers. We have found a diverse collection of six papers, ranging from classic property testing topics to new perspectives on sublinear computation.
Sublinear Optimal Policy Value Estimation in Contextual Bandits by Weihao Kong, Gregory Valiant, Emma Brunskill (arXiv). This isn’t our usual sublinear paper, but it is definitely of interest to us sublinear folk. Let’s start with a stripped down definition of the problem (or rather, game). There are \(K\) “arms”, where the \(i\)th arm is represented by an unknown vector in \(\beta_i \in \mathbb{R}^d\). We are presented with a “context”, which is a vector \(x \in \mathbb{R}^d\). Our job is to choose an arm \(i \in [K]\). We get the reward \(x \cdot \beta_i\) (with some noise added). The contexts appears from a known distribution. To aid us, we observe the rewards of \(N\) iid contexts, so we observe a total of \(T = KN\) rewards. There has been much work on figuring out the minimum value of \(T\) required to learn the optimal policy. One requires at least \(d\) (the dimension) samples to estimate any of the arm vectors. This papers shows that one can actually estimate the expected reward of the optimal policy, without being able to describe it, with sublinear in \(d\) (technically, \(\widetilde{O}(\sqrt{d})\)) samples. We see this a lot in property testing, where producing the “optimal” solution for a problem requires linear-in-dimension samples, but estimating the optimal value is much cheaper (consider, for example, the situation of linearity testing, where we wish to find the closest linear function).
Sublinear Time Numerical Linear Algebra for Structured Matrices by Xiaofei Shi and David P. Woodruff (arXiv). This follows the recent linear of advances in sublinear time linear algebra. Given a matrix \(A \in \mathbb{R}^{n \times d}\), the aim is to get algorithms that only look at \(o(nnz(A))\) entries (where \(nnz(A)\) is the number of non-zeroes, or the support). Consider the classic talk of low rank approximation. Unfortunately, suppose one entry is extremely large, and the others are extremely small. One has to find this large entry for any reasonable approximation, which (in the worst-case) requires \(nnz(A)\) queries into \(A\). Thus, previous papers make structural assumption (such as, \(A\) being a Vandermonde matrix) to get sublinear bounds. This paper gives a clean black box method to get a variety of such results. Basically, one can replace the usual \(nnz(A)\) term in many algorithms, by \(T(A)\), which is the time to compute the matrix-vector product \( Ay\), for \(y \in \mathbb{R}^d\). In many cases \(T(A) = \widetilde{O}(n)\), which can be significantly smaller than \(nnz(A)\). This paper gives such results for low-rank approximations and many regression problems.
Robust and Sample Optimal Algorithms for PSD Low-Rank Approximation by Ainesh Bakshi, Nadiia Chepurko, and David P. Woodruff. Consider the low rank problem discussed above. As mentioned in the previous paragraph, we need structural assumptions on \(A\). Previous results gave sublinear time low-rank approximations assuming that \(A\) is positive semidefinite (PSD). The aim is to get a rank \(k\) matrix \(B\) such that \(\|A-B\|^2_2\) is at most \((1+\epsilon)\)-times the optimal such approximation. The previous algorithm of Musco-Woodruff makes \(\widetilde{O}(nk/\epsilon^{2.5})\) queries in to \(A\), while there is a lower bound of \(\Omega(nk/\epsilon)\). This gap between the complexities is resolved in this paper with an upper bound of \(\widetilde{O}(nk/\epsilon)\) queries.
Constructive derandomization of query algorithms by Guy Blanc, Jane Lange, and Li-Yang Tan (arXiv). This paper discusses an intriguing angle to sublinear question: when can they be derandomized? Abstractly, consider a randomized algorithm \(R\) that makes \(q\) queries. Think of \(R\) as a function \(R(x,r)\), where \(x\) is the input, and \(r\) is the randomness. We would like to design a deterministic algorithm \(D\) making, ideally, close to \(q\) queries and approximates \(\mathop{E}_r[R(x,r)]\). For starters, consider some distribution over \(x\), and suppose we want \(\mathbb{E}_x[D(x) – \mathbb{E}_r[R(x,r)]] < \epsilon\). By (the easy direction of) Yao’s minimax lemma, one can show the existence of such an algorithm \(D\) that makes \(O(q/\epsilon)\) queries. But how to explicitly construct it? Indeed, the first result of this paper gives a “meta-algorithm” that takes as input the description of \(R\) (which is of size \(N\)), has running time \(poly(N)2^{O(q/\epsilon)}\) and outputs a description of \(D\). When \(R\) satisfies the stronger property of “bounded error”, one can get a \(O(q^3)\)-query algorithm \(D\) that approximates \(\mathop{E}_r[R(x,r)]\) for all \(x\) (again, the existence is proven by a classic theorem of Nisan). Overall, this paper gives a method to derandomize sublinear time algorithms, and I wonder if there could be some applications of this method for proving lower bounds. After all, Yao’s minimax theorem is the tool for property testing lower bounds, and any perspective on Yao’s theorem is likely of relevance.
Testing Membership for Timed Automata by Richard Lassaigne and Michel de Rougemont (arXiv). Property testing for regular languages is a classic result in sublinear algorithms. This paper focuses on the more complex notion of timed automata. The technical definition is quite complicated, but here’s an overview. There is a finite automaton and a collection of “clocks”. Imagine a string being processed, where each alphabet symbol appears with a new timestamp. Thus, the input word is called a “timed word”. The transitions of the automaton involve the new symbol read, as well as constraints involving the clock times and the timestamp. Thus, we can enforce conditions like “only transition if another symbol is read within a single time unit”. In general, deciding whether a timed word is accepted by a timed automaton is NP-complete. This papers studies the property testing viewpoint. The paper gives a new definition of “timed edit distance” between timed words. The main result shows that one can distinguish time words accepted by a timed automaton from words that are far (according to timed edit distance), by querying a constant number of word positions.
On the query complexity of estimating the distance to hereditary graph properties by Carlos Hoppen, Yoshiharu Kohayakawa, Richard Lang, Hanno Lefmann, and Henrique Stagni (arXiv). This paper concerns the classic setting of property testing of dense graphs. It is well-known that all hereditary graph properties are testable, and are moreover, one can estimate the distance to the property in time that only depends on \(\varepsilon\). Unfortunately, the queries complexities have large tower dependencies on \(\varepsilon\), arising from the use of the Szemeredi regularity lemma. The question of property testing in dense graphs can be reduced to finding “removal” lemmas (such as the classic triangle remove lemma). Such a lemma states that if at least \(\varepsilon n^2\) edges need to be removed from \(G\) to destroy all “forbidden subgraphs”, then there must be “many” forbidden subgraphs in \(G\). There is much recent research on finding families of forbidden subgraphs, where the “many” (in the above statement) is at least \(poly(\varepsilon)\) times the trivial upper bound. This paper shows that one can also estimate the distance to any hereditary property, in a query complexity that depends directly on the corresponding removal lemma parameters. As a compelling example, one can estimate the distance to being chordal in \(\exp(1/\varepsilon)\) queries, a significant improvement over standard tower bounds.