Apologies for the late post! After the relative silence of last month, we’re getting back to a moderate cadence of papers. Some distribution testing, quantum testing, learning and testing. We’ve also added a non-sublinear paper on distributions that should be of interest. And here’s the roundup.
Efficient Testable Learning of General Halfspaces with Adversarial Label Noise by Ilias Diakonikolas, Daniel M. Kane, Sihan Liu, and Nikos Zarifis (arXiv). This paper is on the recent model of testable learning introduced by Rubinfeld-Vasiliyan. Consider learning halfspaces over the Gaussian distribution. We have sample access to an input function f:\mathbf{R}^d \to \{0,1\}, where our aim is learn the closest halfspace to f. The samples comes from some fixed, underlying distribution \mathcal{D}. But it is often infeasible to validate this distributional assumption, even in the property testing framework. A tester-learner pair will try to test this assumption, and if it accepts, apply the learning algorithm. The guarantee is: if \mathcal{D} is indeed (say) Gaussian, then the learning algorithms works as desired. If the tester accepts and the learner outputs a hypothesis (say, halfspace) h, then it is guaranteed that h is close to f according to \mathcal{D}, even if \mathcal{D} is far from Gaussian. This last part makes the whole setup interesting; the distribution tester might fail to reject the distribution, but we can trust the output hypothesis! There have been many results on learning homogenous halfspaces under the Gaussian distribution. So the hypothesis class consists of halfspaces going through the origin. This paper is about general (inhomogenous) halfspaces. At first blush, this might seem like a simple issue; we’re just adding a constant term to the halfspace. But existing techniques break down, often because they’re solving an optimization problem of minimizing error around a band close the origin. This paper gives a careful reduction to the “nearly” homogeneous case, and gives the first polynomial sample tester-learner pair for this problem.
Tolerant testing stabilizer states by Srinivasan Arunachalam and Arkopal Dutt (arXiv). Let us start with a familiar classical problem, low degree testing. Given access to a function f:\{0,1\}^n \to \{0,1\}, we wish to test if f is a quadratic polynomial (over \mathbb{F}_2). There exist testers based on the Gowers-norm: basically, compute various (constant dimensional) discrete derivatives and check they are consistent with a quadratic function. These discrete derivatives can be analyzed using Fourier analysis, and the main aim is show that a function that “locally” (in terms of derivatives) behaves like a quadratic is indeed globally close to being one. This method can be used to get tolerant testers for quadratic polynomials. This paper is on a quantum generalization of this testing result. The input is a qubit |\psi_f\rangle, promised to be a “phase state”. A phase state has a Boolean function f embedded in it, because one can write a phase state as a linear combination of 2^n “base” states, with the coefficients being Boolean. A “stabilizer state” is one where the function f is a quadratic (or so I believe, I’m probably exposing my ignorance of quantum mechanics). This paper uses the Gowers norm techniques to give the first quantum tolerant tester for stabilizer states.
Improved Bounds for High-Dimensional Equivalence and Product Testing using Subcube Queries by Tomer Adar, Eldar Fischer, and Amit Levi (arXiv). Consider distribution testing on high-dimensional data, so the domain is \{0,1\}^n. A nice model for distribution tester is the subcube conditioning model of Bhattacharyya and Chakraborty. Suppose we fix any subset S \subseteq [n] coordinates with x_S. We can generate a sample y from the distribution, conditioned on y_S = x_S (meaning, y agrees with x on S). The problem is to perform equivalence testing of distributions on this model. Previous results got O(n^2/\varepsilon^2) query algorithms, and this paper give a significantly improved algorithm making O(n/\varepsilon^2) queries. Interestingly, the algorithm only makes weaker queries. One distribution is only accessed by “marginal queries”. So, given x_S as before, but we only sample the marginal distribution over coordinate i, conditioned on S being fixed as x_S. (Hence, the output is a single bit. Also, we note that the other distribution is accessed by prefix queries, a weaker version of subcube queries.) These generalizations lead to more results, on testing equivalence in the interval query model, and for testing the property of product distributions. This paper also proves a \Omega(\sqrt{n}/\varepsilon^2) lower bound for testing product distributions.
Parallel Sampling via Counting by Nima Anari, Ruiquan Gao, and Aviad Rubinstein (arXiv). This isn’t a “typical sublinear algorithm” per se, but I think it is quite interesting to those of us who think about sublinearity, adaptivity, and distributions. This result has connections to the previous paper. Our aim is to sample from an unknown distribution \mu over [q]^n. We have access to “marginal queries” as described above. This problem appears in large language models, wherein neural nets are trained on various marginals (next word), but the output is a sentence (list of words). Observe there is a simple “O(n) round” algorithm. Without any fixing, query the marginal of the first coordinate. Fix the first coordinate, query the marginal of the second coordinate. Fix the first two coordinates, rinse and repeat. This requires n rounds with the marginal query. In this model, polynomially many marginal queries can be made in a single round, and the aim is to minimize the number of rounds (basically, bounding the adaptivity). This paper gives a \widetilde{O}(n^{2/3}) round procedure for sampling, and shows an \Omega(n^{1/3}) lower bound.