Despite a wealth of February conference deadlines, papers were fairly sparse. We found ~~two~~ ~~EDIT: three~~ EDIT EDIT: four papers for the month of February, please share if we missed any relevant papers.

**Monotone probability distributions over the Boolean cube can be learned with sublinear samples**, by Ronitt Rubinfeld and Arsen Vasilyan (arXiv). By now, it is well known that assuming an (unknown) distribution enjoys some sort of structure can lead to more efficient algorithms for learning and testing. Often one proves that the structure permits a convenient representation, and exploits this representation to solve the problem at hand. This paper studies the learning of monotone distributions over the Boolean hypercube. The authors exploit and extend a structural statement about monotone Boolean *functions* by Blais, Håstad, Servedio, and Tan, using it to provide sublinear algorithms for estimating the support size, distance to uniformity, and the distribution itself.

**Locally Private Hypothesis Selection**, by Sivakanth Gopi, Gautam Kamath, Janardhan Kulkarni, Aleksandar Nikolov, Zhiwei Steven Wu, and Huanyu Zhang (arXiv). Given a collection of \(k\) distributions and a set of samples from one of them, can we identify which distribution it is? This paper studies this problem (and an agnostic generalization of it) under the constraint of *local differential privacy*. The authors show that this problem requires \(\Omega(k)\) samples, in contrast to the \(O(\log k)\) complexity in the non-private model. Furthermore, they give \(\tilde O(k)\)-sample upper bounds in various interactivity models.

**Efficient Distance Approximation for Structured High-Dimensional Distributions via Learning**, by Arnab Bhattacharyya, Sutanu Gayen, Kuldeep S. Meel, and N. V. Vinodchandran (arXiv). Given samples from two distributions, can you estimate the total variation distance between them? This paper gives a framework for solving this problem for *structured* distribution classes, including Ising models, Bayesian networks, Gaussians, and causal models. The approach can be decomposed properly learning the distributions, followed by estimating the distance between the two hypotheses. Challenges arise when densities are hard to compute exactly.

**Profile Entropy: A Fundamental Measure for the Learnability and Compressibility of Discrete Distributions**, by Yi Hao and Alon Orlitsky (arXiv). The histogram of a dataset is the collection of frequency counts of domain elements. The *profile* of a dataset can be succinctly described as the histogram of the histogram. Recent works have shown that, in some sense, discarding information about your dataset by looking solely at the profile can be beneficial for certain problems in which it is “universal”. This work explores two new quantities, the entropy and dimension of the profile, which turn out to play a key role in quantifying the performance of estimators based on the profile.