Paper Digest: COLT 2014 Highlights

June 24, 2014June 18, 2020 admin

Readers can also choose to read this highlight article on our console, which allows users to filter out papers using keywords and find related papers.

The Annual Conference on Learning Theory (COLT) focuses on addressing theoretical aspects of machine learing and related topics.

To help the community quickly catch up on the work presented in this conference, Paper Digest Team processed all accepted papers, and generated one highlight sentence (typically the main topic) for each paper. Readers are encouraged to read these machine generated highlights / summaries to quickly get the main idea of each paper.

If you do not want to miss any interesting academic paper, you are welcome to sign up our free daily paper digest service to get updates on new papers published in your area every day. You are also welcome to follow us on Twitter and Linkedin to get updated with new conference digests.

Paper Digest Team
team@paperdigest.org

TABLE 1: COLT 2014 Papers

	Title	Authors	Highlight
1	Preface	Maria Florina Balcan, Csaba Szepesv�ri	Preface
2	Open Problem: Tightness of maximum likelihood semidefinite relaxations	Afonso S. Bandeira, Yuehaw Khoo, Amit Singer	As an illustrative example, we focus on the generalized Procrustes problem.
3	Open Problem: A (Missing) Boosting-type Convergence Result for AdaBoost.MH with Factorized Multi-class Classifiers	Bal�zs K�gl	In this open problem paper we take a step back to the basic setup of boosting generic multi-class factorized (Hamming) classifiers (so no trees), and state the classical problem of boosting-like convergence of the training error.
4	Open Problem: Finding Good Cascade Sampling Processes for the Network Inference Problem	Manuel Gomez-Rodriguez, Le Song, Bernhard Schoelkopf	Information spreads across social and technological networks, but often the network structures are hidden and we only observe the traces left by the diffusion processes, called cascades.
5	Open Problem: Tensor Decompositions: Algorithms up to the Uniqueness Threshold?	Aditya Bhaskara, Moses Charikar, Ankur Moitra, Aravindan Vijayaraghavan	Open Problem: Tensor Decompositions: Algorithms up to the Uniqueness Threshold?
6	Open Problem: The Statistical Query Complexity of Learning Sparse Halfspaces	Vitaly Feldman	We propose a potentially easier question: what is the query complexity of this learning problem in the statistical query (SQ) model of Kearns (1998).
7	Open Problem: Online Local Learning	Paul Christiano	The question we pose is: how general is this phenomenon?
8	Open Problem: Shifting Experts on Easy Data	Manfred K. Warmuth, Wouter M. Koolen	In the full information setting, the FlipFlop algorithm by De Rooij et al. (2014) combines the best of the iid optimal Follow-The-Leader (FL) and the worst-case-safe Hedge algorithms, whereas in the bandit information case SAO by Bubeck and Slivkins (2012) competes with the iid optimal UCB and the worst-case-safe EXP3.
9	Open Problem: Efficient Online Sparse Regression	Satyen Kale	We provide one natural formulation as an online sparse regression problem with squared loss, and ask whether it is possible to achieve sublinear regret with efficient algorithms (i.e. polynomial running time in the natural parameters of the problem).
10	Distribution-independent Reliable Learning	Varun Kanade, Justin Thaler	We study several questions in the \emphreliable agnostic learning framework of Kalai et al. (2009), which captures learning tasks in which one type of error is costlier than other types.
11	Learning without concentration	Shahar Mendelson	We obtain sharp bounds on the convergence rate of Empirical Risk Minimization performed in a convex class and with respect to the squared loss, without any boundedness assumptions on class members or on the target.
12	Uniqueness of Ordinal Embedding	Matth�us Kleindessner, Ulrike Luxburg	Uniqueness of Ordinal Embedding
13	Bayes-Optimal Scorers for Bipartite Ranking	Aditya Krishna Menon, Robert C. Williamson	We address the following seemingly simple question: what is the Bayes-optimal scorer for a bipartite ranking risk?
14	Multiarmed Bandits With Limited Expert Advice	Satyen Kale	We consider the problem of minimizing regret in the setting of advice-efficient multiarmed bandits with expert advice.
15	Learning Sparsely Used Overcomplete Dictionaries	Alekh Agarwal, Animashree Anandkumar, Prateek Jain, Praneeth Netrapalli, Rashish Tandon	We consider the problem of learning sparsely used overcomplete dictionaries, where each observation is a sparse combination of elements from an unknown overcomplete dictionary.
16	Community Detection via Random and Adaptive Sampling	Se-Young Yun, Alexandre Proutiere	In this paper, we consider networks consisting of a finite number of non-overlapping communities.
17	A second-order bound with excess losses	Pierre Gaillard, Gilles Stoltz, Tim van Erven	A second-order bound with excess losses
18	Logistic Regression: Tight Bounds for Stochastic and Online Optimization	Elad Hazan, Tomer Koren, Kfir Y. Levy	In this paper we investigate the question of whether these smoothness and convexity properties make the logistic loss preferable to other widely considered options such as the hinge loss.
19	Higher-Order Regret Bounds with Switching Costs	Eyal Gofer	This work examines online linear optimization with full information and switching costs (SCs) and focuses on regret bounds that depend on properties of the loss sequences.
20	The Complexity of Learning Halfspaces using Generalized Linear Methods	Amit Daniely, Nati Linial, Shai Shalev-Shwartz	We study the performance of this approach in the problem of (agnostically and improperly) learning halfspaces with margin γ.
21	Optimal learners for multiclass problems	Amit Daniely, Shai Shalev-Shwartz	In this paper we seek for a generic optimal learner for \emphmulticlass prediction.
22	Stochastic Regret Minimization via Thompson Sampling	Sudipto Guha, Kamesh Munagala	Our goal in this paper is to make progress towards understanding the empirical success of this policy.
23	Approachability in unknown games: Online learning meets multi-objective optimization	Shie Mannor, Vianney Perchet, Gilles Stoltz	We revisit the classical setting and consider the setting where the player has a preference relation between target sets: she wishes to approach the smallest (“best”) set possible given the observed average payoffs in hindsight.
24	Belief propagation, robust reconstruction and optimal recovery of block models	Elchanan Mossel, Joe Neeman, Allan Sly	We consider the problem of reconstructing sparse symmetric block models with two blocks and connection probabilities a/n and b/n for inter- and intra-block edge probabilities respectively.
25	Sample Compression for Multi-label Concept Classes	Rahim Samei, Pavel Semukhin, Boting Yang, Sandra Zilles	For a specific extension of the notion of VC-dimension to multi-label classes, we prove that every maximum multi-label class of dimension d has a sample compression scheme in which every sample is compressed to a subset of size at most d.
26	Finding a most biased coin with fewest flips	Karthekeyan Chandrasekaran, Richard Karp	We study the problem of learning a most biased coin among a set of coins by tossing the coins adaptively.
27	Volumetric Spanners: an Efficient Exploration Basis for Learning	Elad Hazan, Zohar Karnin, Raghu Meka	We define a novel geometric notion of exploration basis with low variance called volumetric spanners, and give efficient algorithms to construct such bases.
28	lil� UCB : An Optimal Exploration Algorithm for Multi-Armed Bandits	Kevin Jamieson, Matthew Malloy, Robert Nowak, S�bastien Bubeck	The paper proposes a novel upper confidence bound (UCB) procedure for identifying the arm with the largest mean in a multi-armed bandit game in the fixed confidence setting using a small number of total samples.
29	An Inequality with Applications to Structured Sparsity and Multitask Dictionary Learning	Andreas Maurer, Massimiliano Pontil, Bernardino Romera-Paredes	An Inequality with Applications to Structured Sparsity and Multitask Dictionary Learning
30	On the Complexity of A/B Testing	Emilie Kaufmann, Olivier Capp�, Aur�lien Garivier	When the distribution of the outcomes are Gaussian, we prove that the complexity of the fixed-confidence and fixed-budget settings are equivalent, and that uniform sampling of both alternatives is optimal only in the case of equal variances.
31	Elicitation and Identification of Properties	Ingo Steinwart, Chlo� Pasin, Robert Williamson, Siyu Zhang	We extend existing results to characterize the elicitability of properties in a general setting.
32	The sample complexity of agnostic learning under deterministic labels	Shai Ben-David, Ruth Urner	For any d, we present classes of VC-dimension d that are learnable from \tilde O(d/ε)-many samples and classes that require samples of size Ω(d/ε^2).
33	Density-preserving quantization with application to graph downsampling	Morteza Alamgir, G�bor Lugosi, Ulrike Luxburg	We consider the problem of vector quantization of i.i.d. samples drawn from a density p on \mathbbR^d.
34	A Convex Formulation for Mixed Regression with Two Components: Minimax Optimal Rates	Yudong Chen, Xinyang Yi, Constantine Caramanis	We consider the mixed regression problem with two components, under adversarial and stochastic noise.
35	Efficiency of conformalized ridge regression	Evgeny Burnaev, Vladimir Vovk	In this paper we explore the degree to which this additional requirement of efficiency is satisfied in the case of Bayesian ridge regression; we find that asymptotically conformal prediction sets differ little from ridge regression prediction intervals when the standard Bayesian assumptions are satisfied.
36	Most Correlated Arms Identification	Che-Yu Liu, S�bastien Bubeck	We study the problem of finding the most mutually correlated arms among many arms.
37	Fast matrix completion without the condition number	Moritz Hardt, Mary Wootters	We give the first algorithm for Matrix Completion that achieves running time and sample complexity that is polynomial in the rank of the unknown target matrix, \emphlinear in the dimension of the matrix, and \emphlogarithmic in the condition number of the matrix.
38	Learning Coverage Functions and Private Release of Marginals	Vitaly Feldman, Pravesh Kothari	We study the problem of approximating and learning coverage functions.
39	Computational Limits for Matrix Completion	Moritz Hardt, Raghu Meka, Prasad Raghavendra, Benjamin Weitz	On the technical side, we contribute several new ideas on how to encode hard combinatorial problems in low-rank optimization problems.
40	Robust Multi-objective Learning with Mentor Feedback	Alekh Agarwal, Ashwinkumar Badanidiyuru, Miroslav Dud�k, Robert E. Schapire, Aleksandrs Slivkins	We present an algorithm with a vanishing regret compared with the optimal possible improvement, and show that our regret bound is the best possible.
41	Uniqueness of Tensor Decompositions with Applications to Polynomial Identifiability	Aditya Bhaskara, Moses Charikar, Aravindan Vijayaraghavan	Given the importance of Kruskal’s theorem in the tensor literature, we expect that our robust version will have several applications beyond the settings we explore in this work.
42	New Algorithms for Learning Incoherent and Overcomplete Dictionaries	Sanjeev Arora, Rong Ge, Ankur Moitra	This paper presents a polynomial-time algorithm for learning overcomplete dictionaries; the only previously known algorithm with provable guarantees is the recent work of Spielman et al. (2012) who who gave an algorithm for the undercomplete case, which is rarely the case in applications.
43	Online Linear Optimization via Smoothing	Jacob Abernethy, Chansoo Lee, Abhinav Sinha, Ambuj Tewari	We present a new optimization-theoretic approach to analyzing Follow-the-Leader style algorithms, particularly in the setting where perturbations are used as a tool for regularization.
44	Learning Mixtures of Discrete Product Distributions using Spectral Decompositions	Prateek Jain, Sewoong Oh	In this paper, we introduce a polynomial time/sample complexity method for learning a mixture of r discrete product distributions over {1, 2, …, \ell}^n, for general \ell and r.
45	Localized Complexities for Transductive Learning	Ilya Tolstikhin, Gilles Blanchard, Marius Kloft	We give a preliminary analysis of the localized complexities for the prominent case of kernel classes.
46	On the Consistency of Output Code Based Learning Algorithms for Multiclass Learning Problems	Harish G. Ramaswamy, Balaji Srinivasan Babu, Shivani Agarwal, Robert C. Williamson	In this paper, we consider the question of statistical consistency of such methods.
47	Edge Label Inference in Generalized Stochastic Block Models: from Spectral Theory to Impossibility Results	Jiaming Xu, Laurent Massouli�, Marc Lelarge	We propose a computationally efficient spectral algorithm and show it allows for asymptotically correct inference when the average node degree could be as low as logarithmic in the total number of nodes.
48	Lower Bounds on the Performance of Polynomial-time Algorithms for Sparse Linear Regression	Yuchen Zhang, Martin J. Wainwright, Michael I. Jordan	Under a standard assumption in complexity theory (NP not in P/poly), we demonstrate a gap between the minimax prediction risk for sparse linear regression that can be achieved by polynomial-time algorithms, and that achieved by optimal algorithms.
49	Follow the Leader with Dropout Perturbations	Tim Van Erven, Wojciech Kotlowski, Manfred K. Warmuth	We consider online prediction with expert advice.
50	Lipschitz Bandits: Regret Lower Bound and Optimal Algorithms	Stefan Magureanu, Richard Combes, Alexandre Proutiere	For discrete Lipschitz bandits, we derive asymptotic problem specific lower bounds for the regret satisfied by any algorithm, and propose OSLB and CKL-UCB, two algorithms that efficiently exploit the Lipschitz structure of the problem.
51	Sample Complexity Bounds on Differentially Private Learning via Communication Complexity	Vitaly Feldman, David Xiao	In this work we analyze the sample complexity of classification by differentially private algorithms.
52	Unconstrained Online Linear Learning in Hilbert Spaces: Minimax Algorithms and Normal Approximations	H. Brendan McMahan, Francesco Orabona	We study algorithms for online linear optimization in Hilbert spaces, focusing on the case where the player is unconstrained.
53	Principal Component Analysis and Higher Correlations for Distributed Data	Ravi Kannan, Santosh Vempala, David Woodruff	We present algorithms for two illustrative problems on massive data sets: (1) computing a low-rank approximation of a matrix A=A^1 + A^2 + \ldots + A^s, with matrix A^t stored on server t and (2) computing a function of a vector a_1 + a_2 + \ldots + a_s, where server t has the vector a_t; this includes the well-studied special case of computing frequency moments and separable functions, as well as higher-order correlations such as the number of subgraphs of a specified type occurring in a graph.
54	Compressed Counting Meets Compressed Sensing	Ping Li, Cun-Hui Zhang, Tong Zhang	By observing that natural signals (e.g., images or network data) are often nonnegative, we propose a framework for nonnegative signal recovery using \em Compressed Counting (CC).
55	The Geometry of Losses	Robert C. Williamson	In doing so we show a formal connection between proper losses and norms.
56	Resourceful Contextual Bandits	Ashwinkumar Badanidiyuru, John Langford, Aleksandrs Slivkins	We design the first algorithm for solving these problems that improves over a trivial reduction to the non-contextual case.
57	The More, the Merrier: the Blessing of Dimensionality for Learning Large Gaussian Mixtures	Joseph Anderson, Mikhail Belkin, Navin Goyal, Luis Rademacher, James Voss	In this paper we show that very large mixtures of Gaussians are efficiently learnable in high dimension.
58	Near-Optimal Herding	Nick Harvey, Samira Samadi	We present a new polynomial-time algorithm that solves the sampling problem with error O\left(\sqrtd \log^2.5\|\mathcalX\| / t \right) assuming that \mathcalX is finite.
59	Faster and Sample Near-Optimal Algorithms for Proper Learning Mixtures of Gaussians	Constantinos Daskalakis, Gautam Kamath	One of our main contributions is an improved and generalized algorithm for selecting a good candidate distribution from among competing hypotheses.
60	Online Learning with Composite Loss Functions	Ofer Dekel, Jian Ding, Tomer Koren, Yuval Peres	We study a new class of online learning problems where each of the online algorithm’s actions is assigned an adversarial value, and the loss of the algorithm at each step is a known and deterministic function of the values assigned to its recent actions.
61	Online Non-Parametric Regression	Alexander Rakhlin, Karthik Sridharan	We establish optimal rates for online regression for arbitrary classes of regression functions in terms of the sequential entropy introduced in (Rakhlin et al., 2010).