Paper Digest: ICML 2021 Highlights
To help the community quickly catch up on the work presented in this conference, Paper Digest Team processed all accepted papers, and generated one highlight sentence (typically the main topic) for each paper. Readers are encouraged to read these machine generated highlights / summaries to quickly get the main idea of each paper. Based in New York, Paper Digest is dedicated to producing high-quality text analysis results that people can acturally use on a daily basis. In the past 4 years, we have been serving users across the world with a number of exclusive services on ranking, search, tracking and review. This month we feature Literature Review Generator, which automatically generates literature review around any topic.
If you do not want to miss any interesting academic paper, you are welcome to sign up our free daily paper digest service to get updates on new papers published in your area every day. You are also welcome to follow us on Twitter and Linkedin to get updated with new conference digests.
Paper Digest Team
team@paperdigest.org
TABLE 1: Paper Digest: ICML 2021 Highlights
Paper | Author(s) | |
---|---|---|
1 | A New Representation of Successor Features for Transfer Across Dissimilar Environments Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To address this problem, we propose an approach based on successor features in which we model successor feature functions with Gaussian Processes permitting the source successor features to be treated as noisy measurements of the target successor feature function. |
Majid Abdolshah; Hung Le; Thommen Karimpanal George; Sunil Gupta; Santu Rana; Svetha Venkatesh; |
2 | Massively Parallel and Asynchronous Tsetlin Machine Architecture Supporting Almost Constant-Time Scaling Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a novel scheme for desynchronizing the evaluation of clauses, eliminating the voting bottleneck. |
Kuruge Darshana Abeyrathna; Bimal Bhattarai; Morten Goodwin; Saeed Rahimi Gorji; Ole-Christoffer Granmo; Lei Jiao; Rupsa Saha; Rohan K Yadav; |
3 | Debiasing Model Updates for Improving Personalized Federated Training Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a novel method for federated learning that is customized specifically to the objective of a given edge device. |
Durmus Alp Emre Acar; Yue Zhao; Ruizhao Zhu; Ramon Matas; Matthew Mattina; Paul Whatmough; Venkatesh Saligrama; |
4 | Memory Efficient Online Meta Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a novel algorithm for online meta learning where task instances are sequentially revealed with limited supervision and a learner is expected to meta learn them in each round, so as to allow the learner to customize a task-specific model rapidly with little task-level supervision. |
Durmus Alp Emre Acar; Ruizhao Zhu; Venkatesh Saligrama; |
5 | Robust Testing and Estimation Under Manipulation Attacks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We study robust testing and estimation of discrete distributions in the strong contamination model. |
Jayadev Acharya; Ziteng Sun; Huanyu Zhang; |
6 | GP-Tree: A Gaussian Process Classifier for Few-Shot Incremental Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Here, we propose GP-Tree, a novel method for multi-class classification with Gaussian processes and DKL. |
Idan Achituve; Aviv Navon; Yochai Yemini; Gal Chechik; Ethan Fetaya; |
7 | F-Domain Adversarial Learning: Theory and Algorithms Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we introduce a novel and general domain-adversarial framework. |
David Acuna; Guojun Zhang; Marc T. Law; Sanja Fidler; |
8 | Towards Rigorous Interpretations: A Formalisation of Feature Attribution Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we propose to formalise feature selection/attribution based on the concept of relaxed functional dependence. |
Darius Afchar; Vincent Guigue; Romain Hennequin; |
9 | Acceleration Via Fractal Learning Rate Schedules Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We provide some experiments to challenge conventional beliefs about stable learning rates in deep learning: the fractal schedule enables training to converge with locally unstable updates which make negative progress on the objective. |
Naman Agarwal; Surbhi Goel; Cyril Zhang; |
10 | A Regret Minimization Approach to Iterative Learning Control Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this setting, we propose a new performance metric, planning regret, which replaces the standard stochastic uncertainty assumptions with worst case regret. |
Naman Agarwal; Elad Hazan; Anirudha Majumdar; Karan Singh; |
11 | Towards The Unification and Robustness of Perturbation and Gradient Based Explanations Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we analyze two popular post hoc interpretation techniques: SmoothGrad which is a gradient based method, and a variant of LIME which is a perturbation based method. |
Sushant Agarwal; Shahin Jabbari; Chirag Agarwal; Sohini Upadhyay; Steven Wu; Himabindu Lakkaraju; |
12 | Label Inference Attacks from Log-loss Scores Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we investigate the problem of inferring the labels of a dataset from single (or multiple) log-loss score(s), without any other access to the dataset. |
Abhinav Aggarwal; Shiva Kasiviswanathan; Zekun Xu; Oluwaseyi Feyisetan; Nathanael Teissier; |
13 | Deep Kernel Processes Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We define deep kernel processes in which positive definite Gram matrices are progressively transformed by nonlinear kernel functions and by sampling from (inverse) Wishart distributions. |
Laurence Aitchison; Adam Yang; Sebastian W. Ober; |
14 | How Does Loss Function Affect Generalization Performance of Deep Learning? Application to Human Age Estimation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In summary, our main statement in this paper is: choose a stable loss function, generalize better. |
Ali Akbari; Muhammad Awais; Manijeh Bashar; Josef Kittler; |
15 | On Learnability Via Gradient Method for Two-Layer ReLU Neural Networks in Teacher-Student Setting Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we explore theoretical analysis on training two-layer ReLU neural networks in a teacher-student regression model, in which a student network learns an unknown teacher network through its outputs. |
Shunta Akiyama; Taiji Suzuki; |
16 | Slot Machines: Discovering Winning Combinations of Random Weights in Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In contrast to traditional weight optimization in a continuous space, we demonstrate the existence of effective random networks whose weights are never updated. |
Maxwell M Aladago; Lorenzo Torresani; |
17 | A Large-scale Benchmark for Few-shot Program Induction and Synthesis Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose a new way of leveraging unit tests and natural inputs for small programs as meaningful input-output examples for each sub-program of the overall program. |
Ferran Alet; Javier Lopez-Contreras; James Koppel; Maxwell Nye; Armando Solar-Lezama; Tomas Lozano-Perez; Leslie Kaelbling; Joshua Tenenbaum; |
18 | Robust Pure Exploration in Linear Bandits with Limited Budget Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We consider the pure exploration problem in the fixed-budget linear bandit setting. |
Ayya Alieva; Ashok Cutkosky; Abhimanyu Das; |
19 | Communication-Efficient Distributed Optimization with Quantized Preconditioners Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We investigate fast and communication-efficient algorithms for the classic problem of minimizing a sum of strongly convex and smooth functions that are distributed among $n$ different nodes, which can communicate using a limited number of bits. |
Foivos Alimisis; Peter Davies; Dan Alistarh; |
20 | Non-Exponentially Weighted Aggregation: Regret Bounds for Unbounded Loss Functions Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we study a generalized aggregation strategy, where the weights no longer depend exponentially on the losses. |
Pierre Alquier; |
21 | Dataset Dynamics Via Gradient Flows in Probability Space Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose a novel framework for dataset transformation, which we cast as optimization over data-generating joint probability distributions. |
David Alvarez-Melis; Nicol? Fusi; |
22 | Submodular Maximization Subject to A Knapsack Constraint: Combinatorial Algorithms with Near-optimal Adaptive Complexity Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work we obtain the first \emph{constant factor} approximation algorithm for non-monotone submodular maximization subject to a knapsack constraint with \emph{near-optimal} $O(\log n)$ adaptive complexity. |
Georgios Amanatidis; Federico Fusco; Philip Lazos; Stefano Leonardi; Alberto Marchetti-Spaccamela; Rebecca Reiffenh?user; |
23 | Safe Reinforcement Learning with Linear Function Approximation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we address both problems by first modeling safety as an unknown linear cost function of states and actions, which must always fall below a certain threshold. |
Sanae Amani; Christos Thrampoulidis; Lin Yang; |
24 | Automatic Variational Inference with Cascading Flows Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Here, we combine the flexibility of normalizing flows and the prior-embedding property of ASVI in a new family of variational programs, which we named cascading flows. |
Luca Ambrogioni; Gianluigi Silvestri; Marcel Van Gerven; |
25 | Sparse Bayesian Learning Via Stepwise Regression Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Herein, we propose a coordinate ascent algorithm for SBL termed Relevance Matching Pursuit (RMP) and show that, as its noise variance parameter goes to zero, RMP exhibits a surprising connection to Stepwise Regression. |
Sebastian E. Ament; Carla P. Gomes; |
26 | Locally Persistent Exploration in Continuous Control Tasks with Sparse Rewards Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a new exploration method, based on two intuitions: (1) the choice of the next exploratory action should depend not only on the (Markovian) state of the environment, but also on the agent’s trajectory so far, and (2) the agent should utilize a measure of spread in the state space to avoid getting stuck in a small region. |
Susan Amin; Maziar Gomrokchi; Hossein Aboutalebi; Harsh Satija; Doina Precup; |
27 | Preferential Temporal Difference Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose an approach to re-weighting states used in TD updates, both when they are the input and when they provide the target for the update. |
Nishanth V. Anand; Doina Precup; |
28 | Unitary Branching Programs: Learnability and Lower Bounds Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we study a generalized version of bounded width branching programs where instructions are defined by unitary matrices of bounded dimension. |
Fidel Ernesto Diaz Andino; Maria Kokkou; Mateus De Oliveira Oliveira; Farhad Vadiee; |
29 | The Logical Options Framework Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We introduce a hierarchical reinforcement learning framework called the Logical Options Framework (LOF) that learns policies that are satisfying, optimal, and composable. |
Brandon Araki; Xiao Li; Kiran Vodrahalli; Jonathan Decastro; Micah Fry; Daniela Rus; |
30 | Annealed Flow Transport Monte Carlo Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose here a novel Monte Carlo algorithm, Annealed Flow Transport (AFT), that builds upon AIS and SMC and combines them with normalizing flows (NFs) for improved performance. |
Michael Arbel; Alex Matthews; Arnaud Doucet; |
31 | Permutation Weighting Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work we introduce permutation weighting, a method for estimating balancing weights using a standard binary classifier (regardless of cardinality of treatment). |
David Arbour; Drew Dimmery; Arjun Sondhi; |
32 | Analyzing The Tree-layer Structure of Deep Forests Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, our aim is not to benchmark DF performances but to investigate instead their underlying mechanisms. |
Ludovic Arnould; Claire Boyer; Erwan Scornet; |
33 | Dropout: Explicit Forms and Capacity Control Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We investigate the capacity control provided by dropout in various machine learning problems. |
Raman Arora; Peter Bartlett; Poorya Mianjy; Nathan Srebro; |
34 | Tighter Bounds on The Log Marginal Likelihood of Gaussian Process Regression Using Conjugate Gradients Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a lower bound on the log marginal likelihood of Gaussian process regression models that can be computed without matrix factorisation of the full kernel matrix. |
Artem Artemev; David R Burt; Mark Van Der Wilk; |
35 | Deciding What to Learn: A Rate-Distortion Approach Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, leveraging rate-distortion theory, we automate this process such that the designer need only express their preferences via a single hyperparameter and the agent is endowed with the ability to compute its own learning targets that best achieve the desired trade-off. |
Dilip Arumugam; Benjamin Van Roy; |
36 | Private Adaptive Gradient Methods for Convex Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We study adaptive methods for differentially private convex optimization, proposing and analyzing differentially private variants of a Stochastic Gradient Descent (SGD) algorithm with adaptive stepsizes, as well as the AdaGrad algorithm. |
Hilal Asi; John Duchi; Alireza Fallah; Omid Javidbakht; Kunal Talwar; |
37 | Private Stochastic Convex Optimization: Optimal Rates in L1 Geometry Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We show that, up to logarithmic factors the optimal excess population loss of any $(\epsilon,\delta)$-differentially private optimizer is $\sqrt{\log(d)/n} + \sqrt{d}/\epsilon n.$ The upper bound is based on a new algorithm that combines the iterative localization approach of Feldman et al. (2020) with a new analysis of private regularized mirror descent. |
Hilal Asi; Vitaly Feldman; Tomer Koren; Kunal Talwar; |
38 | Combinatorial Blocking Bandits with Stochastic Delays Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we extend the above model in two directions: (i) We consider the general combinatorial setting where more than one arms can be played at each round, subject to feasibility constraints. (ii) We allow the blocking time of each arm to be stochastic. |
Alexia Atsidakou; Orestis Papadigenopoulos; Soumya Basu; Constantine Caramanis; Sanjay Shakkottai; |
39 | Dichotomous Optimistic Search to Quantify Human Perception Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we address a variant of the continuous multi-armed bandits problem, called the threshold estimation problem, which is at the heart of many psychometric experiments. |
Julien Audiffren; |
40 | Federated Learning Under Arbitrary Communication Patterns Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we investigate the performance of an asynchronous version of local SGD wherein the clients can communicate with the server at arbitrary time intervals. |
Dmitrii Avdiukhin; Shiva Kasiviswanathan; |
41 | Asynchronous Distributed Learning : Adapting to Gradient Delays Without Prior Knowledge Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a robust training method for the constrained setting and derive non asymptotic convergence guarantees that do not depend on prior knowledge of update delays, objective smoothness, and gradient variance. |
Rotem Zamir Aviv; Ido Hakimi; Assaf Schuster; Kfir Yehuda Levy; |
42 | Decomposable Submodular Function Minimization Via Maximum Flow Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We solve this minimization problem by lifting the solutions of a parametric cut problem, which we obtain via a new efficient combinatorial reduction to maximum flow. |
Kyriakos Axiotis; Adam Karczmarz; Anish Mukherjee; Piotr Sankowski; Adrian Vladu; |
43 | Differentially Private Query Release Through Adaptive Projection Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose, implement, and evaluate a new algo-rithm for releasing answers to very large numbersof statistical queries likek-way marginals, sub-ject to differential privacy. |
Sergul Aydore; William Brown; Michael Kearns; Krishnaram Kenthapadi; Luca Melis; Aaron Roth; Ankit A Siva; |
44 | On The Implicit Bias of Initialization Shape: Beyond Infinitesimal Mirror Descent Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We develop a novel technique for deriving the inductive bias of gradient-flow and use it to obtain closed-form implicit regularizers for multiple cases of interest. |
Shahar Azulay; Edward Moroshko; Mor Shpigel Nacson; Blake E Woodworth; Nathan Srebro; Amir Globerson; Daniel Soudry; |
45 | On-Off Center-Surround Receptive Fields for Accurate and Robust Image Classification Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To this end, our paper extends the receptive field of convolutional neural networks with two residual components, ubiquitous in the visual processing system of vertebrates: On-center and off-center pathways, with an excitatory center and inhibitory surround; OOCS for short. |
Zahra Babaiee; Ramin Hasani; Mathias Lechner; Daniela Rus; Radu Grosu; |
46 | Uniform Convergence, Adversarial Spheres and A Simple Remedy Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We provide an extensive theoretical investigation of the previously studied data setting through the lens of infinitely-wide models. |
Gregor Bachmann; Seyed-Mohsen Moosavi-Dezfooli; Thomas Hofmann; |
47 | Faster Kernel Matrix Algebra Via Density Estimation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We study fast algorithms for computing basic properties of an n x n positive semidefinite kernel matrix K corresponding to n points x_1,…,x_n in R^d. |
Arturs Backurs; Piotr Indyk; Cameron Musco; Tal Wagner; |
48 | Robust Reinforcement Learning Using Least Squares Policy Iteration with Provable Performance Guarantees Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper addresses the problem of model-free reinforcement learning for Robust Markov Decision Process (RMDP) with large state spaces. |
Kishan Panaganti Badrinath; Dileep Kalathil; |
49 | Skill Discovery for Exploration and Planning Using Deep Skill Graphs Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We introduce a new skill-discovery algorithm that builds a discrete graph representation of large continuous MDPs, where nodes correspond to skill subgoals and the edges to skill policies. |
Akhil Bagaria; Jason K Senthil; George Konidaris; |
50 | Locally Adaptive Label Smoothing Improves Predictive Churn Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present several baselines for reducing churn and show that training on soft labels obtained by adaptively smoothing each example’s label based on the example’s neighboring labels often outperforms the baselines on churn while improving accuracy on a variety of benchmark classification tasks and model architectures. |
Dara Bahri; Heinrich Jiang; |
51 | How Important Is The Train-Validation Split in Meta-Learning? Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We provide a detailed theoretical study on whether and when the train-validation split is helpful in the linear centroid meta-learning problem. |
Yu Bai; Minshuo Chen; Pan Zhou; Tuo Zhao; Jason Lee; Sham Kakade; Huan Wang; Caiming Xiong; |
52 | Stabilizing Equilibrium Models By Jacobian Regularization Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a regularization scheme for DEQ models that explicitly regularizes the Jacobian of the fixed-point update equations to stabilize the learning of equilibrium models. |
Shaojie Bai; Vladlen Koltun; Zico Kolter; |
53 | Don’t Just Blame Over-parametrization for Over-confidence: Theoretical Analysis of Calibration in Binary Classification Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we show theoretically that over-parametrization is not the only reason for over-confidence. |
Yu Bai; Song Mei; Huan Wang; Caiming Xiong; |
54 | Principled Exploration Via Optimistic Bootstrapping and Backward Induction Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a principled exploration method for DRL through Optimistic Bootstrapping and Backward Induction (OB2I). |
Chenjia Bai; Lingxiao Wang; Lei Han; Jianye Hao; Animesh Garg; Peng Liu; Zhaoran Wang; |
55 | GLSearch: Maximum Common Subgraph Detection Via Learning to Search Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose GLSearch, a Graph Neural Network (GNN) based learning to search model. |
Yunsheng Bai; Derek Xu; Yizhou Sun; Wei Wang; |
56 | Breaking The Limits of Message Passing Graph Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we show that if the graph convolution supports are designed in spectral-domain by a non-linear custom function of eigenvalues and masked with an arbitrary large receptive field, the MPNN is theoretically more powerful than the 1-WL test and experimentally as powerful as a 3-WL existing models, while remaining spatially localized. |
Muhammet Balcilar; Pierre Heroux; Benoit Gauzere; Pascal Vasseur; Sebastien Adam; Paul Honeine; |
57 | Instance Specific Approximations for Submodular Maximization Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We develop an algorithm that gives an instance-specific approximation for any solution of an instance of monotone submodular maximization under a cardinality constraint. |
Eric Balkanski; Sharon Qian; Yaron Singer; |
58 | Augmented World Models Facilitate Zero-Shot Dynamics Generalization From A Single Offline Environment Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: However, little attention has been paid to potentially changing dynamics when transferring a policy to the online setting, where performance can be up to 90% reduced for existing methods. In this paper we address this problem with Augmented World Models (AugWM). |
Philip J Ball; Cong Lu; Jack Parker-Holder; Stephen Roberts; |
59 | Regularized Online Allocation Problems: Fairness and Beyond Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we introduce the regularized online allocation problem, a variant that includes a non-linear regularizer acting on the total resource consumption. |
Santiago Balseiro; Haihao Lu; Vahab Mirrokni; |
60 | Predict Then Interpolate: A Simple Algorithm to Learn Stable Classifiers Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose Predict then Interpolate (PI), a simple algorithm for learning correlations that are stable across environments. |
Yujia Bao; Shiyu Chang; Dr.Regina Barzilay; |
61 | Variational (Gradient) Estimate of The Score Function in Energy-based Latent Variable Models Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper presents new estimates of the score function and its gradient with respect to the model parameters in a general energy-based latent variable model (EBLVM). |
Fan Bao; Kun Xu; Chongxuan Li; Lanqing Hong; Jun Zhu; Bo Zhang; |
62 | Compositional Video Synthesis with Action Graphs Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To address this challenge, we propose to represent the actions in a graph structure called Action Graph and present the new "Action Graph To Video" synthesis task. |
Amir Bar; Roei Herzig; Xiaolong Wang; Anna Rohrbach; Gal Chechik; Trevor Darrell; Amir Globerson; |
63 | Approximating A Distribution Using Weight Queries Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose an interactive algorithm that iteratively selects data set examples and performs corresponding weight queries. |
Nadav Barak; Sivan Sabato; |
64 | Graph Convolution for Semi-Supervised Classification: Improved Linear Separability and Out-of-Distribution Generalization Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To understand the merits of this approach, we study the classification of a mixture of Gaussians, where the data corresponds to the node attributes of a stochastic block model. |
Aseem Baranwal; Kimon Fountoulakis; Aukosh Jagannath; |
65 | Training Quantized Neural Networks to Global Optimality Via Semidefinite Programming Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we introduce a convex optimization strategy to train quantized NNs with polynomial activations. |
Burak Bartan; Mert Pilanci; |
66 | Beyond $log^2(T)$ Regret for Decentralized Bandits in Matching Markets Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a phase based algorithm, where in each phase, besides deleting the globally communicated dominated arms the agents locally delete arms with which they collide often. |
Soumya Basu; Karthik Abinav Sankararaman; Abishek Sankararaman; |
67 | Optimal Thompson Sampling Strategies for Support-aware CVaR Bandits Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we study a multi-arm bandit problem in which the quality of each arm is measured by the Conditional Value at Risk (CVaR) at some level alpha of the reward distribution. |
Dorian Baudry; Romain Gautron; Emilie Kaufmann; Odalric Maillard; |
68 | On Limited-Memory Subsampling Strategies for Bandits Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Our first contribution is to show that a simple deterministic subsampling rule, proposed in the recent work of \citet{baudry2020sub} under the name of “last-block subsampling”, is asymptotically optimal in one-parameter exponential families. |
Dorian Baudry; Yoan Russac; Olivier Capp?; |
69 | Generalized Doubly Reparameterized Gradient Estimators Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Here, we develop two generalizations of the DReGs estimator and show that they can be used to train conditional and hierarchical VAEs on image modelling tasks more effectively. |
Matthias Bauer; Andriy Mnih; |
70 | Directional Graph Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To overcome this limitation, we propose the first globally consistent anisotropic kernels for GNNs, allowing for graph convolutions that are defined according to topologicaly-derived directional flows. |
Dominique Beani; Saro Passaro; Vincent L?tourneau; Will Hamilton; Gabriele Corso; Pietro Li?; |
71 | Policy Analysis Using Synthetic Controls in Continuous-Time Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a continuous-time alternative that models the latent counterfactual path explicitly using the formalism of controlled differential equations. |
Alexis Bellot; Mihaela Van Der Schaar; |
72 | Loss Surface Simplexes for Mode Connecting Volumes and Fast Ensembling Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we in fact demonstrate the existence of mode-connecting simplicial complexes that form multi-dimensional manifolds of low loss, connecting many independently trained models. |
Gregory Benton; Wesley Maddox; Sanae Lotfi; Andrew Gordon Gordon Wilson; |
73 | TFix: Learning to Fix Coding Errors with A Text-to-Text Transformer Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we address this challenge and present a new learning-based system, called TFix. |
Berkay Berabi; Jingxuan He; Veselin Raychev; Martin Vechev; |
74 | Learning Queueing Policies for Organ Transplantation Allocation Using Interpretable Counterfactual Survival Analysis Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we develop a data-driven model for (real-time) organ allocation using observational data for transplant outcomes. Furthermore, we introduce a novel organ-allocation simulator to accurately test new policies. |
Jeroen Berrevoets; Ahmed Alaa; Zhaozhi Qian; James Jordon; Alexander E.S. Gimson; Mihaela Van Der Schaar; |
75 | Learning from Biased Data: A Semi-Parametric Approach Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We consider risk minimization problems where the (source) distribution $P_S$ of the training observations $Z_1, \ldots, Z_n$ differs from the (target) distribution $P_T$ involved in the risk that one seeks to minimize. |
Patrice Bertail; Stephan Cl?men?on; Yannick Guyonvarch; Nathan Noiry; |
76 | Is Space-Time Attention All You Need for Video Understanding? Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present a convolution-free approach to video classification built exclusively on self-attention over space and time. |
Gedas Bertasius; Heng Wang; Lorenzo Torresani; |
77 | Confidence Scores Make Instance-dependent Label-noise Learning Possible Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To alleviate this issue, we introduce confidence-scored instance-dependent noise (CSIDN), where each instance-label pair is equipped with a confidence score. |
Antonin Berthon; Bo Han; Gang Niu; Tongliang Liu; Masashi Sugiyama; |
78 | Size-Invariant Graph Representations for Graph Classification Extrapolations Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work we consider an underexplored area of an otherwise rapidly developing field of graph representation learning: The task of out-of-distribution (OOD) graph classification, where train and test data have different distributions, with test data unavailable during training. |
Beatrice Bevilacqua; Yangze Zhou; Bruno Ribeiro; |
79 | Principal Bit Analysis: Autoencoding with Schur-Concave Loss Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We consider a linear autoencoder in which the latent variables are quantized, or corrupted by noise, and the constraint is Schur-concave in the set of latent variances. |
Sourbh Bhadane; Aaron B Wagner; Jayadev Acharya; |
80 | Lower Bounds on Cross-Entropy Loss in The Presence of Test-time Adversaries Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we determine optimal lower bounds on the cross-entropy loss in the presence of test-time adversaries, along with the corresponding optimal classification outputs. |
Arjun Nitin Bhagoji; Daniel Cullina; Vikash Sehwag; Prateek Mittal; |
81 | Additive Error Guarantees for Weighted Low Rank Approximation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We study a natural greedy algorithm for weighted low rank approximation and develop a simple condition under which it yields bi-criteria approximation up to a small additive factor in the error. |
Aditya Bhaskara; Aravinda Kanchana Ruwanpathirana; Maheshakya Wijewardena; |
82 | Sample Complexity of Robust Linear Classification on Separated Data Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We consider the sample complexity of learning with adversarial robustness. |
Robi Bhattacharjee; Somesh Jha; Kamalika Chaudhuri; |
83 | Finding K in Latent $k-$ Polytope Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: The first important contribution of this paper is to show that under \emph{standard assumptions} $k$ equals the \INR of a \emph{subset smoothed data matrix} defined from Data generated from an $\LkP$. |
Chiranjib Bhattacharyya; Ravindran Kannan; Amit Kumar; |
84 | Non-Autoregressive Electron Redistribution Modeling for Reaction Prediction Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To address these issues, we devise a non-autoregressive learning paradigm that predicts reaction in one shot. |
Hangrui Bi; Hengyi Wang; Chence Shi; Connor Coley; Jian Tang; Hongyu Guo; |
85 | TempoRL: Learning When to Act Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To address this, we propose a proactive setting in which the agent not only selects an action in a state but also for how long to commit to that action. |
Andr? Biedenkapp; Raghu Rajan; Frank Hutter; Marius Lindauer; |
86 | Follow-the-Regularized-Leader Routes to Chaos in Routing Games Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We study the emergence of chaotic behavior of Follow-the-Regularized Leader (FoReL) dynamics in games. |
Jakub Bielawski; Thiparat Chotibut; Fryderyk Falniowski; Grzegorz Kosiorowski; Michal Misiurewicz; Georgios Piliouras; |
87 | Neural Symbolic Regression That Scales Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we introduce the first symbolic regression method that leverages large scale pre-training. We procedurally generate an unbounded set of equations, and simultaneously pre-train a Transformer to predict the symbolic equation from a corresponding set of input-output-pairs. |
Luca Biggio; Tommaso Bendinelli; Alexander Neitz; Aurelien Lucchi; Giambattista Parascandolo; |
88 | Model Distillation for Revenue Optimization: Interpretable Personalized Pricing Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present a novel, customized, prescriptive tree-based algorithm that distills knowledge from a complex black-box machine learning algorithm, segments customers with similar valuations and prescribes prices in such a way that maximizes revenue while maintaining interpretability. |
Max Biggs; Wei Sun; Markus Ettl; |
89 | Scalable Normalizing Flows for Permutation Invariant Densities Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we demonstrate how calculating the trace, a crucial step in this method, raises issues that occur both during training and inference, limiting its practicality. |
Marin Bilo?; Stephan G?nnemann; |
90 | Online Learning for Load Balancing of Unknown Monotone Resource Allocation Games Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To overcome this, we propose a simple algorithm that learns to shift the NE of the game to meet the total load constraints by adjusting the pricing coefficients in an online manner. |
Ilai Bistritz; Nicholas Bambos; |
91 | Low-Precision Reinforcement Learning: Running Soft Actor-Critic in Half Precision Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we consider continuous control with the state-of-the-art SAC agent and demonstrate that a naïve adaptation of low-precision methods from supervised learning fails. |
Johan Bj?rck; Xiangyu Chen; Christopher De Sa; Carla P Gomes; Kilian Weinberger; |
92 | Multiplying Matrices Without Multiplying Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Consequently, the task of efficiently approximating matrix products has received significant attention. We introduce a learning-based algorithm for this task that greatly outperforms existing methods. |
Davis Blalock; John Guttag; |
93 | One for One, or All for All: Equilibria and Optimality of Collaboration in Federated Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Inspired by game theoretic notions, this paper introduces a framework for incentive-aware learning and data sharing in federated learning. |
Avrim Blum; Nika Haghtalab; Richard Lanas Phillips; Han Shao; |
94 | Black-box Density Function Estimation Using Recursive Partitioning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present a novel approach to Bayesian inference and general Bayesian computation that is defined through a sequential decision loop. |
Erik Bodin; Zhenwen Dai; Neill Campbell; Carl Henrik Ek; |
95 | Weisfeiler and Lehman Go Topological: Message Passing Simplicial Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To overcome these limitations, we propose Message Passing Simplicial Networks (MPSNs), a class of models that perform message passing on simplicial complexes (SCs). |
Cristian Bodnar; Fabrizio Frasca; Yuguang Wang; Nina Otter; Guido F Montufar; Pietro Li?; Michael Bronstein; |
96 | The Hintons in Your Neural Network: A Quantum Field Theory View of Deep Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work we develop a quantum field theory formalism for deep learning, where input signals are encoded in Gaussian states, a generalization of Gaussian processes which encode the agent’s uncertainty about the input signal. |
Roberto Bondesan; Max Welling; |
97 | Offline Contextual Bandits with Overparameterized Models Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We formally prove upper bounds on the regret of overparameterized value-based learning and lower bounds on the regret for policy-based algorithms. |
David Brandfonbrener; William Whitney; Rajesh Ranganath; Joan Bruna; |
98 | High-Performance Large-Scale Image Recognition Without Normalization Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we develop an adaptive gradient clipping technique which overcomes these instabilities, and design a significantly improved class of Normalizer-Free ResNets. |
Andy Brock; Soham De; Samuel L Smith; Karen Simonyan; |
99 | Evaluating The Implicit Midpoint Integrator for Riemannian Hamiltonian Monte Carlo Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we examine the implicit midpoint integrator as an alternative to the generalized leapfrog integrator. |
James Brofos; Roy R Lederman; |
100 | Reinforcement Learning of Implicit and Explicit Control Flow Instructions Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We focus here on the problem of learning control flow that deviates from a strict step-by-step execution of instructions{—}that is, control flow that may skip forward over parts of the instructions or return backward to previously completed or skipped steps. |
Ethan Brooks; Janarthanan Rajendran; Richard L Lewis; Satinder Singh; |
101 | Machine Unlearning for Random Forests Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we introduce data removal-enabled (DaRE) forests, a variant of random forests that enables the removal of training data with minimal retraining. |
Jonathan Brophy; Daniel Lowd; |
102 | Value Alignment Verification Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we formalize and theoretically analyze the problem of efficient value alignment verification: how to efficiently test whether the behavior of another agent is aligned with a human’s values? |
Daniel S Brown; Jordan Schneider; Anca Dragan; Scott Niekum; |
103 | Model-Free and Model-Based Policy Evaluation When Causality Is Uncertain Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We develop worst-case bounds to assess sensitivity to these unobserved confounders in finite horizons when confounders are drawn iid each period. |
David A Bruns-Smith; |
104 | Narrow Margins: Classification, Margins and Fat Tails Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We investigate the case where this convergence property is not guaranteed to hold and show that it can be fully characterised by the distribution of error terms in the latent variable interpretation of linear classifiers. |
Francois Buet-Golfouse; |
105 | Differentially Private Correlation Clustering Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose an algorithm that achieves subquadratic additive error compared to the optimal cost. |
Mark Bun; Marek Elias; Janardhan Kulkarni; |
106 | Disambiguation of Weak Supervision Leading to Exponential Convergence Rates Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we focus on partial labelling, an instance of weak supervision where, from a given input, we are given a set of potential targets. |
Vivien A Cabannnes; Francis Bach; Alessandro Rudi; |
107 | Finite Mixture Models Do Not Reliably Learn The Number of Components Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we add rigor to data-analysis folk wisdom by proving that under even the slightest model misspecification, the FMM component-count posterior diverges: the posterior probability of any particular finite number of components converges to 0 in the limit of infinite data. |
Diana Cai; Trevor Campbell; Tamara Broderick; |
108 | A Theory of Label Propagation for Subpopulation Shift Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose a provably effective framework based on label propagation by using an input consistency loss. |
Tianle Cai; Ruiqi Gao; Jason Lee; Qi Lei; |
109 | Lenient Regret and Good-Action Identification in Gaussian Process Bandits Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we study the problem of Gaussian process (GP) bandits under relaxed optimization criteria stating that any function value above a certain threshold is “good enough”. |
Xu Cai; Selwyn Gomes; Jonathan Scarlett; |
110 | A Zeroth-Order Block Coordinate Descent Algorithm for Huge-Scale Black-Box Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a novel algorithm, coined ZO-BCD, that exhibits favorable overall query complexity and has a much smaller per-iteration computational complexity. |
Hanqin Cai; Yuchen Lou; Daniel Mckenzie; Wotao Yin; |
111 | GraphNorm: A Principled Approach to Accelerating Graph Neural Network Training Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we study what normalization is effective for Graph Neural Networks (GNNs). |
Tianle Cai; Shengjie Luo; Keyulu Xu; Di He; Tie-Yan Liu; Liwei Wang; |
112 | On Lower Bounds for Standard and Robust Gaussian Process Bandit Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we consider algorithm independent lower bounds for the problem of black-box optimization of functions having a bounded norm is some Reproducing Kernel Hilbert Space (RKHS), which can be viewed as a non-Bayesian Gaussian process bandit problem. |
Xu Cai; Jonathan Scarlett; |
113 | High-dimensional Experimental Design and Kernel Bandits Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose a rounding procedure that frees N of any dependence on the dimension d, while achieving nearly the same performance guarantees of existing rounding procedures. |
Romain Camilleri; Kevin Jamieson; Julian Katz-Samuels; |
114 | A Gradient Based Strategy for Hamiltonian Monte Carlo Hyperparameter Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Instead, we propose to optimize an objective that quantifies directly the speed of convergence to the target distribution. |
Andrew Campbell; Wenlong Chen; Vincent Stimper; Jose Miguel Hernandez-Lobato; Yichuan Zhang; |
115 | Asymmetric Heavy Tails and Implicit Bias in Gaussian Noise Injections Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we focus on the so-called ‘implicit effect’ of GNIs, which is the effect of the injected noise on the dynamics of SGD. |
Alexander Camuto; Xiaoyu Wang; Lingjiong Zhu; Chris Holmes; Mert Gurbuzbalaban; Umut Simsekli; |
116 | Fold2Seq: A Joint Sequence(1D)-Fold(3D) Embedding-based Generative Model for Protein Design Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To overcome these challenges, we propose Fold2Seq, a novel transformer-based generative framework for designing protein sequences conditioned on a specific target fold. |
Yue Cao; Payel Das; Vijil Chenthamarakshan; Pin-Yu Chen; Igor Melnyk; Yang Shen; |
117 | Learning from Similarity-Confidence Data Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we investigate a novel weakly supervised learning problem of learning from similarity-confidence (Sconf) data, where only unlabeled data pairs equipped with confidence that illustrates their degree of similarity (two examples are similar if they belong to the same class) are needed for training a discriminative binary classifier. |
Yuzhou Cao; Lei Feng; Yitian Xu; Bo An; Gang Niu; Masashi Sugiyama; |
118 | Parameter-free Locally Accelerated Conditional Gradients Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We remove this limitation by introducing a novel, Parameter-Free Locally accelerated CG (PF-LaCG) algorithm, for which we provide rigorous convergence guarantees. |
Alejandro Carderera; Jelena Diakonikolas; Cheuk Yin Lin; Sebastian Pokutta; |
119 | Optimizing Persistent Homology Based Functions Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Building on real analytic geometry arguments, we propose a general framework that allows us to define and compute gradients for persistence-based functions in a very simple way. |
Mathieu Carriere; Frederic Chazal; Marc Glisse; Yuichi Ike; Hariprasad Kannan; Yuhei Umeda; |
120 | Online Policy Gradient for Model Free Learning of Linear Quadratic Regulators with $\sqrt$T Regret Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present the first model-free algorithm that achieves similar regret guarantees. |
Asaf B Cassel; Tomer Koren; |
121 | Multi-Receiver Online Bayesian Persuasion Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We study, for the first time, an online Bayesian persuasion setting with multiple receivers. |
Matteo Castiglioni; Alberto Marchesi; Andrea Celli; Nicola Gatti; |
122 | Marginal Contribution Feature Importance – An Axiomatic Approach for Explaining Data Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Therefore, we develop a set of axioms to capture properties expected from a feature importance score when explaining data and prove that there exists only one score that satisfies all of them, the Marginal Contribution Feature Importance (MCI). |
Amnon Catav; Boyang Fu; Yazeed Zoabi; Ahuva Libi Weiss Meilik; Noam Shomron; Jason Ernst; Sriram Sankararaman; Ran Gilad-Bachrach; |
123 | Disentangling Syntax and Semantics in The Brain with Deep Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Overall, this study introduces a versatile framework to isolate, in the brain activity, the distributed representations of linguistic constructs. |
Charlotte Caucheteux; Alexandre Gramfort; Jean-Remi King; |
124 | Fair Classification with Noisy Protected Attributes: A Framework with Provable Guarantees Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present an optimization framework for learning a fair classifier in the presence of noisy perturbations in the protected attributes. |
L. Elisa Celis; Lingxiao Huang; Vijay Keswani; Nisheeth K. Vishnoi; |
125 | Best Model Identification: A Rested Bandit Formulation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We introduce and analyze a best arm identification problem in the rested bandit setting, wherein arms are themselves learning algorithms whose expected losses decrease with the number of times the arm has been played. |
Leonardo Cella; Massimiliano Pontil; Claudio Gentile; |
126 | Revisiting Rainbow: Promoting More Insightful and Inclusive Deep Reinforcement Learning Research Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work we argue that, despite the community’s emphasis on large-scale environments, the traditional small-scale environments can still yield valuable scientific insights and can help reduce the barriers to entry for underprivileged communities. |
Johan Samir Obando Ceron; Pablo Samuel Castro; |
127 | Learning Routines for Effective Off-Policy Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a novel framework for reinforcement learning that effectively lifts such constraints. |
Edoardo Cetin; Oya Celiktutan; |
128 | Learning Node Representations Using Stationary Flow Prediction on Large Payment and Cash Transaction Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, the gradient model is extended to a gated version and we prove that it, unlike the gradient model, is a universal approximator for flows on graphs. |
Ciwan Ceylan; Salla Franz?n; Florian T. Pokorny; |
129 | GRAND: Graph Neural Diffusion Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present Graph Neural Diffusion (GRAND) that approaches deep learning on graphs as a continuous diffusion process and treats Graph Neural Networks (GNNs) as discretisations of an underlying PDE. |
Ben Chamberlain; James Rowbottom; Maria I Gorinova; Michael Bronstein; Stefan Webb; Emanuele Rossi; |
130 | HoroPCA: Hyperbolic Dimensionality Reduction Via Horospherical Projections Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We generalize each of these concepts to the hyperbolic space and propose HoroPCA, a method for hyperbolic dimensionality reduction. |
Ines Chami; Albert Gu; Dat P Nguyen; Christopher Re; |
131 | Goal-Conditioned Reinforcement Learning with Imagined Subgoals Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose to incorporate imagined subgoals into policy learning to facilitate learning of complex tasks. |
Elliot Chane-Sane; Cordelia Schmid; Ivan Laptev; |
132 | Locally Private K-Means in One Round Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We provide an approximation algorithm for k-means clustering in the \emph{one-round} (aka \emph{non-interactive}) local model of differential privacy (DP). |
Alisa Chang; Badih Ghazi; Ravi Kumar; Pasin Manurangsi; |
133 | Modularity in Reinforcement Learning Via Algorithmic Independence in Credit Assignment Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We introduce what we call the modularity criterion for testing whether a learning algorithm satisfies this constraint by performing causal analysis on the algorithm itself. |
Michael Chang; Sid Kaushik; Sergey Levine; Tom Griffiths; |
134 | Image-Level or Object-Level? A Tale of Two Resampling Strategies for Long-Tailed Detection Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We address object-level resampling by introducing an object-centric sampling strategy based on a dynamic, episodic memory bank. |
Nadine Chang; Zhiding Yu; Yu-Xiong Wang; Animashree Anandkumar; Sanja Fidler; Jose M Alvarez; |
135 | DeepWalking Backwards: From Embeddings Back to Graphs Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Focusing on a variant of the popular DeepWalk method \cite{PerozziAl-RfouSkiena:2014, QiuDongMa:2018}, we present algorithms for accurate embedding inversion – i.e., from the low-dimensional embedding of a graph $G$, we can find a graph $\tilde G$ with a very similar embedding. |
Sudhanshu Chanpuriya; Cameron Musco; Konstantinos Sotiropoulos; Charalampos Tsourakakis; |
136 | Differentiable Spatial Planning Using Transformers Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose Spatial Planning Transformers (SPT), which given an obstacle map learns to generate actions by planning over long-range spatial dependencies, unlike prior data-driven planners that propagate information locally via convolutional structure in an iterative manner. |
Devendra Singh Chaplot; Deepak Pathak; Jitendra Malik; |
137 | Solving Challenging Dexterous Manipulation Tasks With Trajectory Optimisation and Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we first introduce a suite of challenging simulated manipulation tasks where current reinforcement learning and trajectory optimisation techniques perform poorly. |
Henry J Charlesworth; Giovanni Montana; |
138 | Classification with Rejection Based on Cost-sensitive Classification Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, based on the relationship between classification with rejection and cost-sensitive classification, we propose a novel method of classification with rejection by learning an ensemble of cost-sensitive classifiers, which satisfies all the following properties: (i) it can avoid estimating class-posterior probabilities, resulting in improved classification accuracy. |
Nontawat Charoenphakdee; Zhenghang Cui; Yivan Zhang; Masashi Sugiyama; |
139 | Actionable Models: Unsupervised Offline Reinforcement Learning of Robotic Skills Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In particular, we propose the objective of learning a functional understanding of the environment by learning to reach any goal state in a given dataset. |
Yevgen Chebotar; Karol Hausman; Yao Lu; Ted Xiao; Dmitry Kalashnikov; Jacob Varley; Alex Irpan; Benjamin Eysenbach; Ryan C Julian; Chelsea Finn; Sergey Levine; |
140 | Unified Robust Semi-Supervised Variational Autoencoder Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a novel noise-robust semi-supervised deep generative model by jointly tackling noisy labels and outliers simultaneously in a unified robust semi-supervised variational autoencoder (URSVAE). |
Xu Chen; |
141 | Unsupervised Learning of Visual 3D Keypoints for Control Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose a framework to learn such a 3D geometric structure directly from images in an end-to-end unsupervised manner. |
Boyuan Chen; Pieter Abbeel; Deepak Pathak; |
142 | Integer Programming for Causal Structure Learning in The Presence of Latent Variables Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a novel exact score-based method that solves an integer programming (IP) formulation and returns a score-maximizing ancestral ADMG for a set of continuous variables that follow a multivariate Gaussian distribution. |
Rui Chen; Sanjeeb Dash; Tian Gao; |
143 | Improved Corruption Robust Algorithms for Episodic Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose new algorithms which, compared to the existing results in \cite{lykouris2020corruption}, achieve strictly better regret bounds in terms of total corruptions for the tabular setting. |
Yifang Chen; Simon Du; Kevin Jamieson; |
144 | Scalable Computations of Wasserstein Barycenter Via Input Convex Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we present a novel scalable algorithm to approximate the Wasserstein Barycenters aiming at high-dimensional applications in machine learning. |
Yongxin Chen; Jiaojiao Fan; Amirhossein Taghvaei; |
145 | Neural Feature Matching in Implicit 3D Representations Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: While the benefits from the global latent space do not correspond to explicit points at local level, we propose to track the continuous point trajectory by matching implicit features with the latent code interpolating between shapes, from which we corroborate the hierarchical functionality of the deep implicit functions, where early layers map the latent code to fitting the coarse shape structure, and deeper layers further refine the shape details. |
Yunlu Chen; Basura Fernando; Hakan Bilen; Thomas Mensink; Efstratios Gavves; |
146 | Decentralized Riemannian Gradient Descent on The Stiefel Manifold Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present a decentralized Riemannian stochastic gradient method (DRSGD) with the convergence rate of $\mathcal{O}(1/\sqrt{K})$ to a stationary point. |
Shixiang Chen; Alfredo Garcia; Mingyi Hong; Shahin Shahrampour; |
147 | Learning Self-Modulating Attention in Continuous Time Space with Applications to Sequential Recommendation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a novel attention network, named \textit{self-modulating attention}, that models the complex and non-linearly evolving dynamic user preferences. |
Chao Chen; Haoyu Geng; Nianzu Yang; Junchi Yan; Daiyue Xue; Jianping Yu; Xiaokang Yang; |
148 | Mandoline: Model Evaluation Under Distribution Shift Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Our key insight is that practitioners may have prior knowledge about the ways in which the distribution shifts, which we can use to better guide the importance weighting procedure. |
Mayee Chen; Karan Goel; Nimit S Sohoni; Fait Poms; Kayvon Fatahalian; Christopher Re; |
149 | Order Matters: Probabilistic Modeling of Node Sequence for Graph Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we provide an expression for the likelihood of a graph generative model and show that its calculation is closely related to the problem of graph automorphism. |
Xiaohui Chen; Xu Han; Jiajing Hu; Francisco Ruiz; Liping Liu; |
150 | CARTL: Cooperative Adversarially-Robust Transfer Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To address such a problem, we propose a novel cooperative adversarially-robust transfer learning (CARTL) by pre-training the model via feature distance minimization and fine-tuning the pre-trained model with non-expansive fine-tuning for target domain tasks. |
Dian Chen; Hongxin Hu; Qian Wang; Li Yinli; Cong Wang; Chao Shen; Qi Li; |
151 | Finding The Stochastic Shortest Path with Low Regret: The Adversarial Cost and Unknown Transition Case Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Specifically, we develop algorithms that achieve $O(\sqrt{S^2ADT_\star K})$ regret for the full-information setting and $O(\sqrt{S^3A^2DT_\star K})$ regret for the bandit feedback setting, where $D$ is the diameter, $T_\star$ is the expected hitting time of the optimal policy, $S$ is the number of states, $A$ is the number of actions, and $K$ is the number of episodes. |
Liyu Chen; Haipeng Luo; |
152 | SpreadsheetCoder: Formula Prediction from Semi-structured Context Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we present the first approach for synthesizing spreadsheet formulas from tabular context, which includes both headers and semi-structured tabular data. |
Xinyun Chen; Petros Maniatis; Rishabh Singh; Charles Sutton; Hanjun Dai; Max Lin; Denny Zhou; |
153 | Large-Margin Contrastive Learning with Distance Polarization Regularizer Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To this end, we propose \emph{large-margin contrastive learning} (LMCL) with \emph{distance polarization regularizer}, motivated by the distribution characteristic of pairwise distances in \emph{metric learning}. |
Shuo Chen; Gang Niu; Chen Gong; Jun Li; Jian Yang; Masashi Sugiyama; |
154 | Z-GCNETs: Time Zigzags at Graph Convolutional Networks for Time Series Forecasting Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: As convergence of these two emerging ideas, we propose to enhance DL architectures with the most salient time-conditioned topological information of the data and introduce the concept of zigzag persistence into time-aware graph convolutional networks (GCNs). |
Yuzhou Chen; Ignacio Segovia; Yulia R. Gel; |
155 | A Unified Lottery Ticket Hypothesis for Graph Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To this end, this paper first presents a unified GNN sparsification (UGS) framework that simultaneously prunes the graph adjacency matrix and the model weights, for effectively accelerating GNN inference on large-scale graphs. Leveraging this new tool, we further generalize the recently popular lottery ticket hypothesis to GNNs for the first time, by defining a graph lottery ticket (GLT) as a pair of core sub-dataset and sparse sub-network, which can be jointly identified from the original GNN and the full dense graph by iteratively applying UGS. |
Tianlong Chen; Yongduo Sui; Xuxi Chen; Aston Zhang; Zhangyang Wang; |
156 | Network Inference and Influence Maximization from Samples Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we consider the more realistic sampling setting where the network is unknown and we only have a set of passively observed cascades that record the set of activated nodes at each diffusion step. |
Wei Chen; Xiaoming Sun; Jialin Zhang; Zhijie Zhang; |
157 | Data-driven Prediction of General Hamiltonian Dynamics Via Learning Exactly-Symplectic Maps Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We consider the learning and prediction of nonlinear time series generated by a latent symplectic map. |
Renyi Chen; Molei Tao; |
158 | Analysis of Stochastic Lanczos Quadrature for Spectrum Approximation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present an error analysis for stochastic Lanczos quadrature (SLQ). |
Tyler Chen; Thomas Trogdon; Shashanka Ubaru; |
159 | Large-Scale Multi-Agent Deep FBSDEs Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we present a scalable deep learning framework for finding Markovian Nash Equilibria in multi-agent stochastic games using fictitious play. |
Tianrong Chen; Ziyi O Wang; Ioannis Exarchos; Evangelos Theodorou; |
160 | Representation Subspace Distance for Domain Adaptation Regression Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Based on this finding, we propose to close the domain gap through orthogonal bases of the representation spaces, which are free from feature scaling. |
Xinyang Chen; Sinan Wang; Jianmin Wang; Mingsheng Long; |
161 | Overcoming Catastrophic Forgetting By Bayesian Generative Regularization Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a new method to over-come catastrophic forgetting by adding generative regularization to Bayesian inference frame-work. |
Pei-Hung Chen; Wei Wei; Cho-Jui Hsieh; Bo Dai; |
162 | Cyclically Equivariant Neural Decoders for Cyclic Codes Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose a novel neural decoder for cyclic codes by exploiting their cyclically invariant property. |
Xiangyu Chen; Min Ye; |
163 | A Receptor Skeleton for Capsule Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper presents a new capsule structure, which contains a set of optimizable receptors and a transmitter is devised on the capsule’s representation. |
Jintai Chen; Hongyun Yu; Chengde Qian; Danny Z Chen; Jian Wu; |
164 | Accelerating Gossip SGD with Periodic Global Averaging Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper introduces Gossip-PGA, which adds Periodic Global Averaging to accelerate Gossip SGD. |
Yiming Chen; Kun Yuan; Yingya Zhang; Pan Pan; Yinghui Xu; Wotao Yin; |
165 | ActNN: Reducing Training Memory Footprint Via 2-Bit Activation Compressed Training Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose ActNN, a memory-efficient training framework that stores randomly quantized activations for back propagation. |
Jianfei Chen; Lianmin Zheng; Zhewei Yao; Dequan Wang; Ion Stoica; Michael Mahoney; Joseph Gonzalez; |
166 | SPADE: A Spectral Method for Black-Box Adversarial Robustness Evaluation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: By leveraging the generalized Courant-Fischer theorem, we propose a SPADE score for evaluating the adversarial robustness of a given model, which is proved to be an upper bound of the best Lipschitz constant under the manifold setting. |
Wuxinlin Cheng; Chenhui Deng; Zhiqiang Zhao; Yaohui Cai; Zhiru Zhang; Zhuo Feng; |
167 | Self-supervised and Supervised Joint Training for Resource-rich Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a joint training approach, F2-XEnDec, to combine self-supervised and supervised learning to optimize NMT models. |
Yong Cheng; Wei Wang; Lu Jiang; Wolfgang Macherey; |
168 | Exact Optimization of Conformal Predictors Via Incremental and Decremental Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we show that it is possible to speed up a CP classifier considerably, by studying it in conjunction with the underlying ML method, and by exploiting incremental&decremental learning. |
Giovanni Cherubin; Konstantinos Chatzikokolakis; Martin Jaggi; |
169 | Problem Dependent View on Structured Thresholding Bandit Problems Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We investigate the \textit{problem dependent regime} in the stochastic \emph{Thresholding Bandit problem} (\tbp) under several \emph{shape constraints}. |
James Cheshire; Pierre Menard; Alexandra Carpentier; |
170 | Online Optimization in Games Via Control Theory: Connecting Regret, Passivity and Poincar? Recurrence Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present a novel control-theoretic understanding of online optimization and learning in games, via the notion of passivity. |
Yun Kuen Cheung; Georgios Piliouras; |
171 | Understanding and Mitigating Accuracy Disparity in Regression Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we study the accuracy disparity problem in regression. |
Jianfeng Chi; Yuan Tian; Geoffrey J. Gordon; Han Zhao; |
172 | Private Alternating Least Squares: Practical Private Matrix Completion with Tighter Rates Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We study the problem of differentially private (DP) matrix completion under user-level privacy. |
Steve Chien; Prateek Jain; Walid Krichene; Steffen Rendle; Shuang Song; Abhradeep Thakurta; Li Zhang; |
173 | Light RUMs Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we consider the question of the (lossy) compressibility of RUMs on a universe of size $n$, i.e., the minimum number of bits required to approximate the winning probabilities of each slate. |
Flavio Chierichetti; Ravi Kumar; Andrew Tomkins; |
174 | Parallelizing Legendre Memory Unit Training Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Here we leverage the linear time-invariant (LTI) memory component of the LMU to construct a simplified variant that can be parallelized during training (and yet executed as an RNN during inference), resulting in up to 200 times faster training. |
Narsimha Reddy Chilkuri; Chris Eliasmith; |
175 | Quantifying and Reducing Bias in Maximum Likelihood Estimation of Structured Anomalies Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we demonstrate that in the normal means setting, the bias of the MLE depends on the size of the anomaly family. |
Uthsav Chitra; Kimberly Ding; Jasper C.H. Lee; Benjamin J Raphael; |
176 | Robust Learning-Augmented Caching: An Experimental Study Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We show that a straightforward method – blindly following either a predictor or a classical robust algorithm, and switching whenever one becomes worse than the other – has only a low overhead over a well-performing predictor, while competing with classical methods when the coupled predictor fails, thus providing a cheap worst-case insurance. |
Jakub Chledowski; Adam Polak; Bartosz Szabucki; Konrad Tomasz Zolna; |
177 | Unifying Vision-and-Language Tasks Via Text Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To alleviate these hassles, in this work, we propose a unified framework that learns different tasks in a single architecture with the same language modeling objective, i.e., multimodal conditional text generation, where our models learn to generate labels in text based on the visual and textual inputs. |
Jaemin Cho; Jie Lei; Hao Tan; Mohit Bansal; |
178 | Learning from Nested Data with Ornstein Auto-Encoders Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: After identifying several issues with RIOAE, we present the product-space OAE (PSOAE) that minimizes a tighter upper bound of the distance and achieves orthogonality in the representation space. |
Youngwon Choi; Sungdong Lee; Joong-Ho Won; |
179 | Variational Empowerment As Representation Learning for Goal-Conditioned Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we discuss how these two approaches {—} goal-conditioned RL (GCRL) and MI-based RL {—} can be generalized into a single family of methods, interpreting mutual information maximization and variational empowerment as representation learning methods that acquire function-ally aware state representations for goal reaching. |
Jongwook Choi; Archit Sharma; Honglak Lee; Sergey Levine; Shixiang Shane Gu; |
180 | Label-Only Membership Inference Attacks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Whereas current attack methods all require access to the model’s predicted confidence score, we introduce a label-only attack that instead evaluates the robustness of the model’s predicted (hard) labels under perturbations of the input, to infer membership. |
Christopher A. Choquette-Choo; Florian Tramer; Nicholas Carlini; Nicolas Papernot; |
181 | Modeling Hierarchical Structures with Continuous Recursive Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose Continuous Recursive Neural Network (CRvNN) as a backpropagation-friendly alternative to address the aforementioned limitations. |
Jishnu Ray Chowdhury; Cornelia Caragea; |
182 | Scaling Multi-Agent Reinforcement Learning with Selective Parameter Sharing Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a novel method to automatically identify agents which may benefit from sharing parameters by partitioning them based on their abilities and goals. |
Filippos Christianos; Georgios Papoudakis; Muhammad A Rahman; Stefano V Albrecht; |
183 | Beyond Variance Reduction: Understanding The True Impact of Baselines on Policy Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we demonstrate that the standard view is too limited for bandit and RL problems. |
Wesley Chung; Valentin Thomas; Marlos C. Machado; Nicolas Le Roux; |
184 | First-Order Methods for Wasserstein Distributionally Robust MDP Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a framework for solving Distributionally robust MDPs via first-order methods, and instantiate it for several types of Wasserstein ambiguity sets. |
Julien Grand Clement; Christian Kroer; |
185 | Phasic Policy Gradient Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We introduce Phasic Policy Gradient (PPG), a reinforcement learning framework which modifies traditional on-policy actor-critic methods by separating policy and value function training into distinct phases. |
Karl W Cobbe; Jacob Hilton; Oleg Klimov; John Schulman; |
186 | Riemannian Convex Potential Maps Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose and study a class of flows that uses convex potentials from Riemannian optimal transport. |
Samuel Cohen; Brandon Amos; Yaron Lipman; |
187 | Scaling Properties of Deep Residual Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We investigate the properties of weights trained by stochastic gradient descent and their scaling with network depth through detailed numerical experiments. |
Alain-Sam Cohen; Rama Cont; Alain Rossier; Renyuan Xu; |
188 | Differentially-Private Clustering of Easy Instances Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work we aim at providing simple implementable differentrially private clustering algorithms when the the data is "easy," e.g., when there exists a significant separation between the clusters. |
Edith Cohen; Haim Kaplan; Yishay Mansour; Uri Stemmer; Eliad Tsfadia; |
189 | Improving Ultrametrics Embeddings Through Coresets Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We improve the above result and show how to improve the above guarantee from 5c to $\sqrt{2}$c+e while achieving the same asymptotic running time. |
Vincent Cohen-Addad; R?mi De Joannis De Verclos; Guillaume Lagarde; |
190 | Correlation Clustering in Constant Many Parallel Rounds Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work we propose a massively parallel computation (MPC) algorithm for this problem that is considerably faster than prior work. |
Vincent Cohen-Addad; Silvio Lattanzi; Slobodan Mitrovic; Ashkan Norouzi-Fard; Nikos Parotsidis; Jakub Tarnawski; |
191 | Concentric Mixtures of Mallows Models for Top-$k$ Rankings: Sampling and Identifiability Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we study mixtures of two Mallows models for top-$k$ rankings with equal location parameters but with different scale parameters (a mixture of concentric Mallows models). |
Fabien Collas; Ekhine Irurozki; |
192 | Exploiting Shared Representations for Personalized Federated Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Based on this intuition, we propose a novel federated learning framework and algorithm for learning a shared data representation across clients and unique local heads for each client. |
Liam Collins; Hamed Hassani; Aryan Mokhtari; Sanjay Shakkottai; |
193 | Differentiable Particle Filtering Via Entropy-Regularized Optimal Transport Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: By leveraging optimal transport ideas, we introduce a principled differentiable particle filter and provide convergence results. |
Adrien Corenflos; James Thornton; George Deligiannidis; Arnaud Doucet; |
194 | Fairness and Bias in Online Selection Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We address the issues of fairness and bias in online selection by introducing multi-color versions of the classic secretary and prophet problem. |
Jose Correa; Andres Cristi; Paul Duetting; Ashkan Norouzi-Fard; |
195 | Relative Deviation Margin Bounds Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present a series of new and more favorable margin-based learning guarantees that depend on the empirical margin loss of a predictor. |
Corinna Cortes; Mehryar Mohri; Ananda Theertha Suresh; |
196 | A Discriminative Technique for Multiple-Source Adaptation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present a new discriminative technique for the multiple-source adaptation (MSA) problem. |
Corinna Cortes; Mehryar Mohri; Ananda Theertha Suresh; Ningshan Zhang; |
197 | Characterizing Fairness Over The Set of Good Models Under Selective Labels Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We develop a framework for characterizing predictive fairness properties over the set of models that deliver similar overall performance, or “the set of good models.” |
Amanda Coston; Ashesh Rambachan; Alexandra Chouldechova; |
198 | Two-way Kernel Matrix Puncturing: Towards Resource-efficient PCA and Spectral Clustering Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: The article introduces an elementary cost and storage reduction method for spectral clustering and principal component analysis. |
Romain Couillet; Florent Chatelain; Nicolas Le Bihan; |
199 | Explaining Time Series Predictions with Dynamic Masks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To address these challenges, we propose dynamic masks (Dynamask). |
Jonathan Crabb?; Mihaela Van Der Schaar; |
200 | Generalised Lipschitz Regularisation Equals Distributional Robustness Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In response, we have been able to significantly sharpen existing results regarding the relationship between distributional robustness and regularisation, when defined with a transportation cost uncertainty set. |
Zac Cranko; Zhan Shi; Xinhua Zhang; Richard Nock; Simon Kornblith; |
201 | Environment Inference for Invariant Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose EIIL, a general framework for domain-invariant learning that incorporates Environment Inference to directly infer partitions that are maximally informative for downstream Invariant Learning. |
Elliot Creager; Joern-Henrik Jacobsen; Richard Zemel; |
202 | Mind The Box: $l_1$-APGD for Sparse Adversarial Attacks on Image Classifiers Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We show that when taking into account also the image domain $[0,1]^d$, established $l_1$-projected gradient descent (PGD) attacks are suboptimal as they do not consider that the effective threat model is the intersection of the $l_1$-ball and $[0,1]^d$. |
Francesco Croce; Matthias Hein; |
203 | Parameterless Transductive Feature Re-representation for Few-Shot Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a parameterless transductive feature re-representation framework that differs from all existing solutions from the following perspectives. |
Wentao Cui; Yuhong Guo; |
204 | Randomized Algorithms for Submodular Function Maximization with A $k$-System Constraint Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we study the problem of non-negative submodular function maximization subject to a $k$-system constraint, which generalizes many other important constraints in submodular optimization such as cardinality constraint, matroid constraint, and $k$-extendible system constraint. |
Shuang Cui; Kai Han; Tianshuai Zhu; Jing Tang; Benwei Wu; He Huang; |
205 | GBHT: Gradient Boosting Histogram Transform for Density Estimation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a density estimation algorithm called \textit{Gradient Boosting Histogram Transform} (GBHT), where we adopt the \textit{Negative Log Likelihood} as the loss function to make the boosting procedure available for the unsupervised tasks. |
Jingyi Cui; Hanyuan Hang; Yisen Wang; Zhouchen Lin; |
206 | ProGraML: A Graph-based Program Representation for Data Flow Analysis and Compiler Optimizations Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose ProGraML – Program Graphs for Machine Learning – a language-independent, portable representation of program semantics. |
Chris Cummins; Zacharias V. Fisches; Tal Ben-Nun; Torsten Hoefler; Michael F P O?Boyle; Hugh Leather; |
207 | Combining Pessimism with Optimism for Robust and Efficient Model-Based Deep Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose the Robust Hallucinated Upper-Confidence RL (RH-UCRL) algorithm to provably solve this problem while attaining near-optimal sample complexity guarantees. |
Sebastian Curi; Ilija Bogunovic; Andreas Krause; |
208 | Quantifying Availability and Discovery in Recommender Systems Via Stochastic Reachability Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we consider how preference models in interactive recommendation systems determine the availability of content and users’ opportunities for discovery. |
Mihaela Curmei; Sarah Dean; Benjamin Recht; |
209 | Dynamic Balancing for Model Selection in Bandits and RL Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a framework for model selection by combining base algorithms in stochastic bandits and reinforcement learning. |
Ashok Cutkosky; Christoph Dann; Abhimanyu Das; Claudio Gentile; Aldo Pacchiano; Manish Purohit; |
210 | ConViT: Improving Vision Transformers with Soft Convolutional Inductive Biases Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To this end, we introduce gated positional self-attention (GPSA), a form of positional self-attention which can be equipped with a “soft convolutional inductive bias. |
St?phane D?Ascoli; Hugo Touvron; Matthew L Leavitt; Ari S Morcos; Giulio Biroli; Levent Sagun; |
211 | Consistent Regression When Oblivious Outliers Overwhelm Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We consider a robust linear regression model $y=X\beta^* + \eta$, where an adversary oblivious to the design $X\in \mathbb{R}^{n\times d}$ may choose $\eta$ to corrupt all but an $\alpha$ fraction of the observations $y$ in an arbitrary way. |
Tommaso D?Orsi; Gleb Novikov; David Steurer; |
212 | Offline Reinforcement Learning with Pseudometric Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose an iterative procedure to learn a pseudometric (closely related to bisimulation metrics) from logged transitions, and use it to define this notion of closeness. |
Robert Dadashi; Shideh Rezaeifar; Nino Vieillard; L?onard Hussenot; Olivier Pietquin; Matthieu Geist; |
213 | A Tale of Two Efficient and Informative Negative Sampling Distributions Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we show two classes of distributions where the sampling scheme is truly adaptive and provably generates negative samples in near-constant time. |
Shabnam Daghaghi; Tharun Medini; Nicholas Meisburger; Beidi Chen; Mengnan Zhao; Anshumali Shrivastava; |
214 | SiameseXML: Siamese Networks Meet Extreme Classifiers with 100M Labels Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To address these, this paper develops the SiameseXML framework based on a novel probabilistic model that naturally motivates a modular approach melding Siamese architectures with high-capacity extreme classifiers, and a training pipeline that effortlessly scales to tasks with 100 million labels. |
Kunal Dahiya; Ananye Agarwal; Deepak Saini; Gururaj K; Jian Jiao; Amit Singh; Sumeet Agarwal; Purushottam Kar; Manik Varma; |
215 | Fixed-Parameter and Approximation Algorithms for PCA with Outliers Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We study this problem from the perspective of parameterized complexity by investigating how parameters like the dimension of the data, the subspace dimension, the number of outliers and their structure, and approximation error, influence the computational complexity of the problem. |
Yogesh Dahiya; Fedor Fomin; Fahad Panolan; Kirill Simonov; |
216 | Sliced Iterative Normalizing Flows Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We develop an iterative (greedy) deep learning (DL) algorithm which is able to transform an arbitrary probability distribution function (PDF) into the target PDF. |
Biwei Dai; Uros Seljak; |
217 | Convex Regularization in Monte-Carlo Tree Search Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we overcome these limitations by introducing the use of convex regularization in Monte-Carlo Tree Search (MCTS) to drive exploration efficiently and to improve policy updates. |
Tuan Q Dam; Carlo D?Eramo; Jan Peters; Joni Pajarinen; |
218 | Demonstration-Conditioned Reinforcement Learning for Few-Shot Imitation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a novel approach to learning few-shot-imitation agents that we call demonstration-conditioned reinforcement learning (DCRL). |
Christopher R. Dance; Julien Perez; Th?o Cachet; |
219 | Re-understanding Finite-State Representations of Recurrent Policy Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We introduce an approach for understanding control policies represented as recurrent neural networks. |
Mohamad H Danesh; Anurag Koul; Alan Fern; Saeed Khorram; |
220 | Newton Method Over Networks Is Fast Up to The Statistical Precision Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a distributed cubic regularization of the Newton method for solving (constrained) empirical risk minimization problems over a network of agents, modeled as undirected graph. |
Amir Daneshmand; Gesualdo Scutari; Pavel Dvurechensky; Alexander Gasnikov; |
221 | BasisDeVAE: Interpretable Simultaneous Dimensionality Reduction and Feature-Level Clustering with Derivative-Based Variational Autoencoders Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present DeVAE, a novel VAE-based model with a derivative-based forward mapping, allowing for greater control over decoder behaviour via specification of the decoder function in derivative space. |
Dominic Danks; Christopher Yau; |
222 | Intermediate Layer Optimization for Inverse Problems Using Deep Generative Models Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose Intermediate Layer Optimization (ILO), a novel optimization algorithm for solving inverse problems with deep generative models. |
Giannis Daras; Joseph Dean; Ajil Jalal; Alex Dimakis; |
223 | Measuring Robustness in Deep Learning Based Compressive Sensing Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In order to understand the sensitivity to such perturbations, in this work, we measure the robustness of different approaches for image reconstruction including trained and un-trained neural networks as well as traditional sparsity-based methods. |
Mohammad Zalbagi Darestani; Akshay S Chaudhari; Reinhard Heckel; |
224 | SAINT-ACC: Safety-Aware Intelligent Adaptive Cruise Control for Autonomous Vehicles Using Deep Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present a novel adaptive cruise control (ACC) system namely SAINT-ACC: {S}afety-{A}ware {Int}elligent {ACC} system (SAINT-ACC) that is designed to achieve simultaneous optimization of traffic efficiency, driving safety, and driving comfort through dynamic adaptation of the inter-vehicle gap based on deep reinforcement learning (RL). |
Lokesh Chandra Das; Myounggyu Won; |
225 | Lipschitz Normalization for Self-attention Layers with Application to Graph Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we show that enforcing Lipschitz continuity by normalizing the attention scores can significantly improve the performance of deep attention models. |
George Dasoulas; Kevin Scaman; Aladin Virmaux; |
226 | Householder Sketch for Accurate and Accelerated Least-Mean-Squares Solvers Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In retrospect, we explore classical Householder transformation as a candidate for sketching and accurately solving LMS problems. |
Jyotikrishna Dass; Rabi Mahapatra; |
227 | Byzantine-Resilient High-Dimensional SGD with Local Iterations on Heterogeneous Data Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We provide convergence analyses for both strongly-convex and non-convex smooth objectives in the heterogeneous data setting. |
Deepesh Data; Suhas Diggavi; |
228 | Catformer: Designing Stable Transformers Via Sensitivity Analysis Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we improve upon recent analysis of Transformers and formalize a notion of sensitivity to capture the difficulty of training. |
Jared Q Davis; Albert Gu; Krzysztof Choromanski; Tri Dao; Christopher Re; Chelsea Finn; Percy Liang; |
229 | Diffusion Source Identification on Networks with Statistical Confidence Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We introduce a statistical framework for the study of this problem and develop a confidence set inference approach inspired by hypothesis testing. |
Quinlan E Dawkins; Tianxi Li; Haifeng Xu; |
230 | Bayesian Deep Learning Via Subnetwork Inference Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we show that it suffices to perform inference over a small subset of model weights in order to obtain accurate predictive posteriors. |
Erik Daxberger; Eric Nalisnick; James U Allingham; Javier Antoran; Jose Miguel Hernandez-Lobato; |
231 | Adversarial Robustness Guarantees for Random Deep Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We explore the properties of adversarial examples for deep neural networks with random weights and biases, and prove that for any p$\geq$1, the \ell^p distance of any given input from the classification boundary scales as one over the square root of the dimension of the input times the \ell^p norm of the input. |
Giacomo De Palma; Bobak Kiani; Seth Lloyd; |
232 | High-Dimensional Gaussian Process Inference with Derivatives Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We show that in the \emph{low-data} regime $N < d$, the= gram= matrix= can= be= decomposed= in= a= manner= that= reduces= cost= of= inference= to= $\mathcal{o}(n^2d= += (n^2)^3)$= (i.e., linear= number= dimensions)= and,= special= cases,= n^3)$. |
Filip De Roos; Alexandra Gessner; Philipp Hennig; |
233 | Transfer-Based Semantic Anomaly Detection Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we show that a previously overlooked strategy for anomaly detection (AD) is to introduce an explicit inductive bias toward representations transferred over from some large and varied semantic task. |
Lucas Deecke; Lukas Ruff; Robert A. Vandermeulen; Hakan Bilen; |
234 | Grid-Functioned Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We introduce a new neural network architecture that we call "grid-functioned" neural networks. |
Javier Dehesa; Andrew Vidler; Julian Padget; Christof Lutteroth; |
235 | Multidimensional Scaling: Approximation and Complexity Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we prove that minimizing the Kamada-Kawai objective is NP-hard and give a provable approximation algorithm for optimizing it, which in particular is a PTAS on low-diameter graphs. |
Erik Demaine; Adam Hesterberg; Frederic Koehler; Jayson Lynch; John Urschel; |
236 | What Does Rotation Prediction Tell Us About Classifier Accuracy Under Varying Testing Environments? Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we train semantic classification and rotation prediction in a multi-task way. |
Weijian Deng; Stephen Gould; Liang Zheng; |
237 | Toward Better Generalization Bounds with Locally Elastic Stability Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Given that, we propose \emph{locally elastic stability} as a weaker and distribution-dependent stability notion, which still yields exponential generalization bounds. |
Zhun Deng; Hangfeng He; Weijie Su; |
238 | Revenue-Incentive Tradeoffs in Dynamic Reserve Pricing Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we study how to set reserves to boost revenue based on the historical bids of strategic buyers, while controlling the impact of such a policy on the incentive compatibility of the repeated auctions. |
Yuan Deng; Sebastien Lahaie; Vahab Mirrokni; Song Zuo; |
239 | Heterogeneity for The Win: One-Shot Federated Clustering Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we explore the unique challenges—and opportunities—of unsupervised federated learning (FL). |
Don Kurian Dennis; Tian Li; Virginia Smith; |
240 | Kernel Continual Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper introduces kernel continual learning, a simple but effective variant of continual learning that leverages the non-parametric nature of kernel methods to tackle catastrophic forgetting. |
Mohammad Mahdi Derakhshani; Xiantong Zhen; Ling Shao; Cees Snoek; |
241 | Bayesian Optimization Over Hybrid Spaces Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a novel approach referred as Hybrid Bayesian Optimization (HyBO) by utilizing diffusion kernels, which are naturally defined over continuous and discrete variables. |
Aryan Deshwal; Syrine Belakaria; Janardhan Rao Doppa; |
242 | Navigation Turing Test (NTT): Learning to Evaluate Human-Like Navigation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We address these limitations through a novel automated Navigation Turing Test (ANTT) that learns to predict human judgments of human-likeness. |
Sam Devlin; Raluca Georgescu; Ida Momennejad; Jaroslaw Rzepecki; Evelyn Zuniga; Gavin Costello; Guy Leroy; Ali Shaw; Katja Hofmann; |
243 | Versatile Verification of Tree Ensembles Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper introduces a generic algorithm called Veritas that enables tackling multiple different verification tasks for tree ensemble models like random forests (RFs) and gradient boosted decision trees (GBDTs). |
Laurens Devos; Wannes Meert; Jesse Davis; |
244 | On The Inherent Regularization Effects of Noise Injection During Training Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present a theoretical study of one particular way of random perturbation, which corresponds to injecting artificial noise to the training data. |
Oussama Dhifallah; Yue Lu; |
245 | Hierarchical Agglomerative Graph Clustering in Nearly-Linear Time Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We study the widely-used hierarchical agglomerative clustering (HAC) algorithm on edge-weighted graphs. |
Laxman Dhulipala; David Eisenstat; Jakub Lacki; Vahab Mirrokni; Jessica Shi; |
246 | Learning Online Algorithms with Distributional Advice Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We study the problem of designing online algorithms given advice about the input. |
Ilias Diakonikolas; Vasilis Kontonis; Christos Tzamos; Ali Vakilian; Nikos Zarifis; |
247 | A Wasserstein Minimax Framework for Mixed Linear Regression Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose an optimal transport-based framework for MLR problems, Wasserstein Mixed Linear Regression (WMLR), which minimizes the Wasserstein distance between the learned and target mixture regression models. |
Theo Diamandis; Yonina Eldar; Alireza Fallah; Farzan Farnia; Asuman Ozdaglar; |
248 | Context-Aware Online Collective Inference for Templated Graphical Models Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we examine online collective inference, the problem of maintaining and performing inference over a sequence of evolving graphical models. |
Charles Dickens; Connor Pryor; Eriq Augustine; Alexander Miller; Lise Getoor; |
249 | ARMS: Antithetic-REINFORCE-Multi-Sample Gradient for Binary Variables Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To better utilize more than two samples, we propose ARMS, an Antithetic REINFORCE-based Multi-Sample gradient estimator. |
Aleksandar Dimitriev; Mingyuan Zhou; |
250 | XOR-CD: Linearly Convergent Constrained Structure Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose XOR-Contrastive Divergence learning (XOR-CD), a provable approach for constrained structure generation, which remains difficult for state-of-the-art neural network and constraint reasoning approaches. |
Fan Ding; Jianzhu Ma; Jinbo Xu; Yexiang Xue; |
251 | Dual Principal Component Pursuit for Robust Subspace Learning: Theory and Algorithms for A Holistic Approach Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we consider a DPCP approach for simultaneously computing the entire basis of the orthogonal complement subspace (we call this a holistic approach) by solving a non-convex non-smooth optimization problem over the Grassmannian. |
Tianyu Ding; Zhihui Zhu; Rene Vidal; Daniel P Robinson; |
252 | Coded-InvNet for Resilient Prediction Serving Systems Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Inspired by a new coded computation algorithm for invertible functions, we propose Coded-InvNet a new approach to design resilient prediction serving systems that can gracefully handle stragglers or node failures. |
Tuan Dinh; Kangwook Lee; |
253 | Estimation and Quantization of Expected Persistence Diagrams Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this article, we study two such summaries, the Expected Persistence Diagram (EPD), and its quantization. |
Vincent Divol; Theo Lacombe; |
254 | On Energy-Based Models with Overparametrized Shallow Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Building from the incipient theory of overparametrized neural networks, we show that models trained in the so-called ’active’ regime provide a statistical advantage over their associated ’lazy’ or kernel regime, leading to improved adaptivity to hidden low-dimensional structure in the data distribution, as already observed in supervised learning. |
Carles Domingo-Enrich; Alberto Bietti; Eric Vanden-Eijnden; Joan Bruna; |
255 | Kernel-Based Reinforcement Learning: A Finite-Time Analysis Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We introduce Kernel-UCBVI, a model-based optimistic algorithm that leverages the smoothness of the MDP and a non-parametric kernel estimator of the rewards and transitions to efficiently balance exploration and exploitation. |
Omar Darwiche Domingues; Pierre Menard; Matteo Pirotta; Emilie Kaufmann; Michal Valko; |
256 | Attention Is Not All You Need: Pure Attention Loses Rank Doubly Exponentially with Depth Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This work proposes a new way to understand self-attention networks: we show that their output can be decomposed into a sum of smaller terms—or paths—each involving the operation of a sequence of attention heads across layers. |
Yihe Dong; Jean-Baptiste Cordonnier; Andreas Loukas; |
257 | How Rotational Invariance of Common Kernels Prevents Generalization in High Dimensions Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we show that in high dimensions, the rotational invariance property of commonly studied kernels (such as RBF, inner product kernels and fully-connected NTK of any depth) leads to inconsistent estimation unless the ground truth is a low-degree polynomial. |
Konstantin Donhauser; Mingqi Wu; Fanny Yang; |
258 | Fast Stochastic Bregman Gradient Methods: Sharp Analysis and Variance Reduction Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We study the problem of minimizing a relatively-smooth convex function using stochastic Bregman gradient methods. |
Radu Alexandru Dragomir; Mathieu Even; Hadrien Hendrikx; |
259 | Bilinear Classes: A Structural Framework for Provable Generalization in RL Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This work introduces Bilinear Classes, a new structural framework, which permit generalization in reinforcement learning in a wide variety of settings through the use of function approximation. |
Simon Du; Sham Kakade; Jason Lee; Shachar Lovett; Gaurav Mahajan; Wen Sun; Ruosong Wang; |
260 | Improved Contrastive Divergence Training of Energy-Based Models Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose an adaptation to improve contrastive divergence training by scrutinizing a gradient term that is difficult to calculate and is often left out for convenience. |
Yilun Du; Shuang Li; Joshua Tenenbaum; Igor Mordatch; |
261 | Order-Agnostic Cross Entropy for Non-Autoregressive Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a new training objective named order-agnostic cross entropy (OaXE) for fully non-autoregressive translation (NAT) models. |
Cunxiao Du; Zhaopeng Tu; Jing Jiang; |
262 | Putting The Learning Into Learning-Augmented Algorithms for Frequency Estimation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Learning here is used to predict heavy hitters from a data stream, which are counted explicitly outside the sketch. |
Elbert Du; Franklyn Wang; Michael Mitzenmacher; |
263 | Estimating $a$-Rank from A Few Entries with Low Rank Matrix Completion Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we aim to reduce the number of pairwise comparisons in recovering a satisfying ranking for $n$ strategies in two-player meta-games, by exploring the fact that agents with similar skills may achieve similar payoffs against others. |
Yali Du; Xue Yan; Xu Chen; Jun Wang; Haifeng Zhang; |
264 | Learning Diverse-Structured Networks for Adversarial Robustness Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we argue that NA and AT cannot be handled independently, since given a dataset, the optimal NA in ST would be no longer optimal in AT. |
Xuefeng Du; Jingfeng Zhang; Bo Han; Tongliang Liu; Yu Rong; Gang Niu; Junzhou Huang; Masashi Sugiyama; |
265 | Risk Bounds and Rademacher Complexity in Batch Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper considers batch Reinforcement Learning (RL) with general value function approximation. |
Yaqi Duan; Chi Jin; Zhiyuan Li; |
266 | Sawtooth Factorial Topic Embeddings Guided Gamma Belief Network Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To relax this assumption, we propose sawtooth factorial topic embedding guided GBN, a deep generative model of documents that captures the dependencies and semantic similarities between the topics in the embedding space. |
Zhibin Duan; Dongsheng Wang; Bo Chen; Chaojie Wang; Wenchao Chen; Yewen Li; Jie Ren; Mingyuan Zhou; |
267 | Exponential Reduction in Sample Complexity with Learning of Ising Model Dynamics Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We study the problem of reconstructing binary graphical models from correlated samples produced by a dynamical process, which is natural in many applications. |
Arkopal Dutt; Andrey Lokhov; Marc D Vuffray; Sidhant Misra; |
268 | Reinforcement Learning Under Moral Uncertainty Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper translates such insights to the field of reinforcement learning, proposes two training methods that realize different points among competing desiderata, and trains agents in simple environments to act under moral uncertainty. |
Adrien Ecoffet; Joel Lehman; |
269 | Confidence-Budget Matching for Sequential Budgeted Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we formalize decision-making problems with querying budget, where there is a (possibly time-dependent) hard limit on the number of reward queries allowed. |
Yonathan Efroni; Nadav Merlis; Aadirupa Saha; Shie Mannor; |
270 | Self-Paced Context Evaluation for Contextual Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To improve sample efficiency for learning on such instances of a problem domain, we present Self-Paced Context Evaluation (SPaCE). |
Theresa Eimer; Andr? Biedenkapp; Frank Hutter; Marius Lindauer; |
271 | Provably Strict Generalisation Benefit for Equivariant Models Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: By considering the simplest case of linear models, this paper provides the first provably non-zero improvement in generalisation for invariant/equivariant models when the target distribution is invariant/equivariant with respect to a compact group. |
Bryn Elesedy; Sheheryar Zaidi; |
272 | Efficient Iterative Amortized Inference for Learning Symmetric and Disentangled Multi-Object Representations Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we introduce EfficientMORL, an efficient framework for the unsupervised learning of object-centric representations. |
Patrick Emami; Pan He; Sanjay Ranka; Anand Rangarajan; |
273 | Implicit Bias of Linear RNNs Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: However, RNNs’ poor ability to capture long-term dependencies has not been fully understood. This paper provides a rigorous explanation of this property in the special case of linear RNNs. |
Melikasadat Emami; Mojtaba Sahraee-Ardakan; Parthe Pandit; Sundeep Rangan; Alyson K Fletcher; |
274 | Global Optimality Beyond Two Layers: Training Deep ReLU Networks Via Convex Programs Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we develop a novel unified framework to reveal a hidden regularization mechanism through the lens of convex optimization. |
Tolga Ergen; Mert Pilanci; |
275 | Revealing The Structure of Deep Neural Networks Via Convex Duality Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We study regularized deep neural networks (DNNs) and introduce a convex analytic framework to characterize the structure of the hidden layers. |
Tolga Ergen; Mert Pilanci; |
276 | Whitening for Self-Supervised Representation Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a different direction and a new loss function for SSL, which is based on the whitening of the latent-space features. |
Aleksandr Ermolov; Aliaksandr Siarohin; Enver Sangineto; Nicu Sebe; |
277 | Graph Mixture Density Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We introduce the Graph Mixture Density Networks, a new family of machine learning models that can fit multimodal output distributions conditioned on graphs of arbitrary topology. |
Federico Errica; Davide Bacciu; Alessio Micheli; |
278 | Cross-Gradient Aggregation for Decentralized Learning from Non-IID Data Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Inspired by ideas from continual learning, we propose Cross-Gradient Aggregation (CGA), a novel decentralized learning algorithm where (i) each agent aggregates cross-gradient information, i.e., derivatives of its model with respect to its neighbors’ datasets, and (ii) updates its model using a projected gradient based on quadratic programming (QP). |
Yasaman Esfandiari; Sin Yong Tan; Zhanhong Jiang; Aditya Balu; Ethan Herron; Chinmay Hegde; Soumik Sarkar; |
279 | Weight-covariance Alignment for Adversarially Robust Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a new SNN that achieves state-of-the-art performance without relying on adversarial training, and enjoys solid theoretical justification. |
Panagiotis Eustratiadis; Henry Gouk; Da Li; Timothy Hospedales; |
280 | Data Augmentation for Deep Learning Based Accelerated MRI Reconstruction with Limited Data Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Inspired by the success of Data Augmentation (DA) for classification problems, in this paper, we propose a pipeline for data augmentation for accelerated MRI reconstruction and study its effectiveness at reducing the required training data in a variety of settings. |
Zalan Fabian; Reinhard Heckel; Mahdi Soltanolkotabi; |
281 | Poisson-Randomised DirBN: Large Mutation Is Needed in Dirichlet Belief Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose Poisson-randomised Dirichlet Belief Networks (Pois-DirBN), which allows large mutations for the latent distributions across layers to enlarge the representation capability. |
Xuhui Fan; Bin Li; Yaqiong Li; Scott A. Sisson; |
282 | Model-based Reinforcement Learning for Continuous Control with Posterior Sampling Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we study model-based posterior sampling for reinforcement learning (PSRL) in continuous state-action spaces theoretically and empirically. |
Ying Fan; Yifei Ming; |
283 | SECANT: Self-Expert Cloning for Zero-Shot Generalization of Visual Policies Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we consider robust policy learning which targets zero-shot generalization to unseen visual environments with large distributional shift. |
Linxi Fan; Guanzhi Wang; De-An Huang; Zhiding Yu; Li Fei-Fei; Yuke Zhu; Animashree Anandkumar; |
284 | On Estimation in Latent Variable Models Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we consider a gradient based method via using variance reduction technique to accelerate estimation procedure. |
Guanhua Fang; Ping Li; |
285 | On Variational Inference in Biclustering Models Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we develop a theory for the estimation of general biclustering models, where the data is assumed to follow certain statistical distribution with underlying biclustering structure. |
Guanhua Fang; Ping Li; |
286 | Learning Bounds for Open-Set Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we target a more challenging and re_x0002_alistic setting: open-set learning (OSL), where there exist test samples from the classes that are unseen during training. |
Zhen Fang; Jie Lu; Anjin Liu; Feng Liu; Guangquan Zhang; |
287 | Streaming Bayesian Deep Tensor Factorization Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: More important, for highly expressive, deep factorization, we lack an effective approach to handle streaming data, which are ubiquitous in real-world applications. To address these issues, we propose SBTD, a Streaming Bayesian Deep Tensor factorization method. |
Shikai Fang; Zheng Wang; Zhimeng Pan; Ji Liu; Shandian Zhe; |
288 | PID Accelerated Value Iteration Algorithm Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose modifications to VI in order to potentially accelerate its convergence behaviour. |
Amir-Massoud Farahmand; Mohammad Ghavamzadeh; |
289 | Near-Optimal Entrywise Anomaly Detection for Low-Rank Matrices with Sub-Exponential Noise Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: So motivated, we propose a conceptually simple entrywise approach to anomaly detection in low-rank matrices. |
Vivek Farias; Andrew A Li; Tianyi Peng; |
290 | Connecting Optimal Ex-Ante Collusion in Teams to Extensive-Form Correlation: Faster Algorithms and Positive Complexity Results Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We focus on the problem of finding an optimal strategy for a team of players that faces an opponent in an imperfect-information zero-sum extensive-form game. |
Gabriele Farina; Andrea Celli; Nicola Gatti; Tuomas Sandholm; |
291 | Train Simultaneously, Generalize Better: Stability of Gradient-based Minimax Learners Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we show that the optimization algorithm also plays a key role in the generalization performance of the trained minimax model. |
Farzan Farnia; Asuman Ozdaglar; |
292 | Unbalanced Minibatch Optimal Transport; Applications to Domain Adaptation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: As an alternative, we suggest that the same minibatch strategy coupled with unbalanced optimal transport can yield more robust behaviors. |
Kilian Fatras; Thibault Sejourne; R?mi Flamary; Nicolas Courty; |
293 | Risk-Sensitive Reinforcement Learning with Function Approximation: A Debiasing Approach Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We study function approximation for episodic reinforcement learning with entropic risk measure. |
Yingjie Fei; Zhuoran Yang; Zhaoran Wang; |
294 | Lossless Compression of Efficient Private Local Randomizers Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Here we demonstrate a general approach that, under standard cryptographic assumptions, compresses every efficient LDP algorithm with negligible loss in privacy and utility guarantees. |
Vitaly Feldman; Kunal Talwar; |
295 | Dimensionality Reduction for The Sum-of-Distances Metric Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We give a dimensionality reduction procedure to approximate the sum of distances of a given set of n points in Rd to any shape that lies in a k-dimensional subspace. |
Zhili Feng; Praneeth Kacham; David Woodruff; |
296 | Reserve Price Optimization for First Price Auctions in Display Advertising Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a gradient-based algorithm to adaptively update and optimize reserve prices based on estimates of bidders’ responsiveness to experimental shocks in reserves. |
Zhe Feng; Sebastien Lahaie; Jon Schneider; Jinchao Ye; |
297 | Uncertainty Principles of Encoding GANs Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we study this predicament of encoding GANs, which is indispensable research for the GAN community. |
Ruili Feng; Zhouchen Lin; Jiapeng Zhu; Deli Zhao; Jingren Zhou; Zheng-Jun Zha; |
298 | Pointwise Binary Classification with Pairwise Confidence Comparisons Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Thus, in this paper, we propose a novel setting called pairwise comparison (Pcomp) classification, where we have only pairs of unlabeled data that we know one is more likely to be positive than the other. |
Lei Feng; Senlin Shu; Nan Lu; Bo Han; Miao Xu; Gang Niu; Bo An; Masashi Sugiyama; |
299 | Provably Correct Optimization and Exploration with Non-linear Policies Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we address this question by designing ENIAC, an actor-critic method that allows non-linear function approximation in the critic. |
Fei Feng; Wotao Yin; Alekh Agarwal; Lin Yang; |
300 | KD3A: Unsupervised Multi-Source Decentralized Domain Adaptation Via Knowledge Distillation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To address the above problems, we propose a privacy-preserving UMDA paradigm named Knowledge Distillation based Decentralized Domain Adaptation (KD3A), which performs domain adaptation through the knowledge distillation on models from different source domains. |
Haozhe Feng; Zhaoyang You; Minghao Chen; Tianye Zhang; Minfeng Zhu; Fei Wu; Chao Wu; Wei Chen; |
301 | Understanding Noise Injection in GANs Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a geometric framework to theoretically analyze the role of noise injection in GANs. |
Ruili Feng; Deli Zhao; Zheng-Jun Zha; |
302 | GNNAutoScale: Scalable and Expressive Graph Neural Networks Via Historical Embeddings Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present GNNAutoScale (GAS), a framework for scaling arbitrary message-passing GNNs to large graphs. |
Matthias Fey; Jan E. Lenssen; Frank Weichert; Jure Leskovec; |
303 | PsiPhi-Learning: Reinforcement Learning with Demonstrations Using Successor Features and Inverse Temporal Difference Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a multi-task inverse reinforcement learning (IRL) algorithm, called \emph{inverse temporal difference learning} (ITD), that learns shared state features, alongside per-agent successor features and preference vectors, purely from demonstrations without reward labels. |
Angelos Filos; Clare Lyle; Yarin Gal; Sergey Levine; Natasha Jaques; Gregory Farquhar; |
304 | A Practical Method for Constructing Equivariant Multilayer Perceptrons for Arbitrary Matrix Groups Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work we provide a completely general algorithm for solving for the equivariant layers of matrix groups. |
Marc Finzi; Max Welling; Andrew Gordon Wilson; |
305 | Few-Shot Conformal Prediction with Auxiliary Tasks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we obtain substantially tighter prediction sets while maintaining desirable marginal guarantees by casting conformal prediction as a meta-learning paradigm over exchangeable collections of auxiliary tasks. |
Adam Fisch; Tal Schuster; Tommi Jaakkola; Dr.Regina Barzilay; |
306 | Scalable Certified Segmentation Via Randomized Smoothing Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present a new certification method for image and point cloud segmentation based on randomized smoothing. |
Marc Fischer; Maximilian Baader; Martin Vechev; |
307 | What’s in The Box? Exploring The Inner Life of Neural Networks with Robust Rules Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a novel method for exploring how neurons within neural networks interact. |
Jonas Fischer; Anna Olah; Jilles Vreeken; |
308 | Online Learning with Optimism and Delay Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Inspired by the demands of real-time climate and weather forecasting, we develop optimistic online learning algorithms that require no parameter tuning and have optimal regret guarantees under delayed feedback. |
Genevieve E Flaspohler; Francesco Orabona; Judah Cohen; Soukayna Mouatadid; Miruna Oprescu; Paulo Orenstein; Lester Mackey; |
309 | Online A-Optimal Design and Active Linear Regression Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We consider in this paper the problem of optimal experiment design where a decision maker can choose which points to sample to obtain an estimate $\hat{\beta}$ of the hidden parameter $\beta^{\star}$ of an underlying linear model. |
Xavier Fontaine; Pierre Perrault; Michal Valko; Vianney Perchet; |
310 | Deep Adaptive Design: Amortizing Sequential Bayesian Experimental Design Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We introduce Deep Adaptive Design (DAD), a method for amortizing the cost of adaptive Bayesian experimental design that allows experiments to be run in real-time. |
Adam Foster; Desi R Ivanova; Ilyas Malik; Tom Rainforth; |
311 | Efficient Online Learning for Dynamic K-Clustering Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we study dynamic clustering problems from the perspective of online learning. |
Dimitris Fotakis; Georgios Piliouras; Stratis Skoulakis; |
312 | Clustered Sampling: Low-Variance and Improved Representativity for Clients Selection in Federated Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This work addresses the problem of optimizing communications between server and clients in federated learning (FL). |
Yann Fraboni; Richard Vidal; Laetitia Kameni; Marco Lorenzi; |
313 | Agnostic Learning of Halfspaces with Gradient Descent Via Soft Margins Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We show that when a quantity we refer to as the \textit{soft margin} is well-behaved—a condition satisfied by log-concave isotropic distributions among others—minimizers of convex surrogates for the zero-one loss are approximate minimizers for the zero-one loss itself. |
Spencer Frei; Yuan Cao; Quanquan Gu; |
314 | Provable Generalization of SGD-trained Neural Networks of Any Width in The Presence of Adversarial Label Noise Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To the best of our knowledge, this is the first work to show that overparameterized neural networks trained by SGD can generalize when the data is corrupted with adversarial label noise. |
Spencer Frei; Yuan Cao; Quanquan Gu; |
315 | Post-selection Inference with HSIC-Lasso Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a selective inference procedure using the so-called model-free "HSIC-Lasso" based on the framework of truncated Gaussians combined with the polyhedral lemma. |
Tobias Freidling; Benjamin Poignard; H?ctor Climente-Gonz?lez; Makoto Yamada; |
316 | Variational Data Assimilation with A Learned Inverse Observation Operator Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We learn a mapping from observational data to physical states and show how it can be used to improve optimizability. |
Thomas Frerix; Dmitrii Kochkov; Jamie Smith; Daniel Cremers; Michael Brenner; Stephan Hoyer; |
317 | Bayesian Quadrature on Riemannian Data Manifolds Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To ease this computational burden, we advocate probabilistic numerical methods for Riemannian statistics. |
Christian Fr?hlich; Alexandra Gessner; Philipp Hennig; Bernhard Sch?lkopf; Georgios Arvanitidis; |
318 | Learn-to-Share: A Hardware-friendly Transfer Learning Framework Exploiting Computation and Parameter Sharing Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose LeTS, a framework that leverages both computation and parameter sharing across multiple tasks. |
Cheng Fu; Hanxian Huang; Xinyun Chen; Yuandong Tian; Jishen Zhao; |
319 | Learning Task Informed Abstractions Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To mitigate this problem, we propose learning Task Informed Abstractions (TIA) that explicitly separates reward-correlated visual features from distractors. |
Xiang Fu; Ge Yang; Pulkit Agrawal; Tommi Jaakkola; |
320 | Double-Win Quant: Aggressively Winning Robustness of Quantized Deep Neural Networks Via Random Precision Training and Inference Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we demonstrate a new perspective regarding quantization’s role in DNNs’ robustness, advocating that quantization can be leveraged to largely boost DNNs’ robustness, and propose a framework dubbed Double-Win Quant that can boost the robustness of quantized DNNs over their full precision counterparts by a large margin. |
Yonggan Fu; Qixuan Yu; Meng Li; Vikas Chandra; Yingyan Lin; |
321 | Auto-NBA: Efficient and Effective Search Over The Joint Space of Networks, Bitwidths, and Accelerators Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To tackle these daunting challenges towards optimal and fast development of DNN accelerators, we propose a framework dubbed Auto-NBA to enable jointly searching for the Networks, Bitwidths, and Accelerators, by efficiently localizing the optimal design within the huge joint design space for each target dataset and acceleration specification. |
Yonggan Fu; Yongan Zhang; Yang Zhang; David Cox; Yingyan Lin; |
322 | A Deep Reinforcement Learning Approach to Marginalized Importance Sampling with The Successor Representation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We bridge the gap between MIS and deep reinforcement learning by observing that the density ratio can be computed from the successor representation of the target policy. |
Scott Fujimoto; David Meger; Doina Precup; |
323 | Learning Disentangled Representations Via Product Manifold Projection Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a novel approach to disentangle the generative factors of variation underlying a given set of observations. |
Marco Fumero; Luca Cosmo; Simone Melzi; Emanuele Rodola; |
324 | Policy Information Capacity: Information-Theoretic Measure for Task Complexity in Deep Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose policy information capacity (PIC) – the mutual information between policy parameters and episodic return – and policy-optimal information capacity (POIC) – between policy parameters and episodic optimality – as two environment-agnostic, algorithm-agnostic quantitative metrics for task difficulty. |
Hiroki Furuta; Tatsuya Matsushima; Tadashi Kozuno; Yutaka Matsuo; Sergey Levine; Ofir Nachum; Shixiang Shane Gu; |
325 | An Information-Geometric Distance on The Space of Tasks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We develop an algorithm to compute the distance which iteratively transports the marginal on the data of the source task to that of the target task while updating the weights of the classifier to track this evolving data distribution. |
Yansong Gao; Pratik Chaudhari; |
326 | Maximum Mean Discrepancy Test Is Aware of Adversarial Attacks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Given this phenomenon, we raise a question: are natural and adversarial data really from different distributions? The answer is affirmative- the previous use of the MMD test on the purpose missed three key factors, and accordingly, we propose three components. |
Ruize Gao; Feng Liu; Jingfeng Zhang; Bo Han; Tongliang Liu; Gang Niu; Masashi Sugiyama; |
327 | Unsupervised Co-part Segmentation Through Assembly Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose an unsupervised learning approach for co-part segmentation from images. |
Qingzhe Gao; Bin Wang; Libin Liu; Baoquan Chen; |
328 | Discriminative Complementary-Label Learning with Weighted Loss Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we derive a simple and theoretically-sound \emph{discriminative} model towards $P(\bar y\mid {\bm x})$, which naturally leads to a risk estimator with estimation error bound at $\mathcal{O}(1/\sqrt{n})$ convergence rate. |
Yi Gao; Min-Ling Zhang; |
329 | RATT: Leveraging Unlabeled Data to Guarantee Generalization Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we leverage unlabeled data to produce generalization bounds. |
Saurabh Garg; Sivaraman Balakrishnan; Zico Kolter; Zachary Lipton; |
330 | On Proximal Policy Optimization’s Heavy-tailed Gradients Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present a detailed empirical study to characterize the heavy-tailed nature of the gradients of the PPO surrogate reward function. |
Saurabh Garg; Joshua Zhanson; Emilio Parisotto; Adarsh Prasad; Zico Kolter; Zachary Lipton; Sivaraman Balakrishnan; Ruslan Salakhutdinov; Pradeep Ravikumar; |
331 | What Does LIME Really See in Images? Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: On the theoretical side, we show that when the number of generated examples is large, LIME explanations are concentrated around a limit explanation for which we give an explicit expression. |
Damien Garreau; Dina Mardaoui; |
332 | Parametric Graph for Unimodal Ranking Bandit Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose an original algorithm, easy to implement and with strong theoretical guarantees to tackle this problem in the Position-Based Model (PBM) setting, well suited for applications where items are displayed on a grid. |
Camille-Sovanneary Gauthier; Romaric Gaudel; Elisa Fromont; Boammani Aser Lompo; |
333 | Let’s Agree to Degree: Comparing Graph Convolutional Networks in The Message-Passing Framework Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we cast neural networks defined on graphs as message-passing neural networks (MPNNs) to study the distinguishing power of different classes of such models. |
Floris Geerts; Filip Mazowiecki; Guillermo Perez; |
334 | On The Difficulty of Unbiased Alpha Divergence Minimization Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work we study unbiased methods for alpha-divergence minimization through the Signal-to-Noise Ratio (SNR) of the gradient estimator. |
Tomas Geffner; Justin Domke; |
335 | How and Why to Use Experimental Data to Evaluate Methods for Observational Causal Inference Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We describe and analyze observational sampling from randomized controlled trials (OSRCT), a method for evaluating causal inference methods using data from randomized controlled trials (RCTs). |
Amanda M Gentzel; Purva Pruthi; David Jensen; |
336 | Strategic Classification in The Dark Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we generalize the strategic classification model to such scenarios and analyze the effect of an unknown classifier. |
Ganesh Ghalme; Vineet Nair; Itay Eilat; Inbal Talgam-Cohen; Nir Rosenfeld; |
337 | EMaQ: Expected-Max Q-Learning Operator for Simple Yet Effective Offline and Online RL Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we closely investigate an important simplification of BCQ (Fujimoto et al., 2018) – a prior approach for offline RL – removing a heuristic design choice. |
Seyed Kamyar Seyed Ghasemipour; Dale Schuurmans; Shixiang Shane Gu; |
338 | Differentially Private Aggregation in The Shuffle Model: Almost Central Accuracy in Almost A Single Message Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we study the problem of summing (aggregating) real numbers or integers, a basic primitive in numerous machine learning tasks, in the shuffle model. |
Badih Ghazi; Ravi Kumar; Pasin Manurangsi; Rasmus Pagh; Amer Sinha; |
339 | The Power of Adaptivity for Stochastic Submodular Cover Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We ask: how well can solutions with only a few adaptive rounds approximate fully-adaptive solutions? |
Rohan Ghuge; Anupam Gupta; Viswanath Nagarajan; |
340 | Differentially Private Quantiles Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work we propose an instance of the exponential mechanism that simultaneously estimates exactly $m$ quantiles from $n$ data points while guaranteeing differential privacy. |
Jennifer Gillenwater; Matthew Joseph; Alex Kulesza; |
341 | Query Complexity of Adversarial Attacks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: There are two main attack models considered in the adversarial robustness literature: black-box and white-box. We consider these threat models as two ends of a fine-grained spectrum, indexed by the number of queries the adversary can ask. |
Grzegorz Gluch; R?diger Urbanke; |
342 | Spectral Normalisation for Deep Reinforcement Learning: An Optimisation Perspective Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We diverge from this view and show we can recover the performance of these developments not by changing the objective, but by regularising the value-function estimator. |
Florin Gogianu; Tudor Berariu; Mihaela C Rosca; Claudia Clopath; Lucian Busoniu; Razvan Pascanu; |
343 | 12-Lead ECG Reconstruction Via Koopman Operators Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we present a methodology to reconstruct missing or noisy leads using the theory of Koopman Operators. |
Tomer Golany; Kira Radinsky; Daniel Freedman; Saar Minha; |
344 | Function Contrastive Learning of Transferable Meta-Representations Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we study the implications of this joint training on the transferability of the meta-representations. |
Muhammad Waleed Gondal; Shruti Joshi; Nasim Rahaman; Stefan Bauer; Manuel Wuthrich; Bernhard Sch?lkopf; |
345 | Active Slices for Sliced Stein Discrepancy Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: First, we show in theory that the requirement of using optimal slicing directions in the kernelized version of SSD can be relaxed, validating the resulting discrepancy with finite random slicing directions. Second, given that good slicing directions are crucial for practical performance, we propose a fast algorithm for finding good slicing directions based on ideas of active sub-space construction and spectral decomposition. |
Wenbo Gong; Kaibo Zhang; Yingzhen Li; Jose Miguel Hernandez-Lobato; |
346 | On The Problem of Underranking in Group-Fair Ranking Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we formulate the problem of underranking in group-fair rankings based on how close the group-fair rank of each item is to its original rank, and prove a lower bound on the trade-off achievable for simultaneous underranking and group fairness in ranking. |
Sruthi Gorantla; Amit Deshpande; Anand Louis; |
347 | MARINA: Faster Non-Convex Distributed Learning with Compression Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We develop and analyze MARINA: a new communication efficient method for non-convex distributed learning over heterogeneous datasets. |
Eduard Gorbunov; Konstantin P. Burlachenko; Zhize Li; Peter Richtarik; |
348 | Systematic Analysis of Cluster Similarity Indices: How to Validate Validation Measures Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a theoretical framework to tackle this problem: we develop a list of desirable properties and conduct an extensive theoretical analysis to verify which indices satisfy them. |
Martijn M G?sgens; Alexey Tikhonov; Liudmila Prokhorenkova; |
349 | Revisiting Point Cloud Shape Classification with A Simple and Effective Baseline Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: First, we find that auxiliary factors like different evaluation schemes, data augmentation strategies, and loss functions, which are independent of the model architecture, make a large difference in performance. |
Ankit Goyal; Hei Law; Bowei Liu; Alejandro Newell; Jia Deng; |
350 | Dissecting Supervised Constrastive Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we address the question whether there are fundamental differences in the sought-for representation geometry in the output space of the encoder at minimal loss. |
Florian Graf; Christoph Hofer; Marc Niethammer; Roland Kwitt; |
351 | Oops I Took A Gradient: Scalable Sampling for Discrete Distributions Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a general and scalable approximate sampling strategy for probabilistic models with discrete variables. |
Will Grathwohl; Kevin Swersky; Milad Hashemi; David Duvenaud; Chris Maddison; |
352 | Detecting Rewards Deterioration in Episodic Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we address this problem by focusing directly on the rewards and testing for degradation. We present this problem as a multivariate mean-shift detection problem with possibly partial observations. |
Ido Greenberg; Shie Mannor; |
353 | Crystallization Learning with The Delaunay Triangulation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Based on the Delaunay triangulation, we propose the crystallization learning to estimate the conditional expectation function in the framework of nonparametric regression. |
Jiaqi Gu; Guosheng Yin; |
354 | AutoAttend: Automated Attention Representation Search Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we automate Key, Query and Value representation design, which is one of the most important steps to obtain effective self-attentions. |
Chaoyu Guan; Xin Wang; Wenwu Zhu; |
355 | Operationalizing Complex Causes: A Pragmatic View of Mediation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Given a collection of candidate mediators, we propose (a) a two-step method for predicting the causal responses of crude interventions; and (b) a testing procedure to identify mediators of crude interventions. |
Limor Gultchin; David Watson; Matt Kusner; Ricardo Silva; |
356 | On A Combination of Alternating Minimization and Nesterov’s Momentum Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we combine AM and Nesterov’s acceleration to propose an accelerated alternating minimization algorithm. |
Sergey Guminov; Pavel Dvurechensky; Nazarii Tupitsa; Alexander Gasnikov; |
357 | Decentralized Single-Timescale Actor-Critic on Zero-Sum Two-Player Stochastic Games Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We study the global convergence and global optimality of the actor-critic algorithm applied for the zero-sum two-player stochastic games in a decentralized manner. |
Hongyi Guo; Zuyue Fu; Zhuoran Yang; Zhaoran Wang; |
358 | Adversarial Policy Learning in Two-player Competitive Games Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose a new adversarial learning algorithm. |
Wenbo Guo; Xian Wu; Sui Huang; Xinyu Xing; |
359 | Soft Then Hard: Rethinking The Quantization in Neural Image Compression Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We thus propose a novel soft-then-hard quantization strategy for neural image compression that first learns an expressive latent space softly, then closes the train-test mismatch with hard quantization. |
Zongyu Guo; Zhizheng Zhang; Runsen Feng; Zhibo Chen; |
360 | UneVEn: Universal Value Exploration for Multi-Agent Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Specifically, we propose a novel MARL approach called Universal Value Exploration (UneVEn) that learns a set of related tasks simultaneously with a linear decomposition of universal successor features. |
Tarun Gupta; Anuj Mahajan; Bei Peng; Wendelin Boehmer; Shimon Whiteson; |
361 | Distribution-Free Calibration Guarantees for Histogram Binning Without Sample Splitting Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We prove calibration guarantees for the popular histogram binning (also called uniform-mass binning) method of Zadrozny and Elkan (2001). |
Chirag Gupta; Aaditya Ramdas; |
362 | Correcting Exposure Bias for Link Recommendation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose estimators that leverage known exposure probabilities to mitigate this bias and consequent feedback loops. |
Shantanu Gupta; Hao Wang; Zachary Lipton; Yuyang Wang; |
363 | The Heavy-Tail Phenomenon in SGD Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we argue that these three seemingly unrelated perspectives for generalization are deeply linked to each other. |
Mert Gurbuzbalaban; Umut Simsekli; Lingjiong Zhu; |
364 | Knowledge Enhanced Machine Learning Pipeline Against Diverse Adversarial Attacks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we aim to enhance the ML robustness from a different perspective by leveraging domain knowledge: We propose a Knowledge Enhanced Machine Learning Pipeline (KEMLP) to integrate domain knowledge (i.e., logic relationships among different predictions) into a probabilistic graphical model via first-order logic rules. |
Nezihe Merve G?rel; Xiangyu Qi; Luka Rimanic; Ce Zhang; Bo Li; |
365 | Adapting to Delays and Data in Adversarial Multi-Armed Bandits Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We consider the adversarial multi-armed bandit problem under delayed feedback. |
Andras Gyorgy; Pooria Joulani; |
366 | Rate-Distortion Analysis of Minimum Excess Risk in Bayesian Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we build upon and extend the recent results of (Xu & Raginsky, 2020) to analyze the MER in Bayesian learning and derive information-theoretic bounds on it. |
Hassan Hafez-Kolahi; Behrad Moniri; Shohreh Kasaei; Mahdieh Soleymani Baghshah; |
367 | Regret Minimization in Stochastic Non-Convex Learning Via A Proximal-Gradient Approach Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: On that account, we propose a conceptual approach that leverages non-convex optimality measures, leading to a suitable generalization of the learner’s local regret. |
Nadav Hallak; Panayotis Mertikopoulos; Volkan Cevher; |
368 | Diversity Actor-Critic: Sample-Aware Entropy Regularization for Sample-Efficient Exploration Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, sample-aware policy entropy regularization is proposed to enhance the conventional policy entropy regularization for better exploration. |
Seungyul Han; Youngchul Sung; |
369 | Adversarial Combinatorial Bandits with General Non-linear Reward Functions Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we study the adversarial combinatorial bandit with a known non-linear reward function, extending existing work on adversarial linear combinatorial bandit. |
Yanjun Han; Yining Wang; Xi Chen; |
370 | A Collective Learning Framework to Boost GNN Expressiveness for Node Classification Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we investigate this question and propose {\em collective learning} for GNNs —a general collective classification approach for node representation learning that increases their representation power. |
Mengyue Hang; Jennifer Neville; Bruno Ribeiro; |
371 | Grounding Language to Entities and Dynamics for Generalization in Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We develop a new model, EMMA (Entity Mapper with Multi-modal Attention) which uses an entity-conditioned attention module that allows for selective focus over relevant descriptions in the manual for each entity in the environment. |
Austin W. Hanjie; Victor Y Zhong; Karthik Narasimhan; |
372 | Sparse Feature Selection Makes Batch Reinforcement Learning More Sample Efficient Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper provides a statistical analysis of high-dimensional batch reinforcement learning (RL) using sparse linear function approximation. |
Botao Hao; Yaqi Duan; Tor Lattimore; Csaba Szepesvari; Mengdi Wang; |
373 | Bootstrapping Fitted Q-Evaluation for Off-Policy Inference Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we study the use of bootstrapping in off-policy evaluation (OPE), and in particular, we focus on the fitted Q-evaluation (FQE) that is known to be minimax-optimal in the tabular and linear-model cases. |
Botao Hao; Xiang Ji; Yaqi Duan; Hao Lu; Csaba Szepesvari; Mengdi Wang; |
374 | Compressed Maximum Likelihood Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Inspired by recent advances in estimating distribution functionals, we propose $\textit{compressed maximum likelihood}$ (CML) that applies ML to the compressed samples. |
Yi Hao; Alon Orlitsky; |
375 | Valid Causal Inference with (Some) Invalid Instruments Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we show how to perform consistent IV estimation despite violations of the exclusion assumption. |
Jason S Hartford; Victor Veitch; Dhanya Sridhar; Kevin Leyton-Brown; |
376 | Model Performance Scaling with Multiple Data Sources Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We show that there is a simple scaling law that predicts the loss incurred by a model even under varying dataset composition. |
Tatsunori Hashimoto; |
377 | Hierarchical VAEs Know What They Don’t Know Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In the context of hierarchical variational autoencoders, we provide evidence to explain this behavior by out-of-distribution data having in-distribution low-level features. |
Jakob D. Drachmann Havtorn; Jes Frellsen; Soren Hauberg; Lars Maal?e; |
378 | Defense Against Backdoor Attacks Via Robust Covariance Estimation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a novel defense algorithm using robust covariance estimation to amplify the spectral signature of corrupted data. |
Jonathan Hayase; Weihao Kong; Raghav Somani; Sewoong Oh; |
379 | Boosting for Online Convex Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We consider the decision-making framework of online convex optimization with a very large number of experts. |
Elad Hazan; Karan Singh; |
380 | PipeTransformer: Automated Elastic Pipelining for Distributed Training of Large-scale Models Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose PipeTransformer, which leverages automated elastic pipelining for efficient distributed training of Transformer models. |
Chaoyang He; Shen Li; Mahdi Soltanolkotabi; Salman Avestimehr; |
381 | SoundDet: Polyphonic Moving Sound Event Detection and Localization from Raw Waveform Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present a new framework SoundDet, which is an end-to-end trainable and light-weight framework, for polyphonic moving sound event detection and localization. |
Yuhang He; Niki Trigoni; Andrew Markham; |
382 | Logarithmic Regret for Reinforcement Learning with Linear Function Approximation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we show that logarithmic regret is attainable under two recently proposed linear MDP assumptions provided that there exists a positive sub-optimality gap for the optimal action-value function. |
Jiafan He; Dongruo Zhou; Quanquan Gu; |
383 | Finding Relevant Information Via A Discrete Fourier Expansion Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To address this, we propose a Fourier-based approach to extract relevant information in the supervised setting. |
Mohsen Heidari; Jithin Sreedharan; Gil I Shamir; Wojciech Szpankowski; |
384 | Zeroth-Order Non-Convex Learning Via Hierarchical Dual Averaging Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a hierarchical version of dual averaging for zeroth-order online non-convex optimization {–} i.e., learning processes where, at each stage, the optimizer is facing an unknown non-convex loss function and only receives the incurred loss as feedback. |
Am?lie H?liou; Matthieu Martin; Panayotis Mertikopoulos; Thibaud Rahier; |
385 | Improving Molecular Graph Neural Network Explainability with Orthonormalization and Induced Sparsity Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To help, we propose two simple regularization techniques to apply during the training of GCNNs: Batch Representation Orthonormalization (BRO) and Gini regularization. |
Ryan Henderson; Djork-Arn? Clevert; Floriane Montanari; |
386 | Muesli: Combining Improvements in Policy Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a novel policy update that combines regularized policy optimization with model learning as an auxiliary loss. |
Matteo Hessel; Ivo Danihelka; Fabio Viola; Arthur Guez; Simon Schmitt; Laurent Sifre; Theophane Weber; David Silver; Hado Van Hasselt; |
387 | Learning Representations By Humans, for Humans Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Here we propose a framework to directly support human decision-making, in which the role of machines is to reframe problems rather than to prescribe actions through prediction. |
Sophie Hilgard; Nir Rosenfeld; Mahzarin R Banaji; Jack Cao; David Parkes; |
388 | Optimizing Black-box Metrics with Iterative Example Weighting Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Our approach is to adaptively learn example weights on the training dataset such that the resulting weighted objective best approximates the metric on the validation sample. |
Gaurush Hiranandani; Jatin Mathur; Harikrishna Narasimhan; Mahdi Milani Fard; Sanmi Koyejo; |
389 | Trees with Attention for Set Prediction Tasks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Set-Tree, presented in this work, extends the support for sets to tree-based models, such as Random-Forest and Gradient-Boosting, by introducing an attention mechanism and set-compatible split criteria. |
Roy Hirsch; Ran Gilad-Bachrach; |
390 | Multiplicative Noise and Heavy Tails in Stochastic Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Modeling stochastic optimization algorithms as discrete random recurrence relations, we show that multiplicative noise, as it commonly arises due to variance in local rates of convergence, results in heavy-tailed stationary behaviour in the parameters. |
Liam Hodgkinson; Michael Mahoney; |
391 | MC-LSTM: Mass-Conserving LSTM Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Our novel Mass-Conserving LSTM (MC-LSTM) adheres to these conservation laws by extending the inductive bias of LSTM to model the redistribution of those stored quantities. |
Pieter-Jan Hoedt; Frederik Kratzert; Daniel Klotz; Christina Halmich; Markus Holzleitner; Grey S Nearing; Sepp Hochreiter; Guenter Klambauer; |
392 | Learning Curves for Analysis of Deep Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a method to robustly estimate learning curves, abstract their parameters into error and data-reliance, and evaluate the effectiveness of different parameterizations. |
Derek Hoiem; Tanmay Gupta; Zhizhong Li; Michal Shlapentokh-Rothman; |
393 | Equivariant Learning of Stochastic Fields: Gaussian Processes and Steerable Conditional Neural Processes Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Motivated by objects such as electric fields or fluid streams, we study the problem of learning stochastic fields, i.e. stochastic processes whose samples are fields like those occurring in physics and engineering. |
Peter Holderrieth; Michael J Hutchinson; Yee Whye Teh; |
394 | Latent Programmer: Discrete Latent Codes for Program Synthesis Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Based on these insights, we introduce the Latent Programmer (LP), a program synthesis method that first predicts a discrete latent code from input/output examples, and then generates the program in the target language. |
Joey Hong; David Dohan; Rishabh Singh; Charles Sutton; Manzil Zaheer; |
395 | Chebyshev Polynomial Codes: Task Entanglement-based Coding for Distributed Matrix Multiplication Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose Chebyshev polynomial codes, which can achieve order-wise improvement in encoding complexity at the master and communication load in distributed matrix multiplication using task entanglement. |
Sangwoo Hong; Heecheol Yang; Youngseok Yoon; Taehyun Cho; Jungwoo Lee; |
396 | Federated Learning of User Verification Models Without Sharing Embeddings Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To address this problem, we propose Federated User Verification (FedUV), a framework in which users jointly learn a set of vectors and maximize the correlation of their instance embeddings with a secret linear combination of those vectors. |
Hossein Hosseini; Hyunsin Park; Sungrack Yun; Christos Louizos; Joseph Soriaga; Max Welling; |
397 | The Limits of Min-Max Optimization Algorithms: Convergence to Spurious Non-Critical Sets Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In particular, we show that a wide class of state-of-the-art schemes and heuristics may converge with arbitrarily high probability to attractors that are in no way min-max optimal or even stationary. |
Ya-Ping Hsieh; Panayotis Mertikopoulos; Volkan Cevher; |
398 | Near-Optimal Representation Learning for Linear Bandits and Linear RL Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a sample-efficient algorithm, MTLR-OFUL, which leverages the shared representation to achieve $\tilde{O}(M\sqrt{dkT} + d\sqrt{kMT} )$ regret, with $T$ being the number of total steps. |
Jiachen Hu; Xiaoyu Chen; Chi Jin; Lihong Li; Liwei Wang; |
399 | On The Random Conjugate Kernel and Neural Tangent Kernel Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We investigate the distributions of Conjugate Kernel (CK) and Neural Tangent Kernel (NTK) for ReLU networks with random initialization. |
Zhengmian Hu; Heng Huang; |
400 | Off-Belief Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Policies learned through self-play may adopt arbitrary conventions and implicitly rely on multi-step reasoning based on fragile assumptions about other agents’ actions and thus fail when paired with humans or independently trained agents at test time. To address this, we present off-belief learning (OBL). |
Hengyuan Hu; Adam Lerer; Brandon Cui; Luis Pineda; Noam Brown; Jakob Foerster; |
401 | Generalizable Episodic Memory for Deep Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To address this problem, we propose Generalizable Episodic Memory (GEM), which effectively organizes the state-action values of episodic memory in a generalizable manner and supports implicit planning on memorized trajectories. |
Hao Hu; Jianing Ye; Guangxiang Zhu; Zhizhou Ren; Chongjie Zhang; |
402 | A Scalable Deterministic Global Optimization Algorithm for Clustering Problems Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we modelled the MSSC task as a two-stage optimization problem and proposed a tailed reduced-space branch and bound (BB) algorithm. |
Kaixun Hua; Mingfei Shi; Yankai Cao; |
403 | On Recovering from Modeling Errors Using Testing Bayesian Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We consider the problem of supervised learning with Bayesian Networks when the used dependency structure is incomplete due to missing edges or missing variable states. |
Haiying Huang; Adnan Darwiche; |
404 | A Novel Sequential Coreset Method for Gradient Descent Algorithms Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, based on the “locality” property of gradient descent algorithms, we propose a new framework, termed “sequential coreset”, which effectively avoids these obstacles. |
Jiawei Huang; Ruomin Huang; Wenjie Liu; Nikolaos Freris; Hu Ding; |
405 | FL-NTK: A Neural Tangent Kernel-based Framework for Federated Learning Analysis Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: The current paper presents a new class of convergence analysis for FL, Federated Neural Tangent Kernel (FL-NTK), which corresponds to overparamterized ReLU neural networks trained by gradient descent in FL and is inspired by the analysis in Neural Tangent Kernel (NTK). |
Baihe Huang; Xiaoxiao Li; Zhao Song; Xin Yang; |
406 | STRODE: Stochastic Boundary Ordinary Differential Equation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present a probabilistic ordinary differential equation (ODE), called STochastic boundaRy ODE (STRODE), that learns both the timings and the dynamics of time series data without requiring any timing annotations during training. |
Hengguan Huang; Hongfu Liu; Hao Wang; Chang Xiao; Ye Wang; |
407 | A Riemannian Block Coordinate Descent Method for Computing The Projection Robust Wasserstein Distance Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a Riemannian block coordinate descent (RBCD) method to solve this problem, which is based on a novel reformulation of the regularized max-min problem over the Stiefel manifold. |
Minhui Huang; Shiqian Ma; Lifeng Lai; |
408 | Projection Robust Wasserstein Barycenters Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper proposes the projection robust Wasserstein barycenter (PRWB) that has the potential to mitigate the curse of dimensionality, and a relaxed PRWB (RPRWB) model that is computationally more tractable. |
Minhui Huang; Shiqian Ma; Lifeng Lai; |
409 | Accurate Post Training Quantization With Small Calibration Sets Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To this end, we minimize the quantization errors of each layer or block separately by optimizing its parameters over the calibration set. |
Itay Hubara; Yury Nahshan; Yair Hanani; Ron Banner; Daniel Soudry; |
410 | Learning and Planning in Complex Action Spaces Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a general framework to reason in a principled way about policy evaluation and improvement over such sampled action subsets. |
Thomas Hubert; Julian Schrittwieser; Ioannis Antonoglou; Mohammadamin Barekatain; Simon Schmitt; David Silver; |
411 | Generative Adversarial Transformers Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We introduce the GANsformer, a novel and efficient type of transformer, and explore it for the task of visual generative modeling. |
Drew A Hudson; Larry Zitnick; |
412 | Neural Pharmacodynamic State Space Modeling Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a deep generative model that makes use of a novel attention-based neural architecture inspired by the physics of how treatments affect disease state. |
Zeshan M Hussain; Rahul G. Krishnan; David Sontag; |
413 | Hyperparameter Selection for Imitation Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We address the issue of tuning hyperparameters (HPs) for imitation learning algorithms in the context of continuous-control, when the underlying reward function of the demonstrating expert cannot be observed at any time. |
L?onard Hussenot; Marcin Andrychowicz; Damien Vincent; Robert Dadashi; Anton Raichuk; Sabela Ramos; Nikola Momchev; Sertan Girgin; Raphael Marinier; Lukasz Stafiniak; Manu Orsini; Olivier Bachem; Matthieu Geist; Olivier Pietquin; |
414 | Pareto GAN: Extending The Representational Power of GANs to Heavy-Tailed Distributions Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We identify issues with standard loss functions and propose the use of alternative metric spaces that enable stable and efficient learning. |
Todd Huster; Jeremy Cohen; Zinan Lin; Kevin Chan; Charles Kamhoua; Nandi O. Leslie; Cho-Yu Jason Chiang; Vyas Sekar; |
415 | LieTransformer: Equivariant Self-Attention for Lie Groups Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose the LieTransformer, an architecture composed of LieSelfAttention layers that are equivariant to arbitrary Lie groups and their discrete subgroups. |
Michael J Hutchinson; Charline Le Lan; Sheheryar Zaidi; Emilien Dupont; Yee Whye Teh; Hyunjik Kim; |
416 | Crowdsourcing Via Annotator Co-occurrence Imputation and Provable Symmetric Nonnegative Matrix Factorization Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This work recasts the pairwise co-occurrence based D&S model learning problem as a symmetric NMF (SymNMF) problem—which offers enhanced identifiability relative to CNMF. |
Shahana Ibrahim; Xiao Fu; |
417 | Selecting Data Augmentation for Simulating Interventions Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we focus on the case where the problem arises through spurious correlation between the observed domains and the actual task labels. |
Maximilian Ilse; Jakub M Tomczak; Patrick Forr?; |
418 | Scalable Marginal Likelihood Estimation for Model Selection in Deep Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we present a scalable marginal-likelihood estimation method to select both hyperparameters and network architectures, based on the training data alone. |
Alexander Immer; Matthias Bauer; Vincent Fortuin; Gunnar R?tsch; Khan Mohammad Emtiyaz; |
419 | Active Learning for Distributionally Robust Level-Set Estimation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this study, we addressed this problem by considering the \textit{distributionally robust PTR} (DRPTR) measure, which considers the worst-case PTR within given candidate distributions. |
Yu Inatsu; Shogo Iwazaki; Ichiro Takeuchi; |
420 | Learning Randomly Perturbed Structured Predictors for Direct Loss Minimization Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we interpolate between these techniques by learning the variance of randomized structured predictors as well as their mean, in order to balance between the learned score function and the randomized noise. |
Hedda Cohen Indelman; Tamir Hazan; |
421 | Randomized Entity-wise Factorization for Multi-Agent Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Our method aims to leverage these commonalities by asking the question: “What is the expected utility of each agent when only considering a randomly selected sub-group of its observed entities?” |
Shariq Iqbal; Christian A Schroeder De Witt; Bei Peng; Wendelin Boehmer; Shimon Whiteson; Fei Sha; |
422 | Randomized Exploration in Reinforcement Learning with General Value Function Approximation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a model-free reinforcement learning algorithm inspired by the popular randomized least squares value iteration (RLSVI) algorithm as well as the optimism principle. |
Haque Ishfaq; Qiwen Cui; Viet Nguyen; Alex Ayoub; Zhuoran Yang; Zhaoran Wang; Doina Precup; Lin Yang; |
423 | Distributed Second Order Methods with Fast Rates and Compressed Communication Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We develop several new communication-efficient second-order methods for distributed optimization. |
Rustem Islamov; Xun Qian; Peter Richtarik; |
424 | What Are Bayesian Neural Network Posteriors Really Like? Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To investigate foundational questions in Bayesian deep learning, we instead use full batch Hamiltonian Monte Carlo (HMC) on modern architectures. |
Pavel Izmailov; Sharad Vikram; Matthew D Hoffman; Andrew Gordon Gordon Wilson; |
425 | How to Learn When Data Reacts to Your Model: Performative Gradient Descent Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Here we introduce \emph{performative gradient descent} (PerfGD), an algorithm for computing performatively optimal points. |
Zachary Izzo; Lexing Ying; James Zou; |
426 | Perceiver: General Perception with Iterative Attention Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we introduce the Perceiver {–} a model that builds upon Transformers and hence makes few architectural assumptions about the relationship between its inputs, but that also scales to hundreds of thousands of inputs, like ConvNets. |
Andrew Jaegle; Felix Gimeno; Andy Brock; Oriol Vinyals; Andrew Zisserman; Joao Carreira; |
427 | Imitation By Predicting Observations Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present a new method for imitation solely from observations that achieves comparable performance to experts on challenging continuous control tasks while also exhibiting robustness in the presence of observations unrelated to the task. |
Andrew Jaegle; Yury Sulsky; Arun Ahuja; Jake Bruce; Rob Fergus; Greg Wayne; |
428 | Local Correlation Clustering with Asymmetric Classification Errors Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We study the $\ell_p$ objective in Correlation Clustering under the following assumption: Every similar edge has weight in $[\alpha\mathbf{w},\mathbf{w}]$ and every dissimilar edge has weight at least $\alpha\mathbf{w}$ (where $\alpha \leq 1$ and $\mathbf{w}>0$ is a scaling parameter). |
Jafar Jafarov; Sanchit Kalhan; Konstantin Makarychev; Yury Makarychev; |
429 | Alternative Microfoundations for Strategic Classification Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we argue that a direct combination of these ingredients leads to brittle solution concepts of limited descriptive and prescriptive value. |
Meena Jagadeesan; Celestine Mendler-D?nner; Moritz Hardt; |
430 | Robust Density Estimation from Batches: The Best Things in Life Are (Nearly) Free Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We answer this question, showing that, perhaps surprisingly, up to logarithmic factors, the optimal sample complexity is the same as for genuine, non-adversarial, data! |
Ayush Jain; Alon Orlitsky; |
431 | Instance-Optimal Compressed Sensing Via Posterior Sampling Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We show for Gaussian measurements and \emph{any} prior distribution on the signal, that the posterior sampling estimator achieves near-optimal recovery guarantees. |
Ajil Jalal; Sushrut Karmalkar; Alex Dimakis; Eric Price; |
432 | Fairness for Image Generation with Uncertain Sensitive Attributes Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This work tackles the issue of fairness in the context of generative procedures, such as image super-resolution, which entail different definitions from the standard classification setting. |
Ajil Jalal; Sushrut Karmalkar; Jessica Hoffmann; Alex Dimakis; Eric Price; |
433 | Feature Clustering for Support Identification in Extreme Regions Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: The present paper develops a novel optimization-based approach to assess the dependence structure of extremes. |
Hamid Jalalzai; R?mi Leluc; |
434 | Improved Regret Bounds of Bilinear Bandits Using Action Space Analysis Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we make progress towards closing the gap between the upper and lower bound on the optimal regret. |
Kyoungseok Jang; Kwang-Sung Jun; Se-Young Yun; Wanmo Kang; |
435 | Inverse Decision Modeling: Learning Interpretable Representations of Behavior Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we develop an expressive, unifying perspective on *inverse decision modeling*: a framework for learning parameterized representations of sequential decision behavior. |
Daniel Jarrett; Alihan H?y?k; Mihaela Van Der Schaar; |
436 | Catastrophic Fisher Explosion: Early Phase Fisher Matrix Impacts Generalization Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We highlight that poor final generalization coincides with the trace of the FIM attaining a large value early in training, to which we refer as catastrophic Fisher explosion. |
Stanislaw Jastrzebski; Devansh Arpit; Oliver Astrand; Giancarlo B Kerg; Huan Wang; Caiming Xiong; Richard Socher; Kyunghyun Cho; Krzysztof J Geras; |
437 | Policy Gradient Bayesian Robust Optimization for Imitation Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We derive a novel policy gradient-style robust optimization approach, PG-BROIL, that optimizes a soft-robust objective that balances expected performance and risk. |
Zaynah Javed; Daniel S Brown; Satvik Sharma; Jerry Zhu; Ashwin Balakrishna; Marek Petrik; Anca Dragan; Ken Goldberg; |
438 | In-Database Regression in Input Sparsity Time Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we design subspace embeddings for database joins which can be computed significantly faster than computing the join. |
Rajesh Jayaram; Alireza Samadian; David Woodruff; Peng Ye; |
439 | Parallel and Flexible Sampling from Autoregressive Models Via Langevin Dynamics Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper introduces an alternative approach to sampling from autoregressive models. |
Vivek Jayaram; John Thickstun; |
440 | Objective Bound Conditional Gaussian Process for Bayesian Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a new surrogate model, called the objective bound conditional Gaussian process (OBCGP), to condition a Gaussian process on a bound on the optimal function value. |
Taewon Jeong; Heeyoung Kim; |
441 | Quantifying Ignorance in Individual-Level Causal-Effect Estimates Under Hidden Confounding Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present a new parametric interval estimator suited for high-dimensional data, that estimates a range of possible CATE values when given a predefined bound on the level of hidden confounding. |
Andrew Jesson; S?ren Mindermann; Yarin Gal; Uri Shalit; |
442 | DeepReDuce: ReLU Reduction for Fast Private Inference Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper proposes DeepReDuce: a set of optimizations for the judicious removal of ReLUs to reduce private inference latency. |
Nandan Kumar Jha; Zahra Ghodsi; Siddharth Garg; Brandon Reagen; |
443 | Factor-analytic Inverse Regression for High-dimension, Small-sample Dimensionality Reduction Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To overcome this limitation, we propose Class-conditional Factor Analytic Dimensions (CFAD), a model-based dimensionality reduction method for high-dimensional, small-sample data. |
Aditi Jha; Michael J. Morais; Jonathan W Pillow; |
444 | Fast Margin Maximization Via Dual Acceleration Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present and analyze a momentum-based gradient method for training linear classifiers with an exponentially-tailed loss (e.g., the exponential or logistic loss), which maximizes the classification margin on separable data at a rate of O(1/t^2). |
Ziwei Ji; Nathan Srebro; Matus Telgarsky; |
445 | Marginalized Stochastic Natural Gradients for Black-Box Variational Inference Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a stochastic natural gradient estimator that is as broadly applicable and unbiased, but improves efficiency by exploiting the curvature of the variational bound, and provably reduces variance by marginalizing discrete latent variables. |
Geng Ji; Debora Sujono; Erik B Sudderth; |
446 | Bilevel Optimization: Convergence Analysis and Enhanced Design Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we investigate the nonconvex-strongly-convex bilevel optimization problem. |
Kaiyi Ji; Junjie Yang; Yingbin Liang; |
447 | Efficient Statistical Tests: A Neural Tangent Kernel Approach Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a shift-invariant convolutional neural tangent kernel (SCNTK) based outlier detector and two-sample tests with maximum mean discrepancy (MMD) that is O(n) in the number of samples due to using the random feature approximation. |
Sheng Jia; Ehsan Nezhadarya; Yuhuai Wu; Jimmy Ba; |
448 | Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we leverage a noisy dataset of over one billion image alt-text pairs, obtained without expensive filtering or post-processing steps in the Conceptual Captions dataset. |
Chao Jia; Yinfei Yang; Ye Xia; Yi-Ting Chen; Zarana Parekh; Hieu Pham; Quoc Le; Yun-Hsuan Sung; Zhen Li; Tom Duerig; |
449 | Multi-Dimensional Classification Via Sparse Label Encoding Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a novel MDC approach named SLEM which learns the predictive model in an encoded label space instead of the original heterogeneous one. |
Bin-Bin Jia; Min-Ling Zhang; |
450 | Self-Damaging Contrastive Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper proposes to explicitly tackle this challenge, via a principled framework called Self-Damaging Contrastive Learning (SDCLR), to automatically balance the representation learning without knowing the classes. |
Ziyu Jiang; Tianlong Chen; Bobak J Mortazavi; Zhangyang Wang; |
451 | Prioritized Level Replay Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We introduce Prioritized Level Replay (PLR), a general framework for selectively sampling the next training level by prioritizing those with higher estimated learning potential when revisited in the future. |
Minqi Jiang; Edward Grefenstette; Tim Rockt?schel; |
452 | Monotonic Robust Policy Optimization with Model Discrepancy Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Since the average and worst-case performance are both important for generalization in RL, in this paper, we propose a policy optimization approach for concurrently improving the policy’s performance in the average and worst-case environment. |
Yuankun Jiang; Chenglin Li; Wenrui Dai; Junni Zou; Hongkai Xiong; |
453 | Approximation Theory of Convolutional Architectures for Time Series Modelling Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we derive parallel results for convolutional architectures, with WaveNet being a prime example. |
Haotian Jiang; Zhong Li; Qianxiao Li; |
454 | Streaming and Distributed Algorithms for Robust Column Subset Selection Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We give the first single-pass streaming algorithm for Column Subset Selection with respect to the entrywise $\ell_p$-norm with $1 \leq p < 2$. |
Shuli Jiang; Dennis Li; Irene Mengze Li; Arvind V Mahankali; David Woodruff; |
455 | Single Pass Entrywise-Transformed Low Rank Approximation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we resolve this open question, obtaining the first single-pass algorithm for this problem and for the same class of functions $f$ studied by Liang et al. |
Yifei Jiang; Yi Li; Yiming Sun; Jiaxin Wang; David Woodruff; |
456 | The Emergence of Individuality Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Inspired by that individuality is of being an individual separate from others, we propose a simple yet efficient method for the emergence of individuality (EOI) in multi-agent reinforcement learning (MARL). |
Jiechuan Jiang; Zongqing Lu; |
457 | Online Selection Problems Against Constrained Adversary Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Inspired by a recent line of work in online algorithms with predictions, we study the constrained adversary model that utilizes predictions from a different perspective. |
Zhihao Jiang; Pinyan Lu; Zhihao Gavin Tang; Yuhao Zhang; |
458 | Active Covering Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We analyze the problem of active covering, where the learner is given an unlabeled dataset and can sequentially label query examples. |
Heinrich Jiang; Afshin Rostamizadeh; |
459 | Emphatic Algorithms for Deep Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we extend the use of emphatic methods to deep reinforcement learning agents. |
Ray Jiang; Tom Zahavy; Zhongwen Xu; Adam White; Matteo Hessel; Charles Blundell; Hado Van Hasselt; |
460 | Characterizing Structural Regularities of Labeled Data in Overparameterized Models Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We analyze how individual instances are treated by a model via a consistency score. The score characterizes the expected accuracy for a held-out instance given training sets of varying size sampled from the data distribution. |
Ziheng Jiang; Chiyuan Zhang; Kunal Talwar; Michael C Mozer; |
461 | Optimal Streaming Algorithms for Multi-Armed Bandits Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose an algorithm that works for any k and achieves the optimal sample complexity O(\frac{n}{\epsilon^2} \log\frac{k}{\delta}) using a single-arm memory and a single pass of the stream. |
Tianyuan Jin; Keke Huang; Jing Tang; Xiaokui Xiao; |
462 | Towards Tight Bounds on The Sample Complexity of Average-reward MDPs Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: When the mixing time of the probability transition matrix of all policies is at most $t_\mathrm{mix}$, we provide an algorithm that solves the problem using $\widetilde{O}(t_\mathrm{mix} \epsilon^{-3})$ (oblivious) samples per state-action pair. |
Yujia Jin; Aaron Sidford; |
463 | Almost Optimal Anytime Algorithm for Batched Multi-Armed Bandits Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we study the anytime batched multi-armed bandit problem. |
Tianyuan Jin; Jing Tang; Pan Xu; Keke Huang; Xiaokui Xiao; Quanquan Gu; |
464 | MOTS: Minimax Optimal Thompson Sampling Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we fill this long open gap by proposing a new Thompson sampling algorithm called MOTS that adaptively truncates the sampling result of the chosen arm at each time step. |
Tianyuan Jin; Pan Xu; Jieming Shi; Xiaokui Xiao; Quanquan Gu; |
465 | Is Pessimism Provably Efficient for Offline RL? Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a pessimistic variant of the value iteration algorithm (PEVI), which incorporates an uncertainty quantifier as the penalty function. |
Ying Jin; Zhuoran Yang; Zhaoran Wang; |
466 | Adversarial Option-Aware Hierarchical Imitation Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose Option-GAIL, a novel method to learn skills at long horizon. |
Mingxuan Jing; Wenbing Huang; Fuchun Sun; Xiaojian Ma; Tao Kong; Chuang Gan; Lei Li; |
467 | Discrete-Valued Latent Preference Matrix Estimation with Graph Side Information Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose a new model in which 1) the unknown latent preference matrix can have any discrete values, and 2) users can be clustered into multiple clusters, thereby relaxing the assumptions made in prior work. |
Changhun Jo; Kangwook Lee; |
468 | Provable Lipschitz Certification for Generative Models Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present a scalable technique for upper bounding the Lipschitz constant of generative models. |
Matt Jordan; Alex Dimakis; |
469 | Isometric Gaussian Process Latent Variable Model for Dissimilarity Data Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present a probabilistic model where the latent variable respects both the distances and the topology of the modeled data. |
Martin J?rgensen; Soren Hauberg; |
470 | On The Generalization Power of Overfitted Two-Layer Neural Tangent Kernel Models Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we study the generalization performance of min $\ell_2$-norm overfitting solutions for the neural tangent kernel (NTK) model of a two-layer neural network with ReLU activation that has no bias term. |
Peizhong Ju; Xiaojun Lin; Ness Shroff; |
471 | Improved Confidence Bounds for The Linear Logistic Model and Applications to Bandits Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose improved fixed-design confidence bounds for the linear logistic model. |
Kwang-Sung Jun; Lalit Jain; Houssam Nassif; Blake Mason; |
472 | Detection of Signal in The Spiked Rectangular Models Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We consider the problem of detecting signals in the rank-one signal-plus-noise data matrix models that generalize the spiked Wishart matrices. |
Ji Hyung Jung; Hye Won Chung; Ji Oon Lee; |
473 | Estimating Identifiable Causal Effects on Markov Equivalence Class Through Double Machine Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we study the problem of causal estimation from a MEC represented by a partial ancestral graph (PAG), which is learnable from observational data. |
Yonghan Jung; Jin Tian; Elias Bareinboim; |
474 | A Nullspace Property for Subspace-Preserving Recovery Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper derives a necessary and sufficient condition for subspace-preserving recovery that is inspired by the classical nullspace property.Based on this novel condition, called here the subspace nullspace property, we derive equivalent characterizations that either admit a clear geometric interpretation that relates data distribution and subspace separation to the recovery success, or can be verified using a finite set of extreme points of a properly defined set. |
Mustafa D Kaba; Chong You; Daniel P Robinson; Enrique Mallada; Rene Vidal; |
475 | Training Recurrent Neural Networks Via Forward Propagation Through Time Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a novel forward-propagation algorithm, FPTT, where at each time, for an instance, we update RNN parameters by optimizing an instantaneous risk function. |
Anil Kag; Venkatesh Saligrama; |
476 | The Distributed Discrete Gaussian Mechanism for Federated Learning with Secure Aggregation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present a comprehensive end-to-end system, which appropriately discretizes the data and adds discrete Gaussian noise before performing secure aggregation. |
Peter Kairouz; Ziyu Liu; Thomas Steinke; |
477 | Practical and Private (Deep) Learning Without Sampling or Shuffling Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We consider training models with differential privacy (DP) using mini-batch gradients. |
Peter Kairouz; Brendan Mcmahan; Shuang Song; Om Thakkar; Abhradeep Thakurta; Zheng Xu; |
478 | A Differentiable Point Process with Its Application to Spiking Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper is concerned about a learning algorithm for a probabilistic model of spiking neural networks (SNNs). |
Hiroshi Kajino; |
479 | Projection Techniques to Update The Truncated SVD of Evolving Matrices with Applications Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: The algorithm presented in this paper undertakes a projection viewpoint and focuses on building a pair of subspaces which approximate the linear span of the sought singular vectors of the updated matrix. |
Vasileios Kalantzis; Georgios Kollias; Shashanka Ubaru; Athanasios N. Nikolakopoulos; Lior Horesh; Kenneth Clarkson; |
480 | Optimal Off-Policy Evaluation from Multiple Logging Policies Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we resolve this dilemma by finding the OPE estimator for multiple loggers with minimum variance for any instance, i.e., the efficient one. |
Nathan Kallus; Yuta Saito; Masatoshi Uehara; |
481 | Efficient Performance Bounds for Primal-Dual Reinforcement Learning from Demonstrations Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To bridge the gap between theory and practice, we introduce a novel bilinear saddle-point framework using Lagrangian duality. |
Angeliki Kamoutsi; Goran Banjac; John Lygeros; |
482 | Statistical Estimation from Dependent Data Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: As our main contribution we provide algorithms and statistically efficient estimation rates for this model, giving several instantiations of our bounds in logistic regression, sparse logistic regression, and neural network regression settings with dependent data. |
Vardis Kandiros; Yuval Dagan; Nishanth Dikkala; Surbhi Goel; Constantinos Daskalakis; |
483 | SKIing on Simplices: Kernel Interpolation on The Permutohedral Lattice for Scalable Gaussian Processes Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we develop a connection between SKI and the permutohedral lattice used for high-dimensional fast bilateral filtering. |
Sanyam Kapoor; Marc Finzi; Ke Alexander Wang; Andrew Gordon Gordon Wilson; |
484 | Variational Auto-Regressive Gaussian Processes for Continual Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: By relying on sparse inducing point approximations for scalable posteriors, we propose a novel auto-regressive variational distribution which reveals two fruitful connections to existing results in Bayesian inference, expectation propagation and orthogonal inducing points. |
Sanyam Kapoor; Theofanis Karaletsos; Thang D Bui; |
485 | Off-Policy Confidence Sequences Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We develop confidence bounds that hold uniformly over time for off-policy evaluation in the contextual bandit setting. |
Nikos Karampatziakis; Paul Mineiro; Aaditya Ramdas; |
486 | Learning from History for Byzantine Robust Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To address these issues, we present two surprisingly simple strategies: a new robust iterative clipping procedure, and incorporating worker momentum to overcome time-coupled attacks. |
Sai Praneeth Karimireddy; Lie He; Martin Jaggi; |
487 | Non-Negative Bregman Divergence Minimization for Deep Direct Density Ratio Estimation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, to mitigate train-loss hacking, we propose non-negative correction for empirical BD estimators. |
Masahiro Kato; Takeshi Teshima; |
488 | Improved Algorithms for Agnostic Pool-based Active Classification Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work we propose an algorithm that, in contrast to uniform sampling over the disagreement region, solves an experimental design problem to determine a distribution over examples from which to request labels. |
Julian Katz-Samuels; Jifan Zhang; Lalit Jain; Kevin Jamieson; |
489 | When Does Data Augmentation Help With Membership Inference Attacks? Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Employing two recent MIAs, we explore the lower bound on the risk in the absence of formal upper bounds. |
Yigitcan Kaya; Tudor Dumitras; |
490 | Regularized Submodular Maximization at Scale Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose scalable methods for maximizing a regularized submodular function $f \triangleq g-\ell$ expressed as the difference between a monotone submodular function $g$ and a modular function $\ell$. |
Ehsan Kazemi; Shervin Minaee; Moran Feldman; Amin Karbasi; |
491 | Prior Image-Constrained Reconstruction Using Style-Based Generative Models Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this study, a framework for estimating an object of interest that is semantically related to a known prior image, is proposed. |
Varun A Kelkar; Mark Anastasio; |
492 | Self Normalizing Flows Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose \emph{Self Normalizing Flows}, a flexible framework for training normalizing flows by replacing expensive terms in the gradient by learned approximate inverses at each layer. |
Thomas A Keller; Jorn W.T. Peters; Priyank Jaini; Emiel Hoogeboom; Patrick Forr?; Max Welling; |
493 | Interpretable Stability Bounds for Spectral Graph Filters Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we study filter stability and provide a novel and interpretable upper bound on the change of filter output, where the bound is expressed in terms of the endpoint degrees of the deleted and newly added edges, as well as the spatial proximity of those edges. |
Henry Kenlay; Dorina Thanou; Xiaowen Dong; |
494 | Affine Invariant Analysis of Frank-Wolfe on Strongly Convex Sets Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we introduce new structural assumptions on the problem (such as the directional smoothness) and derive an affine invariant, norm-independent analysis of Frank-Wolfe. |
Thomas Kerdreux; Lewis Liu; Simon Lacoste-Julien; Damien Scieur; |
495 | Markpainting: Adversarial Machine Learning Meets Inpainting Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we study how to manipulate it using our markpainting technique. |
David Khachaturov; Ilia Shumailov; Yiren Zhao; Nicolas Papernot; Ross Anderson; |
496 | Finite-Sample Analysis of Off-Policy Natural Actor-Critic Algorithm Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we provide finite-sample convergence guarantees for an off-policy variant of the natural actor-critic (NAC) algorithm based on Importance Sampling. |
Sajad Khodadadian; Zaiwei Chen; Siva Theja Maguluri; |
497 | Functional Space Analysis of Local GAN Convergence Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a novel perspective where we study the local dynamics of adversarial training in the general functional space and show how it can be represented as a system of partial differential equations. |
Valentin Khrulkov; Artem Babenko; Ivan Oseledets; |
498 | Hey, That’s Not An ODE: Faster ODE Adjoints Via Seminorms Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Here, we demonstrate that the particular structure of the adjoint equations makes the usual choices of norm (such as $L^2$) unnecessarily stringent. |
Patrick Kidger; Ricky T. Q. Chen; Terry J Lyons; |
499 | Neural SDEs As Infinite-Dimensional GANs Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Here, we show that the current classical approach to fitting SDEs may be approached as a special case of (Wasserstein) GANs, and in doing so the neural and classical regimes may be brought together. |
Patrick Kidger; James Foster; Xuechen Li; Terry J Lyons; |
500 | GRAD-MATCH: Gradient Matching Based Data Subset Selection for Efficient Deep Model Training Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose a general framework, GRAD-MATCH, which finds subsets that closely match the gradient of the \emph{training or validation} set. |
Krishnateja Killamsetty; Durga S; Ganesh Ramakrishnan; Abir De; Rishabh Iyer; |
501 | Improving Predictors Via Combination Across Diverse Task Categories Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Our algorithm aligns the heterogeneous domains of different predictors in a shared latent space to facilitate comparisons of predictors independently of the domains on which they are originally defined. |
Kwang In Kim; |
502 | Self-Improved Retrosynthetic Planning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Motivated by this, we propose an end-to-end framework for directly training the DNNs towards generating reaction pathways with the desirable properties. |
Junsu Kim; Sungsoo Ahn; Hankook Lee; Jinwoo Shin; |
503 | Reward Identification in Inverse Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we formalize the reward identification problem in IRL and study how identifiability relates to properties of the MDP model. |
Kuno Kim; Shivam Garg; Kirankumar Shiragur; Stefano Ermon; |
504 | I-BERT: Integer-only BERT Quantization Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose I-BERT, a novel quantization scheme for Transformer based models that quantizes the entire inference with integer-only arithmetic. |
Sehoon Kim; Amir Gholami; Zhewei Yao; Michael W. Mahoney; Kurt Keutzer; |
505 | Message Passing Adaptive Resonance Theory for Online Active Semi-supervised Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this study, we propose Message Passing Adaptive Resonance Theory (MPART) that learns the distribution and topology of input data online. |
Taehyeong Kim; Injune Hwang; Hyundo Lee; Hyunseo Kim; Won-Seok Choi; Joseph J Lim; Byoung-Tak Zhang; |
506 | Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we present a parallel end-to-end TTS method that generates more natural sounding audio than current two-stage models. |
Jaehyeon Kim; Jungil Kong; Juhee Son; |
507 | A Policy Gradient Algorithm for Learning to Learn in Multiagent Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a novel meta-multiagent policy gradient theorem that directly accounts for the non-stationary policy dynamics inherent to multiagent learning settings. |
Dong Ki Kim; Miao Liu; Matthew D Riemer; Chuangchuang Sun; Marwa Abdulhai; Golnaz Habibi; Sebastian Lopez-Cot; Gerald Tesauro; Jonathan How; |
508 | Inferring Latent Dynamics Underlying Neural Population Activity Via Neural Differential Equations Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Here we address this problem by introducing a low-dimensional nonlinear model for latent neural population dynamics using neural ordinary differential equations (neural ODEs), with noisy sensory inputs and Poisson spike train outputs. |
Timothy D Kim; Thomas Z Luo; Jonathan W Pillow; Carlos Brody; |
509 | The Lipschitz Constant of Self-Attention Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we investigate the Lipschitz constant of self-attention, a non-linear neural network module widely used in sequence modelling. |
Hyunjik Kim; George Papamakarios; Andriy Mnih; |
510 | Unsupervised Skill Discovery with Bottleneck Option Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a novel unsupervised skill discovery method named Information Bottleneck Option Learning (IBOL). |
Jaekyeom Kim; Seohong Park; Gunhee Kim; |
511 | ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present a minimal VLP model, Vision-and-Language Transformer (ViLT), monolithic in the sense that the processing of visual inputs is drastically simplified to just the same convolution-free manner that we process textual inputs. |
Wonjae Kim; Bokyung Son; Ildoo Kim; |
512 | Bias-Robust Bayesian Optimization Via Dueling Bandits Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Our first contribution is a reduction of the confounded setting to the dueling bandit model. Then we propose a novel approach for dueling bandits based on information-directed sampling (IDS). |
Johannes Kirschner; Andreas Krause; |
513 | CLOCS: Contrastive Learning of Cardiac Signals Across Space, Time, and Patients Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a family of contrastive learning methods, CLOCS, that encourages representations across space, time, \textit{and} patients to be similar to one another. |
Dani Kiyasseh; Tingting Zhu; David A Clifton; |
514 | Scalable Optimal Transport in High Dimensions for Graph Distances, Embedding Alignment, and More Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work we propose two effective log-linear time approximations of the cost matrix: First, a sparse approximation based on locality sensitive hashing (LSH) and, second, a Nystr{ö}m approximation with LSH-based sparse corrections, which we call locally corrected Nystr{ö}m (LCN). |
Johannes Klicpera; Marten Lienen; Stephan G?nnemann; |
515 | Representational Aspects of Depth and Conditioning in Normalizing Flows Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In our paper, we tackle representational aspects around depth and conditioning of normalizing flows: both for general invertible architectures, and for a particular common architecture, affine couplings. |
Frederic Koehler; Viraj Mehta; Andrej Risteski; |
516 | WILDS: A Benchmark of In-the-Wild Distribution Shifts Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To address this gap, we present WILDS, a curated benchmark of 10 datasets reflecting a diverse range of distribution shifts that naturally arise in real-world applications, such as shifts across hospitals for tumor identification; across camera traps for wildlife monitoring; and across time and location in satellite imaging and poverty mapping. |
Pang Wei Koh; Shiori Sagawa; Henrik Marklund; Sang Michael Xie; Marvin Zhang; Akshay Balsubramani; Weihua Hu; Michihiro Yasunaga; Richard Lanas Phillips; Irena Gao; Tony Lee; Etienne David; Ian Stavness; Wei Guo; Berton Earnshaw; Imran Haque; Sara M Beery; Jure Leskovec; Anshul Kundaje; Emma Pierson; Sergey Levine; Chelsea Finn; Percy Liang; |
517 | One-sided Frank-Wolfe Algorithms for Saddle Problems Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We study a class of convex-concave saddle-point problems of the form $\min_x\max_y ?Kx,y?+f_{\cal P}(x)-h^*(y)$ where $K$ is a linear operator, $f_{\cal P}$ is the sum of a convex function $f$ with a Lipschitz-continuous gradient and the indicator function of a bounded convex polytope ${\cal P}$, and $h^\ast$ is a convex (possibly nonsmooth) function. |
Vladimir Kolmogorov; Thomas Pock; |
518 | A Lower Bound for The Sample Complexity of Inverse Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper develops an information-theoretic lower bound for the sample complexity of the finite state, finite action IRL problem. |
Abi Komanduru; Jean Honorio; |
519 | Consensus Control for Decentralized Deep Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We identify the changing consensus distance between devices as a key parameter to explain the gap between centralized and decentralized training. |
Lingjing Kong; Tao Lin; Anastasia Koloskova; Martin Jaggi; Sebastian Stich; |
520 | A Distribution-dependent Analysis of Meta Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: For this case we propose to adopt the EM method, which is shown to enjoy efficient updates in our case. |
Mikhail Konobeev; Ilja Kuzborskij; Csaba Szepesvari; |
521 | Evaluating Robustness of Predictive Uncertainty Estimation: Are Dirichlet-based Models Reliable? Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we present the first large-scale, in-depth study of the robustness of DBU models under adversarial attacks. |
Anna-Kathrin Kopetzki; Bertrand Charpentier; Daniel Z?gner; Sandhya Giri; Stephan G?nnemann; |
522 | Kernel Stein Discrepancy Descent Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We investigate the properties of its Wasserstein gradient flow to approximate a target probability distribution $\pi$ on $\mathbb{R}^d$, known up to a normalization constant. |
Anna Korba; Pierre-Cyril Aubin-Frankowski; Szymon Majewski; Pierre Ablin; |
523 | Boosting The Throughput and Accelerator Utilization of Specialized CNN Inference Beyond Increasing Batch Size Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose FoldedCNNs, a new approach to CNN design that increases inference throughput and utilization beyond large batch size. |
Jack Kosaian; Amar Phanishayee; Matthai Philipose; Debadeepta Dey; Rashmi Vinayak; |
524 | NeRF-VAE: A Geometry Aware 3D Scene Generative Model Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose NeRF-VAE, a 3D scene generative model that incorporates geometric structure via Neural Radiance Fields (NeRF) and differentiable volume rendering. |
Adam R Kosiorek; Heiko Strathmann; Daniel Zoran; Pol Moreno; Rosalia Schneider; Sona Mokra; Danilo Jimenez Rezende; |
525 | Active Testing: Sample-Efficient Model Evaluation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We introduce a new framework for sample-efficient model evaluation that we call active testing. |
Jannik Kossen; Sebastian Farquhar; Yarin Gal; Tom Rainforth; |
526 | High Confidence Generalization for Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present several classes of reinforcement learning algorithms that safely generalize to Markov decision processes (MDPs) not seen during training. |
James Kostas; Yash Chandak; Scott M Jordan; Georgios Theocharous; Philip Thomas; |
527 | Offline Reinforcement Learning with Fisher Divergence Critic Regularization Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose an alternative approach to encouraging the learned policy to stay close to the data, namely parameterizing the critic as the log-behavior-policy, which generated the offline data, plus a state-action value offset term, which can be learned using a neural network. |
Ilya Kostrikov; Rob Fergus; Jonathan Tompson; Ofir Nachum; |
528 | ADOM: Accelerated Decentralized Optimization Method for Time-Varying Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose ADOM – an accelerated method for smooth and strongly convex decentralized optimization over time-varying networks. |
Dmitry Kovalev; Egor Shulgin; Peter Richtarik; Alexander V Rogozin; Alexander Gasnikov; |
529 | Revisiting Peng’s Q($?$) for Modern Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Motivated by the empirical results and the lack of theory, we carry out theoretical analyses of Peng’s Q(\lambda), a representative example of non-conservative algorithms. |
Tadashi Kozuno; Yunhao Tang; Mark Rowland; Remi Munos; Steven Kapturowski; Will Dabney; Michal Valko; David Abel; |
530 | Adapting to Misspecification in Contextual Bandits with Offline Regression Oracles Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a simple family of contextual bandit algorithms that adapt to misspecification error by reverting to a good safe policy when there is evidence that misspecification is causing a regret increase. |
Sanath Kumar Krishnamurthy; Vitor Hadad; Susan Athey; |
531 | Out-of-Distribution Generalization Via Risk Extrapolation (REx) Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We motivate this approach, Risk Extrapolation (REx), as a form of robust optimization over a perturbation set of extrapolated domains (MM-REx), and propose a penalty on the variance of training risks (V-REx) as a simpler variant. |
David Krueger; Ethan Caballero; Joern-Henrik Jacobsen; Amy Zhang; Jonathan Binas; Dinghuai Zhang; Remi Le Priol; Aaron Courville; |
532 | Near-Optimal Confidence Sequences for Bounded Random Variables Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To address this question, we provide a near-optimal confidence sequence for bounded random variables by utilizing Bentkus’ concentration results. |
Arun K Kuchibhotla; Qinqing Zheng; |
533 | Differentially Private Bayesian Inference for Generalized Linear Models Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, with logistic and Poisson regression as running examples, we introduce a generic noise-aware DP Bayesian inference method for a GLM at hand, given a noisy sum of summary statistics. |
Tejas Kulkarni; Joonas J?lk?; Antti Koskela; Samuel Kaski; Antti Honkela; |
534 | Bayesian Structural Adaptation for Continual Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present a novel Bayesian framework based on continually learning the structure of deep neural networks, to unify these distinct yet complementary approaches. |
Abhishek Kumar; Sunabha Chatterjee; Piyush Rai; |
535 | Implicit Rate-constrained Optimization of Non-decomposable Objectives Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Our key idea is to formulate a rate-constrained optimization that expresses the threshold parameter as a function of the model parameters via the Implicit Function theorem. |
Abhishek Kumar; Harikrishna Narasimhan; Andrew Cotter; |
536 | A Scalable Second Order Method for Ill-Conditioned Matrix Completion from Few Samples Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose an iterative algorithm for low-rank matrix completion with that can be interpreted as an iteratively reweighted least squares (IRLS) algorithm, a saddle-escaping smoothing Newton method or a variable metric proximal gradient method applied to a non-convex rank surrogate. |
Christian K?mmerle; Claudio M. Verdun; |
537 | Meta-Thompson Sampling Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose several efficient implementations of MetaTS and analyze it in Gaussian bandits. |
Branislav Kveton; Mikhail Konobeev; Manzil Zaheer; Chih-Wei Hsu; Martin Mladenov; Craig Boutilier; Csaba Szepesvari; |
538 | Targeted Data Acquisition for Evolving Negotiation Agents Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To address this, we introduce a targeted data acquisition framework where we guide the exploration of a reinforcement learning agent using annotations from an expert oracle. |
Minae Kwon; Siddharth Karamcheti; Mariano-Florentino Cuellar; Dorsa Sadigh; |
539 | ASAM: Adaptive Sharpness-Aware Minimization for Scale-Invariant Learning of Deep Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we introduce the concept of adaptive sharpness which is scale-invariant and propose the corresponding generalization bound. |
Jungmin Kwon; Jeongseop Kim; Hyunseo Park; In Kwon Choi; |
540 | On The Price of Explainability for Some Clustering Problems Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Here, we study this price for the following clustering problems: $k$-means, $k$-medians, $k$-centers and maximum-spacing. |
Eduardo S Laber; Lucas Murtinho; |
541 | Adaptive Newton Sketch: Linear-time Optimization with Quadratic Convergence and Effective Hessian Dimensionality Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a randomized algorithm with quadratic convergence rate for convex optimization problems with a self-concordant, composite, strongly convex objective function. |
Jonathan Lacotte; Yifei Wang; Mert Pilanci; |
542 | Generalization Bounds in The Presence of Outliers: A Median-of-Means Study Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this context, the present work proposes a general study of MoM’s concentration properties under the contamination regime, that provides a clear understanding on the impact of the outlier proportion and the number of blocks chosen. |
Pierre Laforgue; Guillaume Staerman; Stephan Cl?men?on; |
543 | Model Fusion for Personalized Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To accommodate for such scenarios, we develop a new personalized learning framework that synthesizes customized models for unseen tasks via fusion of independently pre-trained models of related tasks. |
Thanh Chi Lam; Nghia Hoang; Bryan Kian Hsiang Low; Patrick Jaillet; |
544 | Gradient Disaggregation: Breaking Privacy in Federated Learning By Reconstructing The User Participant Matrix Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Our method revolves around reconstructing participant information (e.g: which rounds of training users participated in) from aggregated model updates by leveraging summary information from device analytics commonly used to monitor, debug, and manage federated learning systems. |
Maximilian Lam; Gu-Yeon Wei; David Brooks; Vijay Janapa Reddi; Michael Mitzenmacher; |
545 | Stochastic Multi-Armed Bandits with Unrestricted Delay Distributions Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Our main contribution is algorithms that achieve near-optimal regret in each of the settings, with an additional additive dependence on the quantiles of the delay distribution. |
Tal Lancewicki; Shahar Segal; Tomer Koren; Yishay Mansour; |
546 | Discovering Symbolic Policies with Deep Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To this end, we propose deep symbolic policy, a novel approach to directly search the space of symbolic policies. |
Mikel Landajuela; Brenden K Petersen; Sookyung Kim; Claudio P Santiago; Ruben Glatt; Nathan Mundhenk; Jacob F Pettit; Daniel Faissol; |
547 | Graph Cuts Always Find A Global Optimum for Potts Models (With A Catch) Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We prove that the alpha-expansion algorithm for MAP inference always returns a globally optimal assignment for Markov Random Fields with Potts pairwise potentials, with a catch: the returned assignment is only guaranteed to be optimal for an instance within a small perturbation of the original problem instance. |
Hunter Lang; David Sontag; Aravindan Vijayaraghavan; |
548 | Efficient Message Passing for 0-1 ILPs with Binary Decision Diagrams Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present a message passing method for 0{-}1 integer linear programs. |
Jan-Hendrik Lange; Paul Swoboda; |
549 | CountSketches, Feature Hashing and The Median of Three Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we revisit the classic CountSketch method, which is a sparse, random projection that transforms a (high-dimensional) Euclidean vector $v$ to a vector of dimension $(2t-1) s$, where $t, s > 0$ are integer parameters. |
Kasper Green Larsen; Rasmus Pagh; Jakub Tetek; |
550 | MorphVAE: Generating Neural Morphologies from 3D-Walks Using A Variational Autoencoder with Spherical Latent Space Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Here we propose MorphVAE, a sequence-to-sequence variational autoencoder with spherical latent space as a generative model for neural morphologies. |
Sophie C. Laturnus; Philipp Berens; |
551 | Improved Regret Bound and Experience Replay in Regularized Policy Iteration Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we study algorithms for learning in infinite-horizon undiscounted Markov decision processes (MDPs) with function approximation. |
Nevena Lazic; Dong Yin; Yasin Abbasi-Yadkori; Csaba Szepesvari; |
552 | LAMDA: Label Matching Deep Domain Adaptation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose and study a new challenging setting that allows us to use a Wasserstein distance (WS) to not only quantify the data shift but also to define the label shift directly. |
Trung Le; Tuan Nguyen; Nhat Ho; Hung Bui; Dinh Phung; |
553 | Gaussian Process-Based Real-Time Learning for Safety Critical Applications Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Due to its high computational complexity, Gaussian process regression must be used offline on batches of data, which prevents applications, where a fast adaptation through online learning is necessary to ensure safety. In order to overcome this issue, we propose the LoG-GP. |
Armin Lederer; Alejandro J Ord??ez Conejo; Korbinian A Maier; Wenxin Xiao; Jonas Umlauft; Sandra Hirche; |
554 | Sharing Less Is More: Lifelong Learning in Deep Networks with Selective Layer Transfer Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We first show that the lifelong learning performance of several current deep learning architectures can be significantly improved by transfer at the appropriate layers. We then develop an expectation-maximization (EM) method to automatically select the appropriate transfer configuration and optimize the task network weights. |
Seungwon Lee; Sima Behpour; Eric Eaton; |
555 | Fair Selective Classification Via Sufficiency Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We prove that the sufficiency criterion can be used to mitigate these disparities by ensuring that selective classification increases performance on all groups, and introduce a method for mitigating the disparity in precision across the entire coverage scale based on this criterion. |
Joshua K Lee; Yuheng Bu; Deepta Rajan; Prasanna Sattigeri; Rameswar Panda; Subhro Das; Gregory W Wornell; |
556 | On-the-fly Rectification for Robust Large-Vocabulary Topic Inference Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose novel methods that simultaneously compress and rectify co-occurrence statistics, scaling gracefully with the size of vocabulary and the dimension of latent space. |
Moontae Lee; Sungjun Cho; Kun Dong; David Mimno; David Bindel; |
557 | Unsupervised Embedding Adaptation Via Early-Stage Feature Reconstruction for Few-Shot Classification Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose unsupervised embedding adaptation for the downstream few-shot classification task. |
Dong Hoon Lee; Sae-Young Chung; |
558 | Continual Learning in The Teacher-Student Setup: Impact of Task Similarity Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Here, we attempt to narrow this gap between theory and practice by studying continual learning in the teacher-student setup. |
Sebastian Lee; Sebastian Goldt; Andrew Saxe; |
559 | OptiDICE: Offline Policy Optimization Via Stationary Distribution Correction Estimation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present an offline RL algorithm that prevents overestimation in a more principled way. |
Jongmin Lee; Wonseok Jeon; Byungjun Lee; Joelle Pineau; Kee-Eung Kim; |
560 | SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To mitigate these issues, we present SUNRISE, a simple unified ensemble method, which is compatible with various off-policy RL algorithms. |
Kimin Lee; Michael Laskin; Aravind Srinivas; Pieter Abbeel; |
561 | Achieving Near Instance-Optimality and Minimax-Optimality in Stochastic and Adversarial Linear Bandits Simultaneously Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we develop linear bandit algorithms that automatically adapt to different environments. |
Chung-Wei Lee; Haipeng Luo; Chen-Yu Wei; Mengxiao Zhang; Xiaojin Zhang; |
562 | PEBBLE: Feedback-Efficient Interactive Reinforcement Learning Via Relabeling Experience and Unsupervised Pre-training Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present an off-policy, interactive RL algorithm that capitalizes on the strengths of both feedback and off-policy learning. |
Kimin Lee; Laura M Smith; Pieter Abbeel; |
563 | Near-Optimal Linear Regression Under Distribution Shift Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We develop estimators that achieve minimax linear risk for linear regression problems under distribution shift. |
Qi Lei; Wei Hu; Jason Lee; |
564 | Stability and Generalization of Stochastic Gradient Methods for Minimax Problems Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we provide a comprehensive generalization analysis of stochastic gradient methods for minimax problems under both convex-concave and nonconvex-nonconcave cases through the lens of algorithmic stability. |
Yunwen Lei; Zhenhuan Yang; Tianbao Yang; Yiming Ying; |
565 | Scalable Evaluation of Multi-Agent Reinforcement Learning with Melting Pot Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Our contribution, Melting Pot, is a MARL evaluation suite that fills this gap and uses reinforcement learning to reduce the human labor required to create novel test scenarios. |
Joel Z Leibo; Edgar A Due?ez-Guzman; Alexander Vezhnevets; John P Agapiou; Peter Sunehag; Raphael Koster; Jayd Matyas; Charlie Beattie; Igor Mordatch; Thore Graepel; |
566 | Better Training Using Weight-Constrained Stochastic Dynamics Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We provide a general approach to efficiently incorporate constraints into a stochastic gradient Langevin framework, allowing enhanced exploration of the loss landscape. |
Benedict Leimkuhler; Tiffany J Vlaar; Timoth?e Pouchon; Amos Storkey; |
567 | Globally-Robust Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We show that widely-used architectures can be easily adapted to this objective by incorporating efficient global Lipschitz bounds into the network, yielding certifiably-robust models by construction that achieve state-of-the-art verifiable accuracy. |
Klas Leino; Zifan Wang; Matt Fredrikson; |
568 | Learning to Price Against A Moving Target Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Here we study the problem where the buyer’s value is a moving target, i.e., they change over time either by a stochastic process or adversarially with bounded variation. |
Renato Paes Leme; Balasubramanian Sivan; Yifeng Teng; Pratik Worah; |
569 | SigGPDE: Scaling Sparse Gaussian Processes on Sequential Data Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We develop SigGPDE, a new scalable sparse variational inference framework for Gaussian Processes (GPs) on sequential data. |
Maud Lemercier; Cristopher Salvi; Thomas Cass; Edwin V. Bonilla; Theodoros Damoulas; Terry J Lyons; |
570 | Strategic Classification Made Practical Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we present a learning framework for strategic classification that is practical. |
Sagi Levanon; Nir Rosenfeld; |
571 | Improved, Deterministic Smoothing for L_1 Certified Robustness Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose a non-additive and deterministic smoothing method, Deterministic Smoothing with Splitting Noise (DSSN). |
Alexander J Levine; Soheil Feizi; |
572 | BASE Layers: Simplifying Training of Large, Sparse Models Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We introduce a new balanced assignment of experts (BASE) layer for large language models that greatly simplifies existing high capacity sparse layers. |
Mike Lewis; Shruti Bhosale; Tim Dettmers; Naman Goyal; Luke Zettlemoyer; |
573 | Run-Sort-ReRun: Escaping Batch Size Limitations in Sliced Wasserstein Generative Models Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we build upon recent progress in sliced Wasserstein distances, a family of differentiable metrics for distribution discrepancy based on the Optimal Transport paradigm. |
Jose Lezama; Wei Chen; Qiang Qiu; |
574 | PAGE: A Simple and Optimal Probabilistic Gradient Estimator for Nonconvex Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a novel stochastic gradient estimator—ProbAbilistic Gradient Estimator (PAGE)—for nonconvex optimization. |
Zhize Li; Hongyan Bao; Xiangliang Zhang; Peter Richtarik; |
575 | Tightening The Dependence on Horizon in The Sample Complexity of Q-Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we sharpen the sample complexity of synchronous Q-learning to the order of $\frac{|S||A|}{(1-\gamma)^4\varepsilon^2}$ (up to some logarithmic factor) for any $0<\varepsilon <1$, leading to an order-wise improvement in $\frac{1}{1-\gamma}$. |
Gen Li; Changxiao Cai; Yuxin Chen; Yuantao Gu; Yuting Wei; Yuejie Chi; |
576 | Winograd Algorithm for AdderNet Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To further optimize the hardware overhead of using AdderNet, this paper studies the winograd algorithm, which is a widely used fast algorithm for accelerating convolution and saving the computational costs. |
Wenshuo Li; Hanting Chen; Mingqiang Huang; Xinghao Chen; Chunjing Xu; Yunhe Wang; |
577 | A Free Lunch From ANN: Towards Efficient, Accurate Spiking Neural Networks Calibration Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We introduce SNN Calibration, a cheap but extraordinarily effective method by leveraging the knowledge within a pre-trained Artificial Neural Network (ANN). |
Yuhang Li; Shikuang Deng; Xin Dong; Ruihao Gong; Shi Gu; |
578 | Privacy-Preserving Feature Selection with Secure Multiparty Computation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose the first MPC based protocol for private feature selection based on the filter method, which is independent of model training, and can be used in combination with any MPC protocol to rank features. |
Xiling Li; Rafael Dowsley; Martine De Cock; |
579 | Theory of Spectral Method for Union of Subspaces-Based Random Geometry Graph Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper establishes a theory to show the power of this method for the first time, in which we demonstrate the mechanism of spectral clustering by analyzing a simplified algorithm under the widely used semi-random model. |
Gen Li; Yuantao Gu; |
580 | MURAL: Meta-Learning Uncertainty-Aware Rewards for Outcome-Driven Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we study a more tractable class of reinforcement learning problems defined simply by examples of successful outcome states, which can be much easier to provide while still making the exploration problem more tractable. |
Kevin Li; Abhishek Gupta; Ashwin Reddy; Vitchyr H Pong; Aurick Zhou; Justin Yu; Sergey Levine; |
581 | Ditto: Fair and Robust Federated Learning Through Personalization Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we identify that robustness to data and model poisoning attacks and fairness, measured as the uniformity of performance across devices, are competing constraints in statistically heterogeneous networks. |
Tian Li; Shengyuan Hu; Ahmad Beirami; Virginia Smith; |
582 | Quantization Algorithms for Random Fourier Features Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we focus on developing quantization algorithms for RFF. |
Xiaoyun Li; Ping Li; |
583 | Approximate Group Fairness for Clustering Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Particularly, we propose two dimensions to relax core requirements: one is on the degree of distance improvement, and the other is on the size of deviating coalition. |
Bo Li; Lijun Li; Ankang Sun; Chenhao Wang; Yingfan Wang; |
584 | Sharper Generalization Bounds for Clustering Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a unified clustering learning framework and investigate its excess risk bounds, obtaining state-of-the-art upper bounds under mild assumptions. |
Shaojie Li; Yong Liu; |
585 | Provably End-to-end Label-noise Learning Without Anchor Points Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose an end-to-end framework for solving label-noise learning without anchor points, in which we simultaneously optimize two objectives: the cross entropy loss between the noisy label and the predicted probability by the neural network, and the volume of the simplex formed by the columns of the transition matrix. |
Xuefeng Li; Tongliang Liu; Bo Han; Gang Niu; Masashi Sugiyama; |
586 | A Novel Method to Solve Neural Knapsack Problems Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present a game-theoretic method to solve 0-1 knapsack problems (KPs) where the number of items (products) is large and the values of items are not predetermined but decided by an external value assignment function (e.g., a neural network in our case) during the optimization process. |
Duanshun Li; Jing Liu; Dongeun Lee; Ali Seyedmazloom; Giridhar Kaushik; Kookjin Lee; Noseong Park; |
587 | Mixed Cross Entropy Loss for Neural Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose mixed Cross Entropy loss (mixed CE) as a substitute for CE in both training approaches. |
Haoran Li; Wei Lu; |
588 | Training Graph Neural Networks with 1000 Layers Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we study reversible connections, group convolutions, weight tying, and equilibrium models to advance the memory and parameter efficiency of GNNs. |
Guohao Li; Matthias M?ller; Bernard Ghanem; Vladlen Koltun; |
589 | Active Feature Acquisition with Generative Surrogate Models Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we consider models that perform active feature acquisition (AFA) and query the environment for unobserved features to improve the prediction assessments at evaluation time. |
Yang Li; Junier Oliva; |
590 | Partially Observed Exchangeable Modeling Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose a novel framework, partially observed exchangeable modeling (POEx) that takes in a set of related partially observed instances and infers the conditional distribution for the unobserved dimensions over multiple elements. |
Yang Li; Junier Oliva; |
591 | Testing DNN-based Autonomous Driving Systems Under Critical Environmental Conditions Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose to test DNN-based ADS under different environmental conditions to identify the critical ones, that is, the environmental conditions under which the ADS are more prone to errors. |
Zhong Li; Minxue Pan; Tian Zhang; Xuandong Li; |
592 | The Symmetry Between Arms and Knapsacks: A Primal-Dual Approach for Bandits with Knapsacks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we study the bandits with knapsacks (BwK) problem and develop a primal-dual based algorithm that achieves a problem-dependent logarithmic regret bound. |
Xiaocheng Li; Chunlin Sun; Yinyu Ye; |
593 | Distributionally Robust Optimization with Markovian Data Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a data-driven distributionally robust optimization model to estimate the problem’s objective function and optimal solution. |
Mengmeng Li; Tobias Sutter; Daniel Kuhn; |
594 | Communication-Efficient Distributed SVD Via Local Power Iterations Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In the aggregation, we propose to weight each local eigenvector matrix with orthogonal Procrustes transformation (OPT). |
Xiang Li; Shusen Wang; Kun Chen; Zhihua Zhang; |
595 | FILTRA: Rethinking Steerable CNN By Filter Transform Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we show that kernel constructed by filter transform can also be interpreted in the group representation theory. |
Bo Li; Qili Wang; Gim Hee Lee; |
596 | Online Unrelated Machine Load Balancing with Predictions Revisited Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We study the online load balancing problem with machine learned predictions, and give results that improve upon and extend those in a recent paper by Lattanzi et al. (2020). |
Shi Li; Jiayi Xian; |
597 | Asymptotic Normality and Confidence Intervals for Prediction Risk of The Min-Norm Least Squares Estimator Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper quantifies the uncertainty of prediction risk for the min-norm least squares estimator in high-dimensional linear regression models. |
Zeng Li; Chuanlong Xie; Qinwen Wang; |
598 | TeraPipe: Token-Level Pipeline Parallelism for Training Large-Scale Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we identify a new and orthogonal dimension from existing model parallel approaches: it is possible to perform pipeline parallelism within a single training sequence for Transformer-based language models thanks to its autoregressive property. |
Zhuohan Li; Siyuan Zhuang; Shiyuan Guo; Danyang Zhuo; Hao Zhang; Dawn Song; Ion Stoica; |
599 | A Second Look at Exponential and Cosine Step Sizes: Simplicity, Adaptivity, and Performance Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we study two step size schedules whose power has been repeatedly confirmed in practice: the exponential and the cosine step sizes. |
Xiaoyu Li; Zhenxun Zhuang; Francesco Orabona; |
600 | Towards Understanding and Mitigating Social Biases in Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: With these tools, we propose steps towards mitigating social biases during text generation. |
Paul Pu Liang; Chiyu Wu; Louis-Philippe Morency; Ruslan Salakhutdinov; |
601 | Uncovering The Connections Between Adversarial Transferability and Knowledge Transferability Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, as the first work, we analyze and demonstrate the connections between knowledge transferability and another important phenomenon–adversarial transferability, \emph{i.e.}, adversarial examples generated against one model can be transferred to attack other models. |
Kaizhao Liang; Jacky Y Zhang; Boxin Wang; Zhuolin Yang; Sanmi Koyejo; Bo Li; |
602 | Parallel Droplet Control in MEDA Biochips Using Multi-Agent Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To overcome these problems, we present a multi-agent reinforcement learning (MARL) droplet-routing solution that can be used for various sizes of MEDA biochips with integrated sensors, and we demonstrate the reliable execution of a serial-dilution bioassay with the MARL droplet router on a fabricated MEDA biochip. |
Tung-Che Liang; Jin Zhou; Yun-Sheng Chan; Tsung-Yi Ho; Krishnendu Chakrabarty; Cy Lee; |
603 | Information Obfuscation of Graph Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we study the problem of protecting sensitive attributes by information obfuscation when learning with graph structured data. |
Peiyuan Liao; Han Zhao; Keyulu Xu; Tommi Jaakkola; Geoffrey J. Gordon; Stefanie Jegelka; Ruslan Salakhutdinov; |
604 | Guided Exploration with Proximal Policy Optimization Using A Single Demonstration Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We train an agent on a combination of demonstrations and own experience to solve problems with variable initial conditions and we integrate it with proximal policy optimization (PPO). |
Gabriele Libardi; Gianni De Fabritiis; Sebastian Dittert; |
605 | Debiasing A First-order Heuristic for Approximate Bi-level Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We contribute by theoretically characterizing FOM’s gradient bias under mild assumptions. |
Valerii Likhosherstov; Xingyou Song; Krzysztof Choromanski; Jared Q Davis; Adrian Weller; |
606 | Making Transport More Robust and Interpretable By Moving Data Through A Small Number of Anchor Points Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Here, we introduce Latent Optimal Transport (LOT), a new approach for OT that simultaneously learns low-dimensional structure in data while leveraging this structure to solve the alignment task. |
Chi-Heng Lin; Mehdi Azabou; Eva Dyer; |
607 | Straight to The Gradient: Learning to Use Novel Tokens for Neural Text Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we introduce ScaleGrad, a modification straight to the gradient of the loss function, to remedy the degeneration issue of the standard MLE objective. |
Xiang Lin; Simeng Han; Shafiq Joty; |
608 | Quasi-global Momentum: Accelerating Decentralized Deep Learning on Heterogeneous Data Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we investigate and identify the limitation of several decentralized optimization algorithms for different degrees of data heterogeneity. |
Tao Lin; Sai Praneeth Karimireddy; Sebastian Stich; Martin Jaggi; |
609 | Generative Causal Explanations for Graph Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper presents {\em Gem}, a model-agnostic approach for providing interpretable explanations for any GNNs on various graph learning tasks. |
Wanyu Lin; Hao Lan; Baochun Li; |
610 | Tractable Structured Natural-gradient Descent Using Local Parameterizations Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We address this issue by using \emph{local-parameter coordinates} to obtain a flexible and efficient NGD method that works well for a wide-variety of structured parameterizations. |
Wu Lin; Frank Nielsen; Khan Mohammad Emtiyaz; Mark Schmidt; |
611 | Active Learning of Continuous-time Bayesian Networks Through Interventions Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a novel criterion for experimental design based on a variational approximation of the expected information gain. |
Dominik Linzner; Heinz Koeppl; |
612 | Phase Transitions, Distance Functions, and Implicit Neural Representations Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we draw inspiration from the theory of phase transitions of fluids and suggest a loss for training INRs that learns a density function that converges to a proper occupancy function, while its log transform converges to a distance function. |
Yaron Lipman; |
613 | The Earth Mover’s Pinball Loss: Quantiles for Histogram-Valued Regression Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present a dedicated method for Deep Learning-based histogram regression, which incorporates cross-bin information and yields distributions over possible histograms, expressed by $\tau$-quantiles of the cumulative histogram in each bin. |
Florian List; |
614 | Understanding Instance-Level Label Noise: Disparate Impacts and Treatments Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper aims to provide understandings for the effect of an over-parameterized model, e.g. a deep neural network, memorizing instance-dependent noisy labels. |
Yang Liu; |
615 | APS: Active Pretraining with Successor Features Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We introduce a new unsupervised pretraining objective for reinforcement learning. |
Hao Liu; Pieter Abbeel; |
616 | Learning By Turning: Neural Architecture Aware Optimisation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To address this problem, this paper conducts a combined study of neural architecture and optimisation, leading to a new optimiser called Nero: the neuronal rotator. |
Yang Liu; Jeremy Bernstein; Markus Meister; Yisong Yue; |
617 | Dynamic Game Theoretic Neural Optimizer Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose a novel dynamic game perspective by viewing each layer as a player in a dynamic game characterized by the DNN itself. |
Guan-Horng Liu; Tianrong Chen; Evangelos Theodorou; |
618 | Besov Function Approximation and Binary Classification on Low-Dimensional Manifolds Using Convolutional Residual Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To bridge this gap, we propose to exploit the low-dimensional structures of the real world datasets and establish theoretical guarantees of convolutional residual networks (ConvResNet) in terms of function approximation and statistical recovery for binary classification problem. |
Hao Liu; Minshuo Chen; Tuo Zhao; Wenjing Liao; |
619 | Just Train Twice: Improving Group Robustness Without Training Group Information Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a simple two-stage approach, JTT, that achieves comparable performance to group DRO while only requiring group annotations on a significantly smaller validation set. |
Evan Z Liu; Behzad Haghgoo; Annie S Chen; Aditi Raghunathan; Pang Wei Koh; Shiori Sagawa; Percy Liang; Chelsea Finn; |
620 | Event Outlier Detection in Continuous Time Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we study and develop methods for detecting outliers in continuous-time event sequences, including unexpected absence and unexpected occurrences of events. |
Siqi Liu; Milos Hauskrecht; |
621 | Heterogeneous Risk Minimization Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose Heterogeneous Risk Minimization (HRM) framework to achieve joint learning of latent heterogeneity among the data and invariant relationship, which leads to stable prediction despite distributional shifts. |
Jiashuo Liu; Zheyuan Hu; Peng Cui; Bo Li; Zheyan Shen; |
622 | Stochastic Iterative Graph Matching Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Considering that model outputs are complex matchings, we devise several techniques to improve the learning of GNNs and obtain a new model, Stochastic Iterative Graph MAtching (SIGMA). |
Linfeng Liu; Michael C Hughes; Soha Hassoun; Liping Liu; |
623 | Cooperative Exploration for Multi-Agent Deep Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To address this shortcoming, in this paper, we propose cooperative multi-agent exploration (CMAE): agents share a common goal while exploring. |
Iou-Jen Liu; Unnat Jain; Raymond A Yeh; Alexander Schwing; |
624 | Elastic Graph Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In particular, we propose a novel and general message passing scheme into GNNs. |
Xiaorui Liu; Wei Jin; Yao Ma; Yaxin Li; Hua Liu; Yiqi Wang; Ming Yan; Jiliang Tang; |
625 | One Pass Late Fusion Multi-view Clustering Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To address this issue, we propose to unify the aforementioned two learning procedures into a single optimization, in which the consensus partition matrix can better serve for the generation of cluster labels, and the latter is able to guide the learning of the former. |
Xinwang Liu; Li Liu; Qing Liao; Siwei Wang; Yi Zhang; Wenxuan Tu; Chang Tang; Jiyuan Liu; En Zhu; |
626 | Coach-Player Multi-agent Reinforcement Learning for Dynamic Team Composition Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Coordinating teams with such dynamic composition is challenging: the optimal team strategy varies with the composition. We propose COPA, a coach-player framework to tackle this problem. |
Bo Liu; Qiang Liu; Peter Stone; Animesh Garg; Yuke Zhu; Anima Anandkumar; |
627 | From Local to Global Norm Emergence: Dissolving Self-reinforcing Substructures with Incremental Social Instruments Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose incremental social instruments (ISI) to dissolve these SRSs by creating ties between agents. |
Yiwei Liu; Jiamou Liu; Kaibin Wan; Zhan Qin; Zijian Zhang; Bakhadyr Khoussainov; Liehuang Zhu; |
628 | A Value-Function-based Interior-point Method for Non-convex Bi-level Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose a new gradient-based solution scheme, namely, the Bi-level Value-Function-based Interior-point Method (BVFIM). |
Risheng Liu; Xuan Liu; Xiaoming Yuan; Shangzhi Zeng; Jin Zhang; |
629 | Selfish Sparse RNN Training Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose an approach to train intrinsically sparse RNNs with a fixed parameter count in one single run, without compromising performance. |
Shiwei Liu; Decebal Constantin Mocanu; Yulong Pei; Mykola Pechenizkiy; |
630 | Temporal Difference Learning As Gradient Splitting Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We provide an interpretation of this method in terms of a splitting of the gradient of an appropriately chosen function. |
Rui Liu; Alex Olshevsky; |
631 | On Robust Mean Estimation Under Coordinate-level Corruption Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We study the problem of robust mean estimation and introduce a novel Hamming distance-based measure of distribution shift for coordinate-level corruptions. |
Zifan Liu; Jong Ho Park; Theodoros Rekatsinas; Christos Tzamos; |
632 | Decoupling Exploration and Exploitation for Meta-Reinforcement Learning Without Sacrifices Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We alleviate both concerns by constructing an exploitation objective that automatically identifies task-relevant information and an exploration objective to recover only this information. |
Evan Z Liu; Aditi Raghunathan; Percy Liang; Chelsea Finn; |
633 | How Do Adam and Training Strategies Help BNNs Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To address this, in this paper we first investigate the trajectories of gradients and weights in BNNs during the training process. |
Zechun Liu; Zhiqiang Shen; Shichao Li; Koen Helwegen; Dong Huang; Kwang-Ting Cheng; |
634 | SagaNet: A Small Sample Gated Network for Pediatric Cancer Diagnosis Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose a novel model to solve the diagnosis task of small round blue cell tumors (SRBCTs). |
Yuhan Liu; Shiliang Sun; |
635 | Learning Deep Neural Networks Under Agnostic Corrupted Supervision Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To alleviate this problem, we present an efficient robust algorithm that achieves strong guarantees without any assumption on the type of corruption and provides a unified framework for both classification and regression problems. |
Boyang Liu; Mengying Sun; Ding Wang; Pang-Ning Tan; Jiayu Zhou; |
636 | Leveraging Public Data for Practical Private Query Release Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: With the goal of releasing statistics about a private dataset, we present PMW^Pub, which—unlike existing baselines—leverages public data drawn from a related distribution as prior information. |
Terrance Liu; Giuseppe Vietri; Thomas Steinke; Jonathan Ullman; Steven Wu; |
637 | Watermarking Deep Neural Networks with Greedy Residuals Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a novel watermark-based ownership protection method by using the residuals of important parameters. |
Hanwen Liu; Zhenyu Weng; Yuesheng Zhu; |
638 | Do We Actually Need Dense Over-Parameterization? In-Time Over-Parameterization in Sparse Training Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we introduce a new perspective on training deep neural networks capable of state-of-the-art performance without the need for the expensive over-parameterization by proposing the concept of In-Time Over-Parameterization (ITOP) in sparse training. |
Shiwei Liu; Lu Yin; Decebal Constantin Mocanu; Mykola Pechenizkiy; |
639 | A Sharp Analysis of Model-based Reinforcement Learning with Self-Play Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present a sharp analysis of model-based self-play algorithms for multi-agent Markov games. |
Qinghua Liu; Tiancheng Yu; Yu Bai; Chi Jin; |
640 | Lottery Ticket Preserves Weight Correlation: Is It Desirable or Not? Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we investigate the underlying condition and rationale behind the winning property, and find that the underlying reason is largely attributed to the correlation between initialized weights and final-trained weights when the learning rate is not sufficiently large. |
Ning Liu; Geng Yuan; Zhengping Che; Xuan Shen; Xiaolong Ma; Qing Jin; Jian Ren; Jian Tang; Sijia Liu; Yanzhi Wang; |
641 | Group Fisher Pruning for Practical Network Compression Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present a general channel pruning approach that can be applied to various complicated structures. |
Liyang Liu; Shilong Zhang; Zhanghui Kuang; Aojun Zhou; Jing-Hao Xue; Xinjiang Wang; Yimin Chen; Wenming Yang; Qingmin Liao; Wayne Zhang; |
642 | Infinite-Dimensional Optimization for Zero-Sum Games Via Variational Transport Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we consider infinite-dimensional zero-sum games by a min-max distributional optimization problem over a space of probability measures defined on a continuous variable set, which is inspired by finding a mixed NE for finite-dimensional zero-sum games. |
Lewis Liu; Yufeng Zhang; Zhuoran Yang; Reza Babanezhad; Zhaoran Wang; |
643 | Noise and Fluctuation of Finite Learning Rate Stochastic Gradient Descent Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose to study the basic properties of SGD and its variants in the non-vanishing learning rate regime. |
Kangqiao Liu; Liu Ziyin; Masahito Ueda; |
644 | Multi-layered Network Exploration Via Random Walks: From Offline Optimization to Online Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: The MuLaNE task is to allocate total random walk budget B into each network layer so that the total weights of the unique nodes visited by random walks are maximized. We systematically study this problem from offline optimization to online learning. |
Xutong Liu; Jinhang Zuo; Xiaowei Chen; Wei Chen; John C. S. Lui; |
645 | Relative Positional Encoding for Transformers with Linear Complexity Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we bridge this gap and present Stochastic Positional Encoding as a way to generate PE that can be used as a replacement to the classical additive (sinusoidal) PE and provably behaves like RPE. |
Antoine Liutkus; Ondrej Ci?fka; Shih-Lun Wu; Umut Simsekli; Yi-Hsuan Yang; Gael Richard; |
646 | Joint Online Learning and Decision-making Via Dual Mirror Descent Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a novel offline benchmark and a new algorithm that mixes an online dual mirror descent scheme with a generic parameter learning process. |
Alfonso Lobos; Paul Grigas; Zheng Wen; |
647 | Symmetric Spaces for Graph Embeddings: A Finsler-Riemannian Approach Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This enables us to introduce a new method, the use of Finsler metrics integrated in a Riemannian optimization scheme, that better adapts to dissimilar structures in the graph. |
Federico Lopez; Beatrice Pozzetti; Steve Trettel; Michael Strube; Anna Wienhard; |
648 | HEMET: A Homomorphic-Encryption-Friendly Privacy-Preserving Mobile Neural Network Architecture Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a \textbf{HE}-friendly privacy-preserving \textbf{M}obile neural n\textbf{ET}work architecture, \textbf{HEMET}. |
Qian Lou; Lei Jiang; |
649 | Optimal Complexity in Decentralized Training Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we provide a tight lower bound on the iteration complexity for such methods in a stochastic non-convex setting. |
Yucheng Lu; Christopher De Sa; |
650 | DANCE: Enhancing Saliency Maps Using Decoys Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To address these issues, we propose a framework, DANCE, which improves the robustness of saliency methods by following a two-step procedure. |
Yang Young Lu; Wenbo Guo; Xinyu Xing; William Stafford Noble; |
651 | Binary Classification from Multiple Unlabeled Datasets Via Surrogate Set Classification Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a new approach for binary classification from $m$ U-sets for $m\ge2$. |
Nan Lu; Shida Lei; Gang Niu; Issei Sato; Masashi Sugiyama; |
652 | Variance Reduced Training with Stratified Sampling for Forecasting Models Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we provably show under such heterogeneity, training a forecasting model with commonly used stochastic optimizers (e.g. SGD) potentially suffers large variance on gradient estimation, and thus incurs long-time training. |
Yucheng Lu; Youngsuk Park; Lifan Chen; Yuyang Wang; Christopher De Sa; Dean Foster; |
653 | ACE: Explaining Cluster from An Adversarial Perspective Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Here we propose an integrated deep learning framework, Adversarial Clustering Explanation (ACE), that bundles all three steps into a single workflow. |
Yang Young Lu; Timothy C Yu; Giancarlo Bonora; William Stafford Noble; |
654 | On Monotonic Linear Interpolation of Neural Network Parameters Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This Monotonic Linear Interpolation (MLI) property, first observed by Goodfellow et al. 2014, persists in spite of the non-convex objectives and highly non-linear training dynamics of neural networks. Extending this work, we evaluate several hypotheses for this property that, to our knowledge, have not yet been explored. |
James R Lucas; Juhan Bae; Michael R Zhang; Stanislav Fort; Richard Zemel; Roger B Grosse; |
655 | Improving Breadth-Wise Backpropagation in Graph Neural Networks Helps Learning Long-Range Dependencies Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we focus on the ability of graph neural networks (GNNs) to learn long-range patterns in graphs with edge features. |
Denis Lukovnikov; Asja Fischer; |
656 | GraphDF: A Discrete Flow Model for Molecular Graph Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose GraphDF, a novel discrete latent variable model for molecular graph generation based on normalizing flow methods. |
Youzhi Luo; Keqiang Yan; Shuiwang Ji; |
657 | Trajectory Diversity for Zero-Shot Coordination Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To this end, we introduce \textit{Trajectory Diversity} (TrajeDi) – a differentiable objective for generating diverse reinforcement learning policies. |
Andrei Lupu; Brandon Cui; Hengyuan Hu; Jakob Foerster; |
658 | HyperHyperNetwork for The Design of Antenna Arrays Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present deep learning methods for the design of arrays and single instances of small antennas. |
Shahar Lutati; Lior Wolf; |
659 | Value Iteration in Continuous Actions, States and Time Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose continuous fitted value iteration (cFVI). |
Michael Lutter; Shie Mannor; Jan Peters; Dieter Fox; Animesh Garg; |
660 | Meta-Cal: Well-controlled Post-hoc Calibration By Ranking Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we introduce two constraints that are worth consideration in designing a calibration map for post-hoc calibration. |
Xingchen Ma; Matthew B. Blaschko; |
661 | Neural-Pull: Learning Signed Distance Function from Point Clouds By Learning to Pull Space Onto Surface Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we introduce Neural-Pull, a new approach that is simple and leads to high quality SDFs. |
Baorui Ma; Zhizhong Han; Yu-Shen Liu; Matthias Zwicker; |
662 | Learning Stochastic Behaviour from Aggregate Data Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a novel method using the weak form of Fokker Planck Equation (FPE) — a partial differential equation — to describe the density evolution of data in a sampled form, which is then combined with Wasserstein generative adversarial network (WGAN) in the training process. |
Shaojun Ma; Shu Liu; Hongyuan Zha; Haomin Zhou; |
663 | Local Algorithms for Finding Densely Connected Clusters Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Following this line of research, in this work we study local algorithms for finding a pair of vertex sets defined with respect to their inter-connection and their relationship with the rest of the graph. |
Peter Macgregor; He Sun; |
664 | Learning to Generate Noise for Multi-Attack Robustness Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To address these challenges, we propose a novel meta-learning framework that explicitly learns to generate noise to improve the model’s robustness against multiple types of attacks. |
Divyam Madaan; Jinwoo Shin; Sung Ju Hwang; |
665 | Learning Interaction Kernels for Agent Systems on Riemannian Manifolds Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We consider the problem of learning interaction kernels in these dynamical systems constrained to evolve on Riemannian manifolds from given trajectory data. |
Mauro Maggioni; Jason J Miller; Hongda Qiu; Ming Zhong; |
666 | Tesseract: Tensorised Actors for Multi-Agent Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we focus on the fundamental hurdle affecting both value-based and policy-gradient approaches: an exponential blowup of the action space with the number of agents. |
Anuj Mahajan; Mikayel Samvelyan; Lei Mao; Viktor Makoviychuk; Animesh Garg; Jean Kossaifi; Shimon Whiteson; Yuke Zhu; Animashree Anandkumar; |
667 | Domain Generalization Using Causal Matching Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Based on this objective, we propose matching-based algorithms when base objects are observed (e.g., through data augmentation) and approximate the objective when objects are not observed (MatchDG). |
Divyat Mahajan; Shruti Tople; Amit Sharma; |
668 | Stability and Convergence of Stochastic Gradient Clipping: Beyond Lipschitz Continuity and Smoothness Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper establishes both qualitative and quantitative convergence results of the clipped stochastic (sub)gradient method (SGD) for non-smooth convex functions with rapidly growing subgradients. |
Vien V. Mai; Mikael Johansson; |
669 | Nonparametric Hamiltonian Monte Carlo Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper introduces the Nonparametric Hamiltonian Monte Carlo (NP-HMC) algorithm which generalises HMC to nonparametric models. |
Carol Mak; Fabian Zaiser; Luke Ong; |
670 | Exploiting Structured Data for Learning Contagious Diseases Under Incomplete Testing Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work we ask: can we build reliable infection prediction models when the observed data is collected under limited, and biased testing that prioritizes testing symptomatic individuals? |
Maggie Makar; Lauren West; David Hooper; Eric Horvitz; Erica Shenoy; John Guttag; |
671 | Near-Optimal Algorithms for Explainable K-Medians and K-Means Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a new algorithm for this problem which is $\tilde O(\log k)$ competitive with $k$-medians with $\ell_1$ norm and $\tilde O(k)$ competitive with $k$-means. |
Konstantin Makarychev; Liren Shan; |
672 | KO Codes: Inventing Nonlinear Encoding and Decoding for Reliable Wireless Communication Via Deep-learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we construct KO codes, a computationally efficient family of deep-learning driven (encoder, decoder) pairs that outperform the state-of-the-art reliability performance on the standardized AWGN channel. |
Ashok V Makkuva; Xiyang Liu; Mohammad Vahid Jamali; Hessam Mahdavifar; Sewoong Oh; Pramod Viswanath; |
673 | Quantifying The Benefit of Using Differentiable Learning Over Tangent Kernels Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We study the relative power of learning with gradient descent on differentiable models, such as neural networks, versus using the corresponding tangent kernels. |
Eran Malach; Pritish Kamath; Emmanuel Abbe; Nathan Srebro; |
674 | Inverse Constrained Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we consider the problem of learning constraints from demonstrations of a constraint-abiding agent’s behavior. |
Shehryar Malik; Usman Anwar; Alireza Aghasi; Ali Ahmed; |
675 | A Sampling-Based Method for Tensor Ring Decomposition Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a sampling-based method for computing the tensor ring (TR) decomposition of a data tensor. |
Osman Asif Malik; Stephen Becker; |
676 | Sample Efficient Reinforcement Learning In Continuous State Spaces: A Perspective Beyond Linearity Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To resolve this discrepancy between theory and practice, we introduce the Effective Planning Window (EPW) condition, a structural condition on MDPs that makes no linearity assumptions. |
Dhruv Malik; Aldo Pacchiano; Vishwak Srinivasan; Yuanzhi Li; |
677 | Beyond The Pareto Efficient Frontier: Constraint Active Search for Multiobjective Experimental Design Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We introduce an active search algorithm called Expected Coverage Improvement (ECI) to efficiently discover the region of satisfaction and simultaneously sample diverse acceptable configurations. |
Gustavo Malkomes; Bolong Cheng; Eric H Lee; Mike Mccourt; |
678 | Consistent Nonparametric Methods for Network Assisted Covariate Estimation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we propose a new similarity measure between two nodes based on the patterns of their 2-hop neighborhoods. |
Xueyu Mao; Deepayan Chakrabarti; Purnamrita Sarkar; |
679 | Near-Optimal Model-Free Reinforcement Learning in Non-Stationary Episodic MDPs Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose Restarted Q-Learning with Upper Confidence Bounds (RestartQ-UCB), the first model-free algorithm for non-stationary RL, and show that it outperforms existing solutions in terms of dynamic regret. |
Weichao Mao; Kaiqing Zhang; Ruihao Zhu; David Simchi-Levi; Tamer Basar; |
680 | Adaptive Sampling for Best Policy Identification in Markov Decision Processes Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We investigate the problem of best-policy identification in discounted Markov Decision Processes (MDPs) when the learner has access to a generative model. |
Aymen Al Marjani; Alexandre Proutiere; |
681 | Explanations for Monotonic Classifiers Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper describes novel algorithms for the computation of one formal explanation of a (black-box) monotonic classifier. |
Joao Marques-Silva; Thomas Gerspacher; Martin C Cooper; Alexey Ignatiev; Nina Narodytska; |
682 | Multi-Agent Training Beyond Zero-Sum with Correlated Equilibrium Meta-Solvers Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose Joint Policy-Space Response Oracles (JPSRO), an algorithm for training agents in n-player, general-sum extensive form games, which provably converges to an equilibrium. |
Luke Marris; Paul Muller; Marc Lanctot; Karl Tuyls; Thore Graepel; |
683 | Blind Pareto Fairness and Subgroup Robustness Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work we analyze the space of solutions for worst-case fairness beyond demographics, and propose Blind Pareto Fairness (BPF), a method that leverages no-regret dynamics to recover a fair minimax classifier that reduces worst-case risk of any potential subgroup of sufficient size, and guarantees that the remaining population receives the best possible level of service. |
Natalia L Martinez; Martin A Bertran; Afroditi Papadaki; Miguel Rodrigues; Guillermo Sapiro; |
684 | Necessary and Sufficient Conditions for Causal Feature Selection in Time Series with Latent Common Causes Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We study the identification of direct and indirect causes on time series with latent variables, and provide a constrained-based causal feature selection method, which we prove that is both sound and complete under some graph constraints. |
Atalanti A Mastakouri; Bernhard Sch?lkopf; Dominik Janzing; |
685 | Proximal Causal Learning with Kernels: Two-Stage Estimation and Moment Restriction Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose two kernel-based meth-ods for nonlinear causal effect estimation in thissetting: (a) a two-stage regression approach, and(b) a maximum moment restriction approach. |
Afsaneh Mastouri; Yuchen Zhu; Limor Gultchin; Anna Korba; Ricardo Silva; Matt Kusner; Arthur Gretton; Krikamol Muandet; |
686 | Robust Unsupervised Learning Via L-statistic Minimization Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present a general approach to this problem focusing on unsupervised learning. |
Andreas Maurer; Daniela Angela Parletta; Andrea Paudice; Massimiliano Pontil; |
687 | Adversarial Multi Class Learning Under Weak Supervision with Performance Guarantees Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We develop a rigorous approach for using a set of arbitrarily correlated weak supervision sources in order to solve a multiclass classification task when only a very small set of labeled data is available. |
Alessio Mazzetto; Cyrus Cousins; Dylan Sam; Stephen H Bach; Eli Upfal; |
688 | Fundamental Tradeoffs in Distributionally Adversarial Training Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we focus on \emph{distribution perturbing} adversary framework wherein the adversary can change the test distribution within a neighborhood of the training data distribution. |
Mohammad Mehrabi; Adel Javanmard; Ryan A. Rossi; Anup Rao; Tung Mai; |
689 | Leveraging Non-uniformity in First-order Non-convex Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Motivated by properties of objective functions that arise in machine learning, we propose a non-uniform refinement of these notions, leading to \emph{Non-uniform Smoothness} (NS) and \emph{Non-uniform ?{}ojasiewicz inequality} (N?{}). |
Jincheng Mei; Yue Gao; Bo Dai; Csaba Szepesvari; Dale Schuurmans; |
690 | Controlling Graph Dynamics with Reinforcement Learning and Graph Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We consider the problem of controlling a partially-observed dynamic process on a graph by a limited number of interventions. |
Eli Meirom; Haggai Maron; Shie Mannor; Gal Chechik; |
691 | A Theory of High Dimensional Regression with Arbitrary Correlations Between Input Features and Target Functions: Sample Complexity, Multiple Descent Curves and A Hierarchy of Phase Transitions Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To understand this better we revisit ridge regression in high dimensions, which corresponds to an exceedingly simple architecture and loss function, but we analyze its performance under arbitrary correlations between input features and the target function. |
Gabriel Mel; Surya Ganguli; |
692 | Neural Architecture Search Without Training Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we examine the overlap of activations between datapoints in untrained networks and motivate how this can give a measure which is usefully indicative of a network’s trained performance. |
Joe Mellor; Jack Turner; Amos Storkey; Elliot J Crowley; |
693 | Fast Active Learning for Pure Exploration in Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We show that, surprisingly, for a pure-exploration objective of \emph{reward-free exploration}, bonuses that scale with $1/n$ bring faster learning rates, improving the known upper bounds with respect to the dependence on the horizon $H$. |
Pierre Menard; Omar Darwiche Domingues; Anders Jonsson; Emilie Kaufmann; Edouard Leurent; Michal Valko; |
694 | UCB Momentum Q-learning: Correcting The Bias Without Forgetting Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose UCBMQ, Upper Confidence Bound Momentum Q-learning, a new algorithm for reinforcement learning in tabular and possibly stage-dependent, episodic Markov decision process. |
Pierre Menard; Omar Darwiche Domingues; Xuedong Shang; Michal Valko; |
695 | An Integer Linear Programming Framework for Mining Constraints from Data Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present a general framework for mining constraints from data. |
Tao Meng; Kai-Wei Chang; |
696 | A Statistical Perspective on Distillation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present a statistical perspective on distillation which provides an answer to these questions. |
Aditya K Menon; Ankit Singh Rawat; Sashank Reddi; Seungyeon Kim; Sanjiv Kumar; |
697 | Learn2Hop: Learned Optimization on Rough Landscapes Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose adapting recent developments in meta-learning to these many-minima problems by learning the optimization algorithm for various loss landscapes. |
Amil Merchant; Luke Metz; Samuel S Schoenholz; Ekin D Cubuk; |
698 | Counterfactual Credit Assignment in Model-Free Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We formulate a family of policy gradient algorithms that use these future-conditional value functions as baselines or critics, and show that they are provably low variance. |
Thomas Mesnard; Theophane Weber; Fabio Viola; Shantanu Thakoor; Alaa Saade; Anna Harutyunyan; Will Dabney; Thomas S Stepleton; Nicolas Heess; Arthur Guez; Eric Moulines; Marcus Hutter; Lars Buesing; Remi Munos; |
699 | Provably Efficient Learning of Transferable Rewards Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we study the theoretical properties of the class of reward functions that are compatible with the expert’s behavior. |
Alberto Maria Metelli; Giorgia Ramponi; Alessandro Concetti; Marcello Restelli; |
700 | Mixed Nash Equilibria in The Adversarial Examples Game Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper tackles the problem of adversarial examples from a game theoretic point of view. |
Laurent Meunier; Meyer Scetbon; Rafael B Pinot; Jamal Atif; Yann Chevaleyre; |
701 | Learning in Nonzero-Sum Stochastic Games with Potentials Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we introduce a new generation of MARL learners that can handle \textit{nonzero-sum} payoff structures and continuous settings. |
David H Mguni; Yutong Wu; Yali Du; Yaodong Yang; Ziyi Wang; Minne Li; Ying Wen; Joel Jennings; Jun Wang; |
702 | EfficientTTS: An Efficient and High-Quality Text-to-Speech Architecture Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we address the Text-to-Speech (TTS) task by proposing a non-autoregressive architecture called EfficientTTS. |
Chenfeng Miao; Liang Shuang; Zhengchen Liu; Chen Minchuan; Jun Ma; Shaojun Wang; Jing Xiao; |
703 | Outside The Echo Chamber: Optimizing The Performative Risk Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we shift attention beyond performative stability and focus on optimizing the performative risk directly. |
John P Miller; Juan C Perdomo; Tijana Zrnic; |
704 | Accuracy on The Line: on The Strong Correlation Between Out-of-Distribution and In-Distribution Generalization Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we empirically show that out-of-distribution performance is strongly correlated with in-distribution performance for a wide range of models and distribution shifts. |
John P Miller; Rohan Taori; Aditi Raghunathan; Shiori Sagawa; Pang Wei Koh; Vaishaal Shankar; Percy Liang; Yair Carmon; Ludwig Schmidt; |
705 | Signatured Deep Fictitious Play for Mean Field Games with Common Noise Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, based on the rough path theory, we propose a novel single-loop algorithm, named signatured deep fictitious play (Sig-DFP), by which we can work with the unfixed common noise setup to avoid the nested loop structure and reduce the computational complexity significantly. |
Ming Min; Ruimeng Hu; |
706 | Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose StyleSpeech, a new TTS model which not only synthesizes high-quality speech but also effectively adapts to new speakers. |
Dongchan Min; Dong Bok Lee; Eunho Yang; Sung Ju Hwang; |
707 | On The Explicit Role of Initialization on The Convergence and Implicit Bias of Overparametrized Linear Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present a novel analysis of single-hidden-layer linear networks trained under gradient flow, which connects initialization, optimization, and overparametrization. |
Hancheng Min; Salma Tarmoun; Rene Vidal; Enrique Mallada; |
708 | An Identifiable Double VAE For Disentangled Representations Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Working along this line, we propose a novel VAE-based generative model with theoretical guarantees on identifiability. |
Graziano Mita; Maurizio Filippone; Pietro Michiardi; |
709 | Offline Meta-Reinforcement Learning with Advantage Weighting Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper introduces the offline meta-reinforcement learning (offline meta-RL) problem setting and proposes an algorithm that performs well in this setting. |
Eric Mitchell; Rafael Rafailov; Xue Bin Peng; Sergey Levine; Chelsea Finn; |
710 | The Power of Log-Sum-Exp: Sequential Density Ratio Matrix Estimation for Speed-Accuracy Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a model for multiclass classification of time series to make a prediction as early and as accurate as possible. |
Taiki Miyagawa; Akinori F Ebihara; |
711 | PODS: Policy Optimization Via Differentiable Simulation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, with the goal of improving the performance exhibited by RL algorithms, we explore a systematic way of leveraging the additional information provided by an emerging class of differentiable simulators. |
Miguel Angel Zamora Mora; Momchil P Peychev; Sehoon Ha; Martin Vechev; Stelian Coros; |
712 | Efficient Deviation Types and Learning for Hindsight Rationality in Extensive-Form Games Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Integrating the idea of time selection into counterfactual regret minimization (CFR), we introduce the extensive-form regret minimization (EFR) algorithm that achieves hindsight rationality for any given set of behavioral deviations with computation that scales closely with the complexity of the set. |
Dustin Morrill; Ryan D?Orazio; Marc Lanctot; James R Wright; Michael Bowling; Amy R Greenwald; |
713 | Neural Rough Differential Equations for Long Time Series Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Existing methods for computing the forward pass of a Neural CDE involve embedding the incoming time series into path space, often via interpolation, and using evaluations of this path to drive the hidden state. Here, we use rough path theory to extend this formulation. |
James Morrill; Cristopher Salvi; Patrick Kidger; James Foster; |
714 | Connecting Interpretability and Robustness in Decision Trees Through Separation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Curiously, a connection between robustness and interpretability was empirically observed, but the theoretical reasoning behind it remained elusive. In this paper, we rigorously investigate this connection. |
Michal Moshkovitz; Yao-Yuan Yang; Kamalika Chaudhuri; |
715 | Outlier-Robust Optimal Transport Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To address this issue, we propose an outlier-robust formulation of OT. |
Debarghya Mukherjee; Aritra Guha; Justin M Solomon; Yuekai Sun; Mikhail Yurochkin; |
716 | Oblivious Sketching for Logistic Regression Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To answer this question, we present the first data oblivious sketch for logistic regression. |
Alexander Munteanu; Simon Omlor; David Woodruff; |
717 | Bias-Variance Reduced Local SGD for Less Heterogeneous Federated Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we study a new local algorithm called Bias-Variance Reduced Local SGD (BVR-L-SGD) for nonconvex distributed optimization. |
Tomoya Murata; Taiji Suzuki; |
718 | Implicit-PDF: Non-Parametric Representation of Probability Distributions on The Rotation Manifold Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Our key idea is to represent the distributions implicitly, with a neural network that estimates the probability density, given the input image and a candidate pose. |
Kieran A Murphy; Carlos Esteves; Varun Jampani; Srikumar Ramalingam; Ameesh Makadia; |
719 | No-regret Algorithms for Capturing Events in Poisson Point Processes Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: By partitioning the domain into separate small regions, and using heteroscedastic linear regression, we propose a tractable estimator of Poisson process rates for two feedback models: \emph{count-record}, where exact locations of events are observed, and \emph{histogram} feedback, where only counts of events are observed. |
Mojmir Mutny; Andreas Krause; |
720 | Online Limited Memory Neural-Linear Bandits with Likelihood Matching Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To alleviate this, we propose a likelihood matching algorithm that is resilient to catastrophic forgetting and is completely online. |
Ofir Nabati; Tom Zahavy; Shie Mannor; |
721 | Quantitative Understanding of VAE As A Non-linearly Scaled Isometric Embedding Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper provides a quantitative understanding of VAE property through the differential geometric and information-theoretic interpretations of VAE. |
Akira Nakagawa; Keizo Kato; Taiji Suzuki; |
722 | GMAC: A Distributional Perspective on Actor-Critic Framework Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we devise a distributional framework on actor-critic as a solution to distributional instability, action type restriction, and conflation between samples and statistics. |
Daniel W Nam; Younghoon Kim; Chan Y Park; |
723 | Memory-Efficient Pipeline-Parallel DNN Training Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose PipeDream-2BW, a system that supports memory-efficient pipeline parallelism. |
Deepak Narayanan; Amar Phanishayee; Kaiyu Shi; Xie Chen; Matei Zaharia; |
724 | Randomized Dimensionality Reduction for Facility Location and Single-Linkage Clustering Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Random dimensionality reduction is a versatile tool for speeding up algorithms for high-dimensional problems. We study its application to two clustering problems: the facility location problem, and the single-linkage hierarchical clustering problem, which is equivalent to computing the minimum spanning tree. |
Shyam Narayanan; Sandeep Silwal; Piotr Indyk; Or Zamir; |
725 | Generating Images with Sparse Representations Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present an alternative approach, inspired by common image compression methods like JPEG, and convert images to quantized discrete cosine transform (DCT) blocks, which are represented sparsely as a sequence of DCT channel, spatial location, and DCT coefficient triples. |
Charlie Nash; Jacob Menick; Sander Dieleman; Peter Battaglia; |
726 | Geometric Convergence of Elliptical Slice Sampling Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Under weak regularity assumptions on the posterior density we show that the corresponding Markov chain is geometrically ergodic and therefore yield qualitative convergence guarantees. |
Viacheslav Natarovskii; Daniel Rudolf; Bj?rn Sprungk; |
727 | HardCoRe-NAS: Hard Constrained DiffeRentiable Neural Architecture Search Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work we resolve this by introducing Hard Constrained diffeRentiable NAS (HardCoRe-NAS), that is based on an accurate formulation of the expected resource requirement and a scalable search method that satisfies the hard constraint throughout the search. |
Niv Nayman; Yonathan Aflalo; Asaf Noy; Lihi Zelnik; |
728 | Emergent Social Learning Via Multi-agent Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper investigates whether independent reinforcement learning (RL) agents in a multi-agent environment can learn to use social learning to improve their performance. |
Kamal K Ndousse; Douglas Eck; Sergey Levine; Natasha Jaques; |
729 | Bayesian Algorithm Execution: Estimating Computable Properties of Black-box Functions Using Mutual Information Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To tackle this problem, we present a procedure, InfoBAX, that sequentially chooses queries that maximize mutual information with respect to the algorithm’s output. |
Willie Neiswanger; Ke Alexander Wang; Stefano Ermon; |
730 | Continuous Coordination As A Realistic Scenario for Lifelong Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we introduce a multi-agent lifelong learning testbed that supports both zero-shot and few-shot settings. |
Hadi Nekoei; Akilesh Badrinaaraayanan; Aaron Courville; Sarath Chandar; |
731 | Policy Caches with Successor Features Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present new bounds for the performance of optimal policies in a new task, as well as an approach to use these bounds to decide, when presented with a new task, whether to use cached policies or learn a new policy. |
Mark W Nemecek; Ron Parr; |
732 | Causality-aware Counterfactual Confounding Adjustment As An Alternative to Linear Residualization in Anticausal Prediction Tasks Based on Linear Learners Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we compare the linear residualization approach against the causality-aware confounding adjustment in anticausal prediction tasks. |
Elias Chaibub Neto; |
733 | Incentivizing Compliance with Algorithmic Instruments Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We develop a novel recommendation mechanism that views the planner’s recommendation as a form of instrumental variable (IV) that only affects an agents’ action selection, but not the observed rewards. |
Dung Daniel T Ngo; Logan Stapleton; Vasilis Syrgkanis; Steven Wu; |
734 | On The Proof of Global Convergence of Gradient Descent for Deep ReLU Networks with Linear Widths Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We give a simple proof for the global convergence of gradient descent in training deep ReLU networks with the standard square loss, and show some of its improvements over the state-of-the-art. |
Quynh Nguyen; |
735 | Value-at-Risk Optimization with Gaussian Processes Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper presents a novel VaR upper confidence bound (V-UCB) algorithm for maximizing the VaR of a black-box objective function with the first no-regret guarantee. |
Quoc Phong Nguyen; Zhongxiang Dai; Bryan Kian Hsiang Low; Patrick Jaillet; |
736 | Cross-model Back-translated Distillation for Unsupervised Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We introduce a novel component to the standard UMT framework called Cross-model Back-translated Distillation (CBD), that is aimed to induce another level of data diversification that existing principles lack. |
Xuan-Phi Nguyen; Shafiq Joty; Thanh-Tung Nguyen; Kui Wu; Ai Ti Aw; |
737 | Optimal Transport Kernels for Sequential and Parallel Neural Architecture Search Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Building upon tree-Wasserstein (TW), which is a negative definite variant of OT, we develop a novel discrepancy for neural architectures, and demonstrate it within a Gaussian process surrogate model for the sequential NAS settings. |
Vu Nguyen; Tam Le; Makoto Yamada; Michael A. Osborne; |
738 | Interactive Learning from Activity Description Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present a novel interactive learning protocol that enables training request-fulfilling agents by verbally describing their activities. |
Khanh X Nguyen; Dipendra Misra; Robert Schapire; Miroslav Dudik; Patrick Shafto; |
739 | Nonmyopic Multifidelity Acitve Search Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a model of multifidelity active search, as well as a novel, computationally efficient policy for this setting that is motivated by state-of-the-art classical policies. |
Quan Nguyen; Arghavan Modiri; Roman Garnett; |
740 | Tight Bounds on The Smallest Eigenvalue of The Neural Tangent Kernel for Deep ReLU Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we provide tight bounds on the smallest eigenvalue of NTK matrices for deep ReLU nets, both in the limiting case of infinite widths and for finite widths. |
Quynh Nguyen; Marco Mondelli; Guido F Montufar; |
741 | Temporal Predictive Coding For Model-Based Planning In Latent Space Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we present an information-theoretic approach that employs temporal predictive coding to encode elements in the environment that can be predicted across time. |
Tung D Nguyen; Rui Shu; Tuan Pham; Hung Bui; Stefano Ermon; |
742 | Differentially Private Densest Subgraph Detection Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We study the densest subgraph problem in the edge privacy model, in which the edges of the graph are private. We present the first sequential and parallel differentially private algorithms for this problem. |
Dung Nguyen; Anil Vullikanti; |
743 | Data Augmentation for Meta-Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We systematically dissect the meta-learning pipeline and investigate the distinct ways in which data augmentation can be integrated at both the image and class levels. |
Renkun Ni; Micah Goldblum; Amr Sharaf; Kezhi Kong; Tom Goldstein; |
744 | Improved Denoising Diffusion Probabilistic Models Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We show that with a few simple modifications, DDPMs can also achieve competitive log-likelihoods while maintaining high sample quality. |
Alexander Quinn Nichol; Prafulla Dhariwal; |
745 | Smooth $p$-Wasserstein Distance: Structure, Empirical Approximation, and Statistical Applications Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Motivated by the scalability of this framework to high dimensions, we investigate the structural and statistical behavior of the Gaussian-smoothed p-Wasserstein distance W(s)p, for arbitrary p=1. |
Sloan Nietert; Ziv Goldfeld; Kengo Kato; |
746 | AdaXpert: Adapting Neural Architecture for Growing Data Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To address this, we present a neural architecture adaptation method, namely Adaptation eXpert (AdaXpert), to efficiently adjust previous architectures on the growing data. |
Shuaicheng Niu; Jiaxiang Wu; Guanghui Xu; Yifan Zhang; Yong Guo; Peilin Zhao; Peng Wang; Mingkui Tan; |
747 | Asynchronous Decentralized Optimization With Implicit Stochastic Variance Reduction Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we reformulate the update procedure of ECL such that it implicitly includes the gradient modification of SVR by optimally selecting a constraint-strength control parameter. |
Kenta Niwa; Guoqiang Zhang; W. Bastiaan Kleijn; Noboru Harada; Hiroshi Sawada; Akinori Fujino; |
748 | WGAN with An Infinitely Wide Generator Has No Spurious Stationary Points Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we show that GANs with a 2-layer infinite-width generator and a 2-layer finite-width discriminator trained with stochastic gradient ascent-descent have no spurious stationary points. |
Albert No; Taeho Yoon; Kwon Sehyun; Ernest K Ryu; |
749 | The Impact of Record Linkage on Learning from Feature Partitioned Data Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we provide the first assessment of the problem for supervised learning. |
Richard Nock; Stephen Hardy; Wilko Henecka; Hamish Ivey-Law; Jakub Nabaglo; Giorgio Patrini; Guillaume Smith; Brian Thorne; |
750 | Accuracy, Interpretability, and Differential Privacy Via Explainable Boosting Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We show that adding differential privacy to Explainable Boosting Machines (EBMs), a recent method for training interpretable ML models, yields state-of-the-art accuracy while protecting privacy. |
Harsha Nori; Rich Caruana; Zhiqi Bu; Judy Hanwen Shen; Janardhan Kulkarni; |
751 | Posterior Value Functions: Hindsight Baselines for Policy Gradient Methods Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we exploit the idea of hindsight and introduce posterior value functions. |
Chris Nota; Philip Thomas; Bruno C. Da Silva; |
752 | Global Inducing Point Variational Posteriors for Bayesian Neural Networks and Deep Gaussian Processes Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We consider the optimal approximate posterior over the top-layer weights in a Bayesian neural network for regression, and show that it exhibits strong dependencies on the lower-layer weights. |
Sebastian W Ober; Laurence Aitchison; |
753 | Regularizing Towards Causal Invariance: Linear Models with Proxies Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a method for learning linear models whose predictive performance is robust to causal interventions on unobserved variables, when noisy proxies of those variables are available. |
Michael Oberst; Nikolaj Thams; Jonas Peters; David Sontag; |
754 | Sparsity-Agnostic Lasso Bandit Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: The main contribution of this paper is to propose an algorithm that does not require prior knowledge of the sparsity index $s_0$ and establish tight regret bounds on its performance under mild conditions. |
Min-Hwan Oh; Garud Iyengar; Assaf Zeevi; |
755 | Autoencoder Image Interpolation By Shaping The Latent Space Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a regularization technique that shapes the latent representation to follow a manifold that is consistent with the training images and that forces the manifold to be smooth and locally convex. |
Alon Oring; Zohar Yakhini; Yacov Hel-Or; |
756 | Generalization Guarantees for Neural Architecture Search with Train-Validation Split Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: NAS methods commonly use bilevel optimization where one optimizes the weights over the training data (lower-level problem) and hyperparameters – such as the architecture – over the validation data (upper-level problem). This paper explores the statistical aspects of such problems with train-validation splits. |
Samet Oymak; Mingchen Li; Mahdi Soltanolkotabi; |
757 | Vector Quantized Models for Planning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present a new approach that handles stochastic and partially-observable environments. |
Sherjil Ozair; Yazhe Li; Ali Razavi; Ioannis Antonoglou; Aaron Van Den Oord; Oriol Vinyals; |
758 | Training Adversarially Robust Sparse Networks Via Bayesian Connectivity Sampling Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Motivated by the efficient and stable computational function of the brain in the presence of a highly dynamic synaptic connectivity structure, we propose an intrinsically sparse rewiring approach to train neural networks with state-of-the-art robust learning objectives under high sparsity. |
Ozan ?zdenizci; Robert Legenstein; |
759 | Opening The Blackbox: Accelerating Neural Differential Equations By Regularizing Internal Solver Heuristics Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We describe a novel regularization method that uses the internal cost heuristics of adaptive differential equation solvers combined with discrete adjoint sensitivities to guide the training process towards learning NDEs that are easier to solve. |
Avik Pal; Yingbo Ma; Viral Shah; Christopher V Rackauckas; |
760 | RNN with Particle Flow for Probabilistic Spatio-temporal Forecasting Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we consider the time-series data as a random realization from a nonlinear state-space model and target Bayesian inference of the hidden states for probabilistic forecasting. |
Soumyasundar Pal; Liheng Ma; Yingxue Zhang; Mark Coates; |
761 | Inference for Network Regression Models with Community Structure Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we present a novel regression modeling framework that models the errors as resulting from a community-based dependence structure and exploits the subsequent exchangeability properties of the error distribution to obtain parsimonious standard errors for regression parameters. |
Mengjie Pan; Tyler Mccormick; Bailey Fosdick; |
762 | Latent Space Energy-Based Model of Symbol-Vector Coupling for Text Generation and Classification Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a latent space energy-based prior model for text generation and classification. |
Bo Pang; Ying Nian Wu; |
763 | Leveraging Good Representations in Linear Contextual Bandits Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we first provide a systematic analysis of the different definitions of “good” representations proposed in the literature. We then propose a novel selection algorithm able to adapt to the best representation in a set of M candidates. |
Matteo Papini; Andrea Tirinzoni; Marcello Restelli; Alessandro Lazaric; Matteo Pirotta; |
764 | Wasserstein Distributional Normalization For Robust Distributional Certification of Noisy Labeled Data Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a novel Wasserstein distributional normalization method that can classify noisy labeled data accurately. |
Sung Woo Park; Junseok Kwon; |
765 | Unsupervised Representation Learning Via Neural Activation Coding Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present neural activation coding (NAC) as a novel approach for learning deep representations from unlabeled data for downstream applications. |
Yookoon Park; Sangho Lee; Gunhee Kim; David Blei; |
766 | Conditional Distributional Treatment Effect with Kernel Conditional Mean Embeddings and U-Statistic Regression Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose to analyse the conditional distributional treatment effect (CoDiTE), which, in contrast to the more common conditional average treatment effect (CATE), is designed to encode a treatment’s distributional aspects beyond the mean. |
Junhyung Park; Uri Shalit; Bernhard Sch?lkopf; Krikamol Muandet; |
767 | Generative Adversarial Networks for Markovian Temporal Dynamics: Stochastic Continuous Data Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present a novel generative adversarial network (GAN) that can describe Markovian temporal dynamics. |
Sung Woo Park; Dong Wook Shu; Junseok Kwon; |
768 | Optimal Counterfactual Explanations in Tree Ensembles Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we take a disciplined approach towards counterfactual explanations for tree ensembles. |
Axel Parmentier; Thibaut Vidal; |
769 | PHEW : Constructing Sparse Networks That Learn Fast and Generalize Well Without Training Data Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We first show that even though Synflow-L2 is optimal in terms of convergence, for a given network density, it results in sub-networks with bottleneck (narrow) layers {-} leading to poor performance as compared to other data-agnostic methods that use the same number of parameters. Then we propose a new method to construct sparse networks, without any training data, referred to as Paths with Higher-Edge Weights (PHEW). |
Shreyas Malakarjun Patil; Constantine Dovrolis; |
770 | CombOptNet: Fit The Right NP-Hard Problem By Learning Integer Programming Constraints Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we aim to integrate integer programming solvers into neural network architectures as layers capable of learning both the cost terms and the constraints. |
Anselm Paulus; Michal Rolinek; Vit Musil; Brandon Amos; Georg Martius; |
771 | Ensemble Bootstrapping for Q-Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we introduce a new bias-reduced algorithm called Ensemble Bootstrapped Q-Learning (EBQL), a natural extension of Double-Q-learning to ensembles. |
Oren Peer; Chen Tessler; Nadav Merlis; Ron Meir; |
772 | Homomorphic Sensing: Sparsity and Noise Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we present tighter and simpler conditions for the homomorphic sensing problem to admit a unique solution. |
Liangzu Peng; Boshi Wang; Manolis Tsakiris; |
773 | How Could Neural Networks Understand Programs? Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Inspired by this, we propose a novel program semantics learning paradigm, that the model should learn from information composed of (1) the representations which align well with the fundamental operations in operational semantics, and (2) the information of environment transition, which is indispensable for program understanding. |
Dinglan Peng; Shuxin Zheng; Yatao Li; Guolin Ke; Di He; Tie-Yan Liu; |
774 | Privacy-Preserving Video Classification with Convolutional Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a privacy-preserving implementation of single-frame method based video classification with convolutional neural networks that allows a party to infer a label from a video without necessitating the video owner to disclose their video to other entities in an unencrypted manner. |
Sikha Pentyala; Rafael Dowsley; Martine De Cock; |
775 | Rissanen Data Analysis: Examining Dataset Characteristics Via Description Length Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We introduce a method to determine if a certain capability helps to achieve an accurate model of given data. |
Ethan Perez; Douwe Kiela; Kyunghyun Cho; |
776 | Modelling Behavioural Diversity for Learning in Open-Ended Games Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we offer a geometric interpretation of behavioural diversity in games and introduce a novel diversity metric based on \emph{determinantal point processes} (DPP). |
Nicolas Perez-Nieves; Yaodong Yang; Oliver Slumbers; David H Mguni; Ying Wen; Jun Wang; |
777 | From Poincar? Recurrence to Convergence in Imperfect Information Games: Finding Equilibrium Via Regularization Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we investigate the Follow the Regularized Leader dynamics in sequential imperfect information games (IIG). |
Julien Perolat; Remi Munos; Jean-Baptiste Lespiau; Shayegan Omidshafiei; Mark Rowland; Pedro Ortega; Neil Burch; Thomas Anthony; David Balduzzi; Bart De Vylder; Georgios Piliouras; Marc Lanctot; Karl Tuyls; |
778 | Spectral Smoothing Unveils Phase Transitions in Hierarchical Variational Autoencoders Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We suggest that the hierarchical VAE objective explicitly includes the variance of the function parameterizing the mean and variance of the latent Gaussian distribution which itself is often a high variance function. |
Adeel Pervez; Efstratios Gavves; |
779 | Differentiable Sorting Networks for Scalable Sorting and Ranking Supervision Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: That is, the ground truth order of sets of samples is known, while their absolute values remain unsupervised. For that, we propose differentiable sorting networks by relaxing their pairwise conditional swap operations. |
Felix Petersen; Christian Borgelt; Hilde Kuehne; Oliver Deussen; |
780 | Megaverse: Simulating Embodied Agents at One Million Experiences Per Second Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present Megaverse, a new 3D simulation platform for reinforcement learning and embodied AI research. |
Aleksei Petrenko; Erik Wijmans; Brennan Shacklett; Vladlen Koltun; |
781 | Towards Practical Mean Bounds for Small Samples Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: For the first time since then, we present a new family of bounds that compares favorably to Anderson’s. |
My Phan; Philip Thomas; Erik Learned-Miller; |
782 | DG-LMC: A Turn-key and Scalable Synchronous Distributed MCMC Algorithm Via Langevin Monte Carlo Within Gibbs Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose to fill this gap in the case where the dataset is partitioned and stored on computing nodes within a cluster under a master/slaves architecture. |
Vincent Plassier; Maxime Vono; Alain Durmus; Eric Moulines; |
783 | GeomCA: Geometric Evaluation of Data Representations Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we present Geometric Component Analysis (GeomCA) algorithm that evaluates representation spaces based on their geometric and topological properties. |
Petra Poklukar; Anastasiia Varava; Danica Kragic; |
784 | Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we introduce Grad-TTS, a novel text-to-speech model with score-based decoder producing mel-spectrograms by gradually transforming noise predicted by encoder and aligned with text input by means of Monotonic Alignment Search. |
Vadim Popov; Ivan Vovk; Vladimir Gogoryan; Tasnima Sadekova; Mikhail Kudinov; |
785 | Bias-Free Scalable Gaussian Processes Via Randomized Truncations Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We find that both methods introduce a systematic bias on the learned hyperparameters: CG tends to underfit while RFF tends to overfit. We address these issues using randomized truncation estimators that eliminate bias in exchange for increased variance. |
Andres Potapczynski; Luhuan Wu; Dan Biderman; Geoff Pleiss; John P Cunningham; |
786 | Dense for The Price of Sparse: Improved Performance of Sparsely Initialized Networks Via A Subspace Offset Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we introduce a new ‘DCT plus Sparse’ layer architecture, which maintains information propagation and trainability even with as little as 0.01% trainable parameters remaining. |
Ilan Price; Jared Tanner; |
787 | BANG: Bridging Autoregressive and Non-autoregressive Generation with Large Scale Pretraining Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose BANG, a new pretraining model to Bridge the gap between Autoregressive (AR) and Non-autoregressive (NAR) Generation. |
Weizhen Qi; Yeyun Gong; Jian Jiao; Yu Yan; Weizhu Chen; Dayiheng Liu; Kewen Tang; Houqiang Li; Jiusheng Chen; Ruofei Zhang; Ming Zhou; Nan Duan; |
788 | A Probabilistic Approach to Neural Network Pruning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Given a target network, we provide a universal approach to bound the gap between a pruned and the target network in a probabilistic sense, which is the first study of this nature. |
Xin Qian; Diego Klabjan; |
789 | Global Prosody Style Transfer Without Text Transcriptions Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose AutoPST, which can disentangle global prosody style from speech without relying on any text transcriptions. |
Kaizhi Qian; Yang Zhang; Shiyu Chang; Jinjun Xiong; Chuang Gan; David Cox; Mark Hasegawa-Johnson; |
790 | Efficient Differentiable Simulation of Articulated Bodies Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present a method for efficient differentiable simulation of articulated bodies. |
Yi-Ling Qiao; Junbang Liang; Vladlen Koltun; Ming C Lin; |
791 | Oneshot Differentially Private Top-k Selection Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present the oneshot Laplace mechanism, which generalizes the well-known Report Noisy Max \cite{dwork2014algorithmic} mechanism to reporting noisy top-$k$ elements. |
Gang Qiao; Weijie Su; Li Zhang; |
792 | Density Constrained Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We study constrained reinforcement learning (CRL) from a novel perspective by setting constraints directly on state density functions, rather than the value functions considered by previous works. |
Zengyi Qin; Yuxiao Chen; Chuchu Fan; |
793 | Budgeted Heterogeneous Treatment Effect Estimation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: By deriving an informative generalization bound and connecting to active learning, we propose an effective and efficient method which is validated both theoretically and empirically. |
Tian Qin; Tian-Zuo Wang; Zhi-Hua Zhou; |
794 | Neural Transformation Learning for Deep Anomaly Detection Beyond Images Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: The key idea is to embed the transformed data into a semantic space such that the transformed data still resemble their untransformed form, while different transformations are easily distinguishable. |
Chen Qiu; Timo Pfrommer; Marius Kloft; Stephan Mandt; Maja Rudolph; |
795 | Provably Efficient Fictitious Play Policy Optimization for Zero-Sum Markov Games with Structured Transitions Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We take steps forward by proposing and analyzing new fictitious play policy optimization algorithms for two-player zero-sum Markov games with structured but unknown transitions. |
Shuang Qiu; Xiaohan Wei; Jieping Ye; Zhaoran Wang; Zhuoran Yang; |
796 | Optimization Planning for 3D ConvNets Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we decompose the path into a series of training "states" and specify the hyper-parameters, e.g., learning rate and the length of input clips, in each state. |
Zhaofan Qiu; Ting Yao; Chong-Wah Ngo; Tao Mei; |
797 | On Reward-Free RL with Kernel and Neural Function Approximations: Single-Agent MDP and Markov Game Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Specifically, we propose to explore via an optimistic variant of the value-iteration algorithm incorporating kernel and neural function approximations, where we adopt the associated exploration bonus as the exploration reward. |
Shuang Qiu; Jieping Ye; Zhaoran Wang; Zhuoran Yang; |
798 | Learning Transferable Visual Models From Natural Language Supervision Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We demonstrate that the simple pre-training task of predicting which caption goes with which image is an efficient and scalable way to learn SOTA image representations from scratch on a dataset of 400 million (image, text) pairs collected from the internet. |
Alec Radford; Jong Wook Kim; Chris Hallacy; Aditya Ramesh; Gabriel Goh; Sandhini Agarwal; Girish Sastry; Amanda Askell; Pamela Mishkin; Jack Clark; Gretchen Krueger; Ilya Sutskever; |
799 | A General Framework For Detecting Anomalous Inputs to DNN Classifiers Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose an unsupervised anomaly detection framework based on the internal DNN layer representations in the form of a meta-algorithm with configurable components. |
Jayaram Raghuram; Varun Chandrasekaran; Somesh Jha; Suman Banerjee; |
800 | Towards Open Ad Hoc Teamwork Using Graph-based Policy Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we consider open teams by allowing agents with different fixed policies to enter and leave the environment without prior notification. |
Muhammad A Rahman; Niklas Hopner; Filippos Christianos; Stefano V Albrecht; |
801 | Decoupling Value and Policy for Generalization in Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To alleviate this problem, we propose two approaches which are combined to create IDAAC: Invariant Decoupled Advantage Actor-Critic. |
Roberta Raileanu; Rob Fergus; |
802 | Hierarchical Clustering of Data Streams: Scalable Algorithms and Approximation Guarantees Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We investigate the problem of hierarchically clustering data streams containing metric data in R^d. |
Anand Rajagopalan; Fabio Vitale; Danny Vainstein; Gui Citovsky; Cecilia M Procopiuc; Claudio Gentile; |
803 | Differentially Private Sliced Wasserstein Distance Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Our main contribution is as follows: we analyze the property of adding a Gaussian perturbation to the intrinsic randomized mechanism of the Sliced Wasserstein Distance, and we establish the sensitivity of the resulting differentially private mechanism. |
Alain Rakotomamonjy; Ralaivola Liva; |
804 | Zero-Shot Text-to-Image Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We describe a simple approach for this task based on a transformer that autoregressively models the text and image tokens as a single stream of data. |
Aditya Ramesh; Mikhail Pavlov; Gabriel Goh; Scott Gray; Chelsea Voss; Alec Radford; Mark Chen; Ilya Sutskever; |
805 | End-to-End Learning of Coherent Probabilistic Forecasts for Hierarchical Time Series Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper presents a novel approach for hierarchical time series forecasting that produces coherent, probabilistic forecasts without requiring any explicit post-processing reconciliation. |
Syama Sundar Rangapuram; Lucien D Werner; Konstantinos Benidis; Pedro Mercado; Jan Gasthaus; Tim Januschowski; |
806 | MSA Transformer Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We introduce a protein language model which takes as input a set of sequences in the form of a multiple sequence alignment. |
Roshan M Rao; Jason Liu; Robert Verkuil; Joshua Meier; John Canny; Pieter Abbeel; Tom Sercu; Alexander Rives; |
807 | Autoregressive Denoising Diffusion Models for Multivariate Probabilistic Time Series Forecasting Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose TimeGrad, an autoregressive model for multivariate probabilistic time series forecasting which samples from the data distribution at each time step by estimating its gradient. |
Kashif Rasul; Calvin Seward; Ingmar Schuster; Roland Vollgraf; |
808 | Generative Particle Variational Inference Via Estimation of Functional Gradients Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This work proposes a new method for learning to approximately sample from the posterior distribution. |
Neale Ratzlaff; Qinxun Bai; Li Fuxin; Wei Xu; |
809 | Enhancing Robustness of Neural Networks Through Fourier Stabilization Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a novel approach, Fourier stabilization, for designing evasion-robust neural networks with binary inputs. |
Netanel Raviv; Aidan Kelley; Minzhe Guo; Yevgeniy Vorobeychik; |
810 | Disentangling Sampling and Labeling Bias for Learning in Large-output Spaces Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present a new connection between these schemes and loss modification techniques for countering label imbalance. |
Ankit Singh Rawat; Aditya K Menon; Wittawat Jitkrittum; Sadeep Jayasumana; Felix Yu; Sashank Reddi; Sanjiv Kumar; |
811 | Cross-domain Imitation from Observations Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we study the problem of how to imitate tasks when discrepancies exist between the expert and agent MDP. |
Dripta S. Raychaudhuri; Sujoy Paul; Jeroen Vanbaar; Amit K. Roy-Chowdhury; |
812 | Implicit Regularization in Tensor Factorization Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: As a step further towards practical deep learning, we provide the first theoretical analysis of implicit regularization in tensor factorization — tensor completion via certain type of non-linear neural network. |
Noam Razin; Asaf Maman; Nadav Cohen; |
813 | Align, Then Memorise: The Dynamics of Learning with Feedback Alignment Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Here, we propose a theory of feedback alignment algorithms. |
Maria Refinetti; St?phane D?Ascoli; Ruben Ohana; Sebastian Goldt; |
814 | Classifying High-dimensional Gaussian Mixtures: Where Kernel Methods Fail and Neural Networks Succeed Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Here, we show that two-layer neural networks with *only a few neurons* achieve near-optimal performance on high-dimensional Gaussian mixture classification while lazy training approaches such as random features and kernel methods do not. |
Maria Refinetti; Sebastian Goldt; Florent Krzakala; Lenka Zdeborova; |
815 | Sharf: Shape-conditioned Radiance Fields from A Single View Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present a method for estimating neural scenes representations of objects given only a single image. |
Konstantinos Rematas; Ricardo Martin-Brualla; Vittorio Ferrari; |
816 | LEGO: Latent Execution-Guided Reasoning for Multi-Hop Question Answering on Knowledge Graphs Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Here we present LEGO, a Latent Execution-Guided reasOning framework to handle this challenge in KGQA. |
Hongyu Ren; Hanjun Dai; Bo Dai; Xinyun Chen; Michihiro Yasunaga; Haitian Sun; Dale Schuurmans; Jure Leskovec; Denny Zhou; |
817 | Interpreting and Disentangling Feature Components of Various Complexity from DNNs Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper aims to define, visualize, and analyze the feature complexity that is learned by a DNN. |
Jie Ren; Mingjie Li; Zexu Liu; Quanshi Zhang; |
818 | Integrated Defense for Resilient Graph Matching Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we identify and study two types of unique topology attacks in graph matching: inter-graph dispersion and intra-graph assembly attacks. |
Jiaxiang Ren; Zijie Zhang; Jiayin Jin; Xin Zhao; Sixing Wu; Yang Zhou; Yelong Shen; Tianshi Che; Ruoming Jin; Dejing Dou; |
819 | Solving High-dimensional Parabolic PDEs Using The Tensor Train Format Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we argue that tensor trains provide an appealing approximation framework for parabolic PDEs: the combination of reformulations in terms of backward stochastic differential equations and regression-type methods in the tensor format holds the promise of leveraging latent low-rank structures enabling both compression and efficient computation. |
Lorenz Richter; Leon Sallandt; Nikolas N?sken; |
820 | Best Arm Identification in Graphical Bilinear Bandits Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We introduce a new graphical bilinear bandit problem where a learner (or a \emph{central entity}) allocates arms to the nodes of a graph and observes for each edge a noisy bilinear reward representing the interaction between the two end nodes. |
Geovani Rizk; Albert Thomas; Igor Colin; Rida Laraki; Yann Chevaleyre; |
821 | Principled Simplicial Neural Networks for Trajectory Prediction Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Based on these properties, we propose a simple convolutional architecture, rooted in tools from algebraic topology, for the problem of trajectory prediction, and show that it obeys all three of these properties when an odd, nonlinear activation function is used. |
T. Mitchell Roddenberry; Nicholas Glaze; Santiago Segarra; |
822 | On Linear Identifiability of Learned Representations Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, building on recent advances in nonlinear Independent Components Analysis, we aim to rehabilitate identifiability by showing that a large family of discriminative models are in fact identifiable in function space, up to a linear indeterminacy. |
Geoffrey Roeder; Luke Metz; Durk Kingma; |
823 | Representation Matters: Assessing The Importance of Subgroup Allocations in Training Data Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Our analysis and experiments describe how dataset compositions influence performance and provide constructive results for using trends in existing data, alongside domain knowledge, to help guide intentional, objective-aware dataset design |
Esther Rolf; Theodora T Worledge; Benjamin Recht; Michael Jordan; |
824 | TeachMyAgent: A Benchmark for Automatic Curriculum Learning in Deep RL Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we identify several key challenges faced by ACL algorithms. |
Cl?ment Romac; R?my Portelas; Katja Hofmann; Pierre-Yves Oudeyer; |
825 | Discretization Drift in Two-Player Games Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Using backward error analysis, we derive modified continuous dynamical systems that closely follow the discrete dynamics. |
Mihaela C Rosca; Yan Wu; Benoit Dherin; David Barrett; |
826 | On The Predictability of Pruning Across Scales Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We show that the error of iteratively magnitude-pruned networks empirically follows a scaling law with interpretable coefficients that depend on the architecture and task. |
Jonathan S Rosenfeld; Jonathan Frankle; Michael Carbin; Nir Shavit; |
827 | Benchmarks, Algorithms, and Metrics for Hierarchical Disentanglement Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we develop benchmarks, algorithms, and metrics for learning such hierarchical representations. |
Andrew Ross; Finale Doshi-Velez; |
828 | Simultaneous Similarity-based Self-Distillation for Deep Metric Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To remedy this, we propose S2SD – Simultaneous Similarity-based Self-distillation. |
Karsten Roth; Timo Milbich; Bjorn Ommer; Joseph Paul Cohen; Marzyeh Ghassemi; |
829 | Multi-group Agnostic PAC Learnability Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Motivated by such fairness concerns, we study multi-group agnostic PAC learnability: fixing a measure of loss, a benchmark class \H and a (potentially) rich collection of subgroups \G, the objective is to learn a single predictor such that the loss experienced by every group g?\G is not much larger than the best possible loss for this group within \H. |
Guy N Rothblum; Gal Yona; |
830 | PACOH: Bayes-Optimal Meta-Learning with PAC-Guarantees Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We provide a theoretical analysis using the PAC-Bayesian framework and derive novel generalization bounds for meta-learning. |
Jonas Rothfuss; Vincent Fortuin; Martin Josifoski; Andreas Krause; |
831 | An Algorithm for Stochastic and Adversarial Bandits with Switching Costs Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose an algorithm for stochastic and adversarial multiarmed bandits with switching costs, where the algorithm pays a price $\lambda$ every time it switches the arm being played. |
Chlo? Rouyer; Yevgeny Seldin; Nicol? Cesa-Bianchi; |
832 | Improving Lossless Compression Rates Via Monte Carlo Bits-Back Coding Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we show how to remove this gap asymptotically by deriving bits-back coding algorithms from tighter variational bounds. |
Yangjun Ruan; Karen Ullrich; Daniel S Severo; James Townsend; Ashish Khisti; Arnaud Doucet; Alireza Makhzani; Chris Maddison; |
833 | On Signal-to-Noise Ratio Issues in Variational Inference for Deep Gaussian Processes Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We show that the gradient estimates used in training Deep Gaussian Processes (DGPs) with importance-weighted variational inference are susceptible to signal-to-noise ratio (SNR) issues. |
Tim G. J. Rudner; Oscar Key; Yarin Gal; Tom Rainforth; |
834 | Tilting The Playing Field: Dynamical Loss Functions for Machine Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We show that learning can be improved by using loss functions that evolve cyclically during training to emphasize one class at a time. |
Miguel Ruiz-Garcia; Ge Zhang; Samuel S Schoenholz; Andrea J. Liu; |
835 | UnICORNN: A Recurrent Model for Learning Very Long Time Dependencies Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To overcome this, we propose a novel RNN architecture which is based on a structure preserving discretization of a Hamiltonian system of second-order ordinary differential equations that models networks of oscillators. |
T. Konstantin Rusch; Siddhartha Mishra; |
836 | Simple and Effective VAE Training with Calibrated Decoders Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We study the impact of calibrated decoders, which learn the uncertainty of the decoding distribution and can determine this amount of information automatically, on the VAE performance. |
Oleh Rybkin; Kostas Daniilidis; Sergey Levine; |
837 | Model-Based Reinforcement Learning Via Latent-Space Collocation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we study how the long-horizon planning abilities can be improved with an algorithm that optimizes over sequences of states, rather than actions, which allows better credit assignment. |
Oleh Rybkin; Chuning Zhu; Anusha Nagabandi; Kostas Daniilidis; Igor Mordatch; Sergey Levine; |
838 | Training Data Subset Selection for Regression with Controlled Generalization Error Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, our goal is to design an algorithm for selecting a subset of the training data, so that the model can be trained quickly, without significantly sacrificing on accuracy. |
Durga S; Rishabh Iyer; Ganesh Ramakrishnan; Abir De; |
839 | Unsupervised Part Representation By Flow Capsules Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To address this issue we propose a way to learn primary capsule encoders that detect atomic parts from a single image. |
Sara Sabour; Andrea Tagliasacchi; Soroosh Yazdani; Geoffrey Hinton; David J Fleet; |
840 | Stochastic Sign Descent Methods: New Algorithms and Better Theory Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we analyze sign-based methods for non-convex optimization in three key settings: (i) standard single node, (ii) parallel with shared data and (iii) distributed with partitioned data. |
Mher Safaryan; Peter Richtarik; |
841 | Adversarial Dueling Bandits Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We introduce the problem of regret minimization in Adversarial Dueling Bandits. |
Aadirupa Saha; Tomer Koren; Yishay Mansour; |
842 | Dueling Convex Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We address the problem of convex optimization with preference (dueling) feedback. |
Aadirupa Saha; Tomer Koren; Yishay Mansour; |
843 | Optimal Regret Algorithm for Pseudo-1d Bandit Convex Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a new algorithm \sbcalg that combines randomized online gradient descent with a kernelized exponential weights method to exploit the pseudo-1d structure effectively, guaranteeing the {\em optimal} regret bound mentioned above, up to additional logarithmic factors. |
Aadirupa Saha; Nagarajan Natarajan; Praneeth Netrapalli; Prateek Jain; |
844 | Asymptotics of Ridge Regression in Convolutional Models Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we analyze the asymptotics of estimation error in ridge estimators for convolutional linear models. |
Mojtaba Sahraee-Ardakan; Tung Mai; Anup Rao; Ryan A. Rossi; Sundeep Rangan; Alyson K Fletcher; |
845 | Momentum Residual Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose to change the forward rule of a ResNet by adding a momentum term. |
Michael E. Sander; Pierre Ablin; Mathieu Blondel; Gabriel Peyr?; |
846 | Meta-Learning Bidirectional Update Rules Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we introduce a new type of generalized neural network where neurons and synapses maintain multiple states. |
Mark Sandler; Max Vladymyrov; Andrey Zhmoginov; Nolan Miller; Tom Madams; Andrew Jackson; Blaise Ag?era Y Arcas; |
847 | Recomposing The Reinforcement Learning Building Blocks with Hypernetworks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Standard architectures tend to ignore these variables’ underlying interpretations and simply concatenate their features into a single vector. In this work, we argue that this choice may lead to poor gradient estimation in actor-critic algorithms and high variance learning steps in Meta-RL algorithms. |
Elad Sarafian; Shai Keynan; Sarit Kraus; |
848 | Towards Understanding Learning in Neural Networks with Linear Teachers Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Here we prove that SGD globally optimizes this learning problem for a two-layer network with Leaky ReLU activations. |
Roei Sarussi; Alon Brutzkus; Amir Globerson; |
849 | E(n) Equivariant Graph Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper introduces a new model to learn graph neural networks equivariant to rotations, translations, reflections and permutations called E(n)-Equivariant Graph Neural Networks (EGNNs). |
Vi?ctor Garcia Satorras; Emiel Hoogeboom; Max Welling; |
850 | A Representation Learning Perspective on The Importance of Train-Validation Splitting in Meta-Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present theoretical results that formalize this idea for linear representation learning on a subspace meta-learning instance, and experimentally verify this practical benefit of splitting in simulations and on standard meta-learning benchmarks. |
Nikunj Saunshi; Arushi Gupta; Wei Hu; |
851 | Low-Rank Sinkhorn Factorization Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Building on this, we introduce in this work a generic approach that aims at solving, in full generality, the OT problem under low-nonnegative rank constraints with arbitrary costs. |
Meyer Scetbon; Marco Cuturi; Gabriel Peyr?; |
852 | Linear Transformers Are Secretly Fast Weight Programmers Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We show the formal equivalence of linearised self-attention mechanisms and fast weight controllers from the early ’90s, where a slow neural net learns by gradient descent to program the fast weights of another net through sequences of elementary programming instructions which are additive outer products of self-invented activation patterns (today called keys and values). |
Imanol Schlag; Kazuki Irie; J?rgen Schmidhuber; |
853 | Descending Through A Crowded Valley – Benchmarking Deep Learning Optimizers Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we aim to replace these anecdotes, if not with a conclusive ranking, then at least with evidence-backed heuristics. |
Robin M Schmidt; Frank Schneider; Philipp Hennig; |
854 | Equivariant Message Passing for The Prediction of Tensorial Properties and Molecular Spectra Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: On this basis, we propose the polarizable atom interaction neural network (PaiNN) and improve on common molecule benchmarks over previous networks, while reducing model size and inference time. |
Kristof Sch?tt; Oliver Unke; Michael Gastegger; |
855 | Just How Toxic Is Data Poisoning? A Unified Benchmark for Backdoor and Data Poisoning Attacks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We observe that data poisoning and backdoor attacks are highly sensitive to variations in the testing setup. |
Avi Schwarzschild; Micah Goldblum; Arjun Gupta; John P Dickerson; Tom Goldstein; |
856 | Connecting Sphere Manifolds Hierarchically for Regularization Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper considers classification problems with hierarchically organized classes. |
Damien Scieur; Youngsung Kim; |
857 | Learning Intra-Batch Connections for Deep Metric Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To this end, we propose an approach based on message passing networks that takes all the relations in a mini-batch into account. |
Jenny Denise Seidenschwarz; Ismail Elezi; Laura Leal-Taix?; |
858 | Top-k EXtreme Contextual Bandits with Arm Hierarchy Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Motivated by modern applications, such as online advertisement and recommender systems, we study the top-k extreme contextual bandits problem, where the total number of arms can be enormous, and the learner is allowed to select k arms and observe all or some of the rewards for the chosen arms. |
Rajat Sen; Alexander Rakhlin; Lexing Ying; Rahul Kidambi; Dean Foster; Daniel N Hill; Inderjit S. Dhillon; |
859 | Pure Exploration and Regret Minimization in Matching Bandits Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We prove that it is possible to leverage a rank-1 assumption on the adjacency matrix to reduce the sample complexity and the regret of off-the-shelf algorithms up to reaching a linear dependency in the number of vertices (up to to poly-log terms). |
Flore Sentenac; Jialin Yi; Clement Calauzenes; Vianney Perchet; Milan Vojnovic; |
860 | State Entropy Maximization with Random Encoders for Efficient Exploration Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper presents Random Encoders for Efficient Exploration (RE3), an exploration method that utilizes state entropy as an intrinsic reward. |
Younggyo Seo; Lili Chen; Jinwoo Shin; Honglak Lee; Pieter Abbeel; Kimin Lee; |
861 | Online Submodular Resource Allocation with Applications to Rebalancing Shared Mobility Systems Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a distributed scheme to maximize the cumulative welfare by designing a repeated game among the agents, who learn to act via regret minimization. |
Pier Giuseppe Sessa; Ilija Bogunovic; Andreas Krause; Maryam Kamgarpour; |
862 | RRL: Resnet As Representation for Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose RRL: Resnet as representation for Reinforcement Learning {–} a straightforward yet effective approach that can learn complex behaviors directly from proprioceptive inputs. |
Rutav M Shah; Vikash Kumar; |
863 | Equivariant Networks for Pixelized Spheres Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We show how to model this interplay using ideas from group theory, identify the equivariant linear maps, and introduce equivariant padding that respects these symmetries. |
Mehran Shakerinava; Siamak Ravanbakhsh; |
864 | Personalized Federated Learning Using Hypernetworks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a novel approach to this problem using hypernetworks, termed pFedHN for personalized Federated HyperNetworks. |
Aviv Shamsian; Aviv Navon; Ethan Fetaya; Gal Chechik; |
865 | On The Power of Localized Perceptron for Label-Optimal Learning of Halfspaces with Adversarial Noise Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Our main contribution is a Perceptron-like online active learning algorithm that runs in polynomial time, and under the conditions that the marginal distribution is isotropic log-concave and $\nu = \Omega(\epsilon)$, where $\epsilon \in (0, 1)$ is the target error rate, our algorithm PAC learns the underlying halfspace with near-optimal label complexity of $\tilde{O}\big(d \cdot \polylog(\frac{1}{\epsilon})\big)$ and sample complexity of $\tilde{O}\big(\frac{d}{\epsilon} \big)$. |
Jie Shen; |
866 | Sample-Optimal PAC Learning of Halfspaces with Malicious Noise Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we present a new analysis for the algorithm of Awasthi et al. (2017) and show that it essentially achieves the near-optimal sample complexity bound of $\tilde{O}(d)$, improving the best known result of $\tilde{O}(d^2)$. |
Jie Shen; |
867 | Backdoor Scanning for Deep Neural Networks Through K-Arm Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Inspired by Multi-Arm Bandit in Reinforcement Learning, we propose a K-Arm optimization method for backdoor detection. |
Guangyu Shen; Yingqi Liu; Guanhong Tao; Shengwei An; Qiuling Xu; Siyuan Cheng; Shiqing Ma; Xiangyu Zhang; |
868 | State Relevance for Off-Policy Evaluation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we introduce Omitting-States-Irrelevant-to-Return Importance Sampling (OSIRIS), an estimator which reduces variance by strategically omitting likelihood ratios associated with certain states. |
Simon P Shen; Yecheng Ma; Omer Gottesman; Finale Doshi-Velez; |
869 | SparseBERT: Rethinking The Importance Analysis in Self-attention Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To rethink the importance analysis in self-attention, we study the significance of different positions in attention matrix during pre-training. |
Han Shi; Jiahui Gao; Xiaozhe Ren; Hang Xu; Xiaodan Liang; Zhenguo Li; James Tin-Yau Kwok; |
870 | Learning Gradient Fields for Molecular Conformation Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Inspired by the traditional force field methods for molecular dynamics simulation, in this paper, we propose a novel approach called ConfGF by directly estimating the gradient fields of the log density of atomic coordinates. |
Chence Shi; Shitong Luo; Minkai Xu; Jian Tang; |
871 | Segmenting Hybrid Trajectories Using Latent ODEs Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Here, we propose the Latent Segmented ODE (LatSegODE), which uses Latent ODEs to perform reconstruction and changepoint detection within hybrid trajectories featuring jump discontinuities and switching dynamical modes. |
Ruian Shi; Quaid Morris; |
872 | Deeply-Debiased Off-Policy Interval Estimation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a novel procedure to construct an efficient, robust, and flexible CI on a target policy’s value. |
Chengchun Shi; Runzhe Wan; Victor Chernozhukov; Rui Song; |
873 | GANMEX: One-vs-One Attributions Using GAN-based Model Explainability Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present GANMEX, a novel approach applying Generative Adversarial Networks (GAN) by incorporating the to-be-explained classifier as part of the adversarial networks. |
Sheng-Min Shih; Pin-Ju Tien; Zohar Karnin; |
874 | Large-Scale Meta-Learning with Continual Trajectory Shifting Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we first show that allowing the meta-learners to take a larger number of inner gradient steps better captures the structure of heterogeneous and large-scale task distributions, thus results in obtaining better initialization points. Further, in order to increase the frequency of meta-updates even with the excessively long inner-optimization trajectories, we propose to estimate the required shift of the task-specific parameters with respect to the change of the initialization parameters. |
Jaewoong Shin; Hae Beom Lee; Boqing Gong; Sung Ju Hwang; |
875 | AGENT: A Benchmark for Core Psychological Reasoning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Inspired by cognitive development studies on intuitive psychology, we present a benchmark consisting of a large dataset of procedurally generated 3D animations, AGENT (Action, Goal, Efficiency, coNstraint, uTility), structured around four scenarios (goal preferences, action efficiency, unobserved constraints, and cost-reward trade-offs) that probe key concepts of core intuitive psychology. |
Tianmin Shu; Abhishek Bhandwaldar; Chuang Gan; Kevin Smith; Shari Liu; Dan Gutfreund; Elizabeth Spelke; Joshua Tenenbaum; Tomer Ullman; |
876 | Zoo-Tuning: Adaptive Transfer from A Zoo of Models Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose \emph{Zoo-Tuning} to address these challenges, which learns to adaptively transfer the parameters of pretrained models to the target task. |
Yang Shu; Zhi Kou; Zhangjie Cao; Jianmin Wang; Mingsheng Long; |
877 | Aggregating From Multiple Target-Shifted Sources Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we analyzed the problem for aggregating source domains with different label distributions, where most recent source selection approaches fail. |
Changjian Shui; Zijian Li; Jiaqi Li; Christian Gagn?; Charles X Ling; Boyu Wang; |
878 | Testing Group Fairness Via Optimal Transport Projections Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We have developed a statistical testing framework to detect if a given machine learning classifier fails to satisfy a wide range of group fairness notions. |
Nian Si; Karthyek Murthy; Jose Blanchet; Viet Anh Nguyen; |
879 | On Characterizing GAN Convergence Through Proximal Duality Gap Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we extend the notion of duality gap to proximal duality gap that is applicable to the general context of training GANs where Nash equilibria may not exist. |
Sahil Sidheekh; Aroof Aimen; Narayanan C Krishnan; |
880 | A Precise Performance Analysis of Support Vector Regression Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we study the hard and soft support vector regression techniques applied to a set of $n$ linear measurements of the form $y_i=\boldsymbol{\beta}_\star^{T}{\bf x}_i +n_i$ where $\boldsymbol{\beta}_\star$ is an unknown vector, $\left\{{\bf x}_i\right\}_{i=1}^n$ are the feature vectors and $\left\{{n}_i\right\}_{i=1}^n$ model the noise. |
Houssem Sifaou; Abla Kammoun; Mohamed-Slim Alouini; |
881 | Directed Graph Embeddings in Pseudo-Riemannian Manifolds Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we show that general directed graphs can be effectively represented by an embedding model that combines three components: a pseudo-Riemannian metric structure, a non-trivial global topology, and a unique likelihood function that explicitly incorporates a preferred direction in embedding space. |
Aaron Sim; Maciej L Wiatrak; Angus Brayne; Paidi Creed; Saee Paliwal; |
882 | Collaborative Bayesian Optimization with Fair Regret Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Inspired by social welfare concepts from economics, we propose a new notion of regret capturing these properties and a collaborative BO algorithm whose convergence rate can be theoretically guaranteed by bounding the new regret, both of which share an adjustable parameter for trading off between fairness vs. efficiency. |
Rachael Hwee Ling Sim; Yehong Zhang; Bryan Kian Hsiang Low; Patrick Jaillet; |
883 | Dynamic Planning and Learning Under Recovering Rewards Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: With the objective of maximizing expected cumulative rewards over $T$ time periods, we propose, construct and prove performance guarantees for a class of “Purely Periodic Policies”. |
David Simchi-Levi; Zeyu Zheng; Feng Zhu; |
884 | PopSkipJump: Decision-Based Attack for Probabilistic Classifiers Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We therefore propose a new adversarial decision-based attack specifically designed for classifiers with probabilistic outputs. |
Carl-Johann Simon-Gabriel; Noman Ahmed Sheikh; Andreas Krause; |
885 | Geometry of The Loss Landscape in Overparameterized Neural Networks: Symmetries and Invariances Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We study how permutation symmetries in overparameterized multi-layer neural networks generate ‘symmetry-induced’ critical points. |
Berfin Simsek; Fran?ois Ged; Arthur Jacot; Francesco Spadaro; Clement Hongler; Wulfram Gerstner; Johanni Brea; |
886 | Flow-based Attribution in Graphical Models: A Recursive Shapley Approach Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We study the attribution problem in a graphical model, wherein the objective is to quantify how the effect of changes at the source nodes propagates through the graph. |
Raghav Singal; George Michailidis; Hoiyi Ng; |
887 | Structured World Belief for Reinforcement Learning in POMDP Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose Structured World Belief, a model for learning and inference of object-centric belief states. |
Gautam Singh; Skand Peri; Junghyun Kim; Hyunseok Kim; Sungjin Ahn; |
888 | Skew Orthogonal Convolutions Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose a GNP convolution layer called \textbf{S}kew \textbf{O}rthogonal \textbf{C}onvolution (SOC) that uses the following mathematical property: when a matrix is {\it Skew-Symmetric}, its exponential function is an {\it orthogonal} matrix. |
Sahil Singla; Soheil Feizi; |
889 | Multi-Task Reinforcement Learning with Context-based Representations Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this framework, metadata can help to learn interpretable representations and provide the context to inform which representations to compose and how to compose them. |
Shagun Sodhani; Amy Zhang; Joelle Pineau; |
890 | Shortest-Path Constrained Reinforcement Learning for Sparse Reward Tasks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose the k-Shortest-Path (k-SP) constraint: a novel constraint on the agent’s trajectory that improves the sample efficiency in sparse-reward MDPs. |
Sungryull Sohn; Sungtae Lee; Jongwook Choi; Harm H Van Seijen; Mehdi Fatemi; Honglak Lee; |
891 | Accelerating Feedforward Computation Via Parallel Nonlinear Equation Solving Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To enable parallelization, we frame the task of feedforward computation as solving a system of nonlinear equations. |
Yang Song; Chenlin Meng; Renjie Liao; Stefano Ermon; |
892 | PC-MLP: Model-based Reinforcement Learning with Policy Cover Guided Exploration Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This work studies a computationally and statistically efficient model-based algorithm for both Kernelized Nonlinear Regulators (KNR) and linear Markov Decision Processes (MDPs). |
Yuda Song; Wen Sun; |
893 | Fast Sketching of Polynomial Kernels of Polynomial Degree Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Combined with a novel sampling scheme, we give the fastest algorithms for approximating a large family of slow-growing kernels. |
Zhao Song; David Woodruff; Zheng Yu; Lichen Zhang; |
894 | Variance Reduction Via Primal-Dual Accelerated Dual Averaging for Nonsmooth Convex Finite-Sums Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: For the primal-dual formulation of this problem, we propose a novel algorithm called \emph{Variance Reduction via Primal-Dual Accelerated Dual Averaging (\vrpda)}. |
Chaobing Song; Stephen J Wright; Jelena Diakonikolas; |
895 | Oblivious Sketching-based Central Path Method for Linear Programming Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose a sketching-based central path method for solving linear programmings, whose running time matches the state of the art results [Cohen, Lee, Song STOC 19; Lee, Song, Zhang COLT 19]. |
Zhao Song; Zheng Yu; |
896 | Causal Curiosity: RL Agents Discovering Self-supervised Experiments for Causal Representation Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We introduce a novel intrinsic reward, called causal curiosity, and show that it allows our agents to learn optimal sequences of actions, and to discover causal factors in the dynamics. |
Sumedh A Sontakke; Arash Mehrjou; Laurent Itti; Bernhard Sch?lkopf; |
897 | Decomposed Mutual Information Estimation for Contrastive Representation Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose decomposing the full MI estimation problem into a sum of smaller estimation problems by splitting one of the views into progressively more informed subviews and by applying the chain rule on MI between the decomposed views. |
Alessandro Sordoni; Nouha Dziri; Hannes Schulz; Geoff Gordon; Philip Bachman; Remi Tachet Des Combes; |
898 | Decoupling Representation Learning from Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In an effort to overcome limitations of reward-driven feature learning in deep reinforcement learning (RL) from images, we propose decoupling representation learning from policy learning. |
Adam Stooke; Kimin Lee; Pieter Abbeel; Michael Laskin; |
899 | K-shot NAS: Learnable Weight-Sharing for NAS with K-shot Supernets Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, instead of counting on a single supernet, we introduce $K$-shot supernets and take their weights for each operation as a dictionary. |
Xiu Su; Shan You; Mingkai Zheng; Fei Wang; Chen Qian; Changshui Zhang; Chang Xu; |
900 | More Powerful and General Selective Inference for Stepwise Feature Selection Using Homotopy Method Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this study, we develop a more powerful and general conditional SI method for SFS using the homotopy method which enables us to overcome this limitation. |
Kazuya Sugiyama; Vo Nguyen Le Duy; Ichiro Takeuchi; |
901 | Not All Memories Are Created Equal: Learning to Forget By Expiring Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose Expire-Span, a method that learns to retain the most important information and expire the irrelevant information. |
Sainbayar Sukhbaatar; Da Ju; Spencer Poff; Stephen Roller; Arthur Szlam; Jason Weston; Angela Fan; |
902 | Nondeterminism and Instability in Neural Network Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we establish an experimental protocol for understanding the effect of optimization nondeterminism on model diversity, allowing us to isolate the effects of a variety of sources of nondeterminism. |
Cecilia Summers; Michael J. Dinneen; |
903 | AutoSampling: Search for Effective Data Sampling Schedules Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose an AutoSampling method to automatically learn sampling schedules for model training, which consists of the multi-exploitation step aiming for optimal local sampling schedules and the exploration step for the ideal sampling distribution. |
Ming Sun; Haoxuan Dou; Baopu Li; Junjie Yan; Wanli Ouyang; Lei Cui; |
904 | What Makes for End-to-End Object Detection? Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we first point out that one-to-one positive sample assignment is the key factor, while, one-to-many assignment in previous detectors causes redundant predictions in inference. |
Peize Sun; Yi Jiang; Enze Xie; Wenqi Shao; Zehuan Yuan; Changhu Wang; Ping Luo; |
905 | DFAC Framework: Factorizing The Value Function Via Quantile Mixture for Multi-Agent Distributional Q-Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To address the above issues, we integrate distributional RL and value function factorization methods by proposing a Distributional Value Function Factorization (DFAC) framework to generalize expected value function factorization methods to their distributional variants. |
Wei-Fang Sun; Cheng-Kuang Lee; Chun-Yi Lee; |
906 | Scalable Variational Gaussian Processes Via Harmonic Kernel Decomposition Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We introduce a new scalable variational Gaussian process approximation which provides a high fidelity approximation while retaining general applicability. |
Shengyang Sun; Jiaxin Shi; Andrew Gordon Gordon Wilson; Roger B Grosse; |
907 | Reasoning Over Virtual Knowledge Bases With Open Predicate Relations Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present the Open Predicate Query Language (OPQL); a method for constructing a virtual KB (VKB) trained entirely from text. |
Haitian Sun; Patrick Verga; Bhuwan Dhingra; Ruslan Salakhutdinov; William W Cohen; |
908 | PAC-Learning for Strategic Classification Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we generalize both of these through a unified framework for strategic classification and introduce the notion of strategic VC-dimension (SVC) to capture the PAC-learnability in our general strategic setup. |
Ravi Sundaram; Anil Vullikanti; Haifeng Xu; Fan Yao; |
909 | Reinforcement Learning for Cost-Aware Markov Decision Processes Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper addresses this deficiency by introducing two new, model-free RL algorithms for solving cost-aware Markov decision processes, where the goal is to maximize the ratio of long-run average reward to long-run average cost. |
Wesley Suttle; Kaiqing Zhang; Zhuoran Yang; Ji Liu; David Kraemer; |
910 | Model-Targeted Poisoning Attacks with Provable Convergence Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We consider poisoning attacks against convex machine learning models and propose an efficient poisoning attack designed to induce a model specified by the adversary. |
Fnu Suya; Saeed Mahloujifar; Anshuman Suri; David Evans; Yuan Tian; |
911 | Generalization Error Bound for Hyperbolic Ordinal Embedding Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, through our novel characterization of HOE with decomposed Lorentz Gramian matrices, we provide a generalization error bound of HOE for the first time, which is at most exponential with respect to the embedding space’s radius. |
Atsushi Suzuki; Atsushi Nitanda; Jing Wang; Linchuan Xu; Kenji Yamanishi; Marc Cavazza; |
912 | Of Moments and Matching: A Game-Theoretic Framework for Closing The Imitation Gap Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We provide a unifying view of a large family of previous imitation learning algorithms through the lens of moment matching. |
Gokul Swamy; Sanjiban Choudhury; J. Andrew Bagnell; Steven Wu; |
913 | Parallel Tempering on Optimized Paths Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To address this issue, we expand the framework of PT to general families of paths, formulate the choice of path as an optimization problem that admits tractable gradient estimates, and propose a flexible new family of spline interpolation paths for use in practice. |
Saifuddin Syed; Vittorio Romaniello; Trevor Campbell; Alexandre Bouchard-Cote; |
914 | Robust Representation Learning Via Perceptual Similarity Metrics Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose Contrastive Input Morphing (CIM), a representation learning framework that learns input-space transformations of the data to mitigate the effect of irrelevant input features on downstream performance. |
Saeid A Taghanaki; Kristy Choi; Amir Hosein Khasahmadi; Anirudh Goyal; |
915 | DriftSurf: Stable-State / Reactive-State Learning Under Concept Drift Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present an adaptive learning algorithm that extends previous drift-detection-based methods by incorporating drift detection into a broader stable-state/reactive-state process. |
Ashraf Tahmasbi; Ellango Jothimurugesan; Srikanta Tirthapura; Phillip B Gibbons; |
916 | Sinkhorn Label Allocation: Semi-Supervised Classification Via Annealed Self-Training Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we reinterpret this label assignment process as an optimal transportation problem between examples and classes, wherein the cost of assigning an example to a class is mediated by the current predictions of the classifier. |
Kai Sheng Tai; Peter D Bailis; Gregory Valiant; |
917 | Approximation Theory Based Methods for RKHS Bandits Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Using an approximation method, we propose efficient algorithms for the stochastic RKHS bandit problem and the first general algorithm for the adversarial RKHS bandit problem. |
Sho Takemori; Masahiro Sato; |
918 | Supervised Tree-Wasserstein Distance Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose the Supervised Tree-Wasserstein (STW) distance, a fast, supervised metric learning method based on the tree metric. |
Yuki Takezawa; Ryoma Sato; Makoto Yamada; |
919 | EfficientNetV2: Smaller Models and Faster Training Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper introduces EfficientNetV2, a new family of convolutional networks that have faster training speed and better parameter efficiency than previous models. |
Mingxing Tan; Quoc Le; |
920 | SGA: A Robust Algorithm for Partial Recovery of Tree-Structured Graphical Models with Noisy Samples Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We consider learning Ising tree models when the observations from the nodes are corrupted by independent but non-identically distributed noise with unknown statistics. |
Anshoo Tandon; Aldric Han; Vincent Tan; |
921 | 1-bit Adam: Communication Efficient Large-Scale Training with Adam’s Convergence Speed Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose 1-bit Adam that reduces the communication volume by up to 5x, offers much better scalability, and provides the same convergence speed as uncompressed Adam. |
Hanlin Tang; Shaoduo Gan; Ammar Ahmad Awan; Samyam Rajbhandari; Conglong Li; Xiangru Lian; Ji Liu; Ce Zhang; Yuxiong He; |
922 | Taylor Expansion of Discount Factors Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we study the effect that this discrepancy of discount factors has during learning, and discover a family of objectives that interpolate value functions of two distinct discount factors. |
Yunhao Tang; Mark Rowland; Remi Munos; Michal Valko; |
923 | REPAINT: Knowledge Transfer in Deep Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This work proposes REPresentation And INstance Transfer (REPAINT) algorithm for knowledge transfer in deep reinforcement learning. |
Yunzhe Tao; Sahika Genc; Jonathan Chung; Tao Sun; Sunil Mallya; |
924 | Understanding The Dynamics of Gradient Flow in Overparameterized Linear Models Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We provide a detailed analysis of the dynamics ofthe gradient flow in overparameterized two-layerlinear models. |
Salma Tarmoun; Guilherme Franca; Benjamin D Haeffele; Rene Vidal; |
925 | Sequential Domain Adaptation By Synthesizing Distributionally Robust Experts Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Given available data, we investigate novel strategies to synthesize a family of least squares estimator experts that are robust with regard to moment conditions. |
Bahar Taskesen; Man-Chung Yue; Jose Blanchet; Daniel Kuhn; Viet Anh Nguyen; |
926 | A Language for Counterfactual Generative Models Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present Omega, a probabilistic programming language with support for counterfactual inference. |
Zenna Tavares; James Koppel; Xin Zhang; Ria Das; Armando Solar-Lezama; |
927 | Synthesizer: Rethinking Self-Attention for Transformer Models Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To this end, we propose \textsc{Synthesizer}, a model that learns synthetic attention weights without token-token interactions. |
Yi Tay; Dara Bahri; Donald Metzler; Da-Cheng Juan; Zhe Zhao; Che Zheng; |
928 | OmniNet: Omnidirectional Representations from Transformers Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper proposes Omnidirectional Representations from Transformers (OMNINET). |
Yi Tay; Mostafa Dehghani; Vamsi Aribandi; Jai Gupta; Philip M Pham; Zhen Qin; Dara Bahri; Da-Cheng Juan; Donald Metzler; |
929 | T-SCI: A Two-Stage Conformal Inference Algorithm with Guaranteed Coverage for Cox-MLP Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To recover the guaranteed coverage without linear assumption, we propose two algorithms based on conformal inference. |
Jiaye Teng; Zeren Tan; Yang Yuan; |
930 | Moreau-Yosida $f$-divergences Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Inspired by this, we define the Moreau-Yosida approximation of f-divergences with respect to the Wasserstein-1 metric. |
D?vid Terj?k; |
931 | Understanding Invariance Via Feedforward Inversion of Discriminatively Trained Classifiers Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We explore this phenomenon further using a novel synthesis of methods, yielding a feedforward inversion model that produces remarkably high fidelity reconstructions, qualitatively superior to those of past efforts. |
Piotr Teterwak; Chiyuan Zhang; Dilip Krishnan; Michael C Mozer; |
932 | Resource Allocation in Multi-armed Bandit Exploration: Overcoming Sublinear Scaling with Adaptive Parallelism Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We study exploration in stochastic multi-armed bandits when we have access to a divisible resource that can be allocated in varying amounts to arm pulls. |
Brijen Thananjeyan; Kirthevasan Kandasamy; Ion Stoica; Michael Jordan; Ken Goldberg; Joseph Gonzalez; |
933 | Monte Carlo Variational Auto-Encoders Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we address both issues and demonstrate the performance of the resulting Monte Carlo VAEs on a variety of applications. |
Achille Thin; Nikita Kotelevskii; Arnaud Doucet; Alain Durmus; Eric Moulines; Maxim Panov; |
934 | Efficient Generative Modelling of Protein Structure Fragments Using A Deep Markov Model Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To address these issues, we developed BIFROST, a novel take on the fragment library problem based on a Deep Markov Model architecture combined with directional statistics for angular degrees of freedom, implemented in the deep probabilistic programming language Pyro. |
Christian B Thygesen; Christian Skj?dt Steenmans; Ahmad Salim Al-Sibahi; Lys Sanz Moreta; Anders Bundg?rd S?rensen; Thomas Hamelryck; |
935 | Understanding Self-supervised Learning Dynamics Without Contrastive Pairs Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we answer this question via a simple theoretical study and propose a novel approach, \ourmethod{}, that \emph{directly} sets the linear predictor based on the statistics of its inputs, rather than trained with gradient update. |
Yuandong Tian; Xinlei Chen; Surya Ganguli; |
936 | Online Learning in Unknown Markov Games Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We study online learning in unknown Markov games, a problem that arises in episodic multi-agent reinforcement learning where the actions of the opponents are unobservable. |
Yi Tian; Yuanhao Wang; Tiancheng Yu; Suvrit Sra; |
937 | BORE: Bayesian Optimization By Density-Ratio Estimation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we cast the computation of EI as a binary classification problem, building on the link between class-probability estimation and density-ratio estimation, and the lesser-known link between density-ratios and EI. |
Louis C Tiao; Aaron Klein; Matthias W Seeger; Edwin V. Bonilla; Cedric Archambeau; Fabio Ramos; |
938 | Nonparametric Decomposition of Sparse Tensors Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To address this model misspecification and to exploit the sparse tensor structures, we propose Nonparametric dEcomposition of Sparse Tensors (\ours), which can capture both the sparse structure properties and complex relationships between the tensor nodes to enhance the embedding estimation. |
Conor Tillinghast; Shandian Zhe; |
939 | Probabilistic Programs with Stochastic Conditioning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a generalization of deterministic conditioning to stochastic conditioning, that is, conditioning on the marginal distribution of a variable taking a particular form. |
David Tolpin; Yuan Zhou; Tom Rainforth; Hongseok Yang; |
940 | Deep Continuous Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Here we propose deep continuous networks (DCNs), which combine spatially continuous filters, with the continuous depth framework of neural ODEs. |
Nergis Tomen; Silvia-Laura Pintea; Jan Van Gemert; |
941 | Diffusion Earth Mover’s Distance and Distribution Embeddings Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a new fast method of measuring distances between large numbers of related high dimensional datasets called the Diffusion Earth Mover’s Distance (EMD). |
Alexander Y Tong; Guillaume Huguet; Amine Natik; Kincaid Macdonald; Manik Kuchroo; Ronald Coifman; Guy Wolf; Smita Krishnaswamy; |
942 | Training Data-efficient Image Transformers & Distillation Through Attention Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we produce competitive convolution-free transformers trained on ImageNet only using a single computer in less than 3 days. |
Hugo Touvron; Matthieu Cord; Matthijs Douze; Francisco Massa; Alexandre Sablayrolles; Herve Jegou; |
943 | Conservative Objective Models for Effective Offline Model-Based Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we aim to solve data-driven model-based optimization (MBO) problems, where the goal is to find a design input that maximizes an unknown objective function provided access to only a static dataset of inputs and their corresponding objective values. |
Brandon Trabucco; Aviral Kumar; Xinyang Geng; Sergey Levine; |
944 | Sparse Within Sparse Gaussian Processes Using Neighbor Information Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In particular, we introduce a novel hierarchical prior, which imposes sparsity on the set of inducing variables. |
Gia-Lac Tran; Dimitrios Milios; Pietro Michiardi; Maurizio Filippone; |
945 | SMG: A Shuffling Gradient-Based Method with Momentum Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We combine two advanced ideas widely used in optimization for machine learning: \textit{shuffling} strategy and \textit{momentum} technique to develop a novel shuffling gradient-based method with momentum, coined \textbf{S}huffling \textbf{M}omentum \textbf{G}radient (SMG), for non-convex finite-sum optimization problems. |
Trang H Tran; Lam M Nguyen; Quoc Tran-Dinh; |
946 | Bayesian Optimistic Optimisation with Exponentially Decaying Regret Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose the BOO algorithm, a first practical approach which can achieve an exponential regret bound with order $\mathcal O(N^{-\sqrt{N}})$ under the assumption that the objective function is sampled from a Gaussian process with a Matérn kernel with smoothness parameter $\nu > 4 +\frac{D}{2}$, where $D$ is the number of dimensions. |
Hung Tran-The; Sunil Gupta; Santu Rana; Svetha Venkatesh; |
947 | On Disentangled Representations Learned from Correlated Data Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we bridge the gap to real-world scenarios by analyzing the behavior of the most prominent disentanglement approaches on correlated data in a large-scale empirical study (including 4260 models). |
Frederik Tr?uble; Elliot Creager; Niki Kilbertus; Francesco Locatello; Andrea Dittadi; Anirudh Goyal; Bernhard Sch?lkopf; Stefan Bauer; |
948 | A New Formalism, Method and Open Issues for Zero-Shot Coordination Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: However, until now, this label-free problem has only been informally defined. We formalize this setting as the label-free coordination (LFC) problem by defining the label-free coordination game. |
Johannes Treutlein; Michael Dennis; Caspar Oesterheld; Jakob Foerster; |
949 | Learning A Universal Template for Few-shot Dataset Generalization Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To this end, we propose to utilize the diverse training set to construct a \emph{universal template}: a partial model that can define a wide array of dataset-specialized models, by plugging in appropriate components. |
Eleni Triantafillou; Hugo Larochelle; Richard Zemel; Vincent Dumoulin; |
950 | Provable Meta-Learning of Linear Representations Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we focus on the problem of multi-task linear regression—in which multiple linear regression models share a common, low-dimensional linear representation. |
Nilesh Tripuraneni; Chi Jin; Michael Jordan; |
951 | Cumulants of Hawkes Processes Are Robust to Observation Noise Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we address the problem of learning the causal structure of MHPs when the observed timestamps of events are subject to random and unknown shifts, also known as random translations. |
William Trouleau; Jalal Etesami; Matthias Grossglauser; Negar Kiyavash; Patrick Thiran; |
952 | PixelTransformer: Sample Conditioned Signal Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a generative model that can infer a distribution for the underlying spatial signal conditioned on sparse samples e.g. plausible images given a few observed pixels. |
Shubham Tulsiani; Abhinav Gupta; |
953 | A Framework for Private Matrix Analysis in Sliding Window Model Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We perform a rigorous study of private matrix analysis when only the last $W$ updates to matrices are considered useful for analysis. |
Jalaj Upadhyay; Sarvagya Upadhyay; |
954 | Fast Projection Onto Convex Smooth Constraints Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we focus on projection problems where the constraints are smooth and the number of constraints is significantly smaller than the dimension. |
Ilnura Usmanova; Maryam Kamgarpour; Andreas Krause; Kfir Levy; |
955 | SGLB: Stochastic Gradient Langevin Boosting Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper introduces Stochastic Gradient Langevin Boosting (SGLB) – a powerful and efficient machine learning framework that may deal with a wide range of loss functions and has provable generalization guarantees. |
Aleksei Ustimenko; Liudmila Prokhorenkova; |
956 | LTL2Action: Generalizing LTL Instructions for Multi-Task RL Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To reduce the overhead of learning LTL semantics, we introduce an environment-agnostic LTL pretraining scheme which improves sample-efficiency in downstream environments. |
Pashootan Vaezipoor; Andrew C Li; Rodrigo A Toro Icarte; Sheila A. Mcilraith; |
957 | Active Deep Probabilistic Subsampling Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We generalize DPS to a sequential method that actively picks the next sample based on the information acquired so far; dubbed Active-DPS (A-DPS). |
Hans Van Gorp; Iris Huijben; Bastiaan S Veeling; Nicola Pezzotti; Ruud J. G. Van Sloun; |
958 | CURI: A Benchmark for Productive Concept Learning Under Uncertainty Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We introduce a new benchmark, Compositional Reasoning Under Uncertainty (CURI) that instantiates a series of few-shot, meta-learning tasks in a productive concept space to evaluate different aspects of systematic generalization under uncertainty, including splits that test abstract understandings of disentangling, productive generalization, learning boolean operations, variable binding, etc. |
Ramakrishna Vedantam; Arthur Szlam; Maximillian Nickel; Ari Morcos; Brenden M Lake; |
959 | Towards Domain-Agnostic Contrastive Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To overcome such limitation, we propose a domain-agnostic approach to contrastive learning, named DACL, that is applicable to problems where domain-specific data augmentations are not readily available. |
Vikas Verma; Thang Luong; Kenji Kawaguchi; Hieu Pham; Quoc Le; |
960 | Sparsifying Networks Via Subdifferential Inclusion Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this article, we propose a new formulation of the problem of generating sparse weights for a pre-trained neural network. |
Sagar Verma; Jean-Christophe Pesquet; |
961 | Unbiased Gradient Estimation in Unrolled Computation Graphs with Persistent Evolution Strategies Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We introduce a method called Persistent Evolution Strategies (PES), which divides the computation graph into a series of truncated unrolls, and performs an evolution strategies-based update step after each unroll. |
Paul Vicol; Luke Metz; Jascha Sohl-Dickstein; |
962 | Online Graph Dictionary Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We fill this gap by proposing a new online Graph Dictionary Learning approach, which uses the Gromov Wasserstein divergence for the data fitting term. |
C?dric Vincent-Cuaz; Titouan Vayer; R?mi Flamary; Marco Corneli; Nicolas Courty; |
963 | Neuro-algorithmic Policies Enable Fast Combinatorial Generalization Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We show that, for a certain subclass of the MDP framework, this can be alleviated by a neuro-algorithmic policy architecture that embeds a time-dependent shortest path solver in a deep neural network. |
Marin Vlastelica; Michal Rolinek; Georg Martius; |
964 | Efficient Training of Robust Decision Trees Against Adversarial Examples Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present GROOT, an efficient algorithm for training robust decision trees and random forests that runs in a matter of seconds to minutes. |
Dani?l Vos; Sicco Verwer; |
965 | Object Segmentation Without Labels with Large-Scale Generative Models Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This work demonstrates that large-scale unsupervised models can also perform a more challenging object segmentation task, requiring neither pixel-level nor image-level labeling. |
Andrey Voynov; Stanislav Morozov; Artem Babenko; |
966 | Principal Component Hierarchy for Sparse Quadratic Programs Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Exploiting this property, we propose two scalable optimization algorithms, coined as the best response and the dual program, that can efficiently screen the potential indices of the nonzero elements of the original program. |
Robbie Vreugdenhil; Viet Anh Nguyen; Armin Eftekhari; Peyman Mohajerin Esfahani; |
967 | Whitening and Second Order Optimization Both Make Information in The Dataset Unusable During Training, and Can Reduce or Prevent Generalization Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We show that both data whitening and second order optimization can harm or entirely prevent generalization. |
Neha Wadia; Daniel Duckworth; Samuel S Schoenholz; Ethan Dyer; Jascha Sohl-Dickstein; |
968 | Safe Reinforcement Learning Using Advantage-Based Intervention Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a new algorithm, SAILR, that uses an intervention mechanism based on advantage functions to keep the agent safe throughout training and optimizes the agent’s policy using off-the-shelf RL algorithms designed for unconstrained MDPs. |
Nolan C Wagener; Byron Boots; Ching-An Cheng; |
969 | Task-Optimal Exploration in Linear Dynamical Systems Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we study task-guided exploration and determine what precisely an agent must learn about their environment in order to complete a particular task. |
Andrew J Wagenmaker; Max Simchowitz; Kevin Jamieson; |
970 | Learning and Planning in Average-Reward Markov Decision Processes Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We introduce learning and planning algorithms for average-reward MDPs, including 1) the first general proven-convergent off-policy model-free control algorithm without reference states, 2) the first proven-convergent off-policy model-free prediction algorithm, and 3) the first off-policy learning algorithm that converges to the actual value function rather than to the value function plus an offset. |
Yi Wan; Abhishek Naik; Richard S Sutton; |
971 | Think Global and Act Local: Bayesian Optimisation Over High-Dimensional Categorical and Mixed Search Spaces Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a novel solution—we combine local optimisation with a tailored kernel design, effectively handling high-dimensional categorical and mixed search spaces, whilst retaining sample efficiency. |
Xingchen Wan; Vu Nguyen; Huong Ha; Binxin Ru; Cong Lu; Michael A. Osborne; |
972 | Zero-Shot Knowledge Distillation from A Decision-Based Black-Box Model Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose to generate pseudo samples that are distinguished by the decision boundaries of the DB3 teacher to the largest extent and construct soft labels for these samples, which are used as the transfer set. |
Zi Wang; |
973 | Fairness of Exposure in Stochastic Bandits Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To remedy this problem, we propose a new bandit objective that guarantees merit-based fairness of exposure to the items while optimizing utility to the users. |
Lequn Wang; Yiwei Bai; Wen Sun; Thorsten Joachims; |
974 | A Proxy Variable View of Shared Confounding Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we focus on the setting where there are many treatments with shared confounding, and we study under what conditions is causal identification possible. |
Yixin Wang; David Blei; |
975 | Fast Algorithms for Stackelberg Prediction Game with Least Squares Loss Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In contrast, we propose a novel approach that reformulates a SPG-LS as a single SDP of a similar form and the same dimension as those solved in the bisection method. |
Jiali Wang; He Chen; Rujun Jiang; Xudong Li; Zihao Li; |
976 | Accelerate CNNs from Three Dimensions: A Comprehensive Pruning Framework Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this framework, since collecting too much data for training the regression is very time-costly, we propose two approaches to lower the cost: 1) specializing the polynomial to ensure an accurate regression even with less training data; 2) employing iterative pruning and fine-tuning to collect the data faster. |
Wenxiao Wang; Minghao Chen; Shuai Zhao; Long Chen; Jinming Hu; Haifeng Liu; Deng Cai; Xiaofei He; Wei Liu; |
977 | Explainable Automated Graph Representation Learning with Hyperparameter Importance Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose an explainable AutoML approach for graph representation (e-AutoGR) which utilizes explainable graph features during performance estimation and learns decorrelated importance weights for different hyperparameters in affecting the model performance through a non-linear decorrelated weighting regression. |
Xin Wang; Shuyi Fan; Kun Kuang; Wenwu Zhu; |
978 | Self-Tuning for Data-Efficient Deep Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To escape from this dilemma, we present Self-Tuning to enable data-efficient deep learning by unifying the exploration of labeled and unlabeled data and the transfer of a pre-trained model, as well as a Pseudo Group Contrast (PGC) mechanism to mitigate the reliance on pseudo-labels and boost the tolerance to false labels. |
Ximei Wang; Jinghan Gao; Mingsheng Long; Jianmin Wang; |
979 | Label Distribution Learning Machine Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Specifically, we extend the margin theory to LDL and propose a new LDL method called \textbf{L}abel \textbf{D}istribution \textbf{L}earning \textbf{M}achine (LDLM). |
Jing Wang; Xin Geng; |
980 | AlphaNet: Improved Training of Supernets with Alpha-Divergence Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose to improve the supernet training with a more generalized alpha-divergence. |
Dilin Wang; Chengyue Gong; Meng Li; Qiang Liu; Vikas Chandra; |
981 | Global Convergence of Policy Gradient for Linear-Quadratic Mean-Field Control/Game in Continuous Time Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we study the policy gradient (PG) method for the linear-quadratic mean-field control and game, where we assume each agent has identical linear state transitions and quadratic cost functions. |
Weichen Wang; Jiequn Han; Zhuoran Yang; Zhaoran Wang; |
982 | SG-PALM: A Fast Physically Interpretable Tensor Graphical Model Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a new graphical model inference procedure, called SG-PALM, for learning conditional dependency structure of high-dimensional tensor-variate data. |
Yu Wang; Alfred Hero; |
983 | Deep Generative Learning Via Schr?dinger Bridge Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose to learn a generative model via entropy interpolation with a Schr{ö}dinger Bridge. |
Gefei Wang; Yuling Jiao; Qian Xu; Yang Wang; Can Yang; |
984 | Robust Inference for High-Dimensional Linear Models Via Residual Randomization Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a residual randomization procedure designed for robust inference using Lasso estimates in the high-dimensional setting. |
Y. Samuel Wang; Si Kai Lee; Panos Toulis; Mladen Kolar; |
985 | A Modular Analysis of Provable Acceleration Via Polyak?s Momentum: Training A Wide ReLU Network and A Deep Linear Network Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This work establishes that momentum does indeed speed up neural net training. |
Jun-Kun Wang; Chi-Heng Lin; Jacob D Abernethy; |
986 | Optimal Non-Convex Exact Recovery in Stochastic Block Model Via Projected Power Method Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we study the problem of exact community recovery in the symmetric stochastic block model, where a graph of $n$ vertices is randomly generated by partitioning the vertices into $K \ge 2$ equal-sized communities and then connecting each pair of vertices with probability that depends on their community memberships. |
Peng Wang; Huikang Liu; Zirui Zhou; Anthony Man-Cho So; |
987 | ConvexVST: A Convex Optimization Approach to Variance-stabilizing Transformation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we converted the VST problem into a convex optimization problem, which can always be efficiently solved, identified the specific structure of the convex problem, which further improved the efficiency of the proposed algorithm, and showed that any finite discrete distributions and the discretized version of any continuous distributions from real data can be variance-stabilized in an easy and nonparametric way. |
Mengfan Wang; Boyu Lyu; Guoqiang Yu; |
988 | The Implicit Bias for Adaptive Optimization Algorithms on Homogeneous Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we study the implicit bias of adaptive optimization algorithms on homogeneous neural networks. |
Bohan Wang; Qi Meng; Wei Chen; Tie-Yan Liu; |
989 | Robust Learning for Data Poisoning Attacks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We investigate the robustness of stochastic approximation approaches against data poisoning attacks. |
Yunjuan Wang; Poorya Mianjy; Raman Arora; |
990 | SketchEmbedNet: Learning Novel Concepts By Imitating Drawings Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: While earlier approaches focus on generation quality or retrieval, we explore properties of image representations learned by training a model to produce sketches of images. |
Alexander Wang; Mengye Ren; Richard Zemel; |
991 | Directional Bias Amplification Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we focus on one aspect of the problem, namely bias amplification: the tendency of models to amplify the biases present in the data they are trained on. |
Angelina Wang; Olga Russakovsky; |
992 | An Exact Solver for The Weston-Watkins SVM Subproblem Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose an algorithm that solves the subproblem exactly using a novel reparametrization of the Weston-Watkins dual problem. |
Yutong Wang; Clayton Scott; |
993 | SCC: An Efficient Deep Reinforcement Learning Agent Mastering The Game of StarCraft II Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we’ll share the key insights and optimizations on efficient imitation learning and reinforcement learning for StarCraft II full game. |
Xiangjun Wang; Junxiao Song; Penghui Qi; Peng Peng; Zhenkun Tang; Wei Zhang; Weimin Li; Xiongjun Pi; Jujie He; Chao Gao; Haitao Long; Quan Yuan; |
994 | Quantum Algorithms for Reinforcement Learning with A Generative Model Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: For such an MDP, we design quantum algorithms that approximate an optimal policy ($\pi^*$), the optimal value function ($v^*$), and the optimal $Q$-function ($q^*$), assuming the algorithms can access samples from the environment in quantum superposition. |
Daochen Wang; Aarthi Sundaram; Robin Kothari; Ashish Kapoor; Martin Roetteler; |
995 | Matrix Completion with Model-free Weighting Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a novel method for matrix completion under general non-uniform missing structures. |
Jiayi Wang; Raymond K. W. Wong; Xiaojun Mao; Kwun Chuen Gary Chan; |
996 | UniSpeech: Unified Speech Representation Learning with Labeled and Unlabeled Data Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a unified pre-training approach called UniSpeech to learn speech representations with both labeled and unlabeled data, in which supervised phonetic CTC learning and phonetically-aware contrastive self-supervised learning are conducted in a multi-task learning manner. |
Chengyi Wang; Yu Wu; Yao Qian; Kenichi Kumatani; Shujie Liu; Furu Wei; Michael Zeng; Xuedong Huang; |
997 | Instabilities of Offline RL with Pre-Trained Neural Representation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In particular, our methodology explores these ideas when using features from pre-trained neural networks, in the hope that these representations are powerful enough to permit sample efficient offline RL. |
Ruosong Wang; Yifan Wu; Ruslan Salakhutdinov; Sham Kakade; |
998 | Learning to Weight Imperfect Demonstrations Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In contrast, this paper proposes a method of learning to weight imperfect demonstrations in GAIL without imposing extensive prior information. |
Yunke Wang; Chang Xu; Bo Du; Honglak Lee; |
999 | Evolving Attention with Residual Convolutions Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a novel and generic mechanism based on evolving attention to improve the performance of transformers. |
Yujing Wang; Yaming Yang; Jiangang Bai; Mingliang Zhang; Jing Bai; Jing Yu; Ce Zhang; Gao Huang; Yunhai Tong; |
1000 | Guarantees for Tuning The Step Size Using A Learning-to-Learn Approach Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we give meta-optimization guarantees for the learning-to-learn approach on a simple problem of tuning the step size for quadratic loss. |
Xiang Wang; Shuai Yuan; Chenwei Wu; Rong Ge; |
1001 | Bridging Multi-Task Learning and Meta-Learning: Towards Efficient Training and Effective Adaptation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we take one important step further to understand the close connection between these two learning paradigms, through both theoretical analysis and empirical investigation. |
Haoxiang Wang; Han Zhao; Bo Li; |
1002 | Towards Better Laplacian Representation in Reinforcement Learning with Generalized Graph Drawing Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To solve this problem, we reformulate the graph drawing objective into a generalized form and derive a new learning objective, which is proved to have eigenvectors as its unique global minimizer. |
Kaixin Wang; Kuangqi Zhou; Qixin Zhang; Jie Shao; Bryan Hooi; Jiashi Feng; |
1003 | Robust Asymmetric Learning in POMDPs Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To address this issue, we derive an update which, when applied iteratively to an expert, maximizes the expected reward of the trainee’s policy. |
Andrew Warrington; Jonathan W Lavington; Adam Scibior; Mark Schmidt; Frank Wood; |
1004 | A Unified Generative Adversarial Network Training Via Self-Labeling and Self-Attention Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a novel GAN training scheme that can handle any level of labeling in a unified manner. |
Tomoki Watanabe; Paolo Favaro; |
1005 | Decision-Making Under Selective Labels: Optimal Finite-Domain Policies and Beyond Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper studies the learning of decision policies in the face of selective labels, in an online setting that balances learning costs against future utility. |
Dennis Wei; |
1006 | Inferring Serial Correlation with Dynamic Backgrounds Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a Total Variation (TV) constrained least square estimator coupled with hypothesis tests to infer the serial correlation in the presence of unknown and unstructured dynamic background. |
Song Wei; Yao Xie; Dobromir Rahnev; |
1007 | Meta-learning Hyperparameter Performance Prediction with Neural Processes Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose an end-to-end surrogate named as Transfer NeuralProcesses (TNP) that learns a comprehensive set of meta-knowledge, including the parameters of historical surrogates, historical trials, and initial configurations for other datasets. |
Ying Wei; Peilin Zhao; Junzhou Huang; |
1008 | A Structured Observation Distribution for Generative Biological Sequence Prediction and Forecasting Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To address these problems, we propose a principled drop-in alternative to MSA preprocessing in the form of a structured observation distribution (the "MuE" distribution). |
Eli N Weinstein; Debora Marks; |
1009 | Thinking Like Transformers Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we aim to change that, proposing a computational model for the transformer-encoder in the form of a programming language. |
Gail Weiss; Yoav Goldberg; Eran Yahav; |
1010 | Leveraged Weighted Loss for Partial Label Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a family of loss functions named \textit{Leveraged Weighted} (LW) loss, which for the first time introduces the leverage parameter $\beta$ to consider the trade-off between losses on partial labels and non-partial ones. |
Hongwei Wen; Jingyi Cui; Hanyuan Hang; Jiabin Liu; Yisen Wang; Zhouchen Lin; |
1011 | Characterizing The Gap Between Actor-Critic and Policy Gradient Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we explain the gap between AC and PG methods by identifying the exact adjustment to the AC objective/gradient that recovers the true policy gradient of the cumulative reward objective (PG). |
Junfeng Wen; Saurabh Kumar; Ramki Gummadi; Dale Schuurmans; |
1012 | Toward Understanding The Feature Learning Process of Self-supervised Contrastive Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present an underlying principle called feature decoupling to explain the effects of augmentations, where we theoretically characterize how augmentations can reduce the correlations of dense features between positive samples while keeping the correlations of sparse features intact, thereby forcing the neural networks to learn from the self-supervision of sparse features. |
Zixin Wen; Yuanzhi Li; |
1013 | Keyframe-Focused Visual Imitation Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a solution that outperforms these prior approaches by upweighting demonstration keyframes corresponding to expert action changepoints. |
Chuan Wen; Jierui Lin; Jianing Qian; Yang Gao; Dinesh Jayaraman; |
1014 | Learning De-identified Representations of Prosody from Raw Audio Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a method for learning de-identified prosody representations from raw audio using a contrastive self-supervised signal. |
Jack Weston; Raphael Lenain; Udeepa Meepegama; Emil Fristed; |
1015 | Solving Inverse Problems with A Flow-based Noise Model Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We study image inverse problems with a normalizing flow prior. |
Jay Whang; Qi Lei; Alex Dimakis; |
1016 | Composing Normalizing Flows for Inverse Problems Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Motivated by this, we propose a framework for approximate inference that estimates the target conditional as a composition of two flow models. |
Jay Whang; Erik Lindgren; Alex Dimakis; |
1017 | Which Transformer Architecture Fits My Data? A Vocabulary Bottleneck in Self-attention Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We theoretically predict the existence of an embedding rank bottleneck that limits the contribution of self-attention width to the Transformer expressivity. |
Noam Wies; Yoav Levine; Daniel Jannai; Amnon Shashua; |
1018 | Prediction-Centric Learning of Independent Cascade Dynamics from Partial Observations Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We introduce a computationally efficient algorithm, based on a scalable dynamic message-passing approach, which is able to learn parameters of the effective spreading model given only limited information on the activation times of nodes in the network. |
Mateusz Wilinski; Andrey Lokhov; |
1019 | Leveraging Language to Learn Program Abstractions and Search Heuristics Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We introduce LAPS (Language for Abstraction and Program Search), a technique for using natural language annotations to guide joint learning of libraries and neurally-guided search models for synthesis. |
Catherine Wong; Kevin M Ellis; Joshua Tenenbaum; Jacob Andreas; |
1020 | Leveraging Sparse Linear Layers for Debuggable Deep Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We show how fitting sparse linear models over learned deep feature representations can lead to more debuggable neural networks. |
Eric Wong; Shibani Santurkar; Aleksander Madry; |
1021 | Learning Neural Network Subspaces Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: With a similar computational cost as training one model, we learn lines, curves, and simplexes of high-accuracy neural networks. |
Mitchell Wortsman; Maxwell C Horton; Carlos Guestrin; Ali Farhadi; Mohammad Rastegari; |
1022 | Conjugate Energy-Based Models Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose conjugate energy-based models (CEBMs), a new class of energy-based models that define a joint density over data and latent variables. |
Hao Wu; Babak Esmaeili; Michael Wick; Jean-Baptiste Tristan; Jan-Willem Van De Meent; |
1023 | Making Paper Reviewing Robust to Bid Manipulation Attacks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we study the efficacy of such bid manipulation attacks and find that, indeed, they can jeopardize the integrity of the review process. |
Ruihan Wu; Chuan Guo; Felix Wu; Rahul Kidambi; Laurens Van Der Maaten; Kilian Weinberger; |
1024 | LIME: Learning Inductive Bias for Primitives of Mathematical Reasoning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We specifically design these tasks to be synthetic and devoid of mathematical knowledge to ensure that only the fundamental reasoning biases can be learned from these tasks. This defines a new pre-training methodology called LIME (Learning Inductive bias for Mathematical rEasoning). |
Yuhuai Wu; Markus N Rabe; Wenda Li; Jimmy Ba; Roger B Grosse; Christian Szegedy; |
1025 | ChaCha for Online AutoML Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose the ChaCha (Champion-Challengers) algorithm for making an online choice of hyperparameters in online learning settings. |
Qingyun Wu; Chi Wang; John Langford; Paul Mineiro; Marco Rossi; |
1026 | Temporally Correlated Task Scheduling for Sequence Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we introduce a learnable scheduler to sequence learning, which can adaptively select auxiliary tasks for training depending on the model status and the current training data. |
Xueqing Wu; Lewen Wang; Yingce Xia; Weiqing Liu; Lijun Wu; Shufang Xie; Tao Qin; Tie-Yan Liu; |
1027 | Class2Simi: A Noise Reduction Perspective on Learning with Noisy Labels Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To give an affirmative answer, in this paper, we propose a framework called \emph{Class2Simi}: it transforms data points with noisy \emph{class labels} to data pairs with noisy \emph{similarity labels}, where a similarity label denotes whether a pair shares the class label or not. |
Songhua Wu; Xiaobo Xia; Tongliang Liu; Bo Han; Mingming Gong; Nannan Wang; Haifeng Liu; Gang Niu; |
1028 | On Reinforcement Learning with Adversarial Corruption and Its Application to Block MDP Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: When the total number of corrupted episodes is known, we propose an algorithm, Corruption Robust Monotonic Value Propagation (\textsf{CR-MVP}), which achieves a regret bound of $\tilde{O}\left(\left(\sqrt{SAK}+S^2A+CSA)\right)\polylog(H)\right)$, where $S$ is the number of states, $A$ is the number of actions, $H$ is the planning horizon, $K$ is the number of episodes, and $C$ is the corruption level. |
Tianhao Wu; Yunchang Yang; Simon Du; Liwei Wang; |
1029 | Generative Video Transformer: Can Objects Be The Words? Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose the ObjectCentric Video Transformer (OCVT) which utilizes an object-centric approach for decomposing scenes into tokens suitable for use in a generative video transformer. |
Yi-Fu Wu; Jaesik Yoon; Sungjin Ahn; |
1030 | Uncertainty Weighted Actor-Critic for Offline Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose Uncertainty Weighted Actor-Critic (UWAC), an algorithm that detects OOD state-action pairs and down-weights their contribution in the training objectives accordingly. |
Yue Wu; Shuangfei Zhai; Nitish Srivastava; Joshua M Susskind; Jian Zhang; Ruslan Salakhutdinov; Hanlin Goh; |
1031 | Towards Open-World Recommendation: An Inductive Model-based Collaborative Filtering Approach Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose an inductive collaborative filtering framework that contains two representation models. |
Qitian Wu; Hengrui Zhang; Xiaofeng Gao; Junchi Yan; Hongyuan Zha; |
1032 | Data-efficient Hindsight Off-policy Option Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We introduce Hindsight Off-policy Options (HO2), a data-efficient option learning algorithm. |
Markus Wulfmeier; Dushyant Rao; Roland Hafner; Thomas Lampe; Abbas Abdolmaleki; Tim Hertweck; Michael Neunert; Dhruva Tirumala; Noah Siegel; Nicolas Heess; Martin Riedmiller; |
1033 | A Bit More Bayesian: Domain-Invariant Learning with Uncertainty Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we address both challenges with a probabilistic framework based on variational Bayesian inference, by incorporating uncertainty into neural network weights. |
Zehao Xiao; Jiayi Shen; Xiantong Zhen; Ling Shao; Cees Snoek; |
1034 | On The Optimality of Batch Policy Optimization Algorithms Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Therefore, to establish a framework for distinguishing algorithms, we introduce a new weighted-minimax criterion that considers the inherent difficulty of optimal value prediction. |
Chenjun Xiao; Yifan Wu; Jincheng Mei; Bo Dai; Tor Lattimore; Lihong Li; Csaba Szepesvari; Dale Schuurmans; |
1035 | CRFL: Certifiably Robust Federated Learning Against Backdoor Attacks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper provides the first general framework, Certifiably Robust Federated Learning (CRFL), to train certifiably robust FL models against backdoors. |
Chulin Xie; Minghao Chen; Pin-Yu Chen; Bo Li; |
1036 | RNNRepair: Automatic RNN Repair Via Model-based Analysis Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a lightweight model-based approach (RNNRepair) to help understand and repair incorrect behaviors of an RNN. |
Xiaofei Xie; Wenbo Guo; Lei Ma; Wei Le; Jian Wang; Lingjun Zhou; Yang Liu; Xinyu Xing; |
1037 | Deep Reinforcement Learning Amidst Continual Structured Non-Stationarity Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we formalize this problem setting, and draw upon ideas from the online learning and probabilistic inference literature to derive an off-policy RL algorithm that can reason about and tackle such lifelong non-stationarity. |
Annie Xie; James Harrison; Chelsea Finn; |
1038 | Batch Value-function Approximation with Only Realizability Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We make progress in a long-standing problem of batch reinforcement learning (RL): learning Q* from an exploratory and polynomial-sized dataset, using a realizable and otherwise arbitrary function class. |
Tengyang Xie; Nan Jiang; |
1039 | Interaction-Grounded Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose \emph{Interaction-Grounded Learning} for this novel setting, in which a learner’s goal is to interact with the environment with no grounding or explicit reward to optimize its policies. |
Tengyang Xie; John Langford; Paul Mineiro; Ida Momennejad; |
1040 | Composed Fine-Tuning: Freezing Pre-Trained Denoising Autoencoders for Improved Generalization Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We focus on prediction problems with structured outputs that are subject to output validity constraints, e.g. pseudocode-to-code translation where the code must compile. |
Sang Michael Xie; Tengyu Ma; Percy Liang; |
1041 | Learning While Playing in Mean-Field Games: Convergence and Optimality Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To bridge such a gap, we propose a fictitious play algorithm, which alternatively updates the policy (learning) and the mean-field state (playing) by one step of policy optimization and gradient descent, respectively. |
Qiaomin Xie; Zhuoran Yang; Zhaoran Wang; Andreea Minca; |
1042 | Positive-Negative Momentum: Manipulating Stochastic Gradient Noise to Improve Generalization Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: For simulating SGN at low computational costs and without changing the learning rate or batch size, we propose the Positive-Negative Momentum (PNM) approach that is a powerful alternative to conventional Momentum in classic optimizers. |
Zeke Xie; Li Yuan; Zhanxing Zhu; Masashi Sugiyama; |
1043 | A Hybrid Variance-Reduced Method for Decentralized Stochastic Non-Convex Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this context, we propose a novel single-loop decentralized hybrid variance-reduced stochastic gradient method, called GT-HSGD, that outperforms the existing approaches in terms of both the oracle complexity and practical implementation. |
Ran Xin; Usman Khan; Soummya Kar; |
1044 | Explore Visual Concept Formation for Image Classification Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Inspired by this, we propose a learning strategy of visual concept formation (LSOVCF) based on the ConvNet, in which the two intertwined parts of concept formation, i.e. feature extraction and concept description, are learned together. |
Shengzhou Xiong; Yihua Tan; Guoyou Wang; |
1045 | CRPO: A New Approach for Safe Reinforcement Learning with Convergence Guarantee Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In contrast, we propose a primal approach, called constraint-rectified policy optimization (CRPO), which updates the policy alternatingly between objective improvement and constraint satisfaction. |
Tengyu Xu; Yingbin Liang; Guanghui Lan; |
1046 | To Be Robust or to Be Fair: Towards Fairness in Adversarial Training Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we empirically and theoretically show that this phenomenon can generally happen under adversarial training algorithms which minimize DNN models’ robust errors. |
Han Xu; Xiaorui Liu; Yaxin Li; Anil Jain; Jiliang Tang; |
1047 | Interpretable Stein Goodness-of-fit Tests on Riemannian Manifold Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this study, we develop goodness-of-fit testing and interpretable model criticism methods for general distributions on Riemannian manifolds, including those with an intractable normalization constant. |
Wenkai Xu; Takeru Matsuda; |
1048 | Rethinking Neural Vs. Matrix-Factorization Collaborative Filtering: The Theoretical Perspectives Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we address the comparison rigorously by answering the following questions: 1. what is the limiting expressivity of each model; 2. under the practical gradient descent, to which solution does each optimization path converge; 3. how would the models generalize under the inductive and transductive learning setting. |
Da Xu; Chuanwei Ruan; Evren Korpeoglu; Sushant Kumar; Kannan Achan; |
1049 | Dash: Semi-Supervised Learning with Dynamic Thresholding Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work we develop a simple yet powerful framework, whose key idea is to select a subset of training examples from the unlabeled data when performing existing SSL methods so that only the unlabeled examples with pseudo labels related to the labeled data will be used to train models. |
Yi Xu; Lei Shang; Jinxing Ye; Qi Qian; Yu-Feng Li; Baigui Sun; Hao Li; Rong Jin; |
1050 | An End-to-End Framework for Molecular Conformation Generation Via Bilevel Programming Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose an end-to-end solution for molecular conformation prediction called ConfVAE based on the conditional variational autoencoder framework. |
Minkai Xu; Wujie Wang; Shitong Luo; Chence Shi; Yoshua Bengio; Rafael Gomez-Bombarelli; Jian Tang; |
1051 | Self-supervised Graph-level Representation Learning with Local and Global Structure Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a unified framework called Local-instance and Global-semantic Learning (GraphLoG) for self-supervised whole-graph representation learning. |
Minghao Xu; Hang Wang; Bingbing Ni; Hongyu Guo; Jian Tang; |
1052 | Conformal Prediction Interval for Dynamic Time-series Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We develop a method to construct distribution-free prediction intervals for dynamic time-series, called \Verb|EnbPI| that wraps around any bootstrap ensemble estimator to construct sequential prediction intervals. |
Chen Xu; Yao Xie; |
1053 | Learner-Private Convex Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we study how to optimally obfuscate the learner’s queries in convex optimization with first-order feedback, so that their learned optimal value is provably difficult to estimate for the eavesdropping adversary. |
Jiaming Xu; Kuang Xu; Dana Yang; |
1054 | Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we develop a doubly robust off-policy AC (DR-Off-PAC) for discounted MDP, which can take advantage of learned nuisance functions to reduce estimation errors. |
Tengyu Xu; Zhuoran Yang; Zhaoran Wang; Yingbin Liang; |
1055 | Optimization of Graph Neural Networks: Implicit Acceleration By Skip Connections and More Depth Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We take the first step towards analyzing GNN training by studying the gradient dynamics of GNNs. |
Keyulu Xu; Mozhi Zhang; Stefanie Jegelka; Kenji Kawaguchi; |
1056 | Group-Sparse Matrix Factorization for Transfer Learning of Word Embeddings Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a novel group-sparse penalty that exploits this sparsity to perform transfer learning when there is very little text data available in the target domain—e.g., a single article of text. |
Kan Xu; Xuanyi Zhao; Hamsa Bastani; Osbert Bastani; |
1057 | KNAS: Green Neural Architecture Search Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: According to this hypothesis, we propose a new kernel based architecture search approach KNAS. |
Jingjing Xu; Liang Zhao; Junyang Lin; Rundong Gao; Xu Sun; Hongxia Yang; |
1058 | Structured Convolutional Kernel Networks for Airline Crew Scheduling Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Motivated by the needs from an airline crew scheduling application, we introduce structured convolutional kernel networks (Struct-CKN), which combine CKNs from Mairal et al. (2014) in a structured prediction framework that supports constraints on the outputs. |
Yassine Yaakoubi; Francois Soumis; Simon Lacoste-Julien; |
1059 | Mediated Uncoupled Learning: Learning Functions Without Direct Input-output Correspondences Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we consider the task of predicting $Y$ from $X$ when we have no paired data of them, but we have two separate, independent datasets of $X$ and $Y$ each observed with some mediating variable $U$, that is, we have two datasets $S_X = \{(X_i, U_i)\}$ and $S_Y = \{(U’_j, Y’_j)\}$. |
Ikko Yamane; Junya Honda; Florian Yger; Masashi Sugiyama; |
1060 | EL-Attention: Memory Efficient Lossless Attention for Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose memory-efficient lossless attention (called EL-attention) to address this issue. |
Yu Yan; Jiusheng Chen; Weizhen Qi; Nikhil Bhendawade; Yeyun Gong; Nan Duan; Ruofei Zhang; |
1061 | Link Prediction with Persistent Homology: An Interactive View Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a novel topological approach to characterize interactions between two nodes. |
Zuoyu Yan; Tengfei Ma; Liangcai Gao; Zhi Tang; Chao Chen; |
1062 | CATE: Computation-aware Neural Architecture Encoding with Transformers Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we introduce a Computation-Aware Transformer-based Encoding method called CATE. |
Shen Yan; Kaiqiang Song; Fei Liu; Mi Zhang; |
1063 | On Perceptual Lossy Compression: The Cost of Perceptual Reconstruction and An Optimal Training Framework Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper provides nontrivial results theoretically revealing that, 1) the cost of achieving perfect perception quality is exactly a doubling of the lowest achievable MSE distortion, 2) an optimal encoder for the classic rate-distortion problem is also optimal for the perceptual compression problem, 3) distortion loss is unnecessary for training a perceptual decoder. |
Zeyu Yan; Fei Wen; Rendong Ying; Chao Ma; Peilin Liu; |
1064 | CIFS: Improving Adversarial Robustness of CNNs Via Channel-wise Importance-based Feature Selection Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To examine this hypothesis, we introduce a novel mechanism, \textit{i.e.}, \underline{C}hannel-wise \underline{I}mportance-based \underline{F}eature \underline{S}election (CIFS). |
Hanshu Yan; Jingfeng Zhang; Gang Niu; Jiashi Feng; Vincent Tan; Masashi Sugiyama; |
1065 | Exact Gap Between Generalization Error and Uniform Convergence in Random Feature Models Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To better understand this gap, we study the uniform convergence in the nonlinear random feature model and perform a precise theoretical analysis on how uniform convergence depends on the sample size and the number of parameters. |
Zitong Yang; Yu Bai; Song Mei; |
1066 | Learning Optimal Auctions with Correlated Valuations from Samples Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we investigate the robustness of the optimal auction with correlated valuations via sample complexity analysis. |
Chunxue Yang; Xiaohui Bei; |
1067 | Tensor Programs IV: Feature Learning in Infinite-Width Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose simple modifications to the standard parametrization to allow for feature learning in the limit. |
Greg Yang; Edward J. Hu; |
1068 | LARNet: Lie Algebra Residual Network for Face Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a novel method with Lie algebra theory to explore how face rotation in the 3D space affects the deep feature generation process of convolutional neural networks (CNNs). |
Xiaolong Yang; Xiaohong Jia; Dihong Gong; Dong-Ming Yan; Zhifeng Li; Wei Liu; |
1069 | BASGD: Buffered Asynchronous SGD for Byzantine Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a novel method, called buffered asynchronous stochastic gradient descent (BASGD), for ABL. |
Yi-Rui Yang; Wu-Jun Li; |
1070 | Tensor Programs IIb: Architectural Universality Of Neural Tangent Kernel Training Dynamics Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To achieve this result, we apply the Tensor Programs technique: Write the entire SGD dynamics inside a Tensor Program and analyze it via the Master Theorem. |
Greg Yang; Etai Littwin; |
1071 | Graph Neural Networks Inspired By Classical Iterative Algorithms Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To at least partially address these issues within a simple transparent framework, we consider a new family of GNN layers designed to mimic and integrate the update rules of two classical iterative algorithms, namely, proximal gradient descent and iterative reweighted least squares (IRLS). |
Yongyi Yang; Tang Liu; Yangkun Wang; Jinjing Zhou; Quan Gan; Zhewei Wei; Zheng Zhang; Zengfeng Huang; David Wipf; |
1072 | Representation Matters: Offline Pretraining for Sequential Decision Making Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we consider a slightly different approach to incorporating offline data into sequential decision-making. |
Mengjiao Yang; Ofir Nachum; |
1073 | Accelerating Safe Reinforcement Learning with Constraint-mismatched Baseline Policies Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In order to safely learn from baseline policies, we propose an iterative policy optimization algorithm that alternates between maximizing expected return on the task, minimizing distance to the baseline policy, and projecting the policy onto the constraint-satisfying set. |
Tsung-Yen Yang; Justinian Rosca; Karthik Narasimhan; Peter J Ramadge; |
1074 | Voice2Series: Reprogramming Acoustic Models for Time Series Classification Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Motivated by the advances in deep speech processing models and the fact that voice data are univariate temporal signals, in this paper we propose Voice2Serie (V2S), a novel end-to-end approach that reprograms acoustic models for time series classification, through input transformation learning and output label mapping. |
Chao-Han Huck Yang; Yun-Yun Tsai; Pin-Yu Chen; |
1075 | When All We Need Is A Piece of The Pie: A Generic Framework for Optimizing Two-way Partial AUC Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To address this issue, we propose a generic framework to construct surrogate optimization problems, which supports efficient end-to-end training with deep-learning. |
Zhiyong Yang; Qianqian Xu; Shilong Bao; Yuan He; Xiaochun Cao; Qingming Huang; |
1076 | Rethinking Rotated Object Detection with Gaussian Wasserstein Distance Loss Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a novel regression loss based on Gaussian Wasserstein distance as a fundamental approach to solve the problem. |
Xue Yang; Junchi Yan; Qi Ming; Wentao Wang; Xiaopeng Zhang; Qi Tian; |
1077 | Delving Into Deep Imbalanced Regression Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Motivated by the intrinsic difference between categorical and continuous label space, we propose distribution smoothing for both labels and features, which explicitly acknowledges the effects of nearby targets, and calibrates both label and learned feature distributions. |
Yuzhe Yang; Kaiwen Zha; Yingcong Chen; Hao Wang; Dina Katabi; |
1078 | Backpropagated Neighborhood Aggregation for Accurate Training of Spiking Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a novel BP-like method, called neighborhood aggregation (NA), which computes accurate error gradients guiding weight updates that may lead to discontinuous modifications of firing activities. |
Yukun Yang; Wenrui Zhang; Peng Li; |
1079 | SimAM: A Simple, Parameter-Free Attention Module for Convolutional Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a conceptually simple but very effective attention module for Convolutional Neural Networks (ConvNets). |
Lingxiao Yang; Ru-Yuan Zhang; Lida Li; Xiaohua Xie; |
1080 | HAWQ-V3: Dyadic Neural Network Quantization Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To address this, we present HAWQ-V3, a novel mixed-precision integer-only quantization framework. |
Zhewei Yao; Zhen Dong; Zhangcheng Zheng; Amir Gholami; Jiali Yu; Eric Tan; Leyuan Wang; Qijing Huang; Yida Wang; Michael Mahoney; Kurt Keutzer; |
1081 | Improving Generalization in Meta-learning Via Task Augmentation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Concretely, we propose two task augmentation methods, including MetaMix and Channel Shuffle. |
Huaxiu Yao; Long-Kai Huang; Linjun Zhang; Ying Wei; Li Tian; James Zou; Junzhou Huang; Zhenhui () Li; |
1082 | Deep Learning for Functional Data Analysis with Adaptive Basis Layers Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We introduce neural networks that employ a new Basis Layer whose hidden units are each basis functions themselves implemented as a micro neural network. |
Junwen Yao; Jonas Mueller; Jane-Ling Wang; |
1083 | Addressing Catastrophic Forgetting in Few-Shot Problems Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We demonstrate that the popular gradient-based model-agnostic meta-learning algorithm (MAML) indeed suffers from catastrophic forgetting and introduce a Bayesian online meta-learning framework that tackles this problem. |
Pauching Yap; Hippolyt Ritter; David Barber; |
1084 | Reinforcement Learning with Prototypical Representations Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To address these challenges we propose Proto-RL, a self-supervised framework that ties representation learning with exploration through prototypical representations. |
Denis Yarats; Rob Fergus; Alessandro Lazaric; Lerrel Pinto; |
1085 | Elementary Superexpressive Activations Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We call a finite family of activation functions \emph{superexpressive} if any multivariate continuous function can be approximated by a neural network that uses these activations and has a fixed architecture only depending on the number of input variables (i.e., to achieve any accuracy we only need to adjust the weights, without increasing the number of neurons). |
Dmitry Yarotsky; |
1086 | Break-It-Fix-It: Unsupervised Learning for Program Repair Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To bridge this gap, we propose a new training approach, Break-It-Fix-It (BIFI), which has two key ideas: (i) we use the critic to check a fixer’s output on real bad inputs and add good (fixed) outputs to the training data, and (ii) we train a breaker to generate realistic bad code from good code. Existing works create training data consisting of (bad, good) pairs by corrupting good examples using heuristics (e.g., dropping tokens). |
Michihiro Yasunaga; Percy Liang; |
1087 | Improving Gradient Regularization Using Complex-Valued Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: A form of complex-valued neural network (CVNN) is proposed to improve the performance of gradient regularization on classification tasks of real-valued input in adversarial settings. |
Eric C Yeats; Yiran Chen; Hai Li; |
1088 | Neighborhood Contrastive Learning Applied to Online Patient Monitoring Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we overcome this limitation by supplementing time-series data augmentation techniques with a novel contrastive learning objective which we call neighborhood contrastive learning (NCL). |
Hugo Y?che; Gideon Dresdner; Francesco Locatello; Matthias H?ser; Gunnar R?tsch; |
1089 | From Local Structures to Size Generalization in Graph Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we identify an important type of data where generalization from small to large graphs is challenging: graph distributions for which the local structure depends on the graph size. |
Gilad Yehudai; Ethan Fetaya; Eli Meirom; Gal Chechik; Haggai Maron; |
1090 | Improved OOD Generalization Via Adversarial Training and Pretraing Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, after defining OOD generalization by Wasserstein distance, we theoretically justify that a model robust to input perturbation also generalizes well on OOD data. |
Mingyang Yi; Lu Hou; Jiacheng Sun; Lifeng Shang; Xin Jiang; Qun Liu; Zhiming Ma; |
1091 | Regret and Cumulative Constraint Violation Analysis for Online Convex Optimization with Long Term Constraints Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper considers online convex optimization with long term constraints, where constraints can be violated in intermediate rounds, but need to be satisfied in the long run. |
Xinlei Yi; Xiuxian Li; Tao Yang; Lihua Xie; Tianyou Chai; Karl Johansson; |
1092 | Continuous-time Model-based Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To avoid time-discretization approximation of the underlying process, we propose a continuous-time MBRL framework based on a novel actor-critic method. |
Cagatay Yildiz; Markus Heinonen; Harri L?hdesm?ki; |
1093 | Distributed Nystr?m Kernel Learning with Communications Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We study the statistical performance for distributed kernel ridge regression with Nyström (DKRR-NY) and with Nyström and iterative solvers (DKRR-NY-PCG) and successfully derive the optimal learning rates, which can improve the ranges of the number of local processors $p$ to the optimal in existing state-of-art bounds. |
Rong Yin; Weiping Wang; Dan Meng; |
1094 | Path Planning Using Neural A* Search Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we reformulate a canonical A* search algorithm to be differentiable and couple it with a convolutional encoder to form an end-to-end trainable neural network planner. |
Ryo Yonetani; Tatsunori Taniai; Mohammadamin Barekatain; Mai Nishimura; Asako Kanezaki; |
1095 | SinIR: Efficient General Image Manipulation with Single Image Reconstruction Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose SinIR, an efficient reconstruction-based framework trained on a single natural image for general image manipulation, including super-resolution, editing, harmonization, paint-to-image, photo-realistic style transfer, and artistic style transfer. |
Jihyeong Yoo; Qifeng Chen; |
1096 | Conditional Temporal Neural Processes with Covariance Loss Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We introduce a novel loss function, Covariance Loss, which is conceptually equivalent to conditional neural processes and has a form of regularization so that is applicable to many kinds of neural networks. |
Boseon Yoo; Jiwoo Lee; Janghoon Ju; Seijun Chung; Soyeon Kim; Jaesik Choi; |
1097 | Adversarial Purification with Score-based Generative Models Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a novel adversarial purification method based on an EBM trained with DSM. |
Jongmin Yoon; Sung Ju Hwang; Juho Lee; |
1098 | Federated Continual Learning with Weighted Inter-client Transfer Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To resolve these issues, we propose a novel federated continual learning framework, Federated Weighted Inter-client Transfer (FedWeIT), which decomposes the network weights into global federated parameters and sparse task-specific parameters, and each client receives selective knowledge from other clients by taking a weighted combination of their task-specific parameters. |
Jaehong Yoon; Wonyong Jeong; Giwoong Lee; Eunho Yang; Sung Ju Hwang; |
1099 | Autoencoding Under Normalization Constraints Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose the Normalized Autoencoder (NAE), a normalized probabilistic model constructed from an autoencoder. |
Sangwoong Yoon; Yung-Kyun Noh; Frank Park; |
1100 | Accelerated Algorithms for Smooth Convex-Concave Minimax Problems with O(1/k^2) Rate on Squared Gradient Norm Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we study the computational complexity of reducing the squared gradient magnitude for smooth minimax optimization problems. |
Taeho Yoon; Ernest K Ryu; |
1101 | Lower-Bounded Proper Losses for Weakly Supervised Classification Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper discusses the problem of weakly supervised classification, in which instances are given weak labels that are produced by some label-corruption process. |
Shuhei M Yoshida; Takashi Takenouchi; Masashi Sugiyama; |
1102 | Graph Contrastive Learning Automated Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Aiming to fill in this crucial gap, this paper proposes a unified bi-level optimization framework to automatically, adaptively and dynamically select data augmentations when performing GraphCL on specific graph data. |
Yuning You; Tianlong Chen; Yang Shen; Zhangyang Wang; |
1103 | LogME: Practical Assessment of Pre-trained Models for Transfer Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In pursuit of a practical assessment method, we propose to estimate the maximum value of label evidence given features extracted by pre-trained models. |
Kaichao You; Yong Liu; Jianmin Wang; Mingsheng Long; |
1104 | Exponentially Many Local Minima in Quantum Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We conduct a quantitative investigation on the landscape of loss functions of QNNs and identify a class of simple yet extremely hard QNN instances for training. |
Xuchen You; Xiaodi Wu; |
1105 | DAGs with No Curl: An Efficient DAG Structure Learning Approach Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To further improve efficiency, we propose a novel learning framework to model and learn the weighted adjacency matrices in the DAG space directly. |
Yue Yu; Tian Gao; Naiyu Yin; Qiang Ji; |
1106 | Provably Efficient Algorithms for Multi-Objective Competitive RL Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Our results extend Blackwell’s approachability theorem \citep{blackwell1956analog} to tabular RL, where strategic exploration becomes essential. |
Tiancheng Yu; Yi Tian; Jingzhao Zhang; Suvrit Sra; |
1107 | Whittle Networks: A Deep Likelihood Model for Time Series Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To this end, we propose the first probabilistic circuits (PCs) approach for modeling the joint distribution of multivariate time series, called Whittle sum-product networks (WSPNs). |
Zhongjie Yu; Fabrizio G Ventola; Kristian Kersting; |
1108 | Deep Latent Graph Matching Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To address this, we propose to learn the (distribution of) latent topology, which can better support the downstream GM task. |
Tianshu Yu; Runzhong Wang; Junchi Yan; Baoxin Li; |
1109 | Learning Generalized Intersection Over Union for Dense Pixelwise Prediction Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose PixIoU, a generalized IoU for pixelwise prediction that is sensitive to the distance for non-overlapping cases and the locations in prediction. |
Jiaqian Yu; Jingtao Xu; Yiwei Chen; Weiming Li; Qiang Wang; Byungin Yoo; Jae-Joon Han; |
1110 | Large Scale Private Learning Via Low-rank Reparametrization Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a reparametrization scheme to address the challenges of applying differentially private SGD on large neural networks, which are 1) the huge memory cost of storing individual gradients, 2) the added noise suffering notorious dimensional dependence. |
Da Yu; Huishuai Zhang; Wei Chen; Jian Yin; Tie-Yan Liu; |
1111 | Federated Deep AUC Maximization for Hetergeneous Data with A Constant Communication Complexity Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose improved FDAM algorithms for heterogeneous data by solving the popular non-convex strongly-concave min-max formulation of DAM in a distributed fashion, which can also be applied to a class of non-convex strongly-concave min-max problems. |
Zhuoning Yuan; Zhishuai Guo; Yi Xu; Yiming Ying; Tianbao Yang; |
1112 | Neural Tangent Generalization Attacks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we study the generalization attacks against DNNs, where an attacker aims to slightly modify training data in order to spoil the training process such that a trained network lacks generalizability. |
Chia-Hung Yuan; Shan-Hung Wu; |
1113 | On Explainability of Graph Neural Networks Via Subgraph Explorations Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose a novel method, known as SubgraphX, to explain GNNs by identifying important subgraphs. |
Hao Yuan; Haiyang Yu; Jie Wang; Kang Li; Shuiwang Ji; |
1114 | Federated Composite Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we study the Federated Composite Optimization (FCO) problem, in which the loss function contains a non-smooth regularizer. |
Honglin Yuan; Manzil Zaheer; Sashank Reddi; |
1115 | Three Operator Splitting with A Nonconvex Loss Function Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We consider the problem of minimizing the sum of three functions, one of which is nonconvex but differentiable, and the other two are convex but possibly nondifferentiable. |
Alp Yurtsever; Varun Mangalick; Suvrit Sra; |
1116 | Grey-box Extraction of Natural Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we present algebraic and hybrid algebraic/learning-based attacks on large-scale natural language models. |
Santiago Zanella-Beguelin; Shruti Tople; Andrew Paverd; Boris K?pf; |
1117 | Exponential Lower Bounds for Batch Reinforcement Learning: Batch RL Can Be Exponentially Harder Than Online RL Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: For both tasks we derive exponential information-theoretic lower bounds in discounted infinite horizon MDPs with a linear function representation for the action value function even if 1) realizability holds, 2) the batch algorithm observes the exact reward and transition functions, and 3) the batch algorithm is given the best a priori data distribution for the problem class. |
Andrea Zanette; |
1118 | Learning Binary Decision Trees By Argmin Differentiation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose to learn discrete parameters (i.e., for tree traversals and node pruning) and continuous parameters (i.e., for tree split functions and prediction functions) simultaneously using argmin differentiation. |
Valentina Zantedeschi; Matt Kusner; Vlad Niculae; |
1119 | Barlow Twins: Self-Supervised Learning Via Redundancy Reduction Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose an objective function that naturally avoids collapse by measuring the cross-correlation matrix between the outputs of two identical networks fed with distorted versions of a sample, and making it as close to the identity matrix as possible. |
Jure Zbontar; Li Jing; Ishan Misra; Yann Lecun; Stephane Deny; |
1120 | You Only Sample (Almost) Once: Linear Cost Self-Attention Via Bernoulli Sampling Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we show that a Bernoulli sampling attention mechanism based on Locality Sensitive Hashing (LSH), decreases the quadratic complexity of such models to linear. |
Zhanpeng Zeng; Yunyang Xiong; Sathya Ravi; Shailesh Acharya; Glenn M Fung; Vikas Singh; |
1121 | DouZero: Mastering DouDizhu with Self-Play Deep Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose a conceptually simple yet effective DouDizhu AI system, namely DouZero, which enhances traditional Monte-Carlo methods with deep neural networks, action encoding, and parallel actors. |
Daochen Zha; Jingru Xie; Wenye Ma; Sheng Zhang; Xiangru Lian; Xia Hu; Ji Liu; |
1122 | DORO: Distributional and Outlier Robust Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To resolve this issue, we propose the framework of DORO, for Distributional and Outlier Robust Optimization. |
Runtian Zhai; Chen Dan; Zico Kolter; Pradeep Ravikumar; |
1123 | Can Subnetwork Structure Be The Key to Out-of-Distribution Generalization? Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we use a functional modular probing method to analyze deep model structures under OOD setting. |
Dinghuai Zhang; Kartik Ahuja; Yilun Xu; Yisen Wang; Aaron Courville; |
1124 | Towards Certifying L-infinity Robustness Using Neural Networks with L-inf-dist Neurons Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we seek for a new approach to develop a theoretically principled neural network that inherently resists $\ell_\infty$ perturbations. |
Bohang Zhang; Tianle Cai; Zhou Lu; Di He; Liwei Wang; |
1125 | Efficient Lottery Ticket Finding: Less Data Is More Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper explores a new perspective on finding lottery tickets more efficiently, by doing so only with a specially selected subset of data, called Pruning-Aware Critical set (PrAC set), rather than using the full training set. |
Zhenyu Zhang; Xuxi Chen; Tianlong Chen; Zhangyang Wang; |
1126 | Robust Policy Gradient Against Strong Data Corruption Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We study the problem of robust reinforcement learning under adversarial corruption on both rewards and transitions. |
Xuezhou Zhang; Yiding Chen; Xiaojin Zhu; Wen Sun; |
1127 | Near Optimal Reward-Free Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We study the reward-free reinforcement learning framework, which is particularly suitable for batch reinforcement learning and scenarios where one needs policies for multiple reward functions. |
Zihan Zhang; Simon Du; Xiangyang Ji; |
1128 | Bayesian Attention Belief Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper introduces Bayesian attention belief networks, which construct a decoder network by modeling unnormalized attention weights with a hierarchy of gamma distributions, and an encoder network by stacking Weibull distributions with a deterministic-upward-stochastic-downward structure to approximate the posterior. |
Shujian Zhang; Xinjie Fan; Bo Chen; Mingyuan Zhou; |
1129 | Understanding Failures in Out-of-Distribution Detection with Deep Generative Models Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Deep generative models (DGMs) seem a natural fit for detecting out-of-distribution (OOD) inputs, but such models have been shown to assign higher probabilities or densities to OOD images than images from the training distribution. In this work, we explain why this behavior should be attributed to model misestimation. |
Lily Zhang; Mark Goldstein; Rajesh Ranganath; |
1130 | Poolingformer: Long Document Modeling with Pooling Attention Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we introduce a two-level attention schema, Poolingformer, for long document modeling. |
Hang Zhang; Yeyun Gong; Yelong Shen; Weisheng Li; Jiancheng Lv; Nan Duan; Weizhu Chen; |
1131 | Probabilistic Generating Circuits Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we explore their use as a tractable probabilistic model, and propose probabilistic generating circuits (PGCs) for their efficient representation. |
Honghua Zhang; Brendan Juba; Guy Van Den Broeck; |
1132 | PAPRIKA: Private Online False Discovery Rate Control Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we study False Discovery Rate (FDR) control in multiple hypothesis testing under the constraint of differential privacy for the sample. |
Wanrong Zhang; Gautam Kamath; Rachel Cummings; |
1133 | Learning from Noisy Labels with No Change to The Training Process Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we show that this is really unnecessary: one can simply perform class probability estimation (CPE) on the noisy examples, e.g. using a standard (multiclass) logistic regression algorithm, and then apply noise-correction only in the final prediction step. |
Mingyuan Zhang; Jane Lee; Shivani Agarwal; |
1134 | Progressive-Scale Boundary Blackbox Attack Via Projective Gradient Estimation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we show that such efficiency highly depends on the scale at which the attack is applied, and attacking at the optimal scale significantly improves the efficiency. |
Jiawei Zhang; Linyi Li; Huichen Li; Xiaolu Zhang; Shuang Yang; Bo Li; |
1135 | FOP: Factorizing Optimal Joint Policy of Maximum-Entropy Multi-Agent Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present a novel multi-agent actor-critic method, FOP, which can factorize the optimal joint policy induced by maximum-entropy multi-agent reinforcement learning (MARL) into individual policies. |
Tianhao Zhang; Yueheng Li; Chen Wang; Guangming Xie; Zongqing Lu; |
1136 | Learning Noise Transition Matrix from Only Noisy Labels Via Total Variation Regularization Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose a theoretically grounded method that can estimate the noise transition matrix and learn a classifier simultaneously, without relying on the error-prone noisy class-posterior estimation. |
Yivan Zhang; Gang Niu; Masashi Sugiyama; |
1137 | Quantile Bandits for Best Arms Identification Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Motivated by risk-averse decision-making problems, our goal is to identify a set of $m$ arms with the highest $\tau$-quantile values within a fixed budget. |
Mengyan Zhang; Cheng Soon Ong; |
1138 | Towards Better Robust Generalization with Shift Consistency Regularization Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Towards better robust generalization, we propose a new regularization method {–} shift consistency regularization (SCR) {–} to steer the same-class latent features of both natural and adversarial data into a common direction during adversarial training. |
Shufei Zhang; Zhuang Qian; Kaizhu Huang; Qiufeng Wang; Rui Zhang; Xinping Yi; |
1139 | On-Policy Deep Reinforcement Learning for The Average-Reward Criterion Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We develop theory and algorithms for average-reward on-policy Reinforcement Learning (RL). |
Yiming Zhang; Keith W Ross; |
1140 | Differentiable Dynamic Quantization with Mixed Precision and Adaptive Resolution Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Unlike prior arts that carefully tune these values, we present a fully differentiable approach to learn all of them, named Differentiable Dynamic Quantization (DDQ), which has several benefits. |
Zhaoyang Zhang; Wenqi Shao; Jinwei Gu; Xiaogang Wang; Ping Luo; |
1141 | IDARTS: Differentiable Architecture Search with Stochastic Implicit Gradients Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we tackle the hypergradient computation in DARTS based on the implicit function theorem, making it only depends on the obtained solution to the inner-loop optimization and agnostic to the optimization path. |
Miao Zhang; Steven W. Su; Shirui Pan; Xiaojun Chang; Ehsan M Abbasnejad; Reza Haffari; |
1142 | Deep Coherent Exploration for Continuous Control Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we introduce deep coherent exploration, a general and scalable exploration framework for deep RL algorithms for continuous control, that generalizes step-based and trajectory-based exploration. |
Yijie Zhang; Herke Van Hoof; |
1143 | Average-Reward Off-Policy Policy Evaluation with Function Approximation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To address the deadly triad, we propose two novel algorithms, reproducing the celebrated success of Gradient TD algorithms in the average-reward setting. |
Shangtong Zhang; Yi Wan; Richard S Sutton; Shimon Whiteson; |
1144 | Matrix Sketching for Secure Collaborative Machine Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a practical defense which we call Double-Blind Collaborative Learning (DBCL). |
Mengjiao Zhang; Shusen Wang; |
1145 | MetaCURE: Meta Reinforcement Learning with Empowerment-Driven Exploration Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To address this challenge, we explicitly model an exploration policy learning problem for meta-RL, which is separated from exploitation policy learning, and introduce a novel empowerment-driven exploration objective, which aims to maximize information gain for task identification. |
Jin Zhang; Jianhao Wang; Hao Hu; Tong Chen; Yingfeng Chen; Changjie Fan; Chongjie Zhang; |
1146 | World Model As A Graph: Learning Latent Landmarks for Planning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose to learn graph-structured world models composed of sparse, multi-step transitions. |
Lunjun Zhang; Ge Yang; Bradly C Stadie; |
1147 | Breaking The Deadly Triad with A Target Network Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we investigate the target network as a tool for breaking the deadly triad, providing theoretical support for the conventional wisdom that a target network stabilizes training. |
Shangtong Zhang; Hengshuai Yao; Shimon Whiteson; |
1148 | Multiscale Invertible Generative Networks for High-Dimensional Bayesian Inference Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a Multiscale Invertible Generative Network (MsIGN) and associated training algorithm that leverages multiscale structure to solve high-dimensional Bayesian inference. |
Shumao Zhang; Pengchuan Zhang; Thomas Y Hou; |
1149 | Meta Learning for Support Recovery in High-dimensional Precision Matrix Estimation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we study meta learning for support (i.e., the set of non-zero entries) recovery in high-dimensional precision matrix estimation where we reduce the sufficient sample complexity in a novel task with the information learned from other auxiliary tasks. |
Qian Zhang; Yilin Zheng; Jean Honorio; |
1150 | Model-Free Reinforcement Learning: from Clipped Pseudo-Regret to Sample Complexity Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we consider the problem of learning an $\epsilon$-optimal policy for a discounted Markov Decision Process (MDP). |
Zihan Zhang; Yuan Zhou; Xiangyang Ji; |
1151 | Learning to Rehearse in Long Sequence Memorization Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose the Rehearsal Memory (RM) to enhance long-sequence memorization by self-supervised rehearsal with a history sampler. |
Zhu Zhang; Chang Zhou; Jianxin Ma; Zhijie Lin; Jingren Zhou; Hongxia Yang; Zhou Zhao; |
1152 | Dataset Condensation with Differentiable Siamese Augmentation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we focus on condensing large training sets into significantly smaller synthetic sets which can be used to train deep neural networks from scratch with minimum drop in performance. |
Bo Zhao; Hakan Bilen; |
1153 | Joining Datasets Via Data Augmentation in The Label Space for Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this article, we are interested in systematic ways to join datasets that are made of similar purposes. |
Junbo Zhao; Mingfeng Ou; Linji Xue; Yunkai Cui; Sai Wu; Gang Chen; |
1154 | Calibrate Before Use: Improving Few-shot Performance of Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: GPT-3 can perform numerous tasks when provided a natural language prompt that contains a few training examples. We show that this type of few-shot learning can be unstable: the choice of prompt format, training examples, and even the order of the examples can cause accuracy to vary from near chance to near state-of-the-art. |
Zihao Zhao; Eric Wallace; Shi Feng; Dan Klein; Sameer Singh; |
1155 | Few-Shot Neural Architecture Search Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose few-shot NAS that uses multiple supernetworks, called sub-supernet, each covering different regions of the search space to alleviate the undesired co-adaption. |
Yiyang Zhao; Linnan Wang; Yuandong Tian; Rodrigo Fonseca; Tian Guo; |
1156 | Expressive 1-Lipschitz Neural Networks for Robust Multiple Graph Learning Against Adversarial Attacks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper proposes an attack-agnostic graph-adaptive 1-Lipschitz neural network, ERNN, for improving the robustness of deep multiple graph learning while achieving remarkable expressive power. |
Xin Zhao; Zeru Zhang; Zijie Zhang; Lingfei Wu; Jiayin Jin; Yang Zhou; Ruoming Jin; Dejing Dou; Da Yan; |
1157 | Fused Acoustic and Text Encoding for Multimodal Bilingual Pretraining and Speech Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To address these problems, we propose a Fused Acoustic and Text Masked Language Model (FAT-MLM) which jointly learns a unified representation for both acoustic and text input from various types of corpora including parallel data for speech recognition and machine translation, and even pure speech and text data. |
Renjie Zheng; Junkun Chen; Mingbo Ma; Liang Huang; |
1158 | Two Heads Are Better Than One: Hypergraph-Enhanced Graph Reasoning for Visual Event Ratiocination Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To this end, we propose a novel multi-modal model, Hypergraph-Enhanced Graph Reasoning. |
Wenbo Zheng; Lan Yan; Chao Gou; Fei-Yue Wang; |
1159 | How Framelets Enhance Graph Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper presents a new approach for assembling graph neural networks based on framelet transforms. |
Xuebin Zheng; Bingxin Zhou; Junbin Gao; Yuguang Wang; Pietro Li?; Ming Li; Guido Montufar; |
1160 | Probabilistic Sequential Shrinking: A Best Arm Identification Algorithm for Stochastic Bandits with Corruptions Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We consider a best arm identification (BAI) problem for stochastic bandits with adversarial corruptions in the fixed-budget setting of T steps. |
Zixin Zhong; Wang Chi Cheung; Vincent Tan; |
1161 | Towards Distraction-Robust Active Visual Tracking Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To address this issue, we propose a mixed cooperative-competitive multi-agent game, where a target and multiple distractors form a collaborative team to play against a tracker and make it fail to follow. |
Fangwei Zhong; Peng Sun; Wenhan Luo; Tingyun Yan; Yizhou Wang; |
1162 | Provably Efficient Reinforcement Learning for Discounted MDPs with Feature Mapping Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we study reinforcement learning for discounted Markov Decision Processes (MDPs), where the transition kernel can be parameterized as a linear function of certain feature mapping. |
Dongruo Zhou; Jiafan He; Quanquan Gu; |
1163 | Amortized Conditional Normalized Maximum Likelihood: Reliable Out of Distribution Uncertainty Estimation Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose the amortized conditional normalized maximum likelihood (ACNML) method as a scalable general-purpose approach for uncertainty estimation, calibration, and out-of-distribution robustness with deep networks. |
Aurick Zhou; Sergey Levine; |
1164 | Optimal Estimation of High Dimensional Smooth Additive Function Based on Noisy Observations Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We inherit the idea from a recent work which introduced an effective bias reduction technique through iterative bootstrap and derive a bias-reducing estimator. |
Fan Zhou; Ping Li; |
1165 | Incentivized Bandit Learning with Self-Reinforcing User Preferences Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we investigate a new multi-armed bandit (MAB) online learning model that considers real-world phenomena in many recommender systems: (i) the learning agent cannot pull the arms by itself and thus has to offer rewards to users to incentivize arm-pulling indirectly; and (ii) if users with specific arm preferences are well rewarded, they induce a "self-reinforcing" effect in the sense that they will attract more users of similar arm preferences. |
Tianchen Zhou; Jia Liu; Chaosheng Dong; Jingyuan Deng; |
1166 | Towards Defending Against Adversarial Examples Via Attack-Invariant Features Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To solve this problem, in this paper, we propose to remove adversarial noise by learning generalizable invariant features across attacks which maintain semantic classification information. |
Dawei Zhou; Tongliang Liu; Bo Han; Nannan Wang; Chunlei Peng; Xinbo Gao; |
1167 | Asymmetric Loss Functions for Learning with Noisy Labels Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose a new class of loss functions, namely asymmetric loss functions, which are robust to learning from noisy labels for arbitrary noise type. |
Xiong Zhou; Xianming Liu; Junjun Jiang; Xin Gao; Xiangyang Ji; |
1168 | Examining and Combating Spurious Features Under Distribution Shift Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we define and analyze robust and spurious representations using the information-theoretic concept of minimal sufficient statistics. |
Chunting Zhou; Xuezhe Ma; Paul Michel; Graham Neubig; |
1169 | Sparse and Imperceptible Adversarial Attack Via A Homotopy Algorithm Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we address this challenge by proposing a homotopy algorithm, to jointly tackle the sparsity and the perturbation bound in one unified framework. |
Mingkang Zhu; Tianlong Chen; Zhangyang Wang; |
1170 | Data-Free Knowledge Distillation for Heterogeneous Federated Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Inspired by the prior art, we propose a data-free knowledge distillation approach to address heterogeneous FL, where the server learns a lightweight generator to ensemble user information in a data-free manner, which is then broadcasted to users, regulating local training using the learned knowledge as an inductive bias. |
Zhuangdi Zhu; Junyuan Hong; Jiayu Zhou; |
1171 | Spectral Vertex Sparsifiers and Pair-wise Spanners Over Distributed Graphs Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we design communication-efficient distributed algorithms for constructing spectral vertex sparsifiers, which closely preserve effective resistance distances on a subset of vertices of interest in the original graphs, under the well-established message passing communication model. |
Chunjiang Zhu; Qinqing Liu; Jinbo Bi; |
1172 | Few-shot Language Coordination By Modeling Theory of Mind Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Drawing inspiration from the study of theory-of-mind (ToM; Premack & Woodruff (1978)), we study the effect of the speaker explicitly modeling the listener’s mental state. |
Hao Zhu; Graham Neubig; Yonatan Bisk; |
1173 | Clusterability As An Alternative to Anchor Points When Learning with Noisy Labels Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Our main contribution is the discovery of an efficient estimation procedure based on a clusterability condition. |
Zhaowei Zhu; Yiwen Song; Yang Liu; |
1174 | Commutative Lie Group VAE for Disentanglement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: A simple model named Commutative Lie Group VAE is introduced to realize the group-based disentanglement learning. |
Xinqi Zhu; Chang Xu; Dacheng Tao; |
1175 | Accumulated Decoupled Learning with Gradient Staleness Mitigation for Convolutional Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose an accumulated decoupled learning (ADL), which includes a module-wise gradient accumulation in order to mitigate the gradient staleness. |
Huiping Zhuang; Zhenyu Weng; Fulin Luo; Toh Kar-Ann; Haizhou Li; Zhiping Lin; |
1176 | Demystifying Inductive Biases for (Beta-)VAE Based Architectures Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we shed light on the inductive bias responsible for the success of VAE-based architectures. |
Dominik Zietlow; Michal Rolinek; Georg Martius; |
1177 | Recovering AES Keys with A Deep Cold Boot Attack Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work we combine a deep error correcting code technique together with a modified SAT solver scheme in order to apply the attack to AES keys. |
Itamar Zimerman; Eliya Nachmani; Lior Wolf; |
1178 | Learning Fair Policies in Decentralized Cooperative Multi-Agent Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: As a solution method, we propose a novel neural network architecture, which is composed of two sub-networks specifically designed for taking into account these two aspects of fairness. |
Matthieu Zimmer; Claire Glanois; Umer Siddique; Paul Weng; |
1179 | Contrastive Learning Inverts The Data Generating Process Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We here prove that feedforward models trained with objectives belonging to the commonly used InfoNCE family learn to implicitly invert the underlying generative model of the observed data. |
Roland S. Zimmermann; Yash Sharma; Steffen Schneider; Matthias Bethge; Wieland Brendel; |
1180 | Exploration in Approximate Hyper-State Space for Meta Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To address this, we propose HyperX, which uses novel reward bonuses for meta-training to explore in approximate hyper-state space (where hyper-states represent the environment state and the agent’s task belief). |
Luisa M Zintgraf; Leo Feng; Cong Lu; Maximilian Igl; Kristian Hartikainen; Katja Hofmann; Shimon Whiteson; |
1181 | Provable Robustness of Adversarial Training for Learning Halfspaces with Noise Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To the best of our knowledge, this is the first work showing that adversarial training provably yields robust classifiers in the presence of noise. |
Difan Zou; Spencer Frei; Quanquan Gu; |
1182 | On The Convergence of Hamiltonian Monte Carlo with Stochastic Gradients Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a general framework for proving the convergence rate of HMC with stochastic gradient estimators, for sampling from strongly log-concave and log-smooth target distributions. |
Difan Zou; Quanquan Gu; |
1183 | A Functional Perspective on Learning Symmetric Functions with Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work we treat symmetric functions (of any size) as functions over probability measures, and study the learning and representation of neural networks defined on measures. |
Aaron Zweig; Joan Bruna; |