Paper Digest: NeurIPS 2022 Highlights
To help the community quickly catch up on the work presented in this conference, Paper Digest Team processed all accepted papers, and generated one highlight sentence (typically the main topic) for each paper. Readers are encouraged to read these machine generated highlights / summaries to quickly get the main idea of each paper.
Based in New York, Paper Digest is dedicated to helping people generate contents & reason over unstructured data. Different from black-box approaches, we build deep models on semantics, which allows results to be produced with explainations. Such models power this website, and are behind our services including “search engine”, “summarization”, “question answering”, and “literature review”.
If you do not want to miss interesting academic papers, you are welcome to sign up our daily paper digest service to get updates on new papers published in your area every day. You are also welcome to follow us on Twitter and Linkedin to get updated with new conference digests.
Paper Digest Team
New York City, New York, 10017
team@paperdigest.org
TABLE 1: Paper Digest: NeurIPS 2022 Highlights
Paper | Author(s) | |
---|---|---|
1 | Not All Bits Have Equal Value: Heterogeneous Weight Precisions Via Trainable Noise Tensors Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a method to directly optimize how many bits are used to represent each parameter in a network. |
Pedro Savarese; Xin Yuan; Yanjing Li; Michael Maire; |
2 | S-PIFu: Integrating Parametric Human Models with PIFu for Single-view Clothed Human Reconstruction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present three novel strategies to incorporate a parametric body model into a pixel-aligned implicit model for single-view clothed human reconstruction. |
Kennard Chan; Guosheng Lin; Haiyu Zhao; Weisi Lin; |
3 | Target Alignment in Truncated Kernel Ridge Regression Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we study how the alignment between the target function and the kernel affects the performance of the KRR. |
Arash Amini; Richard Baumgartner; Dai Feng; |
4 | Uncertainty Estimation Using Riemannian Model Dynamics Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we combine parametric and nonparametric methods for uncertainty estimation through a novel latent space based metric. |
Guy Tennenholtz; Shie Mannor; |
5 | Bivariate Causal Discovery for Categorical Data Via Classification with Optimal Label Permutation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a novel causal model for categorical data based on a new classification model, termed classification with optimal label permutation (COLP). |
Yang Ni; |
6 | Adversarial Reprogramming Revisited Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We show that neural networks with random weights are susceptible to adversarial reprogramming, and that in some settings training the network can cause its adversarial reprogramming to fail. |
Matthias Englert; Ranko Lazic; |
7 | Efficient and Effective Augmentation Strategy for Adversarial Training Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose an effective augmentation strategy for Adversarial Training that can be integrated with several Adversarial Training algorithms and data augmentations. |
Sravanti Addepalli; Samyak Jain; Venkatesh Babu R; |
8 | Pragmatically Learning from Pedagogical Demonstrations in Multi-Goal Environments Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Building on the pedagogy and pragmatism concepts from Developmental Psychology, we show how learning from demonstration can benefit from a Bayesian goal inference mechanism to reduce goal ambiguity and learn faster in multi-goal environments. |
Hugo Caselles-Dupré; Olivier Sigaud; Mohamed CHETOUANI; |
9 | Instance-based Learning for Knowledge Base Completion Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we proposed a new method for knowledge base completion (KBC): instance-based learning (IBL). |
Wanyun Cui; Xingran Chen; |
10 | On The Convergence Theory for Hessian-Free Bilevel Algorithms Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper provides a novel convergence rate analysis for Hessian-free bilevel algorithms with partial hypergradient estimation. |
Daouda Sow; Kaiyi Ji; Yingbin Liang; |
11 | Towards Diverse and Faithful One-shot Adaption of Generative Adversarial Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We presented a method DiFa to address the diverse generation and faithful adaptation issues for one-shot generative domain adaption. |
Yabo Zhang; mingshuai Yao; Yuxiang Wei; Zhilong Ji; Jinfeng Bai; Wangmeng Zuo; |
12 | Pay Attention to Your Loss : Understanding Misconceptions About Lipschitz Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Lipschitz neural network are good classifiers: they are expressive, they are provably robust, and they generalize. |
Louis Béthune; Thibaut Boissin; Mathieu Serrurier; Franck Mamalet; Corentin Friedrich; Alberto Gonzalez Sanz; |
13 | Decision Trees with Short Explainable Rules Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: There is indeed a vast literature on the design and analysis of decision tree algorithms that aim at optimizing these parameters.This paper contributes to this important line of research: we propose as a novel criterion of measuring the interpretability of a decision tree, the sparsity of the set of attributes that are (on average) required to explain the classification of the examples. |
Ferdinando Cicalese; Victor Feitosa Souza; Eduardo Laber; Marco Molinaro; |
14 | Does Momentum Change The Implicit Regularization on Separable Data? Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We derive the implicit regularization of momentum-based optimizers on the linearly separable datasets. |
Bohan Wang; Qi Meng; Huishuai Zhang; Ruoyu Sun; Wei Chen; Zhi-Ming Ma; Tie-Yan Liu; |
15 | Self-supervised Heterogeneous Graph Pre-training Based on Structural Clustering Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we present SHGP, a novel Self-supervised Heterogeneous Graph Pre-training approach, which does not need to generate any positive examples or negative examples. |
Yaming Yang; Ziyu Guan; Zhe Wang; Wei Zhao; Cai Xu; Weigang Lu; Jianbin Huang; |
16 | Object Scene Representation Transformer Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose Object Scene Representation Transformer (OSRT), a highly efficient 3D-centric model in which individual object representations naturally emerge through novel view synthesis. |
Mehdi S. M. Sajjadi; Daniel Duckworth; Aravindh Mahendran; Sjoerd van Steenkiste; Filip Pavetić; Mario Lucic; Leonidas Guibas; Klaus Greff; Thomas Kipf; |
17 | Explicable Policy Search Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Prior work on explicable planning describes the ability of agents to respect their human teammate’s expectations by trading off task performance for more expected or "explicable" behaviors. In this paper, we introduce Explicable Policy Search (EPS) to significantly extend such an ability to a reinforcement learning (RL) setting and to handle stochastic domains with continuous state and action spaces. |
Ze Gong; Yu ("Tony") Zhang; |
18 | Revisiting Realistic Test-Time Training: Sequential Inference and Adaptation By Anchored Clustering Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we first revisit TTT assumptions and categorize TTT protocols by two key factors. Among the multiple protocols, we adopt a realistic sequential test-time training (sTTT) protocol, under which we further develop a test-time anchored clustering (TTAC) approach to enable stronger test-time feature learning. |
Yongyi Su; Xun Xu; Kui Jia; |
19 | TokenMixup: Efficient Attention-guided Token-level Data Augmentation for Transformers Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: TokenMixup is a general token-level augmentation method, which provides an efficient augmentation means for vision transformer models. |
Hyeong Kyu Choi; Joonmyung Choi; Hyunwoo Kim; |
20 | Optimistic Tree Searches for Combinatorial Black-Box Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a novel tree search algorithm for solving black-box combinatorial optimization problems |
Cedric Malherbe; Antoine Grosnit; Rasul Tutunov; Haitham Bou Ammar; Jun Wang; |
21 | Learning Robust Rule Representations for Abstract Reasoning Via Internal Inferences Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose a novel framework, ARII, that learns rule representations for Abstract Reasoning via Internal Inferences. |
Wenbo Zhang; likai tang; Site Mo; Xianggen Liu; Sen Song; |
22 | Adversarial Style Augmentation for Domain Generalized Urban-Scene Segmentation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a novel adversarial style augmentation approach for domain generalization in semantic segmentation, which is easy to implement and can effectively improve the model performance on unseen real domains. |
Zhun Zhong; Yuyang Zhao; Gim Hee Lee; Nicu Sebe; |
23 | Amortized Projection Optimization for Sliced Wasserstein Generative Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose to utilize amortized optimization to solve the computational issue of sliced Wasserstein in deep learning applications. |
Khai Nguyen; Nhat Ho; |
24 | OpenAUC: Towards AUC-Oriented Open-Set Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, a systematic analysis reveals that most existing metrics are essentially inconsistent with the aforementioned goal of OSR: (1) For metrics extended from close-set classification, such as Open-set F-score, Youden’s index, and Normalized Accuracy, a poor open-set prediction can escape from a low performance score with a superior close-set prediction. (2) Novelty detection AUC, which measures the ranking performance between close-set and open-set samples, ignores the close-set performance. |
Zitai Wang; Qianqian Xu; Zhiyong Yang; Yuan He; Xiaochun Cao; Qingming Huang; |
25 | Don’t Pour Cereal Into Coffee: Differentiable Temporal Logic for Temporal Action Segmentation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a differentiable linear temporal logic framework to provide explicit temporal constraints to action segmentation models, which results in improved performance. |
Ziwei Xu; Yogesh Rawat; Yongkang Wong; Mohan Kankanhalli; Mubarak Shah; |
26 | Revisiting Sliced Wasserstein on Images: From Vectorization to Convolution Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose convolution sliced Wasserstein between probability measures over images that are based on convolution operators. |
Khai Nguyen; Nhat Ho; |
27 | A Lower Bound of Hash Codes’ Performance Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propsoe a lower bound of hash codes’ performance and a posterior estimation surrogate model over hash codes to improve hash learning. |
Xiaosu Zhu; Jingkuan Song; Yu Lei; Lianli Gao; Hengtao Shen; |
28 | I2Q: A Fully Decentralized Q-Learning Algorithm Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To deal with non-stationarity, we first introduce stationary ideal transition probabilities, on which independent Q-learning could converge to the global optimum. Further, we propose a fully decentralized method, I2Q, which performs independent Q-learning on the modeled ideal transition function to reach the global optimum. |
Jiechuan Jiang; Zongqing Lu; |
29 | Unifying Voxel-based Representation with Transformer for 3D Object Detection Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we present a unified framework for multi-modality 3D object detection, named UVTR. |
Yanwei Li; Yilun Chen; Xiaojuan Qi; Zeming Li; Jian Sun; Jiaya Jia; |
30 | Multiple-sample Neural Image Compression Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose to train NIC with multiple-sample importance weighted autoencoder (IWAE) target, which is tighter than ELBO and converges to log likelihood as sample size increases. |
Tongda Xu; Yan Wang; Dailan He; Chenjian Gao; Han Gao; Kunzan Liu; Hongwei Qin; |
31 | The Unreliability of Explanations in Few-Shot In-Context Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Explanations generated by LLMs can be unreliable, but they can still be useful as a way to verify GPT-3’s predictions post-hoc. |
Xi Ye; Greg Durrett; |
32 | Regularized Gradient Descent Ascent for Two-Player Zero-Sum Markov Games Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Our main contribution is to show that under proper choices of the regularization parameter, the gradient descent ascent algorithm converges to the Nash equilibrium of the original unregularized problem. |
Sihan Zeng; Thinh Doan; Justin Romberg; |
33 | The Price of Unfairness in Linear Bandits with Biased Feedback Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce the problem of linear bandits with biased feedback and characterize it in terms of worst-case and gap-dependent regret. |
Solenne Gaucher; Alexandra Carpentier; Christophe Giraud; |
34 | Approximate Euclidean Lengths and Distances Beyond Johnson-Lindenstrauss Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We investigate techniques related to the Hutch++ algorithm to improve classical Johnson-Lindenstrauss approximations |
Aleksandros Sobczyk; Mathieu Luisier; |
35 | NS3: Neuro-symbolic Semantic Code Search Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, current language models are known to struggle with longer, compositional sentences, and multi-step reasoning. To overcome this limitation, we propose supplementing the query sentence with a layout of its semantic structure. |
Shushan Arakelyan; Anna Hakhverdyan; Miltiadis Allamanis; Christophe Hauser; Luis Garcia; Xiang Ren; |
36 | DeepInteraction: Exploring Multi-modal Interaction for 3D Object Detection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we introduce a novel 3D object detection architecture, dubbed as DeepInteraction, characterized by bilateral interaction and association throughout both representation encoding and decoding, in order to maximally exploit the inter-modal complementary property. |
Zeyu Yang; Jiaqi Chen; Zhenwei Miao; Wei Li; Xiatian Zhu; Li Zhang; |
37 | [Re] Exacerbating Algorithmic Bias Through Fairness Attacks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The presented study evaluates ”Exacerbating Algorithmic Bias through Fairness Attacks” by Mehrabi et al. (2021) within the scope of the ML Reproducibility Challenge 2021. |
Angelos Nalmpantis; Apostolos Panagiotopoulos; John Gkountouras; Konstantinos Papakostas; |
38 | [Re] Differentiable Spatial Planning Using Transformers Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This report covers our reproduction effort of the paper ‘Differentiable Spatial Planning using Transformers’ by DOI Chaplot et al. [chaplot2021differentiable]. |
Rohit Ranjan; Himadri Bhakta; Animesh Jha; Parv Maheshwari; |
39 | FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present a fast and memory-efficient exact attention algorithm by accounting for GPU memory reads/writes, yielding faster end-to-end training time and higher quality models with longer sequences. |
Tri Dao; Dan Fu; Stefano Ermon; Atri Rudra; Christopher Ré; |
40 | Distributed Learning of Finite Gaussian Mixtures Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this situation, the split-and-conquer strategy is among the most effective solutions to many statistical problems, including quantile processes, regression analysis, principal eigenspaces, and exponential families. This paper applies this strategy to develop a distributed learning procedure of finite Gaussian mixtures. |
Qiong Zhang; Jiahua Chen; |
41 | Geo-SIC: Learning Deformable Geometric Shapes in Deep Image Classifiers Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper presents Geo-SIC, the first deep learning model to learn deformable shapes in a deformation space for an improved performance of image classification. |
Jian Wang; Miaomiao Zhang; |
42 | Explainable Reinforcement Learning Via Model Transforms Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We use formal MDP abstractions and transforms, previously used for expediting planning, to automatically explain discrepancies between the behavior of a DRL agent and the behavior that is anticipated by an observer. |
Mira Finkelstein; Nitsan levy; Lucy Liu; Yoav Kolumbus; David Parkes; Jeffrey S Rosenschein; Sarah Keren; |
43 | Markov Chain Score Ascent: A Unifying Framework of Variational Inference with Markovian Gradients Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We provide a uniyfing non-asymptotic analysis of recent variational inference methods based on Markovian gradients and propose an improved scheme. |
Kyurae Kim; Jisu Oh; Jacob Gardner; Adji Bousso Dieng; Hongseok Kim; |
44 | Local Latent Space Bayesian Optimization Over Structured Inputs Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose LOL-BO, which adapts the notion of trust regions explored in recent work on high-dimensional Bayesian optimization to the structured setting. |
Natalie Maus; Haydn Jones; Juston Moore; Matt Kusner; John Bradshaw; Jacob Gardner; |
45 | A Hybrid Neural Autoencoder for Sensory Neuroprostheses and Its Applications in Bionic Vision Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose an encoder-decoder based stimulus encoding framework for sensory neuroprostheses and demonstrate its effectiveness for visual prostheses. |
Jacob Granley; Lucas Relic; Michael Beyeler; |
46 | Unsupervised Domain Adaptation for Semantic Segmentation Using Depth Distribution Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Besides the existing methods that only use depth regression as an auxiliary task, we propose to use depth distribution density to support semantic segmentation. |
Quanliang Wu; Huajun Liu; |
47 | Visual Correspondence-based Explanations Improve AI Robustness and Human-AI Team Accuracy Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose and evaluate two novel, explainable image classifiers that explain before making decisions by computing explicit visual correspondence with examplars |
Mohammad Reza Taesiri; Giang Nguyen; Anh Nguyen; |
48 | Interaction Modeling with Multiplex Attention Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Here we introduce a method for accurately modeling multi-agent systems. |
Fan-Yun Sun; Isaac Kauvar; Ruohan Zhang; Jiachen Li; Mykel J Kochenderfer; Jiajun Wu; Nick Haber; |
49 | FedSR: A Simple and Effective Domain Generalization Method for Federated Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a domain generalization learning method suitable for federated learning by implicit representation alignment |
A. Tuan Nguyen; Ser Nam Lim; Philip Torr; |
50 | Unifying Information Extraction with Latent Adaptive Structure-aware Generative Language Model Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a latent adaptive structure-aware generative language model for universal information extraction. |
Hao Fei; Shengqiong Wu; Libo Qin; Jingye Li; Bobo Li; Fei Li; Meishan Zhang; Min Zhang; Tat-Seng Chua; |
51 | Physically-Based Face Rendering for NIR-VIS Face Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Specifically, we reconstruct 3D face shape and reflectance from a large 2D facial dataset and introduce a novel method of transforming the VIS reflectance to NIR reflectance. |
Yunqi Miao; Alexandros Lattas; Jiankang Deng; Jungong Han; Stefanos Zafeiriou; |
52 | Unsupervised Learning From Incomplete Measurements for Inverse Problems Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present necessary and sufficient conditions and an new unsupervised loss for learning from incomplete measurement data associated to multiple measurement operators. |
Julián Tachella; Dongdong Chen; Mike Davies; |
53 | Dynamic 3D from Monocular Video: Reality Check Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Existing works on dynamic view synthesis from monocular video actually evaluate on protocols that are essentially multi-view. We propose an actual monocular dataset and evaluation protocols that show there’s much room for improvement. |
Hang Gao; Ruilong Li; Shubham Tulsiani; Bryan Russell; Angjoo Kanazawa; |
54 | Peripheral Vision Transformer Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We explore blending human peripheral vision with machine vision for image recognition. |
Juhong Min; Yucheng Zhao; Chong Luo; Minsu Cho; |
55 | Simple Mechanisms for Welfare Maximization in Rich Advertising Auctions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce the problem of rich ads and give a simple truthful mechanism that achieves a constant of the optimal welfare. |
Gagan Aggarwal; Kshipra Bhawalkar; Aranyak Mehta; Divyarthi Mohan; Alexandros Psomas; |
56 | Are All Frames Equal? Active Sparse Labeling for Video Action Detection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose active sparse labeling (ASL), a novel active learning strategy for video action detection. |
Aayush Rana; Yogesh Rawat; |
57 | A Practical, Progressively-Expressive GNN Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Our work puts forth such a proposal: Namely, we first propose the (k, c)(=)-SETWL hierarchy with greatly reduced complexity from k-WL, achieved by moving from k-tuples of nodes to sets with =k nodes defined over =c connected components in the induced original graph. |
Lingxiao Zhao; Neil Shah; Leman Akoglu; |
58 | Constrained Predictive Coding As A Biologically Plausible Model of The Cortical Hierarchy Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: By employing a constraint on the latent variables, we derive an upper bound for the predictive-coding objective, which we use to obtain a biologically plausible neural network that shows excellent agreement with experimental observations. |
Siavash Golkar; Tiberiu Tesileanu; Yanis Bahroun; Anirvan Sengupta; Dmitri Chklovskii; |
59 | SPDNet: A Large-Scale Imagery Dataset and Benchmark for Spatial Precipitation Downscaling Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, the lack of a well-organized and annotated large-scale dataset hinders the training and verification of more effective and advancing deep-learning models for precipitation downscaling. To alleviate these obstacles, we present the first large-scale spatial precipitation downscaling dataset named \emph{SPDNet}, which contains more than $62,400$ pairs of high-quality low/high-resolution precipitation maps for over $17$ years, ready to help the evolution of deep learning models in precipitation downscaling. |
Xuanhong Chen; Kairui Feng; Bingbing Ni; Naiyuan Liu; Yifan Lu; Ziang Liu; Zhengyan Tong; |
60 | Semi-supervised Vision Transformers at Scale Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Our proposed method, dubbed Semi-ViT, achieves comparable or better performance than the CNN counterparts in the semi-supervised classification setting. |
Zhaowei Cai; Avinash Ravichandran; Paolo Favaro; Manchen Wang; Davide Modolo; Rahul Bhotika; Zhuowen Tu; Stefano Soatto; |
61 | Deep Fourier Up-Sampling Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This is the first attempt to propose a theoretically feasible Deep Fourier Up-sampling for multi-scale modeling. |
man zhou; Hu Yu; Jie Huang; Feng Zhao; Jinwei Gu; Chen Change Loy; Deyu Meng; Chongyi Li; |
62 | Free Probability As A Solution to The Problem of Tuning Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Stability (and hence performance) of NNs can be probed before training thanks to Free Probability Theory, which gives a computable metamodel in the infinite width regime. |
Reda CHHAIBI; Tariq Daouda; Ezechiel Kahn; |
63 | Meta-Query-Net: Resolving Purity-Informativeness Dilemma in Open-set Active Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose Meta-Query-Net that adaptively finds the best balancing between purity and informativeness for open-set active learning. |
Dongmin Park; Yooju Shin; Jihwan Bang; Youngjun Lee; Hwanjun Song; Jae-Gil Lee; |
64 | Lazy and Fast Greedy MAP Inference for Determinantal Point Process Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We combine the lazy greedy algorithm and the Cholesky-factorization-based fast greedy algorithm for faster greedy DPP MAP inference. |
Shinichi Hemmi; Taihei Oki; Shinsaku Sakaue; Kaito Fujii; Satoru Iwata; |
65 | Label Noise in Adversarial Training: A Novel Perspective to Study Robust Overfitting Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We show that label noise exists in adversarial training and can explain robust overfitting as well as its intriguing behaviors. |
Chengyu Dong; Liyuan Liu; Jingbo Shang; |
66 | Weakly Supervised Representation Learning with Sparse Perturbations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a natural estimation procedure based on this theory and illustrate it on low-dimensional synthetic and image-based experiments. |
Kartik Ahuja; Jason Hartford; Yoshua Bengio; |
67 | A Character-Level Length Control Algorithm for Non-Autoregressive Sentence Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a Non-Autoregressive summarization model with Character-level length Control (NACC) approach, which not only can control the number of characters in the model output explicitly but also is efficient in inference. |
Puyuan Liu; Xiang Zhang; Lili Mou; |
68 | Risk-Driven Design of Safety-Critical Perception Systems Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Not all perception errors are equally unsafe. We combine closed-loop risk assessment with supervised learning to train safer perception systems. |
Anthony Corso; Sydney Katz; Craig Innes; Xin Du; Subramanian Ramamoorthy; Mykel J Kochenderfer; |
69 | Flatten The Curve: Efficiently Training Low-Curvature Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a practical method to train neural networks such that they have a low curvature, without losing predictive accuracy. |
Suraj Srinivas; Kyle Matoba; Himabindu Lakkaraju; François Fleuret; |
70 | Self-explaining Deep Models with Logic Rule Reasoning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present a framework for integrating self-explaining capabilities into a given deep model, so that it predicts accurately and explains with logic rules that are coherent with human decision logic. |
Seungeon Lee; Xiting Wang; Sungwon Han; Eunji Lee; Xiaoyuan Yi; Xing Xie; Meeyoung Cha; |
71 | Causal Disentanglement for Time Series Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper establishes the identifiability theories of unsupervised causal representation learning for sequential data and propose an implementation of the assumed causal model as a sequential deep generative model. |
Weiran Yao; Guangyi Chen; Kun Zhang; |
72 | FlowHMM: Flow-based Continuous Hidden Markov Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Continuous hidden Markov models (HMMs) assume that observations are generated from a mixture of Gaussian densities, limiting their ability to model more complex distributions. In this work, we address this shortcoming and propose novel continuous HMM models, dubbed FlowHMMs, that allow to learn general continuous observation densities without constraining them to follow a~Gaussian distribution or their mixtures. |
Pawel Lorek; Rafal Nowak; Tomasz Trzcinski; Maciej Zieba; |
73 | Markovian Interference in Experiments Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce an on-policy estimator: the Differences-In-Q’s (DQ) estimator. |
Vivek Farias; Andrew Li; Tianyi Peng; Andrew Zheng; |
74 | Lifting Weak Supervision To Structured Prediction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We study weak supervision for structured prediction, obtaining favorable generalization guarantees despite using noisy pseudo-labels. |
Harit Vishwakarma; Frederic Sala; |
75 | Masked Autoencoders That Listen Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Audio-MAE learns SoTA embeddings from audio spectrograms. Without external pretraining, it achieves best performance with high masking ratio (80%) and decoders with local attention. Qualitative audible reconstructions demonstrate its effectiveness. |
Po-Yao Huang; Hu Xu; Juncheng Li; Alexei Baevski; Michael Auli; Wojciech Galuba; Florian Metze; Christoph Feichtenhofer; |
76 | Unsupervised Point Cloud Completion and Segmentation By Generative Adversarial Autoencoding Network Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a unsupervised method for point cloud completion and segmentation. |
Changfeng Ma; Yang Yang; Jie Guo; Fei Pan; Chongjun Wang; Yanwen Guo; |
77 | BEVFusion: A Simple and Robust LiDAR-Camera Fusion Framework Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce a simple LiDAR-camera fusion framework that overcomes the downside of previous fusion approaches. |
Tingting Liang; Hongwei Xie; Kaicheng Yu; Zhongyu Xia; Zhiwei Lin; Yongtao Wang; Tao Tang; Bing Wang; Zhi Tang; |
78 | Multi-agent Covering Option Discovery Based on Kronecker Product of Factor Graphs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Our key idea is to approximate the joint state space as the Kronecker product of individual agents’ state spaces, based on which we can directly estimate the Fiedler vector of the joint state space using the Laplacian spectrum of individual agents’ transition graphs. |
Jiayu Chen; Jingdi Chen; Tian Lan; Vaneet Aggarwal; |
79 | Neural Transmitted Radiance Fields Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we aim at addressing the problem of rendering novel transmitted views given a set of reflection-corrupted images. |
Chengxuan Zhu; Renjie Wan; Boxin Shi; |
80 | Neural Basis Models for Interpretability Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a novel subfamily of GAMs that utilizes basis decomposition of shape functions, called Neural Basis Models (NBMs). NBMs exploit the feature correlations and allow GAMs to scale by order of magnitude while preserving the interpretability. |
Filip Radenovic; Abhimanyu Dubey; Dhruv Mahajan; |
81 | On Divergence Measures for Bayesian Pseudocoresets Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We explored three divergence measures, reverse KLD, Wasserstein distance, and forward KLD to construct a Bayesian pseudocoreset. |
Balhae Kim; Jungwon Choi; Seanie Lee; Yoonho Lee; Jung-Woo Ha; Juho Lee; |
82 | Test-Time Prompt Tuning for Zero-Shot Generalization in Vision-Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose test-time prompt tuning (TPT) for CLIP to improve its zero-shot generalization. Our method works on a single test sample without the need for training data or annotations. |
Manli Shu; Chaowei Xiao; Weili Nie; De-An Huang; Zhiding Yu; Tom Goldstein; Anima Anandkumar; |
83 | Exact Solutions of A Deep Linear Network Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We find the analytical expression of the global minima of a deep feedforward linear network. |
Liu Ziyin; Botao Li; Xiangming Meng; |
84 | Maximum Likelihood Training of Implicit Nonlinear Diffusion Model Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce a trainable implicit nonlinear diffusion process |
Dongjun Kim; Byeonghu Na; Se Jung Kwon; Dongsoo Lee; Wanmo Kang; Il-chul Moon; |
85 | Relation-Constrained Decoding for Text Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a novel algorithm RESEAL for relation-constrained decoding. |
Xiang Chen; Zhixian Yang; Xiaojun Wan; |
86 | Efficiency Ordering of Stochastic Gradient Descent Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce the notion of efficiency ordering as an alternative metric for comparing the performance of different stochastic input sequences for Stochastic Gradient Descent algorithm. |
Jie Hu; Vishwaraj Doshi; Do-Young Eun; |
87 | Mirror Descent with Relative Smoothness in Measure Spaces, with Application to Sinkhorn and EM Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We derive the convergence of mirror descent for relatively smooth and strongly convex pairs of functionals over measure spaces, applying it to Sinkhorn’s primal iterations and the EM algorithm through th KL. |
Pierre-Cyril Aubin-Frankowski; Anna Korba; Flavien Léger; |
88 | Interaction-Grounded Learning with Action-inclusive Feedback Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We proved that interaction-grounded learning is possible when the feedback has the full information of the action embedded in it. |
Tengyang Xie; Akanksha Saran; Dylan J Foster; Lekan Molu; Ida Momennejad; Nan Jiang; Paul Mineiro; John Langford; |
89 | Learning to Attack Federated Learning: A Model-based Reinforcement Learning Attack Framework Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a model-based reinforcement learning framework to derive untargeted poisoning attacks against federated learning (FL) systems. |
Henger Li; Xiaolin Sun; Zizhan Zheng; |
90 | On Enforcing Better Conditioned Meta-Learning for Rapid Few-Shot Adaptation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Inspired by the concept of preconditioning, we propose a novel method to significantly increase adaptation speed for gradient-based meta-learning methods without incurring extra parameters. |
Markus Hiller; Mehrtash Harandi; Tom Drummond; |
91 | Cluster and Aggregate: Face Recognition with Large Probe Set Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a two-stage feature fusion paradigm, Cluster and Aggregate, that can both scale to large $N$ and maintain the ability to perform sequential inference with order invariance. |
Minchul Kim; Feng Liu; Anil K Jain; Xiaoming Liu; |
92 | Safety Guarantees for Neural Network Dynamic Systems Via Stochastic Barrier Functions Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we introduce a method of safety certification and control for neural network dynamic systems via stochastic barrier functions. |
Rayan Mazouz; Karan Muvvala; Akash Ratheesh Babu; Luca Laurenti; Morteza Lahijanian; |
93 | Online Frank-Wolfe with Arbitrary Delays Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: A variant of online Frank-Wolfe for online learning with arbitrary delays is proposed, and it is robust to a relatively large amount of delay. |
Yuanyu Wan; Wei-Wei Tu; Lijun Zhang; |
94 | What Is A Good Metric to Study Generalization of Minimax Learners? Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: A fundamental question remains elusive: What is a good metric to study generalization of minimax learners? In this paper, we aim to answer this question by first showing that primal risk, a universal metric to study generalization in minimization problems, fails in simple examples of minimax problems. |
Asuman Ozdaglar; Sarath Pattathil; Jiawei Zhang; Kaiqing Zhang; |
95 | Globally Convergent Policy Search for Output Estimation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We develop the first direct policy search algorithm which provably converges to the globally optimal dynamic filter for the classical problem of predicting the outputs of a linear dynamical system, given noisy, partial observations. |
Jack Umenberger; Max Simchowitz; Juan Perdomo; Kaiqing Zhang; Russ Tedrake; |
96 | Improving Neural Ordinary Differential Equations with Nesterov’s Accelerated Gradient Method Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose the Nesterov neural ordinary differential equations (NesterovNODEs) whose layers solve the second-order ordinary differential equations limit of Nesterov’s accelerated gradient method for speeding up the training and inference of NODEs. |
Ho Huu Nghia Nguyen; Tan Nguyen; Huyen Vo; Stanley Osher; Thieu Vo; |
97 | Learning Structure from The Ground Up—Hierarchical Representation Learning By Chunking Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Inspired by the Gestalt principle of \textit{grouping by proximity} and theories of chunking in cognitive science, we propose a hierarchical chunking model (HCM). |
Shuchen Wu; Noemi Elteto; Ishita Dasgupta; Eric Schulz; |
98 | Are Defenses for Graph Neural Networks Robust? Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Adaptive evaluation reveals that most examined adversarial defenses for GNNs show no or only marginal improvement in robustness |
Felix Mujkanovic; Simon Geisler; Aleksandar Bojchevski; Stephan Günnemann; |
99 | Giving Feedback on Interactive Student Programs with Meta-Exploration Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We build a system that interacts with a student program to find bugs and provides feedback with near human-level accuracy by showing that finding bugs is a meta-exploration problem |
Evan Liu; Moritz Stephan; Allen Nie; Chris Piech; Emma Brunskill; Chelsea Finn; |
100 | Effective Adaptation in Multi-Task Co-Training for Unified Autonomous Driving Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we extensively investigate the transfer performance of various types of self-supervised methods, e.g., MoCo and SimCLR, on three downstream tasks, including semantic segmentation, drivable area segmentation, and traffic object detection, on the large-scale driving dataset BDD100K. |
Xiwen Liang; Yangxin Wu; Jianhua Han; Hang Xu; Chunjing XU; Xiaodan Liang; |
101 | SPD Domain-specific Batch Normalization to Crack Interpretable Unsupervised Domain Adaptation in EEG Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose and evaluate (using EEG) an unsupervised domain adaptation framework around SPD domain-specific momentum batch normalization that enables end-to-end learning of tangent space mapping models. |
Reinmar Kobler; Jun-ichiro Hirayama; Qibin Zhao; Motoaki Kawanabe; |
102 | Gradient Estimation with Discrete Stein Operators Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To improve the quality of gradient estimation, we introduce a variance reduction technique based on Stein operators for discrete distributions. |
Jiaxin Shi; Yuhao Zhou; Jessica Hwang; Michalis Titsias; Lester Mackey; |
103 | Counterfactual Neural Temporal Point Process for Misinformation Impact Estimation on Social Media Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We develop a machine learning based counterfactual analysis framework to examine the misinformation’s causal influence on people. |
Yizhou Zhang; Defu Cao; Yan Liu; |
104 | Smoothed Embeddings for Certified Few-Shot Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we extend randomized smoothing to few-shot learning models that map inputs to normalized embeddings. |
Mikhail Pautov; Olesya Kuznetsova; Nurislam Tursynbek; Aleksandr Petiushko; Ivan Oseledets; |
105 | Training Subset Selection for Weak Supervision Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We show how to use pretrained representations to select high-quality subsets of weakly labeled training data. Training with these subsets improves the performance of weak supervision. |
Hunter Lang; Aravindan Vijayaraghavan; David Sontag; |
106 | Learning in Observable POMDPs, Without Computationally Intractable Oracles Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We provide a quasi-polynomial time algorithm for learning POMDPs. |
Noah Golowich; Ankur Moitra; Dhruv Rohatgi; |
107 | Robust Generalized Method of Moments: A Finite Sample Viewpoint Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We develop a computationally efficient robustification of the generalized method of moments, which can tolerate a constant fraction of arbitrary outliers. |
Dhruv Rohatgi; Vasilis Syrgkanis; |
108 | Multi-Objective Deep Learning with Adaptive Reference Vectors Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Many deep learning models involve optimizing multiple objectives. Since objectives are often conflicting, we aim to get diverse and representative trade-off solutions among these objectives. |
Weiyu Chen; James Kwok; |
109 | Implicit Neural Representations with Levels-of-Experts Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To address the limitation, we propose the Levels-of-Experts (LoE) framework, which is a novel coordinate-based representation consisting of an MLP with periodic, position-dependent weights arranged hierarchically. |
Zekun Hao; Arun Mallya; Serge Belongie; Ming-Yu Liu; |
110 | A Closer Look at Learned Optimization: Stability, Robustness, and Inductive Biases Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We investigate the stability properties of learned optimizers, and apply the insights gleaned to develop a learned optimization architecture that yields strong performance improvements over existing architectures. |
James Harrison; Luke Metz; Jascha Sohl-Dickstein; |
111 | A Solver-free Framework for Scalable Learning in Neural ILP Architectures Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: For learning constraints in a neural ILP architecture, we propose a scalable solver-free framework that doesn’t require calling the solver to compute gradients. |
Yatin Nandwani; Rishabh Ranjan; – Mausam; Parag Singla; |
112 | Sublinear Algorithms for Hierarchical Clustering Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The focus of this work is to study hierarchical clustering for massive graphs under three well-studied models of sublinear computation which focus on space, time, and communication, respectively, as the primary resources to optimize: (1) (dynamic) streaming model where edges are presented as a stream, (2) query model where the graph is queried using neighbor and degree queries, (3) massively parallel computation (MPC) model where the edges of the graph are partitioned over several machines connected via a communication channel. |
Arpit Agarwal; Sanjeev Khanna; Huan Li; Prathamesh Patil; |
113 | On Efficient Online Imitation Learning Via Classification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We give new positive and negative computational and statistical results on the fundamental feasibility of regret minimization in online imitation learning with discrete action spaces, in the general nonrealizable case. |
Yichen Li; Chicheng Zhang; |
114 | Deliberated Domain Bridging for Domain Adaptive Semantic Segmentation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we conduct comprehensive analysis of existing domain bridging methods for domain adaptative semantic segmentation task and resort to two complementary data mixing techniques to propose a deliberated domain bridging strategy. |
Lin Chen; Zhixiang Wei; Xin Jin; Huaian Chen; Miao Zheng; Kai Chen; Yi Jin; |
115 | Generalization for Multiclass Classification with Overparameterized Linear Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Via an overparameterized linear model with Gaussian features, we provide conditions for good generalization for multiclass classification of minimum-norm interpolating solutions in an asymptotic setting where both the number of underlying features and the number of classes scale with the number of training points. |
Vignesh Subramanian; Rahul Arya; Anant Sahai; |
116 | Model-Based Offline Reinforcement Learning with Pessimism-Modulated Dynamics Belief Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we instead maintain a belief distribution over dynamics, and evaluate/optimize policy through biased sampling from the belief. |
Kaiyang Guo; Shao Yunfeng; Yanhui Geng; |
117 | Dance of SNN and ANN: Solving Binding Problem By Combining Spike Timing and Reconstructive Attention Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a brain-inspired unsupervised hybrid neural network (HNN) that introduces temporal binding theory originated from neuroscience into ANNs by integrating spike timing dynamics (via spiking neural networks, SNNs) with reconstructive attention (by ANNs). |
Hao Zheng; Luping Shi; Rong Zhao; Hui Lin; |
118 | JAW: Guaranteed Predictive Inference Under Covariate Shift Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose JAWS, a series of wrapper methods for distribution-free uncertainty quantification under covariate shift, including: the jackknife+ with likelihood ratio weights; a computationally-efficient approximation; extensions to error assessment |
Drew Prinster; Anqi Liu; Suchi Saria; |
119 | Efficiently Factorizing Boolean Matrices Using Proximal Gradient Descent Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a novel elastic-net based regularizer that permits efficient Boolean matrix factorization using proximal gradient descent. |
Sebastian Dalleiger; Jilles Vreeken; |
120 | Prompt Certified Machine Unlearning with Randomized Gradient Smoothing and Quantization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper presents a prompt certified machine unlearning algorithm, PCMU, which executes one-time operation of simultaneous training and unlearning in advance for a series of machine unlearning requests, without the knowledge of the removed/forgotten data. |
Zijie Zhang; Xin Zhao; Tianshi Che; Yang Zhou; Lingjuan Lyu; |
121 | Hyperparameter Sensitivity in Deep Outlier Detection: Analysis and A Scalable Hyper-Ensemble Solution Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We conduct a sensitivity analysis of unsupervised deep outlier detection methods to hyper-parameter (HP) settings, and design a scalable hyper-ensemble to circumvent the HP sensitivity issue in the literature. |
Xueying Ding; Lingxiao Zhao; Leman Akoglu; |
122 | Learning to Follow Instructions in Text-Based Games Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Such observations typically include instructions that, in a reinforcement learning (RL) setting, can directly or indirectly guide a player towards completing reward-worthy tasks. In this work, we study the ability of RL agents to follow such instructions. |
Mathieu Tuli; Andrew Li; Pashootan Vaezipoor; Toryn Klassen; Scott Sanner; Sheila McIlraith; |
123 | The Importance of Baselines in Policy Gradients Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Our first contribution is to show that the \emph{state value} baseline allows on-policy stochastic \emph{natural} policy gradient (NPG) to converge to an optimal policy at an $O(1/t)$ rate, which was not previously known. |
Jincheng Mei; Wesley Chung; Valentin Thomas; Bo Dai; Csaba Szepesvari; Dale Schuurmans; |
124 | Online Minimax Multiobjective Optimization: Multicalibeating and Other Applications Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce a simple but general online learning framework in which a learner plays against an adversary in a vector-valued game that changes every round. |
Daniel Lee; Georgy Noarov; Mallesh Pai; Aaron Roth; |
125 | Learning from A Sample in Online Algorithms Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: But can we go beyond the worst-case? In this work we give algorithms that perform substantially better when a $p$-fraction of the input is given as a sample: the algorithm use this sample to \emph{learn} a good strategy to use for the rest of the input. |
C.J. Argue; Anupam Gupta; Alan Frieze; Christopher Seiler; |
126 | Robustness Disparities in Face Detection Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Additionally, no prior work has focused on the robustness of these systems under various perturbations and corruptions, which leaves open the question of how various people are impacted by these phenomena. We present the first of its kind detailed benchmark of face detection systems, specifically examining the robustness to noise of commercial and academic models. |
Samuel Dooley; George Z Wei; Tom Goldstein; John Dickerson; |
127 | SMPL: Simulated Industrial Manufacturing and Process Control Learning Environments Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To bridge this gap, we develop an easy-to-use library that includes five high-fidelity simulation environments: BeerFMTEnv, ReactorEnv, AtropineEnv, PenSimEnv and mAbEnv, which cover a wide range of manufacturing processes. |
Mohan Zhang; Xiaozhou Wang; Benjamin Decardi-Nelson; Bo Song; An Zhang; Jinfeng Liu; Sile Tao; Jiayi Cheng; Xiaohong Liu; Dengdeng Yu; Matthew Poon; Animesh Garg; |
128 | GriddlyJS: A Web IDE for Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Moreover, existing environments often require complex build processes, making reproducing results difficult. To address these issues, we introduce GriddlyJS, a web-based Integrated Development Environment (IDE) based on the Griddly engine. |
Christopher Bamford; Minqi Jiang; Mikayel Samvelyan; Tim Rocktäschel; |
129 | Avalon: A Benchmark for RL Generalization Using Procedurally Generated Worlds Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Avalon is a benchmark for generalization in RL where all individual tasks are constructed via finely controlled procedural generation of environments. |
Joshua Albrecht; Abraham Fetterman; Bryden Fogelman; Ellie Kitanidis; Bartosz Wróblewski; Nicole Seo; Michael Rosenthal; Maksis Knutins; Zack Polizzi; James Simon; Kanjun Qiu; |
130 | CLEVRER-Humans: Describing Physical and Causal Events The Human Way Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The CLEVRER-Humans benchmark is a video reasoning dataset for causal judgment of physical events with human labels. |
Jiayuan Mao; Xuelin Yang; Xikun Zhang; Noah Goodman; Jiajun Wu; |
131 | Ambiguous Images With Human Judgments for Robust Visual Event Classification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce a procedure for creating datasets of ambiguous images and use it to produce DAI (Dataset of Ambiguous Images), a collection of noisy images extracted from videos and corresponding human uncertainty judgments. |
Kate Sanders; Reno Kriz; Anqi Liu; Benjamin Van Durme; |
132 | Finding Naturally Occurring Physical Backdoors in Image Datasets Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We discover and validate the existence of natural backdoors in existing image datasets. |
Emily Wenger; Roma Bhattacharjee; Arjun Nitin Bhagoji; Josephine Passananti; Emilio Andere; Heather Zheng; Ben Zhao; |
133 | A Large Scale Search Dataset for Unbiased Learning to Rank Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: we introduce a new large-scale unbiased learning to rank dataset with rich real-world user feedback and sufficient display information. |
Lixin Zou; Haitao Mao; Xiaokai Chu; Jiliang Tang; Wenwen Ye; Shuaiqiang Wang; Dawei Yin; |
134 | SoundSpaces 2.0: A Simulation Platform for Visual-Acoustic Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We are releasing SoundSpaces 2.0: a fast, continuous, configurable and generalizable audio-visual simulation platform for visual acoustic machine learning research, e.g., audio-visual navigation, far-field speech recognition, and acoustic matching. |
Changan Chen; Carl Schissler; Sanchit Garg; Philip Kobernik; Alexander Clegg; Paul Calamia; Dhruv Batra; Philip Robinson; Kristen Grauman; |
135 | ActionNet: A Multimodal Dataset for Human Activities Using Wearable Sensors in A Kitchen Environment Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper introduces ActionNet, a multimodal dataset and recording framework with an emphasis on wearable sensing in a kitchen environment. |
Joseph DelPreto; Chao Liu; Yiyue Luo; Michael Foshey; Yunzhu Li; Antonio Torralba; Wojciech Matusik; Daniela Rus; |
136 | SkinCon: A Skin Disease Dataset Densely Annotated By Domain Experts for Fine-grained Debugging and Analysis Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: SkinCon is a skin disease dataset densely annotated by domain experts for developing interpretability/explainability methods and fine-grained error analysis. |
Roxana Daneshjou; Mert Yuksekgonul; Zhuo Ran Cai; Roberto Novoa; James Zou; |
137 | CEDe: A Collection of Expert-curated Datasets with Atom-level Entity Annotations for Optical Chemical Structure Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: A collection of datasets containing more than 700,000 atom-level entity annotations and their corresponding bounding boxes. This labels constitute all the necessary information for complete chemical graph reconstruction. |
Rodrigo Hormazabal; Changyoung Park; Soonyoung Lee; Sehui Han; Yeonsik Jo; Jaewan Lee; Ahra Jo; Seung Hwan Kim; Jaegul Choo; Moontae Lee; Honglak Lee; |
138 | MVP-N: A Dataset and Benchmark for Real-World Multi-View Object Classification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper presents a dataset and benchmark for multi-view object classification. |
REN WANG; Jiayue Wang; Tae Sung Kim; JINSUNG KIM; Hyuk-Jae Lee; |
139 | A Multi-Task Benchmark for Korean Legal Language Understanding and Judgement Prediction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Here we present the first large-scale benchmark of Korean legal AI datasets, LBOX OPEN, that consists of one legal corpus, two classification tasks, two legal judgement prediction (LJP) tasks, and one summarization task. |
Wonseok Hwang; Dongjun Lee; Kyoungyeon Cho; Hanuhl Lee; Minjoon Seo; |
140 | Kantorovich Strikes Back! Wasserstein GANs Are Not Optimal Transport? Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we address these questions. We construct 1-Lipschitz functions and use them to build ray monotone transport plans. |
Alexander Korotin; Alexander Kolesov; Evgeny Burnaev; |
141 | AMOS: A Large-Scale Abdominal Multi-Organ Benchmark for Versatile Medical Image Segmentation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To mitigate the limitations, we present AMOS, a large-scale, diverse, clinical dataset for abdominal organ segmentation. |
Yuanfeng Ji; Haotian Bai; Chongjian GE; Jie Yang; Ye Zhu; Ruimao Zhang; Zhen Li; Lingyan Zhanng; Wanling Ma; Xiang Wan; Ping Luo; |
142 | TGEA 2.0: A Large-Scale Diagnostically Annotated Dataset with Benchmark Tasks for Text Generation of Pretrained Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: With the diagnostically annotated dataset, we propose 5 diagnosis benchmark tasks (i.e., erroneous text detection, MiSEW extraction, erroneous span location and correction together with error type classification) and 2 pathology mitigation benchmark tasks (pairwise comparison and word prediction). |
Huibin Ge; Xiaohu Zhao; Chuang Liu; Yulong Zeng; Qun Liu; Deyi Xiong; |
143 | The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper documents the data creation and curation efforts undertaken by BigScience to assemble the Responsible Open-science Open-collaboration Text Sources (ROOTS) corpus, a 1.6TB dataset spanning 59 languages that was used to train the 176-billion-parameter BigScience Large Open-science Open-access Multilingual (BLOOM) language model. |
Hugo Laurençon; Lucile Saulnier; Thomas Wang; Christopher Akiki; Albert Villanova del Moral; Teven Le Scao; Leandro Von Werra; Chenghao Mou; Eduardo González Ponferrada; Huu Nguyen; Jörg Frohberg; Mario Šaško; Quentin Lhoest; Angelina McMillan-Major; Gerard Dupont; Stella Biderman; Anna Rogers; Loubna Ben allal; Francesco De Toni; Giada Pistilli; Olivier Nguyen; Somaieh Nikpoor; Maraim Masoud; Pierre Colombo; Javier de la Rosa; Paulo Villegas; Tristan Thrush; Shayne Longpre; Sebastian Nagel; Leon Weber; Manuel Muñoz; Jian Zhu; Daniel Van Strien; Zaid Alyafeai; Khalid Almubarak; Minh Chien Vu; Itziar Gonzalez-Dios; Aitor Soroa; Kyle Lo; Manan Dey; Pedro Ortiz Suarez; Aaron Gokaslan; Shamik Bose; David Adelani; Long Phan; Hieu Tran; Ian Yu; Suhas Pai; Jenny Chim; Violette Lepercq; Suzana Ilic; Margaret Mitchell; Alexandra V Luccioni; Yacine Jernite; |
144 | NAS-Bench-Graph: Benchmarking Graph Neural Architecture Search Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Specifically, we construct a unified, expressive yet compact search space, covering 26,206 unique graph neural network (GNN) architectures and propose a principled evaluation protocol. |
Yijian Qin; Ziwei Zhang; Xin Wang; Zeyang Zhang; Wenwu Zhu; |
145 | Benchmarking and Analyzing 3D Human Pose and Shape Estimation Beyond Algorithms Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This could lead to less optimal baselines, hindering the fair and faithful evaluations of newly designed methodologies. To address this problem, this work presents the \textit{first} comprehensive benchmarking study from three under-explored perspectives beyond algorithms. |
Hui En Pang; Zhongang Cai; Lei Yang; Tianwei Zhang; Ziwei Liu; |
146 | MATE: Benchmarking Multi-Agent Reinforcement Learning in Distributed Target Coverage Control Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce the Multi-Agent Tracking Environment (MATE), a novel multi-agent environment simulates the target coverage control problems in the real world. |
Xuehai Pan; Mickel Liu; Fangwei Zhong; Yaodong Yang; Song-Chun Zhu; Yizhou Wang; |
147 | How Transferable Are Video Representations Based on Synthetic Data? Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose SynAPT, a novel benchmark for action recognition based on a combination of existing synthetic datasets, in which a model is pre-trained on synthetic videos rendered by various graphics simulators, and then transferred to a set of downstream action recognition datasets, containing different categories than the synthetic data. |
Yo-whan Kim; Samarth Mishra; SouYoung Jin; Rameswar Panda; Hilde Kuehne; Leonid Karlinsky; Venkatesh Saligrama; Kate Saenko; Aude Oliva; Rogerio Feris; |
148 | NAS-Bench-Suite-Zero: Accelerating Research on Zero Cost Proxies Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We create a benchmark suite for zero-cost proxies, and we use it to show how to effectively combine them to improve performance. |
Arjun Krishnakumar; Colin White; Arber Zela; Renbo Tu; Mahmoud Safari; Frank Hutter; |
149 | Pile of Law: Learning Responsible Data Filtering from The Law and A 256GB Open-Source Legal Dataset Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work we have examine how the law and legal data can inform data filtering practices and provide an extensive 256GB legal dataset (the Pile of Law) that can be used to learn these norms, and for pretraining. |
Peter Henderson; Mark Krass; Lucia Zheng; Neel Guha; Christopher D Manning; Dan Jurafsky; Daniel Ho; |
150 | Breaking Bad: A Dataset for Geometric Fracture and Reassembly Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce Breaking Bad, a large-scale dataset of fractured objects. |
Silvia Sellán; Yun-Chun Chen; Ziyi Wu; Animesh Garg; Alec Jacobson; |
151 | Understanding Aesthetics with Language: A Photo Critique Dataset for Aesthetic Assessment Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose the Reddit Photo Critique Dataset (RPCD), which contains tuples of image and photo critiques. |
Daniel Vera Nieto; Luigi Celona; Clara Fernandez Labrador; |
152 | EPIC-KITCHENS VISOR Benchmark: VIdeo Segmentations and Object Relations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce VISOR, a new dataset of pixel annotations and a benchmark suite for segmenting hands and active objects in egocentric video. |
Ahmad Darkhalil; Dandan Shan; Bin Zhu; Jian Ma; Amlan Kar; Richard Higgins; Sanja Fidler; David Fouhey; Dima Damen; |
153 | PFL-Bench: A Comprehensive Benchmark for Personalized Federated Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose the first comprehensive benchmark for personalized Federated Learning, containing more than 10 datasets, 20 pFL methods, and systematic evaluation with highlighted benefits and potential of pFL. |
Daoyuan Chen; Dawei Gao; Weirui Kuang; Yaliang Li; Bolin Ding; |
154 | Touch and Go: Learning from Human-Collected Vision and Touch Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce “Touch and Go”, a human-collected dataset containing paired visual and tactile data from real-world scenes. |
Fengyu Yang; Chenyang Ma; Jiacheng Zhang; Jing Zhu; Wenzhen Yuan; Andrew Owens; |
155 | DDXPlus: A New Dataset For Automatic Medical Diagnosis Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we present a large-scale synthetic dataset of roughly 1.3 million patients that includes a differential diagnosis, along with the ground truth pathology, symptoms and antecedents for each patient. |
Arsene Fansi Tchango; Rishab Goel; Zhi Wen; Julien Martel; Joumana Ghosn; |
156 | AutoWS-Bench-101: Benchmarking Automated Weak Supervision with 100 Labels Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce AutoWS-Bench-101: a benchmarking framework for automated weak supervision techniques on diverse tasks. |
Nicholas Roberts; Xintong Li; Tzu-Heng Huang; Dyah Adila; Spencer Schoenberg; Cheng-Yu Liu; Lauren Pick; Haotian Ma; Aws Albarghouthi; Frederic Sala; |
157 | This Is The Way: Designing and Compiling LEPISZCZE, A Comprehensive NLP Benchmark for Polish Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper we introduce LEPISZCZE (lepiszczeis the Polish word for glew, the Middle English predecessor of glue) a new, comprehensive benchmark for Polish NLP with a large variety of tasks and high-quality operationalization of the benchmark. |
Łukasz Augustyniak; Kamil Tagowski; Albert Sawczyn; Denis Janiak; Roman Bartusiak; Adrian Szymczak; Arkadiusz Janz; Piotr Szymański; Marcin Wątroba; Mikołaj Morzy; Tomasz Kajdanowicz; Maciej Piasecki; |
158 | Evaluating Out-of-Distribution Performance on Document Image Classifiers Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Our paper introduces new out-of-distribution data for evaluating document classifiers, and finds that models trained on RVL-CDIP but tested on our new out-of-distribution data tend to underperform. |
Stefan Larson; Yi Yang Gordon Lim; Yutong Ai; David Kuang; Kevin Leach; |
159 | FETA: Towards Specializing Foundational Models for Expert Task Applications Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a first of its kind FETA benchmark built around the task of teaching FMs to understand technical documentation, via learning to match their graphical illustrations to corresponding language descriptions. |
Amit Alfassy; Assaf Arbelle; Oshri Halimi; Sivan Harary; Roei Herzig; Eli Schwartz; Rameswar Panda; Michele Dolfi; Christoph Auer; Peter Staar; Kate Saenko; Rogerio Feris; Leonid Karlinsky; |
160 | Why Do Tree-based Models Still Outperform Deep Learning on Typical Tabular Data? Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Results show that tree-based models remain state-of-the-art on medium-sized data (10K samples) even without accounting for their superior speed. To understand this gap, we conduct an empirical investigation into the differing inductive biases of tree-based models and neural networks. |
Leo Grinsztajn; Edouard Oyallon; Gael Varoquaux; |
161 | Myriad: A Real-world Testbed to Bridge Trajectory Optimization and Deep Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present a testbed to benchmark imitation learning and reinforcement learning algorithms against trajectory optimization-based methods in real-world environments. |
Nikolaus Howe; Simon Dufort-Labbé; Nitarshan Rajkumar; Pierre-Luc Bacon; |
162 | METS-CoV: A Dataset of Medical Entity and Targeted Sentiment on COVID-19 Related Tweets Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we release METS-CoV, a dataset containing medical entities and targeted sentiments from COVID-19 related tweets. |
Peilin Zhou; Zeqiang Wang; Dading Chong; Zhijiang Guo; Yining Hua; Zichang Su; Zhiyang Teng; Jiageng Wu; Jie Yang; |
163 | A Comprehensive Study on Large-Scale Graph Training: Benchmarking and Rethinking Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present a comprehensive and fair benchmark study on large-scale graph training and further propose a new layer-wise training manner the achieves new SOTA performance on large-scale graph datasets. |
Keyu Duan; Zirui Liu; Peihao Wang; Wenqing Zheng; Kaixiong Zhou; Tianlong Chen; Xia Hu; Zhangyang Wang; |
164 | CARLANE: A Lane Detection Benchmark for Unsupervised Domain Adaptation from Simulation to Multiple Real-World Domains Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose CARLANE, a 3-way sim-to-real domain adaptation benchmark for 2D lane detection. |
Bonifaz Stuhr; Johann Haselberger; Julian Gebele; |
165 | Hard ImageNet: Segmentations for Objects with Strong Spurious Cues Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The severity of this problem varies significantly by class. We identify $15$ classes in ImageNet with very strong spurious cues, and collect segmentation masks for these challenging objects to form \emph{Hard ImageNet}. |
Mazda Moayeri; Sahil Singla; Soheil Feizi; |
166 | MSDS: A Large-Scale Chinese Signature and Token Digit String Dataset for Handwriting Verification Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Although online handwriting verification has made great progress recently, the verification performances are still far behind the real usage owing to the small scale of the datasets as well as the limited biometric mediums. Therefore, this paper proposes a new handwriting verification benchmark dataset named Multimodal Signature and Digit String (MSDS), which consists of two subsets: MSDS-ChS (Chinese Signatures) and MSDS-TDS (Token Digit Strings), contributed by 402 users, with 20 genuine samples and 20 skilled forgeries per user per subset. |
Peirong Zhang; Jiajia Jiang; Yuliang Liu; Lianwen Jin; |
167 | SurDis: A Surface Discontinuity Dataset for Wearable Technology to Assist Blind Navigation in Urban Environments Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we introduce SurDis, a novel dataset of depth maps and stereo images that exemplifies the issue of surface discontinuity in the urban areas of Klang Valley, Malaysia. |
Kuan Yew Leong; Siew Mooi Lim; |
168 | ADBench: Anomaly Detection Benchmark Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Given a long list of anomaly detection algorithms developed in the last few decades, how do they perform with regard to (i) varying levels of supervision, (ii) different types of anomalies, and (iii) noisy and corrupted data? In this work, we answer these key questions by conducting (to our best knowledge) the most comprehensive anomaly detection benchmark with 30 algorithms on 57 benchmark datasets, named ADBench. |
Songqiao Han; Xiyang Hu; Hailiang Huang; Minqi Jiang; Yue Zhao; |
169 | AirfRANS: High Fidelity Computational Fluid Dynamics Dataset for Approximating Reynolds-Averaged Navier–Stokes Solutions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a high fidelity aerodynamic dataset of Reynolds-Averaged Navier–Stokes (RANS) simulations over airfoils |
Florent Bonnet; Jocelyn Mazari; Paola Cinnella; Patrick Gallinari; |
170 | A Unified Evaluation of Textual Backdoor Learning: Frameworks and Benchmarks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To address these issues, we categorize existing works into three practical scenarios in which attackers release datasets, pre-trained models, and fine-tuned models respectively, then discuss their unique evaluation methodologies. |
Ganqu Cui; Lifan Yuan; Bingxiang He; Yangyi Chen; Zhiyuan Liu; Maosong Sun; |
171 | MBW: Multi-view Bootstrapping in The Wild Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: The approach, however, is based on calibrated cameras and rigid geometry, making it expensive, difficult to manage, and impractical in real-world scenarios. In this paper, we address these bottlenecks by combining a non-rigid 3D neural prior with deep flow to obtain high-fidelity landmark estimates from videos with only two or three uncalibrated, handheld cameras. |
Mosam Dabhi; Chaoyang Wang; Tim Clifford; László Jeni; Ian Fasel; Simon Lucey; |
172 | Chartalist: Labeled Graph Datasets for UTXO and Account-based Blockchains Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We created the first blockchain ML-Ready dataset platform |
Kiarash Shamsi; Friedhelm Victor; Murat Kantarcioglu; Yulia Gel; Cuneyt G Akcora; |
173 | Learning Long-Term Crop Management Strategies with CyclesGym Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce CYCLESGYM, an RL environment based on the multi-year, multi-crop CGM Cycles. |
Matteo Turchetta; Luca Corinzia; Scott Sussex; Amanda Burton; Juan Herrera; Ioannis Athanasiadis; Joachim M Buhmann; Andreas Krause; |
174 | LIPS – Learning Industrial Physical Simulation Benchmark Suite Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper introduces a new benchmark suite "Learning Industrial Physical Simulations" (LIPS), whose purpose is to assess the quality of surrogate models for emulation of a physical system following various evaluation criteria categories |
Milad LEYLI ABADI; Antoine Marot; Jérôme Picault; David Danan; Mouadh Yagoubi; Benjamin Donnot; Seif Attoui; Pavel Dimitrov; Asma Farjallah; Clement Etienam; |
175 | FLamby: Datasets and Benchmarks for Cross-Silo Federated Learning in Realistic Healthcare Settings Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose a novel cross-silo dataset suite focused on healthcare, FLamby (Federated Learning AMple Benchmark of Your cross-silo strategies), to bridge the gap between theory and practice of cross-silo FL.FLamby encompasses 7 healthcare datasets with natural splits, covering multiple tasks, modalities, and data volumes, each accompanied with baseline training code. |
Jean Ogier du Terrail; Samy-Safwan Ayed; Edwige Cyffers; Felix Grimberg; Chaoyang He; Regis Loeb; Paul Mangold; Tanguy Marchand; Othmane Marfoq; Erum Mushtaq; Boris Muzellec; Constantin Philippenko; Santiago Silva; Maria Teleńczuk; Shadi Albarqouni; Salman Avestimehr; Aurélien Bellet; Aymeric Dieuleveut; Martin Jaggi; Sai Praneeth Karimireddy; Marco Lorenzi; Giovanni Neglia; Marc Tommasi; Mathieu Andreux; |
176 | PEER: A Comprehensive and Multi-Task Benchmark for Protein Sequence Understanding Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This work proposes a comprehensive and multi-task benchmark for protein sequence understanding, which studies both single-task and multi-task learning. |
Minghao Xu; Zuobai Zhang; Jiarui Lu; Zhaocheng Zhu; Yangtian Zhang; Ma Chang; Runcheng Liu; Jian Tang; |
177 | MOMA-LRG: Language-Refined Graphs for Multi-Object Multi-Actor Activity Parsing Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a new dataset and framework for evaluating video-language models on activity recognition at multiple levels of granularity |
Zelun Luo; Zane Durante; Linden Li; Wanze Xie; Ruochen Liu; Emily Jin; Zhuoyi Huang; Lun Yu Li; Jiajun Wu; Juan Carlos Niebles; Ehsan Adeli; Fei-Fei Li; |
178 | Towards Open Set 3D Learning: Benchmarking and Understanding Semantic Novelty Detection on Pointclouds Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce a novel testbed for semantic novelty detection that considers several settings with increasing difficulties in terms of category semantic shift, and covers both in-domain (synthetic-to-synthetic, real-to-real) and cross-domain (synthetic- to-real) scenarios. |
Antonio Alliegro; Francesco Cappio Borlino; Tatiana Tommasi; |
179 | Unravelling The Performance of Physics-informed Graph Neural Networks for Dynamical Systems Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Our study demonstrates that GNNs with additional inductive biases, such as explicit constraints and decoupling of kinetic and potential energies, exhibit significantly enhanced performance. |
Abishek Thangamuthu; Gunjan Kumar; Suresh Bishnoi; Ravinder Bhattoo; N M Anoop Krishnan; Sayan Ranu; |
180 | A Greek Parliament Proceedings Dataset for Computational Linguistics and Political Analysis Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we introduce a curated dataset of the Greek Parliament Proceedings that extends chronologically from 1989 up to 2020. |
Konstantina Dritsa; Aikaterini Thoma; Ioannis Pavlopoulos; Panos Louridas; |
181 | A Survey and Datasheet Repository of Publicly Available US Criminal Justice Datasets Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To raise awareness of publicly available criminal justice datasets and encourage their responsible use, we conduct a survey, consider contexts, highlight potential uses, and identify gaps and limitations. |
Miri Zilka; Bradley Butcher; Adrian Weller; |
182 | A Benchmark for Compositional Visual Reasoning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Here, we introduce a novel visual reasoning benchmark, Compositional Visual Relations (CVR), to drive progress towards the development of more data-efficient learning algorithms. |
Aimen Zerroug; Mohit Vaishnav; Julien Colin; Sebastian Musslick; Thomas Serre; |
183 | FACT: Learning Governing Abstractions Behind Integer Sequences Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Integer sequences are of central importance to the modeling of concepts admitting complete finitary descriptions. We introduce a novel view on the learning of such concepts and lay down a set of benchmarking tasks aimed at conceptual understanding by machine learning models. |
Peter Belcak; Ard Kastrati; Flavio Schenker; Roger Wattenhofer; |
184 | XView3-SAR: Detecting Dark Fishing Activity Using Synthetic Aperture Radar Imagery Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present the largest labeled dataset for training ML models to detect and characterize vessels and ocean structures in SAR imagery. |
Fernando Paolo; Tsu-ting Tim Lin; Ritwik Gupta; Bryce Goodman; Nirav Patel; Daniel Kuster; David Kroodsma; Jared Dunnmon; |
185 | BLOX: Macro Neural Architecture Search Benchmark and Algorithms Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To provide a systematic study of the performance of NAS algorithms on a macro search space, we release Blox – a benchmark that consists of 91k unique models trained on the CIFAR-100 dataset. |
Thomas Chau; Łukasz Dudziak; Hongkai Wen; Nicholas Lane; Mohamed Abdelfattah; |
186 | ETAB: A Benchmark Suite for Visual Representation Learning in Echocardiography Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we design a suite of benchmarks that can be used to pre-train and evaluate echocardiographic representations with respect to various clinically-relevant tasks using publicly accessible data sets. |
Ahmed M. Alaa; Anthony Philippakis; David Sontag; |
187 | BOND: Benchmarking Unsupervised Outlier Node Detection on Static Attributed Graphs Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present BOND, a comprehensive benchmark for unsupervised node outlier detection on attributed static graphs. |
Kay Liu; Yingtong Dou; Yue Zhao; Xueying Ding; Xiyang Hu; Ruitong Zhang; Kaize Ding; Canyu Chen; Hao Peng; Kai Shu; Lichao Sun; Jundong Li; George H Chen; Zhihao Jia; Philip S Yu; |
188 | Towards Human-Level Bimanual Dexterous Manipulation with Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a bimanual dexterous manipulation benchmark according to literature from cognitive science for comprehensive reinforcement learning research. |
Yuanpei Chen; Tianhao Wu; Shengjie Wang; Xidong Feng; Jiechuan Jiang; Zongqing Lu; Stephen McAleer; Hao Dong; Song-Chun Zhu; Yaodong Yang; |
189 | APT-36K: A Large-scale Benchmark for Animal Pose Estimation and Tracking Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: The lack of APT datasets hinders the development and evaluation of video-based animal pose estimation and tracking methods, limiting the applications in real world, e.g., understanding animal behavior in wildlife conservation. To fill this gap, we make the first step and propose APT-36K, i.e., the first large-scale benchmark for animal pose estimation and tracking. |
Yuxiang Yang; Junjie Yang; Yufei Xu; Jing Zhang; Long Lan; Dacheng Tao; |
190 | JAHS-Bench-201: A Foundation For Research On Joint Architecture And Hyperparameter Search Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present JAHS-Bench-201, the first collection of surrogate benchmarks for Joint Architecture and Hyperparameter Search, built to also facilitate research on multi-objective, cost-aware and (multi) multi-fidelity optimization algorithms. |
Archit Bansal; Danny Stoll; Maciej Janowski; Arber Zela; Frank Hutter; |
191 | Forecasting Future World Events With Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce a dataset for forecasting diverse future world events. |
Andy Zou; Tristan Xiao; Ryan Jia; Joe Kwon; Mantas Mazeika; Richard Li; Dawn Song; Jacob Steinhardt; Owain Evans; Dan Hendrycks; |
192 | CAESAR: An Embodied Simulator for Generating Multimodal Referring Expression Datasets Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: As models can use complementary information from multimodal cues to recognize referring expressions, generating multimodal data from multiple views can help to develop robust models. To address these critical issues, in this paper, we present a novel embodied simulator, CAESAR, to generate multimodal referring expressions containing both verbal utterances and nonverbal cues captured from multiple views. |
Md Mofijul Islam; Reza Mirzaiee; Alexi Gladstone; Haley Green; Tariq Iqbal; |
193 | How Would The Viewer Feel? Estimating Wellbeing From Video Scenarios Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce two large-scale video datasets for predicting how videos would the emotional state and wellbeing of viewers. |
Mantas Mazeika; Eric Tang; Andy Zou; Steven Basart; Jun Shern Chan; Dawn Song; David Forsyth; Jacob Steinhardt; Dan Hendrycks; |
194 | MineDojo: Building Open-Ended Embodied Agents with Internet-Scale Knowledge Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: MineDojo is a new framework built on the Minecraft game for developing open-ended, generally capable embodied agents. |
Linxi Fan; Guanzhi Wang; Yunfan Jiang; Ajay Mandlekar; Yuncong Yang; Haoyi Zhu; Andrew Tang; De-An Huang; Yuke Zhu; Anima Anandkumar; |
195 | StrokeRehab: A Benchmark Dataset for Sub-second Action Identification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce a new benchmark dataset for the identification of subtle and short-duration actions. We also propose a novel seq2seq approach, which outperforms the existing methods on the new as well as standard benchmark datasets. |
Aakash Kaku; Kangning Liu; Avinash Parnandi; Haresh Rengaraj Rajamohan; Kannan Venkataramanan; Anita Venkatesan; Audre Wirtanen; Natasha Pandit; Heidi Schambra; Carlos Fernandez-Granda; |
196 | TwiBot-22: Towards Graph-Based Twitter Bot Detection Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We make the case for graph-based Twitter bot detection and propose a graph-based benchmark TwiBot-22, which addresses the issues of limited dataset scale, incomplete graph structure, and low annotation quality in previous datasets. |
Shangbin Feng; Zhaoxuan Tan; Herun Wan; Ningnan Wang; Zilong Chen; Binchi Zhang; Qinghua Zheng; Wenqian Zhang; Zhenyu Lei; Shujie Yang; Xinshun Feng; Qingyue Zhang; Hongrui Wang; Yuhan Liu; Yuyang Bai; Heng Wang; Zijian Cai; Yanbo Wang; Lijing Zheng; Zihan Ma; Jundong Li; Minnan Luo; |
197 | TAP-Vid: A Benchmark for Tracking Any Point in A Video Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we first formalize the problem, naming it tracking any point (TAP). We introduce a companion benchmark,TAP-Vid, which is composed of both real-world videos with accurate human annotations of point tracks, and synthetic videos with perfect ground-truth point tracks. |
Carl Doersch; Ankush Gupta; Larisa Markeeva; Adria Recasens; Lucas Smaira; Yusuf Aytar; Joao Carreira; Andrew Zisserman; Yi Yang; |
198 | Is One Annotation Enough? – A Data-centric Image Classification Benchmark for Noisy and Ambiguous Label Estimation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The aggregation of such annotations to determine the label of an image leads to a lower data quality. We propose a data-centric image classification benchmark with nine real-world datasets and multiple annotations per image to allow researchers to investigate and quantify the impact of such data quality issues. |
Lars Schmarje; Vasco Grossmann; Claudius Zelenka; Sabine Dippel; Rainer Kiko; Mariusz Oszust; Matti Pastell; Jenny Stracke; Anna Valros; Nina Volkmann; Reinhard Koch; |
199 | Wukong: A 100 Million Large-scale Chinese Cross-modal Pre-training Benchmark Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: A large-scale Chinese cross-modal dataset, called Wukong, containing 100 million image-text pairs is released. Models with either global similarity or token-wise similarity are pre-trained and benchmarked on extensive downstream tasks. |
Jiaxi Gu; Xiaojun Meng; Guansong Lu; Lu Hou; Niu Minzhe; Xiaodan Liang; Lewei Yao; Runhui Huang; Wei Zhang; Xin Jiang; Chunjing XU; Hang Xu; |
200 | BackdoorBench: A Comprehensive Benchmark of Backdoor Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We also provide comprehensive evaluations of every pair of 8 attacks against 9 defenses, with 5 poisoning ratios, based on 5 models and 4 datasets, thus 8,000 pairs of evaluations in total. We present abundant analysis from different perspectives about these 8,000 evaluations, studying the effects of different factors in backdoor learning. |
Baoyuan Wu; Hongrui Chen; Mingda Zhang; Zihao Zhu; Shaokui Wei; Danni Yuan; Chao Shen; |
201 | M4Singer: A Multi-Style, Multi-Singer and Musical Score Provided Mandarin Singing Corpus Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The lack of publicly available high-quality and accurately labeled datasets has long been a major bottleneck for singing voice synthesis (SVS). To tackle this problem, we present M4Singer, a free-to-use Multi-style, Multi-singer Mandarin singing collection with elaborately annotated Musical scores as well as its benchmarks. |
Lichao Zhang; Ruiqi Li; Shoutong Wang; Liqun Deng; Jinglin Liu; Yi Ren; Jinzheng He; Rongjie Huang; Jieming Zhu; Xiao Chen; Zhou Zhao; |
202 | IKEA-Manual: Seeing Shape Assembly Step By Step Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We identify that this is due to 1) a lack of realistic 3D assembly objects that have paired manuals and 2) the difficulty of extracting structured information from purely image-based manuals. Motivated by this observation, we present IKEA-Manual, a dataset consisting of 102 IKEA objects paired with assembly manuals. |
Ruocheng Wang; Yunzhi Zhang; Jiayuan Mao; Ran Zhang; Chin-Yi Cheng; Jiajun Wu; |
203 | HandMeThat: Human-Robot Communication in Physical and Social Environments Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: HandMeThat is a benchmark for evaluating instruction understanding and following in physical and social environments. |
Yanming Wan; Jiayuan Mao; Josh Tenenbaum; |
204 | How Well Do Unsupervised Learning Algorithms Model Human Real-time and Life-long Learning? Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we establish benchmarks for both real-time and life-long continual visual learning. |
Chengxu Zhuang; Ziyu Xiang; Yoon Bai; Xiaoxuan Jia; Nicholas Turk-Browne; Kenneth Norman; James J DiCarlo; Dan Yamins; |
205 | CLiMB: A Continual Learning Benchmark for Vision-and-Language Tasks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper presents CLiMB, a benchmark to study the challenge of learning vision-language tasks in a continual learning setting, and to systematically evaluate how upstream continual learning can rapidly transfer to new multi- and unimodal tasks. |
Tejas Srinivasan; Ting-Yun Chang; Leticia Pinto Alva; Georgios Chochlakis; Mohammad Rostami; Jesse Thomason; |
206 | Long Range Graph Benchmark Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present the Long Range Graph Benchmark (LRGB) with 5 datasets that can be used for the development of models enabling long range dependencies in graphs, like Graph Transformers. |
Vijay Prakash Dwivedi; Ladislav Rampášek; Mikhail Galkin; Ali Parviz; Guy Wolf; Anh Tuan Luu; Dominique Beaini; |
207 | Wild-Time: A Benchmark of In-the-Wild Distribution Shift Over Time Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To address this gap, we curate Wild-Time, a benchmark of 7 datasets that reflect temporal distribution shifts arising in a variety of real-world applications, including drug discovery, patient prognosis, and news classification. On these datasets, we systematically benchmark 13 approaches with various inductive biases. |
Huaxiu Yao; Caroline Choi; Bochuan Cao; Yoonho Lee; Pang Wei Koh; Chelsea Finn; |
208 | ConfLab: A Data Collection Concept, Dataset, and Benchmark for Machine Analysis of Free-Standing Social Interactions in The Wild Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose ConfLab (Conference Living Lab) as a new concept for in-the-wild recording of real-life social human behavior, and provide a dataset from the first edition of ConfLab at a major international conference. |
Chirag Raman; Jose Vargas Quiros; Stephanie Tan; Ashraful Islam; Ekin Gedik; Hayley Hung; |
209 | Communicating Natural Programs to Humans and Machines Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We collect a dataset called LARC, consisting of natural language instructions, used by end-users to instruct each-other how to solve the ARC (a notoriously difficult dataset for AI and program synthesis) tasks |
Sam Acquaviva; Yewen Pu; Marta Kryven; Theodoros Sechopoulos; Catherine Wong; Gabrielle Ecanow; Maxwell Nye; Michael Tessler; Josh Tenenbaum; |
210 | USB: A Unified Semi-supervised Learning Benchmark for Classification Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In addition, previous work typically trains deep neural networks from scratch, which is time-consuming and environmentally unfriendly. To address the above issues, we construct a Unified SSL Benchmark (USB) for classification by selecting 15 diverse, challenging, and comprehensive tasks from CV, natural language processing (NLP), and audio processing (Audio), on which we systematically evaluate the dominant SSL methods, and also open-source a modular and extensible codebase for fair evaluation of these SSL methods. |
Yidong Wang; Hao Chen; Yue Fan; Wang SUN; Ran Tao; Wenxin Hou; Renjie Wang; Linyi Yang; Zhi Zhou; Lan-Zhe Guo; Heli Qi; Zhen Wu; Yu-Feng Li; Satoshi Nakamura; Wei Ye; Marios Savvides; Bhiksha Raj; Takahiro Shinozaki; Bernt Schiele; Jindong Wang; Xing Xie; Yue Zhang; |
211 | OpenSRH: Optimizing Brain Tumor Surgery Using Intraoperative Stimulated Raman Histology Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: OpenSRH is the first ever publicly available stimulated Raman histology (SRH) dataset and benchmark, which will facilitate the clinical translation of rapid optical imaging and real-time ML-based surgical decision support. |
Cheng Jiang; Asadur Chowdury; Xinhai Hou; Akhil Kondepudi; Christian Freudiger; Kyle Conway; Sandra Camelo-Piragua; Daniel Orringer; Honglak Lee; Todd Hollon; |
212 | Turning The Tables: Biased, Dynamic, Imbalanced Tabular Datasets for ML Research Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, tabular data — which is prevalent in many high-stakes domains — has been lagging behind. To bridge this gap, we present Bank Account Fraud (BAF), the first publicly available 1 privacy-preserving, large-scale, realistic suite of tabular datasets. |
Sérgio Jesus; José Pombal; Duarte Alves; André Cruz; Pedro Saleiro; Rita Ribeiro; João Gama; Pedro Bizarro; |
213 | Sample Efficiency Matters: A Benchmark for Practical Molecular Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper thoroughly investigates the performance of 25 molecular design algorithms on 23 single-objective (scalar) optimization tasks with a particular focus on sample efficiency. |
Wenhao Gao; Tianfan Fu; Jimeng Sun; Connor Coley; |
214 | Video Compression Dataset and Benchmark of Learning-based Video-quality Metrics Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present a new benchmark for video-quality metrics that evaluates video compression. |
Anastasia Antsiferova; Sergey Lavrushkin; Maksim Smirnov; Aleksandr Gushchin; Dmitriy Vatolin; Dmitriy Kulikov; |
215 | Flare7K: A Phenomenological Nighttime Flare Removal Dataset Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We design a new phenomenological synthetic flare dataset to help us remove the lens flare artifact at night. |
Yuekun Dai; Chongyi Li; Shangchen Zhou; Ruicheng Feng; Chen Change Loy; |
216 | GOOD: A Graph Out-of-Distribution Benchmark Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we aim at developing an OOD benchmark, known as GOOD, for graphs specifically. |
Shurui Gui; Xiner Li; Limei Wang; Shuiwang Ji; |
217 | TaiSu: A 166M Large-scale High-Quality Dataset for Chinese Vision-Language Pre-training Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We release a new large-scale dataset for Chinese vision-language pretraining |
Yulong Liu; Guibo Zhu; Bin Zhu; Qi Song; Guojing Ge; Haoran Chen; GuanHui Qiao; Ru Peng; Lingxiang Wu; Jinqiao Wang; |
218 | VLMbench: A Compositional Benchmark for Vision-and-Language Manipulation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: VLMbench is the first benchmark that compositional designs for vision-and-language reasoning and categorizes the manipulation tasks from the perspectives of task constraints. |
Kaizhi Zheng; Xiaotong Chen; Odest Chadwicke Jenkins; Xin Wang; |
219 | ViSioNS: Visual Search in Natural Scenes Benchmark Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper builds a benchmark for comparing state-of-the-art human visual search models on different datasets comprising eye movements in natural scenes, discussing their limitations and how their integration could lead to performance improvements. |
Fermín Travi; Gonzalo Ruarte; Gaston Bujia; Juan Esteban Kamienkowski; |
220 | WinoGAViL: Gamified Association Benchmark to Challenge Vision-and-Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce WinoGAViL: an online game to collect vision-and-language associations, used as a dynamic benchmark to evaluate state-of-the-art models. |
Yonatan Bitton; Nitzan Bitton Guetta; Ron Yosef; Yuval Elovici; Mohit Bansal; Gabriel Stanovsky; Roy Schwartz; |
221 | Multi-LexSum: Real-world Summaries of Civil Rights Lawsuits at Multiple Granularities Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Multi-LexSum is a multi-doc summarization dataset for civil rights litigations lawsuits with summaries of three granularities. |
Zejiang Shen; Kyle Lo; Lauren Yu; Nathan Dahlberg; Margo Schlanger; Doug Downey; |
222 | OpenFilter: A Framework to Democratize Research Access to Social Media AR Filters Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper presents two datasets of beautified faces — FairBeauty and B-LFW — and insights obtained through experiments; the datasets were created using a custom framework (OpenFilter). |
Piera Riccio; Bill Psomas; Francesco Galati; Francisco Escolano; Thomas Hofmann; Nuria Oliver; |
223 | Towards Video Text Visual Question Answering: Benchmark and Baseline Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To this end, we propose a new task named Video Text Visual Question Answering (ViteVQA in short) that aims at answering questions by reasoning texts and visual information spatiotemporally in a given video. |
Minyi Zhao; Bingjia Li; Jie Wang; Wanqing Li; Wenjing Zhou; Lan Zhang; Shijie Xuyang; Zhihang Yu; Xinkun Yu; Guangze Li; Aobotao Dai; Shuigeng Zhou; |
224 | Model Zoos: A Dataset of Diverse Populations of Neural Network Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To enable the investigation of populations of neural network models, we release a novel dataset of diverse model zoos with this work. |
Konstantin Schürholt; Diyar Taskiran; Boris Knyazev; Xavier Giro-i-Nieto; Damian Borth; |
225 | Active-Passive SimStereo – Benchmarking The Cross-Generalization Capabilities of Deep Learning-based Stereo Methods Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose the first dataset of active+passive stereo images to evaluate the generalisation ability of stereo deep learning models. |
Laurent Jospin; Allen Antony; Lian Xu; Hamid Laga; Farid Boussaid; Mohammed Bennamoun; |
226 | ENS-10: A Dataset For Post-Processing Ensemble Weather Forecasts Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce a dataset containing ten ensemble members over 20 years for post-processing ensemble weather forecasts. |
Saleh Ashkboos; Langwen Huang; Nikoli Dryden; Tal Ben-Nun; Peter Dueben; Lukas Gianinazzi; Luca Kummer; Torsten Hoefler; |
227 | NAS-Bench-360: Benchmarking Neural Architecture Search on Diverse Tasks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We provide a benchmark for neural architecture search on a diverse set of understudied tasks. |
Renbo Tu; Nicholas Roberts; Misha Khodak; Junhong Shen; Frederic Sala; Ameet Talwalkar; |
228 | Beyond Real-world Benchmark Datasets: An Empirical Study of Node Classification with GNNs Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We empirically study the performance of GNNs with various synthetic graphs by synthetically changing one or a few target characteristic(s) of graphs while keeping other characteristics fixed. |
Seiji Maekawa; Koki Noda; Yuya Sasaki; makoto onizuka; |
229 | FlyView: A Bio-inspired Optical Flow Truth Dataset for Visual Navigation Using Panoramic Stereo Vision Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To this end, we introduce FlyView, a novel bio-inspired truth dataset for visual navigation. |
Alix Leroy; Graham Taylor; |
230 | Characteristics of Harmful Text: Towards Rigorous Benchmarking of Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Though work to evaluate language model harms is under way, translating foresight about which harms may arise into rigorous benchmarks is not straightforward. To facilitate this translation, we outline six ways of characterizing harmful text which merit explicit consideration when designing new benchmarks. |
Maribeth Rauh; John Mellor; Jonathan Uesato; Po-Sen Huang; Johannes Welbl; Laura Weidinger; Sumanth Dathathri; Amelia Glaese; Geoffrey Irving; Iason Gabriel; William Isaac; Lisa Anne Hendricks; |
231 | Pythae: Unifying Generative Autoencoders in Python – A Benchmarking Use Case Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we present Pythae, a versatile python library providing both a unified implementation and a dedicated framework allowing to perform straightforward reproducible and reliable use of generative autoencoder models. |
Clément Chadebec; Louis Vincent; Stephanie Allassonniere; |
232 | Robustness Analysis of Video-Language Models Against Visual and Language Perturbations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we perform the first extensive robustness study of video-language models against various real-world perturbations. |
Madeline Chantry; Shruti Vyas; Hamid Palangi; Yogesh Rawat; Vibhav Vineet; |
233 | A New Dataset for Multilingual Keyphrase Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: While there are many recent papers on English keyphrase generation, keyphrase generation for other languages remains vastly understudied, mostly due to the absence of datasets. To address this, we present a novel dataset called Papyrus, composed of 16427 pairs of abstracts and keyphrases. |
Frédéric Piedboeuf; Philippe Langlais; |
234 | Benchmarking Heterogeneous Treatment Effect Models Through The Lens of Interpretability Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We construct a benchmarking environment that allows us to empirically investigate the ability of personalized treatment effect models to identify predictive covariates. |
Jonathan Crabbé; Alicia Curth; Ioana Bica; Mihaela van der Schaar; |
235 | A Dataset for Efforts Towards Achieving The Sustainable Development Goal of Safe Working Environments Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Consequently, we introduce a new dataset called the Labour Inspection Checklists Dataset (LICD), which we have made publicly available. |
Eirik Lund Flogard; Ole Jakob Mengshoel; |
236 | EgoTaskQA: Understanding Human Tasks in Egocentric Videos Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present the EgoTaskQA benchmark that targets at action dependencies, post-effects, agents’ intents and goals, as well as multi-agent belief modeling in egocentric goal-oriented videos. |
Baoxiong Jia; Ting Lei; Song-Chun Zhu; Siyuan Huang; |
237 | Ontologue: Declarative Benchmark Construction for Ontological Multi-Label Classification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Ontologue is a toolkit for ontological multi-label classification dataset construction from DBPedia. This toolkit allows users to control contextual, distributional, and structured properties and create customized datasets. |
Sean Yang; Bernease Herman; Bill Howe; |
238 | BigBio: A Framework for Data-Centric Biomedical Natural Language Processing Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: BigBio is a community library of 126+ biomedical NLP datasets, covering 13 tasks and 10 languages. |
Jason Fries; Leon Weber; Natasha Seelam; Gabriel Altay; Debajyoti Datta; Samuele Garda; Sunny Kang; Rosaline Su; Wojciech Kusa; Samuel Cahyawijaya; Fabio Barth; Simon Ott; Matthias Samwald; Stephen Bach; Stella Biderman; Mario Sänger; Bo Wang; Alison Callahan; Daniel León Periñán; Théo Gigant; Patrick Haller; Jenny Chim; Jose Posada; John Giorgi; Karthik Rangasai Sivaraman; Marc Pàmies; Marianna Nezhurina; Robert Martin; Michael Cullan; Moritz Freidank; Nathan Dahlberg; Shubhanshu Mishra; Shamik Bose; Nicholas Broad; Yanis Labrak; Shlok Deshmukh; Sid Kiblawi; Ayush Singh; Minh Chien Vu; Trishala Neeraj; Jonas Golde; Albert Villanova del Moral; Benjamin Beilharz; |
239 | Honor of Kings Arena: An Environment for Generalization in Competitive Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper introduces Honor of Kings Arena, a reinforcement learning (RL) environment based on the Honor of Kings, one of the world’s most popular games at present. |
Hua Wei; Jingxiao Chen; Xiyang Ji; Hongyang Qin; Minwen Deng; Siqin Li; Liang Wang; Weinan Zhang; Yong Yu; Liu Linc; Lanxiao Huang; Deheng Ye; Qiang Fu; Wei Yang; |
240 | SafeBench: A Benchmarking Platform for Safety Evaluation of Autonomous Vehicles Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose the first unified platform SafeBench to effectively and efficiently evaluate autonomous driving algorithms against different types of safety-critical testing scenarios. |
Chejian Xu; Wenhao Ding; Weijie Lyu; ZUXIN LIU; Shuai Wang; Yihan He; Hanjiang Hu; DING ZHAO; Bo Li; |
241 | AnoShift: A Distribution Shift Benchmark for Unsupervised Anomaly Detection Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: The existing benchmarks are focused on supervised learning, and to the best of our knowledge, there is none for unsupervised learning. Therefore, we introduce an unsupervised anomaly detection benchmark with data that shifts over time, built over Kyoto-2006+, a traffic dataset for network intrusion detection. |
Marius Dragoi; Elena Burceanu; Emanuela Haller; Andrei Manolache; Florin Brad; |
242 | Addressing Resource Scarcity Across Sign Languages with Multilingual Pretraining and Unified-Vocabulary Datasets Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We release the largest available pretraining dataset for sign language across multiple languages and show how multilingual fine-tuning using a unified vocabulary is helpful to achieve SOTA results |
Gokul NC; Manideep Ladi; Sumit Negi; Prem Selvaraj; Pratyush Kumar; Mitesh Khapra; |
243 | OpenXAI: Towards A Transparent Evaluation of Model Explanations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce OpenXAI, a flexible and comprehensive open source ecosystem for evaluating, comparing, and benchmarking SOTA as well as any newly proposed explanation methods. |
Chirag Agarwal; Satyapriya Krishna; Eshika Saxena; Martin Pawelczyk; Nari Johnson; Isha Puri; Marinka Zitnik; Himabindu Lakkaraju; |
244 | MRI: Multi-modal 3D Human Pose Estimation Dataset Using MmWave, RGB-D, and Inertial Sensors Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: mRI is a large-scale multi-modal human pose estimation dataset focusing on rehab movements, supporting human pose estimation and human activity recognition tasks. |
Sizhe An; Yin Li; Umit Ogras; |
245 | Change Event Dataset for Discovery from Spatio-temporal Remote Sensing Imagery Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, finding such interesting and meaningful change events from the vast data is challenging. In this paper, we present new datasets for such change events that include semantically meaningful events like road construction. |
Utkarsh Mall; Bharath Hariharan; Kavita Bala; |
246 | PROSPECT: Labeled Tandem Mass Spectrometry Dataset for Machine Learning in Proteomics Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The paper introduces a labeled tandem Mass Spectrometry dataset for machine learning in proteomics and recommends evaluation metrics. |
Omar Shouman; Wassim Gabriel; Victor-George Giurcoiu; Vitor Sternlicht; Mathias Wilhelm; |
247 | Tenrec: A Large-scale Multipurpose Benchmark Dataset for Recommender Systems Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we describe Tenrec, a novel and publicly available data collection for RS that records various user feedback from four different recommendation scenarios. |
guanghu yuan; Fajie Yuan; Yudong Li; Beibei Kong; Shujie Li; Lei Chen; Min Yang; Chenyun YU; Bo Hu; Zang Li; Yu Xu; Xiaohu Qie; |
248 | DC-BENCH: Dataset Condensation Benchmark Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This work provides the first large-scale standardized benchmark on Dataset Condensation. |
Justin CUI; Ruochen Wang; Si Si; Cho-Jui Hsieh; |
249 | DABS 2.0: Improved Datasets and Algorithms for Universal Self-Supervision Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We extend the DABS benchmark, presenting improved datasets and algorithms for universal self-supervision |
Alex Tamkin; Gaurab Banerjee; Mohamed Owda; Vincent Liu; Shashank Rammoorthy; Noah Goodman; |
250 | The Surprising Effectiveness of PPO in Cooperative Multi-Agent Games Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We demonstrate PPO’s effectiveness in popular multi-agent benchmarks and analyze its properties and implementation details through empirical studies. |
Chao Yu; Akash Velu; Eugene Vinitsky; Jiaxuan Gao; Yu Wang; Alexandre Bayen; YI WU; |
251 | GLOBEM Dataset: Multi-Year Datasets for Longitudinal Human Behavior Modeling Generalization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present the first multi-year mobile sensing datasets containing over 700 users to support the ML community in developing generalizable longitudinal behavior modeling algorithms |
Xuhai Xu; Han Zhang; Yasaman Sefidgar; Yiyi Ren; Xin Liu; Woosuk Seo; Jennifer Brown; Kevin Kuehn; Mike Merrill; Paula Nurius; Shwetak Patel; Tim Althoff; Margaret Morris; Eve Riskin; Jennifer Mankoff; Anind Dey; |
252 | ComMU: Dataset for Combinatorial Music Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose ComMU, a dataset for generating diverse and high-quality music with rich musical metadata. |
Lee Hyun; Taehyun Kim; Hyolim Kang; Minjoo Ki; Hyeonchan Hwang; kwanho park; Sharang Han; Seon Joo Kim; |
253 | SCAMPS: Synthetics for Camera Measurement of Physiological Signals Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: SCAMPS is a dataset of high-fidelity synthetics containing 2,800 videos (1.68M frames) of avatars with aligned cardiac and respiratory signals and facial action intensities. |
Daniel McDuff; Miah Wander; Xin Liu; Brian Hill; Javier Hernandez; Jonathan Lester; Tadas Baltrusaitis; |
254 | Enabling Detailed Action Recognition Evaluation Through Video Dataset Augmentation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce the Human-centric Analysis Toolkit (HAT), which enables evaluation of learned background bias without the need for new manual video annotation. |
Jihoon Chung; Yu Wu; Olga Russakovsky; |
255 | CGLB: Benchmark Tasks for Continual Graph Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To this end, we systematically study the task configurations in different application scenarios and develop a comprehensive Continual Graph Learning Benchmark (CGLB) curated from different public datasets. |
Xikun Zhang; Dongjin Song; Dacheng Tao; |
256 | Towards Better Evaluation for Dynamic Link Prediction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper we proposed tools to improve evaluation of dynamic link prediction including new datasets, new negative sampling strategies, and a strong baseline. |
Farimah Poursafaei; Shenyang Huang; Kellin Pelrine; Reihaneh Rabbany; |
257 | AnimeRun: 2D Animation Visual Correspondence from Open Source 3D Movies Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We use open source 3D movies to make a new 2D animation dataset with ground truth optical flow and segment-wise correspondence label. |
Li Siyao; Yuhang Li; Bo Li; Chao Dong; Ziwei Liu; Chen Change Loy; |
258 | FLAIR: Federated Learning Annotated Image Repository Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper describes the FLAIR dataset that we are releasing later this month to accelerate research in Federated Learning. This is a large image dataset that is heterogenous, with images grouped by Flicker users and annotated by human. |
Congzheng Song; Filip Granqvist; Kunal Talwar; |
259 | LAION-5B: An Open Large-scale Dataset for Training Next Generation Image-text Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present LAION-5B, an open, publically available dataset of 5.8B image-text pairs and validate it by reproducing results of training state-of-the-art CLIP models of different scale. |
Christoph Schuhmann; Romain Beaumont; Richard Vencu; Cade Gordon; Ross Wightman; Mehdi Cherti; Theo Coombes; Aarush Katta; Clayton Mullis; Mitchell Wortsman; Patrick Schramowski; Srivatsa Kundurthy; Katherine Crowson; Ludwig Schmidt; Robert Kaczmarczyk; Jenia Jitsev; |
260 | OpenOOD: Benchmarking Generalized Out-of-Distribution Detection Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We build an open-source codebase called OpenOOD to support and compare 30+ methods for OOD detection and beyond. |
Jingkang Yang; Pengyun Wang; Dejian Zou; Zitang Zhou; Kunyuan Ding; WENXUAN PENG; Haoqi Wang; Guangyao Chen; Bo Li; Yiyou Sun; Xuefeng Du; Kaiyang Zhou; Wayne Zhang; Dan Hendrycks; Yixuan Li; Ziwei Liu; |
261 | Nocturne: A Scalable Driving Benchmark for Bringing Multi-agent Learning One Step Closer to The Real World Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce a fast, data-driven simulator for studying multi-agent partially observed coordination in human driving. |
Eugene Vinitsky; Nathan Lichtlé; Xiaomeng Yang; Brandon Amos; Jakob Foerster; |
262 | PDEBench: An Extensive Benchmark for Scientific Machine Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We provide a benckmark for Scientific Machine Learning |
Makoto Takamoto; Timothy Praditia; Raphael Leiteritz; Daniel MacKinlay; Francesco Alesiani; Dirk Pflüger; Mathias Niepert; |
263 | ELEVATER: A Benchmark and Toolkit for Evaluating Language-Augmented Visual Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: ELEVATER provides the first public platform and toolkit to evaluate vision foundation models in their large-scale task-level visual transfer in 20 image classification tasks and 35 object detection tasks |
Chunyuan Li; Haotian Liu; Liunian Li; Pengchuan Zhang; Jyoti Aneja; Jianwei Yang; Ping Jin; Houdong Hu; Zicheng Liu; Yong Jae Lee; Jianfeng Gao; |
264 | Open High-Resolution Satellite Imagery: The WorldStrat Dataset – With Application to Super-Resolution Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Analyzing the planet at scale with satellite imagery and machine learning is a dream that has been constantly hindered by the cost of difficult-to-access highly-representative high-resolution imagery. To remediate this, we introduce here the WorldStratified dataset. |
Julien Cornebise; Ivan Oršolić; Freddie Kalaitzis; |
265 | EnvPool: A Highly Parallel Reinforcement Learning Environment Execution Engine Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we aim to address a common bottleneck in the RL training system, i.e., parallel environment execution, which is often the slowest part of the whole system but receives little attention. |
Jiayi Weng; Min Lin; Shengyi Huang; Bo Liu; Denys Makoviichuk; Viktor Makoviychuk; Zichen Liu; Yufan Song; Ting Luo; Yukun Jiang; Zhongwen Xu; Shuicheng Yan; |
266 | Dungeons and Data: A Large-Scale NetHack Dataset Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce and evaluate a new large-scale dataset for the game of NetHack, including 10 billion transitions from humans, 3 billion from a symbolic bot, and code for researchers to record and load their own trajectories. |
Eric Hambro; Roberta Raileanu; Danielle Rothermel; Vegard Mella; Tim Rocktäschel; Heinrich Küttler; Naila Murray; |
267 | EHRSQL: A Practical Text-to-SQL Benchmark for Electronic Health Records Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present a new text-to-SQL dataset for electronic health records (EHRs), where the utterances are collected from 222 hospital staff—including physicians, nurses, and insurance review and health records teams through a poll conducted at a university hospital. |
GYUBOK LEE; Hyeonji Hwang; Seongsu Bae; Yeonsu Kwon; Woncheol Shin; Seongjun Yang; Minjoon Seo; Jong-Yeup Kim; Edward Choi; |
268 | TempEL: Linking Dynamically Evolving and Newly Emerging Entities Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We study how this evolutionary scenario impacts the performance on a well established entity linking (EL) task. For that study, we introduce TempEL, an entity linking dataset that consists of time-stratified English Wikipedia snapshots from 2013 to 2022, from which we collect both anchor mentions of entities, and these target entities’ descriptions. |
Klim Zaporojets; Lucie-Aimée Kaffee; Johannes Deleu; Thomas Demeester; Chris Develder; Isabelle Augenstein; |
269 | FinRL-Meta: Market Environments and Benchmarks for Data-Driven Financial Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present an openly accessible FinRL-Meta library that has been actively maintained by the FinRL community. |
Xiao-Yang Liu; Ziyi Xia; Jingyang Rui; Jiechao Gao; Hongyang Yang; Ming Zhu; Christina Wang; Zhaoran Wang; Jian Guo; |
270 | MoCapAct: A Multi-Task Dataset for Simulated Humanoid Control Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We release a dataset of experts and their rollouts for tracking 3.5 hours of MoCap data in dm_control. |
Nolan Wagener; Andrey Kolobov; Felipe Vieira Frujeri; Ricky Loynd; Ching-An Cheng; Matthew Hausknecht; |
271 | OccGen: Selection of Real-world Multilingual Parallel Data Balanced in Gender Within Occupations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present the OccGen toolkit that builds multilingual parallel data sets balanced in gender within occupations. The toolkit is released together with two datasets in four high-resource languages and in a low-resource language (with English). |
Marta Costa-jussà; Christine Basta; Oriol Domingo; André Rubungo; |
272 | The Dollar Street Dataset: Images Representing The Geographic and Socioeconomic Diversity of The World Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present Dollar Street, a supervised dataset that contains 38,479 images of everyday household items from homes around the world, including tags for objects and demographic data such as region, country and home monthly income. |
William Gaviria Rojas; Sudnya Diamos; Keertan Kini; David Kanter; Vijay Janapa Reddi; Cody Coleman; |
273 | NeoRL: A Near Real-World Benchmark for Offline Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: NeoRL presents conservative datasets for offline RL, highlights the complete pipeline for deploying offline RL in real-world applications, and also benchmarks recent offline RL algorithms on NeoRL under the complete pipeline. |
Rong-Jun Qin; Xingyuan Zhang; Songyi Gao; Xiong-Hui Chen; Zewen Li; Weinan Zhang; Yang Yu; |
274 | PeRFception: Perception Using Radiance Fields Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a new dataset, PeRFception dataset, that is a new unified radiance field dataset for the 2D image classification, 3D shape classification, and 3D semantic segmentation. |
Yoonwoo Jeong; Seungjoo Shin; Junha Lee; Chris Choy; Anima Anandkumar; Minsu Cho; Jaesik Park; |
275 | PyKT: A Python Library to Benchmark Deep Learning Based Knowledge Tracing Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce a comprehensive python based benchmark platform, pyKT, to guarantee valid comparisons across deep learning based knowledge tracing methods via thorough evaluations. |
Zitao Liu; Qiongqiong Liu; Jiahao Chen; Shuyan Huang; Jiliang Tang; Weiqi Luo; |
276 | TweetNERD – End to End Entity Linking Benchmark for Tweets Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: TweetNERD is a dataset of 340K+ Tweets for benchmarking Named Entity Recognition and Disambiguation systems on English Tweets. |
Shubhanshu Mishra; Aman Saini; Raheleh Makki; Sneha Mehta; Aria Haghighi; Ali Mollahosseini; |
277 | Meta-Album: Multi-domain Meta-Dataset for Few-Shot Image Classification Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce Meta-Album, an image classification meta-dataset designed to facilitate few-shot learning, transfer learning, meta-learning, among other tasks. |
Ihsan Ullah; Dustin Carrión-Ojeda; Sergio Escalera; Isabelle Guyon; Mike Huisman; Felix Mohr; Jan N. van Rijn; Haozhe Sun; Joaquin Vanschoren; Phan Anh Vu; |
278 | OpenFWI: Large-scale Multi-structural Benchmark Datasets for Full Waveform Inversion Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present an open-source platform for Full Waveform Inversion with twelve datasets and benchmarks on four deep learning methods. |
Chengyuan Deng; Shihang Feng; Hanchen Wang; Xitong Zhang; Peng Jin; Yinan Feng; Qili Zeng; Yinpeng Chen; Youzuo Lin; |
279 | OLIVES Dataset: Ophthalmic Labels for Investigating Visual Eye Semantics Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a first-of-its-kind dataset that combines clinical labels, biomarkers, fundus, OCT scans, for disease prediction, treatment analysis and biomarker detection. |
Mohit Prabhushankar; Kiran Kokilepersaud; Yash-yee Logan; Stephanie Trejo Corona; Ghassan AlRegib; Charles Wykoff; |
280 | MTNeuro: A Benchmark for Evaluating Representations of Brain Structure Across Multiple Levels of Abstraction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce a new dataset, annotations, and multiple downstream tasks that provide diverse ways to readout information about brain structure and architecture from the same image. |
Jorge Quesada; Lakshmi Sathidevi; Ran Liu; Nauman Ahad; Joy Jackson; Mehdi Azabou; Jingyun Xiao; Christopher Liding; Matthew Jin; Carolina Urzay; William Gray-Roncal; Erik Johnson; Eva Dyer; |
281 | K-Radar: 4D Radar Object Detection for Autonomous Driving in Various Weather Conditions Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we introduce KAIST-Radar (K-Radar), a novel large-scale object detection dataset and benchmark that contains 35K frames of 4D Radar tensor (4DRT) data with power measurements along the Doppler, range, azimuth, and elevation dimensions, together with carefully annotated 3D bounding box labels of objects on the roads. |
Dong-Hee Paek; SEUNG-HYUN KONG; Kevin Tirta Wijaya; |
282 | Multilingual Abusive Comment Detection at Scale for Indic Languages Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To facilitate and encourage research in this important direction, we contribute for the first time MACD – a large-scale (150K), human-annotated, multilingual (5 languages), balanced (49\% abusive content) and diverse (70K users) abuse detection dataset of user comments, sourced from a popular social media platform – ShareChat. |
Vikram Gupta; Sumegh Roychowdhury; Mithun Das; Somnath Banerjee; Punyajoy Saha; Binny Mathew; hastagiri prakash vanchinathan; Animesh Mukherjee; |
283 | Geoclidean: Few-Shot Generalization in Euclidean Geometry Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce Geoclidean, a domain-specific language for Euclidean geometry, and use it to generate two datasets of geometric concept learning tasks for benchmarking generalization judgements of humans and machines. |
Joy Hsu; Jiajun Wu; Noah Goodman; |
284 | HAPI: A Large-scale Longitudinal Dataset of Commercial ML API Predictions Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We construct and analyze a large-scale longitudinal dataset of commercial ML API predictions. |
Lingjiao Chen; Zhihua Jin; Evan Sabri Eyuboglu; Christopher Ré; Matei Zaharia; James Zou; |
285 | DART: Articulated Hand Model with Diverse Accessories and Rich Textures Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present DART, which extends MANO with diverse accessories and rich textures, and synthesize a large-scale (800K) hand dataset. |
Daiheng Gao; Yuliang Xiu; Kailin Li; Lixin Yang; Feng Wang; Peng Zhang; Bang Zhang; Cewu Lu; Ping Tan; |
286 | DGraph: A Large-Scale Financial Dataset for Graph Anomaly Detection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper present DGraph, a real-world dynamic graph in the finance domain. |
Xuanwen Huang; Yang Yang; Yang Wang; Chunping Wang; Zhisheng Zhang; Jiarong Xu; Lei Chen; Michalis Vazirgiannis; |
287 | PulseImpute: A Novel Benchmark Task for Pulsative Physiological Signal Imputation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: PulseImpute is the first mHealth pulsative signal imputation challenge which includes realistic missingness models, clinical downstream tasks, and an extensive set of baselines, including an augmented transformer that achieves SOTA performance. |
Maxwell Xu; Alexander Moreno; Supriya Nagesh; Varol Aydemir; David Wetter; Santosh Kumar; James Rehg; |
288 | On Reinforcement Learning and Distribution Matching for Fine-Tuning Language Models with No Catastrophic Forgetting Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We describe and exploit connections between two distinct paradigms for expressing preferences over outputs of language models: reward maximization and distribution matching. |
Tomasz Korbak; Hady Elsahar; Germán Kruszewski; Marc Dymetman; |
289 | On The Strong Correlation Between Model Invariance and Generalization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Building on this qualitative implication we make two contributions. First, we introduce effective invariance (EI), a simple and reasonable measure of model invariance which does not rely on image labels. Given predictions on a test image and its transformed version, EI measures how well the predictions agree and with what level of confidence. Second, using invariance scores computed by EI, we perform large-scale quantitative correlation studies between generalization and invariance, focusing on rotation and grayscale transformations. |
Weijian Deng; Stephen Gould; Liang Zheng; |
290 | Adaptive Interest for Emphatic Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a way to automatically learn the interest function of emphatic algorithms and verify our approach on a wide range of environments. |
Martin Klissarov; Rasool Fakoor; Jonas Mueller; Kavosh Asadi; Taesup Kim; Alexander Smola; |
291 | Hilbert Distillation for Cross-Dimensionality Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a novel Hilbert curve-based cross-dimensionality distillation approach that facilitates the knowledge of 3D networks to improve the performance of 2D networks. |
Dian Qin; Haishuai Wang; Zhe Liu; HONGJIA XU; Sheng Zhou; Jiajun Bu; |
292 | Distributionally Adaptive Meta Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we develop a framework for meta-RL algorithms that are able to behave appropriately under test-time distribution shifts in the space of tasks. |
Anurag Ajay; Dibya Ghosh; Sergey Levine; Pulkit Agrawal; Abhishek Gupta; |
293 | Simplified Graph Convolution with Heterophily Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a simple, non-deep method for graph convolution which can handle both homophilous and heterophilous graphs. |
Sudhanshu Chanpuriya; Cameron Musco; |
294 | Online Allocation and Learning in The Presence of Strategic Agents Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study the problem of sequentially allocating items to n potentially strategic agents with unknown prior on their value distribution. |
Steven Yin; Shipra Agrawal; Assaf Zeevi; |
295 | Accelerating Certified Robustness Training Via Knowledge Transfer Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose Certified Robustness Transfer (CRT), a general-purpose framework for reducing the computational overhead of any certifiably robust training method through knowledge transfer. |
Pratik Vaishnavi; Kevin Eykholt; Amir Rahmati; |
296 | Efficient and Near-Optimal Smoothed Online Learning for Generalized Linear Functions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We show that one can beat the exponential computation-statistical gap for worst-case function classes in smooth online learning when one considers generalized linear function classes. |
Adam Block; Max Simchowitz; |
297 | ShapeCrafter: A Recursive Text-Conditioned 3D Shape Generation Model Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We recursively generate 3D shape distributions from progressively evolving phrase sequences. |
Rao Fu; Xiao Zhan; YIWEN CHEN; Daniel Ritchie; Srinath Sridhar; |
298 | Trajectory Inference Via Mean-field Langevin in Path Space Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: The estimator for trajectory inference that minimizes the entropy relative to Wiener measure can be computed with a Langevin dynamics in path space (convergence guaranteed). |
Stephen Zhang; Lénaïc Chizat; Matthieu Heitz; Geoffrey Schiebinger; |
299 | Beyond Black Box Densities: Parameter Learning for The Deviated Components Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose the "deviating mixture model" and study its theoretical properties. |
Dat Do; Nhat Ho; XuanLong Nguyen; |
300 | Boosting Barely Robust Learners: A New Perspective on Adversarial Robustness Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present an oracle-efficient algorithm for boosting robustness to adversarial examples. |
Avrim Blum; Omar Montasser; Greg Shakhnarovich; Hongyang Zhang; |
301 | Optimal Efficiency-Envy Trade-Off Via Optimal Transport Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We use tools from Optimal Transport to achieve optimal trade-off between efficiency and envy in resource allocation problems. |
Steven Yin; Christian Kroer; |
302 | One Model to Edit Them All: Free-Form Text-Driven Image Manipulation with Semantic Modulations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a method named Free-Form CLIP (FFCLIP), aiming to establish an automatic latent mapping so that one manipulation model handles free-form text prompts. |
Yiming Zhu; Hongyu Liu; Yibing Song; Ziyang Yuan; Xintong Han; Chun Yuan; Qifeng Chen; Jue Wang; |
303 | (De-)Randomized Smoothing for Decision Stump Ensembles Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a (De-)Randomized Smoothing approach for decision stump ensembles, which i) significantly improves SOTA certified Lp-norm robustness for tree-based models and ii) enables joint certificates of numerical & categorical perturbations. |
Miklós Horváth; Mark Müller; Marc Fischer; Martin Vechev; |
304 | Generative Multitask Learning Mitigates Target-causing Confounding Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We use ideas from causality to develop an inference objective for MTL that improves robustness to target shift. |
Taro Makino; Krzysztof Geras; Kyunghyun Cho; |
305 | IM-Loss: Information Maximization Loss for Spiking Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, the forward-passing $0/1$ spike quantization will cause information loss and accuracy degradation. To deal with this problem, the Information maximization loss (IM-Loss) that aims at maximizing the information flow in the SNN is proposed in the paper. |
Yufei Guo; Yuanpei Chen; Liwen Zhang; Xiaode Liu; Yinglei Wang; Xuhui Huang; Zhe Ma; |
306 | Low-Rank Modular Reinforcement Learning Via Muscle Synergy Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Motivated by the way the human central nervous system controls numerous muscles, we propose a Synergy-Oriented LeARning (SOLAR) framework that exploits the redundant nature of DoF in robot control. |
Heng Dong; Tonghan Wang; Chongjie Zhang; |
307 | A Differentially Private Linear-Time FPTAS for The Minimum Enclosing Ball Problem Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this works, we give the first differentially private (DP) fPTAS for the Minimum Enclosing Ball problem, improving both on the runtime and the utility bound of the best known DP-PTAS for the problem, of Ghazi et al (2020). |
Bar Mahpud; Or Sheffet; |
308 | Optimal Query Complexities for Dynamic Trace Estimation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We give tight bounds for implicity trace estimation in a dynamic setting. |
David Woodruff; Fred Zhang; Richard Zhang; |
309 | GALOIS: Boosting Deep Reinforcement Learning Via Generalizable Logic Synthesis Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Based on that, GALOIS proposes a sketch-based program synthesis method to automatically generate white-box programs with generalizable and interpretable cause-effect logic. |
Yushi Cao; Zhiming Li; Tianpei Yang; Hao Zhang; YAN ZHENG; Yi Li; Jianye Hao; Yang Liu; |
310 | Near-Optimal Sample Complexity Bounds for Constrained MDPs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We provide minimax sample-complexity bounds for learning near-optimal policies for discounted constrained Markov decision processes (CMDPs) with access to a simulator.. |
Sharan Vaswani; Lin Yang; Csaba Szepesvari; |
311 | ReFactorGNNs: Revisiting Factorisation-based Models from A Message-Passing Perspective Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose ReFactor GNNs inspired by revisiting FMs from the perspective of message-passing. |
Yihong Chen; Pushkar Mishra; Luca Franceschi; Pasquale Minervini; Pontus Lars Erik Saito Stenetorp; Sebastian Riedel; |
312 | When Adversarial Training Meets Vision Transformers Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper investigates the training techniques and utilizes the unique architectures to improve the adversarial robustness of Vision transformers. |
Yichuan Mo; Dongxian Wu; Yifei Wang; Yiwen Guo; Yisen Wang; |
313 | Interventions, Where and How? Bayesian Active Causal Discovery at Scale Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we incorporate recent advances in Bayesian causal discovery into the Bayesian optimal experimental design framework, which allows for active causal discovery of nonlinear, large SCMs, while selecting both the target and the value to intervene with. |
Panagiotis Tigas; Yashas Annadani; Andrew Jesson; Bernhard Schölkopf; Yarin Gal; Stefan Bauer; |
314 | Nonlinear MCMC for Bayesian Machine Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We provide a convergence guarantee in total variation that uses novel results for long-time convergence and large-particle (“propagation of chaos”) convergence. We apply this nonlinear MCMC technique to sampling problems including a Bayesian neural network on CIFAR10. |
James Vuckovic; |
315 | Robust Neural Posterior Estimation and Statistical Model Criticism Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: As a remedy we argue that principled scientific inquiry with simulators should incorporate a model criticism component, to facilitate interpretable identification of misspecification and a robust inference component, to fit "wrong but useful" models. We propose robust neural posterior estimation (RNPE), an extension of NPE to simultaneously achieve both these aims, through explicitly modelling the discrepancies between simulations and the observed data. |
Daniel Ward; Patrick Cannon; Mark Beaumont; Matteo Fasiolo; Sebastian Schmon; |
316 | Multi-layer State Evolution Under Random Convolutional Design Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We show how to deal with Convolutional Matrices with Approximate Message Passing |
Max Daniels; Cedric Gerbelot; Florent Krzakala; Lenka Zdeborová; |
317 | Unsupervised Object Representation Learning Using Translation and Rotation Group Equivariant VAE Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a translation and rotation group equivariant variational autoencoder by performing direct inference on these transformations. |
Alireza Nasiri; Tristan Bepler; |
318 | Estimation of Entropy in Constant Space with Improved Sample Complexity Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we give a new constant memory scheme that reduces the sample complexity to $(k/\epsilon^2)\cdot \text{polylog}(1/\epsilon)$. |
Maryam Aliakbarpour; Andrew McGregor; Jelani Nelson; Erik Waingarten; |
319 | Grounding Aleatoric Uncertainty in Unsupervised Environment Design Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We characterize how curriculum learning can induce suboptimal reinforcement learning policies with respect to a ground-truth distribution of environments, and propose a method for correcting this effect. |
Minqi Jiang; Michael Dennis; Jack Parker-Holder; Andrei Lupu; Heinrich Küttler; Edward Grefenstette; Tim Rocktäschel; Jakob Foerster; |
320 | A Deep Learning Toolbox for Stochastic Stabilized Supralinear Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We develop a method to train biologically realistic stabilized supralinear networks in a stable manner. |
Wayne Soo; Mate Lengyel; |
321 | Batch Bayesian Optimization on Permutations Using The Acquisition Weighted Kernel Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work we propose a batch Bayesian optimization method for combinatorial problems on permutations, which is well suited for expensive-to-evaluate objectives. |
Changyong Oh; Roberto Bondesan; Efstratios Gavves; Max Welling; |
322 | Aligning Individual Brains with Fused Unbalanced Gromov Wasserstein Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We derive a new unbalanced optimal transport loss to align human individual brains using fMRI data while preserving their anatomical topology |
Alexis Thual; Quang Huy TRAN; Tatiana Zemskova; Nicolas Courty; Rémi Flamary; Stanislas Dehaene; Bertrand Thirion; |
323 | [Re] Does Self-Supervision Always Improve Few-Shot Learning? Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Scope of Reproducibility: This report covers our reproduction and extension of the paper ‘When Does Self-Supervision Improve Few-shot Learning?’ |
Arjun Ashok; Haswanth Aekula; |
324 | Sequential Latent Variable Models for Multiagent Trajectories Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a semi-supervised generative framework for modeling and annotating trajectories of multiple agents. |
Dennis Fassmeyer; Pascal Fassmeyer; Ulf Brefeld; |
325 | Bayesian Inference Via Sparse Hamiltonian Flows Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper shows how to (1) construct Bayesian coresets simply and tractably using variational flows, and (2) make variational flows cheaper in the large-data regime via coresets. |
Naitong Chen; Zuheng Xu; Trevor Campbell; |
326 | Learning with Convolution and Pooling Operations in Kernel Methods Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We describe the generalization properties of a one-layer convolutional kernel with pooling and downsampling. |
Theodor Misiakiewicz; Song Mei; |
327 | Mean Estimation with User-level Privacy Under Data Heterogeneity Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study mean estimation in the setting with users with heterogeneous data. |
Rachel Cummings; Vitaly Feldman; Audra McMillan; Kunal Talwar; |
328 | On The Efficient Implementation of High Accuracy Optimality of Profile Maximum Likelihood Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper we provide an efficient algorithm to compute a universal plug-in estimator for symmetric properties of distributions that is sample optimal up to accuracy $\epsilon \gg n^{-1/3}$, where $n$ is the sample size. |
Moses Charikar; Zhihao Jiang; Kirankumar Shiragur; Aaron Sidford; |
329 | Data Augmentation for Compositional Data: Advancing Predictive Models of The Microbiome Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose novel data augmentation strategies that yield significant performance gains for microbiome compositional data. |
Elliott Gordon-Rodriguez; Thomas Quinn; John Cunningham; |
330 | Learning in Distributed Contextual Linear Bandits Without Sharing The Context Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a method to compress the context using $\approx 5d$ bits per context if the context distribution is unknown and $0$ bits per context if the context distribution is known, while achieving optimal regret. |
Osama Hanna; Lin Yang; Christina Fragouli; |
331 | Parameter-free Dynamic Graph Embedding for Link Prediction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To this end, this paper proposes FreeGEM, a parameter-free dynamic graph embedding method for link prediction. |
Jiahao Liu; Dongsheng Li; Hansu Gu; Tun Lu; Peng Zhang; Ning Gu; |
332 | Subgroup Robustness Grows On Trees: An Empirical Baseline Investigation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We show that tree-based methods are surprisingly strong baselines for subgroup robustness on tabular data. |
Josh Gardner; Zoran Popovic; Ludwig Schmidt; |
333 | Characterizing The Ventral Visual Stream with Response-Optimized Neural Encoding Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We develop a data-driven, hypothesis-agnostic computational approach to understand representations within the human ventral visual pathway |
Meenakshi Khosla; Keith Jamison; Amy Kuceyeski; Mert Sabuncu; |
334 | Multi-Class $H$-Consistency Bounds Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present an extensive study of $H$-consistency bounds formulti-class classification. |
Pranjal Awasthi; Anqi Mao; Mehryar Mohri; Yutao Zhong; |
335 | Learning Bipartite Graphs: Heavy Tails and Multiple Components Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose estimators for (k-component) bipartite graphs under the assumption that the observed data is heavy-tailed. |
José Vinícius de Miranda Cardoso; Jiaxi Ying; Daniel Palomar; |
336 | Autoregressive Perturbations for Data Poisoning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we introduce autoregressive (AR) poisoning, a method that can generate poisoned data without access to the broader dataset. |
Pedro Sandoval-Segura; Vasu Singla; Jonas Geiping; Micah Goldblum; Tom Goldstein; David Jacobs; |
337 | Does GNN Pretraining Help Molecular Representation? Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We investigate graph pretraining on molecular representation. We conduct thorough ablation studies on the key components of GNN pretraining, and found that many occasions the benefits from self-supervised pretraining on molecular data is negligible. |
Ruoxi Sun; Hanjun Dai; Adams Yu; |
338 | Discrete Compositional Representations As An Abstraction for Goal Conditioned Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Defining goals in the space of noisy, high-dimensional sensory inputs is one possibility, yet this poses a challenge for training goal-conditioned agents, or even for generalization to novel goals. We propose to address this by learning compositional representations of goals and processing the resulting representation via a discretization bottleneck, for coarser specification of goals, through an approach we call DGRL. |
Riashat Islam; Hongyu Zang; Anirudh Goyal; Alex Lamb; Kenji Kawaguchi; Xin Li; Romain Laroche; Yoshua Bengio; Remi Tachet des Combes; |
339 | E-MAPP: Efficient Multi-Agent Reinforcement Learning with Parallel Program Guidance Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose E-MAPP, a framework for parallel program guided multi-agent reinforcement learning, which outperforms strong baselines in long-horizon cooperation tasks and generalizes well. |
Can Chang; Ni Mu; Jiajun Wu; Ling Pan; Huazhe Xu; |
340 | Unsupervised Learning of Shape Programs with Repeatable Implicit Parts Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a shape ProGram with Repeatable Implicit Parts (ProGRIP) along with an unsupervised learning strategy that helps learn high fidelity structured shape with self-similarity considered. |
Boyang Deng; Sumith Kulal; Zhengyang Dong; Congyue Deng; Yonglong Tian; Jiajun Wu; |
341 | Towards Hard-pose Virtual Try-on Via 3D-aware Global Correspondence Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we target image-based person-to-person virtual try-on in the presence of diverse poses and large viewpoint variations. |
Zaiyu Huang; Hanhui Li; Zhenyu Xie; Michael Kampffmeyer; qingling Cai; Xiaodan Liang; |
342 | A Fourier Approach to Mixture Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We give a simple algorithm that learns spherical Gaussian mixtures with a nearly-optimal separation in the moderate-dimension regime. |
Mingda Qiao; Guru Guruganesh; Ankit Rawat; Kumar Avinava Dubey; Manzil Zaheer; |
343 | The Minority Matters: A Diversity-Promoting Collaborative Metric Learning Algorithm Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To this end, we propose a diversity control regularization term to accommodate the multi-vector representation strategy better. |
Shilong Bao; Qianqian Xu; Zhiyong Yang; Yuan He; Xiaochun Cao; Qingming Huang; |
344 | Debiased Machine Learning Without Sample-Splitting for Stable Estimators Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We prove asymptotic normality for a target parameter of interest, of debiased machine learning semi-parametric estimators without sample splitting, when the machine learning estimators used for the nuisance functions are leave-one-out stable. |
Qizhao Chen; Vasilis Syrgkanis; Morgane Austern; |
345 | Dynamics of SGD with Stochastic Polyak Stepsizes: Truly Adaptive Variants and Convergence to Exact Solution Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose DecSPS, a novel variant of stochastic Polyak stepsize (SPS) for SGD, yielding first stochastic *adaptive* optimization method that converges to exact solution without restrictive assumptions like bounded iterates/gradients or interpolation |
Antonio Orvieto; Simon Lacoste-Julien; Nicolas Loizou; |
346 | Generalization Error Bounds on Deep Learning with Markov Datasets Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we derive upper bounds on generalization errors for deep neural networks with Markov datasets. |
Lan V. Truong; |
347 | Differentially Private Graph Learning Via Sensitivity-Bounded Personalized PageRank Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We provide the first differential private algorithm for approximating personalized page rank. |
Alessandro Epasto; Vahab Mirrokni; Bryan Perozzi; Anton Tsitsulin; Peilin Zhong; |
348 | Online Agnostic Multiclass Boosting Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We give the first boosting algorithm for online agnostic multiclass classification by reducing boosting to online convex optimization. |
Vinod Raman; Ambuj Tewari; |
349 | Data-Efficient Structured Pruning Via Submodular Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a principled data-efficient structured pruning method based on submodular optimization. |
Marwa El Halabi; Suraj Srinivas; Simon Lacoste-Julien; |
350 | Masked Prediction: A Parameter Identifiability View Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This work offers a new lens to understanding self-supervised learning: one of parameter identifiability. We show that with proper choices of parametric forms and prediction tasks, masked prediction tasks can recover parameters of HMMs. |
Bingbin Liu; Daniel Hsu; Pradeep Ravikumar; Andrej Risteski; |
351 | A Unified Analysis of Federated Learning with Arbitrary Client Participation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a unified framework for analyzing the convergence of federated learning with arbitrary participation of clients. |
Shiqiang Wang; Mingyue Ji; |
352 | When Does Return-conditioned Supervised Learning Work for Offline Reinforcement Learning? Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we provide a rigorous study of the capabilities and limitations of RCSL something which is crucially missing in previous work. |
David Brandfonbrener; Alberto Bietti; Jacob Buckman; Romain Laroche; Joan Bruna; |
353 | Near-Optimal Private and Scalable $k$-Clustering Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We provide nearly optimal algorithms for differentially private k-means and k-median clustering in Euclidean space, in the massively parallel computation model. |
Vincent Cohen-Addad; Alessandro Epasto; Vahab Mirrokni; Shyam Narayanan; Peilin Zhong; |
354 | Decision-Focused Learning Without Decision-Making: Learning Locally Optimized Decision Losses Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We provide a novel way to learn loss functions for predictive models so as to improve their performance when used in conjunction with specific optimization tasks. |
Sanket Shah; Kai Wang; Bryan Wilder; Andrew Perrault; Milind Tambe; |
355 | PAC Prediction Sets for Meta-Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a novel algorithm to construct a prediction set for meta learning that satisfies a probably approximately correct (PAC) guarantee tailored to meta learning. |
Sangdon Park; Edgar Dobriban; Insup Lee; Osbert Bastani; |
356 | Policy Optimization with Advantage Regularization for Long-Term Fairness in Decision Systems Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We use policy optimization with advantage regularization to improve long-term fairness of decision-making policies. |
Eric Yu; Zhizhen Qin; Min Kyung Lee; Sicun Gao; |
357 | Layer Freezing & Data Sieving: Missing Pieces of A Generic Framework for Sparse Training Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper intends to explore other possible directions to effectively and efficiently reduce sparse training costs while preserving accuracy. |
Geng Yuan; Yanyu Li; Sheng Li; Zhenglun Kong; Sergey Tulyakov; Xulong Tang; Yanzhi Wang; Jian Ren; |
358 | Rapid Model Architecture Adaption for Meta-Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The combinatorial search complexity $T \times H$ creates a fundamental search efficiency challenge if one naively applies existing NAS methods to these scenarios. To overcome this issue, we show, for the first time, how to rapidly adapt model architectures to new tasks in a \emph{many-task many-hardware} few-shot learning setup by integrating Model Agnostic Meta Learning (MAML) into the NAS flow. |
Yiren Zhao; Xitong Gao; I Shumailov; Nicolo Fusi; Robert Mullins; |
359 | Compositional Generalization Through Abstract Representations in Human and Artificial Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study the impact of abstract representations on compositional generalization in human imaging data and simple artificial neural networks. |
Takuya Ito; Tim Klinger; Doug Schultz; John Murray; Michael Cole; Mattia Rigotti; |
360 | Surprising Instabilities in Training Deep Networks and A Theoretical Analysis Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We empirically demonstrate numerical instabilities in training deep networks with SGD and provide a theoretical analysis for it. |
Yuxin Sun; DONG LAO; Ganesh Sundaramoorthi; Anthony Yezzi; |
361 | Change-point Detection for Sparse and Dense Functional Data in General Dimensions Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We study the problem of change-point detection and localisation for functional data sequentially observed on a general $d$-dimensional space, where we allow the functional curves to be either sparsely or densely sampled. |
Carlos Misael Madrid Padilla; Daren Wang; Zifeng Zhao; Yi Yu; |
362 | Mesoscopic Modeling of Hidden Spiking Neurons Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We derive a neuronally-grounded latent variable model for multi-neuronal spike trains. |
Shuqi Wang; Valentin Schmutz; Guillaume Bellec; Wulfram Gerstner; |
363 | A Near-Optimal Best-of-Both-Worlds Algorithm for Online Learning with Feedback Graphs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study the problem of online learning with feedback graphs and present an algorithm capable of achieving near-optimal pseudo-regret bounds simultaneously against adversarial and stochastic sequences of losses. |
Chloé Rouyer; Dirk van der Hoeven; Nicolò Cesa-Bianchi; Yevgeny Seldin; |
364 | Making Look-Ahead Active Learning Strategies Feasible with Neural Tangent Kernels Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a new method for approximating active learning acquisition strategies that are based on retraining with hypothetically-labeled candidate data points. |
Mohamad Amin Mohamadi; Wonho Bae; Danica J. Sutherland; |
365 | Graph Neural Networks Are Dynamic Programmers Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We use category theory and abstract algebra to further uncover the relationship between graph neural nets and dynamic programming, which was previously done handwavily over specific examples. |
Andrew J Dudzik; Petar Veličković; |
366 | Generalized Laplacian Eigenmaps Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose GLEN, an NP-hard rank difference minimization problem for graph node embedding that enjoys the intra-class separation guarantee and can be solved with a logdet relaxation. |
Hao Zhu; Piotr Koniusz; |
367 | Invertible Monotone Operators for Normalizing Flows Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a monotone operator-based normalizing flow by parametrizing the Cayley operator of monotone operators. |
Byeongkeun Ahn; Chiyoon Kim; Youngjoon Hong; Hyunwoo Kim; |
368 | FR: Folded Rationalization with A Unified Encoder Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, such a two-phase model may incur the degeneration problem where the predictor overfits to the noise generated by a not yet well-trained generator and in turn, leads the generator to converge to a suboptimal model that tends to select senseless pieces. To tackle this challenge, we propose Folded Rationalization (FR) that folds the two phases of the rationale model into one from the perspective of text semantic extraction. |
Wei Liu; Haozhao Wang; Jun Wang; Ruixuan Li; Chao Yue; YuanKai Zhang; |
369 | Riemannian Diffusion Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a continuous-time diffusion model for data represented on a Riemannian manifold. |
Chin-Wei Huang; Milad Aghajohari; Joey Bose; Prakash Panangaden; Aaron Courville; |
370 | Training with More Confidence: Mitigating Injected and Natural Backdoors During Training Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: By further analyzing the training process and model architectures, we found that piece-wise linear functions cause this hyperplane surface. In this paper, we design a novel training method that forces the training to avoid generating such hyperplanes and thus remove the injected backdoors. |
Zhenting Wang; Hailun Ding; Juan Zhai; Shiqing Ma; |
371 | GhostNetV2: Enhance Cheap Operation with Long-Range Attention Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a hardware-friendly attention mechanism (dubbed DFC attention) and then present a new GhostNetV2 architecture for mobile applications. |
Yehui Tang; Kai Han; Jianyuan Guo; Chang Xu; Chao Xu; Yunhe Wang; |
372 | Chefs’ Random Tables: Non-Trigonometric Random Features Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present a new family of random features for the Gaussian kernel. Extensive theoretical and empirical analysis is presented. |
Valerii Likhosherstov; Krzysztof M Choromanski; Kumar Avinava Dubey; Frederick Liu; Tamas Sarlos; Adrian Weller; |
373 | Hierarchical Agglomerative Graph Clustering in Poly-Logarithmic Depth Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We give a new efficient approximate parallel algorithm for graph-based average-linkage HAC which is scalable and high quality relative to existing state-of-the-art hierarchical clustering algorithms. |
Laxman Dhulipala; David Eisenstat; Jakub Lacki; Vahab Mirrokni; Jessica Shi; |
374 | Efficient Non-Parametric Optimizer Search for Diverse Tasks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: With the goal of democratizing research and application of optimizer search, we present the first efficient, scalable and generalizable framework that can directly search on the tasks of interest. |
Ruochen Wang; Yuanhao Xiong; Minhao Cheng; Cho-Jui Hsieh; |
375 | Factored DRO: Factored Distributionally Robust Policies for Contextual Bandits Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Our algorithm Factored-DRO learns distributionally robust batch contextual bandit policies, and can separately handle distribution shifts in the context distribution and shifts in the reward generating process. |
Tong Mu; Yash Chandak; Tatsunori Hashimoto; Emma Brunskill; |
376 | Scalable Interpretability Via Polynomials Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Second degree polynomials can be used as drop-in replacements for DNNs on most tabular and processed image datasets for interpretability with no loss in performance. |
Abhimanyu Dubey; Filip Radenovic; Dhruv Mahajan; |
377 | Diffusion Models As Plug-and-Play Priors Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We consider the problem of inferring high-dimensional data $x$ in a model that consists of a prior $p(x)$ and an auxiliary constraint $c(x,y)$. |
Alexandros Graikos; Nikolay Malkin; Nebojsa Jojic; Dimitris Samaras; |
378 | Non-Gaussian Tensor Programs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We have extended the Tensor Programs framework to non-Gaussian weight distributions and recovered all existing applications of its main theorem |
Eugene Golikov; Greg Yang; |
379 | When to Update Your Model: Constrained Model-based Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We analyze the optimization monotonicity for MBRL algorithms under a novel and general scheme, upon which we develop an algorithm CMLO equipped with an event-triggered mechanism to learn the model from a dynamically-varying number of explorations. |
Tianying Ji; Yu Luo; Fuchun Sun; Mingxuan Jing; Fengxiang He; Wenbing Huang; |
380 | Learning to Navigate Wikipedia with Graph Diffusion Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present an efficient technique for learning to navigate web knowledge sources like Wikipedia by pretraining on random walks. This navigating agent can be used for precise evidence gathering on downstream tasks like QA and fact verification. |
Manzil Zaheer; Kenneth Marino; Will Grathwohl; John Schultz; Wenling Shang; Sheila Babayan; Arun Ahuja; Ishita Dasgupta; Christine Kaeser-Chen; Rob Fergus; |
381 | ZIN: When and How to Learn Invariance By Environment Inference? Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a framework to provably learn invariant feature without environment partition. |
Yong Lin; Shengyu Zhu; Lu Tan; Peng Cui; |
382 | Information-Theoretic Generative Model Compression with Variational Energy-based Model Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose an information-theoretic knowledge distillation approach for the compression of generative adversarial networks, which aims to maximize the mutual information between teacher and student networks. |
Minsoo Kang; Hyewon Yoo; Eunhee Kang; Sehwan Ki; Hyong Euk Lee; Bohyung Han; |
383 | Understanding Programmatic Weak Supervision Via Source-aware Influence Function Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We develop a general framework for understanding the behavior of model rendered by Programmatic Weak Supervision (PWS). |
Jieyu Zhang; Haonan Wang; Cheng-Yu Hsieh; Alexander Ratner; |
384 | Learning-based Manipulation Planning in Dynamic Environments Using GNNs and Temporal Encoding Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a GNN-based neural architecture that involves temporal encoding, and use imitation learning with data aggregation procedures for learning both the embedding and edge prioritization policies. |
Ruipeng Zhang; Chenning Yu; Jingkai Chen; Chuchu Fan; Sicun Gao; |
385 | Learning Modular Simulations for Homogeneous Systems Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present a modular approach to model the dynamics of homogeneous networks, where the nodes are modeled using a ‘message passing neural ODE’ algorithm, an extension over neural ODE that enables node-node communication. |
Jayesh Gupta; Sai Vemprala; Ashish Kapoor; |
386 | UMIX: Improving Importance Weighting for Subpopulation Shift Via Uncertainty-Aware Mixup Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a simple and practical approach called uncertainty-aware mixup (UMIX) to improve previous IW methods by re-weighting the mixed samples. We also provide insightful theoretical analysis to explain why it works. |
Zongbo Han; Zhipeng Liang; Fan Yang; Liu Liu; Lanqing Li; Yatao Bian; Peilin Zhao; Bingzhe Wu; Changqing Zhang; Jianhua Yao; |
387 | Where2comm: Communication-Efficient Collaborative Perception Via Spatial Confidence Maps Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: It inevitably results in a fundamental trade-off between perception performance and communication bandwidth. To tackle this bottleneck issue, we propose a spatial confidence map, which reflects the spatial heterogeneity of perceptual information. |
Yue Hu; Shaoheng Fang; Zixing Lei; Yiqi Zhong; Siheng Chen; |
388 | Natural Color Fool: Towards Boosting Black-box Unrestricted Attacks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: we propose a Natural Color Fool (NCF), which fully exploits color distributions of semantic classes in an image to craft human-imperceptible, flexible, and highly transferable adversarial examples. |
Shengming Yuan; Qilong Zhang; Lianli Gao; Yaya Cheng; Jingkuan Song; |
389 | AutoST: Towards The Universal Modeling of Spatio-temporal Sequences Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, the manually-designed heterogeneous models can hardly meet the spatio-temporal dependency capturing priority for various tasks. To address this, we proposed a universal modeling framework with three distinctive characteristics: (i) Attention-based network backbone, including S2T Layer (spatial first), T2S Layer (temporal first), and STS Layer (spatio-temporal synchronous). |
Jianxin Li; Shuai Zhang; Hui Xiong; Haoyi Zhou; |
390 | On Solving Class Incremental Learning in Continual Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper performs a theoretical study on how to solve the class increment learning problem (CIL) and proposes two strong CIL algorithms. |
Gyuhak Kim; Changnan Xiao; Tatsuya Konishi; Zixuan Ke; Bing Liu; |
391 | On The Effectiveness of Persistent Homology Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: The goal of this work is to identify some types of problems where PH performs well or even better than other state-of-the-art methods in data analysis. |
Renata Turkes; Guido Montufar; Nina Otter; |
392 | Learning Debiased Classifier with Biased Committee Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper proposes a new method for training debiased classifier, learning debiased classifier with biased committee (LWBC). |
Nayeong Kim; SEHYUN HWANG; Sungsoo Ahn; Jaesik Park; Suha Kwak; |
393 | 3DB: A Framework for Debugging Computer Vision Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce 3DB: an extendable, unified framework for testing and debugging vision models using photorealistic simulation. |
Guillaume Leclerc; Hadi Salman; Andrew Ilyas; Sai Vemprala; Logan Engstrom; Vibhav Vineet; Kai Xiao; Pengchuan Zhang; Shibani Santurkar; Greg Yang; Ashish Kapoor; Aleksander Madry; |
394 | ToDD: Topological Compound Fingerprinting in Computer-Aided Drug Discovery Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Our primary contribution is framing the VS process as a new topology-based graph ranking problem by partitioning a compound into chemical substructures informed by the periodic properties of its atoms and extracting their persistent homology features at multiple resolution levels. |
Andaç Demir; Baris Coskunuzer; Yulia Gel; Ignacio Segovia-Dominguez; Yuzhou Chen; Bulent Kiziltan; |
395 | Trading Off Utility, Informativeness, and Complexity in Emergent Communication Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Training agents to communicate according to a tradeoff between utility, communicative accuracy, and complexity allows us to generate varied emergent communication, much like differing human languages. |
Mycal Tucker; Roger Levy; Julie Shah; Noga Zaslavsky; |
396 | Grounded Video Situation Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a new task Grounded Video Situation Recognition(GVSR). In addition to predicting the verbs, and semantic roles in the form of captions, we also ground them in the spatio-temporal domain in weakly-supervised setup in an end-to-end fashion. |
Zeeshan Khan; C.V. Jawahar; Makarand Tapaswi; |
397 | HierSpeech: Bridging The Gap Between Text and Speech By Hierarchical Variational Inference Using Self-supervised Representations for Speech Synthesis Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper presents HierSpeech, a high-quality end-to-end text-to-speech (TTS) system based on a hierarchical conditional variational autoencoder (VAE) utilizing self-supervised speech representations. |
Sang-Hoon Lee; Seung-Bin Kim; Ji-Hyun Lee; Eunwoo Song; Min-Jae Hwang; Seong-Whan Lee; |
398 | Structural Kernel Search Via Bayesian Optimization and Symbolical Optimal Transport Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a new method for kernel selection for Gaussian processes, where the distance between two GPs is measured using their associated symbolic description of the statistical hypothesis. |
Matthias Bitzer; Mona Meister; Christoph Zimmer; |
399 | Theory and Approximate Solvers for Branched Optimal Transport with Multiple Sources Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We lay out the theory and practice of devising optimal transportation routes with subadditive edge costs as a generalization of optimal transport, encouraging solutions with branched structure. |
Peter Lippmann; Enrique Fita Sanmartín; Fred Hamprecht; |
400 | Understanding Robust Learning Through The Lens of Representation Similarities Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we aim to understand how the properties of representations learned by robust training differ from those obtained from standard, non-robust training. |
Christian Cianfarani; Arjun Nitin Bhagoji; Vikash Sehwag; Ben Zhao; Prateek Mittal; Heather Zheng; |
401 | A Few Expert Queries Suffices for Sample-Efficient RL with Resets and Linear Value Approximation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: While sample complexities in MDPs with linear optimal value functions can be exponentially large, we give a new method which shows that a surprisingly-little amount of expert advice permits sample efficiency. |
Philip Amortila; Nan Jiang; Dean Foster; Dhruv Madeka; |
402 | Delving Into Sequential Patches for Deepfake Detection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Based on low-level temporal inconsistency understanding, we identify deepfake videos in a more robust and generalizable way with model designs in a Transfomer style. |
Jiazhi Guan; Hang Zhou; Zhibin Hong; Errui Ding; Jingdong Wang; Chengbin Quan; Youjian Zhao; |
403 | Probabilistic Transformer: Modelling Ambiguities and Distributions for RNA Folding and Molecule Design Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This ambiguity suggests that a predictive model should have similar probabilistic characteristics to match the data it models. Therefore, we propose a hierarchical latent distribution to enhance one of the most successful deep learning models, the Transformer, to accommodate these sorts of ambiguities and data distributions. |
Jörg Franke; Frederic Runge; Frank Hutter; |
404 | Untargeted Backdoor Watermark: Towards Harmless and Stealthy Dataset Copyright Protection Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We explore how to design the untargeted backdoor watermark and how to use it for harmless and stealthy dataset copyright protection. |
Yiming Li; Yang Bai; Yong Jiang; Yong Yang; Shu-Tao Xia; Bo Li; |
405 | Synergy-of-Experts: Collaborate to Improve Adversarial Robustness Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper further improves the ensemble’ adversarial robustness through a collaboration scheme. |
Sen Cui; Jingfeng ZHANG; Jian Liang; Bo Han; Masashi Sugiyama; Changshui Zhang; |
406 | HSDF: Hybrid Sign and Distance Field for Modeling Surfaces with Arbitrary Topologies Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a hybrid sign and distance field for modeling arbitrary shapes with both open and closed surfaces. |
Li Wang; jie Yang; Weikai Chen; Xiaoxu Meng; Bo Yang; Jintao Li; Lin Gao; |
407 | Semi-Supervised Semantic Segmentation Via Gentle Teaching Assistant Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce a gentle teaching assistant for semi-supervised semantic segmentation, which assists representation learning through our carefully designed representation knowledge transmission. |
Ying Jin; Jiaqi Wang; Dahua Lin; |
408 | A Scalable Deterministic Global Optimization Algorithm for Training Optimal Decision Tree Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present several structure-exploiting lower and upper bounding methods. |
Kaixun Hua; Jiayang Ren; Yankai Cao; |
409 | HSurf-Net: Normal Estimation for 3D Point Clouds By Learning Hyper Surfaces Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a novel normal estimation method called HSurf-Net, which can accurately predict normals from point clouds with noise and density variations. |
Qing Li; Yu-Shen Liu; Jin-San Cheng; Cheng Wang; Yi Fang; Zhizhong Han; |
410 | Decoupled Self-supervised Learning for Non-Homophilous Graphs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we study the problem of conducting self-supervised learning for node representation learning on non-homophilous graphs. |
Teng Xiao; Zhengyu Chen; Zhimeng Guo; Zeyang Zhuang; Suhang Wang; |
411 | DataMUX: Data Multiplexing for Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present data multiplexing (DataMUX) — a technique that enables deep neural networks to process multiple inputs simultaneously using a single compact representation and dramatically improves inference throughput |
Vishvak Murahari; Carlos Jimenez; Runzhe Yang; Karthik Narasimhan; |
412 | Finding Optimal Arms in Non-stochastic Combinatorial Bandits with Semi-bandit Feedback and Finite Budget Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We consider the combinatorial bandits problem with semi-bandit feedback under finite sampling budget constraints, in which the learner can carry out its action only for a limited number of times specified by an overall budget. |
Jasmin Brandt; Björn Haddenhorst; Viktor Bengs; Eyke Hüllermeier; |
413 | Saliency-Aware Neural Architecture Search Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: They treat all data elements as being equally important and therefore lead to suboptimal performance. To address this problem, we propose an end-to-end framework which dynamically detects saliency of input data, reweights data using saliency maps, and searches architectures on saliency-reweighted data. |
Ramtin Hosseini; Pengtao Xie; |
414 | ESCADA: Efficient Safety and Context Aware Dose Allocation for Precision Medicine Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a novel multi-armed bandit algorithm for the “leveling" task where the aim is to keep the outcomes close to a target level rather than maximize them, which is a prevalent problem in medicine. |
Ilker Demirel; Ahmet Alparslan Celik; Cem Tekin; |
415 | Neuron with Steady Response Leads to Better Generalization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Accordingly, we propose a new regularization method called Neuron Steadiness Regularization (NSR) to reduce neuron intra-class response variance. |
Qiang Fu; Lun Du; Haitao Mao; Xu Chen; Wei Fang; Shi Han; Dongmei Zhang; |
416 | Learning Active Camera for Multi-Object Navigation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Unlike existing agents that always look forward, we propose an active-camera agent that coordinates the camera moving action and navigation action for efficiently perceiving the environment to solve the multi-object navigation task. |
Peihao Chen; Dongyu Ji; Kunyang Lin; Weiwen Hu; Wenbing Huang; Thomas Li; Mingkui Tan; Chuang Gan; |
417 | SPoVT: Semantic-Prototype Variational Transformer for Dense Point Cloud Semantic Completion Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we introduce a Semantic-Prototype Variational Transformer (SPoVT) for dense point cloud semantic completion. |
Sheng Yu Huang; Hao-Yu Hsu; Frank Wang; |
418 | Debiased Self-Training for Semi-Supervised Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We tackle the bias issue in SSL by (1) decoupling the generation and utilization of pseudo labels; (2) estimating the worst case of pseudo labeling and optimizing the representation to avoid the worst case. |
Baixu Chen; Junguang Jiang; Ximei Wang; Pengfei Wan; Jianmin Wang; Mingsheng Long; |
419 | Disentangling The Predictive Variance of Deep Ensembles Through The Neural Tangent Kernel Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: By studying deep ensembles in the linear training regime, we can describe their predictive variance through the Neural Tangent Kernel. |
Seijin Kobayashi; Pau Vilimelis Aceituno; Johannes von Oswald; |
420 | Semi-Supervised Video Salient Object Detection Based on Uncertainty-Guided Pseudo Labels Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose an Uncertainty-Guided Pseudo Label Generator and introduce an adversarial learning strategy to improve the quality of pseudo-labels, finally solving SS-VSOD by using the progressively-enhanced pseudo labels. |
chenyang lu; Yongri Piao; Miao Zhang; Huchuan Lu; |
421 | Hierarchical Channel-spatial Encoding for Communication-efficient Collaborative Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a novel communication-efficient learning method called stripe-wise group quantization (SGQ), which significantly reduces feature size and communication traffic, while not degrading model accuracy for edge-cloud systems. |
Qihua ZHOU; Song Guo; YI LIU; Jie Zhang; Jiewei Zhang; Tao GUO; Zhenda XU; Zhihao Qu; |
422 | RenyiCL: Contrastive Representation Learning with Skew Renyi Divergence Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a novel contrastive learning method that uses Rényi divergence to manage harder data augmentations. |
Kyungmin Lee; Jinwoo Shin; |
423 | Thor: Wielding Hammers to Integrate Language Models and Automated Theorem Provers Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This presents a challenge for all theorem provers, especially the ones based on language models, due to their relative inability to reason over huge volumes of premises in text form. This paper introduces Thor, a framework integrating language models and automated theorem provers to overcome this difficulty. |
Albert Qiaochu Jiang; Wenda Li; Szymon Tworkowski; Konrad Czechowski; Tomasz Odrzygóźdź; Piotr Miłoś; Yuhuai Wu; Mateja Jamnik; |
424 | Autoformalization with Large Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Large language models can be used to do autoformalization, allowing us to achieve in a new SOTA on miniF2F benchmark. |
Yuhuai Wu; Albert Qiaochu Jiang; Wenda Li; Markus N Rabe; Charles Staats; Mateja Jamnik; Christian Szegedy; |
425 | Probabilistic Missing Value Imputation for Mixed Categorical and Ordered Data Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper proposes a probabilistic imputation method using an extended Gaussian copula model that supports both single and multiple imputation. |
Yuxuan Zhao; Alex Townsend; Madeleine Udell; |
426 | TotalSelfScan: Learning Full-body Avatars from Self-Portrait Videos of Faces, Hands, and Bodies Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The main reason is the image region occupied by these parts is very small compared to the body. To solve this problem, we propose TotalSelfScan, which reconstructs the full-body human from several monocular self-rotation videos that focus on the face, hands, and body, respectively. |
Junting Dong; Qi Fang; Yudong Guo; Sida Peng; Qing Shuai; Hujun Bao; Xiaowei Zhou; |
427 | Distributional Reward Estimation for Effective Multi-agent Deep Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce the multi-action-branch reward estimation followed by policy-weighted reward aggregation for stabilized training in multi-agent reinforcement learning with reward uncertainty. |
Jifeng Hu; Yanchao Sun; Hechang Chen; Sili Huang; haiyin piao; Yi Chang; Lichao Sun; |
428 | Unsupervised Cross-Domain Imitation Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a scalable framework that enables cross-domain imitation learning without access to additional demonstrations or further domain knowledge. |
Tim Franzmeyer; Philip Torr; João Henriques; |
429 | Function Classes for Identifiable Nonlinear Independent Component Analysis Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We prove identifiability results for nonlinear Independent Component Analysis in constrained function classes |
Simon Buchholz; Michel Besserve; Bernhard Schölkopf; |
430 | Tree Ensemble Kernels for Bayesian Optimization with Known Constraints Over Mixed-feature Spaces Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We use tree kernel Gaussian processes for Bayesian optimization to simultaneously incorporate: a reliable uncertainty metric in mixed features and known constraints. |
Alexander Thebelt; Calvin Tsay; Robert Lee; Nathan Sudermann-Merx; David Walz; Behrang Shafei; Ruth Misener; |
431 | OnePose++: Keypoint-Free One-Shot Object Pose Estimation Without CAD Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose an keypoint-free one-shot object pose estimation method that handles low-textured objects without knowing CAD models. |
Xingyi He; Jiaming Sun; Yuang Wang; Di Huang; Hujun Bao; Xiaowei Zhou; |
432 | Collaborative Linear Bandits with Adversarial Agents: Near-Optimal Regret Bounds Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper studies a collaborative linear bandit setting in the presence of adversaries, proposes several novel robust algorithms, and provides the first set of tight regret bounds for this problem. |
Aritra Mitra; Arman Adibi; George J. Pappas; Hamed Hassani; |
433 | Near-Optimal Regret Bounds for Multi-batch Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Abstract: In this paper, we study the episodic reinforcement learning (RL) problem modeled by finite-horizon Markov Decision Processes (MDPs) with constraint on the number of batches. The … |
Zihan Zhang; Yuhang Jiang; Yuan Zhou; Xiangyang Ji; |
434 | Contextual Bandits with Knapsacks for A Conversion Model Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We consider a model of contextual bandits with knapsacks where rewards and costs are coupled through binary variables measuring customer conversions |
Zhen LI; Gilles Stoltz; |
435 | Factuality Enhanced Language Models for Open-Ended Text Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a factual-nucleus sampling algorithm that dynamically adapts the randomness to improve the factuality of generation while maintaining quality. |
Nayeon Lee; Wei Ping; Peng Xu; Mostofa Patwary; Mohammad Shoeybi; Bryan Catanzaro; |
436 | Learning (Very) Simple Generative Models Is Hard Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We prove the first computational hardness result for learning pushforwards of Gaussians under one hidden layer ReLU networks of logarithmic size. |
Sitan Chen; Jerry Li; Yuanzhi Li; |
437 | Decomposed Knowledge Distillation for Class-incremental Semantic Segmentation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a simple yet effective framework that achieves a good trade-off between plasticity and rigidity for class-incremental semantic segmentation. |
Donghyeon Baek; Youngmin Oh; Sanghoon Lee; Junghyup Lee; Bumsub Ham; |
438 | Escaping from The Barren Plateau Via Gaussian Initializations in Deep Variational Quantum Circuits Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a Gaussian initialization strategy addressing the vanishing gradient problem in variational quantum circuits with theoretical guarantees. |
Kaining Zhang; Liu Liu; Min-Hsiu Hsieh; Dacheng Tao; |
439 | Predicting from Predictions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study conditions under which the causal effect of performative predictions can be identified from observational data |
Frances Ding; Yixin Wang; Celestine Mendler-Dünner; |
440 | Learning Generalizable Models for Vehicle Routing Problems Via Knowledge Distillation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present a generic and efficient Adaptive Multi-Distribution Knowledge Distillation (AMDKD) scheme to tackle the cross-distribution generalization issue for learning-to-solve routing problems. |
Jieyi Bi; Yining Ma; Jiahai Wang; Zhiguang Cao; Jinbiao Chen; Yuan Sun; Yeow Meng Chee; |
441 | A Regret-Variance Trade-Off in Online Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We state a regret-variance trade-off in online learning and provide multiple applications. |
Dirk van der Hoeven; Nikita Zhivotovskiy; Nicolò Cesa-Bianchi; |
442 | Learning on The Edge: Online Learning with Stochastic Feedback Graphs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We consider a generalization of adversarial online learning with feedback graph and we prove matching upper and lower bounds on the regret |
Emmanuel Esposito; Federico Fusco; Dirk van der Hoeven; Nicolò Cesa-Bianchi; |
443 | NeuPhysics: Editable Neural Geometry and Physics from Monocular Videos Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a method for learning 3D geometry and physics parameters of a dynamic scene from only a monocular RGB video input. |
Yi-Ling Qiao; Alexander Gao; Ming Lin; |
444 | Lethal Dose Conjecture on Data Poisoning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose Lethal Dose Conjecture, which characterizes the largest amount of poisoned samples any defense can tolerate for a given task, and showcase its implications, including better/easy ways to improve robustness against data poisoning. |
Wenxiao Wang; Alexander Levine; Soheil Feizi; |
445 | Mask Matching Transformer for Few-Shot Segmentation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we aim to tackle the challenging few-shot segmentation task from a new perspective. |
siyu jiao; Gengwei Zhang; Shant Navasardyan; Ling Chen; Yao Zhao; Yunchao Wei; Honghui Shi; |
446 | AUTOMATA: Gradient Based Data Subset Selection for Compute-Efficient Hyper-parameter Tuning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose AUTOMATA, a gradient-based subset selection framework for hyper-parameter tuning. |
Krishnateja Killamsetty; Guttu Sai Abhishek; Aakriti Lnu; Alexandre Evfimievski; Lucian Popa; Ganesh Ramakrishnan; Rishabh Iyer; |
447 | Orient: Submodular Mutual Information Measures for Data Subset Selection Under Distribution Shift Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we aim to improve the efficiency of existing supervised domain adaptation (SDA) methods by using a subset of source data that is similar to target data for faster model training. |
Athresh Karanam; Krishnateja Killamsetty; Harsha Kokel; Rishabh Iyer; |
448 | Empirical Gateaux Derivatives for Causal Inference Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study a constructive procedure that approximates Gateaux derivatives for statistical functionals by finite-differencing, with attention to causal inference functionals. |
Michael Jordan; Yixin Wang; Angela Zhou; |
449 | Active Model Adaptation Under Changed Distributions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This work mainly discusses how to make a known model adapt to a variety of changed distributions at a relatively small labeling cost. |
Jie-Jing Shao; Lan-Zhe Guo; Xiao-wen Yang; Yu-Feng Li; |
450 | Enhanced Latent Space Blind Model for Real Image Denoising Via Alternative Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a novel enhanced latent space blind model based deep unfolding network, namely ScaoedNet, for complex real image denoising. |
Chao Ren; Yizhong Pan; Jie Huang; |
451 | Unsupervised Multi-View Object Segmentation Using Radiance Field Propagation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present radiance field propagation (RFP), a novel approach to segmenting objects in 3D during reconstruction given only unlabeled multi-view images of a scene. |
Xinhang Liu; Jiaben Chen; Huai Yu; Yu-Wing Tai; Chi-Keung Tang; |
452 | Trap and Replace: Defending Backdoor Attacks By Trapping Them Into An Easy-to-Replace Subnetwork Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a brand-new backdoor defense strategy, which makes it much easier to remove the harmful influence of backdoor samples from the model. |
Haotao Wang; Junyuan Hong; Aston Zhang; Jiayu Zhou; Zhangyang Wang; |
453 | Preservation of The Global Knowledge By Not-True Distillation in Federated Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper suggests the forgetting global knowledge in federated learning, and proposes distillation-based algorithms to relieve it. |
Gihun Lee; Minchan Jeong; Yongjin Shin; Sangmin Bae; Se-Young Yun; |
454 | Adversarial Training for High-stakes Reliability Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We used a safe language generation task (“avoid injuries”) as a testbed for achieving high reliability through adversarial training. |
Daniel Ziegler; Seraphina Nix; Lawrence Chan; Tim Bauman; Peter Schmidt-Nielsen; Tao Lin; Adam Scherlis; Noa Nabeshima; Benjamin Weinstein-Raun; Daniel de Haas; Buck Shlegeris; Nate Thomas; |
455 | Learning Substructure Invariance for Out-of-Distribution Molecular Representations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We aim to solve the out-of-distribution problem on molecule representation learning tasks from a substructure invariance perspective. |
Nianzu Yang; Kaipeng Zeng; Qitian Wu; Xiaosong Jia; Junchi Yan; |
456 | Confident Adaptive Language Modeling Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we introduce Confident Adaptive Language Modeling (CALM), a framework for dynamically allocating different amounts of compute per input and generation timestep. |
Tal Schuster; Adam Fisch; Jai Gupta; Mostafa Dehghani; Dara Bahri; Vinh Tran; Yi Tay; Donald Metzler; |
457 | On Sample Optimality in Personalized Collaborative and Federated Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study the sample complexity of collaboratively minimizing functions held by N different agents: we prove matching lower and upper complexity bounds. |
Mathieu Even; Laurent Massoulié; Kevin Scaman; |
458 | RAMBO-RL: Robust Adversarial Model-Based Offline Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we present Robust Adversarial Model-Based Offline RL (RAMBO), a novel approach to model-based offline RL. |
Marc Rigter; Bruno Lacerda; Nick Hawes; |
459 | Quo Vadis: Is Trajectory Forecasting The Key Towards Long-Term Multi-Object Tracking? Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Intuitively, the longer the occlusion gap, the larger the search space for possible associations. In this paper, we show that even a small yet diverse set of trajectory predictions for moving agents will significantly reduce this search space and thus improve long-term tracking robustness. |
Patrick Dendorfer; Vladimir Yugay; Aljosa Osep; Laura Leal-Taixé; |
460 | Efficient Identification of Informative Features in Simulation-based Inference Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Here, we provide a more efficient approach based on the SBI method neural likelihood estimation (NLE): We show that one can marginalize the trained surrogate likelihood post-hoc before inferring the posterior to assess the contribution of a feature.. |
Jonas Beck; Michael Deistler; Yves Bernaerts; Jakob H Macke; Philipp Berens; |
461 | Asynchronous Actor-Critic for Multi-Agent Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To allow asynchronous learning and decision-making, we formulate a set of asynchronous multi-agent actor-critic methods that allow agents to directly optimize asynchronous policies in three standard training paradigms: decentralized learning, centralized learning, and centralized training for decentralized execution. |
Yuchen Xiao; Weihao Tan; Christopher Amato; |
462 | A Scalable Tester for Samplers Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Constrained samplers generate samples from hard distributions. We present a tool that can test whether your sampler does actually generate samples from the right distribution. |
Yash Pote; Kuldeep S Meel; |
463 | Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Speech Processing Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a novel framework to finetune the connections of speech SSL models, instead of model weights, to empower efficient multilingual and multitask speech processing. |
Yonggan Fu; Yang Zhang; Kaizhi Qian; Zhifan Ye; Zhongzhi Yu; Cheng-I Jeff Lai; Yingyan Lin; |
464 | Unsupervised Learning of Group Invariant and Equivariant Representations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose an unsupervised learning framework to extract separated group invariant and equivariant representations. |
Robin Winter; Marco Bertolini; Tuan Le; Frank Noe; Djork-Arné Clevert; |
465 | [Re] Replication Study of "Fairness and Bias in Online Selection" Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Scope of Reproducibility This report aims to reproduce the results in the paper ‘Fairness and Bias in Online Selection’. |
Diego van der Mast; Soufiane Ben Haddou; Jacky Chu; Jaap Stefels; |
466 | Focal Modulation Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose focal modulation network (FocalNet in short), where self-attention (SA) is completely replaced by a focal modulation module that is more effective and efficient for modeling token interactions. |
Jianwei Yang; Chunyuan Li; Xiyang Dai; Jianfeng Gao; |
467 | Reinforcement Learning with Neural Radiance Fields Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We learn state representations of scenes using supervision from neural radiance fields, and show that using these in downstream reinforcement learning tasks improves sample efficiency. |
Danny Driess; Ingmar Schubert; Pete Florence; Yunzhu Li; Marc Toussaint; |
468 | MaskPlace: Fast Chip Placement Via Reinforced Visual Representation Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper proposes an RL-based chip placement method MaskPlace based on rich visual representation. |
Yao Lai; Yao Mu; Ping Luo; |
469 | When Is The Convergence Time of Langevin Algorithms Dimension Independent? A Composite Optimization Viewpoint Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper provides an affirmative answer to this problem for the case of either Lipschitz or smooth convex functions with normal priors. By viewing Langevin algorithm as composite optimization, we develop a new analysis technique that leads to dimension independent convergence rates for such problems. |
Yoav S Freund; Yi-An Ma; Tong Zhang; |
470 | SAVi++: Towards End-to-End Object-Centric Learning from Real-World Videos Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce SAVi++, an object-centric video model which is trained to predict depth signals from a slot-based video representation. SAVi++ is able to learn emergent object segmentation and tracking from videos in the real-world Waymo Open dataset. |
Gamaleldin Elsayed; Aravindh Mahendran; Sjoerd van Steenkiste; Klaus Greff; Michael Mozer; Thomas Kipf; |
471 | A Mixture Of Surprises for Unsupervised Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Hence, choosing between the two objectives is a dilemma. We propose a novel yet simple mixture of policies to address this concern, allowing us to optimize an objective that simultaneously maximizes and minimizes the surprise. |
Andrew Zhao; Matthieu Lin; Yangguang Li; Yong-jin Liu; Gao Huang; |
472 | Leveraging The Hints: Adaptive Bidding in Repeated First-Price Auctions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Following a series of recent works in this area, we consider a differentiated setup: we do not make any assumption about other bidders’ maximum bid (i.e. it can be adversarial over time), and instead assume that we have access to a hint that serves as a prediction of other bidders’ maximum bid, where the prediction is learned through some blackbox machine learning model. |
Wei Zhang; Yanjun Han; Zhengyuan Zhou; Aaron Flores; Tsachy Weissman; |
473 | Amortised Inference in Structured Generative Models with Explaining Away Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose structured amortised inference to account for the posterior latent correlations induced by the "explaining away" effect. |
Changmin Yu; Hugo Soulat; Neil Burgess; Maneesh Sahani; |
474 | AD-DROP: Attribution Driven Dropout for Robust Language Model Finetuning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We investigate the impact of dropout on self-attention and propose a novel dropout regularizer, AD-DROP, driven by self-attention attribution to reduce overfitting when fine-tuning pre-trained language models. |
Tao Yang; JInghao Deng; Xiaojun Quan; Qifan Wang; Shaoliang Nie; |
475 | Robust and Scalable Manifold Learning Via Landmark Diffusion for Long-term Medical Signal Processing Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Motivated by analyzing long-term physiological time series, we design a robust and scalable spectral embedding algorithm that we refer to as RObust and Scalable Embedding via LANdmark Diffusion (Roseland). |
Chao Shen; Yu-Ting Lin; Hau-Tieng Wu; |
476 | Toward Robust Spiking Neural Network Against Adversarial Perturbation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The first work that applies certification-based techniques to spiking neural networks. |
LING LIANG; Kaidi Xu; Xing Hu; Lei Deng; Yuan Xie; |
477 | Most Activation Functions Can Win The Lottery Without Excessive Depth Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We generalize lottery ticket existence proofs to almost arbitrary activation functions and show that a source network can have almost the same depth as a target network. |
Rebekka Burkholz; |
478 | LGDN: Language-Guided Denoising Network for Video-Language Modeling Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose an efficient and effective model for video-language modeling with salient frame proposal mechanism. |
Haoyu Lu; Mingyu Ding; Nanyi Fei; Yuqi Huo; Zhiwu Lu; |
479 | Online Nonnegative CP-dictionary Learning for Markovian Data Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we introduce a novel algorithm that learns a CANDECOMP/PARAFAC (CP) basis from a given stream of tensor-valued data under general constraints, including nonnegativity constraints that induce interpretability of the learned CP basis. |
Hanbaek Lyu; Christopher Strohmeier; Deanna Needell; |
480 | Exploring Example Influence in Continual Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We explore the example influence in Continual Learning, and give the usage of example influence. |
Qing Sun; Fan Lyu; Fanhua Shang; Wei Feng; Liang Wan; |
481 | Navigating Memory Construction By Global Pseudo-Task Simulation for Continual Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We have proposed a novel method Global Pseudo-task Simulation (GPS) to solve the dynamic memory construction problem in the online continual learning setting. |
Yejia Liu; Wang Zhu; Shaolei Ren; |
482 | SCINet: Time Series Modeling and Forecasting with Sample Convolution and Interaction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper presents a novel convolutional neural network for time series forecasting, achieving significant accuracy improvements. |
Minhao LIU; Ailing Zeng; Muxi Chen; Zhijian Xu; Qiuxia LAI; Lingna Ma; Qiang Xu; |
483 | Multi-Agent Multi-Armed Bandits with Limited Communication Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present Limited Communication Collaboration – Upper Confidence Bound (LCC-UCB), a doubling-epoch based algorithm where each agent communicates only after the end of the epoch and shares the index of the best arm it knows. |
Mridul Agarwal; Vaneet Aggarwal; Kamyar Azizzadenesheli; |
484 | Concentration of Data Encoding in Parameterized Quantum Circuits Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This work shows the concentration of data encoding in parameterized quantum circuits and its severe limitations on downstream tasks. |
Guangxi Li; Ruilin Ye; Xuanqiang Zhao; Xin Wang; |
485 | Robust Model Selection and Nearly-Proper Learning for GMMs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We give efficient algorithms for robust model selection and nearly-proper learning of Gaussian mixture models. |
Allen Liu; Jerry Li; Ankur Moitra; |
486 | Improved Convergence Rate of Stochastic Gradient Langevin Dynamics with Variance Reduction and Its Application to Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Especially, its variance reduced versions have nowadays gained particular attention. In this paper, we study two variants of this kind, namely, the Stochastic Variance Reduced Gradient Langevin Dynamics and the Stochastic Recursive Gradient Langevin Dynamics. |
Yuri Kinoshita; Taiji Suzuki; |
487 | Imbalance Trouble: Revisiting Neural-Collapse Geometry Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Here we thus ask whether it can be made invariant to class imbalances. Towards this end, we adopt the unconstrained feature model (UFM), a recent theoretical model for studying neural collapse, and introduce $\text{\emph{Simplex-Encoded-Labels Interpolation}}$ (SELI) as an invariant characterization of the neural collapse phenomenon. |
Christos Thrampoulidis; Ganesh Ramachandra Kini; Vala Vakilian; Tina Behnia; |
488 | Understanding and Improving Robustness of Vision Transformers Through Patch-based Negative Augmentation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We investigate the robustness of vision transformers (ViTs) through the lens of their special patch-based architectural structure, i.e., they process an image as a sequence of image patches. |
Yao Qin; Chiyuan Zhang; Ting Chen; Balaji Lakshminarayanan; Alex Beutel; Xuezhi Wang; |
489 | Animatable 3D-Aware Face Image Generation for Realistic Video Avatars Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose an animatable 3D-aware face image generation method. |
Yue Wu; Yu Deng; Jiaolong Yang; Fangyun Wei; Qifeng Chen; Xin Tong; |
490 | Normalizing Flows for Knockoff-free Controlled Feature Selection Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Using a normalizing flow to fit an arbitrary feature distribution, our method identifies relevant features while controlling false discoveries. |
Derek Hansen; Brian Manzo; Jeffrey Regier; |
491 | Learning to Break The Loop: Analyzing and Mitigating Repetitions for Neural Text Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We analyze the consetutive sentence repetitions in language models and propose a simple and effective method to mitigate it. |
Jin Xu; Xiaojiang Liu; Jianhao Yan; Deng Cai; Huayang Li; Jian Li; |
492 | RCNNs Learn Succinct Learning Algorithms in Polynomial Time Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We describe a natural architecture with recurrent and convolutional weight sharing which, when trained using SGD with random initialization and restarts, can perform as well as all constant sized TMs. |
Surbhi Goel; Cyril Zhang; Sham Kakade; Adam Kalai; |
493 | Hidden Progress in Deep Learning: SGD Learns Parities Near The Computational Limit Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: While there are some accounts of how these resources modulate statistical capacity, far less is known about their effect on the computational problem of model training. This work conducts such an exploration through the lens of learning $k$-sparse parities of $n$ bits, a canonical family of problems which pose theoretical computational barriers. |
Boaz Barak; Benjamin Edelman; Surbhi Goel; Sham Kakade; Eran Malach; Cyril Zhang; |
494 | Stimulative Training of Residual Networks: A Social Psychology Perspective of Loafing Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper understands and improves residual networks from a social psychology perspective of loafing |
Peng Ye; Shengji Tang; Baopu Li; Tao Chen; Wanli Ouyang; |
495 | Diffusion Visual Counterfactual Explanations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Current approaches for the generation of VCEs are restricted to adversarially robust models and often contain non-realistic artefacts or are restricted to image classification problems with few classes. In this paper we overcome this by generating Diffusion Visual Counterfactual Explanations (DVCEs) for arbitrary ImageNet classifiers via a diffusion process. |
Maximilian Augustin; Valentyn Boreiko; Francesco Croce; Matthias Hein; |
496 | Accelerated Projected Gradient Algorithms for Sparsity Constrained Optimization Problems Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: For optimization problems with a sparsity constraint, we propose acceleration methods with provably faster convergence rates and significantly faster empirical speed than the state of the art. |
Jan Harold Alcantara; Ching-pei Lee; |
497 | TOIST: Task Oriented Instance Segmentation Transformer with Noun-Pronoun Distillation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: As such, we study the challenging problem of task oriented detection, which aims to find objects that best afford an action indicated by verbs like sit comfortably on. |
Pengfei Li; Beiwen Tian; Yongliang Shi; Xiaoxue Chen; Hao Zhao; Guyue Zhou; Ya-Qin Zhang; |
498 | DENSE: Data-Free One-Shot Federated Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Despite the low communication cost, existing one-shot FL methods are mostly impractical or face inherent limitations, \eg a public dataset is required, %poor performance of the global model, clients’ models are homogeneous, and additional data/model information need to be uploaded. To overcome these issues, we propose a novel two-stage \textbf{D}ata-fre\textbf{E} o\textbf{N}e-\textbf{S}hot federated l\textbf{E}arning (DENSE) framework, which trains the global model by a data generation stage and a model distillation stage. |
Jie Zhang; Chen Chen; Bo Li; Lingjuan Lyu; Shuang Wu; Shouhong Ding; Chunhua Shen; Chao Wu; |
499 | A Stochastic Linearized Augmented Lagrangian Method for Decentralized Bilevel Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This work develops a stochastic linearized augmented Lagrangian method (SLAM) for solving general nonconvex bilevel optimization problems over a graph, where both upper and lower optimization variables are able to achieve a consensus. |
Songtao Lu; Siliang Zeng; Xiaodong Cui; Mark Squillante; Lior Horesh; Brian Kingsbury; Jia Liu; Mingyi Hong; |
500 | Fine-Grained Analysis of Stability and Generalization for Modern Meta Learning Algorithms Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We provide fine-grained analysis of stability and generalization for modern meta learning algorithms. |
Jiechao Guan; Yong Liu; Zhiwu Lu; |
501 | Quasi-Newton Methods for Saddle Point Problems Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose random Broyden family updates, which have explicit local superlinear convergence rate of ${\mathcal O}\big(\big(1-1/(n\kappa^2)\big)^{k(k-1)/2}\big)$, where $n$ is the dimension of the problem, $\kappa$ is the condition number and $k$ is the number of iterations. |
Chengchang Liu; Luo Luo; |
502 | Training Language Models to Follow Instructions with Human Feedback Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We fine-tune GPT-3 using data collected from human labelers. The resulting model, called InstructGPT, outperforms GPT-3 on a range of NLP tasks. |
Long Ouyang; Jeffrey Wu; Xu Jiang; Diogo Almeida; Carroll Wainwright; Pamela Mishkin; Chong Zhang; Sandhini Agarwal; Katarina Slama; Alex Ray; John Schulman; Jacob Hilton; Fraser Kelton; Luke Miller; Maddie Simens; Amanda Askell; Peter Welinder; Paul Christiano; Jan Leike; Ryan Lowe; |
503 | Deterministic Langevin Monte Carlo with Normalizing Flows for Bayesian Inference Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a general purpose Bayesian inference algorithm for expensive likelihoods, replacing the stochastic term in Langevin equation with a deterministic density gradient term. |
Uros Seljak; Richard Grumitt; Biwei Dai; |
504 | Batch Size-invariance for Policy Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We show how to make PPO batch size-invariant (changes to the batch size can largely be compensated for by changing other hyperparameters) by decoupling the proximal policy (used for controlling the size of policy updates) from the behavior policy. |
Jacob Hilton; Karl Cobbe; John Schulman; |
505 | ACIL: Analytic Class-Incremental Learning with Absolute Memorization and Privacy Protection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Inspired by learning of linear problems, we propose an analytic class-incremental learning (ACIL) with absolute memorization of past knowledge while avoiding breaching of data privacy (i.e., without storing historical data). |
HUIPING ZHUANG; Zhenyu Weng; Hongxin Wei; RENCHUNZI XIE; Kar-Ann Toh; Zhiping Lin; |
506 | Rate-Distortion Theoretic Bounds on Generalization Error for Distributed Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We show that distributed learning setup has a smaller generalization error than the corresponding centralized setup, for certain cases. |
Milad Sefidgaran; Romain Chor; Abdellatif Zaidi; |
507 | Multi-agent Dynamic Algorithm Configuration Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose MA-DAC to solve the dynamic configuration of algorithms with multiple types of hyperparameters, where one agent works to handle one type of configuration hyperparameter. |
Ke Xue; Jiacheng Xu; Lei Yuan; Miqing Li; Chao Qian; Zongzhang Zhang; Yang Yu; |
508 | Weakly Supervised Knowledge Distillation for Whole Slide Image Classification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose an end-to-end weakly supervised knowledge distillation framework (WENO) for WSI classification. |
Linhao Qu; xiaoyuan luo; Manning Wang; Zhijian Song; |
509 | Semantic Exploration from Language Abstractions and Pretrained Representations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Novelty-based exploration methods can suffer in high-dimensional state spaces, such as continuous partially-observable 3D environments. We address this challenge by defining novelty using semantically meaningful state abstractions, which can be found in learned representations shaped by natural language. |
Allison Tam; Neil Rabinowitz; Andrew Lampinen; Nicholas Roy; Stephanie Chan; DJ Strouse; Jane Wang; Andrea Banino; Felix Hill; |
510 | Curious Exploration Via Structured World Models Yields Zero-Shot Object Manipulation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose CEE-US, a method combining the learning of GNNs as structured world models with curiosity-driven, planning-based exploration, that achieves zero-shot downstream task generalization in multi-object manipulation tasks. |
Cansu Sancaktar; Sebastian Blaes; Georg Martius; |
511 | Multi-Objective Bayesian Optimization with Pareto Set Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose a novel Pareto set learning (PSL) method to approximate the whole Pareto set for expensive multi-objective optimization problems. |
Xi Lin; Zhiyuan Yang; Xiaoyuan Zhang; Qingfu Zhang; |
512 | Robust Binary Models By Pruning Randomly-initialized Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce a framework to find robust sub-networks from randomly-initialized binary networks without updating the model parameters. |
Chen Liu; Ziqi Zhao; Sabine Süsstrunk; Mathieu Salzmann; |
513 | A Win-win Deal: Towards Sparse and Robust Pre-trained Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We extend the study on PLM subnetwork to the OOD scenario,investigating whether there exist PLM subnetworks that are both sparse and robust against dataset bias. |
Yuanxin Liu; Fandong Meng; Zheng Lin; Jiangnan Li; Peng Fu; Yanan Cao; Weiping Wang; Jie Zhou; |
514 | Hierarchical Normalization for Robust Monocular Depth Estimation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel multi-scale depth normalization method that hierarchically normalizes the depth representations based on spatial information and depth distributions. |
Chi Zhang; Wei Yin; Billzb Wang; Gang Yu; Chunhua Shen; BIN FU; |
515 | Integral Probability Metrics PAC-Bayes Bounds Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present a PAC-Bayes-style generalization bound which enables the replacement of the KL-divergence with a variety of Integral Probability Metrics (IPM). |
Ron Amit; Baruch Epstein; Shay Moran; Ron Meir; |
516 | AutoLink: Self-supervised Learning of Human Skeletons and Object Outlines By Linking Keypoints Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a self-supervised method that learns the common object structure as a graph that links keypoints to skeletons. |
Xingzhe He; Bastian Wandt; Helge Rhodin; |
517 | DASCO: Dual-Generator Adversarial Support Constrained Offline Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, in practice, GAN-based offline RL methods have not outperformed alternative approaches, perhaps because the generator is trained to both fool the discriminator and maximize return – two objectives that are often at odds with each other. In this paper, we show that the issue of conflicting objectives can be resolved by training two generators: one that maximizes return, with the other capturing the remainder of the data distribution in the offline dataset, such that the mixture of the two is close to the behavior policy. |
Quan Vuong; Aviral Kumar; Sergey Levine; Yevgen Chebotar; |
518 | Embrace The Gap: VAEs Perform Independent Mechanism Analysis Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: The gap between ELBO and log-likelihood helps variational autoencoders with near-deterministic decoders learn useful representations by performing independent mechanism analysis. |
Patrik Reizinger; Luigi Gresele; Jack Brady; Julius von Kügelgen; Dominik Zietlow; Bernhard Schölkopf; Georg Martius; Wieland Brendel; Michel Besserve; |
519 | Disentangling Causal Effects from Sets of Interventions in The Presence of Unobserved Confounders Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We formally characterise the conditions under which single-variable causal effects can be learnt from only observational and multi-variable interventional data — providing identification proofs alongside an estimation method we evaluate empirically. |
Olivier Jeunen; Ciarán Gilligan-Lee; Rishabh Mehrotra; Mounia Lalmas; |
520 | To Update or Not to Update? Neurons at Equilibrium in Deep Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work we shift our focus from the single parameters to the behavior of the whole neuron, exploiting the concept of neuronal equilibrium (NEq). |
Andrea Bragagnolo; Enzo Tartaglione; Marco Grangetto; |
521 | An Adaptive Deep RL Method for Non-Stationary Environments with Piecewise Stable Context Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To leverage the segment structure of piecewise stable context in real-world applications, in this paper, we propose a \textit{\textbf{Se}gmented \textbf{C}ontext \textbf{B}elief \textbf{A}ugmented \textbf{D}eep~(SeCBAD)} RL method. |
Xiaoyu Chen; Xiangming Zhu; Yufeng Zheng; Pushi Zhang; Li Zhao; Wenxue Cheng; Peng CHENG; Yongqiang Xiong; Tao Qin; Jianyu Chen; Tie-Yan Liu; |
522 | SHAQ: Incorporating Shapley Value Theory Into Multi-Agent Q-Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Based on the properties of MSV, we derive \textit{Shapley-Bellman optimality equation} (SBOE) to evaluate the optimal MSV, which corresponds to an optimal joint deterministic policy. |
Jianhong Wang; Yuan Zhang; Yunjie Gu; Tae-Kyun Kim; |
523 | BMU-MoCo: Bidirectional Momentum Update for Continual Video-Language Modeling Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a novel cross-modal BMU-MoCo with bidirectional momentum update for continual video-language modeling. |
Yizhao Gao; Nanyi Fei; Haoyu Lu; Zhiwu Lu; Hao Jiang; Yijie Li; Zhao Cao; |
524 | Structural Analysis of Branch-and-Cut and The Learnability of Gomory Mixed Integer Cuts Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We conduct a novel structural analysis of branch-and-cut that pins down how every step of the algorithm is affected by changes in the parameters defining the cutting planes added to the input integer program. |
Maria-Florina Balcan; Siddharth Prasad; Tuomas Sandholm; Ellen Vitercik; |
525 | Joint Entropy Search for Multi-Objective Bayesian Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose the Joint Entropy Search acquisition function for multi-objective Bayesian optimization and showcase its effectiveness on some practical problems. |
Ben Tu; Axel Gandy; Nikolas Kantas; Behrang Shafei; |
526 | GAR: Generalized Autoregression for Multi-Fidelity Fusion Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Despite the fast developments of multi-?delity fusion techniques, most existing methods require particular data structures and do not scale well to high-dimensional output. To resolve these issues, we generalize the classic autoregression (AR), which is wildly used due to its simplicity, robustness, accuracy, and tractability, and propose generalized autoregression (GAR) using tensor formulation and latent features. |
Yuxin Wang; Zheng Xing; WEI XING; |
527 | Learning The Structure of Large Networked Systems Obeying Conservation Laws Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Motivated by this important problem, we study the estimation of the sparsity structure of the matrix $B^\ast$ from $n$ samples of $Y$ under the assumption that the node injections $X$ follow a Gaussian distribution with a known covariance $\Sigma_X$. We propose a new $\ell_{1}$-regularized maximum likelihood estimator for tackling this problem in the high-dimensional regime where the size of the network may be vastly larger than the number of samples $n$. |
Anirudh Rayas; Rajasekhar Anguluri; Gautam Dasarathy; |
528 | CEIP: Combining Explicit and Implicit Priors for Reinforcement Learning with Demonstrations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We develop a model which uses task-agnostic and task-specific demonstrations both explicitly and implicitly to improve efficiency of reinforcement learning |
Kai Yan; Alex Schwing; Yu-Xiong Wang; |
529 | The Neural Testbed: Evaluating Joint Predictions Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Predictive distributions quantify uncertainties ignored by point estimates. This paper introduces The Neural Testbed: an open source benchmark for controlled and principled evaluation of agents that generate such predictions. |
Ian Osband; Zheng Wen; Seyed Mohammad Asghari; Vikranth Dwaracherla; Xiuyuan Lu; MORTEZA IBRAHIMI; Dieterich Lawson; Botao Hao; Brendan O’Donoghue; Benjamin Van Roy; |
530 | Follow-the-Perturbed-Leader for Adversarial Markov Decision Processes with Bandit Feedback Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We consider Follow-the-Perturbed-Leader (FTPL) algorithms for Adversarial Markov Decision Processes (AMDPs) in episodic settings. We also extend them to delayed AMDPs as well as infinite-horizon communicating AMDPs. |
Yan Dai; Haipeng Luo; Liyu Chen; |
531 | On The Global Convergence Rates of Decentralized Softmax Gradient Play in Markov Potential Games Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study the finite time global convergence to a Nash equilibrium for decentralized softmax gradient play algorithms under the Markov potential game setting. |
Runyu Zhang; Jincheng Mei; Bo Dai; Dale Schuurmans; Na Li; |
532 | Generative Status Estimation and Information Decoupling for Image Rain Removal Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We construct SEIDNet, a generative network equipped with the pixel-wise Status Estimation and the Information Decoupling for rain removal. |
Di Lin; Xin WANG; Jia Shen; Renjie Zhang; Ruonan Liu; Miaohui Wang; Wuyuan Xie; Qing Guo; Ping Li; |
533 | Learning to Compare Nodes in Branch and Bound with Graph Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Branch-and-bound approaches in integer programming require ordering portions of the space to explore next, a problem known as node comparison. We propose a new siamese graph neural network model to tackle this problem, where the nodes are represented as bipartite graphs with attributes. |
Abdel Ghani Labassi; Didier Chetelat; Andrea Lodi; |
534 | Off-Policy Evaluation for Episodic Partially Observable Markov Decision Processes Under Non-Parametric Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We establish first finite sample error bounds for OPE in confounded POMDPs under non-parametric models. |
Rui Miao; Zhengling Qi; Xiaoke Zhang; |
535 | UViM: A Unified Modeling Approach for Vision with Learned Guiding Codes Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a unified model for computer vision, which does not require any task-specific components. |
Alexander Kolesnikov; André Susano Pinto; Lucas Beyer; Xiaohua Zhai; Jeremiah Harmsen; Neil Houlsby; |
536 | InterpretDL: Explaining Deep Models in PaddlePaddle Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce InterpretDL, a toolkit of explanation algorithms based on PaddlePaddle, with uniformed programming interfaces and "plug-and-play" designs. |
Xuhong Li; Haoyi Xiong; Xingjian Li; Xuanyu Wu; Zeyu Chen; Dejing Dou; |
537 | EfficientViT: Vision Transformers at MobileNet Speed Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Our work proves that properly designed transformers can reach extremely low latency on mobile devices while maintaining high performance. |
Yanyu Li; Geng Yuan; Yang Wen; Ju Hu; Georgios Evangelidis; Sergey Tulyakov; Yanzhi Wang; Jian Ren; |
538 | Joint Estimation and Inference for Data Integration Problems Based on Multiple Multi-layered Gaussian Graphical Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: The rapid development of high-throughput technologies has enabled the generation of data from biological or disease processes that span multiple layers, like genomic, proteomic or metabolomic data, and further pertain to multiple sources, like disease subtypes or experimental conditions. In this work, we propose a general statistical framework based on Gaussian graphical models for horizontal (i.e. across conditions or subtypes) and vertical (i.e. across different layers containing data on molecular compartments) integration of information in such datasets. |
Subhabrata Majumdar; George Michailidis; |
539 | Improving 3D-aware Image Synthesis with A Geometry-aware Discriminator Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We improve 3D-aware GANs by making the discriminator 3D-aware as well, resulting in far more accurate 3D shapes. |
Zifan Shi; Yinghao Xu; Yujun Shen; Deli Zhao; Qifeng Chen; Dit-Yan Yeung; |
540 | Accelerating Sparse Convolution for Efficient Neural Network Inference Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose an algorithm-software co-designed sparse convolution based on a novel out-vector-wise (OVW) sparse pattern. |
Yijun Tan; Kai Han; Kang Zhao; Xianzhi Yu; Zidong Du; Yunhe Wang; Jun Yao; Yunji Chen; |
541 | Exploiting The Relationship Between Kendall’s Rank Correlation and Cosine Similarity for Attribution Protection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we first show that the expected Kendall’s rank correlation is positively correlated to cosine similarity and then indicate that the direction of attribution is the key to attribution robustness. Based on these findings, we explore the vector space of attribution to explain the shortcomings of attribution defense methods using $\ell_p$ norm and propose integrated gradient regularizer (IGR), which maximizes the cosine similarity between natural and perturbed attributions. |
Fan Wang; Adams Wai Kin Kong; |
542 | Improved Fine-Tuning By Better Leveraging Pre-Training Data Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose to select and use pre-training data in the fine-tuning stage motivated by our theoretical analysis. |
Ziquan Liu; Yi Xu; Yuanhong Xu; Qi Qian; Hao Li; Xiangyang Ji; Antoni Chan; Rong Jin; |
543 | Addressing Leakage in Concept Bottleneck Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Leakage adversarily affects the performance and interpretability of concept bottleneck models. We address the underlying causes. |
Marton Havasi; Sonali Parbhoo; Finale Doshi-Velez; |
544 | Planning for Sample Efficient Imitation Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Predictive distributions quantify uncertainties ignored by point estimates. This paper introduces The Neural Testbed: an open source benchmark for controlled and principled evaluation of agents that generate such predictions. |
Zhao-Heng Yin; Weirui Ye; Qifeng Chen; Yang Gao; |
545 | Pre-Trained Image Encoder for Generalizable Visual Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Hence, we propose Pre-trained Image Encoder for Generalizable visual reinforcement learning (PIE-G), a simple yet effective framework that can generalize to the unseen visual scenarios in a zero-shot manner. |
Zhecheng Yuan; Zhengrong Xue; Bo Yuan; Xueqian Wang; YI WU; Yang Gao; Huazhe Xu; |
546 | Neural-Symbolic Entangled Framework for Complex Query Answering Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose a Neural and Symbolic Entangled framework (ENeSy) for complex query answering, which enables the neural and symbolic reasoning to enhance each other to alleviate the cascading error and KG incompleteness. |
Zezhong Xu; Wen Zhang; Peng Ye; Hui Chen; Huajun Chen; |
547 | MACK: Multimodal Aligned Conceptual Knowledge for Unpaired Image-text Matching Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Different from them, this paper studies a new scenario as unpaired image-text matching, in which paired images and texts are assumed to be unavailable during model training. |
Yan Huang; Yuming Wang; Yunan Zeng; Liang Wang; |
548 | Transferring Pre-trained Multimodal Representations with Cross-modal Similarity Matching Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a method (BeamCLIP) that can effectively transfer the representations of a large pre-trained multimodal model (CLIP-ViT) into a small target model (e.g., ResNet-18). |
Byoungjip Kim; Sungik Choi; Dasol Hwang; Moontae Lee; Honglak Lee; |
549 | Real-Valued Backpropagation Is Unsuitable for Complex-Valued Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We theoretically show that real-valued backpropagation reduces the training dynamics of complex networks to that of ordinary real networks as the widths grow. |
Zhi-Hao Tan; Yi Xie; Yuan Jiang; Zhi-Hua Zhou; |
550 | Learning Latent Seasonal-Trend Representations for Time Series Forecasting Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Motivated by the success of disentangled variational autoencoder in computer vision and classical time series decomposition, we plan to infer a couple of representations that depict seasonal and trend components of time series. To achieve this goal, we propose LaST, which, based on variational inference, aims to disentangle the seasonal-trend representations in the latent space. |
Zhiyuan Wang; Xovee Xu; Goce Trajcevski; Weifeng Zhang; Ting Zhong; Fan Zhou; |
551 | Semi-Discrete Normalizing Flows Through Differentiable Tessellation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We combine Voronoi tessellation with normalizing flows to construct a new invertible transformation that has learnable discrete structure. We construct new tessellation-based dequantization and disjoint mixture modeling approaches. |
Ricky T. Q. Chen; Brandon Amos; Maximilian Nickel; |
552 | Pure Transformers Are Powerful Graph Learners Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We show that standard Transformers without graph-specific modifications can work well in graph learning both in theory and practice. |
Jinwoo Kim; Dat Nguyen; Seonwoo Min; Sungjun Cho; Moontae Lee; Honglak Lee; Seunghoon Hong; |
553 | Diversity Vs. Recognizability: Human-like Generalization in One-shot Generative Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose and test a new framework to evaluate one-shot image generation models |
Victor Boutin; Lakshya Singhal; Xavier Thomas; Thomas Serre; |
554 | Optimal-er Auctions Through Attention Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We improve RegretNet, a deep learning based approach to optimal auction design, by increasing its revenue with an attention-based architecture and simplifying its tuning with an interpretable loss function. |
Dmitry Ivanov; Iskander Safiulin; Igor Filippov; Ksenia Balabaeva; |
555 | Graph Learning Assisted Multi-Objective Integer Programming Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper presents a graph neural network based method to solve multi-objective integer programs. |
Yaoxin Wu; Wen Song; Zhiguang Cao; Jie Zhang; Abhishek Gupta; Mingyan Lin; |
556 | Simulation-guided Beam Search for Neural Combinatorial Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose simulation-guided beam search method and its combination with EAS (efficient active search) that significantly improve inference performances of neural approaches for combinatorial optimization. |
Jinho Choo; Yeong-Dae Kwon; Jihoon Kim; Jeongwoo Jae; André Hottung; Kevin Tierney; Youngjune Gwon; |
557 | Local Metric Learning for Off-Policy Evaluation in Contextual Bandits with Continuous Actions Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present an analytic solution for the optimal metric, based on the analysis of bias and variance. |
Haanvid Lee; Jongmin Lee; Yunseon Choi; Wonseok Jeon; Byung-Jun Lee; Yung-Kyun Noh; Kee-Eung Kim; |
558 | Minimax-Optimal Multi-Agent RL in Zero-Sum Markov Games With A Generative Model Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper develops a minimax-optimal algorithm for learning the equilibrium of a Markov game in the presence of a generative model. |
Gen Li; Yuejie Chi; Yuxin Chen; Yuting Wei; |
559 | Sharpness-Aware Training for Free Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose Sharpness-Aware Training for Free, or SAF, which mitigates the sharp landscape at almost zero additional computational cost over the base optimizer. |
JIAWEI DU; Daquan Zhou; Joey Tianyi Zhou; Jiashi Feng; Vincent Tan; |
560 | Random Sharpness-Aware Minimization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, one-step gradient ascent may not be sufficient and multi-step gradient ascents will cause additional training costs. Based on this observation, we propose a novel random smoothing based SAM (R-SAM) algorithm. |
Yong Liu; Siqi Mai; Minhao Cheng; Xiangning Chen; Cho-Jui Hsieh; Yang You; |
561 | Beyond Mahalanobis Distance for Textual OOD Detection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a new detector called TRUSTED. |
Pierre Colombo; Eduardo Dadalto; Guillaume Staerman; Nathan Noiry; Pablo Piantanida; |
562 | Universality of Group Convolutional Neural Networks Based on Ridgelet Analysis on Groups Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We have obtained an analysis operator, called the ridgelet transform, for a general GCNN, and shown a universal approximation theorem in a unified, direct and constructive manner. |
Sho Sonoda; Isao Ishikawa; Masahiro Ikeda; |
563 | Augmented RBMLE-UCB Approach for Adaptive Control of Linear Quadratic Systems Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We consider the problem of controlling a stochastic linear system with quadratic costs, when its system parameters are not known to the agent — called the adaptive LQ control problem. |
Akshay Mete; Rahul Singh; P. R. Kumar; |
564 | MetricFormer: A Unified Perspective of Correlation Exploring in Similarity Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a new method called MetricFormer, which can effectively capture and model the multiple correlations in a unified perspective. |
Jiexi Yan; Erkun Yang; Cheng Deng; Heng Huang; |
565 | Error Correction Code Transformer Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a novel SOTA Neural error correction decoder based on Transformers. |
Yoni Choukroun; Lior Wolf; |
566 | A Unified Diversity Measure for Multiagent Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose a novel metric called the Unified Diversity Measure (UDM) that offers a unified view for existing diversity metrics. |
Zongkai Liu; Chao Yu; Yaodong Yang; peng sun; Zifan Wu; Yuan Li; |
567 | Outsourcing Training Without Uploading Data Via Efficient Collaborative Open-Source Sampling Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We develop a novel method to sample proximal data from multiple agnostic sources for outsourcing training without uploading data. |
Junyuan Hong; Lingjuan Lyu; Jiayu Zhou; Michael Spranger; |
568 | Dynamic Pricing with Monotonicity Constraint Under Unknown Parametric Demand Model Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present optimal regret bounds for markdown pricing with unknown parametric demand functions, improving upon previous result without the parametric assumption |
Su Jia; Andrew Li; R Ravi; |
569 | Make Some Noise: Reliable and Efficient Single-Step Adversarial Training Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce a novel single-step attack for adversarial training that can prevent catastrophic overfitting while obtaining a 3x speed-up. |
Pau de Jorge Aranda; Adel Bibi; Riccardo Volpi; Amartya Sanyal; Philip Torr; Gregory Rogez; Puneet Dokania; |
570 | Branch & Learn for Recursively and Iteratively Solvable Problems in Predict+Optimize Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper proposes Branch & Learn, a framework for Predict+Optimize to tackle optimization problems containing parameters that are unknown at the time of solving. |
Xinyi Hu; Jasper Lee; Jimmy Lee; Allen Z. Zhong; |
571 | Stochastic Adaptive Activation Function Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: The proposed activation function adaptively rectifies inputs by fine-tuning threshold potential according to inputs |
Kyungsu Lee; Jaeseung Yang; Haeyun Lee; Jae Youn Hwang; |
572 | Generalizing Bayesian Optimization with Decision-theoretic Entropies Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We develop a Bayesian optimization procedure based on a decision-theoretic generalization of entropy, which can be tailored to custom optimization and other sequential decision making tasks. |
Willie Neiswanger; Lantao Yu; Shengjia Zhao; Chenlin Meng; Stefano Ermon; |
573 | A Non-Asymptotic Moreau Envelope Theory for High-Dimensional Generalized Linear Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We prove a new generalization bound that shows for any class of linear predictors in Gaussian space, the Rademacher complexity of the class and the training error under any continuous loss $\ell$ can control the test error under all Moreau envelopes of the loss $\ell$ . |
Lijia Zhou; Frederic Koehler; Pragya Sur; Danica J. Sutherland; Nati Srebro; |
574 | Statistically Meaningful Approximation: A Case Study on Approximating Turing Machines with Transformers Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a new notion of "statistically meaningful" approximation and show that neural nets can statistically-meaningfully approximate Boolean circuits and Turing machines. |
Colin Wei; Yining Chen; Tengyu Ma; |
575 | Provable General Function Class Representation Learning in Multitask Bandits and MDP Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we extend the analysis to general function class representations. |
Rui Lu; Andrew Zhao; Simon Du; Gao Huang; |
576 | On Learning Fairness and Accuracy on Multiple Subgroups Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose an analysis in fair learning that preserves the utility of the data while reducing prediction disparities under the criteria of group sufficiency. |
Changjian Shui; Gezheng Xu; Qi CHEN; Jiaqi Li; Charles Ling; Tal Arbel; Boyu Wang; Christian Gagné; |
577 | VRL3: A Data-Driven Framework for Visual Deep Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We combine pretraining, offline RL and online RL into a 3-stage framework that makes Adroit robotic hand learning up to 24x more sample efficient than previous SOTA. |
Che Wang; Xufang Luo; Keith Ross; Dongsheng Li; |
578 | Locally Hierarchical Auto-Regressive Modeling for Image Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a Hierarchical-Quantized Transformers (HQ-Transformer) to model the multi-level discrete sequences efficiently and generate novel images of good quality. |
Tackgeun You; Saehoon Kim; Chiheon Kim; Doyup Lee; Bohyung Han; |
579 | Transcormer: Transformer for Sentence Scoring with Sliding Language Modeling Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce a complete new language model for sentence scoring. |
Kaitao Song; Yichong Leng; Xu Tan; Yicheng Zou; Tao Qin; Dongsheng Li; |
580 | Mind The Gap: Understanding The Modality Gap in Multi-modal Contrastive Representation Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present modality gap, an intriguing geometric phenomenon of the representation space of multi-modal models. |
Victor Weixin Liang; Yuhui Zhang; Yongchan Kwon; Serena Yeung; James Zou; |
581 | A Rotated Hyperbolic Wrapped Normal Distribution for Hierarchical Representation Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present a rotated hyperbolic wrapped normal distribution, a simple yet effective alteration of a hyperbolic wrapped normal distribution, applicable to the representation learning of the data with hierarchy. |
Seunghyuk Cho; Juyong Lee; Jaesik Park; Dongwoo Kim; |
582 | Bootstrapped Transformer for Offline Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose Bootstrapped Transformer to self-generate more offline data to boost the training of sequence model for offline reinforcement learning. |
Kerong Wang; Hanye Zhao; Xufang Luo; Kan Ren; Weinan Zhang; Dongsheng Li; |
583 | Learning Generalized Policy Automata for Relational Stochastic Shortest Path Problems Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present an approach that uses relational abstractions for few-shot learning of generalized policies for SSPs that can be used to quickly solve larger SSPs containing more objects while guaranteeing completeness and hierarchical optimality. |
Rushang Karia; Rashmeet Kaur Nayyar; Siddharth Srivastava; |
584 | Towards Consistency in Adversarial Classification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study calibration and consistency of losses in the adversarial setting. |
Laurent Meunier; Raphael Ettedgui; Rafael Pinot; Yann Chevaleyre; Jamal Atif; |
585 | GLIPv2: Unifying Localization and Vision-Language Understanding Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present a region-aware vision-language pre-trained model that serves both localization tasks (e.g., object detection, instance segmentation) and understanding (e.g., VQA, image captioning) tasks |
Haotian Zhang; Pengchuan Zhang; Xiaowei Hu; Yen-Chun Chen; Liunian Li; Xiyang Dai; Lijuan Wang; Lu Yuan; Jenq-Neng Hwang; Jianfeng Gao; |
586 | Distilled Gradient Aggregation: Purify Features for Input Attribution in The Deep Neural Network Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we design a new input attribution method which adopt the strengths of both local and global attributions.In particular, we propose a novel approach to distill input features using weak and extremely positive contributor masks. |
Giyoung Jeon; Haedong Jeong; Jaesik Choi; |
587 | Optimal Positive Generation Via Latent Transformation for Contrastive Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Leveraging the remarkable property of pretrained generative models, we propose to generate instance-specific optimal positive samples for contrastive learning. |
Yinqi Li; Hong Chang; Bingpeng MA; Shiguang Shan; Xilin Chen; |
588 | [Re] Replication Study of DECAF: Generating Fair Synthetic Data Using Causally-Aware Generative Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The goal of the original paper is to create a model that takes as input a biased dataset and outputs a debiased synthetic dataset that can be used to train downstream models to make unbiased predictions both on synthetic and real data. |
Velizar Shulev; Paul Verhagen; Shuai Wang; Jennifer Zhuge; |
589 | Polynomial Time Guarantees for The Burer-Monteiro Method Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The bound on $p$ approaches the celebrated Barvinok-Pataki bound in the limit as $\eta$ goes to zero, beneath which it the nonconvex program can be suboptimal. Our main technical contribution, which is key for our tight bound on $p$, is to connect spurious approximately critical points of the nonconvex program to tubular neighborhoods of certain algebraic varieties, and then estimate the volume of such tubes. |
Diego Cifuentes; Ankur Moitra; |
590 | Factored Adaptation for Non-Stationary Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce a factored adaptation framework for nonstationary RL and show that learned factored representations improve the rewards and robustness under non-stationarity. |
Fan Feng; Biwei Huang; Kun Zhang; Sara Magliacane; |
591 | Whitening Convergence Rate of Affine Coupling Flows Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We show and confirm experimentally an explicit convergence rate for coupling-based normalizing flows for whitening in terms of KL divergence. |
Felix Draxler; Christoph Schnörr; Ullrich Köthe; |
592 | Proximal Learning With Opponent-Learning Awareness Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce POLA, a policy parameterization invariant version of LOLA, and empirically show that approximations to POLA learn reciprocity-based cooperation more reliably than LOLA. |
Stephen Zhao; Chris Lu; Roger Grosse; Jakob Foerster; |
593 | Towards Improving Calibration in Object Detection Under Domain Shift Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose new techniques to improve calibration of visual object detection methods including domain-adaptive ones, especially under domain-shift. |
Muhammad Akhtar Munir; Muhammad Haris Khan; M. Sarfraz; Mohsen Ali; |
594 | An Analysis of Ensemble Sampling Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We derive a general analysis template for approximate Thompson sampling, and based on it provide the first rigorous analysis of ensemble sampling. |
Chao Qin; Zheng Wen; Xiuyuan Lu; Benjamin Van Roy; |
595 | M2N: Mesh Movement Networks for PDE Solvers Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose to the best of our knowledge the first learning-based end-to-end mesh movement framework for PDE solvers. |
Wenbin Song; Mingrui Zhang; Joseph G Wallwork; Junpeng Gao; Zheng Tian; Fanglei Sun; Matthew Piggott; Junqing Chen; Zuoqiang Shi; Xiang Chen; Jun Wang; |
596 | Towards Theoretically Inspired Neural Initialization Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a differentiable quantity, named GradCoisne, with theoretical insights to evaluate the initial state of a neural network. |
Yibo Yang; Hong Wang; Haobo Yuan; Zhouchen Lin; |
597 | What Is Where By Looking: Weakly-Supervised Open-World Phrase-Grounding Without Text Inputs Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: of weakly-supervised open-world phrase-grounding without input text |
Tal Shaharabany; Yoad Tewel; Lior Wolf; |
598 | Algorithms and Hardness for Learning Linear Thresholds from Label Proportions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This work provides algorithmic and hardness results for learning linear thresholds from label proportions for bag size >= 3. |
Rishi Saket; |
599 | Confidence-based Reliable Learning Under Dual Noises Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Technically, we develop a confidence-based sample filter to progressively filter out noisy data without the need of pre-specifying noise ratio. |
Peng Cui; Yang Yue; Zhijie Deng; Jun Zhu; |
600 | RecursiveMix: Mixed Learning with History Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a recursive mixed-sample learning paradigm, termed “RecursiveMix” (RM), by exploring a novel training strategy that leverages the historical input-prediction-label triplets. |
Lingfeng Yang; Xiang Li; Borui Zhao; Renjie Song; Jian Yang; |
601 | Improving Self-Supervised Learning By Characterizing Idealized Representations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We characterize idealized self-supervised representations, which leads to actionable insights for improving SSL algorithms. |
Yann Dubois; Stefano Ermon; Tatsunori Hashimoto; Percy Liang; |
602 | Split-kl and PAC-Bayes-split-kl Inequalities Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a new concentration of measure inequality for sums of independent bounded random variables, which we name a split-kl inequality. |
Yi-Shan Wu; Yevgeny Seldin; |
603 | Signal Propagation in Transformers: Theoretical Perspectives and The Role of Rank Collapse Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The question of if and how rank collapse affects training is still largely unanswered, and its investigation is necessary for a more comprehensive understanding of this architecture. In this work, we shed new light on the causes and the effects of this phenomenon. |
Sotiris Anagnostidis; Luca Biggio; Lorenzo Noci; Antonio Orvieto; Sidak Pal Singh; Aurelien Lucchi; |
604 | VLMo: Unified Vision-Language Pre-Training with Mixture-of-Modality-Experts Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present a unified Vision-Language pretrained Model (VLMo) that jointly learns a dual encoder and a fusion encoder with a modular Transformer network. |
Hangbo Bao; Wenhui Wang; Li Dong; Qiang Liu; Owais Khan Mohammed; Kriti Aggarwal; Songhao Piao; Subhojit Som; Furu Wei; |
605 | A Theoretical Framework for Inference Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we develop a novel theoretical framework for inference learning, a biologically plausible local learning algorithm for deep neural networks. |
Nick Alonso; Beren Millidge; Jeffrey Krichmar; Emre O Neftci; |
606 | Self-Organized Group for Cooperative Multi-agent Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a spontaneously grouping mechanism to promote generalization ability for multi-agent reinforcement learning. |
Jianzhun Shao; Zhiqiang Lou; Hongchang Zhang; Yuhang Jiang; Shuncheng He; Xiangyang Ji; |
607 | A Robust Phased Elimination Algorithm for Corruption-Tolerant Gaussian Process Bandits Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a novel robust elimination-type algorithm that runs in epochs, combines exploration with infrequent switching to select a small subset of actions, and plays each action for multiple time instants. |
Ilija Bogunovic; Zihan Li; Andreas Krause; Jonathan Scarlett; |
608 | First Contact: Unsupervised Human-Machine Co-Adaptation Via Mutual Information Maximization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We train an assistive interface to translate a user’s raw command signals into robot actions in a completely unsupervised manner, by maximizing the mutual information between the user’s commands and the state transitions of the environment. |
Siddharth Reddy; Sergey Levine; Anca Dragan; |
609 | Learning to Branch with Tree MDPs Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Instead, we propose to learn branching rules from scratch with Reinforcement Learning (RL). |
Lara Scavuzzo; Feng Chen; Didier Chetelat; Maxime Gasse; Andrea Lodi; Neil Yorke-Smith; Karen Aardal; |
610 | Interpreting Operation Selection in Differentiable Architecture Search: A Perspective from Influence-Directed Explanations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The vanilla DARTS assumes the optimized magnitudes reflect the importance of operations, while more recent works find this naive assumption leads to poor generalization and is without any theoretical guarantees. In this work, we leverage influence functions, the functional derivatives of the loss function, to theoretically reveal the operation selection part in DARTS and estimate the candidate operation importance by approximating its influence on the supernet with Taylor expansions. |
Miao Zhang; Wei Huang; Bin Yang; |
611 | FiLM: Frequency Improved Legendre Memory Model for Long-term Time Series Forecasting Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To this end, we design a \textbf{F}requency \textbf{i}improved \textbf{L}egendre \textbf{M}emory model, or {\bf FiLM}: it applies Legendre polynomial projections to approximate historical information, uses Fourier projection to remove noise, and adds a low-rank approximation to speed up computation. |
Tian Zhou; Ziqing MA; xue wang; Qingsong Wen; Liang Sun; Tao Yao; Wotao Yin; Rong Jin; |
612 | Probing Classifiers Are Unreliable for Concept Removal and Detection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We theoretically and experimentally demonstrate that even under favorable conditions, probing-based null-space and adversarial removal methods fail to remove the sensitive attribute from latent representation. |
Abhinav Kumar; Chenhao Tan; Amit Sharma; |
613 | Learning Multi-resolution Functional Maps with Spectral Attention for Robust Shape Matching Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Our work introduces a novel non-rigid shape matching framework based on multi-resolution functional maps with spectral attention. |
Lei Li; Nicolas Donati; Maks Ovsjanikov; |
614 | Pessimism for Offline Linear Contextual Bandits Using $\ell_p$ Confidence Sets Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a family $\{\widehat{\pi}_p\}_{p\ge 1}$ of pessimistic learning rules for offline learning of linear contextual bandits, relying on confidence sets with respect to different $\ell_p$ norms, where $\widehat{\pi}_2$ corresponds to Bellman-consistent pessimism (BCP), while $\widehat{\pi}_\infty$ is a novel generalization of lower confidence bound (LCB) to the linear setting. |
Gene Li; Cong Ma; Nati Srebro; |
615 | Divert More Attention to Vision-Language Tracking Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We explore a different path to achieve SOTA tracking via vision-language multimodal learning instead of complex Transformer. |
Mingzhe Guo; Zhipeng Zhang; Heng Fan; Liping Jing; |
616 | Adversarially Robust Learning: A Generic Minimax Optimal Learner and Characterization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a minimax optimal learner for the problem of learning predictors robust to adversarial examples at test-time. |
Omar Montasser; Steve Hanneke; Nati Srebro; |
617 | Learning General World Models in A Handful of Reward-Free Deployments Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a new method to learn general world models using a diverse population of self-supervised exploration agents in a handful of reward-free deployments. |
Jack Parker-Holder; Yingchen Xu; Philip Ball; Aldo Pacchiano; Oleh Rybkin; S Roberts; Tim Rocktäschel; Edward Grefenstette; |
618 | End-to-end Stochastic Programming with Energy-based Model Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, it has poor scalability since it requires to solve and differentiate through the optimization problem at every iteration; furthermore, it can only be applied to convex problem. To address these shortcomings, we propose a new end-to-end stochastic programming method with Energy-based model. |
Lingkai Kong; Jiaming Cui; Yuchen Zhuang; Rui Feng; B. Aditya Prakash; Chao Zhang; |
619 | Memorization and Optimization in Deep Neural Networks with Minimum Over-parameterization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We show that the NTK is well-conditioned for deep neural networks with minimum possible over-parameterization ($\Omega(N)$ parameters and, hence, $\Omega(\sqrt{N})$ neurons — $N$ being the number of training samples). |
Simone Bombari; Mohammad Hossein Amani; Marco Mondelli; |
620 | PKD: General Distillation Framework for Object Detectors Via Pearson Correlation Coefficient Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a general distillation framework for object detectors via Pearson Correlation Coefficient to focus on the relational information from the teacher. |
Weihan Cao; Jianfei Gao; Anda Cheng; Ke Cheng; Yifan Zhang; Jian Cheng; |
621 | Regret Bounds for Multilabel Classification in Sparse Label Regimes Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We investigate regret upper and lower bounds for multi-label classification under sparsity constraints |
Róbert Busa-Fekete; Heejin Choi; Krzysztof Dembczynski; Claudio Gentile; Henry Reeve; Balazs Szorenyi; |
622 | The Power and Limitation of Pretraining-Finetuning for Linear Regression Under Covariate Shift Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study the risk bounds of pretraining-finetuning for linear regression under covariate shift |
Jingfeng Wu; Difan Zou; Vladimir Braverman; Quanquan Gu; Sham Kakade; |
623 | Unsupervised Reinforcement Learning with Contrastive Intrinsic Control Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Contrastive Intrinsic Control (CIC) uses a novel contrastive loss between states and skills to achieve good performance on the state-based Unsupervised RL Benchmark. |
Michael Laskin; Hao Liu; Xue Bin Peng; Denis Yarats; Aravind Rajeswaran; Pieter Abbeel; |
624 | The Franz-Parisi Criterion and Computational Trade-offs in High Dimensional Statistics Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We establish some rigorous connections between two different frameworks for computational hardness of statistical problems: algebraic methods based on low-degree polynomials and geometric methods rooted in statistical physics. |
Afonso Bandeira; Ahmed El Alaoui; Samuel Hopkins; Tselil Schramm; Alexander Wein; Ilias Zadik; |
625 | On The Role of Overparameterization in Temporal Difference Learning with Linear Function Approximation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study the role of overparameterization in Temporal Difference (TD) learning and how it affects optimization. |
Valentin Thomas; |
626 | OOD Link Prediction Generalization Capabilities of Message-Passing GNNs in Larger Test Graphs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This work proves bounds on the ability of (structural) node and pairwise message-passing GNNs to inductively predict links OOD when test graphs are larger than training graphs |
Yangze Zhou; Gitta Kutyniok; Bruno Ribeiro; |
627 | What Are The Best Systems? New Perspectives on NLP Benchmarking Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper proposes a new procedure to rank systems based on their performance across different tasks. |
Pierre Colombo; Nathan Noiry; Ekhine Irurozki; Stephan Clémençon; |
628 | Scale-invariant Learning By Physics Inversion Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We combine scale invariant physics optimizers with Adam to improve learning ill-conditioned inverse problems. |
Philipp Holl; Vladlen Koltun; Nils Thuerey; |
629 | On Feature Learning in The Presence of Spurious Correlations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We explore the quality of representations learned by standard ERM training and specialized group robustness methods in the presence of spurious correlations. |
Pavel Izmailov; Polina Kirichenko; Nate Gruver; Andrew Wilson; |
630 | Tight Analysis of Extra-gradient and Optimistic Gradient Methods For Nonconvex Minimax Problems Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Despite the established convergence theory of Optimistic Gradient Descent Ascent (OGDA) and Extragradient (EG) methods for the convex-concave minimax problems, little is known about the theoretical guarantees of these methods in nonconvex settings. To bridge this gap, for the first time, this paper establishes the convergence of OGDA and EG methods under the nonconvex-strongly-concave (NC-SC) and nonconvex-concave (NC-C) settings by providing a unified analysis through the lens of single-call extra-gradient methods. |
Pouria Mahdavinia; Yuyang Deng; Haochuan Li; Mehrdad Mahdavi; |
631 | A Fast Scale-Invariant Algorithm for Non-negative Least Squares with Non-negative Data Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We provide a fast scale invariant algorithm with a multiplicative error guarantee for non-negative least squares problems with non-negative data. |
Jelena Diakonikolas; Chenghui Li; Swati Padmanabhan; Chaobing Song; |
632 | Detecting Abrupt Changes in Sequential Pairwise Comparison Data Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose novel and practicable algorithms that can localize change points in pairwise comparison data with time stamps modeled by the Bradley-Terry model and establish consistency rates for our methodology. |
Wanshan Li; Alessandro Rinaldo; Daren Wang; |
633 | How Sampling Impacts The Robustness of Stochastic Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Consequently, a gradient-based adversarial example is calculated based on one set of samples and its classification on another set. In this paper we derive a sufficient condition for such a stochastic prediction to be robust against a given sample-based attack. |
Sina Däubener; Asja Fischer; |
634 | The Computational and Learning Benefits of Daleian Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We show that Daleian neural networks, despite being significantly more structurally constrained, accurately approximate the computation of arbitrary non-Daleian neural networks, and show novel computational and learning benefits of Daleian networks |
Adam Haber; Elad Schneidman; |
635 | On The Symmetries of The Synchronization Problem in Cryo-EM: Multi-Frequency Vector Diffusion Maps on The Projective Plane Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study the symmetries of cryo-EM and show that relative poses in O(2) are sufficient to identify the images’ poses. Hence, we extend Vector Diffusion Maps to not only predict viewing directions similarity but also recover images’ poses. |
Gabriele Cesa; Arash Behboodi; Taco Cohen; Max Welling; |
636 | One Positive Label Is Sufficient: Single-Positive Multi-Label Learning with Label Enhancement Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, a novel SPMLL method named SMILE, i.e., Single-positive MultI-label learning with Label Enhancement, is proposed. |
Ning Xu; Congyu Qiao; Jiaqi Lv; Xin Geng; Min-Ling Zhang; |
637 | Learning Interface Conditions in Domain Decomposition Solvers Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We use graph neural networks and introduce an improved loss function in order to learn interface conditions for optimized Schwarz domain-decomposition algorithms, enabling their use on unstructured grids. |
Ali Taghibakhshi; Nicolas Nytko; Tareq Uz Zaman; Scott MacLachlan; Luke Olson; Matthew West; |
638 | Doubly-Asynchronous Value Iteration: Making Value Iteration Asynchronous in Actions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we establish that DAVI maintains similarly appealing theoretical properties to VI without the need to wait for a full sweep through the entire action space in each update. |
Tian Tian; Kenny Young; Richard Sutton; |
639 | Partial Identification of Treatment Effects with Implicit Generative Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We estimate bounds on the average treatment effect using implicit generative models by estimating partial derivatives of the response function in the interval between two points on the response curve. |
Vahid Balazadeh Meresht; Vasilis Syrgkanis; Rahul Krishnan; |
640 | ShuffleMixer: An Efficient ConvNet for Image Super-Resolution Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a simple and effective approach, ShuffleMixer, for lightweight image super-resolution that combines large convolution and channel split-shuffle operation. |
Long Sun; Jinshan Pan; Jinhui Tang; |
641 | Fine-tuning Language Models Over Slow Networks Using Activation Compression with Guarantees Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose AQ-SGD, a novel activation quantization algorithm for communication-efficient pipeline parallelism training over slow networks. |
Jue WANG; Binhang Yuan; Luka Rimanic; Yongjun He; Tri Dao; Beidi Chen; Christopher Ré; Ce Zhang; |
642 | A Gradient Estimator Via L1-randomization for Online Zero-order Optimization with Two Point Feedback Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a new gradient estimator for zero-order optimisation and study its theoretical and practical aspects |
Arya Akhavan; Evgenii Chzhen; Massimiliano Pontil; Alexandre Tsybakov; |
643 | Feature Learning in $L_2$-regularized DNNs: Attraction/Repulsion and Sparsity Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The loss of L2 regularized DNNs can be reformulated in terms of the hidden representations at every layer, with implications on the sparsity of the optimal network. |
Arthur Jacot; Eugene Golikov; Clement Hongler; Franck Gabriel; |
644 | Off-Policy Evaluation with Deficient Support Using Side Information Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Exploiting Side Information, we propose estimators more suitable than IPS whenever the "full support" assumption does not hold. |
Nicolò Felicioni; Maurizio Ferrari Dacrema; Marcello Restelli; Paolo Cremonesi; |
645 | Retaining Knowledge for Learning with Dynamic Definition Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a first practical and provable solution to LDD. |
Zichang Liu; Benjamin Coleman; Tianyi Zhang; Anshumali Shrivastava; |
646 | Batch Multi-Fidelity Active Learning with Budget Constraints Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We proposed a budget-aware, batch multi-fdielity active learning algorithm for high dimensional outputs that are common for physical simulations and related applications. |
Shibo Li; Jeff M Phillips; Xin Yu; Robert Kirby; Shandian Zhe; |
647 | A Theory of PAC Learnability Under Transformation Invariances Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study the PAC sample complexity of learning under transformation invariances and the performance of data augmentation. |
Han Shao; Omar Montasser; Avrim Blum; |
648 | VoiceBox: Privacy Through Real-Time Adversarial Attacks with Audio-to-Audio Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Inspired by architectures for audio-to-audio tasks such as denoising and speech enhancement, we propose a neural network model capable of adversarially modifying a user’s audio stream in real-time. |
Patrick O’Reilly; Andreas Bugler; Keshav Bhandari; Max Morrison; Bryan Pardo; |
649 | Robustness in Deep Learning: The Width (good), The Depth (bad), and The Initialization (ugly) Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We explore the interplay of the width, the depth and the initialization(s) on the average robustness of neural networks with new theoretical bounds in an effort to address the apparent contradiction in the literature. |
Zhenyu Zhu; Fanghui Liu; Grigorios Chrysos; Volkan Cevher; |
650 | On Scrambling Phenomena for Randomly Initialized Recurrent Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: New analysis for RNNs at initialization showing how certain phenomena from chaotic dynamical systems emerge. |
Vaggos Chatziafratis; Ioannis Panageas; Clayton Sanford; Stelios Stavroulakis; |
651 | Fixing Neural Networks By Leaving The Right Past Behind Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we point to a formal connection between RNNs and chaotic dynamical systems and prove a qualitatively stronger phenomenon about RNNs than what exploding gradients seem to suggest. |
Ryutaro Tanno; Melanie F. Pradier; Aditya Nori; Yingzhen Li; |
652 | Teacher Forcing Recovers Reward Functions for Text Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We derive a reward function for text generation via the lens of inverse reinforcement learning. |
Yongchang Hao; Yuxin Liu; Lili Mou; |
653 | Are All Losses Created Equal: A Neural Collapse Perspective Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: A broad family of loss functions leads to neural collapse solutions hence are equivalent on training set; moreover, they exhibit largely identical performance on test data as well. |
Jinxin Zhou; Chong You; Xiao Li; Kangning Liu; Sheng Liu; Qing Qu; Zhihui Zhu; |
654 | What You See Is What You Get: Distributional Generalization for Algorithm Design in Deep Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We develop the theoretical connection between differential privacy and distributional generalization, and we leverage our theory to improve empirical performance in privacy, fairness, and distribution robustness applications. |
Bogdan Kulynych; Yao-Yuan Yang; Yaodong Yu; Jarosław Błasiok; Preetum Nakkiran; |
655 | LAMP: Extracting Text from Gradients with Language Model Priors Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a novel attack for text reconstruction from gradients in federated learning based on language model priors. |
Mislav Balunovic; Dimitar Dimitrov; Nikola Jovanović; Martin Vechev; |
656 | Incrementality Bidding Via Reinforcement Learning Under Mixed and Delayed Rewards Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: A fundamental difference between our learning problem from standard RL problems is that the realized reward feedback from conversion incrementality is mixed and delayed. To handle this difficulty we propose and analyze a novel pairwise moment-matching algorithm to learn the conversion incrementality, which we believe is of independent of interest. |
Ashwinkumar Badanidiyuru Varadaraja; Zhe Feng; Tianxi Li; Haifeng Xu; |
657 | Using Partial Monotonicity in Submodular Maximization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work we fill this gap by defining the \emph{monotonicity ratio}, which is a continuous version of the monotonicity property. We then show that for many standard submodular maximization algorithms one can prove new approximation guarantees that depend on the monotonicity ratio; leading to improved approximation ratios for the common machine learning applications of movie recommendation, quadratic programming and image summarization. |
Loay Mualem; Moran Feldman; |
658 | Efficient and Stable Fully Dynamic Facility Location Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We provide the first algorithm for the dynamic facility location problem that at the same time maintains a constant approximation, uses polylogarithmic time per update, and incurs polylogarithmic recourse per update. |
Sayan Bhattacharya; Silvio Lattanzi; Nikos Parotsidis; |
659 | Perfect Sampling from Pairwise Comparisons Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study how to efficiently obtain perfect samples from a discrete distribution given access only to pairwise comparisons of elements of its support. |
Dimitris Fotakis; Alkis Kalavasis; Christos Tzamos; |
660 | Finding Correlated Equilibrium of Constrained Markov Game: A Primal-Dual Approach Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We proposed correlated equilibrium (CE) for constrained Markov game and developed the first primal-dual algorithm with non-asymptotic convergence to CE. |
Ziyi Chen; Shaocong Ma; Yi Zhou; |
661 | Fair Bayes-Optimal Classifiers Under Predictive Parity Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper considers predictive parity, which requires equalizing the probability of success given a positive prediction among different protected groups. |
Xianli Zeng; Edgar Dobriban; Guang Cheng; |
662 | Alternating Mirror Descent for Constrained Min-Max Games Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We analyze the regret guarantees of the alternating mirror descent algorithm for constrained min-max games. |
Andre Wibisono; Molei Tao; Georgios Piliouras; |
663 | LDSA: Learning Dynamic Subtask Assignment in Cooperative Multi-Agent Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To reasonably assign agents to different subtasks, we propose an ability-based subtask selection strategy, which can dynamically group agents with similar abilities into the same subtask. |
Mingyu Yang; Jian Zhao; Xunhan Hu; Wengang Zhou; Jiangcheng Zhu; Houqiang Li; |
664 | Learning to Generate Inversion-Resistant Model Explanations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose the first defense framework that mitigates explanation-aware model inversion attacks by teaching a model to suppress inversion-critical features in a given explanation while preserving its functionality. |
Hoyong Jeong; Suyoung Lee; Sung Ju Hwang; Sooel Son; |
665 | Revisiting Injective Attacks on Recommender Systems Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We first revisit current injective attackers on recommender systems and then propose a difficulty-aware and diversity-aware attacker. |
Haoyang LI; Shimin DI; Lei Chen; |
666 | SNN-RAT: Robustness-enhanced Spiking Neural Network Through Regularized Adversarial Training Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Experimental and theoretical insights about the robustness of spiking neural networks motivate a robust training scheme. |
Jianhao Ding; Tong Bu; Zhaofei Yu; Jian Liu; Tiejun Huang; |
667 | Training Uncertainty-Aware Classifiers with Conformalized Deep Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper develops a novel loss function and learning algorithm for training uncertainty-aware deep neural classifiers that can lead to smaller conformal prediction sets with more reliable coverage compared to standard state-of-the-art techniques. |
Bat-Sheva Einbinder; Yaniv Romano; Matteo Sesia; Yanfei Zhou; |
668 | Distinguishing Learning Rules with Brain Machine Interfaces Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a metric for distinguishing biased from unbiased learning rules based on network activity and use simulated data from recurrent neural networks emulating a brain-machine interface task to identify the learning rule used in training |
Jacob Portes; Christian Schmid; James M Murray; |
669 | Causal Inference with Non-IID Data Using Linear Graphical Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Put differently, we present conditions under which causal effects can be computed with negligible bias by assuming that samples are IID. |
Chi Zhang; Karthika Mohan; Judea Pearl; |
670 | Graph Reordering for Cache-Efficient Near Neighbor Search Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We speed up SOTA near-neighbor search algorithms by 40% with graph-aware cache optimizations, which we analyze in the ideal cache model. |
Benjamin Coleman; Santiago Segarra; Alexander Smola; Anshumali Shrivastava; |
671 | Factorized-FL: Personalized Federated Learning with Parameter Factorization & Similarity Matching Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study label- and domain-heterogeneity in federated learning scenarios and propose a novel method, Factorized FL, which factorizes model parameters and performs similarity matching with the factorized vectors |
Wonyong Jeong; Sung Ju Hwang; |
672 | HyperMiner: Topic Taxonomy Mining with Hyperbolic Embedding Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To this end, we present a novel framework that introduces hyperbolic embeddings to represent words and topics. |
Yi.shi Xu; Dongsheng Wang; Bo Chen; Ruiying Lu; Zhibin Duan; Mingyuan Zhou; |
673 | Diagonal State Spaces Are As Effective As Structured State Spaces Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a seq2seq model that uses diagonal state spaces (DSS) for contextualization & delivers state-of-the-art performance on benchmarks requiring long-range reasoning over text, images & audio. |
Ankit Gupta; Albert Gu; Jonathan Berant; |
674 | Generalised Implicit Neural Representations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We train implicit neural representations for signals on non-Euclidean domains, showing experiments with biological, social, and meteorological data. |
Daniele Grattarola; Pierre Vandergheynst; |
675 | CyCLIP: Cyclic Contrastive Language-Image Pretraining Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a framework for cyclic consistency in contrastive language-image pretraining |
Shashank Goel; Hritik Bansal; Sumit Bhatia; Ryan Rossi; Vishwa Vinay; Aditya Grover; |
676 | CoNT: Contrastive Neural Text Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we analyse the underlying reasons and propose a new Contrastive Neural Text generation framework, CoNT. |
Chenxin An; Jiangtao Feng; Kai Lv; Lingpeng Kong; Xipeng Qiu; Xuanjing Huang; |
677 | Margin-Based Few-Shot Class-Incremental Learning with Class-Level Overfitting Mitigation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To mitigate class-level overfitting (CO) in margin-based classification for the few-shot class-incremental learning task, we first interpret CO from pattern learning, and then propose a method to mitigate CO and achieve SOTA performance. |
Yixiong Zou; Shanghang Zhang; Yuhua Li; Ruixuan Li; |
678 | Low-rank Optimal Transport: Approximation, Statistics and Debiasing Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The goal of this paper is to advance our knowledge, understanding and practical ability to leverage low-rank factorizations in optimal transport. |
Meyer Scetbon; Marco Cuturi; |
679 | Towards A Standardised Performance Evaluation Protocol for Cooperative MARL Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We conduct a detailed meta-analysis of prior Cooperative MARL work, take inspiration from recent trends in RL and our data-driven insights to propose a standardised performance evaluation protocol for Cooperative MARL. |
Rihab Gorsane; Oumayma Mahjoub; Ruan John de Kock; Roland Dubb; Siddarth Singh; Arnu Pretorius; |
680 | Shield Decentralization for Safe Multi-Agent Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We describe a method of shield decomposition to enforce safety constraints in a communication-free multi-agent reinforcement learning setting. |
Daniel Melcer; Stavros Tripakis; Christopher Amato; |
681 | Hardness in Markov Decision Processes: Theory and Practice Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The goal of this paper is to unlock the practical usefulness of this theory through four main contributions. First, we present a systematic survey of the theory of hardness, which also identifies promising research directions. |
Michelangelo Conserva; Paulo Rauber; |
682 | Using Natural Language and Program Abstractions to Instill Human Inductive Biases in Machines Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We show how meta-learning agents can learn human inductive biases through co-training with representations from language descriptions and program induction. |
Sreejan Kumar; Carlos G. Correa; Ishita Dasgupta; Raja Marjieh; Michael Y Hu; Robert Hawkins; Jonathan D Cohen; nathaniel daw; Karthik Narasimhan; Tom Griffiths; |
683 | GREED: A Neural Framework for Learning Graph Distance Functions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we design a novel siamese graph neural network called Greed, which through a carefully crafted inductive bias, learns GED and SED in a property-preserving manner. |
Rishabh Ranjan; Siddharth Grover; Sourav Medya; Venkatesan Chakaravarthy; Yogish Sabharwal; Sayan Ranu; |
684 | Coarse-to-Fine Vision-Language Pre-training with Fusion in The Backbone Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, most existing end-to-end pre-training approaches either only aim to tackle VL tasks such as image-text retrieval, visual question answering (VQA) and image captioning that test high-level understanding of images, or only target region-level understanding for tasks such as phrase grounding and object detection. We present FIBER (Fusion-In-the-Backbone-based transformER), a new VL model architecture that can seamlessly handle both these types of tasks. |
Zi-Yi Dou; Aishwarya Kamath; Zhe Gan; Pengchuan Zhang; Jianfeng Wang; Linjie Li; Zicheng Liu; Ce Liu; Yann LeCun; Nanyun Peng; Jianfeng Gao; Lijuan Wang; |
685 | Cross-modal Learning for Image-Guided Point Cloud Shape Completion Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We developed a framework for image-guided point cloud completion under supervised and self-supervised settings. |
Emanuele Aiello; Diego Valsesia; Enrico Magli; |
686 | Provably Efficient Reinforcement Learning in Partially Observable Dynamical Systems Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a unified provably efficient (PAC) RL framework for learning in partially observable dynamical systems including POMDPs such as tabular POMDPs, LQG and Hilbert-space Embedding POMDPs. |
Masatoshi Uehara; Ayush Sekhari; Jason Lee; Nathan Kallus; Wen Sun; |
687 | Rethinking The Compositionality of Point Clouds Through Regularization in The Hyperbolic Space Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We exploit the hyperbolic space to regularize point cloud classification through a part-whole hierarchy. |
Antonio Montanaro; Diego Valsesia; Enrico Magli; |
688 | Redistribution of Weights and Activations for AdderNet Quantization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To this end, we first thoroughly analyze the difference on distributions of weights and activations in AdderNet and then propose a new quantization algorithm by redistributing the weights and the activations. |
Ying Nie; Kai Han; Haikang Diao; Chuanjian Liu; Enhua Wu; Yunhe Wang; |
689 | Language Conditioned Spatial Relation Reasoning for 3D Object Grounding Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work we propose a language-conditioned transformer model for grounding 3D objects and their spatial relations. |
Shizhe Chen; Pierre-Louis Guhur; Makarand Tapaswi; Cordelia Schmid; Ivan Laptev; |
690 | Weakly Supervised Causal Representation Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We show that causal factors and their causal structure can be identified from low-level data (e.g. pixels) observed before and after interventions. |
Johann Brehmer; Pim de Haan; Phillip Lippe; Taco Cohen; |
691 | Deep Attentive Belief Propagation: Integrating Reasoning and Learning for Solving Constraint Optimization Problems Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Moreover, existing BP algorithms treat each variable node’s neighbors equally when composing a new message, which also limits their exploration ability. To address these issues, we seamlessly integrate BP, Gated Recurrent Units (GRUs), and Graph Attention Networks (GATs) within the massage-passing framework to reason about dynamic weights and damping factors for composing new BP messages. |
Yanchen Deng; Shufeng Kong; Caihua Liu; Bo An; |
692 | Beyond Accuracy: Generalization Properties of Bio-plausible Temporal Credit Assignment Rules Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We study generalization performance of biologically-plausible learning rules in RNNs from a geometric perspective by leveraging theoretical tools from deep learning. |
Yuhan Helena Liu; Arna Ghosh; Blake Richards; Eric Shea-Brown; Guillaume Lajoie; |
693 | BagFlip: A Certified Defense Against Data Poisoning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present BagFlip, a model-agnostic certified approach that can effectively defend against both trigger-less and backdoor attack. |
Yuhao Zhang; Aws Albarghouthi; Loris D’Antoni; |
694 | Evaluated CMI Bounds for Meta Learning: Tightness and Expressiveness Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Specifically, we present novel generalization bounds for meta learning in terms of the evaluated CMI (e-CMI). |
Fredrik Hellström; Giuseppe Durisi; |
695 | On The Inability of Gaussian Process Regression to Optimally Learn Compositional Functions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We prove information-theoretic lower bounds for convergence rates of Gaussian processes when the true function has a compositional structure. |
Matteo Giordano; Kolyan Ray; Johannes Schmidt-Hieber; |
696 | Adapting to Online Label Shift with Provable Guarantees Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we formulate and investigate the problem of \emph{online label shift} (OLS): the learner trains an initial model from the labeled offline data and then deploys it to an unlabeled online environment where the underlying label distribution changes over time but the label-conditional density does not. |
Yong Bai; Yu-Jie Zhang; Peng Zhao; Masashi Sugiyama; Zhi-Hua Zhou; |
697 | When to Ask for Help: Proactive Interventions in Autonomous Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To this end, we propose an algorithm that can efficiently learn to detect and avoid states that are irreversible, and proactively ask for help in case the agent does enter them. |
Annie Xie; Fahim Tajwar; Archit Sharma; Chelsea Finn; |
698 | CLOOB: Modern Hopfield Networks with InfoLOOB Outperform CLIP Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce CLOOB, which uses modern Hopfield networks together with the InfoLOOB objective in multimodal contrastive learning. |
Andreas Fürst; Elisabeth Rumetshofer; Johannes Lehner; Viet Tran; Fei Tang; Hubert Ramsauer; David Kreil; Michael Kopp; Günter Klambauer; Angela Bitto; Sepp Hochreiter; |
699 | Sampling Without Replacement Leads to Faster Rates in Finite-Sum Minimax Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We analyze the convergence rates of stochastic gradient algorithms for smooth finite-sum minimax optimization and show that, for many such algorithms, sampling the data points \emph{without replacement} leads to faster convergence compared to sampling with replacement. |
Aniket Das; Bernhard Schölkopf; Michael Muehlebach; |
700 | Deconfounded Representation Similarity for Comparison of Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We improve the consistency of CKA and RSA w.r.t. functional similarity by removing the input similarity structure (a confounder), without losing their nice properties in comparing NN representations. |
Tianyu Cui; Yogesh Kumar; Pekka Marttinen; Samuel Kaski; |
701 | Memory Safe Computations with XLA Compiler Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: The extension to the XLA compiler for automatic resolving memory overflows in machine learning programs. The impact of memory optimisations is demonstrated on sparse Gaussian processes. |
Artem Artemev; Tilman Roeder; Mark van der Wilk; |
702 | Supervised Training of Conditional Monge Maps Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To account for and incorporate that context in OT estimation, we introduce \textsc{CondOT}, an approach to estimate OT maps conditioned on a context variable, using several pairs of measures $(\mu_i, \nu_i)$ tagged with a context label~$c_i$. |
Charlotte Bunne; Andreas Krause; Marco Cuturi; |
703 | Local Linear Convergence of Gradient Methods for Subspace Optimization Via Strict Complementarity Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We prove local linear convergence to optimal solutions of several efficient gradient methods for generalized subspace recovery problems under a strict complementarity condition |
Dan Garber; Ron Fisher; |
704 | Enhance The Visual Representation Via Discrete Adversarial Training Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose Discrete Adversarial Training (DAT) which transfers the merit of NLP-style adversarial training to vision models, for improving robustness and generalization simultaneously. |
Xiaofeng Mao; YueFeng Chen; Gege Qi; Xiaodan Li; Ranjie Duan; Yao Zhu; shaokai ye; Rong Zhang; Hui Xue’; |
705 | Recipe for A General, Powerful, Scalable Graph Transformer Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a 3-part recipe on how to build a general, powerful, scalable graph Transformers with linear complexity and state-of-the-art results on a diverse set of benchmarks. |
Ladislav Rampášek; Mikhail Galkin; Vijay Prakash Dwivedi; Anh Tuan Luu; Guy Wolf; Dominique Beaini; |
706 | Is This The Right Neighborhood? Accurate and Query Efficient Model Agnostic Explanations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We in this paper propose a simple approach that is robust across neighborhood widths in recovering faithful local explanations. |
Amit Dhurandhar; Karthikeyan Natesan Ramamurthy; Karthikeyan Shanmugam; |
707 | Expediting Large-Scale Vision Transformer for Dense Prediction Without Fine-tuning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In response to the fact that high-resolution representations are necessary for dense prediction, we present two non-parametric operators, a \emph{token clustering layer} to decrease the number of tokens and a \emph{token reconstruction layer} to increase the number of tokens. |
WEICONG LIANG; YUHUI YUAN; Henghui Ding; Xiao Luo; Weihong Lin; Ding Jia; Zheng Zhang; Chao Zhang; Han Hu; |
708 | The Unreasonable Effectiveness of Fully-Connected Layers for Low-Data Regimes Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose a simple yet effective framework to improve generalization from small amounts of data. |
Peter Kocsis; Ismail Elezi; Peter Súkeník; Guillem Braso; Matthias Niessner; Laura Leal-Taixé; |
709 | UniGAN: Reducing Mode Collapse in GANs Using A Uniform Generator Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a new type of generative diversity named uniform diversity, which relates to a newly proposed type of mode collapse named $u$-mode collapse where the generative samples distribute nonuniformly over the data manifold. |
Ziqi Pan; Li Niu; Liqing Zhang; |
710 | Boosting Out-of-distribution Detection with Typical Features Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We delve into the obstacle factors in OOD detection from the perspective of typicality and propose to boost the OOD detection with typical features. |
Yao Zhu; YueFeng Chen; Chuanlong Xie; Xiaodan Li; Rong Zhang; Hui Xue’; Xiang Tian; bolun zheng; Yaowu Chen; |
711 | Efficient Risk-Averse Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We prove that under certain conditions this inevitably leads to a local-optimum barrier, and propose a mechanism we call soft risk to bypass it. |
Ido Greenberg; Yinlam Chow; Mohammad Ghavamzadeh; Shie Mannor; |
712 | An Embarrassingly Simple Approach to Semi-Supervised Few-Shot Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a simple but quite effective approach to predict accurate negative pseudo-labels of unlabeled data from an indirect learning perspective. |
Xiu-Shen Wei; H.-Y. Xu; Faen Zhang; Yuxin Peng; Wei Zhou; |
713 | A Unified Model for Multi-class Anomaly Detection Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present UniAD that accomplishes anomaly detection for multiple classes with a unified framework. |
Zhiyuan You; Lei Cui; Yujun Shen; Kai Yang; Xin Lu; Yu Zheng; Xinyi Le; |
714 | Constants of Motion Network Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present a neural network that can simultaneously learn the dynamics of the system and the constants of motion from data. |
Muhammad Firmansyah Kasim; Yi Heng Lim; |
715 | One Inlier Is Enough: Towards Efficient Position Encoding for Point Cloud Registration Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: we propose a simple but efficient position encoding for point cloud registration. |
Fan Yang; Lin Guo; Zhi Chen; Wenbing Tao; |
716 | An Invisible Issue of Task Underspecification in Deep Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This article argues that for reliable downstream decision-making, performance evaluations on a task in DRL should be carried out over a family of MDPs rather than a point MDP, which may be subject to bias. |
Vindula Jayawardana; Catherine Tang; Sirui Li; Dajiang Suo; Cathy Wu; |
717 | PointTAD: Multi-Label Temporal Action Detection with Learnable Query Points Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we focus on the complex multi-label temporal action detection that aims to localize all action instances from a multi-label untrimmed video. |
Jing Tan; Xiaotong Zhao; Xintian Shi; Bin Kang; Limin Wang; |
718 | Non-Markovian Reward Modelling from Trajectory Labels Via Interpretable Multiple Instance Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We generalise reward modelling for reinforcement learning to handle non-Markovian rewards, and propose new interpretable multiple instance learning models for this problem. |
Joseph Early; Tom Bewley; Christine Evers; Sarvapali Ramchurn; |
719 | Cross-dataset Training Transformers for Robust Action Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper studies learning robust feature representations that can generalize on multiple datasets for action recognition using transformers. |
Junwei Liang; Enwei Zhang; Jun Zhang; Chunhua Shen; |
720 | Linear Tree Shap Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper presents a more efficient and straightforward algorithm: Linear TreeShap.Like TreeShap, Linear TreeShap is exact and requires the same amount of memory. |
peng yu; Albert Bifet; Jesse Read; Chao Xu; |
721 | DISCO: Adversarial Defense with Local Implicit Functions Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: A novel adversarial defense for image classification is proposed with the use of the local implicit functions. |
Chih-Hui Ho; Nuno Vasconcelos; |
722 | Adaptive Bio-Inspired Fish Simulation with Deep Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce bio-inspired fish simulation. Deep reinforcement learning was applied for the fish to be able to learn efficient schooling behavior in various fish cages and adapt themselves to the change of the environment. |
Yuko Ishiwaka; Xiao Zeng; Shun Ogawa; Donovan Westwater; Tadayuki Tone; Masaki Nakada; |
723 | Prototypical VoteNet for Few-Shot 3D Point Cloud Object Detection Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To address the few-shot 3D object detection problem, we propose Prototypical VoteNet to recognize and localize novel instances, which incorporates two new modules: Prototypical Vote Module (PVM) and Prototypical Head Module (PHM). |
Shizhen Zhao; Xiaojuan Qi; |
724 | Marksman Backdoor: Backdoor Attacks with Arbitrary Target Class Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To achieve this goal, we propose to represent the trigger function as a class-conditional generative model and to inject the backdoor in a constrained optimization framework, where the trigger function learns to generate an optimal trigger pattern to attack any target class at will while simultaneously embedding this generative backdoor into the trained model. |
Khoa D Doan; Yingjie Lao; Ping Li; |
725 | Reinforcement Learning with Logarithmic Regret and Policy Switches Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We provide instance-dependent regret guarantees for model-based and model-free algorithms in the general function approximation setting, where the underlying function class has bounded eluder dimension. |
Grigoris Velegkas; Zhuoran Yang; Amin Karbasi; |
726 | A Simple Approach to Automated Spectral Clustering Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In addition, we often need to choose different AMC methods for different datasets, which still depends on experience. To solve these two challenging problems, in this paper, we present a simple yet effective method for automated spectral clustering. |
Jicong Fan; Yiheng Tu; Zhao Zhang; Mingbo Zhao; Haijun Zhang; |
727 | SeqPATE: Differentially Private Text Generation Via Knowledge Distillation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose SeqPATE, an extension of PATE on text generation, which aims to protect the privacy of both training samples and sensitive phrases in samples. |
zhiliang tian; Yingxiu Zhao; Ziyue Huang; Yu-Xiang Wang; Nevin L. Zhang; He He; |
728 | Sparse Hypergraph Community Detection Thresholds in Stochastic Block Model Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper proved the community detection threshold in the sparse hypergraph stochastic block model. |
Erchuan Zhang; David Suter; Giang Truong; Syed Zulqarnain Gilani; |
729 | Confident Approximate Policy Iteration for Efficient Local Planning in $q^\pi$-realizable MDPs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Our first contribution is a new variant of Approximate Policy Iteration (API), which we call Confident Approximate Policy Iteration (CAPI). |
Gellért Weisz; András György; Csaba Szepesvari; |
730 | Self-Explaining Deviations for Coordination Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we focus on a specific subclass of coordination problems in which humans are able to discover self-explaining deviations (SEDs). |
Hengyuan Hu; Samuel Sokota; David Wu; Anton Bakhtin; Andrei Lupu; Brandon Cui; Jakob Foerster; |
731 | Planning to The Information Horizon of BAMDPs Via Epistemic State Abstraction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: As efficient exploration in BAMDPs hinges upon the judicious acquisition of information, our complexity measure highlights the worst-case difficulty of gathering information and exhausting epistemic uncertainty. To illustrate its significance, we establish a computationally-intractable, exact planning algorithm that takes advantage of this measure to show more efficient planning. |
Dilip Arumugam; Satinder Singh; |
732 | Neural Lyapunov Control of Unknown Nonlinear Systems with Stability Guarantees Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a learning framework to simultaneously learn and stabilize an unknown nonlinear system with provable guarantees. |
Ruikun Zhou; Thanin Quartz; Hans De Sterck; Jun Liu; |
733 | Fault-Aware Neural Code Rankers Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose fault-aware neural code rankers that can predict the correctness of a sampled program without executing it. |
Jeevana Priya Inala; Chenglong Wang; Mei Yang; Andres Codas; Mark Encarnación; Shuvendu Lahiri; Madanlal Musuvathi; Jianfeng Gao; |
734 | Computationally Efficient Aggregated Kernel Tests Using Incomplete $U$-statistics Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose computationally efficient aggregated kernel tests using incomplete U-statistics. |
Antonin Schrab; Ilmun Kim; Benjamin Guedj; Arthur Gretton; |
735 | Learning Enhanced Representation for Tabular Data Via Neighborhood Propagation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose to 1) construct a hypergraph from relevant data instance retrieval to model the cross-row and cross-column patterns of those instances, and 2) perform message Propagation to Enhance the target data instance representation for Tabular prediction tasks. |
Kounianhua Du; Weinan Zhang; Ruiwen Zhou; Yangkun Wang; Xilong Zhao; Jiarui Jin; Quan Gan; Zheng Zhang; David P Wipf; |
736 | DigGAN: Discriminator GradIent Gap Regularization for GAN Training with Limited Data Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In contrast, we propose a Discriminator gradIent Gap regularized GAN (DigGAN) formulation which can be added to any existing GAN. |
Tiantian Fang; Ruoyu Sun; Alex Schwing; |
737 | On The Interpretability of Regularisation for Neural Networks Through Model Gradient Similarity Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce a new framework, Model Gradient Similarity (MGS), that (1) serves as a metric of regularisation, which can be used to monitor neural network training, (2) adds insight into how explicit regularisers, while derived from widely different principles, operate via the same mechanism underneath by increasing MGS, and (3) provides the basis for a new regularisation scheme which exhibits excellent performance, especially in challenging settings such as high levels of label noise or limited sample sizes. |
Vincent Szolnoky; Viktor Andersson; Balazs Kulcsar; Rebecka Jörnsten; |
738 | Global Convergence of Direct Policy Search for State-Feedback $\mathcal{H}_\infty$ Robust Control: A Revisit of Nonsmooth Synthesis with Goldstein Subdifferential Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We develop the global convergence theory of direct policy search for solving the optimal $\mathcal{H}_\infty$ state-feedback control problem. |
Xingang Guo; Bin Hu; |
739 | Statistical, Robustness, and Computational Guarantees for Sliced Wasserstein Distances Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: The scalability of sliced optimal transport is quantified via new empirical convergence rates, robust estimation risks, and computational bounds. |
Sloan Nietert; Ziv Goldfeld; Ritwik Sadhu; Kengo Kato; |
740 | Increasing Confidence in Adversarial Robustness Evaluations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a test that enables researchers to find flawed adversarial robustness evaluations. Passing our test produces compelling evidence that the attacks used have sufficient power to evaluate the model’s robustness. |
Roland S. Zimmermann; Wieland Brendel; Florian Tramer; Nicholas Carlini; |
741 | Instance-Dependent Policy Learning for Linear MDPs Via Online Experiment Design Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work we show instance-dependent bounds on PAC policy learning in linear MDPs. |
Andrew Wagenmaker; Kevin Jamieson; |
742 | RKHS-SHAP: Shapley Values for Kernel Methods Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: By analysing SVs from a functional perspective, we propose RKHS-SHAP, an attribution method for kernel machines that can efficiently compute both Interventional and Observational Shapley values using kernel mean embeddings of distributions. |
Siu Lun Chau; Robert Hu; Javier González; Dino Sejdinovic; |
743 | [Re] Privacy-preserving Collaborative Learning with Automatic Transformation Search Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The goal of this study is to: (1) Verify the findings of the authors about the performance of the found policies and the correlation between the reconstruction metric and provided protection. |
Alfonso Taboada Warmerdam; Lodewijk Loerakker; Lucas Meijer; Ole Nissen; |
744 | Learning-Augmented Algorithms for Online Linear and Semidefinite Programming Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a unifying framework for learning-augmented online covering linear programs and online covering semidefinite programs. |
Elena Grigorescu; Young-San Lin; Sandeep Silwal; Maoyuan Song; Samson Zhou; |
745 | Giga-scale Kernel Matrix-Vector Multiplication on GPU Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, as KMVM tends to scale quadratically in both memory and time, applications are often limited by these computational constraints. In this paper, we propose a novel approximation procedure coined \textit{Faster-Fast and Free Memory Method} ($\text{F}^3$M) to address these scaling issues of KMVM for tall~($10^8\sim 10^9$) and skinny~($D\leq7$) data. |
Robert Hu; Siu Lun Chau; Dino Sejdinovic; Joan Glaunès; |
746 | Bayesian Optimization Over Discrete and Mixed Spaces Via Probabilistic Reparameterization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a theoretically-grounded method for Bayesian optimization over discrete and mixed search spaces and demonstrate state-of-the-art performance on a variety of real-world tasks. |
Samuel Daulton; Xingchen Wan; David Eriksson; Maximilian Balandat; Eytan Bakshy; Michael A Osborne; |
747 | Explaining Preferences with Shapley Values Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose Pref-SHAP to explain Preference Learning, even when data is not rankable |
Robert Hu; Siu Lun Chau; Jaime Ferrando Huertas; Dino Sejdinovic; |
748 | Generalized Variational Inference in Function Spaces: Gaussian Measures Meet Bayesian Deep Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We develop a framework for generalized variational inference in infinite-dimensional function spaces and use it to construct a method termed Gaussian Wasserstein inference (GWI). |
Veit David Wild; Robert Hu; Dino Sejdinovic; |
749 | Exploring The Whole Rashomon Set of Sparse Decision Trees Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We provide the first technique for completely enumerating the Rashomon set for sparse decision trees; in fact, our work provides the first complete enumeration of any Rashomon set for a non-trivial problem with a highly nonlinear discrete function class. |
Rui Xin; Chudi Zhong; Zhi Chen; Takuya Takagi; Margo Seltzer; Cynthia Rudin; |
750 | ZSON: Zero-Shot Object-Goal Navigation Using Multimodal Goal Embeddings Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose using a vision-and-language model to create semantic representations of navigation goals to enable zero-shot open-world ObjectNav. |
Arjun Majumdar; Gunjan Aggarwal; Bhavika Devnani; Judy Hoffman; Dhruv Batra; |
751 | Few-Shot Parameter-Efficient Fine-Tuning Is Better and Cheaper Than In-Context Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a new recipe T-Few for parameter-efficient few-shot learning that outperforms GPT-3 in-context learning. |
Haokun Liu; Derek Tam; Mohammed Muqeeth; Jay Mohta; Tenghao Huang; Mohit Bansal; Colin Raffel; |
752 | Rethinking Individual Global Max in Cooperative Multi-Agent Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper proposes to introduce imitation learning into hypernetwork-based value decomposition to avoid error accumulation after revealing that Individual Global Max is a lossy decomposition. |
Yaochu Jin; Yitian Hong; Yang Tang; |
753 | Reduction Algorithms for Persistence Diagrams of Networks: CoralTDA and PrunIT Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose two methods to reduce the computational costs of topological data analysis methods on graphs. |
Cuneyt G Akcora; Murat Kantarcioglu; Yulia Gel; Baris Coskunuzer; |
754 | Natural Image Synthesis for The Retina with Variational Information Bottleneck Representation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: But, is all the visual scene information required to explain the neuronal responses? In this work, we search for answers to this question by developing a joint model of the natural visual input and neuronal responses using the Information Bottleneck (IB) framework that is able to represent features of the input data into a few latent variables that play a role in the prediction of the outputs. |
Babak Rahmani; Demetri Psaltis; Christophe Moser; |
755 | NaturalProver: Grounded Mathematical Proof Generation with Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We study large-scale language models on two new generation tasks: suggesting the next step in a mathematical proof, and full proof generation. |
Sean Welleck; Jiacheng Liu; Ximing Lu; Hannaneh Hajishirzi; Yejin Choi; |
756 | Dual-discriminative Graph Neural Network for Imbalanced Graph-level Anomaly Detection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a dual-discriminative graph neural network for graph-level anomaly detection, namely iGAD. |
GE ZHANG; Zhenyu Yang; Jia Wu; Jian Yang; Shan Xue; Hao Peng; Jianlin Su; Chuan Zhou; Quan Z. Sheng; Leman Akoglu; Charu Aggarwal; |
757 | An $\alpha$-No-Regret Algorithm For Graphical Bilinear Bandits Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This setting reveals a combinatorial NP-hard problem that prevents the use of any existing regret-based algorithm in the (bi-)linear bandit literature. In this paper, we fill this gap and present the first regret-based algorithm for graphical bilinear bandits using the principle of optimism in the face of uncertainty. |
Geovani Rizk; Igor Colin; Albert Thomas; Rida Laraki; Yann Chevaleyre; |
758 | Movement Penalized Bayesian Optimization with Application to Wind Energy Systems Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce the episodic CBO with movement costs problem and, based on the online learning approach for metrical task systems of Coester and Lee (2019), propose a novel randomized mirror descent algorithm that makes use of Gaussian Process confidence bounds. |
Shyam Sundhar Ramesh; Pier Giuseppe Sessa; Andreas Krause; Ilija Bogunovic; |
759 | EAGER: Asking and Answering Questions for Automatic Reward Shaping in Language-guided RL Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose an automated reward shaping method for guiding exploration in instruction following settings. |
Thomas Carta; Pierre-Yves Oudeyer; Olivier Sigaud; Sylvain Lamprier; |
760 | Adv-Attribute: Inconspicuous and Transferable Adversarial Attack on Face Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, instead of performing perturbations on the low-level pixels, we propose to generate attacks through perturbing on the high-level semantics to improve attack transferability. |
Shuai Jia; Bangjie Yin; Taiping Yao; Shouhong Ding; Chunhua Shen; Xiaokang Yang; Chao Ma; |
761 | Collaborative Learning By Detecting Collaboration Partners Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We focus on collaborative learning and propose to learn $K$ models for $N$ heterogeneous clients ($K \ll N$) which are proved to be good approximations of the optimal models. |
Shu Ding; Wei Wang; |
762 | Global Convergence of ResNets: From Finite to Infinite Width Using Linear Parameterization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We show convergence of Gradient Descent for the training of infinitely deep Residual Neural Networks. Our linear parameterization of the residuals allows to bridge the gap between finite- and infinite-width models. |
Raphaël Barboni; Gabriel Peyré; Francois-Xavier Vialard; |
763 | Decoupling Features in Hierarchical Propagation for Video Object Segmentation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper focuses on developing a more effective method of hierarchical propagation for semi-supervised Video Object Segmentation (VOS). |
Zongxin Yang; Yi Yang; |
764 | Optimal Binary Classification Beyond Accuracy Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We generalize the binary Bayes classifier to any performance metric computed from the confusion matrix, and apply this to derive statistical guarantees for imbalanced classification. |
Shashank Singh; Justin Khim; |
765 | CHIMLE: Conditional Hierarchical IMLE Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose an efficient sampling algorithm for conditional IMLE, a mode-covering generative model, and show improved image fidelity |
Shichong Peng; Seyed Alireza Moazenipourasil; Ke Li; |
766 | Moment Distributionally Robust Tree Structured Prediction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a distributionally robust method for structured prediction of tree-shaped objects with consistency and generalization guarantees. |
Yeshu Li; Danyal Saeed; Xinhua Zhang; Brian Ziebart; Kevin Gimpel; |
767 | On The Theoretical Properties of Noise Correlation in Stochastic Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We provide theory analyzing the effect of noise correlation in stochastic optimization, using fractional Brownian motion. |
Aurelien Lucchi; Frank Proske; Antonio Orvieto; Francis Bach; Hans Kersting; |
768 | LASSIE: Learning Articulated Shapes from Sparse Image Ensemble Via 3D Part Discovery Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We learn to reconstruct high-quality articulated shapes from sparse image collections by discovering 3D neural parts without any shape template or keypoint annotations. |
Chun-Han Yao; Wei-Chih Hung; Yuanzhen Li; Michael Rubinstein; Ming-Hsuan Yang; Varun Jampani; |
769 | GFlowCausal: Generative Flow Networks for Causal Discovery Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a plug-and-play module based on transitive closure to ensure efficiently sampling. |
Wenqian Li; Yinchuan Li; Shengyu Zhu; Shao Yunfeng; Jianye Hao; Yan Pang; |
770 | Guaranteed Conservation of Momentum for Learning Particle-based Fluid Dynamics Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We use conservation of momentum as an inductive bias in the form of a hard constraint for learning particle-based fluid dynamics. |
Lukas Prantl; Benjamin Ummenhofer; Vladlen Koltun; Nils Thuerey; |
771 | Diffusion Curvature for Estimating Local Curvature in High Dimensional Data Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce a new intrinsic measure of local curvature on point-cloud data called diffusion curvature. |
Dhananjay Bhaskar; Kincaid MacDonald; Oluwadamilola Fasina; Dawson Thomas; Bastian Rieck; Ian Adelstein; Smita Krishnaswamy; |
772 | Reducing Confidence Along Adversarial Directions: Maximizing Entropy on Self-Generated Perturbations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work we propose a complementary regularization strategy that reduces confidence on self-generated examples. |
Amrith Setlur; Benjamin Eysenbach; Virginia Smith; Sergey Levine; |
773 | Self-supervised Learning of Brain Dynamics from Broad Neuroimaging Data Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We devise and evaluate novel self-supervised learning techniques for neuroimaging data inspired by prominent learning frameworks in natural language processing, using one of the broadest neuroimaging datasets used for pre-training to date. |
Armin Thomas; Christopher Ré; Russell Poldrack; |
774 | Stochastic Halpern Iteration with Variance Reduction for Stochastic Monotone Inclusions Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We provide state-of-the art variance reduced guarantees for all standard classes of stochastic Lipschitz monotone inclusion problems, using variants of Halpern iteration. |
Xufeng Cai; Chaobing Song; Cristóbal Guzmán; Jelena Diakonikolas; |
775 | Coordinate Linear Variance Reduction for Generalized Linear Programming Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We provide a novel variance reduced primal-dual algorithm for generalized linear programs with improved theoretical and empirical performance among primal-dual methods and that is competitive with off-the-shelf solvers on considered datasets. |
Chaobing Song; Cheuk Yin Lin; Stephen Wright; Jelena Diakonikolas; |
776 | How Powerful Are K-hop Message Passing Graph Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we first theoretically analyze the expressive power and the limitation of K-hop message passing graph neural networks. Then, we propose a novel method to improve the K-hop message passing framework. |
Jiarui Feng; Yixin Chen; Fuhai Li; Anindya Sarkar; Muhan Zhang; |
777 | K-LITE: Learning Transferable Visual Models with External Knowledge Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: K-LITE provides the first strong evidence that external knowledge benefits large-scale task-level visual transfer in image classification and object detection |
Sheng Shen; Chunyuan Li; Xiaowei Hu; Yujia Xie; Jianwei Yang; Pengchuan Zhang; Zhe Gan; Lijuan Wang; Lu Yuan; Ce Liu; Kurt Keutzer; Trevor Darrell; Anna Rohrbach; Jianfeng Gao; |
778 | Distributionally Robust Optimization Via Ball Oracle Acceleration Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We develop and theoretically analyze algorithms for distributionally robust optimization with group-structured and bounded $f$-divergence uncertainty sets. |
Yair Carmon; Danielle Hausler; |
779 | Statistical Learning and Inverse Problems: An Stochastic Gradient Approach Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we consider the setup of Statistical Inverse Problem (SIP) and demonstrate how Stochastic Gradient Descent (SGD) algorithms can be used in the linear SIP setting. |
Yuri Fonseca; |
780 | Convergence Beyond The Over-parameterized Regime Using Rayleigh Quotients Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we present a new strategy to prove the convergence of Deep Learning architectures to a zero training (or even testing) loss by gradient flow. |
David Robin; Kevin Scaman; marc lelarge; |
781 | Decimated Framelet System on Graphs and Fast G-Framelet Transforms Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper proposes a novel multiscale representation system for graph data, called decimated framelets, which form a localized tight frame on the graph. |
Xuebin Zheng; Bingxin Zhou; Yuguang Wang; Xiaosheng Zhuang; |
782 | Network Change Point Localisation Under Local Differential Privacy Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we expand the current static analysis of privatised networks to a dynamic framework by considering a sequence of networks with potential change points. |
Mengchu Li; Tom Berrett; Yi Yu; |
783 | Efficient Submodular Optimization Under Noise: Local Search Is Robust Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper designs a novel local search framework that can handle the effect of noise and achieve near-optimal approximation guarantees for submodular maximization with polynomial queries. |
Lingxiao Huang; Yuyi Wang; Chunxue Yang; Huanjian Zhou; |
784 | [Re] Reproduction and Extension of "Queens Are Powerful Too: Mitigating Gender Bias in Dialogue Generation" Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The main claims we are trying to reproduce are that bias controlled training or combining counterfactual data augmentation, the positively biased data collected by Dinan et al. [5], and bias controlled training for the LIGHT dataset yields generated dialogue in which the percent of gendered words and male bias closely match the ground truth. |
Erica Eaton; Pirouz Naghavi; |
785 | Counterfactual Harm Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We derive the first statistical definition of harm, and prove that machine learning algorithms are guaranteed to pursue harmful policies unless we train them with counterfactual objectives. |
Jonathan Richens; Rory Beard; Daniel H. Thompson; |
786 | Improving Generative Adversarial Networks Via Adversarial Learning in Latent Space Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper proposes to improve the performance of GAN in terms of generative quality and diversity by mining the latent space using adversarial learning. |
Yang Li; Yichuan Mo; Liangliang Shi; Junchi Yan; |
787 | Reconciling Intrinsic Rewards Via Constrained Policy Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We alleviate the performance drop resulting from the bias of intrinsic rewards while preserving the merits of intrinsic rewards. |
Eric Chen; Zhang-Wei Hong; Joni Pajarinen; Pulkit Agrawal; |
788 | Variable-rate Hierarchical CPC Leads to Acoustic Unit Discovery in Speech Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Inspired by the fact that speech is often described as a sequence of discrete units unevenly distributed in time, we propose a model in which the output of a low-level CPC module is non-uniformly downsampled to directly minimize the loss of a high-level CPC module. |
Santiago Cuervo; Adrian Lancucki; Ricard Marxer; Paweł Rychlikowski; Jan Chorowski; |
789 | UDC: Unified DNAS for Compressible TinyML Models for Neural Processing Units Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper demonstrates Unified DNAS for Compressible (UDC) NNs, which explores a large search space to generate state-of-the-art compressible NNs for NPU. |
Igor Fedorov; Ramon Matas; Hokchhay Tann; Chuteng Zhou; Matthew Mattina; Paul Whatmough; |
790 | Learning Invariant Graph Representations Under Distribution Shifts Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose Graph Invariant Learning (GIL) model capable of learning generalized graph representations under distribution shifts. |
Haoyang Li; Ziwei Zhang; Xin Wang; Wenwu Zhu; |
791 | Social-Inverse: Inverse Decision-making of Social Contagion Management with Task Migrations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Such problems, called inverse decision-making with task migrations, are of interest in that the complex and stochastic nature of real-world applications often prevents the agent from completely knowing the underlying system. In this paper, we introduce such a new problem with formal formulations and present a generic framework for addressing decision-making tasks in social contagion management. |
Guangmo Tong; |
792 | Infinite-Fidelity Coregionalization for Physical Simulation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose an infinite fidelity surrogate model for physical simulations and related applications which can take arbitrary/continuous fidelity’s inputs and outputs. |
Shibo Li; Zheng Wang; Robert Kirby; Shandian Zhe; |
793 | Amplifying Membership Exposure Via Data Poisoning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We demonstrate how to use data poisoning attacks to amplify the membership exposure of the targeted class. |
Yufei Chen; Chao Shen; Yun Shen; Cong Wang; Yang Zhang; |
794 | Tensor Program Optimization with Probabilistic Programs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce a domain-specific probabilistic language to enable modular construction of search space in automatic tensor program optimization. |
Junru Shao; Xiyou Zhou; Siyuan Feng; Bohan Hou; Ruihang Lai; Hongyi Jin; Wuwei Lin; Masahiro Masuda; Cody Hao Yu; Tianqi Chen; |
795 | Introspective Learning : A Two-Stage Approach for Inference in Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a method that uses neural networks as knowledge bases to reflect on their decisions to make more robust decisions. |
Mohit Prabhushankar; Ghassan AlRegib; |
796 | Formalizing Coherence and Consistency Applied to Transfer Learning in Neuro-Symbolic Autoencoders Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce formal definitions for consistency and coherence of neural systems and show that the better a model’s coherence, the better it transfers. |
Harald Strömfelt; Luke Dickens; Artur Garcez; Alessandra Russo; |
797 | Exploring Length Generalization in Large Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We explore the ability of transformer-based language models to learn from shorter problem instances to generalize to longer ones and identify points of failure and success. |
Cem Anil; Yuhuai Wu; Anders Andreassen; Aitor Lewkowycz; Vedant Misra; Vinay Ramasesh; Ambrose Slone; Guy Gur-Ari; Ethan Dyer; Behnam Neyshabur; |
798 | Sketching Based Representations for Robust Image Classification with Provable Guarantees Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work we theoretically study synthetic images that are composed of a union or intersection of several mathematically specified shapes using thresholded polynomial functions (for e.g. ellipses, rectangles). |
Nishanth Dikkala; Sankeerth Rao Karingula; Raghu Meka; Jelani Nelson; Rina Panigrahy; Xin Wang; |
799 | Concept Embedding Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a novel concept-based interpretable architecture capable of learning meaningful concept embedding representations and supporting test-time concept interventions. |
Mateo Espinosa Zarlenga; Pietro Barbiero; Gabriele Ciravegna; Giuseppe Marra; Francesco Giannini; Michelangelo Diligenti; Zohreh Shams; Frederic Precioso; Stefano Melacci; Adrian Weller; Pietro Lió; Mateja Jamnik; |
800 | Implications of Model Indeterminacy for Explanations of Automated Decisions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, it is well-known that given a dataset and a predictive task, there may be a multiplicity of models that solve the problem (nearly) equally well. In this work, we investigate the implications of this kind of model indeterminacy on the post-hoc explanations of predictive models. |
Marc-Etienne Brunet; Ashton Anderson; Richard Zemel; |
801 | Large (robust) Models from Computational Constraints Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We show that efficient (robust) learning could provably need more parameters than inefficient (robust) learning. |
Sanjam Garg; Somesh Jha; Saeed Mahloujifar; Mohammad Mahmoody; Mingyuan Wang; |
802 | Contextual Dynamic Pricing with Unknown Noise: Explore-then-UCB Strategy and Improved Regrets Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we consider a contextual dynamic pricing problem under a linear customer valuation model with an unknown market noise distribution $F$. |
Yiyun Luo; Will Wei Sun; Yufeng Liu; |
803 | The Importance of Being Correlated: Implications of Dependence in Joint Spectral Inference Across Multiple Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Such inference procedures typically rely heavily on independence assumptions across the multiple network realizations, and even in this case, little attention has been paid to the induced network correlation that can be a consequence of such joint embeddings. In this paper, we present a generalized omnibus embedding methodology and we provide a detailed analysis of this embedding across both independent and correlated networks, the latter of which significantly extends the reach of such procedures, and we describe how this omnibus embedding can itself induce correlation. |
Konstantinos Pantazis; Avanti Athreya; Jesus Arroyo; William N Frost; Evan S Hill; Vince Lyzinski; |
804 | Composition Theorems for Interactive Differential Privacy Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study composition properties of differential privacy in concurrent compositions. |
Xin Lyu; |
805 | Improved Coresets for Euclidean $k$-Means Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present improved coresets for the Euclidean k-means problem. |
Vincent Cohen-Addad; Kasper Green Larsen; David Saulpic; Chris Schwiegelshohn; Omar Ali Sheikh-Omar; |
806 | Private Multiparty Perception for Navigation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce a framework for navigating through cluttered environments by connecting multiple cameras together while simultaneously preserving privacy via multi-party computation. |
Hui Lu; Mia Chiquier; Carl Vondrick; |
807 | Robust Imitation of A Few Demonstrations with A Backwards Model Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce a backwards model-based imitation learning method that learns optimal policies robust to unobserved initial states. |
Jung Yeon Park; Lawson Wong; |
808 | Quantized Training of Gradient Boosted Decision Trees Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we investigate an essentially important question but has been largely ignored by the previous literature – how many bits are in need for representing gradients in training GBDT? |
Yu Shi; Guolin Ke; Zhuoming Chen; Shuxin Zheng; Tie-Yan Liu; |
809 | Pre-trained Adversarial Perturbations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We design a novel algorithm to generate adversarial samples using pre-trained models which can fool the corresponding fine-tuned ones and thus reveal the safety problem of fine-tuning pre-trained models to do downstream tasks. |
Yuanhao Ban; Yinpeng Dong; |
810 | Decentralized Training of Foundation Models in Heterogeneous Environments Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We explore how to deploy the training of large-scale foundation models in a decentralized environment. |
Binhang Yuan; Yongjun He; Tianyi Zhang; Jared Davis; Tri Dao; Beidi Chen; Percy Liang; Christopher Ré; Ce Zhang; |
811 | Positive-Unlabeled Learning Using Random Forests Via Recursive Greedy Risk Minimization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose new random forest algorithms for PU-learning. |
Jonathan Wilton; Nan Ye; Miao Xu; Abigail Koay; Ryan Ko; |
812 | Hyper-Representations As Generative Models: Sampling Unseen Neural Network Weights Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We extend hyper-representations for generative use to sample neural network weights for initialization, ensembling and transfer learning. |
Konstantin Schürholt; Boris Knyazev; Xavier Giro-i-Nieto; Damian Borth; |
813 | An Analytical Theory of Curriculum Learning in Teacher-Student Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We analyse a solvable model of curriculum learning and comment on the implications for the ML and the experimental psychology literature. |
Luca Saglietti; Stefano Mannelli; Andrew Saxe; |
814 | Joint Learning of 2D-3D Weakly Supervised Semantic Segmentation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a novel 2D-3D joint framework for learning 2D and 3D weakly supervised semantic segmentation using image- and scene-level classification labels only. |
Hyeokjun Kweon; Kuk-Jin Yoon; |
815 | Injecting Domain Knowledge from Empirical Interatomic Potentials to Neural Networks for Predicting Material Properties Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose two generic strategies that take advantage of unlabeled training instances to inject domain knowledge from conventional EIPs to NNs in order to increase their generalizability. |
Zeren Shui; Daniel Karls; Mingjian Wen; ilia Nikiforov; Ellad Tadmor; George Karypis; |
816 | Towards Understanding The Mixture-of-Experts Layer in Deep Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we formally study how the MoE layer improves the performance of neural network learning and why the mixture model will not collapse into a single model. |
Zixiang Chen; Yihe Deng; Yue Wu; Quanquan Gu; Yuanzhi Li; |
817 | Graph Scattering Beyond Wavelet Shackles Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This work develops a flexible and mathematically sound framework for the design and analysis of graph scattering networks with variable branching ratios and generic functional calculus filters.Spectrally-agnostic stability guarantees for node- and graph-level perturbations are derived; the vertex-set non-preserving case is treated by utilizing recently developed mathematical-physics based tools. |
Christian Koke; Gitta Kutyniok; |
818 | DARE: Disentanglement-Augmented Rationale Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Since previous works fail to fully exploit the original input, where the information of non-selected tokens is ignored, in this paper, we propose a Disentanglement-Augmented Rationale Extraction (DARE) method, which encapsulates more information from the input to extract rationales. |
Linan Yue; Qi Liu; Yichao Du; Yanqing An; Li Wang; Enhong Chen; |
819 | Differentially Private Learning Needs Hidden State (Or Much Faster Convergence) Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we significantly improve privacy analysis under the hidden state assumption. |
Jiayuan Ye; Reza Shokri; |
820 | A Unifying Framework of Off-Policy General Value Function Evaluation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a new algorithm called GenTD for off-policy GVFs evaluation and show that GenTD learns multiple interrelated multi-dimensional GVFs as efficiently as a single canonical scalar value function. |
Tengyu Xu; Zhuoran Yang; Zhaoran Wang; Yingbin Liang; |
821 | Skills Regularized Task Decomposition for Multi-task Offline Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we present a skill-based multi-task RL technique on heterogeneous datasets that are generated by behavior policies of different quality. |
Minjong Yoo; SangWoo Cho; Honguk Woo; |
822 | Learning Symmetric Rules with SATNet Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present SymSATNet, a variant of SATNet that translates the given symmetries of the target rules to a condition on the parameters of SATNet and requires that the parameters should have a particular parametric form that guarantees the condition. |
SANGHO LIM; Eungyeol Oh; Hongseok Yang; |
823 | Make An Omelette with Breaking Eggs: Zero-Shot Learning for Novel Attribute Synthesis Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To this end, in this paper, we bring up a new problem scenario: ”Can we derive zero-shot learning for novel attribute detectors/classifiers and use them to automatically annotate the dataset for labeling efficiency?” |
Yu-Hsuan Li; Tzu-Yin Chao; Ching-Chun Huang; Pin-Yu Chen; Wei-Chen Chiu; |
824 | Finite-Time Regret of Thompson Sampling Algorithms for Exponential Family Multi-Armed Bandits Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a Thompson sampling algorithm, termed ExpTS, which uses a novel sampling distribution to avoid the under-estimation of the optimal arm. |
Tianyuan Jin; Pan Xu; Xiaokui Xiao; Anima Anandkumar; |
825 | Transition to Linearity of General Neural Networks with Directed Acyclic Graph Architecture Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper we show that feedforward neural networks corresponding to arbitrary directed acyclic graphs undergo transition to linearity as their “width” approaches infinity. |
Libin Zhu; Chaoyue Liu; Misha Belkin; |
826 | Exact Learning Dynamics of Deep Linear Networks with Prior Knowledge Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Here we derive exact solutions to the dynamics of learning with rich prior knowledge in deep linear networks by generalising Fukumizu’s matrix Riccati solution \citep{Fukumizu1998}. |
Lukas Braun; Clémentine Dominé; James Fitzgerald; Andrew Saxe; |
827 | When Are Offline Two-Player Zero-Sum Markov Games Solvable? Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Our work serves as an important initial step towards understanding offline multi-agent reinforcement learning. |
Qiwen Cui; Simon Du; |
828 | Top Two Algorithms Revisited Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we provide a general analysis of top-two methods, which identifies desirable properties of the leader, the challenger, and the (possibly non-parametric) distributions of the arms. |
Marc Jourdan; Rémy Degenne; Dorian Baudry; Rianne de Heide; Emilie Kaufmann; |
829 | Invariance Learning Based on Label Hierarchy Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, the requirement of training data in multiple domains is a strong restriction of using IL, since it demands expensive annotation. We propose a novel IL framework to overcome this problem. |
Shoji Toyota; Kenji Fukumizu; |
830 | Selective Compression Learning of Latent Representations for Variable-rate Image Compression Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we firstly propose a selective compression method that partially encodes the latent representations in a fully generalized manner for deep learning-based variable-rate image compression. |
Jooyoung Lee; Seyoon Jeong; Munchurl Kim; |
831 | Trade-off Between Payoff and Model Rewards in Fair Collaborative Machine Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In particular, we propose desirable properties for achieving a fair adjustment of the payoff flows that can trade off between the model reward’s performance and the payoff reward. |
Quoc Phong Nguyen; Bryan Kian Hsiang Low; Patrick Jaillet; |
832 | LISA: Learning Interpretable Skill Abstractions from Language Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To encode complex instructions into skills that can generalize to unseen instructions, we propose Learning Interpretable Skill Abstractions (LISA), a hierarchical imitation learning framework that can learn diverse, interpretable skills from language-conditioned demonstrations. |
Divyansh Garg; Skanda Vaidyanath; Kuno Kim; Jiaming Song; Stefano Ermon; |
833 | Your Out-of-Distribution Detection Method Is Not Robust! Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We subsequently propose the Adversarially Trained Discriminator (ATD), which utilizes a pre-trained robust model to extract robust features, and a generator model to create OOD samples. |
Mohammad Azizmalayeri; Arshia Soltani Moakhar; Arman Zarei; Reihaneh Zohrabi; Mohammad Manzuri; Mohammad Hossein Rohban; |
834 | On Neural Network Pruning’s Effect on Generalization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Motivated by such a contradiction, we re-examine pruning’s effect on generalization empirically.We demonstrate that pruning’s generalization-improving effect cannot be fully accounted for by weight removal. |
Tian Jin; Daniel M Roy; Michael Carbin; Jonathan Frankle; Gintare Karolina Dziugaite; |
835 | Fast Bayesian Inference of Point Process Intensity As Function of Covariates Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a novel augmentation of permanental process called {¥it augmented permanental process}, a double stochastic point process that uses a Gaussian process on covariate space to describe the Bayesian a priori uncertainty present in the square root of intensity, and derive a fast Bayesian inference algorithm that scales linearly with data size without relying on either domain discretization or Markov Chain Monte Carlo computation. |
Hideaki Kim; Taichi Asami; Hiroyuki Toda; |
836 | Dataset Distillation Using Neural Feature Regression Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Meta-gradient computation is one of the key challenges in this formulation, as differentiating through the inner loop learning procedure introduces significant computation and memory costs. In this paper, we address these challenges using neural Feature Regression with Pooling (FRePo), achieving the state-of-the-art performance with an order of magnitude less memory requirement and two orders of magnitude faster training than previous methods. |
Yongchao Zhou; Ehsan Nezhadarya; Jimmy Ba; |
837 | Beyond Separability: Analyzing The Linear Transferability of Contrastive Representations to Related Subpopulations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper analyzes when and why contrastive representations exhibit linear transferability in a general unsupervised domain adaptation setting. |
Jeff Z. HaoChen; Colin Wei; Ananya Kumar; Tengyu Ma; |
838 | Understanding Hyperdimensional Computing for Parallel Single-Pass Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a new theoretical analysis of the limits of HDC via a consideration of what similarity matrices can be expressed by binary vectors, and we show how the limits of HDC can be approached using random Fourier features (RFF). |
Tao Yu; Yichi Zhang; Zhiru Zhang; Christopher De Sa; |
839 | Spectral Bias Outside The Training Set for Deep Networks in The Kernel Regime Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We provide quantitative bounds measuring the $L^2$ difference in function space between the trajectory of a finite-width network trained on finitely many samples from the idealized kernel dynamics of infinite width and infinite data. |
Benjamin Bowman; Guido Montufar; |
840 | Turbocharging Solution Concepts: Solving NEs, CEs and CCEs with Neural Equilibrium Solvers Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce the Neural Equilibrium Solver which utilizes a special equivariant neural network architecture to approximately solve the space of all games of fixed shape, buying speed and determinism. |
Luke Marris; Ian Gemp; Thomas Anthony; Andrea Tacchetti; Siqi Liu; Karl Tuyls; |
841 | Gradient Flow Dynamics of Shallow ReLU Networks for Square Loss and Orthogonal Inputs Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This article presents, for orthogonal input vectors, a precise description of the gradient flow dynamics of training one-hidden layer ReLU neural networks for the mean squared error at small initialisation. |
Etienne Boursier; Loucas PILLAUD-VIVIEN; Nicolas Flammarion; |
842 | Sampling from Log-Concave Distributions with Infinity-Distance Guarantees Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We bridge this gap by presenting an algorithm that outputs a point $\varepsilon$-close to $\pi$ in infinity distance that requires at most $\mathrm{poly}(\log \frac{1}{\varepsilon}, d)$ calls to a membership oracle for $K$ and evaluation oracle for $f$, when $f$ is Lipschitz. |
Oren Mangoubi; Nisheeth Vishnoi; |
843 | (Nearly) All Cardinality Estimators Are Differentially Private Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We consider privacy in the context of streaming algorithms for cardinality estimation. |
Charlie Dickens; Justin Thaler; Daniel Ting; |
844 | Fast Stochastic Composite Minimization and An Accelerated Frank-Wolfe Algorithm Under Parallelization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We consider the problem of minimizing the sum of two convex functions. |
Benjamin Dubois-Taine; Francis Bach; Quentin Berthet; Adrien Taylor; |
845 | On Viewpoint Robustness of Visual Recognition in The Wild Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel method called ViewFool to find adversarial viewpoints that mislead visual recognition models. |
Yinpeng Dong; Shouwei Ruan; Hang Su; Caixin Kang; Xingxing Wei; Jun Zhu; |
846 | Communication Efficient Federated Learning for Generalized Linear Bandits Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a communication-efficient solution framework that employs online regression for local update and offline regression for global update. |
Chuanhao Li; Hongning Wang; |
847 | AZ-whiteness Test: A Test for Signal Uncorrelation on Spatio-temporal Graphs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present the first whiteness hypothesis test for graphs, i.e., a whiteness test for multivariate time series associated with the nodes of a dynamic graph; as such, the test represents an important model assessment tool for graph deep learning, e.g., in forecasting setups. |
Daniele Zambon; Cesare Alippi; |
848 | Generalization Bounds with Minimal Dependency on Hypothesis Class Via Distributionally Robust Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Established approaches to obtain generalization bounds in data-driven optimization and machine learning mostly build on solutions from empirical risk minimization (ERM), which depend crucially on the functional complexity of the hypothesis class. In this paper, we present an alternate route to obtain these bounds on the solution from distributionally robust optimization (DRO), a recent data-driven optimization framework based on worst-case analysis and the notion of ambiguity set to capture statistical uncertainty. |
Yibo Zeng; Henry Lam; |
849 | Fairness Transferability Subject to Bounded Distribution Shift Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we study the transferability of statistical group fairness for machine learning predictors (i.e., classifiers or regressors) subject to bounded distribution shift, a phenomenon frequently caused by user adaptation to a deployed model or a dynamic environment. |
Yatong Chen; Reilly Raab; Jialu Wang; Yang Liu; |
850 | APG: Adaptive Parameter Generation Network for Click-Through Rate Prediction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose an efficient, effective, and universal module, named as Adaptive Parameter Generation network (APG), which can dynamically generate parameters for deep CTR models on-the-fly based on different instances. |
Bencheng Yan; Pengjie Wang; Kai Zhang; Feng Li; Hongbo Deng; Jian Xu; Bo Zheng; |
851 | Efficient Active Learning with Abstention Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We develop the first computationally efficient active learning algorithm with abstention. |
Yinglun Zhu; Robert Nowak; |
852 | Redistricting Via Local Fairness Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose to use the concept of local fairness for auditing and ranking redistricting plans. |
Pankaj Agarwal; Shao-Heng Ko; Kamesh Munagala; Erin Taylor; |
853 | SelecMix: Debiased Learning By Contradicting-pair Sampling Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Mainly, we propose a selective mixup scheme, SelecMix, where the mixup is applied to the pairs having (i) the same label but dissimilar biased features, and (ii) different labels but similar biased features. |
Inwoo Hwang; Sangjun Lee; Yunhyeok Kwak; Seong Joon Oh; Damien Teney; Jin-Hwa Kim; Byoung-Tak Zhang; |
854 | Inducing Equilibria Via Incentives: Simultaneous Design-and-Play Ensures Global Convergence Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The existing bilevel optimization algorithms raise a dilemma when applied to this problem: anticipating how incentives affect the agents at equilibrium requires solving the equilibrium problem repeatedly, which is computationally inefficient; bypassing the time-consuming step of equilibrium-finding can reduce the computational cost, but may lead the designer to a sub-optimal solution. To address such a dilemma, we propose a method that tackles the designer’s and agents’ problems simultaneously in a single loop. |
Boyi Liu; Jiayang Li; Zhuoran Yang; Hoi-To Wai; Mingyi Hong; Yu Nie; Zhaoran Wang; |
855 | Unsupervised Image-to-Image Translation with Density Changing Regularization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we make a density changing assumption where image patches of high probability density should be mapped to patches of high probability density in another domain. |
Shaoan Xie; Qirong Ho; Kun Zhang; |
856 | Augmentations in Hypergraph Contrastive Learning: Fabricated and Generative Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper targets at improving the generalizability of hypergraph neural networks in the low-label regime, through applying the contrastive learning approach from images/graphs. |
Tianxin Wei; Yuning You; Tianlong Chen; Yang Shen; Jingrui He; Zhangyang Wang; |
857 | Gradient Descent: The Ultimate Optimizer Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We show how to \emph{automatically} compute hypergradients with a simple and elegant modification to backpropagation. |
Kartik Chandra; Audrey Xie; Jonathan Ragan-Kelley; ERIK MEIJER; |
858 | Iterative Scene Graph Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, this fixed factorization is not ideal under all scenarios (e.g., for images where an object entailed in interaction is small and not discernible on its own). In this work, we propose a novel framework for scene graph generation that addresses this limitation, as well as introduces dynamic conditioning on the image, using message passing in a Markov Random Field. |
Siddhesh Khandelwal; Leonid Sigal; |
859 | Theoretically Provable Spiking Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we theoretically investigate the approximation powers and computational efficiency of spiking neural networks with self connections, and show that the self-connection structure enables spiking neural networks to approximate continuous dynamical systems within polynomial parameters and time complexities. |
Shao-Qun Zhang; Zhi-Hua Zhou; |
860 | Coresets for Vertical Federated Learning: Regularized Linear Regression and $K$-Means Clustering Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a unified framework by constructing \emph{coresets} in a distributed fashion for communication-efficient VFL. |
Lingxiao Huang; Zhize Li; Jialin Sun; Haoyu Zhao; |
861 | A Composable Machine-learning Approach for Steady-state Simulations on High-resolution Grids Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper we show that our Machine Learning (ML) approach, CoMLSim (Composable Machine Learning Simulator), can simulate PDEs on highly-resolved grids with higher accuracy and generalization to out-of-distribution source terms and geometries than traditional ML baselines. |
Rishikesh Ranade; Chris Hill; Lalit Ghule; Jay Pathak; |
862 | Contrastive and Non-Contrastive Self-Supervised Learning Recover Global and Local Spectral Embedding Methods Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a unifying framework under the helm of spectral manifold learning. |
Randall Balestriero; Yann LeCun; |
863 | Visual Clues: Bridging Vision and Language Foundations for Image Paragraph Captioning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We argue that by using visual clues to bridge large pretrained vision foundation models and language models, we can do so without any extra cross-modal training. |
Yujia Xie; Luowei Zhou; Xiyang Dai; Lu Yuan; Nguyen Bach; Ce Liu; Michael Zeng; |
864 | Non-convex Online Learning Via Algorithmic Equivalence Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study an algorithmic equivalence technique between nonconvex gradient descent and convex mirror descent. |
Udaya Ghai; Zhou Lu; Elad Hazan; |
865 | Automatic Differentiation of Nonsmooth Iterative Algorithms Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: For nonsmooth piggyback iterations, we characterize the attractor set of nonsmooth piggyback iterations as a set-valued fixed point which remains in the conservative framework. |
Jerome Bolte; Edouard Pauwels; Samuel Vaiter; |
866 | ALIFE: Adaptive Logit Regularizer and Feature Replay for Incremental Semantic Segmentation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose in this paper a novel ISS method, dubbed ALIFE, that provides a better compromise between accuracy and efficiency. |
Youngmin Oh; Donghyeon Baek; Bumsub Ham; |
867 | Hub-Pathway: Transfer Learning from A Hub of Pre-trained Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a Hub-Pathway framework to enable knowledge transfer from a model hub. |
Yang Shu; Zhangjie Cao; Ziyang Zhang; Jianmin Wang; Mingsheng Long; |
868 | On The Double Descent of Random Features Models Trained with SGD Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we derive precise non-asymptotics error bounds of RF regression under both constant and polynomial-decay step-size SGD setting, and observe the double descent phenomenon both theoretically and empirically. |
Fanghui Liu; Johan Suykens; Volkan Cevher; |
869 | On The Consistent Estimation of Optimal Receiver Operating Characteristic (ROC) Curve Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We formally introduce the notion of \textit{optimal ROC curve} over a general model space. |
Renxiong Liu; Yunzhang Zhu; |
870 | ASPiRe: Adaptive Skill Priors for Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce ASPiRe (Adaptive Skill Prior for RL), a new approach that leverages prior experience to accelerate reinforcement learning. |
Mengda Xu; Manuela Veloso; Shuran Song; |
871 | Ask4Help: Learning to Leverage An Expert for Embodied Tasks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Embodied AI agents continue to become more capable every year with the advent of new models, environments, and benchmarks, but are still far away from being performant and reliable enough to be deployed in real, user-facing, applications. In this paper, we ask: can we bridge this gap by enabling agents to ask for assistance from an expert such as a human being? |
Kunal Pratap Singh; Luca Weihs; Alvaro Herrasti; Jonghyun Choi; Aniruddha Kembhavi; Roozbeh Mottaghi; |
872 | Heatmap Distribution Matching for Human Pose Estimation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: For tackling the task of 2D human pose estimation, the great majority of the recent methods regard this task as a heatmap estimation problem, and optimize the heatmap prediction using the Gaussian-smoothed heatmap as the optimization objective and using the pixel-wise loss (e.g. MSE) as the loss function. In this paper, we show that optimizing the heatmap prediction in such a way, the model performance of body joint localization, which is the intrinsic objective of this task, may not be consistently improved during the optimization process of the heatmap prediction. |
Haoxuan Qu; Li Xu; Yujun Cai; Lin Geng Foo; Jun Liu; |
873 | Object-Category Aware Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Objects of the same category should share similar functionalities; therefore, the category is the most critical property of an object. Following this insight, we propose a novel framework named Object-Category Aware Reinforcement Learning (OCARL), which utilizes the category information of objects to facilitate both perception and reasoning. |
Qi Yi; Rui Zhang; shaohui peng; Jiaming Guo; Xing Hu; Zidong Du; xishan zhang; Qi Guo; Yunji Chen; |
874 | Revisiting Heterophily For Graph Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we first revisit the widely used homophily metrics and point out that their consideration of only graph-label consistency is a shortcoming. Then, we study heterophily from the perspective of post-aggregation node similarity and define new homophily metrics, which are potentially advantageous compared to existing ones. |
Sitao Luan; Chenqing Hua; Qincheng Lu; Jiaqi Zhu; Mingde Zhao; Shuyuan Zhang; Xiao-Wen Chang; Doina Precup; |
875 | Self-Consistent Dynamical Field Theory of Kernel Evolution in Wide Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We analyze feature learning in infinite-width neural networks trained with gradient flow through a self-consistent dynamical field theory. |
Blake Bordelon; Cengiz Pehlevan; |
876 | Learning Chaotic Dynamics in Dissipative Systems Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose a machine learning framework to learn the underlying solution operator for dissipative chaotic systems, showing that the resulting learned operator accurately captures short-time trajectories and long-time statistical behavior. |
Zongyi Li; Miguel Liu-Schiaffini; Nikola Kovachki; Kamyar Azizzadenesheli; Burigede Liu; Kaushik Bhattacharya; Andrew Stuart; Anima Anandkumar; |
877 | MultiGuard: Provably Robust Multi-label Classification Against Adversarial Examples Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Our major theoretical contribution is that we show a certain number of ground truth labels of an input are provably in the set of labels predicted by our MultiGuard when the $\ell_2$-norm of the adversarial perturbation added to the input is bounded. |
Jinyuan Jia; Wenjie Qu; Neil Gong; |
878 | Additive MIL: Intrinsically Interpretable Models for Digital Pathology Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose a simple formulation of MIL models, which enables interpretability while maintaining similar predictive performance. |
Syed Ashar Javed; Dinkar Juyal; Harshith Padigela; Amaro Taylor-Weiner; Limin Yu; Aaditya Prakash; |
879 | A Kernelised Stein Statistic for Assessing Implicit Generative Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a principled procedure to assess the quality of a synthetic data generator. |
Wenkai Xu; Gesine D Reinert; |
880 | Deep Model Reassembly Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we explore a novel knowledge-transfer task, termed as Deep Model Reassembly (DeRy), for general-purpose model reuse. |
Xingyi Yang; Daquan Zhou; Songhua Liu; Jingwen Ye; Xinchao Wang; |
881 | Effective Backdoor Defense By Exploiting Sensitivity of Poisoned Samples Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Given a backdoored model, we observe that the feature representations of poisoned samples with trigger are more sensitive to transformations than those of clean samples. |
Weixin Chen; Baoyuan Wu; Haoqian Wang; |
882 | Why Neural Networks Find Simple Solutions: The Many Regularizers of Geometric Complexity Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Here we develop the notion of geometric complexity, which is a measure of the variability of the model function, computed using a discrete Dirichlet energy. |
Benoit Dherin; Michael Munn; Mihaela Rosca; David Barrett; |
883 | Symplectic Spectrum Gaussian Processes: Learning Hamiltonians from Noisy and Sparse Data Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper proposes a probabilistic model that can learn the dynamics of conservative or dissipative systems from noisy and sparse data. |
Yusuke Tanaka; Tomoharu Iwata; naonori ueda; |
884 | Does Self-supervised Learning Really Improve Reinforcement Learning from Pixels? Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We investigate whether self-supervised learning (SSL) can improve online reinforcement learning (RL) from pixels. |
Xiang Li; Jinghuan Shang; Srijan Das; Michael Ryoo; |
885 | Uncoupled Learning Dynamics with $O(\log T)$ Swap Regret in Multiplayer Games Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper we establish efficient and \emph{uncoupled} learning dynamics so that, when employed by all players in a general-sum multiplayer game, the \emph{swap regret} of each player after $T$ repetitions of the game is bounded by $O(\log T)$, improving over the prior best bounds of $O(\log^4 (T))$. |
Ioannis Anagnostides; Gabriele Farina; Christian Kroer; Chung-Wei Lee; Haipeng Luo; Tuomas Sandholm; |
886 | Learning Dynamics of Deep Linear Networks with Multiple Pathways Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We use the approximation of deep linear networks with large hidden layer sizes to show that, as the depth of the parallel pathways increases, different features of the training set (defined by the singular values of the input-output correlation) will typically concentrate in one of the pathways. |
Jianghong Shi; Eric Shea-Brown; Michael Buice; |
887 | Controllable 3D Face Synthesis with Conditional Generative Occupancy Fields Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a new NeRF-based conditional 3D face synthesis framework, which enables 3D controllability over the generated face images by imposing explicit 3D conditions from 3D face priors. |
Keqiang Sun; Shangzhe Wu; Zhaoyang Huang; Ning Zhang; Quan Wang; Hongsheng Li; |
888 | VoxGRAF: Fast 3D-Aware Image Synthesis with Sparse Voxel Grids Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper investigates sparse voxel grids as 3D representation for 3D-aware image synthesis to achieve efficient rendering. |
Katja Schwarz; Axel Sauer; Michael Niemeyer; Yiyi Liao; Andreas Geiger; |
889 | Learning Sparse Features Can Lead to Overfitting in Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Yet, understanding when and how this feature learning improves performance remains a challenge: for example, it is beneficial for modern architectures trained to classify images, whereas it is detrimental for fully-connected networks trained for the same task on the same data. Here we propose an explanation for this puzzle, by showing that feature learning can perform worse than lazy training (via random feature kernel or the NTK) as the former can lead to a sparser neural representation. |
Francesco Cagnetta; Matthieu Wyart; Leonardo Petrini; Eric Vanden-Eijnden; |
890 | An Empirical Study on Disentanglement of Negative-free Contrastive Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: But its disentanglement property remains unexplored. In this paper, we take different negative-free contrastive learning methods to study the disentanglement property of this genre of self-supervised methods empirically. |
Jinkun Cao; Ruiqian Nai; Qing Yang; Jialei Huang; Yang Gao; |
891 | Exploiting Reward Shifting in Value-Based Deep RL Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we study the simple yet universally applicable case of reward shaping in value-based Deep Reinforcement Learning (DRL). |
Hao Sun; Lei Han; Rui Yang; Xiaoteng Ma; Jian Guo; Bolei Zhou; |
892 | PointNeXt: Revisiting PointNet++ with Improved Training and Scaling Strategies Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: PointNeXt boosts the performance of PointNet++ to the state-of-the-art level with improved training and scaling strategies. |
Guocheng Qian; Yuchen Li; Houwen Peng; Jinjie Mai; Hasan Hammoud; Mohamed Elhoseiny; Bernard Ghanem; |
893 | Sampling with Riemannian Hamiltonian Monte Carlo in A Constrained Space Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We demonstrate for the first time that ill-conditioned, non-smooth, constrained distributions in very high dimension, can be sampled efficiently in practice, outperforming existing packages by orders of magnitude. |
Yunbum Kook; Yin-Tat Lee; Ruoqi Shen; Santosh Vempala; |
894 | Q-ViT: Accurate and Fully Quantized Low-bit Vision Transformer Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, through extensive empirical analysis, we first identify the bottleneck for severe performance drop comes from the information distortion of the low-bit quantized self-attention map. We then develop an information rectification module (IRM) and a distribution guided distillation (DGD) scheme for fully quantized vision transformers (Q-ViT) to effectively eliminate such distortion, leading to a fully quantized ViTs. |
Yanjing Li; Sheng Xu; Baochang Zhang; Xianbin Cao; Peng Gao; Guodong Guo; |
895 | Stability and Generalization of Kernel Clustering: from Single Kernel to Multiple Kernel Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel method that can efficiently compute the embedding of out-of-sample data with a solid generalization guarantee. |
Weixuan Liang; Xinwang Liu; Yong Liu; sihang zhou; Jun-Jie Huang; Siwei Wang; Jiyuan Liu; Yi Zhang; En Zhu; |
896 | MoVQ: Modulating Quantized Vectors for High-Fidelity Image Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Although two-stage Vector Quantized (VQ) generative models allow for synthesizing high-fidelity and high-resolution images, their quantization operator encodes similar patches within an image into the same index, resulting in a repeated artifact for similar adjacent regions using existing decoder architectures. To address this issue, we propose to incorporate the spatially conditional normalization to modulate the quantized vectors so as to insert spatially variant information to the embedded index maps, encouraging the decoder to generate more photorealistic images. |
Chuanxia Zheng; Tung-Long Vuong; Jianfei Cai; Dinh Phung; |
897 | Visual Concepts Tokenization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Obtaining the human-like perception ability of abstracting visual concepts from concrete pixels has always been a fundamental and important target in machine learning research fields such as disentangled representation learning and scene decomposition. Towards this goal, we propose an unsupervised transformer-based Visual Concepts Tokenization framework, dubbed VCT, to perceive an image into a set of disentangled visual concept tokens, with each concept token responding to one type of independent visual concept. |
Tao Yang; Yuwang Wang; Yan Lu; Nanning Zheng; |
898 | EcoFormer: Energy-Saving Attention with Linear Complexity Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To this end, we propose a new binarization paradigm customized to high-dimensional softmax attention via kernelized hashing, called EcoFormer, to map the original queries and keys into low-dimensional binary codes in Hamming space. |
Jing Liu; Zizheng Pan; Haoyu He; Jianfei Cai; Bohan Zhuang; |
899 | SGAM: Building A Virtual 3D World Through Simultaneous Generation and Mapping Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present simultaneous generation and mapping (SGAM), a novel 3D scene generation algorithm. |
Yuan Shen; Wei-Chiu Ma; Shenlong Wang; |
900 | Rank Diminishing in Deep Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: By virtue of our numerical tools, we provide the first empirical analysis of the per-layer behavior of network ranks in realistic settings, \ieno, ResNets, deep MLPs, and Transformers on ImageNet. |
Ruili Feng; Kecheng Zheng; Yukun Huang; Deli Zhao; Michael Jordan; Zheng-Jun Zha; |
901 | Continual Learning: A Feature Extraction Formalization, An Efficient Algorithm, and Fundamental Obstructions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a framework for continual learning through the framework of feature extraction—namely, one in which features, as well as a classifier, are being trained with each environment. |
Binghui Peng; Andrej Risteski; |
902 | Thinking Outside The Ball: Optimal Learning with Gradient Descent for Generalized Linear Stochastic Convex Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We consider linear prediction with a convex Lipschitz loss, or more generally, stochastic convex optimization problems of generalized linear form, i.e.~where each instantaneous loss is a scalar convex function of a linear function. |
Idan Amir; Roi Livni; Nati Srebro; |
903 | Lottery Tickets on A Data Diet: Finding Initializations with Sparse Trainable Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, the same does not hold at step 0, i.e. random initialization. In this work, we seek to understand how this early phase of pre-training leads to a good initialization for IMP both through the lens of the data distribution and the loss landscape geometry. |
Mansheej Paul; Brett Larsen; Surya Ganguli; Jonathan Frankle; Gintare Karolina Dziugaite; |
904 | [Re] Reproducibility Study of “Counterfactual Generative Networks” Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Scope of Reproducibility In this work, we study the reproducibility of the paper Counterfactual Generative Networks (CGN) by Sauer and Geiger to verify their main claims, which state that (i) their proposed model can reliably generate high-quality counterfactual images by disentangling the shape, texture and background of the image into independent mechanisms, (ii) each independent mechanism has to be considered, and jointly optimizing all of them end-to-end is needed for high-quality images, and (iii) despite being synthetic, these counterfactual images can improve out-of-distribution performance of classifiers by making them invariant to spurious signals. |
Piyush Bagad; Paul Hilders; Jesse Maas; Danilo de Goede; |
905 | [Re] Replication Study of ‘Data-Driven Methods for Balancing Fairness and Efficiency in Ride-Pooling’ Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We evaluate the following claims related to fairness-based objective functions presented in the original work: (1) For the four objective functions, the success rate in the worst-off neighborhood increases monotonically with respect to the overall success rate. (2) The proposed objective functions do not lead to a higher income for the lowest-earning drivers, nor a higher total income, compared to a request-maximizing objective function. (3) The driver-side fairness objective can outperform a request-maximizing objective in terms of overall success rate and success rate in the worst-off neighborhood. |
Vera Neplenbroek; Sabijn Perdijk; Victor Prins; |
906 | Recovery and Generalization in Over-Realized Dictionary Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This work characterizes the surprising phenomenon that dictionary recovery can be facilitated by searching over the space of larger over-realized models. |
Jeremias Sulam; Chong You; Zhihui Zhu; |
907 | Optimality and Stability in Non-Convex Smooth Games Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper aims to provide a comprehensive analysis of local minimax points, such as their relation with other solution concepts and their optimality conditions. |
Guojun Zhang; Pascal Poupart; Yaoliang Yu; |
908 | IALE: Imitating Active Learner Ensembles Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose IALE, an imitation learning scheme that imitates the selection of the best-performing expert heuristic at each stage of the learning cycle in a batch-mode pool-based setting. |
Christoffer Löffler; Christopher Mutschler; |
909 | Foolish Crowds Support Benign Overfitting Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We prove a lower bound on the excess risk of sparse interpolating procedures for linear regression with Gaussian data in the overparameterized regime. |
Niladri S. Chatterji; Philip Long; |
910 | Equivariant Graph Hierarchy-based Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose Equivariant Hierarchy-based Graph Networks (EGHNs) which consist of the three key components: generalized Equivariant Matrix Message Passing (EMMP) , E-Pool and E-UnPool. |
Jiaqi Han; Yu Rong; Tingyang Xu; Wenbing Huang; |
911 | Supervised Dimensionality Reduction and Visualization Using Centroid-Encoder Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a new tool for visualizing complex, and potentially large and high-dimensional, data sets called Centroid-Encoder (CE). |
Tomojit Ghosh; Michael Kirby; |
912 | Deep Limits and A Cut-Off Phenomenon for Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We consider dynamical and geometrical aspects of deep learning. |
Benny Avelin; Anders Karlsson; |
913 | Sufficient Reductions in Regression with Mixed Predictors Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study the general regression problem of inferring on a variable of interest based on high dimensional mixed continuous and binary predictors. |
Efstathia Bura; Liliana Forzani; Rodrigo García Arancibia; Pamela Llop; Diego Tomassi; |
914 | A Bregman Learning Framework for Sparse Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a learning framework based on stochastic Bregman iterations, also known as mirror descent, to train sparse neural networks with an inverse scale space approach. |
Leon Bungert; Tim Roith; Daniel Tenbrinck; Martin Burger; |
915 | Sparse Additive Gaussian Process Regression Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper we introduce a novel model for Gaussian process (GP) regression in the fully Bayesian setting. |
Hengrui Luo; Giovanni Nattino; Matthew Pratola; |
916 | TaSIL: Taylor Series Imitation Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose Taylor Series Imitation Learning (TaSIL), a simple augmentation to standard behavior cloning losses in the context of continuous control. |
Daniel Pfrommer; Thomas Zhang; Stephen Tu; Nikolai Matni; |
917 | FourierFormer: Transformer Meets Generalized Fourier Integral Theorem Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose the FourierFormer, a new class of transformers in which the pair-wise dot product kernels are replaced by the novel generalized Fourier integral kernels to efficiently capture the dependency of the features of data. |
Tan Nguyen; Minh Pham; Tam Nguyen; Khai Nguyen; Stanley Osher; Nhat Ho; |
918 | Posterior and Computational Uncertainty in Gaussian Processes Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Here, we develop a new class of methods that provides consistent estimation of the combined uncertainty arising from both the finite number of data observed and the finite amount of computation expended. |
Jonathan Wenger; Geoff Pleiss; Marvin Pförtner; Philipp Hennig; John Cunningham; |
919 | COLD Decoding: Energy-based Constrained Text Generation with Langevin Dynamics Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present Energy-based Constrained Decoding with Langevin Dynamics (Cold), a decoding framework which unifies constrained generation as specifying constraints through an energy function, then performing efficient differentiable reasoning over the constraints through gradient-based sampling. |
Lianhui Qin; Sean Welleck; Daniel Khashabi; Yejin Choi; |
920 | Efficient Graph Similarity Computation with Alignment Regularization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We consider the graph similarity computation (GSC) task based on graph edit distance (GED) estimation. |
Wei Zhuo; Guang Tan; |
921 | Reconsidering Deep Ensembles Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Recent work suggests that deep ensembles may offer distinct benefits beyond predictive power: namely, uncertainty quantification and robustness to dataset shift. In this work, we demonstrate limitations to these purported benefits, and show that a single (but larger) neural network can replicate these qualities. |
Taiga Abe; Estefany Kelly Buchanan; Geoff Pleiss; Richard Zemel; John Cunningham; |
922 | Refining Low-Resource Unsupervised Translation By Language Disentanglement of Multilingual Translation Model Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose a simple refinement procedure to separate languages from a pre-trained multilingual UMT model for it to focus on only the target low-resource task. |
Xuan-Phi Nguyen; Shafiq Joty; Kui Wu; Ai Ti Aw; |
923 | Old Can Be Gold: Better Gradient Flow Can Make Vanilla-GCNs Great Again Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we provide a new perspective of gradient flow to understand the substandard performance of deep GCNs and hypothesize that by facilitating healthy gradient flow, we can significantly improve their trainability and performance. |
AJAY JAISWAL; Peihao Wang; Tianlong Chen; Justin Rousseau; Ying Ding; Zhangyang Wang; |
924 | Policy Optimization for Markov Games: Unified Framework and Faster Convergence Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We establish the first faster convergence rates for symmetric optimistic policy optimization algorithms in Markov games, and provide a unified framework for similar algorithms and analyses. |
Runyu Zhang; Qinghua Liu; Huan Wang; Caiming Xiong; Na Li; Yu Bai; |
925 | Systematic Improvement of Neural Network Quantum States Using Lanczos Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose a symmetry-projected variational solution in the form of linear combinations of simple restricted Boltzmann machines. |
Hongwei Chen; Douglas Hendry; Phillip Weinberg; Adrian Feiguin; |
926 | List-Decodable Sparse Mean Estimation Via Difference-of-Pairs Filtering Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we develop a novel, conceptually simpler technique for list-decodable mean estimation. |
Ilias Diakonikolas; Daniel Kane; Sushrut Karmalkar; Ankit Pensia; Thanasis Pittas; |
927 | No-regret Learning in Games with Noisy Feedback: Faster Rates and Adaptivity Via Learning Rate Separation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We devise algorithms that enjoy constant regret for learning in games with noisy feedback. |
Yu-Guan Hsieh; Kimon Antonakopoulos; Volkan Cevher; Panayotis Mertikopoulos; |
928 | Lower Bounds on Randomly Preconditioned Lasso Via Robust Sparse Designs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We prove a stronger lower bound that rules out randomized preconditioners. |
Jonathan Kelner; Frederic Koehler; Raghu Meka; Dhruv Rohatgi; |
929 | Coreset for Line Sets Clustering Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We prove that \emph{every} such input set $\set{L}$ has a small $\varepsilon$-coreset, and provide the first coreset construction for this problem and its variants. |
Sagi Lotan; Ernesto Evgeniy Sanches Shayda; Dan Feldman; |
930 | When Does Dough Become A Bagel? Analyzing The Remaining Mistakes on ImageNet Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We focus on the multi-label subset evaluation of ImageNet, where today’s best models achieve upwards of 97% top-1 accuracy. |
Vijay Vasudevan; Benjamin Caine; Raphael Gontijo Lopes; Sara Fridovich-Keil; Rebecca Roelofs; |
931 | First Hitting Diffusion Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a family of First Hitting Diffusion Models (FHDM), deep generative models that generate data with a diffusion process that terminates at a random first hitting time. |
Mao Ye; Lemeng Wu; Qiang Liu; |
932 | Nearly Optimal Algorithms for Linear Contextual Bandits with Adversarial Corruptions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The best-known algorithms in this setting are limited in that they either are computationally inefficient or require a strong assumption on the corruption, or their regret is at least $C$ times worse than the regret without corruption. In this paper, to overcome these limitations, we propose a new algorithm based on the principle of optimism in the face of uncertainty. |
Jiafan He; Dongruo Zhou; Tong Zhang; Quanquan Gu; |
933 | Adaptation Accelerating Sampling-based Bayesian Inference in Attractor Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this study, we explore how latent stimulus sampling can be accelerated in neural circuits. |
Xingsi Dong; Zilong Ji; Tianhao Chu; Tiejun Huang; Wenhao Zhang; Si Wu; |
934 | Learning to Discover and Detect Objects Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To this end, we propose a two-stage object detection network Region-based NCDL (RNCDL), that uses a region proposal network to localize potential objects and classify them. |
Vladimir Fomenko; Ismail Elezi; Deva Ramanan; Aljosa Osep; Laura Leal-Taixé; |
935 | Neural Circuit Architectural Priors for Embodied Control Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we ask what advantages biologically inspired network architecture can provide in the context of motor control. |
Nikhil Bhattasali; Anthony M Zador; Tatiana Engel; |
936 | Non-Convex Bilevel Games with Critical Point Selection Maps Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: When the latter is non-convex, multiple critical points may be present, leading to an ambiguous definition of the problem. In this paper, we introduce a key ingredient for resolving this ambiguity through the concept of a selection map which allows one to choose a particular solution to the lower-level problem. |
Michael Arbel; Julien Mairal; |
937 | Self-Similarity Priors: Neural Collages As Differentiable Fractal Representations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Said patterns commonly appear in natural and artificial objects, such as molecules, shorelines, galaxies, and even images. In this work, we investigate the role of learning in the automated discovery of self-similarity and in its utilization for downstream tasks. |
Michael Poli; Winnie Xu; Stefano Massaroli; Chenlin Meng; Kuno Kim; Stefano Ermon; |
938 | Bounded-Regret MPC Via Perturbation Analysis: Prediction Error, Constraints, and Nonlinearity Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study Model Predictive Control (MPC) and propose a general analysis pipeline to bound its dynamic regret. |
Yiheng Lin; Yang Hu; Guannan Qu; Tongxin Li; Adam Wierman; |
939 | A Reparametrization-Invariant Sharpness Measure Based on Information Geometry Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Based on an information geometric analysis of the neural network parameter space, in this paper we propose a reparametrization-invariant sharpness measure that captures the change in loss with respect to changes in the probability distribution modeled by neural networks, rather than with respect to changes in the parameter values. |
Cheongjae Jang; Sungyoon Lee; Yung-Kyun Noh; Frank Park; |
940 | Stability Analysis and Generalization Bounds of Adversarial Training Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we study the robust overfitting issue of adversarial training by using tools from uniform stability. |
Jiancong Xiao; Yanbo Fan; Ruoyu Sun; Jue Wang; Zhi-Quan Luo; |
941 | Mining Multi-Label Samples from Single Positive Labels Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To generate multi-label data in the single positive setting, we propose a novel sampling approach called single-to-multi-label (S2M) sampling, based on the Markov chain Monte Carlo method. |
Youngin Cho; Daejin Kim; MOHAMMAD AZAM KHAN; Jaegul Choo; |
942 | Decomposable Non-Smooth Convex Optimization with Nearly-Linear Gradient Oracle Complexity Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Our main technical contribution is an adaptive procedure to select an $f_i$ term at every iteration via a novel combination of cutting-plane and interior-point methods. |
Sally Dong; Haotian Jiang; Yin Tat Lee; Swati Padmanabhan; Guanghao Ye; |
943 | Distributed Learning of Conditional Quantiles in The Reproducing Kernel Hilbert Space Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study distributed learning of nonparametric conditional quantiles with Tikhonov regularization in a reproducing kernel Hilbert space (RKHS). |
Heng Lian; |
944 | Wasserstein Iterative Networks for Barycenter Estimation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present an algorithm to approximate the Wasserstein-2 barycenters of continuous measures via a generative model. |
Alexander Korotin; Vage Egiazarian; Lingxiao Li; Evgeny Burnaev; |
945 | Decoupled Context Processing for Context Augmented Language Modeling Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper we examined a simple yet effective architecture for incorporating external context into language models based on decoupled $\texttt{Encoder-Decoder}$ architecture. |
Zonglin Li; Ruiqi Guo; Sanjiv Kumar; |
946 | Transformers from An Optimization Perspective Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Deep learning models such as the Transformer are often constructed by heuristics and experience. To provide a complementary foundation, in this work we study the following problem: Is it possible to find an energy function underlying the Transformer model, such that descent steps along this energy correspond with the Transformer forward pass? |
Yongyi Yang; zengfeng Huang; David P Wipf; |
947 | Unsupervised Skill Discovery Via Recurrent Skill Training Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Although impressive results have been provided, we found that parallel training procedure can sometimes block exploration when the state visited by different skills overlap, which leads to poor state coverage and restricts the diversity of learned skills. In this paper, we take a deeper look into this phenomenon and propose a novel framework to address this issue, which we call Recurrent Skill Training (ReST). |
Zheyuan Jiang; Jingyue Gao; Jianyu Chen; |
948 | Para-CFlows: $C^k$-universal Diffeomorphism Approximators As Superior Neural Surrogates Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The approximation universality for CFlows is of paramount importance to ensure the model expressiveness. In this paper, we prove that CFlows}can approximate any diffeomorphism in $C^k$-norm if its layers can approximate certain single-coordinate transforms. |
Junlong Lyu; Zhitang Chen; Chang Feng; Wenjing Cun; Shengyu Zhu; Yanhui Geng; ZHIJIE XU; Chen Yongwei; |
949 | SoLar: Sinkhorn Label Refinery for Imbalanced Partial-Label Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We subsequently propose SoLar, a novel Optimal Transport-based framework that allows to refine the disambiguated labels towards matching the marginal class prior distribution. |
Haobo Wang; Mingxuan Xia; Yixuan Li; Yuren Mao; Lei Feng; Gang Chen; Junbo Zhao; |
950 | A Data-Augmentation Is Worth A Thousand Samples Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Data-Augmentation (DA) is known to improve performance across tasks and datasets. We propose a method to theoretically analyze the effect of DA and study questions such as: how many augmented samples are needed to correctly estimate the information encoded by that DA? |
Randall Balestriero; Ishan Misra; Yann LeCun; |
951 | Sub-exponential Time Sum-of-Squares Lower Bounds for Principal Components Analysis Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we study the limits of the powerful Sum of Squares (SoS) family of algorithms for Sparse PCA. |
Aaron Potechin; GOUTHAM RAJENDRAN; |
952 | A Closer Look at The Adversarial Robustness of Deep Equilibrium Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To this end, we observe that an adversarially trained DEQ requires more forward steps to arrive at the equilibrium state, or even violates its fixed-point structure. |
Zonghan Yang; Tianyu Pang; Yang Liu; |
953 | Bessel Equivariant Networks for Inversion of Transmission Effects in Multi-Mode Optical Fibres Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We develop a new type of model for solving the task of inverting the transmission effects of multi-mode optical fibres through the construction of an $\mathrm{SO}^{+}(2,1)$-equivariant neural network. |
Joshua Mitton; Simon Mekhail; Miles Padgett; Daniele Faccio; Marco Aversa; Roderick Murray-Smith; |
954 | Scalable Distributional Robustness in A Class of Non-Convex Optimization with Guarantees Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We endeavor to provide DRO solutions for a class of sum of fractionals, non-convex optimization which is used for decision making in prominent areas such as facility location and security games. |
Avinandan Bose; Arunesh Sinha; Tien Mai; |
955 | CLIPDraw: Exploring Text-to-Drawing Synthesis Through Language-Image Encoders Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: CLIPDraw is an algorithm that synthesizes novel drawings from natural language input. |
Kevin Frans; Olaf Witkowski; Lisa Soros; |
956 | On Deep Generative Models for Approximation and Estimation of Distributions on Manifolds Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The widely held manifold hypothesis speculates that real-world data sets, such as natural images and signals, exhibit low-dimensional geometric structures. In this paper, we take such low-dimensional data structures into consideration by assuming that data distributions are supported on a low-dimensional manifold. |
Biraj Dahal; Alexander Havrilla; Minshuo Chen; Tuo Zhao; Wenjing Liao; |
957 | If Influence Functions Are The Answer, Then What Is The Question? Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: While influence estimates align well with leave-one-out retraining for linear models, recent works have shown this alignment is often poor in neural networks. In this work, we investigate the specific factors that cause this discrepancy by decomposing it into five separate terms. |
Juhan Bae; Nathan Ng; Alston Lo; Marzyeh Ghassemi; Roger Grosse; |
958 | InsNet: An Efficient, Flexible, and Performant Insertion-based Text Generation Model Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose InsNet, an expressive insertion-based text generator with efficient training and flexible decoding (parallel or sequential). |
Sidi Lu; Tao Meng; Nanyun Peng; |
959 | A Single Self-Supervised Model for Many Speech Modalities Enables Zero-Shot Modality Transfer Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we present u-HuBERT, a self-supervised pre-training framework that can leverage both multimodal and unimodal speech with a unified masked cluster prediction objective. |
Wei-Ning Hsu; Bowen Shi; |
960 | Expected Improvement for Contextual Bandits Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we introduce and study the EI technique as a new tool for the contextual bandit problem which is a generalization of the standard bandit. |
Hung Tran-The; Sunil Gupta; Santu Rana; Tuan Truong; Long Tran-Thanh; Svetha Venkatesh; |
961 | Toward Understanding Privileged Features Distillation in Learning-to-Rank Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we first study PFD empirically on three public ranking datasets and an industrial-scale ranking problem derived from Amazon’s logs. We show that PFD outperforms several baselines (no-distillation, pretraining-finetuning, self-distillation, and generalized distillation) on all these datasets. |
Shuo Yang; Sujay Sanghavi; Holakou Rahmanian; Jan Bakus; Vishwanathan S. V. N.; |
962 | Hyperbolic Feature Augmentation Via Distribution Estimation and Infinite Sampling on Manifolds Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a hyperbolic feature augmentation method that generates diverse and discriminative features in the hyperbolic space to combat overfitting. |
Zhi Gao; Yuwei Wu; Yunde Jia; Mehrtash Harandi; |
963 | Pushing The Limits of Fairness Impossibility: Who’s The Fairest of Them All? Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Rather than follow suit, in this paper we present a framework that pushes the limits of the impossibility theorem in order to satisfy all three metrics to the best extent possible. |
Brian Hsu; Rahul Mazumder; Preetam Nandy; Kinjal Basu; |
964 | SInGE: Sparsity Via Integrated Gradients Estimation of Neuron Relevance Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we discuss the limitations of existing pruning heuristics, among which magnitude and gradient-based methods. |
Edouard YVINEC; Arnaud Dapogny; Matthieu Cord; Kevin Bailly; |
965 | Spartan: Differentiable Sparsity Via Regularized Transportation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present Spartan, a method for training sparse neural network models with a predetermined level of sparsity. |
Kai Sheng Tai; Taipeng Tian; Ser Nam Lim; |
966 | Nearly Optimal Best-of-Both-Worlds Algorithms for Online Learning with Feedback Graphs Related Papers Related Patents Related Grants Related Orgs Related Experts View Abstract: This study considers online learning with general directed feedback graphs. For this problem, we present best-of-both-worlds algorithms that achieve nearly tight regret bounds for … |
Shinji Ito; Taira Tsuchiya; Junya Honda; |
967 | Average Sensitivity of Euclidean K-Clustering Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In practical situations, the clustering result must be stable against points missing in the input data so that we can make trustworthy and consistent decisions. To address this issue, we consider the average sensitivity of Euclidean $(k,\ell)$-clustering, which measures the stability of the output in total variation distance against deleting a random point from the input data. |
Yuichi Yoshida; Shinji Ito; |
968 | Single Loop Gaussian Homotopy Method for Non-convex Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a novel single loop framework for GH methods (SLGH) that updates the parameter $t$ and the optimization decision variables at the same. |
Hidenori Iwakiri; Yuhang Wang; Shinji Ito; Akiko Takeda; |
969 | Causality Preserving Chaotic Transformation and Classification Using Neurochaos Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, a recently proposed brain inspired learning algorithm namely-\emph{Neurochaos Learning} (NL) is used for the classification of cause-effect from coupled autoregressive processes, coupled 1D chaotic skew tent maps, coupled 1D chaotic logistic maps and a real-world prey-predator system. |
Harikrishnan N B; Aditi Kathpalia; Nithin Nagaraj; |
970 | A First Approach to Universal Second-Order Acceleration for Convex Minimization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose a universal and adaptive second-order method for minimization of second-order smooth, convex functions. |
Ali Kavis; Kimon Antonakopoulos; Volkan Cevher; |
971 | Provably Tuning The ElasticNet Across Instances Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We consider the problem of tuning the regularization parameters of Ridge regression, LASSO, and the ElasticNet across multiple problem instances, a setting that encompasses both cross-validation and multi-task hyperparameter optimization. |
Maria-Florina Balcan; Misha Khodak; Dravyansh Sharma; Ameet Talwalkar; |
972 | Trimmed Maximum Likelihood Estimation for Robust Generalized Linear Model Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study the problem of learning generalized linear models under adversarial corruptions. |
Weihao Kong; Rajat Sen; Pranjal Awasthi; Abhimanyu Das; |
973 | On Optimal Learning Under Targeted Data Poisoning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work we aim to characterize the smallest achievable error $\epsilon=\epsilon(\eta)$ by the learner in the presence of such an adversary in both realizable and agnostic settings. |
Idan Mehalel; Steve Hanneke; Shay Moran; Mohammad Mahmoody; Amin Karbasi; |
974 | FNeVR: Neural Volume Rendering for Face Animation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, it remains a critical challenge to generate identity preserving and photo-realistic images due to the sophisticated motion deformation and complex facial detail modeling. To address these problems, we propose a Face Neural Volume Rendering (FNeVR) network to fully explore the potential of 2D motion warping and 3D volume rendering in a unified framework. |
Bohan Zeng; Boyu Liu; Hong Li; Xuhui Liu; Jianzhuang Liu; Dapeng Chen; Wei Peng; Baochang Zhang; |
975 | Rethinking Variational Inference for Probabilistic Programs with Stochastic Support Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce Support Decomposition Variational Inference (SDVI), a new variational inference (VI) approach for probabilistic programs with stochastic support. |
Tim Reichelt; Luke Ong; Thomas Rainforth; |
976 | Finite-Time Last-Iterate Convergence for Learning in Multi-Player Games Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study the question of last-iterate convergence rate of the extragradient algorithm by Korpelevich [1976] and the optimistic gradient algorithm by Popov [1980] in multi-player games. |
Yang Cai; Argyris Oikonomou; Weiqiang Zheng; |
977 | Efficient Methods for Non-stationary Online Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we present efficient methods for optimizing dynamic regret and adaptive regret, which reduce the number of projections per round from O(log T) to 1. |
Peng Zhao; Yan-Feng Xie; Lijun Zhang; Zhi-Hua Zhou; |
978 | Semantic Uncertainty Intervals for Disentangled Latent Spaces Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we provide principled uncertainty intervals that are guaranteed to contain the true semantic factors for any underlying generative model. |
Swami Sankaranarayanan; Anastasios Angelopoulos; Stephen Bates; Yaniv Romano; Phillip Isola; |
979 | Policy Gradient With Serial Markov Chain Reasoning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce a new framework that performs decision-making in reinforcement learning (RL) as an iterative reasoning process. |
Edoardo Cetin; Oya Celiktutan; |
980 | Imitating Past Successes Can Be Very Suboptimal Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Our aim is not to develop an entirely new method, but rather to explain how a variant of outcome-conditioned imitation learning can be used to maximize rewards |
Benjamin Eysenbach; Soumith Udatha; Russ Salakhutdinov; Sergey Levine; |
981 | Semantic Difference Convolution for Semantic Segmentation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose an efficient boundary-aware convolution operator to boost the boundary modeling capacity for semantic segmentation, named Semantic Difference Convolution (SDC). |
Haoru Tan; Sitong Wu; Jimin Pi; |
982 | Thinned Random Measures for Sparse Graphs with Overlapping Communities Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a framework for thinning edges from realizations of GGP random graphs that models observed links via nodes’ overall propensity to interact, as well as the similarity of node memberships within a large set of latent communities. |
Federica Zoe Ricci; Michele Guindani; Erik Sudderth; |
983 | A Communication-efficient Algorithm with Linear Convergence for Federated Minimax Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we study a large-scale multi-agent minimax optimization problem, which models many interesting applications in statistical learning and game theory, including Generative Adversarial Networks (GANs). |
Zhenyu Sun; Ermin Wei; |
984 | Queue Up Your Regrets: Achieving The Dynamic Capacity Region of Multiplayer Bandits Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Our main observation is that the gap between $\lambda_{n}t$ and the accumulated reward of agent $n$, which we call the QoS regret, behaves like a queue. Inspired by this observation, we propose a distributed algorithm that aims to learn a max-weight matching of agents to actions. |
Ilai Bistritz; Nicholas Bambos; |
985 | Trading Off Resource Budgets For Improved Regret Bounds Related Papers Related Patents Related Grants Related Orgs Related Experts View Abstract: We consider a variant of adversarial online learning where in each round one picks $B$ arms and incurs cost equal to the minimum of the costs of each arm chosen. We study the … |
Thomas Orton; Damon Falck; |
986 | Neural Matching Fields: Implicit Representation of Matching Cost for Semantic Correspondence Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, complicacy and high-dimensionality of a 4D matching field are the major hindrances. To address them, we propose a cost embedding network consisting of convolution and self-attention layers to process the coarse cost volume to obtain cost feature representation, which is used as a guidance for establishing high-precision matching field through the following fully-connected network. |
Sunghwan Hong; Seungryong Kim; Dongbo Min; Sangryul Jeon; Seokju Cho; Susung Hong; Jisu Nam; |
987 | On-Demand Sampling: Learning Optimally from Multiple Distributions Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In each of these settings, a learner seeks to minimize its worst-case loss over a set of $n$ predefined distributions, while using as few samples as possible. In this paper, we establish the optimal sample complexity of these learning paradigms and give algorithms that meet this sample complexity. |
Nika Haghtalab; Michael Jordan; Eric Zhao; |
988 | Batch-Size Independent Regret Bounds for Combinatorial Semi-Bandits with Probabilistically Triggered Arms or Independent Arms Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: First, for the setting of CMAB with probabilistically triggered arms (CMAB-T), we discover a novel (directional) triggering probability and variance modulated (TPVM) condition that can replace the previously-used smoothness condition for various applications, such as cascading bandits, online network exploration and online influence maximization. Under this new condition, we propose a BCUCB-T algorithm with variance-aware confidence intervals and conduct regret analysis which reduces the $O(K)$ factor to $O(\log K)$ or $O(\log^2 K)$ in the regret bound, significantly improving the regret bounds for the above applications. |
Xutong Liu; Jinhang Zuo; Siwei Wang; Carlee Joe-Wong; John C.S. Lui; Wei Chen; |
989 | Adversarial Robustness Is at Odds with Lazy Training Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we take one step further and show that one single gradient step can find adversarial examples for networks trained in the so-called lazy regime. |
Yunjuan Wang; Enayat Ullah; Poorya Mianjy; Raman Arora; |
990 | Rethinking and Scaling Up Graph Contrastive Learning: An Extremely Efficient Approach with Group Discrimination Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Inspired by an observation of a technical defect (i.e., inappropriate usage of Sigmoid function) commonly used in two representative GCL works, DGI and MVGRL, we revisit GCL and introduce a new learning paradigm for self-supervised graph representation learning, namely, Group Discrimination (GD), and propose a novel GD-based method called Graph Group Discrimination (GGD). |
YIZHEN ZHENG; Shirui Pan; Vincent CS Lee; Yu Zheng; Philip S Yu; |
991 | List-Decodable Sparse Mean Estimation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Our main contribution is the first polynomial-time algorithm that enjoys sample complexity $O\big(\mathrm{poly}(k, \log d)\big)$, i.e. poly-logarithmic in the dimension. |
Shiwei Zeng; Jie Shen; |
992 | Non-monotonic Resource Utilization in The Bandits with Knapsacks Problem Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper we introduce a naturalgeneralization of the stochastic BwK problem that allows non-monotonicresource utilization. |
Raunak Kumar; Robert Kleinberg; |
993 | Is $L^2$ Physics Informed Loss Always Suitable for Training Physics Informed Neural Network? Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Based on the theoretical insight, we develop a novel PINN training algorithm to minimize the $L^{\infty}$ loss for HJB equations which is in a similar spirit to adversarial training. |
Chuwei Wang; Shanda Li; Di He; Liwei Wang; |
994 | Defending Against Adversarial Attacks Via Neural Dynamic System Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Some recent works have accordingly proposed to enhance the robustness of DNN from a dynamic system perspective. Following this line of inquiry, and inspired by the asymptotic stability of the general nonautonomous dynamical system, we propose to make each clean instance be the asymptotically stable equilibrium point of a slowly time-varying system in order to defend against adversarial attacks. |
Xiyuan Li; Zou Xin; Weiwei Liu; |
995 | Graph Neural Networks with Adaptive Readouts Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose differentiable and adaptive readouts for graph neural networks which replace standard operators such as sum or max, then discuss the benefits and trade-offs with an extensive empirical analysis. |
David Buterez; Jon Paul Janet; Steven J Kiddle; Dino Oglic; Pietro Liò; |
996 | Dataset Factorization for Condensation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we study dataset distillation (DD), from a novel perspective and introduce a \emph{dataset factorization} approach, termed \emph{HaBa}, which is a plug-and-play strategy portable to any existing DD baseline. |
Songhua Liu; Kai Wang; Xingyi Yang; Jingwen Ye; Xinchao Wang; |
997 | Precise Learning Curves and Higher-Order Scalings for Dot-product Kernel Regression Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We establish precise closed-form formulas for sample-wise learning curves for dot-product kernel ridge regression in the polynomial scaling regime. |
Lechao Xiao; Jeffrey Pennington; Theodor Misiakiewicz; Hong Hu; Yue Lu; |
998 | Regret Bounds for Information-Directed Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We develop novel information-theoretic tools to bound the information ratio and cumulative information gain about the learning target. |
Botao Hao; Tor Lattimore; |
999 | Trustworthy Monte Carlo Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The technique typically enables massively parallel computation, however, with the risk that some of the delegated computations contain spontaneous or adversarial errors. We present an orchestration of the computations such that the outcome is accompanied with a proof of correctness. |
Juha Harviainen; Mikko Koivisto; Petteri Kaski; |
1000 | Explicit Tradeoffs Between Adversarial and Natural Distributional Robustness Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In practice, however, models need to enjoy both types of robustness to ensure reliability. In this work, we bridge this gap and show that in fact, {\it explicit tradeoffs} exist between adversarial and natural distributional robustness. |
Mazda Moayeri; Kiarash Banihashem; Soheil Feizi; |
1001 | Distilling Representations from GAN Generator Via Squeeze and Span Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose to distill knowledge from GAN generators by squeezing and spanning their representations. |
Yu Yang; Xiaotian Cheng; Chang Liu; Hakan Bilen; Xiangyang Ji; |
1002 | Remember The Past: Distilling Datasets Into Addressable Memories for Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose an algorithm that compresses the critical information of a large dataset into compact addressable memories. |
Zhiwei Deng; Olga Russakovsky; |
1003 | Bayesian Subset Selection and Variable Importance for Interpretable Prediction and Classification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Given any Bayesian predictive model M, we extract a family of near-optimal subsets of variables for linear prediction or classification. This strategy deemphasizes the role of a single “best” subset and instead advances the broader perspective that often many subsets are highly competitive. |
Daniel R. Kowal; |
1004 | Relational Language-Image Pre-training for Human-Object Interaction Detection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, the design of an appropriate pre-training strategy for this task remains underexplored by existing approaches. To address this gap, we propose $\textit{Relational Language-Image Pre-training}$ (RLIP), a strategy for contrastive pre-training that leverages both entity and relation descriptions. |
Hangjie Yuan; Jianwen Jiang; Samuel Albanie; Tao Feng; Ziyuan Huang; Dong Ni; Mingqian Tang; |
1005 | BEER: Fast $O(1/T)$ Rate for Decentralized Nonconvex Optimization with Communication Compression Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper proposes BEER, which adopts communication compression with gradient tracking, and shows it converges at a faster rate of $O(1/T)$. |
Haoyu Zhao; Boyue Li; Zhize Li; Peter Richtarik; Yuejie Chi; |
1006 | Robust Reinforcement Learning Using Offline Data Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose a robust RL algorithm called Robust Fitted Q-Iteration (RFQI), which uses only an offline dataset to learn the optimal robust policy. |
Kishan Panaganti; Zaiyan Xu; Dileep Kalathil; Mohammad Ghavamzadeh; |
1007 | Flamingo: A Visual Language Model for Few-Shot Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Building models that can be rapidly adapted to novel tasks using only a handful of annotated examples is an open challenge for multimodal machine learning research. We introduce Flamingo, a family of Visual Language Models (VLM) with this ability. |
Jean-Baptiste Alayrac; Jeff Donahue; Pauline Luc; Antoine Miech; Iain Barr; Yana Hasson; Karel Lenc; Arthur Mensch; Katherine Millican; Malcolm Reynolds; Roman Ring; Eliza Rutherford; Serkan Cabi; Tengda Han; Zhitao Gong; Sina Samangooei; Marianne Monteiro; Jacob L Menick; Sebastian Borgeaud; Andy Brock; Aida Nematzadeh; Sahand Sharifzadeh; Mikołaj Bińkowski; Ricardo Barreira; Oriol Vinyals; Andrew Zisserman; Karen Simonyan; |
1008 | DOMINO: Decomposed Mutual Information Optimization for Generalized Context in Meta-Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Multiple confounders can influence the transition dynamics, making it challenging to infer accurate context for decision-making. This paper addresses such a challenge by decomposed mutual information optimization (DOMINO) for context learning, which explicitly learns a disentangled context to maximize the mutual information between the context and historical trajectories while minimizing the state transition prediction error. |
Yao Mu; Yuzheng Zhuang; Fei Ni; Bin Wang; Jianyu Chen; Jianye Hao; Ping Luo; |
1009 | Align Then Fusion: Generalized Large-scale Multi-view Clustering with Anchor Matching Correspondences Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Under multi-view scenarios, generating correct correspondences could be extremely difficult since anchors are not consistent in feature dimensions. To solve this challenging issue, we propose the first study of a generalized and flexible anchor graph fusion framework termed Fast Multi-View Anchor-Correspondence Clustering (FMVACC). |
Siwei Wang; Xinwang Liu; Suyuan Liu; Jiaqi Jin; Wenxuan Tu; Xinzhong Zhu; En Zhu; |
1010 | Differentiable Analog Quantum Computing for Optimization and Control Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We formulate the first differentiable analog quantum computing framework with specific parameterization design at the analog signal (pulse) level to better exploit near-term quantum devices via variational methods. |
Jiaqi Leng; Yuxiang Peng; Yi-Ling Qiao; Ming Lin; Xiaodi Wu; |
1011 | Long-Form Video-Language Pre-Training with Multimodal Temporal Contrastive Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we introduce a Long-Form Video-Language Pre-training (LF-VLP) model, and train it on a large-scale long-form video and paragraph dataset constructed from an existing public dataset. |
Yuchong Sun; Bei Liu; Hongwei Xue; Ruihua Song; Huan Yang; Jianlong Fu; |
1012 | Understanding Neural Architecture Search: Convergence and Generalization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we study the generalization properties of NAS under a unifying framework enabling (deep) layer skip connection search and activation function search. |
Zhenyu Zhu; Fanghui Liu; Grigorios Chrysos; Volkan Cevher; |
1013 | Symmetry-induced Disentanglement on Graphs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Disentanglement has been formalized using a symmetry-centric notion for unstructured spaces, however, graphs have eluded a similarly rigorous treatment. We fill this gap with a new notion of conditional symmetry based disentanglement, and leverage tools from Lie algebras to encode graph properties into subgroups using suitable adaptations of generative models such as Variational Autoencoders. |
Giangiacomo Mercatali; Andre Freitas; Vikas Garg; |
1014 | Efficient Adversarial Training Without Attacking: Worst-Case-Aware Robust Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose a strong and efficient robust training framework for RL, named Worst-case-aware Robust RL (WocaR-RL), that directly estimates and optimizes the worst-case reward of a policy under bounded $\ell_p$ attacks without requiring extra samples for learning an attacker. |
Yongyuan Liang; Yanchao Sun; Ruijie Zheng; Furong Huang; |
1015 | On The Epistemic Limits of Personalized Prediction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a metric to evaluate the worst-case accuracy gain across groups called the benefit of personalization (BoP). |
Lucas Monteiro Paes; Carol Long; Berk Ustun; Flavio Calmon; |
1016 | Black-box Pseudodata Variational Inference Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: So far, both approaches are limited by complexities in evaluating their objectives for general purpose models, and require generating samples from a typically intractable posterior over the coreset throughout inference and testing. In this work, we present a black-box variational inference algorithm for coresets that overcomes these constraints and enables principled application of variational coresets to intractable models, such as Bayesian neural networks. |
Dionysis Manousakas; Hippolyt Ritter; Theofanis Karaletsos; |
1017 | Modular Flows: Differential Molecular Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose generative graph normalizing flow models, based on PDEs, for high quality molecular generation. |
Yogesh Verma; Samuel Kaski; Markus Heinonen; Vikas Garg; |
1018 | The Phenomenon of Policy Churn Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We identify and study the phenomenon of policy churn, that is, the rapid change of the greedy policy in value-based reinforcement learning. |
Tom Schaul; Andre Barreto; John Quan; Georg Ostrovski; |
1019 | Extreme Compression for Pre-trained Transformers Made Simple and Efficient Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Based on our study, we propose a simple yet effective compression pipeline for extreme compression. |
Xiaoxia Wu; Zhewei Yao; Minjia Zhang; Conglong Li; Yuxiong He; |
1020 | DreamShard: Generalizable Embedding Table Placement for Recommender Systems Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To this end, we present DreamShard, a reinforcement learning (RL) approach for embedding table placement. |
Daochen Zha; Louis Feng; Qiaoyu Tan; Zirui Liu; Kwei-Herng Lai; Bhargav Bhushanam; Yuandong Tian; Arun Kejariwal; Xia Hu; |
1021 | D-GCCA: Decomposition-based Generalized Canonical Correlation Analysis for Multi-view High-dimensional Data Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a novel decomposition method for this model, called decomposition-based generalized canonical correlation analysis (D-GCCA). |
Hai Shu; Zhe Qu; Hongtu Zhu; |
1022 | Sequential Information Design: Learning to Persuade in The Dark Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study a repeated information design problem faced by an informed sender who tries to influence the behavior of a self-interested receiver. |
Martino Bernasconi; Matteo Castiglioni; Alberto Marchesi; Nicola Gatti; Francesco Trovò; |
1023 | GAL: Gradient Assisted Learning for Decentralized Multi-Organization Collaborations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose Gradient Assisted Learning (GAL), a new method for multiple organizations to assist each other in supervised learning tasks without sharing local data, models, and objective functions. |
Enmao Diao; Jie Ding; Vahid Tarokh; |
1024 | SAPA: Similarity-Aware Point Affiliation for Feature Upsampling Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: By rethinking point affiliation, we present a generic formulation for generating upsampling kernels. |
Hao Lu; Wenze Liu; Zixuan Ye; Hongtao Fu; Yuliang Liu; Zhiguo Cao; |
1025 | Outlier-Robust Sparse Estimation Via Non-Convex Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We explore the connection between outlier-robust high-dimensional statistics and non-convex optimization in the presence of sparsity constraints, with a focus on the fundamental tasks of robust sparse mean estimation and robust sparse PCA. |
Yu Cheng; Ilias Diakonikolas; Rong Ge; Shivam Gupta; Daniel Kane; Mahdi Soltanolkotabi; |
1026 | Chaotic Regularization and Heavy-Tailed Limits for Deterministic Gradient Descent Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this study, we incorporate a chaotic component to GD in a controlled manner, and introduce \emph{multiscale perturbed GD} (MPGD), a novel optimization framework where the GD recursion is augmented with chaotic perturbations that evolve via an independent dynamical system. |
Soon Hoe Lim; Yijun Wan; Umut Simsekli; |
1027 | When Privacy Meets Partial Information: A Refined Analysis of Differentially Private Bandits Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study the problem of multi-armed bandits with ?-global Differential Privacy (DP). |
Achraf Azize; Debabrota Basu; |
1028 | [Re] Reproduction Study of Variational Fair Clustering Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: VFC is capable of handling large datasets and offers a mechanism that allows for a trade-off between fairness and clustering quality. We run a series of experiments to evaluate the major claims made by the authors. Specifically, that VFC is on par with SOTA clustering objectives, that it is scalable, that it has a trade-off control, and that it is compatible with both prototype-based and graph-based clustering algorithms. |
Floor Eijkjelboom; Mark Fokkema; Anna Lau; Luuk Verheijen; |
1029 | Estimating Noise Transition Matrix with Label Correlations for Noisy Multi-Label Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a new method to estimate the transition matrices by exploiting label correlations for noisy multi-label learning. |
Shikun Li; Xiaobo Xia; Hansong Zhang; Yibing Zhan; Shiming Ge; Tongliang Liu; |
1030 | Semi-infinitely Constrained Markov Decision Processes Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a generalization of constrained Markov decision processes (CMDPs) that we call the \emph{semi-infinitely constrained Markov decision process} (SICMDP). |
Liangyu Zhang; Yang Peng; Wenhao Yang; Zhihua Zhang; |
1031 | WaveBound: Dynamically Bounding Error for Stable Time Series Forecasting Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we introduce the dynamic error bounds on training loss to address the overfitting issue in time series forecasting. |
Youngin Cho; Daejin Kim; DONGMIN KIM; MOHAMMAD AZAM KHAN; Jaegul Choo; |
1032 | Multi-objective Deep Data Generation with Correlated Property Control Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We address these challenges by proposing a novel deep generative framework that recovers semantics and correlation of properties through disentangled latent vectors. |
Shiyu Wang; Xiaojie Guo; Xuanyang Lin; Bo Pan; Yuanqi Du; Yinkai Wang; Yanfang Ye; Ashley Petersen; Austin Leitgeb; Saleh Alkhalifa; Kevin Minbiole; William M. Wuest; Amarda Shehu; Liang Zhao; |
1033 | Semi-supervised Semantic Segmentation with Prototype-based Consistency Regularization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This diversity will make the label propagation hard from pixels to pixels. To address this problem, we propose a novel approach to regularize the distribution of within-class features to ease label propagation difficulty. |
Haiming Xu; Lingqiao Liu; Qiuchen Bian; Zhen Yang; |
1034 | Recursive Reasoning in Minimax Games: A Level $k$ Gradient Play Method Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In order to stabilize the learning dynamics in minimax games, we propose a novel recursive reasoning algorithm: Level $k$ Gradient Play (Lv. |
Zichu Liu; Lacra Pavel; |
1035 | Pre-Train Your Loss: Easy Bayesian Transfer Learning with Informative Priors Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Instead, we show that we can learn highly informative posteriors from the source task, through supervised or self-supervised approaches, which then serve as the basis for priors that modify the whole loss surface on the downstream task. |
Ravid Shwartz-Ziv; Micah Goldblum; Hossein Souri; Sanyam Kapoor; Chen Zhu; Yann LeCun; Andrew Wilson; |
1036 | Causal Discovery in Linear Latent Variable Models Subject to Measurement Error Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Based on our theoretical results, we propose causal structure learning methods for both models, and evaluate their performance on synthetic data. |
Yuqin Yang; AmirEmad Ghassami; Mohamed Nafea; Negar Kiyavash; Kun Zhang; Ilya Shpitser; |
1037 | DeepTOP: Deep Threshold-Optimal Policy for MDPs and RMABs Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We consider the problem of learning the optimal threshold policy for control problems. |
Khaled Nakhleh; I-Hong Hou; |
1038 | Advancing Model Pruning Via Bi-level Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Specifically, we formulate the pruning problem from a fresh and novel viewpoint, bi-level optimization (BLO). |
Yihua Zhang; Yuguang Yao; Parikshit Ram; pu zhao; Tianlong Chen; Mingyi Hong; Yanzhi Wang; Sijia Liu; |
1039 | Action-modulated Midbrain Dopamine Activity Arises from Distributed Control Policies Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Animal behavior is driven by multiple brain regions working in parallel with distinct control policies. We present a biologically plausible model of off-policy reinforcement learning in the basal ganglia, which enables learning in such an architecture. |
Jack Lindsey; Ashok Litwin-Kumar; |
1040 | VTC-LFC: Vision Transformer Compression with Low-Frequency Components Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Inspired by recent findings that self-attention is a low-pass filter and low-frequency signals/components are more informative to ViTs, this paper proposes compressing ViTs with low-frequency components. |
Zhenyu Wang; Hao Luo; Pichao WANG; Feng Ding; Fan Wang; Hao Li; |
1041 | Zeroth-Order Hard-Thresholding: Gradient Error Vs. Expansivity Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, first-order gradients of the objective function may be either unavailable or expensive to calculate in a lot of real-world problems, where zeroth-order (ZO) gradients could be a good surrogate. Unfortunately, whether ZO gradients can work with the hard-thresholding operator is still an unsolved problem.To solve this puzzle, in this paper, we focus on the $\ell_0$ constrained black-box stochastic optimization problems, and propose a new stochastic zeroth-order gradient hard-thresholding (SZOHT) algorithm with a general ZO gradient estimator powered by a novel random support sampling. |
William de Vazelhes; Hualin Zhang; Huimin Wu; Xiaotong Yuan; Bin Gu; |
1042 | A Boosting Approach to Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we take a further step: we reduce reinforcement learning to a sequence of weak learning problems. |
Nataly Brukhim; Elad Hazan; Karan Singh; |
1043 | Deep Generative Model for Periodic Graphs Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To address them, this paper proposes Periodical-Graph Disentangled Variational Auto-encoder (PGD-VAE), a new deep generative model for periodic graphs that can automatically learn, disentangle, and generate local and global graph patterns. |
Shiyu Wang; Xiaojie Guo; Liang Zhao; |
1044 | Finding Second-Order Stationary Points in Nonconvex-Strongly-Concave Minimax Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper proposes efficient algorithms for finding second-order stationary points of the nonconvex-strongly-concave minimax problems. |
Luo Luo; Yujun Li; Cheng Chen; |
1045 | Alignment As A Multi-agent Intrinsic Reward Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Current multi-agent algorithms struggle to learn in the alternative setup of decentralized training or sparse rewards. To address these issues, we propose a self-supervised intrinsic reward called \textit{alignment} inspired by the self-organization principle in Zoology. |
Zixian Ma; Rose Wang; Michael Bernstein; Fei-Fei Li; Ranjay Krishna; |
1046 | Recruitment Strategies That Take A Chance Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Previous work has considered algorithms that make offers sequentially and are subject to a hard budget constraint. We argue that these modeling choices may be inconsistent with the practice of academic recruitment. |
Gregory Kehne; Ariel Procaccia; Jingyan Wang; |
1047 | Active Learning of Classifiers with Label and Seed Queries Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We show that, under a generalized notion of margin, combining two types of queries in active multi-class classification yields a polynomial time algorithm with exponential savings in query complexity. |
Marco Bressan; Nicolò Cesa-Bianchi; Silvio Lattanzi; Andrea Paudice; Maximilian Thiessen; |
1048 | MetaTeacher: Coordinating Multi-Model Domain Adaptation for Medical Image Classification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we introduce a novel MetaTeacher framework with three key components: (1) A learnable coordinating scheme for adaptive domain adaptation of individual source models, (2) A mutual feedback mechanism between the target model and source models for more coherent learning, and (3) A semi-supervised bilevel optimization algorithm for consistently organizing the adaption of source models and the learning of target model. |
Zhenbin Wang; Mao Ye; Xiatian Zhu; Liuhan Peng; Liang Tian; Yingying Zhu; |
1049 | What Makes A "Good" Data Augmentation in Knowledge Distillation – A Statistical Perspective Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we ask: Why do some DA schemes (e.g., CutMix) inherently perform much better than others in KD? |
Huan Wang; Suhas Lohit; Michael Jones; Yun Fu; |
1050 | Low-rank Lottery Tickets: Finding Efficient Low-rank Neural Networks Via Matrix Differential Equations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose a novel algorithm to find efficient low-rank subnetworks. |
Steffen Schotthöfer; Emanuele Zangrando; Jonas Kusch; Gianluca Ceruti; Francesco Tudisco; |
1051 | On Computing Probabilistic Explanations for Decision Trees Related Papers Related Patents Related Grants Related Orgs Related Experts View Abstract: Formal XAI (explainable AI) is a growing area that focuses on computing explanations with mathematical guarantees for the decisions made by ML models. Inside formal XAI, one of … |
Marcelo Arenas; Pablo Barceló; Miguel Romero Orth; Bernardo Subercaseaux; |
1052 | Global Optimal K-Medoids Clustering of One Million Samples Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This work proposes a branch and bound (BB) scheme, in which a tailored Lagrangian relaxation method proposed in the 1970s is used to provide a lower bound at each BB node. |
Jiayang Ren; Kaixun Hua; Yankai Cao; |
1053 | SemiFL: Semi-Supervised Federated Learning for Unlabeled Clients with Alternate Training Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose SemiFL to address the problem of combining communication efficient FL like FedAvg with Semi-Supervised Learning (SSL). |
Enmao Diao; Jie Ding; Vahid Tarokh; |
1054 | Disentangling Transfer in Continual Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The ability of continual learning systems to transfer knowledge from previously seen tasks in order to maximize forward transfer is a significant challenge for the field, limiting the applicability of continual learning solutions to realistic scenarios. Consequently, this study aims to broaden our understanding of transfer and its driving forces in the specific case of continual reinforcement learning. |
Maciej Wolczyk; Michał Zając; Razvan Pascanu; Łukasz Kuciński; Piotr Miłoś; |
1055 | Weak-shot Semantic Segmentation Via Dual Similarity Transfer Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To this end, we focus on the problem named weak-shot semantic segmentation, where the novel classes are learnt from cheaper image-level labels with the support of base classes having off-the-shelf pixel-level labels. |
Junjie Chen; Li Niu; Siyuan Zhou; Jianlou Si; Chen Qian; Liqing Zhang; |
1056 | Towards Safe Reinforcement Learning with A Safety Editor Policy Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present SEditor, a two-policy approach that learns a safety editor policy transforming potentially unsafe actions proposed by a utility maximizer policy into safe ones. |
Haonan Yu; Wei Xu; Haichao Zhang; |
1057 | Accelerating SGD for Highly Ill-Conditioned Huge-Scale Online Matrix Completion Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a preconditioned version of SGD that preserves all the favorable practical qualities of SGD for huge-scale online optimization while also making it agnostic to $\kappa$. |
Jialun Zhang; Hong-Ming Chiu; Richard Y Zhang; |
1058 | Maximum Common Subgraph Guided Graph Retrieval: Late and Early Interaction Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose both late and early interaction neural MCES and MCCS formulations. |
Indradyumna Roy; Soumen Chakrabarti; Abir De; |
1059 | Boosting The Performance of Generic Deep Neural Network Frameworks with Log-supermodular CRFs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we present a generic combined approach in which a log-supermodular CRF acts as a regularizer to encourage similarity between outputs in a structured prediction task. |
Hao Xiong; Yangxiao Lu; Nicholas Ruozzi; |
1060 | Error Analysis of Tensor-Train Cross Approximation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To our knowledge, existing results only provide element-wise approximation accuracy guarantees, which lead to a very loose bound when extended to the entire tensor. In this paper, we bridge this gap by providing accuracy guarantees in terms of the entire tensor for both exact and noisy measurements. |
Zhen Qin; Alexander Lidiak; Zhexuan Gong; Gongguo Tang; Michael B Wakin; Zhihui Zhu; |
1061 | Language Models with Image Descriptors Are Strong Few-Shot Video-Language Learners Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: The goal of this work is to build flexible video-language models that can generalize to various video-to-text tasks from few examples. |
Zhenhailong Wang; Manling Li; Ruochen Xu; Luowei Zhou; Jie Lei; Xudong Lin; Shuohang Wang; Ziyi Yang; Chenguang Zhu; Derek Hoiem; Shih-Fu Chang; Mohit Bansal; Heng Ji; |
1062 | Beyond The Best: Distribution Functional Estimation in Infinite-Armed Bandits Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose unified meta algorithms for the online and offline settings and derive matching lower bounds using different Wasserstein distances. |
Yifei Wang; Tavor Baharav; Yanjun Han; Jiantao Jiao; David Tse; |
1063 | Model-based RL with Optimistic Posterior Sampling: Structural Conditions and Sample Complexity Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a general framework to design posterior sampling methods for model-based RL. |
Alekh Agarwal; Tong Zhang; |
1064 | Learning Interacting Dynamical Systems with Latent Gaussian Process ODEs Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce a new model that decomposes independent dynamics of single objects accurately from their interactions. |
Çağatay Yıldız; Melih Kandemir; Barbara Rakitsch; |
1065 | Semi-Parametric Neural Image Synthesis Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Our work questions the underlying paradigm of compressing large training data into ever growing parametric representations. We rather present an orthogonal, semi-parametric approach. |
Andreas Blattmann; Robin Rombach; Kaan Oktay; Jonas Müller; Björn Ommer; |
1066 | Data-Driven Conditional Robust Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose an integrated framework that designs the conditional uncertainty set by jointly learning the partitions in the covariate data space and simultaneously constructing partition specific deep uncertainty sets for the random vector that perturbs the CRO problem. |
Abhilash Reddy Chenreddy; Nymisha Bandi; Erick Delage; |
1067 | Adaptive Sampling for Discovery Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we study a sequential decision-making problem, called Adaptive Sampling for Discovery (ASD). |
Ziping Xu; Eunjae Shim; Ambuj Tewari; Paul Zimmerman; |
1068 | Efficient Architecture Search for Diverse Tasks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: As less-studied domains are precisely those where we expect AutoML to have the greatest impact, in this work we study NAS for efficiently solving diverse problems. |
Junhong Shen; Misha Khodak; Ameet Talwalkar; |
1069 | Causal Identification Under Markov Equivalence: Calculus, Algorithm, and Completeness Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we assume as the input of the task a less informative structure known as a partial ancestral graph (PAG), which represents a Markov equivalence class of causal diagrams, learnable from observational data. |
Amin Jaber; Adele Ribeiro; Jiji Zhang; Elias Bareinboim; |
1070 | Single Model Uncertainty Estimation Via Stochastic Data Centering Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present a striking new finding that an ensemble of neural networks with the same weight initialization, trained on datasets that are shifted by a constant bias gives rise to slightly inconsistent trained models, where the differences in predictions are a strong indicator of epistemic uncertainties. |
Jayaraman Thiagarajan; Rushil Anirudh; Vivek Sivaraman Narayanaswamy; Timo Bremer; |
1071 | Models Out of Line: A Fourier Lens on Distribution Shift Robustness Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Thus, to aid future research into the OOD puzzle, we address the gap in publicly-available models with effective robustness by introducing a set of pretrained CIFAR-10 models—$RobustNets$—with varying levels of OOD robustness. |
Sara Fridovich-Keil; Brian Bartoldson; James Diffenderfer; Bhavya Kailkhura; Timo Bremer; |
1072 | Global Convergence and Stability of Stochastic Gradient Descent Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we demonstrate the restrictiveness of these assumptions using three canonical models in machine learning, then we develop novel theoretical tools to address this shortcoming in two ways. |
Vivak Patel; Shushu Zhang; Bowen Tian; |
1073 | Training Spiking Neural Networks with Local Tandem Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we put forward a generalized learning rule, termed Local Tandem Learning (LTL). |
Qu Yang; Jibin Wu; Malu Zhang; Yansong Chua; Xinchao Wang; Haizhou Li; |
1074 | Robust Calibration with Multi-domain Temperature Scaling Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we develop a systematic calibration model to handle distribution shifts by leveraging data from multiple domains. |
Yaodong Yu; Stephen Bates; Yi Ma; Michael Jordan; |
1075 | Roadblocks for Temporarily Disabling Shortcuts and Learning New Knowledge Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Although recent studies have shown some characteristics of shortcuts, there are few investigations on how to help the deep learning models to solve shortcut problems. This paper proposes a framework to address this issue by setting up roadblocks on shortcuts. |
Hongjing Niu; Hanting Li; Feng Zhao; Bin Li; |
1076 | A Primer for Neural Arithmetic Logic Modules Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Focusing on the shortcomings of the NALU, we provide an in-depth analysis to reason about design choices of recent modules. |
Bhumika Mistry; Katayoun Farrahi; Jonathon Hare; |
1077 | In What Ways Are Deep Neural Networks Invariant and How Should We Measure This? Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper we explore the nature of invariance and equivariance of deep learning models with the goal of better understanding the ways that they actually capture these concepts on a formal level. We introduce a family of invariance and equivariance metrics that allow us to quantify these properties in a way that disentangles them from other metrics such as loss or accuracy. |
Henry Kvinge; Tegan Emerson; Grayson Jorgenson; Scott Vasquez; Tim Doster; Jesse Lew; |
1078 | 3DILG: Irregular Latent Grids for 3D Generative Modeling Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a new representation for encoding 3D shapes as neural fields. |
Biao Zhang; Matthias Niessner; Peter Wonka; |
1079 | Deep Combinatorial Aggregation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we explore a combinatorial generalization of deep ensemble called deep combinatorial aggregation (DCA). |
Yuesong Shen; Daniel Cremers; |
1080 | Subspace Clustering in High-dimensions: Phase Transitions \& Statistical-to-Computational Gap Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: A simple model to study subspace clustering is the high-dimensional $k$-Gaussian mixture model where the cluster means are sparse vectors. Here we provide an exact asymptotic characterization of the statistically optimal reconstruction error in this model in the high-dimensional regime with extensive sparsity, i.e. when the fraction of non-zero components of the cluster means $\rho$, as well as the ratio $\alpha$ between the number of samples and the dimension are fixed, while the dimension diverges. |
Luca Pesce; Bruno Loureiro; Florent Krzakala; Lenka Zdeborová; |
1081 | Sound and Complete Incorporation of Local Causal Background Knowledge with Latent Variables Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: As an application of the orientation rules, we propose the first general active learning framework for causal discovery in the presence of latent confounders, aiming to recover the true ancestral graph with as few interventions as possible. |
Tian-Zuo Wang; Tian Qin; Zhi-Hua Zhou; |
1082 | Diverse Weight Averaging for Out-of-Distribution Generalization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose Diverse Weight Averaging (DiWA), a new WA strategy whose main motivation is to increase the functional diversity across averaged models. |
Alexandre Rame; Matthieu Kirchmeyer; Thibaud Rahier; Alain Rakotomamonjy; Patrick Gallinari; Matthieu Cord; |
1083 | Sampling in Constrained Domains with Orthogonal-Space Variational Gradient Descent Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a new variational framework with a designed orthogonal-space gradient flow (O-Gradient) for sampling on a manifold $\mathcal{G}_0$ defined by general equality constraints. |
Ruqi Zhang; Qiang Liu; Xin Tong; |
1084 | Generalization Bounds for Stochastic Gradient Descent Via Localized $\varepsilon$-Covers Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a new covering technique localized for the trajectories of SGD. |
Sejun Park; Umut Simsekli; Murat Erdogdu; |
1085 | Multi-Fidelity Best-Arm Identification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study the multi-fidelity variant of best-arm identification, where observing the outcome of a given arm is expensive, but multiple and biased approximations (i.e. fidelity) are available at a cheaper cost. |
Riccardo Poiani; Alberto Maria Metelli; Marcello Restelli; |
1086 | An Efficient Graph Generative Model for Navigating Ultra-large Combinatorial Synthesis Libraries Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To overcome these challenges, we propose the Combinatorial Synthesis Library Variational Auto-Encoder (CSLVAE). The proposed generative model represents such libraries as a differentiable, hierarchically-organized database. |
Aryan Pedawi; Pawel Gniewek; Chaoyi Chang; Brandon Anderson; Henry van den Bedem; |
1087 | Not Too Little, Not Too Much: A Theoretical Analysis of Graph (over)smoothing Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we consider simplified linear GNNs, and rigorously analyze two examples for which a finite number of mean aggregation steps provably improves the learning performance, before oversmoothing kisks in. |
Nicolas Keriven; |
1088 | Adam Can Converge Without Any Modification On Update Rules Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We prove that, when the 2nd-order momentum parameter $\beta_2$ is large and 1st-order momentum parameter $\beta_1 < \sqrt{\beta_2}<1$, Adam converges to the neighborhood of critical points. |
Yushun Zhang; Congliang Chen; Naichen Shi; Ruoyu Sun; Zhi-Quan Luo; |
1089 | Towards Efficient Post-training Quantization of Pre-trained Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Therefore, they suffer from slow training, large memory overhead, and data accessibility issues. In this paper, we study post-training quantization~(PTQ) of PLMs, and propose module-wise quantization error minimization~(MREM), an efficient solution to mitigate these issues. |
Haoli Bai; Lu Hou; Lifeng Shang; Xin Jiang; Irwin King; Michael R Lyu; |
1090 | On The Robustness of Deep Clustering Models: Adversarial Attacks and Defenses Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Through this work, we thus aim to motivate the need for truly robust deep clustering models. |
Anshuman Chhabra; Ashwin Sekhari; Prasant Mohapatra; |
1091 | Bidirectional Learning for Offline Infinite-width Model-based Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This methodology frequently suffers from the out-of-distribution problem where the proxy function often returns adversarial designs. To mitigate this problem, we propose $\textit{\textbf{B}i\textbf{D}irectional learning for offline \textbf{I}nfinite-width model-based optimization}~(\textbf{BDI})$. |
Can Chen; Yingxueff Zhang; Jie Fu; Xue (Steve) Liu; Mark Coates; |
1092 | Path Independent Equilibrium Networks Can Better Exploit Test-Time Computation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we reproduce the performance of the prior art using a broader class of architectures called equilibrium models, and find that stronger generalization performance on harder examples (which require more iterations of inference to get correct) strongly correlates with the path independence of the system—its ability to converge to the same attractor (or limit cycle) regardless of initialization, given enough computation. |
Cem Anil; Ashwini Pokle; Kaiqu Liang; Johannes Treutlein; Yuhuai Wu; Shaojie Bai; J. Zico Kolter; Roger Grosse; |
1093 | On The Convergence of Policy Gradient Methods to Nash Equilibria in General Stochastic Games Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We derive the rate of convergence of policy gradient methods to deterministic and/or second-order stationary Nash policies in general stochastic games. |
Angeliki Giannou; Kyriakos Lotidis; Panayotis Mertikopoulos; Emmanouil-Vasileios Vlatakis-Gkaragkounis; |
1094 | Conformal Frequency Estimation with Sketched Data Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: A flexible conformal inference method is developed to construct confidence intervals for the frequencies of queried objects in very large data sets, based on a much smaller sketch of those data. |
Matteo Sesia; Stefano Favaro; |
1095 | Reproducibility in Optimization: Theoretical Framework and Limits Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We initiate a formal study of reproducibility in optimization. |
Kwangjun Ahn; Prateek Jain; Ziwei Ji; Satyen Kale; Praneeth Netrapalli; Gil I Shamir; |
1096 | Mirror Descent Maximizes Generalized Margin and Can Be Implemented Efficiently Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper answers an open question in this literature: For the classification setting, what solution does mirror descent (MD) converge to? Specifically, motivated by its efficient implementation, we consider the family of mirror descent algorithms with potential function chosen as the $p$-th power of the $\ell_p$-norm, which is an important generalization of GD. |
Haoyuan Sun; Kwangjun Ahn; Christos Thrampoulidis; Navid Azizan; |
1097 | Deep Surrogate Assisted Generation of Environments Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose Deep Surrogate Assisted Generation of Environments (DSAGE), a sample-efficient QD environment generation algorithm that maintains a deep surrogate model for predicting agent behaviors in new environments. |
Varun Bhatt; Bryon Tjanaka; Matthew Fontaine; Stefanos Nikolaidis; |
1098 | Constraining Gaussian Processes to Systems of Linear Ordinary Differential Equations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper presents a novel algorithmic and symbolic construction for covariance functions of Gaussian Processes (GPs) with realizations strictly following a system of linear homogeneous ODEs with constant coefficients, which we call LODE-GPs. |
Andreas Besginow; Markus Lange-Hegermann; |
1099 | Predictive Querying for Autoregressive Neural Sequence Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper we introduce a general typology for predictive queries in autoregressive sequence models and show that such queries can be systematically represented by sets of elementary building blocks. |
Alex Boyd; Samuel Showalter; Stephan Mandt; Padhraic Smyth; |
1100 | Joint Model-Policy Optimization of A Lower Bound for Model-Based RL Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: As noted in prior work, there is an objective mismatch: models are useful if they yield good policies, but they are trained to maximize their accuracy, rather than the performance of the policies that result from them. In this work, we propose a single objective for jointly training the model and the policy, such that updates to either component increase a lower bound on expected return. |
Benjamin Eysenbach; Alexander Khazatsky; Sergey Levine; Russ Salakhutdinov; |
1101 | Analyzing Sharpness Along GD Trajectory: Progressive Sharpening and Edge of Stability Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper aims to analyze the GD dynamics and the sharpness along the optimization trajectory.Our analysis naturally divides the GD trajectory into four phases depending on the change of the sharpness. |
Zixuan Wang; Zhouzi Li; Jian Li; |
1102 | MoCoDA: Model-based Counterfactual Data Augmentation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Knowing the local structure also allows us to predict which unseen states and actions this dynamics model will generalize to. We propose to leverage these observations in a novel Model-based Counterfactual Data Augmentation (MoCoDA) framework. |
Silviu Pitis; Elliot Creager; Ajay Mandlekar; Animesh Garg; |
1103 | [Re] Transparent Object Tracking Benchmark Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The tracking performance was reproduced in terms of success, precision, and normalized precision, and the reported value is in the 95 percent confidence interval, which supports the paper’s conclusion that TransATOM significantly outperforms other state-of-the-art algorithms on TOTB database. |
Žiga Trojer; |
1104 | ALMA: Hierarchical Learning for Composite Multi-Agent Tasks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This decomposed decision making provides a strong structural inductive bias, significantly reduces agent observation spaces, and encourages subtask-specific policies to be reused and composed during training, as opposed to treating each new composition of subtasks as unique. We introduce ALMA, a general learning method for taking advantage of these structured tasks. |
Shariq Iqbal; Robby Costales; Fei Sha; |
1105 | Coordinates Are Not Lonely – Codebook Prior Helps Implicit Neural 3D Representations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To relive the over-dependence on massive calibrated images and enrich the coordinate-based feature representation, we explore injecting the prior information into the coordinate-based network and introduce a novel coordinate-based model, CoCo-INR, for implicit neural 3D representation. |
Fukun Yin; Wen Liu; Zilong Huang; Pei Cheng; Tao Chen; Gang Yu; |
1106 | Cluster Randomized Designs for One-Sided Bipartite Experiments Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we study interference in the setting of one-sided bipartite experiments in which the experimental units—where treatments are randomized and outcomes are measured—do not interact directly. |
Jennifer Brennan; Vahab Mirrokni; Jean Pouget-Abadie; |
1107 | Scalable and Efficient Training of Large Convolutional Neural Networks with Differential Privacy Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Large convolutional neural networks (CNN) can be difficult to train in the differentially private (DP) regime, since the optimization algorithms require a computationally expensive operation, known as the per-sample gradient clipping. We propose an efficient and scalable implementation of this clipping on convolutional layers, termed as the mixed ghost clipping, that significantly eases the private training in terms of both time and space complexities, without affecting the accuracy. |
Zhiqi Bu; Jialin Mao; Shiyun Xu; |
1108 | Exploring Evolution-based & -free Protein Language Models As Protein Function Predictors Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Specifically, we aim to answer the following key questions: (1) Does the Evoformer trained as part of AlphaFold produce representations amenable to predicting protein function? |
Mingyang Hu; Fajie Yuan; Kevin Yang; Fusong Ju; Jin Su; Hui Wang; Fei Yang; Qiuyang Ding; |
1109 | Invariance Learning in Deep Neural Networks with Differentiable Laplace Approximations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a differentiable Kronecker-factored Laplace approximation to the marginal likelihood as our objective, which can be optimised without human supervision or validation data. |
Alexander Immer; Tycho van der Ouderaa; Gunnar Rätsch; Vincent Fortuin; Mark van der Wilk; |
1110 | MGNNI: Multiscale Graph Neural Networks with Implicit Layers Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we introduce and justify two weaknesses of implicit GNNs: the constrained expressiveness due to their limited effective range for capturing long-range dependencies, and their lack of ability to capture multiscale information on graphs at multiple resolutions. |
Juncheng Liu; Bryan Hooi; Kenji Kawaguchi; Xiaokui Xiao; |
1111 | Anonymous Bandits for Multi-User Systems Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we present and study a new framework for online learning in systems with multiple users that provide user anonymity. |
Hossein Esfandiari; Vahab Mirrokni; Jon Schneider; |
1112 | Amortized Mixing Coupling Processes for Clustering Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose cluster-wise amortized mixing coupling processes (AMCP), which is able to achieve efficient amortized clustering in a well-defined non-parametric Bayesian posterior. |
Huafeng Liu; Liping Jing; |
1113 | Continual Learning In Environments With Polynomial Mixing Times Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we characterize problems that are of long-term interest to the development of continual RL, which we call scalable MDPs, through the lens of mixing times. |
Matthew Riemer; Sharath Chandra Raparthy; Ignacio Cases; Gopeshh Subbaraj; Maximilian Puelma Touzel; Irina Rish; |
1114 | On The Adversarial Robustness of Mixture of Experts Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This raises an interesting open question, do—and can—functions with more parameters, but not necessarily more computational cost, have better robustness? We study this question for sparse Mixture of Expert models (MoEs), that make it possible to scale up the model size for a roughly constant computational cost. |
Joan Puigcerver; Rodolphe Jenatton; Carlos Riquelme; Pranjal Awasthi; Srinadh Bhojanapalli; |
1115 | First-Order Algorithms for Min-Max Optimization in Geodesic Metric Spaces Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We prove the iterative performance of riemannian corrected extra gradient and gradient descent-ascent for min-max geodesically convex-concave manifold problems. |
Michael Jordan; Tianyi Lin; Emmanouil-Vasileios Vlatakis-Gkaragkounis; |
1116 | TREC: Transient Redundancy Elimination-based Convolution Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper gives a principled method to detect and avoid transient redundancy, a type of redundancy existing in input data or activation maps and hence changing across inferences. |
Jiawei Guan; Feng Zhang; Jiesong Liu; Hsin-Hsuan Sung; Ruofan Wu; Xiaoyong Du; Xipeng Shen; |
1117 | Toward Efficient Robust Training Against Union of $\ell_p$ Threat Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we show that by carefully choosing the objective function used for robust training, it is possible to achieve similar, or improved worst-case performance over a union of threat models while utilizing only single-step attacks, thereby achieving a significant reduction in computational resources necessary for training. |
Gaurang Sriramanan; Maharshi Gor; Soheil Feizi; |
1118 | Trajectory-guided Control Prediction for End-to-end Autonomous Driving: A Simple Yet Strong Baseline Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Current end-to-end autonomous driving methods either run a controller based on a planned trajectory or perform control prediction directly, which have spanned two separately studied lines of research. Seeing their potential mutual benefits to each other, this paper takes the initiative to explore the combination of these two well-developed worlds. |
Penghao Wu; Xiaosong Jia; Li Chen; Junchi Yan; Hongyang Li; Yu Qiao; |
1119 | On The Detrimental Effect of Invariances in The Likelihood for Variational Inference Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, prior work has associated the variational mean-field approximation for Bayesian neural networks with underfitting in the case of small datasets or large model sizes. In this work, we show that invariances in the likelihood function of over-parametrised models contribute to this phenomenon because these invariances complicate the structure of the posterior by introducing discrete and/or continuous modes which cannot be well approximated by Gaussian mean-field distributions. |
Richard Kurle; Ralf Herbrich; Tim Januschowski; Yuyang (Bernie) Wang; Jan Gasthaus; |
1120 | Truncated Proposals for Scalable and Hassle-free Simulation-based Inference Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce an efficient and testable simulation-based inference method that can scale to complex models with many parameters. |
Michael Deistler; Pedro Goncalves; Jakob H Macke; |
1121 | Spectral Bias in Practice: The Role of Function Frequency in Generalization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose methodologies for measuring spectral bias in modern image classification networks on CIFAR-10 and ImageNet. |
Sara Fridovich-Keil; Raphael Gontijo Lopes; Rebecca Roelofs; |
1122 | Geometry-aware Two-scale PIFu Representation for Human Reconstruction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel geometry-aware two-scale PIFu for 3D human reconstruction from sparse, noisy inputs. |
Zheng Dong; Ke Xu; Ziheng Duan; Hujun Bao; Weiwei Xu; Rynson Lau; |
1123 | ZeroQuant: Efficient and Affordable Post-Training Quantization for Large-Scale Transformers Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we present an efficient and affordable post-training quantization approach to compress large Transformer-based models, termed as \OURS. |
Zhewei Yao; Reza Yazdani Aminabadi; Minjia Zhang; Xiaoxia Wu; Conglong Li; Yuxiong He; |
1124 | ResQ: A Residual Q Function-based Approach for Multi-Agent Reinforcement Learning Value Factorization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Existing studies are limited by their representation capability, sample efficiency, and approximation error. To address these challenges, we propose, ResQ, a MARL value function factorization method, which can find the optimal joint policy for any state-action value function through residual functions. |
Siqi SHEN; Mengwei Qiu; Jun Liu; Weiquan Liu; Yongquan Fu; Xinwang Liu; Cheng Wang; |
1125 | Understanding Cross-Domain Few-Shot Learning Based on Domain Similarity and Few-Shot Difficulty Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We further design two pre-training schemes, mixed-supervised and two-stage learning, that improve performance. In this light, we present six findings for CD-FSL, which are supported by extensive experiments and analyses on three source and eight target benchmark datasets with varying levels of domain similarity and few-shot difficulty. |
Jaehoon Oh; Sungnyun Kim; Namgyu Ho; Jin-Hwa Kim; Hwanjun Song; Se-Young Yun; |
1126 | The Nature of Temporal Difference Errors in Multi-step Distributional Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Despite the apparent similarity between value-based RL and distributional RL, our study reveals intriguing and fundamental differences between the two cases in the multi-step setting. |
Yunhao Tang; Remi Munos; Mark Rowland; Bernardo Avila Pires; Will Dabney; Marc Bellemare; |
1127 | Improving Barely Supervised Learning By Discriminating Unlabeled Samples with Super-Class Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Existing SSL methods suffer from failures in barely-supervised learning (BSL), where only one or two labels per class are available, as the insufficient labels cause the discriminative information being difficult or even infeasible to learn. To bridge this gap, we investigate a simple yet effective way to leverage unlabeled samples for discriminative learning, and propose a novel discriminative information learning module to benefit model training. |
Guan Gui; Zhen Zhao; Lei Qi; Luping Zhou; Lei Wang; Yinghuan Shi; |
1128 | Look Where You Look! Saliency-guided Q-networks for Visual RL Tasks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a generic method improving generalization for visual reinforcement learning based on attribution maps. |
David Bertoin; Adil Zouitine; Mehdi Zouitine; Emmanuel Rachelson; |
1129 | Scalable Neural Video Representations with Learnable Positional Features Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To meet all requirements (a), (b), and (c) simultaneously, we propose neural video representations with learnable positional features (NVP), a novel CNR by introducing "learnable positional features" that effectively amortize a video as latent codes. |
Subin Kim; Sihyun Yu; Jaeho Lee; Jinwoo Shin; |
1130 | Outlier-Robust Sparse Mean Estimation for Heavy-Tailed Distributions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we give the first sample-efficient and polynomial-time robust sparse mean estimator for heavy-tailed distributions under mild moment assumptions. |
Ilias Diakonikolas; Daniel Kane; Jasper Lee; Ankit Pensia; |
1131 | Posted Pricing and Dynamic Prior-independent Mechanisms with Value Maximizers Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study posted price auctions and dynamic prior-independent mechanisms for (ROI-constrained) value maximizers. |
Yuan Deng; Vahab Mirrokni; Hanrui Zhang; |
1132 | Stars: Tera-Scale Graph Building for Clustering and Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we present \emph{Stars}: a highly scalable method for building extremely sparse graphs via two-hop spanners, which are graphs where similar points are connected by a path of length at most two. |
CJ Carey; Jonathan Halcrow; Rajesh Jayaram; Vahab Mirrokni; Warren Schudy; Peilin Zhong; |
1133 | Subspace Recovery from Heterogeneous Data with Non-isotropic Noise Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Recovering linear subspaces from data is a fundamental and important task in statistics and machine learning. Motivated by heterogeneity in Federated Learning settings, we study a basic formulation of this problem: the principal component analysis (PCA), with a focus on dealing with irregular noise. |
John Duchi; Vitaly Feldman; Lunjia Hu; Kunal Talwar; |
1134 | Deep Invariant Networks with Differentiable Augmentation Layers Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work we investigate new ways of learning invariances only from the training data. |
Cédric ROMMEL; Thomas Moreau; Alexandre Gramfort; |
1135 | Autoinverse: Uncertainty Aware Inversion of Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose Autoinverse, a highly automated approach for inverting neural network surrogates. |
Navid Ansari; Hans-peter Seidel; Nima Vahidi Ferdowsi; Vahid Babaei; |
1136 | CutFreq: Cut-and-Swap Frequency Components for Low-Level Vision Augmentation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose the Cut-and-Swap Frequency Components (CutFreq) method for low-level vision, which aims to preserve high-level representations with directionality and improve image synthesis quality. |
Hongyang Chen; Kaisheng Ma; |
1137 | Performative Power Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce the notion of performative power, which measures the ability of a firm operating an algorithmic system, such as a digital content recommendation platform, to steer a population. |
Moritz Hardt; Meena Jagadeesan; Celestine Mendler-Dünner; |
1138 | Fixed-Distance Hamiltonian Monte Carlo Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a variation of the Hamiltonian Monte Carlo sampling (HMC) where the equations of motion are simulated for a fixed traversed distance rather than the conventional fixed simulation time. |
Hadi Mohasel Afshar; Sally Cripps; |
1139 | Geo-Neus: Geometry-Consistent Neural Implicit Surfaces Learning for Multi-view Reconstruction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, one key challenge remains: existing approaches lack explicit multi-view geometry constraints, hence usually fail to generate geometry consistent surface reconstruction. To address this challenge, we propose geometry-consistent neural implicit surfaces learning for multi-view reconstruction. |
Qiancheng Fu; Qingshan Xu; Yew Soon Ong; Wenbing Tao; |
1140 | Distributional Reinforcement Learning for Risk-Sensitive Policies Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose modifications to the existing algorithms that include a new distributional Bellman operator and show that the proposed strategy greatly expands the utility of distributional RL in learning and representing CVaR-optimized policies. |
Shiau Hong Lim; ILYAS MALIK; |
1141 | Neural Estimation of Submodular Functions with Applications to Differentiable Subset Selection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose FlexSubNet, a family of flexible neural models for both monotone and non-monotone submodular functions. |
Abir De; Soumen Chakrabarti; |
1142 | Exploration Via Elliptical Episodic Bonuses Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In recent years, a number of reinforcement learning (RL) methods have been pro- posed to explore complex environments which differ across episodes. In this work, we show that the effectiveness of these methods critically relies on a count-based episodic term in their exploration bonus. |
Mikael Henaff; Roberta Raileanu; Minqi Jiang; Tim Rocktäschel; |
1143 | SIXO: Smoothing Inference with Twisted Objectives Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce SIXO, a variational method that learns a sequence of target distributions that approximate the smoothing distributions, incorporating information from all observations, jointly with the model and proposal. |
Dieterich Lawson; Allan Raventós; andrew warrington; Scott Linderman; |
1144 | The Gyro-Structure of Some Matrix Manifolds Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we study the gyrovector space structure (gyro-structure) of matrix manifolds. |
Xuan Son Nguyen; |
1145 | Influencing Long-Term Behavior in Multiagent Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a principled framework for considering the limiting policies of other agents as time approaches infinity. |
Dong-Ki Kim; Matthew Riemer; Miao Liu; Jakob Foerster; Michael Everett; Chuangchuang Sun; Gerald Tesauro; Jonathan How; |
1146 | The Query Complexity of Cake Cutting Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We consider the query complexity of cake cutting and give lower and upper bounds for computing approximately envy-free, perfect, and equitable allocations with the minimum number of cuts. |
Simina Branzei; Noam Nisan; |
1147 | Rethinking Generalization in Few-Shot Classification Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we take a closer look at the implications in the context of few-shot learning. |
Markus Hiller; Rongkai Ma; Mehrtash Harandi; Tom Drummond; |
1148 | P2P: Tuning Pre-trained Image Models for Point Cloud Analysis with Point-to-Pixel Prompting Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, it is non-trivial to promote such a pretraining-tuning paradigm to the 3D vision, given the limited training data that are relatively inconvenient to collect. In this paper, we propose a new perspective of leveraging pre-trained 2D knowledge in 3D domain to tackle this problem, tuning pre-trained image models with the novel Point-to-Pixel prompting for point cloud analysis. |
Ziyi Wang; Xumin Yu; Yongming Rao; Jie Zhou; Jiwen Lu; |
1149 | Estimating Graphical Models for Count Data with Applications to Single-cell Gene Network Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a two-step method PLNet to estimate the precision matrix. |
Feiyi Xiao; Junjie Tang; Huaying Fang; Ruibin Xi; |
1150 | Zero-Shot 3D Drug Design By Sketching and Generating Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this study, we propose the zero-shot drug design method DESERT (Drug dEsign by SkEtching and geneRaTing). |
Siyu Long; Yi Zhou; Xinyu Dai; Hao Zhou; |
1151 | Early Stage Convergence and Global Convergence of Training Mildly Parameterized Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We study the convergence of GD and SGD when training mildly parameterized neural networks starting from random initialization. |
Mingze Wang; Chao Ma; |
1152 | Unsupervised Causal Generative Understanding of Images Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a novel framework for unsupervised object-centric 3D scene understanding that generalizes robustly to out-of-distribution images. |
Titas Anciukevicius; Patrick Fox-Roberts; Edward Rosten; Paul Henderson; |
1153 | Efficient Scheduling of Data Augmentation for Deep Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We establish a framework to mitigate the interference between data augmentation (DA) and deep RL by separating them in time and scheduling them adaptively. |
Byungchan Ko; Jungseul Ok; |
1154 | General Cutting Planes for Bound-Propagation-Based Neural Network Verification Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we generalize the bound propagation procedure to allow the addition of arbitrary cutting plane constraints, including those involving relaxed integer variables that do not appear in existing bound propagation formulations. |
Huan Zhang; Shiqi Wang; Kaidi Xu; Linyi Li; Bo Li; Suman Jana; Cho-Jui Hsieh; J. Zico Kolter; |
1155 | Latent Planning Via Expansive Tree Search Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we formulate latent planning as search to discover paths between far distant states in high-dimensional and long-horizon goal-reaching scenarios. |
Robert Gieselmann; Florian T. Pokorny; |
1156 | Biologically-plausible Backpropagation Through Arbitrary Timespans Via Local Neuromodulators Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Here, we propose that extra-synaptic diffusion of local neuromodulators such as neuropeptides may afford an effective mode of backpropagation lying within the bounds of biological plausibility. |
Yuhan Helena Liu; Stephen Smith; Stefan Mihalas; Eric Shea-Brown; Uygar Sümbül; |
1157 | Learning Low-dimensional Generalizable Natural Features from Retina Using A U-net Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This work focuses on using the retinal encoding of natural movies to determine the presumably behaviorally-relevant features that the brain represents. |
Siwei Wang; Benjamin Hoshal; Elizabeth de Laittre; Thierry Mora; Michael Berry; Stephanie Palmer; |
1158 | Faster Deep Reinforcement Learning with Slower Online Network Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper we endow two popular deep reinforcement learning algorithms, namely DQN and Rainbow, with updates that incentivize the online network to remain in the vicinity of the target network. |
Kavosh Asadi; Rasool Fakoor; Omer Gottesman; Taesup Kim; Michael Littman; Alexander Smola; |
1159 | PolarMix: A General Data Augmentation Technique for LiDAR Point Clouds Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper presents PolarMix, a point cloud augmentation technique that is simple and generic but can mitigate the data constraint effectively across various perception tasks and scenarios. |
Aoran Xiao; Jiaxing Huang; Dayan Guan; Kaiwen Cui; Shijian Lu; Ling Shao; |
1160 | Masked Generative Adversarial Networks Are Robust Generation Learners Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper shows that masked generative adversarial network (MaskedGAN) is robust image generation learners with limited training data. |
Jiaxing Huang; Kaiwen Cui; Dayan Guan; Aoran Xiao; Fangneng Zhan; Shijian Lu; Shengcai Liao; Eric Xing; |
1161 | PlasticityNet: Learning to Simulate Metal, Sand, and Snow for Optimization Time Integration Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a neural network-based approach for learning to represent the behavior of plastic solid materials ranging from rubber and metal to sand and snow. |
Xuan Li; Yadi Cao; Minchen Li; Yin Yang; Craig Schroeder; Chenfanfu Jiang; |
1162 | Template Based Graph Neural Network with Optimal Transport Distances Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose in this work a novel point of view, which places distances to some learnable graph templates at the core of the graph representation. |
Cédric Vincent-Cuaz; Rémi Flamary; Marco Corneli; Titouan Vayer; Nicolas Courty; |
1163 | Decision-based Black-box Attack Against Vision Transformers Via Patch-wise Adversarial Removal Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we theoretically analyze the limitations of existing decision-based attacks from the perspective of noise sensitivity difference between regions of the image, and propose a new decision-based black-box attack against ViTs, termed Patch-wise Adversarial Removal (PAR). |
Yucheng Shi; Yahong Han; Yu-an Tan; Xiaohui Kuang; |
1164 | Sequence-to-Set Generative Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a sequence-to-set method that can transform any sequence generative model based on maximum likelihood to a set generative model where we can evaluate the utility/probability of any set. |
Longtao Tang; Ying Zhou; Yu Yang; |
1165 | Ensemble of Averages: Improving Model Selection and Boosting Performance in Domain Generalization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: A simple hyper-parameter free strategy of using the simple moving average of model parameters during training and ensembling achieves SOTA on domain generalization benchmarks, and can be explained using the Bias-Variance trade-off. |
Devansh Arpit; Huan Wang; Yingbo Zhou; Caiming Xiong; |
1166 | Neurosymbolic Deep Generative Models for Sequence Data with Relational Constraints Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a novel approach for incorporating global structure in the form of relational constraints between different subcomponents of an example (e.g., lines of a poem or measures of music). |
Halley Young; Maxwell Du; Osbert Bastani; |
1167 | One for All: Simultaneous Metric and Preference Learning Over Multiple Users Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper investigates simultaneous preference and metric learning from a crowd of respondents. |
Gregory Canal; Blake Mason; Ramya Korlakai Vinayak; Robert Nowak; |
1168 | Asymptotics of Smoothed Wasserstein Distances in The Small Noise Regime Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study the behavior of the Wasserstein-$2$ distance between discrete measures $\mu$ and $\nu$ in $\mathbb{R}^d$ when both measures are smoothed by small amounts of Gaussian noise. |
Yunzi Ding; Jonathan Niles-Weed; |
1169 | Towards A Unified Framework for Uncertainty-aware Nonlinear Variable Selection with Theoretical Guarantees Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We develop a simple and unified framework for nonlinear variable {\color{blue} importance estimation} that incorporates uncertainty in the prediction function and is compatible with a wide range of machine learning models (e.g., tree ensembles, kernel methods, neural networks, etc). |
Wenying Deng; Beau Coker; Rajarshi Mukherjee; Jeremiah Liu; Brent Coull; |
1170 | Noise Attention Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a simple yet effective method that automatically distinguishes the mislabeled samples and prevents the model from memorizing them, named Noise Attention Learning. |
Yangdi Lu; Yang Bo; Wenbo He; |
1171 | Unified Optimal Transport Framework for Universal Domain Adaptation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Moreover, they cannot recognize different categories among target-private samples as these private samples are treated as a whole. In this paper, we propose to use Optimal Transport (OT) to handle these issues under a unified framework, namely UniOT. |
Wanxing Chang; Ye Shi; Hoang Tuan; Jingya Wang; |
1172 | Learning Individualized Treatment Rules with Many Treatments: A Supervised Clustering Approach Using Adaptive Fusion Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: For computation, we propose an efficient algorithm based on accelerated proximal gradient and further conduct a novel group-lasso based algorithm for variable selection to boost the performance. |
Haixu Ma; Donglin Zeng; Yufeng Liu; |
1173 | Tensor Wheel Decomposition and Its Tensor Completion Application Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel TN decomposition, dubbed tensor wheel (TW) decomposition, in which a high-order tensor is represented by a set of latent factors mapped into a specific wheel topology. |
Zhong-Cheng Wu; Ting-Zhu Huang; Liang-Jian Deng; Hong-Xia Dou; Deyu Meng; |
1174 | Rethinking Value Function Learning for Generalization in Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To this end, we introduce Delayed-Critic Policy Gradient (DCPG), which implicitly penalizes the value estimates by training the value function less frequently with more training data compared to the policy. |
Seungyong Moon; JunYeong Lee; Hyun Oh Song; |
1175 | Transferring Fairness Under Distribution Shifts Via Fair Consistency Regularization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a theory-guided algorithm to transfer fairness under distribution shifts. |
Bang An; Zora Che; Mucong Ding; Furong Huang; |
1176 | SPD: Synergy Pattern Diversifying Oriented Unsupervised Multi-agent Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose Synergy Pattern Diversifying Oriented Unsupervised Multi-agent Reinforcement Learning (SPD) to learn generic coordination policies for agents with no extrinsic reward. |
Yuhang Jiang; Jianzhun Shao; Shuncheng He; Hongchang Zhang; Xiangyang Ji; |
1177 | Lifting The Information Ratio: An Information-Theoretic Analysis of Thompson Sampling for Contextual Bandits Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study the Bayesian regret of the renowned Thompson Sampling algorithm in contextual bandits with binary losses and adversarially-selected contexts. |
Gergely Neu; Iuliia Olkhovskaia; Matteo Papini; Ludovic Schwartz; |
1178 | Active Labeling: Streaming Stochastic Gradients Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: After formalizing the “active labeling” problem, which focuses on active learning with partial supervision, we provide a streaming technique that provably minimizes the ratio of generalization error over the number of samples.We illustrate our technique in depth for robust regression. |
Vivien Cabannes; Francis Bach; Vianney Perchet; Alessandro Rudi; |
1179 | Private Isotonic Regression Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper we consider the problem of differentially private (DP) algorithms for isotonic regression. |
Badih Ghazi; Pritish Kamath; Ravi Kumar; Pasin Manurangsi; |
1180 | Anonymized Histograms in Intermediate Privacy Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we provide an algorithm with a nearly matching error guarantee of $\tilde{O}_\varepsilon(\sqrt{n})$ in the shuffle DP and pan-private models. |
Badih Ghazi; Pritish Kamath; Ravi Kumar; Pasin Manurangsi; |
1181 | Learning to Mitigate AI Collusion on Economic Platforms Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we demonstrate that reinforcement learning (RL) can also be used by platforms to learn buy box rules that are effective in preventing collusion by RL sellers. |
Eric Mibuari; Gianluca Brero; David Parkes; Nicolas Lepore; |
1182 | Attention-based Neural Cellular Automata Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Inspired by Transformer-based architectures, our work presents a new class of attention-based NCAs formed using a spatially localized—yet globally organized—self-attention scheme. |
Mattie Tesfaldet; Derek Nowrouzezahrai; Chris Pal; |
1183 | Learning A Condensed Frame for Memory-Efficient Video Class-Incremental Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, only a few bulky videos can be stored due to the limited memory. To address this problem, we propose FrameMaker, a memory-efficient video class-incremental learning approach that learns to produce a condensed frame for each selected video. |
Yixuan Pei; Zhiwu Qing; Jun CEN; Xiang Wang; Shiwei Zhang; Yaxiong Wang; Mingqian Tang; Nong Sang; Xueming Qian; |
1184 | Regret Bounds of Cooperative Thompson Sampling Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We consider the concurrent reinforcement learning problem where $n$ agents simultaneously learn to make decisions in the same environment by sharing experience (i.e. data) with each other. |
Yan Chen; Qinxun Bai; Perry Dong; Maria Dimakopoulou; Wei Xu; Zhengyuan Zhou; |
1185 | Certifying Some Distributional Fairness with Subpopulation Decomposition Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a general framework to certifying the distributional fairness of a trained model based on fairness constrained distribution. |
Mintong Kang; Linyi Li; Maurice Weber; Yang Liu; Ce Zhang; Bo Li; |
1186 | End-to-End Learning to Index and Search in Large Output Spaces Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel method ELIAS which relaxes the tree-based index to a specialized weighted graph based index which is learned end-to-end with the final task objective. |
Nilesh Gupta; Patrick Chen; Hsiang-Fu Yu; Cho-Jui Hsieh; Inderjit Dhillon; |
1187 | The Stability-Efficiency Dilemma: Investigating Sequence Length Warmup for Training GPT Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, such practice is often brittle and leads to a so-called stability-efficiency dilemma: increasing the batch sizes and learning rates leads to better training efficiency but can also result in training instability, leading to poor generalization accuracy or failed runs. To better understand this phenomenon, we conduct an in-depth analysis on large-scale pre-training experiments replicating the GPT-2 model with public dataset. |
Conglong Li; Minjia Zhang; Yuxiong He; |
1188 | I2DFormer: Learning Image to Document Attention for Zero-Shot Image Classification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To this end, we propose I2DFormer, a novel transformer-based ZSL framework that jointly learns to encode images and documents by aligning both modalities in a shared embedding space. |
Muhammad Ferjad Naeem; Yongqin Xian; Luc V Gool; Federico Tombari; |
1189 | On The Complexity of Adversarial Decision Making Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We consider a general adversarial decision making framework that encompasses (structured) bandit problems with adversarial rewards and reinforcement learning problems with adversarial dynamics. |
Dylan J Foster; Alexander Rakhlin; Ayush Sekhari; Karthik Sridharan; |
1190 | Point Transformer V2: Grouped Vector Attention and Improved Sampling Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Inheriting the advantages of both learnable weight vector and multi-head attention, we present a highly effective implementation of grouped vector attention with a novel grouped weight encoding layer. |
Xiaoyang Wu; Yixing Lao; Li Jiang; Xihui Liu; Hengshuang Zhao; |
1191 | Improving Certified Robustness Via Statistical Learning with Logical Reasoning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Given that existing pure data-driven statistical approaches have reached a bottleneck, in this paper, we propose to integrate statistical ML models with knowledge (expressed as logical rules) as a reasoning component using Markov logic networks (MLN), so as to further improve the overall certified robustness. |
Zhuolin Yang; Zhikuan Zhao; Boxin Wang; Jiawei Zhang; Linyi Li; Hengzhi Pei; Bojan Karlaš; Ji Liu; Heng Guo; Ce Zhang; Bo Li; |
1192 | Rethinking Image Restoration for Object Detection Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Specifically, we present an ADAM-like adversarial attack to generate pseudo ground truth for restoration training. |
Shangquan Sun; Wenqi Ren; Tao Wang; Xiaochun Cao; |
1193 | Escaping Saddle Points for Effective Generalization on Class-Imbalanced Data Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we analyze the class-imbalanced learning problem through the lens of loss landscape. |
Harsh Rangwani; Sumukh K Aithal; Mayank Mishra; Venkatesh Babu R; |
1194 | Few-Shot Audio-Visual Learning of Environment Acoustics Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Towards that goal, we introduce a transformer-based method that uses self-attention to build a rich acoustic context, then predicts RIRs of arbitrary query source-receiver locations through cross-attention. |
Sagnik Majumder; Changan Chen; Ziad Al-Halah; Kristen Grauman; |
1195 | Understanding The Eluder Dimension Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: For Boolean-valued function classes, we obtain a characterization of the eluder dimension in terms of star number and threshold dimension, quantities which are relevant in active learning and online learning respectively. |
Gene Li; Pritish Kamath; Dylan J Foster; Nati Srebro; |
1196 | Adaptive Oracle-Efficient Online Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: But despite the benefits of computational feasibility, most oracle-efficient algorithms exhibit one major limitation: while performing well in worst-case settings, they do not adapt well to friendly environments. In this paper we consider two such friendly scenarios, (a) "small-loss" problems and (b) IID data. |
Guanghui Wang; Zihao Hu; Vidya Muthukumar; Jacob Abernethy; |
1197 | Meta-Reinforcement Learning with Self-Modifying Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: On the contrary, biological synaptic plasticity is persistent and manifold, and has been hypothesized to play a key role in executive functions such as working memory and cognitive flexibility, potentially supporting more efficient and generic learning abilities. Inspired by this, we propose to build networks with dynamic weights, able to continually perform self-reflexive modification as a function of their current synaptic state and action-reward feedback, rather than a fixed network configuration. |
Mathieu Chalvidal; Thomas Serre; Rufin VanRullen; |
1198 | Efficient and Effective Optimal Transport-Based Biclustering Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we leverage optimal transport (OT) which has gained momentum in the machine learning community to propose a novel and scalable biclustering model that generalizes several classical biclustering approaches. |
Chakib Fettal; lazhar labiod; Mohamed NADIF; |
1199 | Anchor-Changing Regularized Natural Policy Gradient for Multi-Objective Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose an Anchor-changing Regularized Natural Policy Gradient (ARNPG) framework, which can systematically incorporate ideas from well-performing first-order methods into the design of policy optimization algorithms for multi-objective MDP problems. |
Ruida Zhou; Tao Liu; Dileep Kalathil; P. R. Kumar; Chao Tian; |
1200 | On The Effectiveness of Fine-tuning Versus Meta-reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This calls into question the benefits of meta learning approaches also in reinforcement learning, which typically come at the cost of high complexity. We therefore investigate meta-RL approaches in a variety of vision-based benchmarks, including Procgen, RLBench, and Atari, where evaluations are made on completely novel tasks. |
Mandi Zhao; Pieter Abbeel; Stephen James; |
1201 | Differentially Private Covariance Revisited Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Abstract: In this paper, we present two new algorithms for covariance estimation under concentrated differential privacy (zCDP). The first algorithm achieves a Frobenius error of … |
Wei Dong; Yuting Liang; Ke Yi; |
1202 | Gradient Methods Provably Converge to Non-Robust Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we identify natural settings where depth-$2$ ReLU networks trained with gradient flow are provably non-robust (susceptible to small adversarial $\ell_2$-perturbations), even when robust networks that classify the training dataset correctly exist. |
Gal Vardi; Gilad Yehudai; Ohad Shamir; |
1203 | On The Frequency-bias of Coordinate-MLPs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We investigate this behavior through a Fourier lens and uncover that as the bandwidth of a coordinate-MLP is enhanced, lower frequencies tend to get suppressed unless a suitable prior is provided explicitly. Based on these insights, we propose a simple regularization technique that can mitigate the above problem, which can be incorporated into existing networks without any architectural modifications. |
Sameera Ramasinghe; Lachlan E. MacDonald; Simon Lucey; |
1204 | VeriDark: A Large-Scale Benchmark for Authorship Verification on The Dark Web Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To address these issues, we release VeriDark: a benchmark comprised of three large scale authorship verification datasets and one authorship identification dataset obtained from user activity from either Dark Web related Reddit communities or popular illicit Dark Web market forums. |
Andrei Manolache; Florin Brad; Antonio Barbalau; Radu Tudor Ionescu; Marius Popescu; |
1205 | Fair and Efficient Allocations Without Obvious Manipulations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we investigate the interplay of fairness and efficiency under a relaxation of truthfulness called non-obvious manipulability (NOM), recently proposed by~\citep{troyan2020obvious}. |
Alexandros Psomas; Paritosh Verma; |
1206 | Structure-Preserving 3D Garment Modeling with Neural Sewing Machines Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel Neural Sewing Machine (NSM), a learning-based framework for structure-preserving 3D garment modeling, which is capable of modeling and learning representations for garments with diverse shapes and topologies and is successfully applied to 3D garment reconstruction and controllable manipulation. |
Xipeng Chen; Guangrun Wang; Dizhong Zhu; Xiaodan Liang; Philip Torr; Liang Lin; |
1207 | ProtoVAE: A Trustworthy Self-Explainable Prototypical Variational Model Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Prior approaches are either based on multi-stage optimization schemes, impacting the predictive performance of the model, or produce explanations that are not transparent, trustworthy or do not capture the diversity of the data. To address these shortcomings, we propose ProtoVAE, a variational autoencoder-based framework that learns class-specific prototypes in an end-to-end manner and enforces trustworthiness and diversity by regularizing the representation space and introducing an orthonormality constraint. |
Srishti Gautam; Ahcène Boubekki; Stine Hansen; Suaiba Salahuddin; Robert Jenssen; Marina Höhne; Michael Kampffmeyer; |
1208 | VisFIS: Improved Visual Feature Importance Supervision with Right-for-Right-Reason Objectives Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we show that model FI supervision can meaningfully improve VQA model accuracy as well as performance on several Right-for-the-Right-Reason (RRR) metrics by optimizing for four key model objectives: (1) accurate predictions given limited but sufficient information (Sufficiency); (2) max-entropy predictions given no important information (Uncertainty); (3) invariance of predictions to changes in unimportant features (Invariance); and (4) alignment between model FI explanations and human FI explanations (Plausibility). |
Zhuofan Ying; Peter Hase; Mohit Bansal; |
1209 | Joint Entropy Search For Maximally-Informed Bayesian Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose Joint Entropy Search (JES), a novel information-theoretic acquisition function that considers an entirely new quantity, namely the entropy over the joint optimal probability density over both input and output space. |
Carl Hvarfner; Frank Hutter; Luigi Nardi; |
1210 | Instability and Local Minima in GAN Training with Kernel Discriminators Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Despite their empirical success, the training of GANs is not fully understood due to the joint training of the generator and discriminator. This paper analyzes these joint dynamics when the true samples, as well as the generated samples, are discrete, finite sets, and the discriminator is kernel-based. |
Evan Becker; Parthe Pandit; Sundeep Rangan; Alyson Fletcher; |
1211 | Deep Counterfactual Estimation with Categorical Background Variables Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Nevertheless, in this work, we show that under the assumption that the main latent contributors to the treatment responses are categorical, the counterfactuals can be still reliably predicted. |
Edward De Brouwer; |
1212 | Beyond Adult and COMPAS: Fairness in Multi-Class Prediction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We consider the problem of producing fair probabilistic classifiers for multi-class classification tasks. |
Wael Alghamdi; Hsiang Hsu; Haewon Jeong; Hao Wang; Peter Michalak; Shahab Asoodeh; Flavio Calmon; |
1213 | Consistent Interpolating Ensembles Via The Manifold-Hilbert Kernel Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We devise an ensemble classification method that simultaneously interpolates the training data, and is consistent for a broad class of data distributions. |
Yutong Wang; Clay Scott; |
1214 | PRO: Patch-level Rendering and Optimization for Infinite Visual Synthesis Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, since they fail to model global dependencies between patches, the quality and consistency of the generation can be limited. To address this issue, we propose PRO, a patch-level \emph{“render-and-optimize”} strategy for infinite visual synthesis. |
Jian Liang; Chenfei Wu; Xiaowei Hu; Zhe Gan; Jianfeng Wang; Lijuan Wang; Zicheng Liu; Yuejian Fang; Nan Duan; |
1215 | Bridge The Gap Between Architecture Spaces Via A Cross-Domain Predictor Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Particularly, we propose a progressive subspace adaptation strategy to address the domain discrepancy between the source architecture space and the target space. |
Yuqiao Liu; Yehui Tang; Zeqiong Lv; Yunhe Wang; Yanan Sun; |
1216 | Parametrically Retargetable Decision-Makers Tend To Seek Power Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We consider a range of models of AI decision-making, from optimal, to random, to choices informed by learning and interacting with an environment. |
Alex Turner; Prasad Tadepalli; |
1217 | A Damped Newton Method Achieves Global $\mathcal O \left(\frac{1}{k^2}\right)$ and Local Quadratic Convergence Rate Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we present the first stepsize schedule for Newton method resulting in fast global and local convergence guarantees. |
Slavomír Hanzely; Dmitry Kamzolov; Dmitry Pasechnyuk; Alexander Gasnikov; Peter Richtarik; Martin Takac; |
1218 | Adjoint-aided Inference of Gaussian Process Driven Differential Equations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper we show how the adjoint of a linear system can be used to efficiently infer forcing functions modelled as GPs, after using a truncated basis expansion of the GP kernel. |
Paterne GAHUNGU; Christopher Lanyon; Mauricio A Álvarez; Engineer Bainomugisha; Michael T Smith; Richard Wilkinson; |
1219 | Trust Region Policy Optimization with Optimal Transport Discrepancies: Duality and Algorithm for Continuous Actions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we explore optimal transport discrepancies (which include the Wasserstein distance) to define trust regions, and we propose a novel algorithm – Optimal Transport Trust Region Policy Optimization (OT-TRPO) – for continuous state-action spaces. |
Antonio Terpin; Nicolas Lanzetti; Batuhan Yardim; Giorgia Ramponi; Florian Dorfler; |
1220 | 360-MLC: Multi-view Layout Consistency for Self-training and Hyper-parameter Tuning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present 360-MLC, a self-training method based on multi-view layout consistency for finetuning monocular room-layout models using unlabeled 360-images only. |
Bolivar Solarte; Chin-Hsuan Wu; Yueh-Cheng Liu; Yi-Hsuan Tsai; Min Sun; |
1221 | Neural Differential Equations for Learning to Program Neural Nets Through Continuous Learning Rules Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Here we introduce a novel combination of learning rules and Neural ODEs to build continuous-time sequence processing nets that learn to manipulate short-term memory in rapidly changing synaptic connections of other nets. |
Kazuki Irie; Francesco Faccio; Jürgen Schmidhuber; |
1222 | SatMAE: Pre-training Transformers for Temporal and Multi-Spectral Satellite Imagery Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we present SatMAE, a pre-training framework for temporal or multi-spectral satellite imagery based on Masked Autoencoder (MAE). |
Samar Khanna; Yezhen Cong; Chenlin Meng; Patrick Liu; Erik Rozi; Yutong He; Marshall Burke; David Lobell; Stefano Ermon; |
1223 | Receding Horizon Inverse Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper presents Receding Horizon Inverse Reinforcement Learning (RHIRL), a new IRL algorithm for high-dimensional, noisy, continuous systems with black-box dynamic models. |
Yiqing Xu; Wei Gao; David Hsu; |
1224 | Mask-based Latent Reconstruction for Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Specifically, we propose a simple yet effective self-supervised method, Mask-based Latent Reconstruction (MLR), to predict the complete state representations in the latent space from the observations with spatially and temporally masked pixels. |
Tao Yu; Zhizheng Zhang; Cuiling Lan; Yan Lu; Zhibo Chen; |
1225 | Robustness to Label Noise Depends on The Shape of The Noise Distribution in Feature Space Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Machine learning classifiers have been demonstrated, both empirically and theoretically, to be robust to label noise under certain conditions — notably the typical assumption is that label noise is independent of the features given the class label. We provide a theoretical framework that generalizes beyond this typical assumption by modeling label noise as a distribution over feature space. |
Diane Oyen; Michal Kucer; Nicolas Hengartner; Har Simrat Singh; |
1226 | Contrastive Neural Ratio Estimation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a multiclass framework free from the bias inherent to NRE-B at optimum, leaving us in the position to run diagnostics that practitioners depend on. |
Benjamin K Miller; Christoph Weniger; Patrick Forré; |
1227 | Kernel Attractor Networks: A Unifying Framework for Memory Modeling Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We consider the problem of training a neural network to store a set of patterns with maximal noise robustness. |
Georgios Iatropoulos; Johanni Brea; Wulfram Gerstner; |
1228 | Multi-agent Performative Prediction with Greedy Deployment and Consensus Seeking Agents Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we formulate Multi-PfD as a decentralized optimization problem that minimizes a sum of loss functions, where each loss function is based on a distribution influenced by the local decision vector. |
Qiang LI; Chung-Yiu Yau; Hoi-To Wai; |
1229 | Improved Utility Analysis of Private CountSketch Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper we consider the classical CountSketch, made differentially private with the Gaussian mechanism, and give an improved analysis of its estimation error. |
Rasmus Pagh; Mikkel Thorup; |
1230 | CogView2: Faster and Better Text-to-Image Generation Via Hierarchical Transformers Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we put forward a solution based on hierarchical transformers and local parallel autoregressive generation. |
Ming Ding; Wendi Zheng; Wenyi Hong; Jie Tang; |
1231 | Fused Orthogonal Alternating Least Squares for Tensor Clustering Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce a multi-modes tensor clustering method that implements a fused version of the alternating least squares algorithm (Fused-Orth-ALS) for simultaneous tensor factorization and clustering. |
Jiacheng Wang; Dan Nicolae; |
1232 | Bayesian Optimistic Optimization: Optimistic Exploration for Model-based Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper proposes an algorithm, Bayesian optimistic optimization (BOO), which adopts a dynamic weighting technique for enforcing the constraint rather than explicitly solving a constrained optimization problem. |
Chenyang Wu; Tianci Li; Zongzhang Zhang; Yang Yu; |
1233 | Stability and Generalization Analysis of Gradient Methods for Shallow Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we study the generalization behavior of shallow neural networks (SNNs) by leveraging the concept of algorithmic stability. |
Yunwen Lei; Rong Jin; Yiming Ying; |
1234 | Training Spiking Neural Networks with Event-driven Backpropagation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we first analyze the commonly used temporal backpropagation training approach and prove that the sum of gradients remains unchanged between fully-connected and convolutional layers. Secondly, we show that the max pooling layer meets the above invariance rule, while the average pooling layer does not, which will suffer the gradient vanishing problem but can be revised to meet the requirement. |
YAOYU ZHU; Zhaofei Yu; Wei Fang; Xiaodong Xie; Tiejun Huang; Timothée Masquelier; |
1235 | Why Do Artificially Generated Data Help Adversarial Robustness Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We provide statistical insights to explain why the artificially generated data improve adversarial training. |
Yue Xing; Qifan Song; Guang Cheng; |
1236 | Learning Partial Equivariances From Data Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we introduce \textit{Partial G-CNNs}: G-CNNs able to learn layer-wise levels of partial and full equivariance to discrete, continuous groups and combinations thereof during training. |
David W. Romero; Suhas Lohit; |
1237 | Fast Neural Kernel Embeddings for General Activations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Moreover, most prior works on neural kernels have focused on the ReLU activation, mainly due to its popularity but also due to the difficulty of computing such kernels for general activations. In this work, we overcome such difficulties by providing methods to work with general activations. |
Insu Han; Amir Zandieh; Jaehoon Lee; Roman Novak; Lechao Xiao; Amin Karbasi; |
1238 | Optimal Brain Compression: A Framework for Accurate Post-Training Quantization and Pruning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we introduce a new compression framework which covers both weight pruning and quantization in a unified setting, is time- and space-efficient, and considerably improves upon the practical performance of existing post-training methods. |
Elias Frantar; Dan Alistarh; |
1239 | Off-Policy Evaluation for Action-Dependent Non-stationary Environments Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose, OPEN, an algorithm that uses a double application of counterfactual reasoning and a novel importance-weighted instrument-variable regression to obtain both a lower bias and a lower variance estimate of the structure in the changes of a policy’s past performances. |
Yash Chandak; Shiv Shankar; Nathaniel Bastian; Bruno da Silva; Emma Brunskill; Philip Thomas; |
1240 | SparCL: Sparse Continual Learning on The Edge Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose a novel framework called Sparse Continual Learning (SparCL), which is the first study that leverages sparsity to enable cost-effective continual learning on edge devices. |
Zifeng Wang; Zheng Zhan; Yifan Gong; Geng Yuan; Wei Niu; Tong Jian; Bin Ren; Stratis Ioannidis; Yanzhi Wang; Jennifer Dy; |
1241 | REVIVE: Regional Visual Representation Matters in Knowledge-Based Visual Question Answering Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Specifically, we observe in most state-of-the-art knowledge-based VQA methods: 1) visual features are extracted either from the whole image or in a sliding window manner for retrieving knowledge, and the important relationship within/among object regions is neglected; 2) visual features are not well utilized in the final answering model, which is counter-intuitive to some extent. Based on these observations, we propose a new knowledge-based VQA method REVIVE, which tries to utilize the explicit information of object regions not only in the knowledge retrieval stage but also in the answering model. |
Yuanze Lin; Yujia Xie; Dongdong Chen; Yichong Xu; Chenguang Zhu; Lu Yuan; |
1242 | Physics-Embedded Neural Networks: Graph Neural PDE Solvers with Mixed Boundary Conditions Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present our approach termed physics-embedded neural networks that considers boundary conditions and predicts the state after a long time using an implicit method. |
Masanobu Horie; NAOTO MITSUME; |
1243 | Neural Payoff Machines: Predicting Fair and Stable Payoff Allocations Among Team Members Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we show how cooperative game-theoretic solutions can be distilled into a learned model by training neural networks to propose fair and stable payoff allocations. |
Daphne Cornelisse; Thomas Rood; Yoram Bachrach; Mateusz Malinowski; Tal Kachman; |
1244 | Best of Both Worlds Model Selection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study the problem of model selection in bandit scenarios in the presence of nested policy classes, with the goal of obtaining simultaneous adversarial and stochastic (best of both worlds) high-probability regret guarantees. |
Aldo Pacchiano; Christoph Dann; Claudio Gentile; |
1245 | Independence Testing for Bounded Degree Bayesian Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study the following independence testing problem: given access to samples from a distribution $P$ over $\{0,1\}^n$, decide whether $P$ is a product distribution or whether it is $\varepsilon$-far in total variation distance from any product distribution. |
Arnab Bhattacharyya; Clément L Canonne; Qiping Yang; |
1246 | VisCo Grids: Surface Reconstruction with Viscosity and Coarea Grids Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The goal of this work is to show that replacing neural networks with simple grid functions, along with two novel geometric priors achieve comparable results to INRs, with instant inference, and improved training times. |
Albert Pumarola; Artsiom Sanakoyeu; Lior Yariv; Ali Thabet; Yaron Lipman; |
1247 | Combining Implicit and Explicit Regularization for Efficient Learning in Deep Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: For deep linear neural networks, it has been shown that gradient descent/flow implicit regularizes toward low-rank solutions in matrix completion/factorization tasks, similar to an accelerative pre-conditioning, whoseeffects become more pronounced with increased depth. In light of this, we propose an explicit penalty that mirrors the rank minimization behavior and generalization performance independently of depth, but interestingly only takes effect with Adam and some of its close variants—it outperforms many approaches in matrix completion and is robust to a wide range of parameter and data regimes. |
Dan Zhao; |
1248 | The Pitfalls of Regularization in Off-Policy TD Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we introduce a series of new counterexamples to show that the instability and unbounded error of TD methods is not solved by regularization. |
Gaurav Manek; J. Zico Kolter; |
1249 | Fairness Without Demographics Through Knowledge Distillation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, such constraints could be too strict to achieve expected improvement in group fairness, and could lead to a great decrease in accuracy. In light of these limitations, in this paper, we propose to solve the problem from a new perspective, i.e., through knowledge distillation. |
Junyi Chai; Taeuk Jang; Xiaoqian Wang; |
1250 | Recurrent Memory Transformer Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose and study a memory-augmented segment-level recurrent Transformer (RMT). |
Aydar Bulatov; Yury Kuratov; Mikhail Burtsev; |
1251 | Neural Attentive Circuits Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we introduce a general purpose, yet modular neural architecture called Neural Attentive Circuits (NACs) that jointly learns the parameterization and a sparse connectivity of neural modules without using domain knowledge. |
Martin Weiss; Nasim Rahaman; Francesco Locatello; Chris Pal; Yoshua Bengio; Bernhard Schölkopf; Li Erran Li; Nicolas Ballas; |
1252 | Self-supervised Surround-view Depth Estimation with Volumetric Feature Fusion Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose a self-supervised depth estimation method using a unified volumetric feature encoded from surround-view. |
Jung-Hee Kim; Junhwa Hur; Tien Phuoc Nguyen; Seong-Gyun Jeong; |
1253 | Algorithms with Prediction Portfolios Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We study the use of multiple predictors for a number of fundamental problems, including matching, load balancing, and non-clairvoyant scheduling, which have been well-studied in the single predictor setting. For each of these problems we introduce new algorithms that take advantage of multiple predictors, and prove bounds on the resulting performance. |
Michael Dinitz; Sungjin Im; Thomas Lavastida; Benjamin Moseley; Sergei Vassilvitskii; |
1254 | Multi-fidelity Monte Carlo: A Pseudo-marginal Approach Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we describe a class of asymptotically exact multi-fidelity MCMC algorithms for the setting where a sequence of likelihoods of increasing fidelity can be computed that approximates the high-fidelity likelihood. |
Diana Cai; Ryan Adams; |
1255 | Fine-tuning Language Models to Find Consensus Among Humans with Diverse Preferences Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We fine-tune a 70 billion parameter language model to generate statements that align with small groups of humans with diverse opinions. |
Michiel Bakker; Martin Chadwick; Hannah Sheahan; Michael Tessler; Lucy Campbell-Gillingham; Jan Balaguer; Nat McAleese; Amelia Glaese; John Aslanides; Matt Botvinick; Christopher Summerfield; |
1256 | Few-shot Relational Reasoning Via Pretraining of Connection Subgraph Reconstruction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Here we propose Connection Subgraph Reasoner (CSR), which can make predictions for the target few-shot task directly without the need for pre-training on the human curated set of training tasks. |
Qian Huang; Hongyu Ren; Jure Leskovec; |
1257 | Transformer Memory As A Differentiable Search Index Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To this end, we introduce the Differentiable Search Index (DSI), a new paradigm that learns a text-to-text model that maps string queries directly to relevant docids; in other words, a DSI model answers queries directly using only its parameters, dramatically simplifying the whole retrieval process. |
Yi Tay; Vinh Tran; Mostafa Dehghani; Jianmo Ni; Dara Bahri; Harsh Mehta; Zhen Qin; Kai Hui; Zhe Zhao; Jai Gupta; Tal Schuster; William Cohen; Donald Metzler; |
1258 | On Non-Linear Operators for Geometric Deep Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We fine-tune a 70 billion parameter language model to generate statements that align with small groups of humans with diverse opinions. |
Grégoire Sergeant-Perthui; Jakob Maier; Joan Bruna; Edouard Oyallon; |
1259 | Task Discovery: Finding The Tasks That Neural Networks Generalize on Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: What do they look like, or do they show anything?This is the question we address in this paper. We propose a task discovery framework that automatically finds examples of such tasks via optimizing a generalization-based quantity called agreement score. |
Andrei Atanov; Andrey Filatov; Teresa Yeo; Ajay Sohmshetty; Amir Zamir; |
1260 | Identification, Amplification and Measurement: A Bridge to Gaussian Differential Privacy Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose an efficient method for GDP algorithms to narrow down possible values of an optimal privacy measurement, $\mu$ with an arbitrarily small and quantifiable margin of error. |
Yi Liu; Ke Sun; Bei Jiang; Linglong Kong; |
1261 | Learning Optimal Flows for Non-Equilibrium Importance Sampling Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Many applications in computational sciences and statistical inference require the computation of expectations with respect to complex high-dimensional distributions with unknown normalization constants, as well as the estimation of these constants. Here we develop a method to perform these calculations based on generating samples from a simple base distribution, transporting them along the flow generated by a velocity field, and performing averages along these flowlines. |
Yu Cao; Eric Vanden-Eijnden; |
1262 | Incentivizing Combinatorial Bandit Exploration Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We prove that Thompson Sampling, when applied to combinatorial semi-bandits, is incentive-compatible when initialized with a sufficient number of samples of each arm (where this number is determined in advance by the Bayesian prior). |
Xinyan Hu; Dung Ngo; Aleksandrs Slivkins; Steven Wu; |
1263 | Instance-optimal PAC Algorithms for Contextual Bandits Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we focus on the stochastic bandit problem in the $(\epsilon,\delta)$-PAC setting: given a policy class $\Pi$ the goal of the learner is to return a policy $\pi\in \Pi$ whose expected reward is within $\epsilon$ of the optimal policy with probability greater than $1-\delta$. |
Zhaoqi Li; Lillian Ratliff; houssam nassif; Kevin Jamieson; Lalit Jain; |
1264 | Object Representations As Fixed Points: Training Iterative Refinement Algorithms with Implicit Differentiation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This often requires the use iterative refinement procedures that break symmetries among equally plausible explanations for the data, but most prior works differentiate through the unrolled refinement process, which can make optimization exceptionally challenging. In this work, we observe that such iterative refinement methods can be made differentiable by means of the implicit function theorem, and develop an implicit differentiation approach that improves the stability and tractability of training such models by decoupling the forward and backward passes. |
Michael Chang; Tom Griffiths; Sergey Levine; |
1265 | A Closer Look at Offline RL Agents Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Inspired by the evaluation results, a novel offline RL algorithm is proposed by a simple modification of IQL and achieves SOTA performance. |
Yuwei Fu; Di Wu; Benoit Boulet; |
1266 | Efficient Multi-agent Communication Via Self-supervised Information Aggregation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To that end, we propose \textbf{M}ulti-\textbf{A}gent communication via \textbf{S}elf-supervised \textbf{I}nformation \textbf{A}ggregation (MASIA), with which agents can ground the received message into compact representations and extract the most relevant part to augment the local policy. |
Cong Guan; Feng Chen; Lei Yuan; Chenghe Wang; Hao Yin; Zongzhang Zhang; Yang Yu; |
1267 | Subgame Solving in Adversarial Team Games Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we extend the successful approach of solving huge two-player zero-sum games, where a blueprint strategy is computed offline by using an abstract version of the game and then it is refined online, that is, during a playthrough. |
Brian Zhang; Luca Carminati; Federico Cacciamani; Gabriele Farina; Pierriccardo Olivieri; Nicola Gatti; Tuomas Sandholm; |
1268 | Near-Optimal No-Regret Learning Dynamics for General Convex Games Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The question as to whether $O(\mathrm{polylog} T)$ regret bounds can be obtained for general convex and compact strategy sets—as is the case in many fundamental models in economics and multiagent systems—while retaining efficient strategy updates is an important question. In this paper, we answer this in the positive by establishing the first uncoupled learning algorithm with $O(\log T)$ per-player regret in general convex games, that is, games with concave utility functions supported on arbitrary convex and compact strategy sets. |
Gabriele Farina; Ioannis Anagnostides; Haipeng Luo; Chung-Wei Lee; Christian Kroer; Tuomas Sandholm; |
1269 | Learning to Reconstruct Missing Data from Spatiotemporal Graphs with Sparse Observations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In particular, we propose a novel class of attention-based architectures that, given a set of highly sparse discrete observations, learn a representation for points in time and space by exploiting a spatiotemporal diffusion architecture aligned with the imputation task. |
Ivan Marisca; Andrea Cini; Cesare Alippi; |
1270 | Self-Supervised Pretraining for Large-Scale Point Clouds Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a new self-supervised pretraining method that targets large-scale 3D scenes. |
Zaiwei Zhang; Min Bai; Li Erran Li; |
1271 | Sharing Knowledge for Meta-learning with Feature Descriptions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a meta-learning method that shares knowledge across supervised learning tasks using feature descriptions written in natural language, which have not been used in the existing meta-learning methods. |
Tomoharu Iwata; Atsutoshi Kumagai; |
1272 | Simple Unsupervised Object-Centric Learning for Complex and Naturalistic Videos Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose STEVE, an unsupervised model for object-centric learning in videos. |
Gautam Singh; Yi-Fu Wu; Sungjin Ahn; |
1273 | Active Learning with Safety Constraints Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose an adaptive experimental design-based algorithm, which we show efficiently trades off between the difficulty of showing an arm is unsafe vs suboptimal. |
Romain Camilleri; Kevin Jamieson; Jamie Morgenstern; Lalit Jain; Andrew Wagenmaker; |
1274 | Constrained GPI for Zero-Shot Transfer in Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The goal of this work is to improve the transfer by further bounding the value approximation errors of successor features on the new target tasks. |
Jaekyeom Kim; Seohong Park; Gunhee Kim; |
1275 | A Unified Hard-Constraint Framework for Solving Geometrically Complex PDEs Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present a unified hard-constraint framework for solving geometrically complex PDEs with neural networks, where the most commonly used Dirichlet, Neumann, and Robin boundary conditions (BCs) are considered. |
Songming Liu; Hao Zhongkai; Chengyang Ying; Hang Su; Jun Zhu; Ze Cheng; |
1276 | Rethinking Knowledge Graph Evaluation Under The Open-World Assumption Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we study KGC evaluation under a more realistic setting, namely the open-world assumption, where unknown triplets are considered to include many missing facts not included in the training or test sets. |
Haotong Yang; Zhouchen Lin; Muhan Zhang; |
1277 | So3krates – Self-attention for Higher-order Geometric Interactions on Arbitrary Length-scales Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This work proposes a modified attention mechanism adapted to the underlying physics, which allows to recover the relevant non-local effects. |
Thorben Frank; Oliver Unke; Klaus-Robert Müller; |
1278 | Group Meritocratic Fairness in Linear Contextual Bandits Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a notion of fairness that states that the agent’s policyis fair when it selects a candidate with highest relative rank, which measures howgood the reward is when compared to candidates from the same group. |
Riccardo Grazzi; Arya Akhavan; Massimiliano Pontil; John IF Falk; Leonardo Cella; |
1279 | Beyond IID: Data-driven Decision-making in Heterogeneous Environments Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a new framework in which historical samples are generated from unknown and different distributions, which we dub \textit{heterogeneous environments}. |
Omar Besbes; Will Ma; Omar Mouchtaki; |
1280 | Generalization Bounds for Gradient Methods Via Discrete and Continuous Prior Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We prove new generalization bounds for FGD (a variant of GD) and SGLD. |
Jian Li; Xuanyuan Luo; |
1281 | Improving Intrinsic Exploration with Language Abstractions Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Instead, we explore natural language as a general medium for highlighting relevant abstractions in an environment. |
Jesse Mu; Victor Zhong; Roberta Raileanu; Minqi Jiang; Noah Goodman; Tim Rocktäschel; Edward Grefenstette; |
1282 | Log-Concave and Multivariate Canonical Noise Distributions for Differential Privacy Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we consider the existence and construction of log-concave CNDs as well as multivariate CNDs. |
Jordan Awan; Jinshuo Dong; |
1283 | Benign Overfitting in Two-layer Convolutional Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we study the benign overfitting phenomenon in training a two-layer convolutional neural network (CNN). |
Yuan Cao; Zixiang Chen; Misha Belkin; Quanquan Gu; |
1284 | Using Mixup As A Regularizer Can Surprisingly Improve Accuracy and Out Distribution Robustness Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We show that the effectiveness of the well celebrated Mixup~\citep{zhang2018mixup} can be further improved if instead of using it as the sole learning objective, it is being utilized as an additional regularizer to the standard cross-entropy loss. |
Francesco Pinto; Harry Yang; Ser Nam Lim; Philip Torr; Puneet Dokania; |
1285 | The First Optimal Algorithm for Smooth and Strongly-Convex-Strongly-Concave Minimax Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts View Abstract: In this paper, we revisit the smooth and strongly-convex-strongly-concave minimax optimization problem. Zhang et al. (2021) and Ibrahim et al. (2020) established the lower bound … |
Dmitry Kovalev; Alexander Gasnikov; |
1286 | Hiding Images in Deep Probabilistic Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we describe a different computational framework to hide images in deep probabilistic models. |
Haoyu Chen; Linqi Song; Zhenxing Qian; Xinpeng Zhang; Kede Ma; |
1287 | Capturing Failures of Large Language Models Via Human Cognitive Biases Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Large language models generate complex, open-ended outputs: instead of outputting a class label they write summaries, generate dialogue, or produce working code. In order to asses the reliability of these open-ended generation systems, we aim to identify qualitative categories of erroneous behavior, beyond identifying individual errors. |
Erik Jones; Jacob Steinhardt; |
1288 | Provably Feedback-Efficient Reinforcement Learning Via Active Reward Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, despite achieving great empirical successes, HiL RL usually requires \emph{too much} feedback from a human teacher and also suffers from insufficient theoretical understanding. In this paper, we focus on addressing this issue from a theoretical perspective, aiming to provide provably feedback-efficient algorithmic frameworks that take human-in-the-loop to specify rewards of given tasks. |
Dingwen Kong; Lin Yang; |
1289 | Near-Optimal Regret for Adversarial MDP with Delayed Bandit Feedback Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present the first algorithms that achieve near-optimal $\sqrt{K + D}$ regret, where $K$ is the number of episodes and $D = \sum_{k=1}^K d^k$ is the total delay, significantly improving upon the best known regret bound of $(K + D)^{2/3}$. |
Tiancheng Jin; Tal Lancewicki; Haipeng Luo; Yishay Mansour; Aviv Rosenberg; |
1290 | AutoMS: Automatic Model Selection for Novelty Detection with Error Rate Control Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, due to the absence of labeled data for model evaluation and comparison, there is a lack of systematic approaches that are able to select a ”best” model/detector (i.e., the algorithm as well as its hyperparameters) and achieve certain error rate control simultaneously. In this paper, we introduce a unified data-driven procedure to address this issue. |
Yifan Zhang; Haiyan Jiang; Haojie Ren; Changliang Zou; Dejing Dou; |
1291 | Computationally Efficient Horizon-Free Reinforcement Learning for Linear Mixture MDPs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose the first computationally efficient horizon-free algorithm for linear mixture MDPs, which achieves the optimal $\tilde O(d\sqrt{K} +d^2)$ regret up to logarithmic factors. |
Dongruo Zhou; Quanquan Gu; |
1292 | Active Learning Helps Pretrained Models Learn The Intended Task Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We investigate whether pretrained models are better active learners, capable of disambiguating between the possible tasks a user may be trying to specify. |
Alex Tamkin; Dat Nguyen; Salil Deshpande; Jesse Mu; Noah Goodman; |
1293 | Understanding Deep Contrastive Learning Via Coordinate-wise Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a unified formulation for a broad family of contrastive losses, including InfoNCE, propose novel losses, and show contrastive learning with deep linear network can be equivalent to PCA. |
Yuandong Tian; |
1294 | Mixture-of-Experts with Expert Choice Routing Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Prior work allocates a fixed number of experts to each token using a top-k function regardless of the relative importance of different tokens. To address this, we propose a heterogeneous mixture-of-experts employing an expert choice method. |
Yanqi Zhou; Tao Lei; Hanxiao Liu; Nan Du; Yanping Huang; Vincent Zhao; Andrew Dai; zhifeng Chen; Quoc V Le; James Laudon; |
1295 | Left Heavy Tails and The Effectiveness of The Policy and Value Networks in DNN-based Best-first Search for Sokoban Planning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To further understand the phenomena, we studied the cost distribution of the search algorithms and found that Sokoban instances can have heavy-tailed runtime distributions, with tails both on the left and right-hand sides. |
Dieqiao Feng; Carla Gomes; Bart Selman; |
1296 | Beyond The Return: Off-policy Function Estimation Under User-specified Error-measuring Distributions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we provide guarantees for off-policy function estimation under only realizability, by imposing proper regularization on the MIS objectives. |
Audrey Huang; Nan Jiang; |
1297 | MinVIS: A Minimal Video Instance Segmentation Framework Without Video-based Training Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose MinVIS, a minimal video instance segmentation (VIS) framework that achieves state-of-the-art VIS performance with neither video-based architectures nor training procedures. |
De-An Huang; Zhiding Yu; Anima Anandkumar; |
1298 | Signal Processing for Implicit Neural Representations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we present a pilot study on the question: how to directly modify an INR without explicit decoding? |
Dejia Xu; Peihao Wang; Yifan Jiang; Zhiwen Fan; Zhangyang Wang; |
1299 | Scalable Algorithm Synthesis with Recurrent Networks: Extrapolation Without Overthinking Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a recall architecture that keeps an explicit copy of the problem instance in memory so that it cannot be forgotten. |
Arpit Bansal; Avi Schwarzschild; Eitan Borgnia; Zeyad Emam; Furong Huang; Micah Goldblum; Tom Goldstein; |
1300 | GraB: Finding Provably Better Data Permutations Than Random Reshuffling Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To reduce the memory overhead, we leverage discrepancy minimization theory to propose an online Gradient Balancing algorithm (GraB) that enjoys the same rate as herding, while reducing the memory usage from $O(nd)$ to just $O(d)$ and computation from $O(n^2)$ to $O(n)$, where $d$ denotes the model dimension. |
Yucheng Lu; Wentao Guo; Christopher De Sa; |
1301 | Linear Label Ranking with Bounded Noise Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We focus on the fundamental case of learning linear sorting functions (LSFs) under Gaussian marginals: $x$ is sampled from the $d$-dimensional standard normal and the ground truth ranking $\sigma^\star(x)$ is the ordering induced by sorting the coordinates of the vector $W^\star x$, where $W^\star \in \mathbb{R}^{k \times d}$ is unknown. |
Dimitris Fotakis; Alkis Kalavasis; Vasilis Kontonis; Christos Tzamos; |
1302 | Re-Analyze Gauss: Bounds for Private Matrix Approximation Via Dyson Brownian Motion Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Given a symmetric matrix $M$ and a vector $\lambda$, we present new bounds on the Frobenius-distance utility of the Gaussian mechanism for approximating $M$ by a matrix whose spectrum is $\lambda$, under $(\varepsilon,\delta)$-differential privacy. |
Oren Mangoubi; Nisheeth Vishnoi; |
1303 | FeLMi : Few Shot Learning with Hard Mixup Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To this end, we propose Few shot Learning with hard Mixup (FeLMi) using manifold mixup to synthetically generate samples which helps in mitigating the data scarcity issue. |
Aniket Roy; Anshul Shah; Ketul Shah; Prithviraj Dhar; Anoop Cherian; Rama Chellappa; |
1304 | Inherently Explainable Reinforcement Learning in Natural Language Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We focus on the task of creating a reinforcement learning agent that is inherently explainable—with the ability to produce immediate local explanations by thinking out loud while performing a task and analyzing entire trajectories post-hoc to produce temporally extended explanations. |
Xiangyu Peng; Mark Riedl; Prithviraj Ammanabrolu; |
1305 | Diagnosing Failures of Fairness Transfer Across Distribution Shift in Real-world Medical Settings Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we adopt a causal framing to motivate conditional independence tests as a key tool for characterizing distribution shifts. |
Jessica Schrouff; Natalie Harris; Sanmi Koyejo; Ibrahim Alabdulmohsin; Eva Schnider; Krista Opsahl-Ong; Alexander Brown; Subhrajit Roy; Diana Mincu; Christina Chen; Awa Dieng; Yuan Liu; Vivek Natarajan; Alan Karthikesalingam; Katherine Heller; Silvia Chiappa; Alexander D’Amour; |
1306 | Unknown-Aware Domain Adversarial Learning for Open-Set Domain Adaptation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Therefore, we propose Unknown-Aware Domain Adversarial Learning (UADAL), which $\textit{aligns}$ the source and the target-$\textit{known}$ distribution while simultaneously $\textit{segregating}$ the target-$\textit{unknown}$ distribution in the feature alignment procedure. |
JoonHo Jang; Byeonghu Na; Dong Hyeok Shin; Mingi Ji; Kyungwoo Song; Il-chul Moon; |
1307 | Improving Variational Autoencoders with Density Gap-based Regularization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To this end, we introduce new training objectives to tackle both two problems through a novel regularization based on the probabilistic density gap between the aggregated posterior distribution and the prior distribution. |
Jianfei Zhang; Jun Bai; Chenghua Lin; Yanmeng Wang; Wenge Rong; |
1308 | Biologically-Plausible Determinant Maximization Neural Networks for Blind Separation of Correlated Sources Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Previous work on biologically-plausible BSS algorithms assumed that observed signals are linear mixtures of statistically independent or uncorrelated sources, limiting the domain of applicability of these algorithms. To overcome this limitation, we propose novel biologically-plausible neural networks for the blind separation of potentially dependent/correlated sources. |
Bariscan Bozkurt; Cengiz Pehlevan; Alper Erdogan; |
1309 | What’s The Harm? Sharp Bounds on The Fraction Negatively Affected By Treatment Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We derive the tightest-possible bounds on the fraction with negative individual treatment effect, an unknowable quantity due to the fundamental problem of causal inference, and we develop an efficient and robust method for inference on these bounds |
Nathan Kallus; |
1310 | An Online Algorithm for Data Deletion Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Unlike prior memory efficient unlearning algorithms, we target ERM trained models that minimize objectives with non-smooth regularizers, such as the commonly used $\ell_1$, elastic net, or nuclear norm penalties. |
Vinith Suriyakumar; Ashia Wilson; |
1311 | Module-Aware Optimization for Auxiliary Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The proposed approach considers the module-level influence through the learnable module-level auxiliary importance, i.e., the importance of each auxiliary loss to each module. |
Hong Chen; Xin Wang; Yue Liu; Yuwei Zhou; Chaoyu Guan; Wenwu Zhu; |
1312 | Fairness Reprogramming Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a new generic fairness learning paradigm, called FairReprogram, which incorporates the model reprogramming technique. |
Guanhua Zhang; Yihua Zhang; Yang Zhang; Wenqi Fan; Qing Li; Sijia Liu; Shiyu Chang; |
1313 | ZooD: Exploiting Model Zoo for Out-of-Distribution Generalization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose ZooD, a paradigm for PTMs ranking and ensemble with feature selection. |
Qishi Dong; Awais Muhammad; Fengwei Zhou; Chuanlong Xie; Tianyang Hu; Yongxin Yang; Sung-Ho Bae; Zhenguo Li; |
1314 | SU-SSL: Maximize Performance in Unseen Classes and Maintain Safeness in Seen Classes Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we develop a new SSL approach, called Safe Unseen classification Semi-Supervised Learning (SU-SSL), which can not only classify unseen classes automatically but also maintain safeness on seen classes. |
Yi-Ge Zhang; Lan-Zhe Guo; Zhi-Fan Wu; Jie-Jing Shao; Yu-Feng Li; |
1315 | Polynomial-Time Optimal Equilibria with A Mediator in Extensive-Form Games Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we investigate both notions in extensive-form games from a computational lens. |
Brian Zhang; Tuomas Sandholm; |
1316 | Flowification: Everything Is A Normalizing Flow Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We develop a method that can be used to turn any multi-layer perceptron or convolutional network into a normalizing flow. |
Bálint Máté; Samuel Klein; Tobias Golling; François Fleuret; |
1317 | A Quadrature Rule Combining Control Variates and Adaptive Importance Sampling Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Within the standard adaptive importance sampling framework, a simple weighted least squares approach is proposed to improve the procedure with control variates. |
Rémi Leluc; François Portier; Johan Segers; Aigerim Zhuman; |
1318 | Tempo: Accelerating Transformer-Based Model Training Through Memory Footprint Reduction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Transformer-based models, which have recently seen a surge in popularity due to their good performance and applicability to a variety of tasks, have a similar problem. To remedy this issue, we propose Tempo, a new approach to efficiently use accelerator (e.g., GPU) memory resources for training Transformer-based models. |
Muralidhar Andoorveedu; Zhanda Zhu; Bojian Zheng; Gennady Pekhimenko; |
1319 | Revisiting Graph Contrastive Learning from The Perspective of Graph Spectrum Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Then we theoretically prove that GCL is able to learn the invariance information by contrastive invariance theorem, together with our GAME rule, for the first time, we uncover that the learned representations by GCL essentially encode the low-frequency information, which explains why GCL works. Guided by this rule, we propose a spectral graph contrastive learning module (SpCo), which is a general and GCL-friendly plug-in. |
Nian Liu; Xiao Wang; Deyu Bo; Chuan Shi; Jian Pei; |
1320 | INRAS: Implicit Neural Representation for Audio Scenes Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we introduce a novel method for Implicit Neural Representation for Audio Scenes (INRAS) which renders high fidelity time-domain impulse responses at any arbitrary emitter-listener positions using neural network parameterization. |
Kun Su; Mingfei Chen; Eli Shlizerman; |
1321 | Tree Mover’s Distance: Bridging Graph Metrics and Stability of Graph Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Identifying such a metric is particularly challenging for non-Euclidean data such as graphs. Here, we propose a pseudometric for attributed graphs, the Tree Mover’s Distance (TMD), and study its relation to generalization. |
Ching-Yao Chuang; Stefanie Jegelka; |
1322 | Learning Equivariant Segmentation with Instance-Unique Querying Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we devise a new training framework that boosts query-based models through discriminative query embedding learning. |
Wenguan Wang; James Liang; Dongfang Liu; |
1323 | Improving Policy Learning Via Language Dynamics Distillation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, for environments with complex language abstractions, learning how to ground language to observations is difficult due to sparse, delayed rewards. We propose Language Dynamics Distillation (LDD), which pretrains a model to predict environment dynamics given demonstrations with language descriptions, and then fine-tunes these language-aware pretrained representations via reinforcement learning (RL). |
Victor Zhong; Jesse Mu; Luke Zettlemoyer; Edward Grefenstette; Tim Rocktäschel; |
1324 | In Defense of The Unitary Scalarization for Deep Multi-Task Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We show that a basic multi-task learning optimizer performs on par with specialized algorithms and suggest a possible explanation based on regularization. |
Vitaly Kurin; Alessandro De Palma; Ilya Kostrikov; Shimon Whiteson; Pawan K Mudigonda; |
1325 | Recovering Private Text in Federated Learning of Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present a novel attack method FILM (Federated Inversion attack for Language Models) for federated learning of language models—for the first time, we show the feasibility of recovering text from large batch sizes of up to 128 sentences. |
Samyak Gupta; Yangsibo Huang; Zexuan Zhong; Tianyu Gao; Kai Li; Danqi Chen; |
1326 | Generalised Mutual Information for Discriminative Clustering Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We replace the Kullback-Leibler divergence inside the mutual information by other distances like the Wasserstein metric and improve thus clustering performances of deep models. |
Louis Ohl; Pierre-Alexandre Mattei; Charles Bouveyron; Warith Harchaoui; Arnaud Droit; Mickaël Leclercq; Frederic Precioso; |
1327 | Why So Pessimistic? Estimating Uncertainties for Offline RL Through Ensembles, and Why Their Independence Matters Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Motivated by the success of ensembles for uncertainty estimation in supervised learning, we take a renewed look at how ensembles of $Q$-functions can be leveraged as the primary source of pessimism for offline reinforcement learning (RL). |
Seyed Kamyar Seyed Ghasemipour; Shixiang (Shane) Gu; Ofir Nachum; |
1328 | Self-Supervised Learning with An Information Maximization Criterion Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this article, we argue that a straightforward application of information maximization among alternative latent representations of the same input naturally solves the collapse problem and achieves competitive empirical results. We propose a self-supervised learning method, CorInfoMax, that uses a second-order statistics-based mutual information measure that reflects the level of correlation among its arguments. |
Serdar Ozsoy; Shadi Hamdan; Sercan Arik; Deniz Yuret; Alper Erdogan; |
1329 | GMMSeg: Gaussian Mixture Based Generative Semantic Segmentation Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Though straightforward, this de facto paradigm neglects the underlying data distribution p(pixel feature|class), and struggles to identify out-of-distribution data. Going beyond this, we propose GMMSeg, a new family of segmentation models that rely on a dense generative classifier for the joint distribution p(pixel feature,class). |
Chen Liang; Wenguan Wang; Jiaxu Miao; Yi Yang; |
1330 | Active Surrogate Estimators: An Active Learning Approach to Label-Efficient Model Evaluation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose Active Surrogate Estimators (ASEs), a new method for label-efficient model evaluation. |
Jannik Kossen; Sebastian Farquhar; Yarin Gal; Thomas Rainforth; |
1331 | Reinforcement Learning with Automated Auxiliary Loss Search Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a principled and universal method for learning better representations with auxiliary loss functions, named Automated Auxiliary Loss Search (A2LS), which automatically searches for top-performing auxiliary loss functions for RL. |
Tairan He; Yuge Zhang; Kan Ren; Che Wang; Minghuan Liu; Weinan Zhang; Dongsheng Li; Yuqing Yang; |
1332 | Mean Estimation in High-Dimensional Binary Markov Gaussian Mixture Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We characterize the minimax error rate (up to a logarithmic factor) of mean estimation given samples from a high-dimensional binary Markov Gaussian mixture model. |
Yihan Zhang; Nir Weinberger; |
1333 | A Permutation-free Kernel Two-sample Test Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose the cross-MMD, a new quadratic time MMD test statistic based on sample-splitting and studentization. |
Shubhanshu Shekhar; Ilmun Kim; Aaditya Ramdas; |
1334 | Sharper Convergence Guarantees for Asynchronous SGD for Distributed and Federated Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study the asynchronous stochastic gradient descent algorithm, for distributed training over $n$ workers that might be heterogeneous. |
Anastasiia Koloskova; Sebastian Stich; Martin Jaggi; |
1335 | Beyond Spectral Gap: The Role of The Topology in Decentralized Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper aims to paint an accurate picture of sparsely-connected distributed optimization when workers share the same data distribution. |
Thijs Vogels; Hadrien Hendrikx; Martin Jaggi; |
1336 | Robust Continual Test-time Adaptation: Instance-aware BN and Prediction-balanced Memory Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We discover that most existing TTA methods fail dramatically under such scenarios. Motivated by this, we present a new test-time adaptation scheme that is robust against non-i.i.d. test data streams. |
Taesik Gong; Jongheon Jeong; Taewon Kim; Yewon Kim; Jinwoo Shin; Sung-Ju Lee; |
1337 | STaR: Bootstrapping Reasoning With Reasoning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a technique to iteratively leverage a small number of rationale examples and a large dataset without rationales, to bootstrap the ability to perform successively more complex reasoning. |
Eric Zelikman; Yuhuai Wu; Jesse Mu; Noah Goodman; |
1338 | Constrained Langevin Algorithms with L-mixing External Random Variables Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we obtain a deviation of $O(T^{-1/2} \log T)$ in $1$-Wasserstein distance for non-convex losses with $L$-mixing data variables and polyhedral constraints (which are not necessarily bounded). |
Yuping Zheng; Andrew Lamperski; |
1339 | Offline Multi-Agent Reinforcement Learning with Knowledge Distillation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce an offline multi-agent reinforcement learning ( offline MARL) framework that utilizes previously collected data without additional online data collection. |
Wei-Cheng Tseng; Tsun-Hsuan Johnson Wang; Yen-Chen Lin; Phillip Isola; |
1340 | Simultaneous Missing Value Imputation and Structure Learning with Groups Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose VISL, a novel scalable structure learning approach that can simultaneously infer structures between groups of variables under missing data and perform missing value imputations with deep learning. |
Pablo Morales-Alvarez; Wenbo Gong; Angus Lamb; Simon Woodhead; Simon Peyton Jones; Nick Pawlowski; Miltiadis Allamanis; Cheng Zhang; |
1341 | Mining Unseen Classes Via Regional Objectness: A Simple Baseline for Incremental Segmentation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: For class incremental semantic segmentation, such a phenomenon often becomes much worse due to the semantic shift of the background class, \ie, some concepts learned at previous stages are assigned to the background class at the current training stage, therefore, significantly reducing the performance of these old concepts. To address this issue, we propose a simple yet effective method in this paper, named Mining unseen Classes via Regional Objectness (MicroSeg). |
Zekang Zhang; Zekang Zhang; Yunchao Wei; Zhiyuan Fang; Jianbo Jiao; |
1342 | Exponential Separations in Symmetric Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work we demonstrate a novel separation between symmetric neural network architectures. |
Aaron Zweig; Joan Bruna; |
1343 | A Reduction to Binary Approach for Debiasing Multiclass Datasets Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a novel reduction-to-binary (R2B) approach that enforces demographic parity for multiclass classification with non-binary sensitive attributes via a reduction to a sequence of binary debiasing tasks. |
Ibrahim Alabdulmohsin; Jessica Schrouff; Sanmi Koyejo; |
1344 | Dynamic Pricing and Assortment Under A Contextual MNL Demand Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a randomized dynamic pricing policy based on a variant of the Online Newton Step algorithm (ONS) that achieves a $O(d\sqrt{T}\log(T))$ regret guarantee under an adversarial arrival model. |
Noemie Perivier; Vineet Goyal; |
1345 | On The Difficulty of Learning Chaotic Dynamics with RNNs Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: It is particularly problematic in scientific applications where one aims to reconstruct the underlying dynamical system. Here we offer a comprehensive theoretical treatment of this problem by relating the loss gradients during RNN training to the Lyapunov spectrum of RNN-generated orbits. |
Jonas Mikhaeil; Zahra Monfared; Daniel Durstewitz; |
1346 | Revisiting Neural Scaling Laws in Language and Vision Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To predict the benefit of scale empirically, we argue for a more rigorous methodology based on the extrapolation loss, instead of reporting the best-fitting (interpolating) parameters. |
Ibrahim Alabdulmohsin; Behnam Neyshabur; Xiaohua Zhai; |
1347 | Why Robust Generalization in Deep Learning Is Difficult: Perspective of Expressive Power Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, although the robust training error can be near zero via some methods, all existing algorithms lead to a high robust generalization error. In this paper, we provide a theoretical understanding of this puzzling phenomenon from the perspective of expressive power for deep neural networks. |
Binghui Li; Jikai Jin; Han Zhong; John Hopcroft; Liwei Wang; |
1348 | Online Algorithms for The Santa Claus Problem Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The Santa Claus problem is a fundamental problem in {\em fair division}: the goal is to partition a set of {\em heterogeneous} items among {\em heterogeneous} agents so as to maximize the minimum value of items received by any agent. In this paper, we study the online version of this problem where the items are not known in advance and have to be assigned to agents as they arrive over time. |
Max Springer; MohammadTaghi Hajiaghayi; Debmalya Panigrahi; Mohammad Khani; |
1349 | Video PreTraining (VPT): Learning to Act By Watching Unlabeled Online Videos Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We extend the internet-scale pretraining paradigm to sequential decision domains through semi-supervised imitation learning wherein agents learn to act by watching online unlabeled videos. |
Bowen Baker; Ilge Akkaya; Peter Zhokov; Joost Huizinga; Jie Tang; Adrien Ecoffet; Brandon Houghton; Raul Sampedro; Jeff Clune; |
1350 | Beyond Rewards: A Hierarchical Perspective on Offline Multiagent Behavioral Analysis Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce a model-agnostic method for discovery of behavior clusters in multiagent domains, using variational inference to learn a hierarchy of behaviors at the joint and local agent levels. |
Shayegan Omidshafiei; Andrei Kapishnikov; Yannick Assogba; Lucas Dixon; Been Kim; |
1351 | ElasticMVS: Learning Elastic Part Representation for Self-supervised Multi-view Stereopsis Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we show that geometric proximity such as surface connectedness and occlusion boundaries implicitly inferred from images could serve as reliable guidance for pixel-wise multi-view correspondences. With this insight, we present a novel elastic part representation which encodes physically-connected part segmentations with elastically-varying scales, shapes and boundaries. |
Jinzhi Zhang; Ruofan Tang; Zheng Cao; Jing Xiao; Ruqi Huang; LU FANG; |
1352 | LST: Ladder Side-Tuning for Parameter and Memory Efficient Transfer Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This is because the gradient computation for the trainable parameters still requires backpropagation through the large pre-trained backbone model. To address this, we propose Ladder Side-Tuning (LST), a new PETL technique that can also reduce training memory requirements by more substantial amounts. |
Yi-Lin Sung; Jaemin Cho; Mohit Bansal; |
1353 | Beyond Time-Average Convergence: Near-Optimal Uncoupled Online Learning Via Clairvoyant Multiplicative Weights Update Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we provide a novel and simple algorithm, Clairvoyant Multiplicative Weights Updates (CMWU) for regret minimization in general games. |
Georgios Piliouras; Ryann Sim; EFSTRATIOS SKOULAKIS; |
1354 | Drawing Out of Distribution with Neuro-Symbolic Generative Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present Drawing out of Distribution (DooD), a neuro-symbolic generative model of stroke-based drawing that can learn such general-purpose representations. |
Yichao Liang; Josh Tenenbaum; Tuan Anh Le; Siddharth N; |
1355 | DivBO: Diversity-aware CASH for Ensemble Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In the framework, we propose to use a diversity surrogate to predict the pair-wise diversity of two unseen configurations. |
Yu Shen; Yupeng Lu; Yang Li; Yaofeng Tu; Wentao Zhang; Bin CUI; |
1356 | ELASTIC: Numerical Reasoning with Adaptive Symbolic Compiler Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we introduce the numEricaL reASoning with adapTive symbolIc Compiler (ELASTIC) model, which is constituted of the RoBERTa as the Encoder and a Compiler with four modules: Reasoning Manager, Operator Generator, Operands Generator, and Memory Register. |
Jiaxin Zhang; Yashar Moshfeghi; |
1357 | Rashomon Capacity: A Metric for Predictive Multiplicity in Probabilistic Classification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce a new measure of predictive multiplicity in probabilistic classification called Rashomon capacity. |
Hsiang Hsu; Flavio Calmon; |
1358 | Cache-Augmented Inbatch Importance Resampling for Training Recommender Retriever Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a Cache-Augmented Inbatch Importance Resampling (XIR) for training recommender retrievers, which not only offers different negatives to user queries with inbatch items, but also adaptively achieves a more accurate estimation of the softmax distribution. |
Jin Chen; Defu Lian; Yucheng Li; Baoyun Wang; Kai Zheng; Enhong Chen; |
1359 | Towards Robust Blind Face Restoration with Codebook Lookup Transformer Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we demonstrate that the uncertainty and ambiguity of the mapping can be largely reduced by casting face restoration as a code prediction task in a small, finite proxy feature space. Under this paradigm, we propose a Transformer-based prediction network, named \textit{CodeFormer}, to exploit global contexts of the input for \textit{code prediction}, enabling the discovery of a natural face that closely approximates the target high-quality image even when the input is severely degraded. |
Shangchen Zhou; Kelvin Chan; Chongyi Li; Chen Change Loy; |
1360 | Fast Mixing of Stochastic Gradient Descent with Normalization and Weight Decay Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We provide a partial proof to Fast Equilibrium conjecture, that normalized networks trained by SGD+WD mixes in O(1/LR*WD) time. |
Zhiyuan Li; Tianhao Wang; Dingli Yu; |
1361 | Fully Sparse 3D Object Detection Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To enable efficient long-range LiDAR-based object detection, we build a fully sparse 3D object detector (FSD). |
Lue Fan; Feng Wang; Naiyan Wang; ZHAO-XIANG ZHANG; |
1362 | VICRegL: Self-Supervised Learning of Local Visual Features Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: A new method called VICRegL is proposed that learns good global and local features simultaneously, yielding excellent performance on detection and segmentation tasks while maintaining good performance on classification tasks. |
Adrien Bardes; Jean Ponce; Yann LeCun; |
1363 | Learning With An Evolving Class Ontology Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper formalizes a protocol for studying the problem of $\textit{Learning with Evolving Class Ontology}$ (LECO). |
Zhiqiu Lin; Yu-Xiong Wang; Deepak Pathak; Deva Ramanan; Shu Kong; |
1364 | Optimal Gradient Sliding and Its Application to Optimal Distributed Optimization Under Similarity Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study structured convex optimization problems, with additive objective $r:=p + q$, where $r$ is ($\mu$-strongly) convex, $q$ is $L_q$-smooth and convex, and $p$ is $L_p$-smooth, possibly nonconvex. For such a class of problems, we proposed an inexact accelerated gradient sliding method that can skip the gradient computation for one of these components while still achieving optimal complexity of gradient calls of $p$ and $q$, that is, $\mathcal{O}(\sqrt{L_p/\mu})$ and $\mathcal{O}(\sqrt{L_q/\mu})$, respectively. |
Dmitry Kovalev; Aleksandr Beznosikov; Ekaterina Borodich; Alexander Gasnikov; Gesualdo Scutari; |
1365 | Masked Autoencoders As Spatiotemporal Learners Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Our study suggests that the general framework of masked autoencoding (BERT, MAE, etc.) can be a unified methodology for representation learning with minimal domain knowledge. |
Christoph Feichtenhofer; haoqi fan; Yanghao Li; Kaiming He; |
1366 | AdaFocal: Calibration-aware Adaptive Focal Loss Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a calibration-aware adaptive focal loss called AdaFocal that utilizes the calibration properties of focal (and inverse-focal loss) and adaptively modifies $\gamma_t$ for different groups of samples based on (1) $\gamma_{t-1}$ from the previous step (2) the magnitude of the model’s under/over-confidence. |
Arindam Ghosh; Thomas Schaaf; Matthew Gormley; |
1367 | Training Scale-Invariant Neural Networks on The Sphere Can Happen in Three Regimes Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we investigate the properties of training scale-invariant neural networks directly on the sphere using a fixed ELR. |
Maxim Kodryan; Ekaterina Lobacheva; Maksim Nakhodnov; Dmitry Vetrov; |
1368 | Approaching Quartic Convergence Rates for Quasi-Stochastic Approximation with Application to Gradient-Free Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: It is shown in this paper that through design it is possible to obtain far faster convergence, of order $O(n^{-4+\delta})$, with $\delta>0$ arbitrary. |
Caio Kalil Lauand; Sean Meyn; |
1369 | Efficient Change-Point Detection for Tackling Piecewise-Stationary Bandits Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce GLRklUCB, a novel algorithm for the piecewise iid non-stationary bandit problem with bounded rewards. |
Lilian Besson; Emilie Kaufmann; Odalric-Ambrym Maillard; Julien Seznec; |
1370 | Uncertain Estimation for Multi-view Data: The Power of Seeing The Whole Picture Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, most existing works are designed for unimodal data, whereas multi-view uncertainty estimation has not been sufficiently investigated. Therefore, we propose a new multi-view classification framework for better uncertainty estimation and out-of-domain sample detection, where we associate each view with an uncertainty-aware classifier and combine the predictions of all the views in a principled way. |
Myong Chol Jung; He Zhao; Joanna Dipnall; Belinda Gabbe; Lan Du; |
1371 | Attraction-Repulsion Spectrum in Neighbor Embeddings Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Here we empirically show that changing the balance between the attractive and the repulsive forces in t-SNE using the exaggeration parameter yields a spectrum of embeddings, which is characterized by a simple trade-off: stronger attraction can better represent continuous manifold structures, while stronger repulsion can better represent discrete cluster structures and yields higher kNN recall. |
Jan Niklas Böhm; Philipp Berens; Dmitry Kobak; |
1372 | A Unified Statistical Learning Model for Rankings and Scores with Application to Grant Panel Review Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Numerous models exist to study data of each type separately, but no unified statistical model captures both data types simultaneously without first performing data conversion. We propose the Mallows-Binomial model to close this gap, which combines a Mallows $\phi$ ranking model with Binomial score models through shared parameters that quantify object quality, a consensus ranking, and the level of consensus among judges. |
Michael Pearce; Elena A. Erosheva; |
1373 | Randomized Message-Interception Smoothing: Gray-box Certificates for Graph Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Yet, existing randomized smoothing certificates for GNNs are overly pessimistic since they treat the model as a black box, ignoring the underlying architecture. To remedy this, we propose novel gray-box certificates that exploit the message-passing principle of GNNs: We randomly intercept messages and carefully analyze the probability that messages from adversarially controlled nodes reach their target nodes. |
Yan Scholten; Jan Schuchardt; Simon Geisler; Aleksandar Bojchevski; Stephan Günnemann; |
1374 | Task-Agnostic Graph Explanations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: They are also unable to provide explanations in cases where the GNN is trained in a self-supervised manner, and the resulting representations are used in future downstream tasks. To address these limitations, we propose a Task-Agnostic GNN Explainer (TAGE) that is independent of downstream models and trained under self-supervision with no knowledge of downstream tasks. |
Yaochen Xie; Sumeet Katariya; Xianfeng Tang; Edward Huang; Nikhil Rao; Karthik Subbian; Shuiwang Ji; |
1375 | Structural Knowledge Distillation for Object Detection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a replacement for the pixel-wise independent $\ell_{p}$-norm based on the structural similarity (SSIM). |
Philip de Rijk; Lukas Schneider; Marius Cordts; Dariu Gavrila; |
1376 | Differentially Private Linear Sketches: Efficient Implementations and Applications Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We show that linear sketches can ensure privacy and maintain their unique properties with a small amount of noise added at initialization. |
Fuheng Zhao; Dan Qiao; Rachel Redberg; Divyakant Agrawal; Amr El Abbadi; Yu-Xiang Wang; |
1377 | Learning Viewpoint-Agnostic Visual Representations By Recovering Tokens in 3D Space Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To this end, we propose a 3D Token Representation Layer (3DTRL) that estimates the 3D positional information of the visual tokens and leverages it for learning viewpoint-agnostic representations. |
Jinghuan Shang; Srijan Das; Michael Ryoo; |
1378 | Robustness to Unbounded Smoothness of Generalized SignSGD Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we show that clipping is not indispensable for Adam-type algorithms in tackling such scenarios: we theoretically prove that a generalized SignSGD algorithm can obtain similar convergence rates as SGD with clipping but does not need explicit clipping at all. |
Michael Crawshaw; Mingrui Liu; Francesco Orabona; Wei Zhang; Zhenxun Zhuang; |
1379 | Sparse Fourier Backpropagation in Cryo-EM Reconstruction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present an approach where the VAE reconstruction is expressed on a volumetric grid, and demonstrate how such a model can be trained efficiently through a novel backpropagation protocol that exploits the sparsity of the projection operation in Fourier space. |
Dari Kimanius; Kiarash Jamali; Sjors Scheres; |
1380 | Back Razor: Memory-Efficient Transfer Learning By Self-Sparsified Backpropogation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we present a novel and general framework called Back Razor, that can be plug-and-play applied to any pre-trained network without changing its architecture. |
Ziyu Jiang; Xuxi Chen; Xueqin Huang; Xianzhi Du; Denny Zhou; Zhangyang Wang; |
1381 | M³ViT: Mixture-of-Experts Vision Transformer for Efficient Multi-task Learning with Model-Accelerator Co-design Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we present a model-accelerator co-design framework to enable efficient on-device MTL, that tackles both training and inference bottlenecks. |
hanxue liang; Zhiwen Fan; Rishov Sarkar; Ziyu Jiang; Tianlong Chen; Kai Zou; Yu Cheng; Cong Hao; Zhangyang Wang; |
1382 | Near-Optimal Randomized Exploration for Tabular Markov Decision Processes Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study algorithms using randomized value functions for exploration in reinforcement learning. |
Zhihan Xiong; Ruoqi Shen; Qiwen Cui; Maryam Fazel; Simon Du; |
1383 | Learning Robust Dynamics Through Variational Sparse Gating Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we introduce Variational Sparse Gating (VSG), where model states are sparsely updated through a stochastic gating mechanism. |
Arnav Kumar Jain; Shivakanth Sujit; Shruti Joshi; Vincent Michalski; Danijar Hafner; Samira Ebrahimi Kahou; |
1384 | HyperTree Proof Search for Neural Theorem Proving Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose an online training procedure for a transformer-based automated theorem prover. |
Guillaume Lample; Timothee Lacroix; Marie-Anne Lachaux; Aurelien Rodriguez; Amaury Hayat; Thibaut Lavril; Gabriel Ebner; Xavier Martinet; |
1385 | Random Normalization Aggregation for Adversarial Defense Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Based on our theoretical analysis, we propose a simple yet effective module named Random Normalization Aggregation (RNA) which replaces the batch normalization layers in the networks and aggregates different selected normalization types to form a huge random space. |
Minjing Dong; Xinghao Chen; Yunhe Wang; Chang Xu; |
1386 | BiMLP: Compact Binary Architectures for Vision Multi-Layer Perceptrons Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To this end, we propose to improve the performance of binary MLP (BiMLP) model by enriching the representation ability of binary FC layers. |
Yixing Xu; Xinghao Chen; Yunhe Wang; |
1387 | Learning Efficient Vision Transformers Via Fine-Grained Manifold Distillation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we fully utilize the patch-level information and propose a fine-grained manifold distillation method for transformer-based networks. |
Zhiwei Hao; Jianyuan Guo; Ding Jia; Kai Han; Yehui Tang; Chao Zhang; Han Hu; Yunhe Wang; |
1388 | Efficient and Modular Implicit Differentiation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we proposeautomatic implicit differentiation, an efficientand modular approach for implicit differentiation of optimization problems. |
Mathieu Blondel; Quentin Berthet; Marco Cuturi; Roy Frostig; Stephan Hoyer; Felipe Llinares-Lopez; Fabian Pedregosa; Jean-Philippe Vert; |
1389 | Dynamic Learning in Large Matching Markets Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study a sequential matching problem faced by "large" centralized platforms where "jobs" must be matched to "workers" subject to uncertainty about worker skill proficiencies. |
Anand Kalvit; Assaf Zeevi; |
1390 | Learning and Covering Sums of Independent Random Variables with Unbounded Support Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we address two questions: (i) Are there general families of SIIRVs with infinite support that can be learned with sample complexity independent of both $n$ and the maximal element of the support? |
Alkis Kalavasis; Konstantinos Stavropoulos; Emmanouil Zampetakis; |
1391 | Manifold Interpolating Optimal-Transport Flows for Trajectory Inference Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a method called Manifold Interpolating Optimal-Transport Flow (MIOFlow) that learns stochastic, continuous population dynamics from static snapshot samples taken at sporadic timepoints. |
Guillaume Huguet; Daniel Sumner Magruder; Oluwadamilola Fasina; Alexander Tong; Manik Kuchroo; Guy Wolf; Smita Krishnaswamy; |
1392 | Sample-Efficient Learning of Correlated Equilibria in Extensive-Form Games Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper presents the first sample-efficient algorithm for learning the EFCE from bandit feedback. |
Ziang Song; Song Mei; Yu Bai; |
1393 | Scalable Representation Learning in Linear Contextual Bandits with Constant Regret Guarantees Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose \textsc{BanditSRL}, a representation learning algorithm that combines a novel constrained optimization problem to learn a realizable representation with good spectral properties with a generalized likelihood ratio test to exploit the recovered representation and avoid excessive exploration. |
Andrea Tirinzoni; Matteo Papini; Ahmed Touati; Alessandro Lazaric; Matteo Pirotta; |
1394 | Global Linear and Local Superlinear Convergence of IRLS for Non-Smooth Robust Regression Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We advance both the theory and practice of robust $\ell_p$-quasinorm regression for $p \in (0,1]$ by using novel variants of iteratively reweighted least-squares (IRLS) to solve the underlying non-smooth problem. |
Liangzu Peng; Christian Kümmerle; Rene Vidal; |
1395 | Improved Differential Privacy for SGD Via Optimal Private Linear Operators on Adaptive Streams Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We prove fundamental theoretical results on the applicability of matrix factorizations to the adaptive streaming setting, and provide a new parameter-free fixed-point algorithm for computing optimal factorizations. |
Sergey Denisov; H. Brendan McMahan; John Rush; Adam Smith; Abhradeep Guha Thakurta; |
1396 | Supervising The Multi-Fidelity Race of Hyperparameter Configurations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we introduce DyHPO, a Bayesian Optimization method that learns to decide which hyperparameter configuration to train further in a dynamic race among all feasible configurations. |
Martin Wistuba; Arlind Kadra; Josif Grabocka; |
1397 | Independence Testing-Based Approach to Causal Discovery Under Measurement Error and Linear Non-Gaussian Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose the Transformed Independent Noise (TIN) condition, which checks for independence between a specific linear transformation of some measured variables and certain other measured variables. |
Haoyue Dai; Peter Spirtes; Kun Zhang; |
1398 | Learning to Accelerate Partial Differential Equations Via Latent Global Evolution Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: LE-PDE achieves speedup by having a much smaller latent dimension to update during long rollout as compared to updating in the input space. We introduce new learning objectives to effectively learn such latent dynamics to ensure long-term stability. |
Tailin Wu; Takashi Maruyama; Jure Leskovec; |
1399 | ZeroC: A Neuro-Symbolic Model for Zero-shot Concept Recognition and Acquisition at Inference Time Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we introduce Zero-shot Concept Recognition and Acquisition (ZeroC), a neuro-symbolic architecture that can recognize and acquire novel concepts in a zero-shot way. |
Tailin Wu; Megan Tjandrasuwita; Zhengxuan Wu; Xuelin Yang; Kevin Liu; Rok Sosic; Jure Leskovec; |
1400 | Extrapolative Continuous-time Bayesian Neural Network for Fast Training-free Test-time Adaptation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose to formulate internal predictive modeling as a continuous-time Bayesian filtering problem within the context of a stochastic dynamical system. |
Hengguan Huang; Xiangming Gu; Hao Wang; Chang Xiao; Hongfu Liu; Ye Wang; |
1401 | A Projection-free Algorithm for Constrained Stochastic Multi-level Composition Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a projection-free conditional gradient-type algorithm for smooth stochastic multi-level composition optimization, where the objective function is a nested composition of $T$ functions and the constraint set is a closed convex set. |
Tesi Xiao; Krishnakumar Balasubramanian; Saeed Ghadimi; |
1402 | Theseus: A Library for Differentiable Nonlinear Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present Theseus, an efficient application-agnostic open source library for differentiable nonlinear least squares (DNLS) optimization built on PyTorch, providing a common framework for end-to-end structured learning in robotics and vision. |
Luis Pineda; Taosha Fan; Maurizio Monge; Shobha Venkataraman; Paloma Sodhi; Ricky T. Q. Chen; Joseph Ortiz; Daniel DeTone; Austin Wang; Stuart Anderson; Jing Dong; Brandon Amos; Mustafa Mukadam; |
1403 | BinauralGrad: A Two-Stage Conditional Diffusion Probabilistic Model for Binaural Audio Synthesis Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we formulate the synthesis process from a different perspective by decomposing the binaural audio into a common part that shared by the left and right channels as well as a specific part that differs in each channel. Accordingly, we propose BinauralGrad, a novel two-stage framework equipped with diffusion models to synthesize them respectively. |
Yichong Leng; Zehua Chen; Junliang Guo; Haohe Liu; Jiawei Chen; Xu Tan; Danilo Mandic; Lei He; Xiangyang Li; Tao Qin; Sheng Zhao; Tie-Yan Liu; |
1404 | VideoMAE: Masked Autoencoders Are Data-Efficient Learners for Self-Supervised Video Pre-Training Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we show that video masked autoencoders (VideoMAE) are data-efficient learners for self-supervised video pre-training (SSVP). |
Zhan Tong; Yibing Song; Jue Wang; Limin Wang; |
1405 | Efficient Coding, Channel Capacity, and The Emergence of Retinal Mosaics Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Here, we use efficient coding theory to present a comprehensive account of mosaic organization in the case of natural videos as the retinal channel capacity—the number of neurons available for encoding—is varied. |
Na Young Jun; Greg Field; John Pearson; |
1406 | Gradient Descent Is Optimal Under Lower Restricted Secant Inequality And Upper Error Bound Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Recent work argues that strong convexity andsmoothness—popular assumptions in literature—lead to a pathological definition of the conditionnumber. Motivated by this result, we focus on the class of functionssatisfying a lower restricted secant inequality and an upper error bound. |
Charles Guille-Escuret; Adam Ibrahim; Baptiste Goujaud; Ioannis Mitliagkas; |
1407 | Which Explanation Should I Choose? A Function Approximation Perspective to Characterizing Post Hoc Explanations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce a no free lunch theorem for explanation methods which demonstrates that no single method can perform optimally across all neighbourhoods and calls for choosing among methods. |
Tessa Han; Suraj Srinivas; Himabindu Lakkaraju; |
1408 | Near-Isometric Properties of Kronecker-Structured Random Tensor Embeddings Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We give uniform concentration inequality for random tensor acting on rank-1 Kronecker structured signals, which parallels a Gordon-type inequality for this class of tensor structured data. |
Qijia Jiang; |
1409 | Stochastic Multiple Target Sampling Gradient Descent Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: . To answer this question, we propose Stochastic Multiple Target Sampling Gradient Descent (MT-SGD), enabling us to sample from multiple unnormalized target distributions. |
Hoang Phan; Ngoc Tran; Trung Le; Toan Tran; Nhat Ho; Dinh Phung; |
1410 | Neural Networks with Hadamard Product: Separation on Extrapolation and Spectral Bias Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we derive the finite-width NTK formulation for a special class of NNs-Hp, i.e., polynomial neural networks. |
Yongtao Wu; Zhenyu Zhu; Fanghui Liu; Grigorios Chrysos; Volkan Cevher; |
1411 | Orthogonal Transformer: An Efficient Vision Transformer Backbone with Token Orthogonalization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a general vision transformer backbone, called as Orthogonal Transformer, in pursuit of both efficiency and effectiveness. |
Huaibo Huang; Xiaoqiang Zhou; Ran He; |
1412 | On-Device Training Under 256KB Memory Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose an algorithm-system co-design framework to make training neural networks possible with only 256KB of memory. |
Ji Lin; Ligeng Zhu; Wei-Ming Chen; Wei-Chen Wang; Chuang Gan; Song Han; |
1413 | Data Distributional Properties Drive Emergent In-Context Learning in Transformers Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This observation raises the question: what aspects of the training regime lead to this emergent behavior? Here, we show that this behavior is driven by the distributions of the training data itself. |
Stephanie Chan; Adam Santoro; Andrew Lampinen; Jane Wang; Aaditya Singh; Pierre Richemond; James McClelland; Felix Hill; |
1414 | Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present Imagen, a text-to-image diffusion model with an unprecedented degree of photorealism and a deep level of language understanding. |
Chitwan Saharia; William Chan; Saurabh Saxena; Lala Li; Jay Whang; Emily Denton; Seyed Kamyar Seyed Ghasemipour; Raphael Gontijo Lopes; Burcu Karagol Ayan; Tim Salimans; Jonathan Ho; David Fleet; Mohammad Norouzi; |
1415 | Evaluating Graph Generative Models with Contrastively Learned Features Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: So far, most techniques use either traditional metrics based on subgraph counting, or the representations of randomly initialized Graph Neural Networks (GNNs). We propose using representations from constrastively trained GNNs, rather than random GNNs, and show this gives more reliable evaluation metrics. |
Hamed Shirzad; Kaveh Hassani; Danica J. Sutherland; |
1416 | Fast Distance Oracles for Any Symmetric Norm Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a nearly-optimal dynamic data structure for the distance oracle problem for any symmetric norm. |
Yichuan Deng; Zhao Song; OMRI WEINSTEIN; Ruizhe Zhang; |
1417 | Regularized Molecular Conformation Fields Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we introduce RMCF, a novel framework to generate a diverse set of low-energy molecular conformations through samplingfrom a regularized molecular conformation field. |
Lihao Wang; Yi Zhou; Yiqun Wang; Xiaoqing Zheng; Xuanjing Huang; Hao Zhou; |
1418 | Learning Single-index Models with Shallow Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we introduce a natural class of shallow neural networks and study its ability to learn single-index models via \textit{gradient descent}. |
Alberto Bietti; Joan Bruna; Clayton Sanford; Min Jae Song; |
1419 | DetCLIP: Dictionary-Enriched Visual-Concept Paralleled Pre-training for Open-world Detection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper presents DetCLIP, a paralleled visual-concept pre-training method for open-world detection by resorting to knowledge enrichment from a designed concept dictionary. |
Lewei Yao; Jianhua Han; Youpeng Wen; Xiaodan Liang; Dan Xu; Wei Zhang; Zhenguo Li; Chunjing XU; Hang Xu; |
1420 | SageMix: Saliency-Guided Mixup for Point Clouds Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose SageMix, a saliency-guided Mixup for point clouds to preserve salient local structures. |
Sanghyeok Lee; Minkyu Jeon; Injae Kim; Yunyang Xiong; Hyunwoo Kim; |
1421 | Efficient $\Phi$-Regret Minimization in Extensive-Form Games Via Online Mirror Descent Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This approach enables us to directly translate state-of-the-art techniques and analyses in NFGs to learning EFGs, but typically suffers from computational intractability due to the exponential blow-up of the game size introduced by the conversion. In this paper, we address this problem in natural and important setups for the \emph{$\Phi$-Hedge} algorithm—A generic algorithm capable of learning a large class of equilibria for NFGs. |
Yu Bai; Chi Jin; Song Mei; Ziang Song; Tiancheng Yu; |
1422 | Two-layer Neural Network on Infinite Dimensional Data: Global Optimization Guarantee in The Mean-field Regime Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we develop a new mean-field analysis of two-layer neural network in an infinite-dimensional parameter space. |
Naoki Nishikawa; Taiji Suzuki; Atsushi Nitanda; Denny Wu; |
1423 | UniCLIP: Unified Framework for Contrastive Language-Image Pre-training Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, as these works define inter-domain (image-text) contrastive loss and intra-domain (image-image) contrastive loss in individual spaces, many feasible combinations of supervision are overlooked. To overcome this issue, we propose UniCLIP, a Unified framework for Contrastive Language-Image Pre-training. |
Janghyeon Lee; Jongsuk Kim; Hyounguk Shon; Bumsoo Kim; Seung Hwan Kim; Honglak Lee; Junmo Kim; |
1424 | Data-Efficient Pipeline for Offline Reinforcement Learning with Limited Data Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In particular, our work highlights the importance of performing multiple data splits to produce more reliable algorithm-hyperparameter selection: while this is a common approach in supervised learning, to our knowledge, this has not been discussed in detail in the offline RL setting, and we show it can have substantial impacts when the dataset is small. |
Allen Nie; Yannis Flet-Berliac; Deon Jordan; William Steenbergen; Emma Brunskill; |
1425 | Learning to Configure Computer Networks with Neural Algorithmic Reasoning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a new method for scaling automatic configuration of computer networks. |
Luca Beurer-Kellner; Martin Vechev; Laurent Vanbever; Petar Veličković; |
1426 | Weighted Mutual Learning with Diversity-Driven Model Compression Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To address the two challenges, this paper presents a framework called Weighted Mutual Learning with Diversity-Driven Model Compression (\textbf{WML}) for online distillation. |
Miao Zhang; Li Wang; David Campos; Wei Huang; Chenjuan Guo; Bin Yang; |
1427 | Identifiability of Deep Generative Models Under Mixture Priors Without Auxiliary Information Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: At the same time, several works have empirically observed that this doesn’t seem to be necessary in practice. In this work, we explain this behavior by showing that for a broad class of generative (i.e. unsupervised) models with universal approximation capabilities, the side information $u$ is not necessary: We prove identifiability of the entire generative model where we do not observe $u$ and only observe the data $x$. |
Bohdan Kivva; GOUTHAM RAJENDRAN; Pradeep Ravikumar; Bryon Aragam; |
1428 | On A Mallows-type Model For (Ranked) Choices Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study a distance-based distribution over rankings that aggregates into a simple (ranked) choice model, is easy to estimate, and demonstrates promising performance on real data. |
Yifan Feng; Yuxuan Tang; |
1429 | Parameters or Privacy: A Provable Tradeoff Between Overparameterization and Membership Inference Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we study an underexplored hidden cost of overparameterization: the fact that overparameterized models are more vulnerable to {\em privacy attacks}, in particular the {\em membership inference} attack that predicts the (potentially sensitive) examples used to train a model. |
Jasper Tan; Blake Mason; Hamid Javadi; Richard Baraniuk; |
1430 | NeurOLight: A Physics-Agnostic Neural Operator Enabling Parametric Photonic Device Simulation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, for the first time, a physics-agnostic neural operator-based framework, dubbed NeurOLight, is proposed to learn a family of frequency-domain Maxwell PDEs for ultra-fast parametric photonic device simulation. |
Jiaqi Gu; Zhengqi Gao; Chenghao Feng; Hanqing Zhu; Ray Chen; Duane Boning; David Pan; |
1431 | A Deep Learning Dataloader with Shared Data Preparation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a dependent sampling algorithm (DSA) and domain-specific cache policy to relax the constraints. |
jian xie; Jingwei Xu; Guochang Wang; Yuan Yao; Zenan Li; Chun Cao; Hanghang Tong; |
1432 | Are You Stealing My Model? Sample Correlation for Fingerprinting Deep Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Specifically, we present SAC-w that selects wrongly classified normal samples as model inputs and calculates the mean correlation among their model outputs. |
Jiyang Guan; Jian Liang; Ran He; |
1433 | Large-scale Optimization of Partial AUC in A Range of False Positive Rates Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Although partial AUC optimization in a range of FPRs had been studied, existing algorithms are not scalable to big data and not applicable to deep learning. To address this challenge, we cast the problem into a non-smooth difference-of-convex (DC) program for any smooth predictive functions (e.g., deep neural networks), which allowed us to develop an efficient approximated gradient descent method based on the Moreau envelope smoothing technique, inspired by recent advances in non-smooth DC optimization. |
Yao Yao; Qihang Lin; Tianbao Yang; |
1434 | Improve Task-Specific Generalization in Few-Shot Learning Via Adaptive Vicinal Risk Minimization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Thereupon we derive the resulting vicinal loss function over vicinities of all training samples and minimize it instead of the conventional empirical loss over training samples only, favorably free from the exhaustive sampling of all vicinal samples. |
Long-Kai Huang; Ying Wei; |
1435 | Test-Time Training with Masked Autoencoders Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We show how applying masked autoencoding to train on each unlabeled test sample before making a prediction improves generalization. |
Yossi Gandelsman; Yu Sun; Xinlei Chen; Alexei Efros; |
1436 | A Unified Sequence Interface for Vision Tasks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work we show that a diverse set of “core” computer vision tasks can also be unified if formulated in terms of a shared pixel-to-sequence interface. |
Ting Chen; Saurabh Saxena; Lala Li; Tsung-Yi Lin; David Fleet; Geoffrey E Hinton; |
1437 | Learning Representations Via A Robust Behavioral Metric for Deep Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Based on the analysis, we propose a new behavioral distance, the RAP distance, and develop a practical representation learning algorithm on top of it. |
Jianda Chen; Sinno Pan; |
1438 | Learning from Small Samples: Transformation-Invariant SVMs with Composition and Locality at Multiple Scales Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Motivated by the problem of learning with small sample sizes, this paper shows how to incorporate into support-vector machines (SVMs) those properties that have made convolutional neural networks (CNNs) successful. |
Tao Liu; P. R. Kumar; Ruida Zhou; Xi Liu; |
1439 | DMAP: A Distributed Morphological Attention Policy for Learning to Locomote with A Changing Body Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce DMAP, a biologically-inspired, attention-based policy network architecture. |
Alberto Silvio Chiappa; Alessandro Marin Vargas; Alexander Mathis; |
1440 | Learning Infinite-Horizon Average-Reward Restless Multi-Action Bandits Via Index Awareness Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we advocate index-aware reinforcement learning (RL) solutions to design RL algorithms operating on a much smaller dimensional subspace by exploiting the inherent structure in restless bandits. |
GUOJUN XIONG; Shufan Wang; Jian Li; |
1441 | Pluralistic Image Completion with Probabilistic Mixture-of-Experts Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Specifically, we introduce a unified probabilistic graph model that represents the complex interactions in image completion. |
Xiaobo Xia; Wenhao Yang; Jie Ren; Yewen Li; Yibing Zhan; Bo Han; Tongliang Liu; |
1442 | Micro and Macro Level Graph Modeling for Graph Variational Auto-Encoders Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce a new micro-macro training objective for graph generation that combines node-level and graph-level losses. |
Kiarash Zahirnia; Parmis Naddaf; Oliver Schulte; Ke Li; |
1443 | CodeRL: Mastering Code Generation Through Pretrained Models and Deep Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce CodeRL, a novel framework for program synthesis through large-scale pretrained models and deep reinforcement learning and obtained new SOTA on both the challenging APPS and MBPP benchmarks. |
Hung Le; Yue Wang; Akhilesh Deepak Gotmare; Silvio Savarese; Steven Chu Hong Hoi; |
1444 | GAGA: Deciphering Age-path of Generalized Self-paced Regularizer Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, current age-path algorithms are either limited to the simplest regularizer, or lack solid theoretical understanding as well as computational efficiency. To address this challenge, we propose a novel Generalized Age-path Algorithm (GAGA) for SPL with various self-paced regularizers based on ordinary differential equations (ODEs) and sets control, which can learn the entire solution spectrum w.r.t. a range of age parameters. |
Xingyu Qu; Diyang Li; Xiaohan Zhao; Bin Gu; |
1445 | Understanding The Evolution of Linear Regions in Deep Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We seek to understand how observed region counts and their densities evolve during deep reinforcement learning using empirical results that span a range of continuous control tasks and policy network dimensions. |
Setareh Cohan; Nam Hee Kim; David Rolnick; Michiel van de Panne; |
1446 | Plan To Predict: Learning An Uncertainty-Foreseeing Model For Model-Based Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To this end, we propose Plan To Predict (P2P), an MBRL framework that treats the model rollout process as a sequential decision making problem by reversely considering the model as a decision maker and the current policy as the dynamics. |
Zifan Wu; Chao Yu; Chen Chen; Jianye Hao; Hankz Hankui Zhuo; |
1447 | Egocentric Video-Language Pretraining Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We pioneer Egocentric Video-Language Pretraining from pretraining dataset, model and development benchmark; the resulted pretrained model exhibits strong performance on five downstream tasks across three egocentric datasets. |
Kevin Qinghong Lin; Jinpeng Wang; Mattia Soldan; Michael Wray; Rui Yan; Eric Z. XU; Denial Gao; Rong-Cheng Tu; Wenzhe Zhao; Weijie Kong; Chengfei Cai; WANG HongFa; Dima Damen; Bernard Ghanem; Wei Liu; Mike Zheng Shou; |
1448 | A Universal Error Measure for Input Predictions Applied to Online Graph Problems Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce a novel measure for quantifying the error in input predictions. |
Giulia Bernardini; Alexander Lindermayr; Alberto Marchetti-Spaccamela; Nicole Megow; Leen Stougie; Michelle Sweering; |
1449 | Polyhistor: Parameter-Efficient Multi-Task Adaptation for Dense Vision Tasks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we provide an extensive single- and multi-task parameter-efficient benchmark and examine existing parameter-efficient fine-tuning NLP methods for vision tasks. |
Yen-Cheng Liu; CHIH-YAO MA; Junjiao Tian; Zijian He; Zsolt Kira; |
1450 | On The Relationship Between Variational Inference and Auto-associative Memory Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this article, we propose a variational inference formulation of memory retrieval in auto-associative memories, allowing us to combine memory retrieval with perceptual inference into the same mathematical framework. |
Louis Annabi; Alexandre Pitti; Mathias Quoy; |
1451 | Online Learning and Pricing for Network Revenue Management with Reusable Resources Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose novel batched bandit learning algorithms for finding near-optimal pricing policies, and show that they admit a near-optimal cumulative regret bound of $\tilde{O}(J\sqrt{XT})$, where $J$, $X$, and $T$ are the numbers of products, candidate prices, and service periods, respectively. |
Huiwen Jia; Cong Shi; Siqian Shen; |
1452 | Distributed Online Convex Optimization with Compressed Communication Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We design provably no-regret distributed online algorithms that work with compressors. |
Zhipeng Tu; Xi Wang; Yiguang Hong; Lei Wang; Deming Yuan; Guodong Shi; |
1453 | Molecule Generation By Principal Subgraph Mining and Assembling Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we define a novel notion, principal subgraph that is closely related to the informative pattern within molecules. |
Xiangzhe Kong; Wenbing Huang; Zhixing Tan; Yang Liu; |
1454 | LIFT: Language-Interfaced FineTuning for Non-language Machine Learning Tasks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose the language interface framework for fine-tuning language models for non-language tasks without making any changes in architecture or loss function. |
Tuan Dinh; Yuchen Zeng; Ruisu Zhang; Ziqian Lin; Michael Gira; Shashank Rajput; Jy-yong Sohn; Dimitris Papailiopoulos; Kangwook Lee; |
1455 | Generalization Analysis on Learning with A Concurrent Verifier Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper gives a generalization analysis of learning with a concurrent verifier. |
Masaaki Nishino; Kengo Nakamura; Norihito Yasuda; |
1456 | Size and Depth of Monotone Neural Networks: Interpolation and Approximation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study the interpolation problem for monotone data sets: The input is a monotone data set with $n$ points, and the goal is to find a size and depth efficient monotone neural network with \emph{non negative parameters} and threshold units that interpolates the data set. |
Dan Mikulincer; Daniel Reichman; |
1457 | Neural Approximation of Extended Persistent Homology on Graphs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Inspired by recent success in neural algorithmic reasoning, we propose a novel graph neural network to estimate extended persistence diagrams (EPDs) on graphs efficiently. |
Zuoyu Yan; Tengfei Ma; Liangcai Gao; Zhi Tang; Yusu Wang; Chao Chen; |
1458 | Decentralized Gossip-Based Stochastic Bilevel Optimization Over Communication Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper studies the problem of distributed bilevel optimization over a network where agents can only communicate with neighbors, including examples from multi-task, multi-agent learning and federated learning.In this paper, we propose a gossip-based distributed bilevel learning algorithm that allows networked agents to solve both the inner and outer optimization problems in a single timescale and share information via network propagation. |
Shuoguang Yang; Xuezhou Zhang; Mengdi Wang; |
1459 | Communication Efficient Distributed Learning for Kernelized Contextual Bandits Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, instead of assuming the existence of a linear reward mapping from the features to the expected rewards, we consider non-linear reward mappings, by letting agents collaboratively search in a reproducing kernel Hilbert space (RKHS). |
Chuanhao Li; Huazheng Wang; Mengdi Wang; Hongning Wang; |
1460 | Exploration-Guided Reward Shaping for Reinforcement Learning Under Sparse Rewards Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel framework, Exploration-Guided Reward Shaping (ExploRS), that operates in a fully self-supervised manner and can accelerate an agent’s learning even in sparse-reward environments. |
Rati Devidze; Parameswaran Kamalaruban; Adish Singla; |
1461 | Bayesian Spline Learning for Equation Discovery of Nonlinear Dynamics with Quantified Uncertainty Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: The proposed method utilizes spline basis to handle the data scarcity and measurement noise, upon which a group of derivatives can be accurately computed to form a library of candidate model terms. |
Luning Sun; Daniel Huang; Hao Sun; Jian-Xun Wang; |
1462 | Mildly Conservative Q-Learning for Offline Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper explores mild but enough conservatism for offline learning while not harming generalization. We propose Mildly Conservative Q-learning (MCQ), where OOD actions are actively trained by assigning them proper pseudo Q values. |
Jiafei Lyu; Xiaoteng Ma; Xiu Li; Zongqing Lu; |
1463 | Data-Driven Model-Based Optimization Via Invariant Representation Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study the problem of data-driven model-based optimization, where the goal is to find the optimal design, provided access to only a static dataset, with no active data collection. |
Han Qi; Yi Su; Aviral Kumar; Sergey Levine; |
1464 | SemMAE: Semantic-Guided Masking for Learning Masked Autoencoders Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we explore a potential visual analogue of words, i.e., semantic parts, and we integrate semantic information into the training process of MAE by proposing a Semantic-Guided Masking strategy. |
Gang Li; Heliang Zheng; Daqing Liu; Chaoyue Wang; Bing Su; Changwen Zheng; |
1465 | Fast Bayesian Inference with Batch Bayesian Quadrature Via Kernel Recombination Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose a parallelised (batch) BQ method, employing techniques from kernel quadrature, that possesses an empirically exponential convergence rate. |
Masaki Adachi; Satoshi Hayakawa; Martin Jørgensen; Harald Oberhauser; Michael A Osborne; |
1466 | Distinguishing Discrete and Continuous Behavioral Variability Using Warped Autoregressive HMMs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present two versions of warped ARHMM in which the warping variable affects the dynamics of each syllable either linearly or nonlinearly. |
Julia Costacurta; Lea Duncker; Blue Sheffer; Alex Williams; Winthrop Gillis; Caleb Weinreb; Jeffrey Markowitz; Sandeep R Datta; Scott Linderman; |
1467 | Learning to Drop Out: An Adversarial Approach to Training Sequence VAEs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose an adversarial training strategy to achieve information-based stochastic dropout. |
Djordje Miladinovic; Kumar Shridhar; Kushal Jain; Max Paulus; Joachim M Buhmann; Carl Allen; |
1468 | Diffusion-based Molecule Generation with Informative Prior Bridges Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a simple and novel approach to steer the training of diffusion-based generative models with physical and statistics prior information. |
Lemeng Wu; Chengyue Gong; Xingchao Liu; Mao Ye; Qiang Liu; |
1469 | Understanding The Generalization Benefit of Normalization Layers: Sharpness Reduction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Motivated by the long-held belief that flatter minima lead to better generalization, this paper gives mathematical analysis and supporting experiments suggesting that normalization (together with accompanying weight-decay) encourages GD to reduce the sharpness of loss surface. |
Kaifeng Lyu; Zhiyuan Li; Sanjeev Arora; |
1470 | Coresets for Relational Data and The Applications Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a novel approach called “aggregation tree with pseudo-cube” that can build a coreset from bottom to up. |
Jiaxiang Chen; Qingyuan Yang; Ruomin Huang; Hu Ding; |
1471 | Archimedes Meets Privacy: On Privately Estimating Quantiles in High Dimensions Under Minimal Assumptions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We show how one can privately, and with polynomially many samples, (a) output an approximate interior point of the FB — e.g., "a typical user" in a high-dimensional database — by leveraging the robustness of the Steiner point of the FB; and at the expense of polynomially many more samples, (b) produce an approximate uniform sample from the FB, by constructing a private noisy projection oracle. |
Omri Ben-Eliezer; Dan Mikulincer; Ilias Zadik; |
1472 | Forecasting Human Trajectory from Scene History Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In other words, a person’s subsequent trajectory has likely been traveled by others. Based on this hypothesis, we propose to forecast a person’s future trajectory by learning from the implicit scene regularities. |
Mancheng Meng; Ziyan Wu; Terrence Chen; Dinggang Shen; Fan Yang; |
1473 | A General Approximation Lower Bound in $L^p$ Norm, with Applications to Feedforward Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study the fundamental limits to the expressive power of neural networks. |
El Mehdi Achour; Armand Foucault; Sébastien Gerchinovitz; François Malgouyres; |
1474 | Structure-Aware Image Segmentation with Homotopy Warping Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: By focusing on these critical locations, we propose a new homotopy warping loss to train deep image segmentation networks for better topological accuracy. To efficiently identity these topologically critical locations, we propose a new algorithm exploiting the distance transform. |
Xiaoling Hu; |
1475 | Do Current Multi-Task Optimization Methods in Deep Learning Even Help? Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We perform a large-scale study of the effects of specialized optimization methods for deep multi-task models. |
Derrick Xin; Behrooz Ghorbani; Justin Gilmer; Ankush Garg; Orhan Firat; |
1476 | ConfounderGAN: Protecting Image Data Privacy with Causal Confounder Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Therefore, it’s important and necessary to develop a method or tool to prevent unauthorized data exploitation. In this paper, we propose ConfounderGAN, a generative adversarial network (GAN) that can make personal image data unlearnable to protect the data privacy of its owners. |
Qi Tian; Kelu Jiang; Kun Kuang; Furui Liu; Zhihua Wang; Fei Wu; |
1477 | A Fair Comparison of Two Popular Flat-Minima Optimizers: Stochastic Weight Averaging Vs. Sharpness-Aware Minimization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We fill this gap here by comparing the loss surfaces of the models trained with each method and through a broad benchmarking across computer vision, natural language processing, and graph representation learning tasks. |
Jean Kaddour; Linqing Liu; Ricardo Silva; Matt Kusner; |
1478 | On The Parameterization and Initialization of Diagonal State Space Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This work seeks to systematically understand how to parameterize and initialize diagonal state space models. |
Albert Gu; Karan Goel; Ankit Gupta; Christopher Ré; |
1479 | Single-Stage Visual Relationship Learning Using Conditional Queries Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose Transformers with conditional queries for SGG, namely, TraCQ with a new formulation for SGG that avoids the multi-task learning problem and the combinatorial entity pair distribution. |
Alakh Desai; Tz-Ying Wu; Subarna Tripathi; Nuno Vasconcelos; |
1480 | Optimizing Data Collection for Machine Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a new paradigm for modeling the data collection workflow as a formal optimal data collection problem that allows designers to specify performance targets, collection costs, a time horizon, and penalties for failing to meet the targets. |
Rafid Mahmood; James Lucas; Jose M. Alvarez; Sanja Fidler; Marc Law; |
1481 | Convergent Representations of Computer Programs in Human and Artificial Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We first evaluate a selection of static and dynamic code properties, such as abstract syntax tree (AST)-related and runtime-related metrics. Then, to learn whether brain representations encode fine-grained information about computer programs, we train a probe to align brain recordings with representations learned by a suite of ML models. |
Shashank Srikant; Ben Lipkin; Anna Ivanova; Evelina Fedorenko; Una-May O’Reilly; |
1482 | Maximum A Posteriori Natural Scene Reconstruction from Retinal Ganglion Cells with Deep Denoiser Priors Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We develop a method for approximate MAP reconstruction natural images from large populations of experimentally recorded retinal ganglion cells, and show that the method is comparable to or better than current ad hoc reconstruction methods. |
Eric Wu; Alexander Sher; Alan Litke; Eero Simoncelli; E.J. Chichilnisky; Nora Brackbill; |
1483 | Conformalized Fairness Via Quantile Regression Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To fulfill great needs and advocate the significance of quantile fairness, we propose a novel framework to learn a real-valued quantile function under the fairness requirement of Demographic Parity with respect to sensitive attributes, such as race or gender, and thereby derive a reliable fair prediction interval. |
Meichen Liu; Lei Ding; Dengdeng Yu; Wulong Liu; Linglong Kong; Bei Jiang; |
1484 | TreeMoCo: Contrastive Neuron Morphology Representation Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: A quantitative, informative, and comprehensive representation of neuron morphology is largely absent but desired. To fill this gap, in this work, we adopt a Tree-LSTM network to encode neuron morphology and introduce a self-supervised learning framework named TreeMoCo to learn features without the need for labels. |
Hanbo Chen; Jiawei Yang; Daniel Iascone; Lijuan Liu; Lei He; Hanchuan Peng; Jianhua Yao; |
1485 | A General Framework for Auditing Differentially Private Machine Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a framework to statistically audit the privacy guarantee conferred by a differentially private machine learner in practice. |
Fred Lu; Joseph Munoz; Maya Fuchs; Tyler LeBlond; Elliott Zaresky-Williams; Edward Raff; Francis Ferraro; Brian Testa; |
1486 | Refined Dimension-Dependent Analysis for Private Convex Learning and Implications for Fine-Tuning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We identify that the magnitudes of gradients projected onto subspaces is a key factor that determines performance. |
Xuechen Li; Daogao Liu; Tatsunori Hashimoto; Huseyin A. Inan; Janardhan Kulkarni; Yin-Tat Lee; Abhradeep Guha Thakurta; |
1487 | A Contrastive Rule for Meta-learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Here, we present a general-purpose, biologically-plausible meta-learning rule which estimates gradients with respect to the parameters of an underlying learning algorithm by simply running it twice. |
Nicolas Zucchet; Simon Schug; Johannes von Oswald; Dominic Zhao; João Sacramento; |
1488 | Posterior Refinement Improves Sample Efficiency in Bayesian Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, previous methods for obtaining accurate posterior approximations are expensive and non-trivial to implement. We, therefore, propose to refine Gaussian approximate posteriors with normalizing flows. |
Agustinus Kristiadi; Runa Eschenhagen; Philipp Hennig; |
1489 | Sample Complexity of Learning Heuristic Functions for Greedy-Best-First and A* Search Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: While heuristic functions have been handcrafted using domain knowledge, recent studies demonstrate that learning heuristic functions from data is effective in many applications. Motivated by this emerging approach, we study the sample complexity of learning heuristic functions for GBFS and A*. |
Shinsaku Sakaue; Taihei Oki; |
1490 | Enhanced Meta Reinforcement Learning Via Demonstrations in Sparse Reward Environments Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We consider the situation where some data, possibly generated by a sub-optimal agent, is available for each task. |
Desik Rengarajan; Sapana Chaudhary; Jaewon Kim; Dileep Kalathil; Srinivas Shakkottai; |
1491 | Discrete-Convex-Analysis-Based Framework for Warm-Starting Algorithms with Predictions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a principled discrete-convex-analysis-based framework for warm-starting algorithms with predictions to improve time complexity bounds. |
Shinsaku Sakaue; Taihei Oki; |
1492 | MAgNet: Mesh Agnostic Neural PDE Solver Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we leverage the recent advances in Implicit Neural Representations (INR) to design a novel architecture that predicts the spatially continuous solution of a PDE given a spatial position query. |
Oussama Boussif; Yoshua Bengio; Loubna Benabbou; Dan Assouline; |
1493 | Eliciting Thinking Hierarchy Without A Prior Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To the best of our knowledge, we are the first to propose a thinking hierarchy model with empirical validations in the general problem-solving scenarios; and the first to propose a practical open-response-based crowdsourcing approach that beats plurality voting without any prior. |
Yuqing Kong; Yunqi Li; Yubo Zhang; Zhihuan Huang; Jinzhao Wu; |
1494 | Human-Robotic Prosthesis As Collaborating Agents for Symmetrical Walking Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This is the first attempt at considering human influence in the reinforcement learning control of a robotic lower limb prosthesis toward symmetrical walking in real world situations. We propose a collaborative multi-agent reinforcement learning (cMARL) solution framework for this highly complex and challenging human-prosthesis collaboration (HPC) problem. |
Ruofan Wu; Junmin Zhong; Brent Wallace; Xiang Gao; He Huang; Jennie Si; |
1495 | Sparse Structure Search for Parameter-Efficient Tuning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In contrast to the manual designation, we explore constructing PET modules in an automatic manner. |
Shengding Hu; Zhen Zhang; Ning Ding; Yadao Wang; Yasheng Wang; Zhiyuan Liu; Maosong Sun; |
1496 | Transformer-based Working Memory for Multiagent Reinforcement Learning with Action Parsing Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Previous efforts alleviate the partial observability by historical hidden states with Recurrent Neural Networks, however, they do not consider the multiagent characters that either the multiagent observation consists of a number of object entities or the action space shows clear entity interactions. To tackle these issues, we propose the Agent Transformer Memory (ATM) network with a transformer-based memory. |
Yaodong Yang; Guangyong Chen; Weixun Wang; Xiaotian Hao; Jianye Hao; Pheng-Ann Heng; |
1497 | Improved Surface Reconstruction Using High-frequency Details Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a novel method to improve the quality of surface reconstruction in neural rendering. |
Yiqun Wang; Ivan Skorokhodov; Peter Wonka; |
1498 | Chaotic Dynamics Are Intrinsic to Neural Network Training with SGD Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper exploits the theoretical connection between the curvature of the loss landscape and chaotic dynamics in neural network training to propose a modified SGD ensuring non-chaotic training dynamics to study the importance thereof in NN training. |
Luis Herrmann; Maximilian Granz; Tim Landgraf; |
1499 | On Convergence of FedProx: Local Dissimilarity Invariant Bounds, Non-smoothness and Beyond Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We contribute to derive several new and deeper theoretical insights into the FedProx algorithm under milder conditions. |
Xiaotong Yuan; Ping Li; |
1500 | Large Language Models Are Zero-Shot Reasoners Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a single zero-shot prompt that elicits effective chain of thought reasoning across diverse benchmarks that require multi-step thinking. |
Takeshi Kojima; Shixiang (Shane) Gu; Machel Reid; Yutaka Matsuo; Yusuke Iwasawa; |
1501 | Human-AI Collaborative Bayesian Optimisation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a new method for human-AI collaboration in Bayesian optimisation where the optimum is mainly pursued by the Bayesian optimisation algorithm following complex computation, whilst getting occasional help from the accompanying expert having a deeper knowledge of the underlying physical phenomenon. |
Arun Kumar Anjanapura Venkatesh; Santu Rana; Alistair Shilton; Svetha Venkatesh; |
1502 | MExMI: Pool-based Active Model Extraction Crossover Membership Inference Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we show that ME and MI can reinforce each other through a chained and iterative reaction, which can significantly boost ME attack accuracy and improve MI by saving the query cost. |
Yaxin Xiao; Qingqing Ye; Haibo Hu; Huadi Zheng; Chengfang Fang; Jie Shi; |
1503 | Adversarial Task Up-sampling for Meta-learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we seek an approach that up-samples meta-training tasks from the task representation via a task up-sampling network. |
Yichen WU; Long-Kai Huang; Ying Wei; |
1504 | Policy Optimization with Linear Temporal Logic Constraints Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study the problem of policy optimization (PO) with linear temporal logic (LTL) constraints. |
Cameron Voloshin; Hoang Le; Swarat Chaudhuri; Yisong Yue; |
1505 | Post-hoc Estimators for Learning to Defer to An Expert Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To achieve this, a central issue studied in prior work is the design of a coherent loss function for both mechanisms. In this work, we demonstrate that existing losses have two subtle limitations: they can encourage underfitting when there is a high cost of deferring, and the deferral function can have a weak dependence on the base model predictions. |
Harikrishna Narasimhan; Wittawat Jitkrittum; Aditya Menon; Ankit Rawat; Sanjiv Kumar; |
1506 | Explainability Via Causal Self-Talk Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We describe an effective way to satisfy all the desiderata: train the AI system to build a causal model of itself. |
Nicholas Roy; Junkyung Kim; Neil Rabinowitz; |
1507 | MultiScan: Scalable RGBD Scanning for 3D Environments with Articulated Objects Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce MultiScan, a scalable RGBD dataset construction pipeline leveraging commodity mobile devices to scan indoor scenes with articulated objects and web-based semantic annotation interfaces to efficiently annotate object and part semantics and part mobility parameters. |
Yongsen Mao; Yiming Zhang; Hanxiao Jiang; Angel Chang; Manolis Savva; |
1508 | RTFormer: Efficient Design for Real-Time Semantic Segmentation with Transformer Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose RTFormer, an efficient transformer for real-time semantic segmenation, which achieves better trade-off between performance and efficiency than CNN-based models. |
Jian Wang; Chenhui Gou; Qiman Wu; Haocheng Feng; Junyu Han; Errui Ding; Jingdong Wang; |
1509 | HYPRO: A Hybridly Normalized Probabilistic Model for Long-Horizon Prediction of Event Sequences Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Existing state-of-the-art models do not perform well at this task due to their autoregressive structure. We propose HYPRO, a hybridly normalized probabilistic model that naturally fits this task: its first part is an autoregressive base model that learns to propose predictions; its second part is an energy function that learns to reweight the proposals such that more realistic predictions end up with higher probabilities. |
Siqiao Xue; Xiaoming Shi; James Zhang; Hongyuan Mei; |
1510 | Maximum Class Separation As Inductive Bias in One Matrix Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper proposes a simple alternative: encoding maximum separation as an inductive bias in the network by adding one fixed matrix multiplication before computing the softmax activations. |
Tejaswi Kasarla; Gertjan Burghouts; Max van Spengler; Elise van der Pol; Rita Cucchiara; Pascal Mettes; |
1511 | HumanLiker: A Human-like Object Detector Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Empirically, humans run two steps to locate an object bounding box manually: 1) click the mouse at the top-left corner of object, and then drag the mouse to the bottom-right corner; 2) refine the corner positions to make the bounding box more precisely, if necessary. Inspired by this manual labeling process, we propose a novel human-like detector, termed as HumanLiker, which is devised as a two-stage end-to-end detector to simulate the two aforementioned. |
Haoran Wei; Ping Guo; Yangguang Zhu; Chenglong Liu; Peng Wang; |
1512 | Sparse Probabilistic Circuits Via Pruning and Growing Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose two operations: pruning and growing, that exploit the sparsity of PC structures. |
Meihua Dang; Anji Liu; Guy Van den Broeck; |
1513 | Resolving The Data Ambiguity for Periodic Crystals Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper introduces the new invariants that are crystal descriptors without false negatives and are called Pointwise Distance Distributions (PDD). |
Daniel Widdowson; Vitaliy Kurlin; |
1514 | GENIE: Higher-Order Denoising Diffusion Solvers Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose Higher-Order Denoising Diffusion Solvers (GENIE): Based on truncated Taylor methods, we derive a novel higher-order solver that significantly accelerates synthesis. |
Tim Dockhorn; Arash Vahdat; Karsten Kreis; |
1515 | LION: Latent Point Diffusion Models for 3D Shape Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To this end, we introduce the hierarchical Latent Point Diffusion Model (LION) for 3D shape generation. |
xiaohui zeng; Arash Vahdat; Francis Williams; Zan Gojcic; Or Litany; Sanja Fidler; Karsten Kreis; |
1516 | Large-Scale Retrieval for Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We leverage fast, approximate nearest neighbor techniques in order to retrieve relevant data from a set of tens of millions of expert demonstration states. |
Peter Humphreys; Arthur Guez; Olivier Tieleman; Laurent Sifre; Theophane Weber; Timothy Lillicrap; |
1517 | Matching in Multi-arm Bandit with Collision Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we consider the matching of multi-agent multi-armed bandit problem, i.e., while agents prefer arms with higher expected reward, arms also have preferences on agents. |
YiRui Zhang; Siwei Wang; Zhixuan Fang; |
1518 | Information Bottleneck Theory of High-dimensional Regression: Relevancy, Efficiency and Optimality Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Information efficient learning algorithms minimize residual information while maximizing the relevant bits, which are predictive of the unknown generative models. We solve this optimization to obtain the information content of optimal algorithms for a linear regression problem and compare it to that of randomized ridge regression. |
Vudtiwat Ngampruetikorn; David Schwab; |
1519 | Uncovering The Structural Fairness in Graph Contrastive Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we surprisingly find out that representations obtained by GCL methods are already fairer to degree bias than those learned by GCN. |
Ruijia Wang; Xiao Wang; Chuan Shi; Le Song; |
1520 | Sustainable Online Reinforcement Learning for Auto-bidding Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we argue that there exists significant gaps between VAS and RAS, making the RL process in the VAS suffer from the problem of inconsistency between online and offline (IBOO). |
Zhiyu Mou; Yusen Huo; Rongquan Bai; Mingzhou Xie; Chuan Yu; Jian Xu; Bo Zheng; |
1521 | Dense Interspecies Face Embedding Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce a new task of cross-domain face understanding, and propose a dense interspecies face embedding (DIFE) learned in an unsupervised manner by our multi-teacher knowledge distillation and pseudo-paired data synthesis. |
Sejong Yang; Subin Jeon; Seonghyeon Nam; Seon Joo Kim; |
1522 | Off-Team Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: By training in an off-team manner, we can mitigate the training and testing time covariate shift of off-belief learning, resulting in near optimal zero-shot coordination and mitigate covariate shift in ad-hoc teamplay and proxy human-AI. |
Brandon Cui; Hengyuan Hu; Samuel Sokota; Andrei Lupu; Jakob Foerster; |
1523 | Learning Consistency-Aware Unsigned Distance Functions Progressively from Raw Point Clouds Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a novel method to learn consistency-aware unsigned distance functions directly from raw point clouds. |
Junsheng Zhou; Baorui Ma; Yu-Shen Liu; Yi Fang; Zhizhong Han; |
1524 | Amortized Inference for Causal Structure Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose to amortize causal structure learning. |
Lars Lorch; Scott Sussex; Jonas Rothfuss; Andreas Krause; Bernhard Schölkopf; |
1525 | Semi-supervised Active Linear Regression Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce a novel formulation for active learning where the learner has access to a a-priori labeled dataset. We show optimal instance-dependent sample complexity bounds dependent on a new parameter we introduce, the “reduced rank”. |
Nived Rajaraman; Fnu Devvrit; Pranjal Awasthi; |
1526 | Active Learning Polynomial Threshold Functions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We initiate the study of active learning polynomial threshold functions (PTFs). |
Omri Ben-Eliezer; Max Hopkins; Chutong Yang; Hantao Yu; |
1527 | The First Optimal Acceleration of High-Order Methods in Smooth Convex Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we study the fundamental open question of finding the optimal high-order algorithm for solving smooth convex minimization problems. |
Dmitry Kovalev; Alexander Gasnikov; |
1528 | A Policy-Guided Imitation Approach for Offline Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this study, we propose an alternative approach, inheriting the training stability of imitation-style methods while still allowing logical out-of-distribution generalization. |
Haoran Xu; Li Jiang; Li Jianxiong; Xianyuan Zhan; |
1529 | Meta Reinforcement Learning with Finite Training Tasks – A Density Estimation Approach Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose a different approach: directly learn the task distribution, using density estimation techniques, and then train a policy on the learned task distribution. |
Zohar Rimon; Aviv Tamar; Gilad Adler; |
1530 | Spherization Layer: Representation Using Only Angles Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this study, we proposed spherization layer to represent all information on angular similarity. |
Hoyong Kim; kangil kim; |
1531 | Recurrent Video Restoration Transformer with Guided Deformable Attention Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we attempt to integrate the advantages of the two cases by proposing a recurrent video restoration transformer, namely RVRT. |
Jingyun Liang; Yuchen Fan; Xiaoyu Xiang; Rakesh Ranjan; Eddy Ilg; Simon Green; Jiezhang Cao; Kai Zhang; Radu Timofte; Luc V Gool; |
1532 | Neural Shape Deformation Priors Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To this end, we introduce transformer-based deformation networks that represent a shape deformation as a composition of local surface deformations. |
Jiapeng Tang; Lev Markhasin; Bi Wang; Justus Thies; Matthias Niessner; |
1533 | Tiered Reinforcement Learning: Pessimism in The Face of Uncertainty and Constant Regret Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We identify the tiered structure in many real-world applications, and prove that leveraging such structure by using pessimism based algorithms one can achieve constant regret in online learning. |
Jiawei Huang; Li Zhao; Tao Qin; Wei Chen; Nan Jiang; Tie-Yan Liu; |
1534 | Annihilation of Families of Spurious Minima in Two-Layer ReLU Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study the optimization problem associated with fitting two-layer ReLU neural networks with respect to the squared loss, where labels are generated by a target network. |
Yossi Arjevani; Michael Field; |
1535 | The Curse of Unrolling: Rate of Differentiating Through Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This work provides a non-asymptotic convergence-rate analysis of this approach on quadratic objectives for gradient descent and the Chebyshev method. |
Damien Scieur; Gauthier Gidel; Quentin Bertrand; Fabian Pedregosa; |
1536 | Learning to Sample and Aggregate: Few-shot Reasoning Over Temporal Knowledge Graph Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We correspondingly propose a novel Meta Temporal Knowledge Graph Reasoning (MetaTKGR) framework. |
Ruijie Wang; Zheng Li; Dachun Sun; Shengzhong Liu; Jinning Li; Bing Yin; Tarek Abdelzaher; |
1537 | Perturbation Learning Based Anomaly Detection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper presents a simple yet effective method for anomaly detection. |
Jinyu Cai; Jicong Fan; |
1538 | Neural Stochastic Control Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose two novel frameworks of neural stochastic control to stabilize ODEs and SDEs, and these two controllers can complement each other in terms of convergence rate and training time. |
Jingdong Zhang; Qunxi Zhu; Wei LIN; |
1539 | Distributionally Robust Optimization with Data Geometry Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, to further constrain the uncertainty set, we incorporate data geometric properties into the design of distance metrics, obtaining our novel Geometric Wasserstein DRO (GDRO). |
Jiashuo Liu; Jiayun Wu; Bo Li; Peng Cui; |
1540 | Learning NP-Hard Joint-Assignment Planning Using GNN: Inference on A Random Graph and Provable Auction-Fitted Q-iteration Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We develop a theory of inference on a random graph using graph neural networks (GNN) and illustrate its capability to solve NP-hard scheduling problems. |
HYUNWOOK KANG; Taehwan Kwon; James R. Morrison; Jinkyoo Park; |
1541 | LECO: Learnable Episodic Count for Task-Specific Intrinsic Reward Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Moreover, the interference from task-irrelevant observations in the episodic count may cause its intrinsic motivation to overlook task-related important changes of states, and the novelty in an episodic manner can lead to repeatedly revisit the familiar states across episodes. In order to resolve these issues, in this paper, we propose a learnable hash-based episodic count, which we name LECO, that efficiently performs as a task-specific intrinsic reward in hard exploration problems. |
Daejin Jo; Sungwoong Kim; Daniel Nam; Taehwan Kwon; Seungeun Rho; Jongmin Kim; Donghoon Lee; |
1542 | What Can Transformers Learn In-Context? A Case Study of Simple Function Classes Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: While large language models such as GPT-3 exhibit some ability to perform in-context learning, it is unclear what the relationship is between tasks on which this succeeds and what is present in the training data. To investigate this, we consider the problem of training a model to in-context learn a function class (e.g., linear functions): given data derived from some functions in the class, can we train a model (e.g., a Transformer) to in-context learn most functions from that class? |
Shivam Garg; Dimitris Tsipras; Gregory Valiant; Percy Liang; |
1543 | Conditional Independence Testing with Heteroskedastic Data and Applications to Causal Discovery Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We frame heteroskedasticity in a structural causal model framework and present an adaptation of the partial correlation CI test that works well in the presence of heteroskedastic noise, given that expert knowledge about the heteroskedastic relationships is available. |
Wiebke Günther; Urmi Ninad; Jonas Wahl; Jakob Runge; |
1544 | Environment Diversification with Multi-head Neural Network for Invariant Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This work proposes an invariant learning framework containing a multi-head neural network to absorb data biases. |
Bo-Wei Huang; Keng-Te Liao; Chang-Sheng Kao; Shou-De Lin; |
1545 | On Image Segmentation With Noisy Labels: Characterization and Volume Properties of The Optimal Solutions to Accuracy and Dice Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We study two of the most popular performance metrics in medical image segmentation, Accuracy and Dice, when the target labels are noisy. |
Marcus Nordstrom; Henrik Hult; Fredrik Löfman; Jonas Söderberg; |
1546 | TabNAS: Rejection Sampling for Neural Architecture Search on Tabular Datasets Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper develops TabNAS, a new and more effective approach to handle resource constraints in tabular NAS using an RL controller motivated by the idea of rejection sampling. |
Chengrun Yang; Gabriel Bender; Hanxiao Liu; Pieter-Jan Kindermans; Madeleine Udell; Yifeng Lu; Quoc V Le; Da Huang; |
1547 | FedPop: A Bayesian Approach for Personalised Federated Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To this end, we propose a novel methodology coined FedPop by recasting personalised FL into the population modeling paradigm where clients’ models involve fixed common population parameters and random individual ones, aiming at explaining data heterogeneity. |
Nikita Kotelevskii; Maxime Vono; Alain Durmus; Eric Moulines; |
1548 | VaiPhy: A Variational Inference Based Algorithm for Phylogeny Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose VaiPhy, a remarkably fast VI based algorithm for approximate posterior inference in an \textit{augmented tree space}. |
Hazal Koptagel; Oskar Kviman; Harald Melin; Negar Safinianaini; Jens Lagergren; |
1549 | Video Diffusion Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To generate long and higher resolution videos we introduce a new conditional sampling technique for spatial and temporal video extension that performs better than previously proposed methods. |
Jonathan Ho; Tim Salimans; Alexey Gritsenko; William Chan; Mohammad Norouzi; David Fleet; |
1550 | Coded Residual Transform for Generalizable Deep Metric Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: A fundamental challenge in deep metric learning is the generalization capability of the feature embedding network model since the embedding network learned on training classes need to be evaluated on new test classes. To address this challenge, in this paper, we introduce a new method called coded residual transform (CRT) for deep metric learning to significantly improve its generalization capability. |
Shichao Kan; Yixiong Liang; Min Li; Yigang Cen; Jianxin Wang; Zhihai He; |
1551 | Masked Autoencoding for Scalable and Generalizable Decision Making Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To this end, this paper presents masked decision prediction (MaskDP), a simple and scalable self-supervised pretraining method for reinforcement learning (RL), and behavioral cloning (BC). |
Fangchen Liu; Hao Liu; Aditya Grover; Pieter Abbeel; |
1552 | Time Dimension Dances with Simplicial Complexes: Zigzag Filtration Curve Based Supra-Hodge Convolution Networks for Time-series Forecasting Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Yet, such GNN models pre-dominantly capture only lower order interactions, that is, pairwise relations among nodes, and also largely ignore intrinsic time-conditioned information on the underlying topology of multivariate time series. To address these limitations, we propose a new time-aware GNN architecture which amplifies the power of the recently emerged simplicial neural networks with a time-conditioned topological knowledge representation in a form of zigzag persistence. |
Yuzhou Chen; Yulia Gel; H. Vincent Poor; |
1553 | Learning to Find Proofs and Theorems By Learning to Refine Search Strategies Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a new approach to automated theorem proving and deductive program synthesis where an AlphaZero-style agent is self-training to refine a high-level expert strategy expressed as a nondeterministic program. |
Jonathan Laurent; André Platzer; |
1554 | Flexible Neural Image Compression Via Code Editing Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper we propose Code Editing, a highly flexible coding method for NIC based on semi-amortized inference. |
Chenjian Gao; Tongda Xu; Dailan He; Yan Wang; Hongwei Qin; |
1555 | Redundant Representations Help Generalization in Wide Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Here, we study the last hidden layer representations of various state-of-the-art convolutional neural networks and find that if the last hidden representation is wide enough, its neurons tend to split into groups which carry identical information, and differ from each other only by a statistically independent noise. |
Diego Doimo; Aldo Glielmo; Sebastian Goldt; Alessandro Laio; |
1556 | Shadow Knowledge Distillation: Bridging Offline and Online Knowledge Transfer Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, this fine-tuning process still costs lots of training budgets. To alleviate this dilemma, we propose SHAKE, a simple yet effective SHAdow KnowlEdge transfer framework to bridge offline and online distillation, which trades the accuracy with efficiency. |
Lujun Li; ZHE JIN; |
1557 | Challenging Common Assumptions in Convex Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In particular, we prove that erroneously optimizing the infinite trials objective in place of the actual finite trials one, as it is usually done, can lead to a significant approximation error. |
Mirco Mutti; Riccardo De Santi; Piersilvio De Bartolomeis; Marcello Restelli; |
1558 | Co-Modality Imbalanced Graph Contrastive Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We design a co-modality graph contrastive learning model with network pruning to learn graph representations on imbalanced data. |
Yiyue Qian; Chunhui Zhang; Yiming Zhang; Qianlong Wen; Yanfang Ye; Chuxu Zhang; |
1559 | Self-Supervised Learning Through Efference Copies Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We show, the brain’s motor commands could theoretically also offer supervision to the learning process of sensory representations, a framework that also unifies various self-supervised machine-learning methods, extends them and improves performance. |
Franz Scherr; Qinghai Guo; Timoleon Moraitis; |
1560 | On The Generalization Power of The Overfitted Three-Layer Neural Tangent Kernel Model Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we study the generalization performance of overparameterized 3-layer NTK models. |
Peizhong Ju; Xiaojun Lin; Ness Shroff; |
1561 | Minimax Regret for Cascading Bandits Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This sharply contrasts with the standard (non-cascading) bandit setting, where the variance-aware algorithms only improve constants. In light of this and as an additional contribution, we propose a variance-aware algorithm for the structured case of linear rewards and show its regret strictly improves the state-of-the-art. |
Daniel Vial; Sujay Sanghavi; Sanjay Shakkottai; R. Srikant; |
1562 | Sample-Efficient Reinforcement Learning of Partially Observable Markov Games Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper considers the challenging tasks of Multi-Agent Reinforcement Learning (MARL) under partial observability, where each agent only sees her own individual observations and actions that reveal incomplete information about the underlying state of system. |
Qinghua Liu; Csaba Szepesvari; Chi Jin; |
1563 | Syndicated Bandits: A Framework for Auto Tuning Hyper-parameters in Contextual Bandit Algorithms Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To tackle this problem, we first propose a two-layer bandit structure for auto tuning the exploration parameter and further generalize it to the Syndicated Bandits framework which can learn multiple hyper-parameters dynamically in contextual bandit environment. We derive the regret bounds of our proposed Syndicated Bandits framework and show it can avoid its regret dependent exponentially in the number of hyper-parameters to be tuned. |
QIN DING; Yue Kang; Yi-Wei Liu; Thomas Chun Man Lee; Cho-Jui Hsieh; James Sharpnack; |
1564 | Convergence for Score-based Generative Modeling with Polynomial Complexity Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We give the first fully polynomial convergence guarantees for score-based generative models. |
Holden Lee; Jianfeng Lu; Yixin Tan; |
1565 | The Neural Covariance SDE: Shaped Infinite Depth-and-Width Networks at Initialization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The logit outputs of a feedforward neural network at initialization are conditionally Gaussian, given a random covariance matrix defined by the penultimate layer. In this work, we study the distribution of this random matrix. |
Mufan Li; Mihai Nica; Daniel M Roy; |
1566 | Understanding Self-Supervised Graph Representation Learning from A Data-Centric Perspective Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This raises the question: how do graph SSL methods, and in particular, contrastive learning (CL), work well? To systematically probe this question, we perform a generalization analysis for CL when using generic graph augmentations (GGAs) based on dataset recoverability and separability constraints, yielding insights into task-relevant augmentations. |
Puja Trivedi; Ekdeep S Lubana; Mark Heimann; Danai Koutra; Jayaraman Thiagarajan; |
1567 | A Unified Framework for Alternating Offline Model Training and Policy Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we address the issue by developing an iterative offline MBRL framework, where we maximize a lower bound of the true expected return, by alternating between dynamic model training and policy learning. |
Shentao Yang; Yihao Feng; Shujian Zhang; Mingyuan Zhou; |
1568 | Sketch-GNN: Scalable Graph Neural Networks with Sublinear Training Complexity Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper proposes a sketch-based algorithm whose training time and memory grow sublinearly with respect to graph size by training GNNs atop a few compact sketches of graph adjacency and node embeddings. |
Mucong Ding; Tahseen Rabbani; Bang An; Evan Wang; Furong Huang; |
1569 | Learning Mixed Multinomial Logits with Provable Guarantees Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We developed an algorithm that learns a mixture of MNL models with provable guarantees. |
Yiqun Hu; David Simchi-Levi; Zhenzhen Yan; |
1570 | Assessing Representation Quality in Self-Supervised Learning By Measuring Eigenspectrum Decay Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We measure decay of eigenspectrum in representation covariance matrix and analytically and empiricially show that it is indicative of generalization. |
Kumar K Agrawal; Arnab Mondal; Arna Ghosh; Blake Richards; |
1571 | On Uncertainty, Tempering, and Data Augmentation in Bayesian Classification Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Our work shows that explicitly accounting for aleatoric uncertainty significantly improves the performance of Bayesian neural networks. |
Sanyam Kapoor; Wesley Maddox; Pavel Izmailov; Andrew Wilson; |
1572 | An Empirical Analysis of Compute-optimal Large Language Model Training Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We investigate the optimal model size and number of tokens for training a transformer language model under a given compute budget. |
Jordan Hoffmann; Sebastian Borgeaud; Arthur Mensch; Elena Buchatskaya; Trevor Cai; Eliza Rutherford; Diego de Las Casas; Lisa Anne Hendricks; Johannes Welbl; Aidan Clark; Thomas Hennigan; Eric Noland; Katherine Millican; George van den Driessche; Bogdan Damoc; Aurelia Guy; Simon Osindero; Karen Simonyan; Erich Elsen; Jack Rae; Oriol Vinyals; Laurent Sifre; |
1573 | FedRolex: Model-Heterogeneous Federated Learning with Rolling Submodel Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, in real-world scenarios, such a requirement acts as a constraint that restricts the outreach to clients with heterogeneous device resources and unfairly excludes users with low-end devices who would otherwise benefit from FL. In this work, we propose a simple yet effective model-heterogeneous FL method named FedRolex to tackle this constraint. |
Samiul Alam; Luyang Liu; Ming Yan; Mi Zhang; |
1574 | Operator Splitting Value Iteration Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce new planning and reinforcement learning algorithms for discounted MDPs that can utilize an approximate model of the environment to accelerate the convergence of the value function.Inspired by the splitting approach in numerical linear algebra, we introduce Operator Splitting Value Iteration (OS-VI) for both policy evaluation and control problems. |
Amin Rakhsha; Andrew Wang; Mohammad Ghavamzadeh; Amir-massoud Farahmand; |
1575 | Trajectory Balance: Improved Credit Assignment in GFlowNets Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We thus propose a new learning objective for GFlowNets, trajectory balance, as a more efficient alternative to previously used objectives. |
Nikolay Malkin; Moksh Jain; Emmanuel Bengio; Chen Sun; Yoshua Bengio; |
1576 | Memory Efficient Continual Learning with Transformers Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we devise a method to incrementally train a model on a sequence of tasks using pre-trained Transformers and extending them with Adapters. |
Beyza Ermis; Giovanni Zappella; Martin Wistuba; Aditya Rawal; Cedric Archambeau; |
1577 | Continuous MDP Homomorphisms and Homomorphic Policy Gradient Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We derive a policy gradient theorem on the abstract MDP, which allows us to leverage approximate symmetries of the environment for policy optimization. Based on this theorem, we propose an actor-critic algorithm that is able to learn the policy and the MDP homomorphism map simultaneously, using the lax bisimulation metric. |
Sahand Rezaei-Shoshtari; Rosie Zhao; Prakash Panangaden; David Meger; Doina Precup; |
1578 | Audio-Driven Co-Speech Gesture Image Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To this end, we propose a novel framework, Audio-driveN Gesture Image gEneration (ANGIE), to effectively capture the reusable co-speech gesture patterns as well as fine-grained rhythmic movements. |
Xian Liu; Qianyi Wu; Hang Zhou; Yuanqi Du; Wayne Wu; Dahua Lin; Ziwei Liu; |
1579 | Robust Streaming PCA Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We consider streaming principal component analysis (PCA) when the stochastic data-generating model is subject to perturbations. |
Daniel Bienstock; Minchan Jeong; Apurv Shukla; Se-Young Yun; |
1580 | Active Learning with Neural Networks: Insights from Nonparametric Statistics Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Under common low noise conditions, we show that active learning with neural networks can provably achieve the minimax label complexity, up to disagreement coefficient and other logarithmic terms. |
Yinglun Zhu; Robert Nowak; |
1581 | A Multi-Resolution Framework for U-Nets with Applications to Hierarchical VAEs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we formulate a multi-resolution framework which identifies U-Nets as finite-dimensional truncations of models on an infinite-dimensional function space. |
Fabian Falck; Christopher Williams; Dominic Danks; George Deligiannidis; Christopher Yau; Chris C Holmes; Arnaud Doucet; Matthew Willetts; |
1582 | Quantum Algorithms for Sampling Log-Concave Distributions and Estimating Normalizing Constants Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We develop quantum algorithms for sampling logconcave distributions and for estimating their normalizing constants, with polynomial quantum speedup in the condition number $\kappa$, dimension $d$, and error $\epsilon$ in various scenarios. |
Andrew M. Childs; Tongyang Li; Jin-Peng Liu; Chunhao Wang; Ruizhe Zhang; |
1583 | Provable Subspace Identification Under Post-Nonlinear Mixtures Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, the post-nonlinear (PNL) mixture model is revisited, where {\it unknown} element-wise nonlinear functions are imposed after a linear mixture, making the identification problem even more ill-posed. |
Qi Lyu; Xiao Fu; |
1584 | Constrained Stochastic Nonconvex Optimization with State-dependent Markov Data Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study stochastic optimization algorithms for constrained nonconvex stochastic optimization problems with Markovian data. |
Abhishek Roy; Krishnakumar Balasubramanian; Saeed Ghadimi; |
1585 | Sparsity in Continuous-Depth Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a regularization for continuous-depth neural networks that reduces input-output interactions for better generalization. |
Hananeh Aliee; Till Richter; Mikhail Solonin; Ignacio Ibarra; Fabian Theis; Niki Kilbertus; |
1586 | Robust On-Policy Sampling for Data-Efficient Policy Evaluation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce a method called Robust On-policy Sampling and demonstrate theoretically and empirically that it produces data that converges faster to the expected on-policy distribution compared to on-policy sampling. |
Rujie Zhong; Duohan Zhang; Lukas Schäfer; Stefano Albrecht; Josiah Hanna; |
1587 | Graph Few-shot Learning with Task-specific Structures Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Since the class sets are different across meta-tasks, node representations should be task-specific to promote classification performance. Therefore, to adaptively learn node representations across meta-tasks, we propose a novel framework that learns a task-specific structure for each meta-task. |
Song Wang; Chen Chen; Jundong Li; |
1588 | Second Thoughts Are Best: Learning to Re-Align With Human Values from Text Edits Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present Second Thoughts, a new learning paradigm that enables language models (LMs) to re-align with human values. |
Ruibo Liu; Chenyan Jia; Ge Zhang; Ziyu Zhuang; Tony Liu; Soroush Vosoughi; |
1589 | Watermarking for Out-of-distribution Detection Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, existing methods largely ignore the reprogramming property of deep models and thus may not fully unleash their intrinsic strength: without modifying parameters of a well-trained deep model, we can reprogram this model for a new purpose via data-level manipulation (e.g., adding a specific feature perturbation). This property motivates us to reprogram a classification model to excel at OOD detection (a new task), and thus we propose a general methodology named watermarking in this paper. |
Qizhou Wang; Feng Liu; Yonggang Zhang; Jing Zhang; Chen Gong; Tongliang Liu; Bo Han; |
1590 | On The Sample Complexity of Stabilizing LTI Systems on A Single Trajectory Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a novel algorithm based on spectral decomposition that only needs to learn “a small part” of the dynamical matrix acting on its unstable subspace. |
Yang Hu; Adam Wierman; Guannan Qu; |
1591 | Uncalibrated Models Can Improve Human-AI Collaboration Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present an initial exploration that suggests showing AI models as more confident than they actually are, even when the original AI is well-calibrated, can improve human-AI performance (measured as the accuracy and confidence of the human’s final prediction after seeing the AI advice). |
Kailas Vodrahalli; Tobias Gerstenberg; James Zou; |
1592 | Single-phase Deep Learning in Cortico-cortical Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, these models are either unable to effectively backpropagate error signals across multiple layers or require a multi-phase learning process, neither of which are reminiscent of learning in the brain. Here, we introduce a new model, bursting cortico-cortical networks (BurstCCN), which solves these issues by integrating known properties of cortical networks namely bursting activity, short-term plasticity (STP) and dendrite-targeting interneurons. |
Will Greedy; Heng Wei Zhu; Joseph Pemberton; Jack Mellor; Rui Ponte Costa; |
1593 | Acceleration in Distributed Sparse Regression Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a new accelerated distributed algorithm suitable for high-dimensions. |
Marie Maros; Gesualdo Scutari; |
1594 | DGD^2: A Linearly Convergent Distributed Algorithm For High-dimensional Statistical Recovery Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Our technical contribution is a novel convergence analysis that resembles (albeit different) algorithmic stability arguments extended to high-dimensions and distributed setting, which is of independent interest. |
Marie Maros; Gesualdo Scutari; |
1595 | CoPur: Certifiably Robust Collaborative Inference Via Feature Purification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The compromised agent either does not send embedded features to the FC, or sends arbitrarily embedded features. To address this, we propose a certifiably robust COllaborative inference framework via feature PURification (CoPur), by leveraging the block-sparse nature of adversarial perturbations on the feature vector, as well as exploring the underlying redundancy across the embedded features (by assuming the overall features lie on an underlying lower dimensional manifold). |
Jing Liu; Chulin Xie; Sanmi Koyejo; Bo Li; |
1596 | Self-Supervised Pretraining for 3D Vision Tasks By Cross-View Completion Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Inspired by MIM, we propose an unsupervised representation learning task trained from pairs of images showing the same scene from different viewpoints. |
Philippe Weinzaepfel; Vincent Leroy; Thomas Lucas; Romain BRÉGIER; Yohann Cabon; Vaibhav ARORA; Leonid Antsfeld; Boris Chidlovskii; Gabriela Csurka; Jerome Revaud; |
1597 | On The SDEs and Scaling Rules for Adaptive Gradient Algorithms Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper derives the SDE approximations for RMSprop and Adam, giving theoretical guarantees of their correctness as well as experimental validation of their applicability to common large-scaling vision and language settings. |
Sadhika Malladi; Kaifeng Lyu; Abhishek Panigrahi; Sanjeev Arora; |
1598 | Density-driven Regularization for Out-of-distribution Detection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Other reported approaches either impose strong unproven parametric assumptions to estimate OOD sample density or develop empirical detectors lacking clear theoretical motivations. To address these issues, we propose a theoretical probabilistic framework for OOD detection in deep classification networks, in which two regularization constraints are constructed to reliably estimate sample density to identify OOD. |
Wenjian Huang; Hao Wang; Jiahao Xia; Chengyan Wang; Jianguo Zhang; |
1599 | On Batch Teaching with Sample Complexity Bounded By VCD Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce the first model of batch teaching whose sample complexity is upper-bounded by VCD, and discuss what desirable properties of teaching models can be fulfilled by such model. |
Farnam Mansouri; Hans Simon; Adish Singla; Sandra Zilles; |
1600 | Brain Network Transformer Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we study Transformer-based models for brain network analysis. |
Xuan Kan; Wei Dai; Hejie Cui; Zilong Zhang; Ying Guo; Carl Yang; |
1601 | SnAKe: Bayesian Optimization with Pathwise Exploration Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper investigates the problem and introduces ‘Sequential Bayesian Optimization via Adaptive Connecting Samples’ (SnAKe), which provides a solution by considering large batches of queries and preemptively building optimization paths that minimize input costs. |
Jose Pablo Folch; Shiqiang Zhang; Robert Lee; Behrang Shafei; David Walz; Calvin Tsay; Mark van der Wilk; Ruth Misener; |
1602 | Pruning Neural Networks Via Coresets and Convex Geometry: Towards No Assumptions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To this end, we suggest a novel and robust framework for computing such coresets under mild assumptions on the model’s weights and without any assumption on the training data. |
Murad Tukan; Loay Mualem; Alaa Maalouf; |
1603 | DOPE: Doubly Optimistic and Pessimistic Exploration for Safe Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, the approach is so pessimistic that empirical results indicate an inordinately long learning process that keeps on applying sequences of the safe baseline policy. Our key insight is that such excessive pessimism hinders exploration, and needs to be combated by optimism with respect to the model. |
Archana Bura; Aria HasanzadeZonuzy; Dileep Kalathil; Srinivas Shakkottai; Jean-Francois Chamberland; |
1604 | Learning to Share in Multi-Agent Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we study the problem of networked multi-agent reinforcement learning (MARL), where a number of agents are deployed as a partially connected network and each interacts only with nearby agents. |
Yuxuan Yi; Ge Li; Yaowei Wang; Zongqing Lu; |
1605 | Domain Generalization By Learning and Removing Domain-specific Features Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a new approach that aims to explicitly remove domain-specific features for domain generalization. |
Yu Ding; Lei Wang; Bin Liang; Shuming Liang; Yang Wang; Fang Chen; |
1606 | On The Effective Number of Linear Regions in Shallow Univariate ReLU Networks: Convergence Guarantees and Implicit Bias Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study the dynamics and implicit bias of gradient flow (GF) on univariate ReLU neural networks with a single hidden layer in a binary classification setting. |
Itay Safran; Gal Vardi; Jason Lee; |
1607 | Adaptive Stochastic Variance Reduction for Non-convex Finite-Sum Minimization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose an adaptive variance-reduction method, called AdaSpider, for minimization of $L$-smooth, non-convex functions with a finite-sum structure. |
Ali Kavis; EFSTRATIOS SKOULAKIS; Kimon Antonakopoulos; Leello Tadesse Dadi; Volkan Cevher; |
1608 | Conformal Off-Policy Prediction in Contextual Bandits Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In addition, particularly in safety-critical settings, stronger guarantees than asymptotic correctness may be required. To address these limitations, we consider a novel application of conformal prediction to contextual bandits. |
Muhammad Faaiz Taufiq; Jean-Francois Ton; Rob Cornish; Yee Whye Teh; Arnaud Doucet; |
1609 | Pre-Trained Language Models for Interactive Decision-Making Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose an approach for using LMs to scaffold learning and generalization in general sequential decision-making problems. |
Shuang Li; Xavier Puig; Chris Paxton; Yilun Du; Clinton Wang; Linxi Fan; Tao Chen; De-An Huang; Ekin Akyürek; Anima Anandkumar; Jacob Andreas; Igor Mordatch; Antonio Torralba; Yuke Zhu; |
1610 | Local-Global MCMC Kernels: The Best of Both Worlds Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In the present paper we study an Explore-Exploit Markov chain Monte Carlo strategy ($\operatorname{Ex^2MCMC}$) that combines local and global samplers showing that it enjoys the advantages of both approaches. |
Sergey Samsonov; Evgeny Lagutin; Marylou Gabrié; Alain Durmus; Alexey Naumov; Eric Moulines; |
1611 | Provably Sample-efficient RL with Side Information About Latent Dynamics Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Crucially, we assume no prior knowledge about the structure of observations in the target domain except that they can be used to identify the latent states (but the decoding map is unknown). Under these assumptions, we present an algorithm, called TASID, that learns a robust policy in the target domain, with sample complexity that is polynomial in the horizon, and independent of the number of states, which is not possible without access to some prior knowledge. |
Yao Liu; Dipendra Misra; Miro Dudik; Robert Schapire; |
1612 | A Direct Approximation of AIXI Using Logical State Abstractions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a practical integration of logical state abstraction with AIXI, a Bayesian optimality notion for reinforcement learning agents, to significantly expand the model class that AIXI agents can be approximated over to complex history-dependent and structured environments. |
Samuel Yang-Zhao; Tianyu Wang; Kee Siong Ng; |
1613 | Delving Into OOD Detection with Vision-Language Representations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Particularly, we propose Maximum Concept Matching (MCM), a simple yet effective zero-shot OOD detection method based on aligning visual features with textual concepts. |
Yifei Ming; Ziyang Cai; Jiuxiang Gu; Yiyou Sun; Wei Li; Yixuan Li; |
1614 | A Sharp NMF Result with Applications in Network Modeling Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: While most existing works focused on the case of $m = 0$, our primary interest is on the case of general $m$. With new proof ideas we develop, we present sharp results on when the NMF problem is solvable, which significantly extend existing results on this topic. |
Jiashun Jin; |
1615 | What Makes A Good Code? A New Look At LSH From Random Fourier Features Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we revisit binary hashing from RFF, and study SignRFF, a simple strategy to extract RFF-based binary codes. |
Xiaoyun Li; Ping Li; |
1616 | Wasserstein Logistic Regression with Mixed Features Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we show that distributionally robust logistic regression with mixed (\emph{i.e.}, numerical and categorical) features, despite amounting to an optimization problem of exponential size, admits a polynomial-time solution scheme. |
Aras Selvi; Mohammad Reza Belbasi; Martin Haugh; Wolfram Wiesemann; |
1617 | Intermediate Prototype Mining Transformer for Few-Shot Semantic Segmentation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Specifically, we design an Intermediate Prototype Mining Transformer (IPMT) to learn the prototype in an iterative way. |
YUANWEI LIU; Nian Liu; Xiwen Yao; Junwei Han; |
1618 | Obj2Seq: Formatting Objects As Sequences with Class Prompt for Visual Tasks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose an object-centric vision framework, Obj2Seq. |
Zhiyang Chen; Yousong Zhu; Zhaowen Li; Fan Yang; Wei Li; Haixin Wang; Chaoyang Zhao; Liwei Wu; Rui Zhao; Jinqiao Wang; Ming Tang; |
1619 | AnimeSR: Learning Real-World Super-Resolution Models for Animation Videos Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: First, recent real-world super-resolution approaches typically rely on degradation simulation using basic operators without any learning capability, such as blur, noise, and compression. In this work, we propose to learn such basic operators from real low-quality animation videos, and incorporate the learned ones into the degradation generation pipeline. |
Yanze Wu; Xintao Wang; GEN LI; Ying Shan; |
1620 | Semantic Probabilistic Layers for Neuro-Symbolic Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We design a predictive layer for structured-output prediction (SOP) that can be plugged into any neural network guaranteeing its predictions are consistent with a set of predefined symbolic constraints. |
Kareem Ahmed; Stefano Teso; Kai-Wei Chang; Guy Van den Broeck; Antonio Vergari; |
1621 | Versatile Multi-stage Graph Neural Network for Circuit Representation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel way to integrate the multiple information sources under a unified heterogeneous graph named Circuit Graph, where topological and geometrical information is well integrated. |
shuwen yang; Zhihao Yang; Dong Li; Yingxueff Zhang; Zhanguang Zhang; Guojie Song; Jianye Hao; |
1622 | Video-based Human-Object Interaction Detection from Tubelet Tokens Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a novel vision Transformer, named TUTOR, which is able to learn tubelet tokens, served as highly-abstracted spatial-temporal representations, for video-based human-object interaction (V-HOI) detection. |
Danyang Tu; Wei Sun; Xiongkuo Min; Guangtao Zhai; Wei Shen; |
1623 | Towards Improving Faithfulness in Abstractive Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose a Faithfulness Enhanced Summarization model (FES), which is designed for addressing these two problems and improving faithfulness in abstractive summarization. |
Xiuying Chen; Mingzhe Li; Xin Gao; Xiangliang Zhang; |
1624 | Explaining A Reinforcement Learning Agent Via Prototyping Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a prototype-based post-hoc \emph{policy explainer}, ProtoX, that explains a black-box agent by prototyping the agent’s behaviors into scenarios, each represented by a prototypical state. |
Ronilo Ragodos; Qihang Lin; Xun Zhou; Tong Wang; |
1625 | Faster Stochastic Algorithms for Minimax Optimization Under Polyak-{\L}ojasiewicz Condition Related Papers Related Patents Related Grants Related Orgs Related Experts View Abstract: This paper considers stochastic first-order algorithms for minimax optimization under Polyak-{\L}ojasiewicz (PL) conditions. We propose SPIDER-GDA for solving the finite-sum … |
Lesi Chen; Boyuan Yao; Luo Luo; |
1626 | Causally Motivated Multi-shortcut Identification and Removal Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Instead, we propose a two step approach to (1) efficiently identify relevant shortcuts, and (2) leverage the identified shortcuts to build models that are robust to distribution shifts. |
Jiayun Zheng; Maggie Makar; |
1627 | Non-stationary Transformers: Rethinking The Stationarity in Time Series Forecasting Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To tackle the dilemma between series predictability and model capability, we propose Non-stationary Transformers as a generic framework with two interdependent modules: Series Stationarization and De-stationary Attention. |
Yong Liu; Haixu Wu; Jianmin Wang; Mingsheng Long; |
1628 | A Theory of Weight Distribution-constrained Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The emerging high-quality large structural datasets raise the question of what general functional principles can be gleaned from them. Motivated by this question, we developed a statistical mechanical theory of learning in neural networks that incorporates structural information as constraints. |
Weishun Zhong; Ben Sorscher; Daniel Lee; Haim Sompolinsky; |
1629 | Fully Convolutional One-Stage 3D Object Detection on LiDAR Range Images Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a simple yet effective fully convolutional one-stage 3D object detector for LiDAR point clouds of autonomous driving scenes, termed FCOS-LiDAR. |
Zhi Tian; Xiangxiang Chu; Xiaoming Wang; Xiaolin Wei; Chunhua Shen; |
1630 | Counterfactual Fairness with Partially Known Causal Graph Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper proposes a general method to achieve the notion of counterfactual fairness when the true causal graph is unknown. |
Aoqi Zuo; Susan Wei; Tongliang Liu; Bo Han; Kun Zhang; Mingming Gong; |
1631 | Look More But Care Less in Video Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Existing action recognition methods typically sample a few frames to represent each video to avoid the enormous computation, which often limits the recognition performance. To tackle this problem, we propose Ample and Focal Network (AFNet), which is composed of two branches to utilize more frames but with less computation. |
Yitian Zhang; Yue Bai; Huan Wang; Yi Xu; Yun Fu; |
1632 | Optimistic Mirror Descent Either Converges to Nash or to Strong Coarse Correlated Equilibria in Bimatrix Games Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We show that when both players in a general-sum game employ optimistic mirror descent, either the dynamics lead to a Nash equilibrium, or both players experience regret that decays linearly. |
Ioannis Anagnostides; Gabriele Farina; Ioannis Panageas; Tuomas Sandholm; |
1633 | Unsupervised Learning Under Latent Label Shift Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we introduce unsupervised learning under Latent Label Shift (LLS), where the label marginals $p_d(y)$ shift but the class conditionals $p(x|y)$ do not. |
Manley Roberts; Pranav Mani; Saurabh Garg; Zachary Lipton; |
1634 | A Spectral Approach to Item Response Theory Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a new spectral method for the item estimation problem under the Rasch model, one of the most fundamental models in item response theory; our algorithm enjoys favorable theoretical guarantees and achieves competitive numerical performance. |
Duc Nguyen; Anderson Ye Zhang; |
1635 | On Privacy and Personalization in Cross-Silo Federated Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we instead consider the more realistic notion of \textit{silo-specific item-level} privacy, where silos set their own privacy targets for local examples. |
Ken Liu; Shengyuan Hu; Steven Wu; Virginia Smith; |
1636 | TUSK: Task-Agnostic Unsupervised Keypoints Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We thus propose a novel method to learn Task-agnostic, UnSupervised Keypoints (TUSK) which can deal with multiple instances. |
Yuhe Jin; Weiwei Sun; Jan Hosang; Eduard Trulls; Kwang Moo Yi; |
1637 | Fair and Optimal Decision Trees: A Dynamic Programming Approach Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, state-of-the-art algorithms for fair and optimal decision trees have scalability issues, often requiring several hours to find such trees even for small datasets. In contrast to these state-of-the-art methods that use mixed integer programming, we propose a method that exploits the tree structure using dynamic programming. |
Jacobus van der Linden; Mathijs de Weerdt; Emir Demirović; |
1638 | Instance-Based Uncertainty Estimation for Gradient-Boosted Regression Trees Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose Instance-Based Uncertainty estimation for Gradient-boosted regression trees (IBUG), a simple method for extending any GBRT point predictor to produce probabilistic predictions. |
Jonathan Brophy; Daniel Lowd; |
1639 | Predicting Label Distribution from Multi-label Ranking Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we study the problem of predicting label distribution from multi-label ranking which is a compromise w.r.t. annotation cost but has good guarantees for performance. |
Yunan Lu; Xiuyi Jia; |
1640 | Representing Spatial Trajectories As Distributions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce a representation learning framework for spatial trajectories. |
Didac Suris Coll-Vinent; Carl Vondrick; |
1641 | Weisfeiler and Leman Go Walking: Random Walk Kernels Revisited Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We give a unified view on both classes of graph kernels. |
Nils M. Kriege; |
1642 | RankFeat: Rank-1 Feature Removal for Out-of-distribution Detection Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we observe that the singular value distributions of the in-distribution (ID) and OOD features are quite different: the OOD feature matrix tends to have a larger dominant singular value than the ID feature, and the class predictions of OOD samples are largely determined by it. |
Yue Song; Nicu Sebe; Wei Wang; |
1643 | Adversarial Auto-Augment with Label Preservation: A Representation Learning Principle Guided Approach Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we show that a prior-free autonomous data augmentation’s objective can be derived from a representation learning principle that aims to preserve the minimum sufficient information of the labels. |
Kaiwen Yang; Yanchao Sun; Jiahao Su; Fengxiang He; Xinmei Tian; Furong Huang; Tianyi Zhou; Dacheng Tao; |
1644 | Towards Disentangling Information Paths with Coded ResNeXt Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a neural network architecture for classification in which the information that is relevant to each class flows through specific paths. |
Apostolos Avranas; Marios Kountouris; |
1645 | In The Eye of The Beholder: Robust Prediction with Causal User Modeling Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a learning framework for relevance prediction that is robust to changes in the data distribution. |
Amir Feder; Guy Horowitz; Yoav Wald; Roi Reichart; Nir Rosenfeld; |
1646 | LSAR: Efficient Leverage Score Sampling Algorithm for The Analysis of Big Time Series Data Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We apply methods from randomized numerical linear algebra (RandNLA) to develop improved algorithms for the analysis of large-scale time series data. |
Ali Eshragh; Fred Roosta; Asef Nazari; Michael Mahoney; |
1647 | Generalized Delayed Feedback Model with Post-Click Information in Recommender Systems Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a generalized delayed feedback model (GDFM) that unifies both post-click behaviors and early conversions as stochastic post-click information, which could be utilized to train GDFM in a streaming manner efficiently. |
Jiaqi Yang; De-Chuan Zhan; |
1648 | Efficient Learning of Nonlinear Prediction Models with Time-series Privileged Information Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We provide new insights into this analysis and generalize it to nonlinear prediction tasks in latent dynamical systems, extending theoretical guarantees to the case where the map connecting latent variables and observations is known up to a linear transform. |
Bastian Jung; Fredrik Johansson; |
1649 | Matryoshka Representation Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Our main contribution is Matryoshka Representation Learning (MRL) which encodes information at different granularities and allows a single embedding to adapt to the computational constraints of downstream tasks. |
Aditya Kusupati; Gantavya Bhatt; Aniket Rege; Matthew Wallingford; Aditya Sinha; Vivek Ramanujan; William Howard-Snyder; Kaifeng Chen; Sham Kakade; Prateek Jain; Ali Farhadi; |
1650 | 4D Unsupervised Object Discovery Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose 4D unsupervised object discovery, jointly discovering objects from 4D data——3D point clouds and 2D RGB images with temporal information. |
Yuqi Wang; Yuntao Chen; ZHAO-XIANG ZHANG; |
1651 | TransTab: Learning Transferable Tabular Transformers Across Tables Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We are the first to fulfill pretraining, transfer learning, feature incremental learning, and zero-shot predictions across tabular datasets based on transferable tabular transformers (TransTab). |
Zifeng Wang; Jimeng Sun; |
1652 | How Mask Matters: Towards Theoretical Understandings of Masked Autoencoders Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a new theoretical understanding of how MAE works and why the choice of mask ratio is so important for MAE from a graph perspective. |
Qi Zhang; Yifei Wang; Yisen Wang; |
1653 | Self-Supervised Aggregation of Diverse Experts for Test-Agnostic Long-Tailed Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we study a more practical yet challenging task, called test-agnostic long-tailed recognition, where the training class distribution is long-tailed while the test class distribution is agnostic and not necessarily uniform. |
Yifan Zhang; Bryan Hooi; Lanqing Hong; Jiashi Feng; |
1654 | Phase Transition from Clean Training to Adversarial Training Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, numerous empirical results show a great performance degradation from clean training to adversarial training (e.g., 90+\% vs 67\% testing accuracy on CIFAR-10 dataset), which does not match the theoretical guarantee delivered by the existing studies. Such a gap inspires us to explore the existence of phase transition phenomenon with respect to the attack strength: adversarial training is as well behaved as clean training in the small-attack regime, but there is a sharp transition from clean training to adversarial training in the large-attack regime. |
Yue Xing; Qifan Song; Guang Cheng; |
1655 | SegNeXt: Rethinking Convolutional Attention Design for Semantic Segmentation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present SegNeXt, a simple convolutional network architecture for semantic segmentation. |
Meng-Hao Guo; Cheng-Ze Lu; Qibin Hou; Zhengning Liu; Ming-Ming Cheng; Shi-min Hu; |
1656 | Self-Supervised Visual Representation Learning with Semantic Grouping Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Instead, we propose contrastive learning from data-driven semantic slots, namely SlotCon, for joint semantic grouping and representation learning. |
Xin Wen; Bingchen Zhao; Anlin Zheng; Xiangyu Zhang; Xiaojuan Qi; |
1657 | Online Neural Sequence Detection with Hierarchical Dirichlet Point Process Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To tackle these limitations, we propose a hierarchical Dirichlet point processes model for efficient neural sequence detection. |
Weihan Li; Yu Qi; Gang Pan; |
1658 | Modeling Transitivity and Cyclicity in Directed Graphs Via Binary Code Box Embeddings Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To this end, we propose binary code box embeddings, where a learned binary code selects a subset of graphs for intersection. |
Dongxu Zhang; Michael Boratko; Cameron Musco; Andrew McCallum; |
1659 | Structural Pruning Via Latency-Saliency Knapsack Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose Latency-Aware Structural Pruning (LASP) that formulates structural pruning as a global resource allocation optimization problem, aiming at maximizing the accuracy while constraining latency under a predefined budget. |
Maying Shen; Hongxu Yin; Pavlo Molchanov; Lei Mao; Jianna Liu; Jose M. Alvarez; |
1660 | $k$-Sliced Mutual Information: A Quantitative Study of Scalability with Dimension Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Using a new result on the continuity of differential entropy in the 2-Wasserstein metric, we derive sharp bounds on the error of Monte Carlo (MC)-based estimates of $k$-SMI, with explicit dependence on $k$ and the ambient dimension, revealing their interplay with the number of samples. |
Ziv Goldfeld; Kristjan Greenewald; Theshani Nuradha; Galen Reeves; |
1661 | On Infinite Separations Between Simple and Optimal Mechanisms Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We characterize the structure of (correlated) distributions that witness an infinite separation between simple and optimal mechanisms. |
Alexandros Psomas; Ariel Schvartzman Cohenca; S. Weinberg; |
1662 | Transfer Learning in Information Criteria-based Feature Selection Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a procedure that combines transfer learning with Mallows’ Cp (TLCp) and prove that it outperforms the conventional Mallows’ Cp criterion in terms of accuracy and stability. |
Shaohan Chen; Nikolaos V Sahinidis; Chuanhou Gao; |
1663 | Biologically Plausible Solutions for Spiking Networks with Efficient Coding Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Here, we revisit the theory of efficient coding with spikes to develop spiking neural networks that are closer to biological circuits. |
Veronika Koren; Stefano Panzeri; |
1664 | A Theoretical View on Sparsely Activated Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: As our first contribution, we present a formal model of data-dependent sparse networks that captures salient aspects of popular architectures. |
Cenk Baykal; Nishanth Dikkala; Rina Panigrahy; Cyrus Rashtchian; Xin Wang; |
1665 | [Re] Projection-based Algorithm for Updating The TruncatedSVD of Evolving Matrices Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We found that our implementation of the original experiments was able to closely match the results of the paper with regards to accuracy and runtime. |
Andy Chen; Shion Matsumoto; Rohan Sinha Varma; |
1666 | Predicting Single-Cell Perturbation Responses for Unseen Drugs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a model for counterfactual modelling of perturbation responses for unseen drugs and evaluate a transfer learning scheme for improved generalisation in the single-cell setting. |
Leon Hetzel; Simon Boehm; Niki Kilbertus; Stephan Günnemann; mohammad lotfollahi; Fabian Theis; |
1667 | Score-Based Models Detect Manifolds Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We find precise conditions under which SGMs are able to produce samples from an underlying (low-dimensional) data manifold $\mathcal{M}$. |
Jakiw Pidstrigach; |
1668 | SAGDA: Achieving $\mathcal{O}(\epsilon^{-2})$ Communication Complexity in Federated Min-Max Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To address this challenge, in this paper, we propose a new algorithmic framework called \ul{s}tochastic \ul{s}ampling \ul{a}veraging \ul{g}radient \ul{d}escent \ul{a}scent ($\mathsf{SAGDA}$), which i) assembles stochastic gradient estimators from randomly sampled clients as control variates and ii) leverages two learning rates on both server and client sides. |
Haibo Yang; Zhuqing Liu; Xin Zhang; Jia Liu; |
1669 | FP8 Quantization: The Power of The Exponent Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, low-bit floating point numbers have an extra degree of freedom, assigning some bits to work on an exponential scale instead. This paper exhaustively investigates this benefit of the floating point format for neural network inference. |
Andrey Kuzmin; Mart van Baalen; Yuwei Ren; Markus Nagel; Jorn Peters; Tijmen Blankevoort; |
1670 | Out-of-Distribution Detection Via Conditional Kernel Independence Model Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we probe an alternative hypothesis on OOD detection by constructing a novel latent variable model based on independent component analysis (ICA) techniques. |
Yu Wang; Jingjing Zou; Jingyang Lin; Qing Ling; Yingwei Pan; Ting Yao; Tao Mei; |
1671 | Neural Abstractions Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present a novel method for the safety verification of non-linear dynamical systems. |
Alessandro Abate; Alec Edwards; Mirco Giacobbe; |
1672 | Information-Theoretic Safe Exploration with Gaussian Processes Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose an information-theoretic safe exploration criterion that directly exploits the GP posterior to identify the most informative safe parameters to evaluate. |
Alessandro Bottero; Carlos Luis; Julia Vinogradska; Felix Berkenkamp; Jan Peters; |
1673 | Provably Adversarially Robust Detection of Out-of-Distribution Data (Almost) for Free Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We slightly modify the architecture of neural network classifiers such that one can obtain provable guarantees on adversarially robust OOD detection without any loss in accuracy. |
Alexander Meinke; Julian Bitterwolf; Matthias Hein; |
1674 | Benefits of Additive Noise in Composing Classes with Bounded Capacity Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We develop a theory showing that adding a little bit of noise can effectively control the capacity of composite classes. |
Alireza Fathollah Pour; Hassan Ashtiani; |
1675 | Knowledge Distillation Improves Graph Structure Augmentation for Graph Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we take an important graph property, namely graph homophily, to analyze the distribution shifts between the two graphs and thus measure the severity of an augmentation algorithm suffering from negative augmentation. |
Lirong Wu; Haitao Lin; Yufei Huang; Stan Z. Li; |
1676 | Identifiability and Generalizability from Multiple Experts in Inverse Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This work starts by showing an equivalent identifiability statement from multiple experts in tabular MDPs based on a rank condition, which is easily verifiable and is shown to be also necessary. |
Paul Rolland; Luca Viano; Norman Schürhoff; Boris Nikolov; Volkan Cevher; |
1677 | Learning Deep Input-Output Stable Dynamics Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a method to learn nonlinear systems guaranteeing the input-output stability. |
Ryosuke Kojima; Yuji Okamoto; |
1678 | PyramidCLIP: Hierarchical Feature Alignment for Vision-language Model Pretraining Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, in real scenarios, this assumption can be difficult to hold: the text description, obtained by crawling the affiliated metadata of the image, often suffers from the semantic mismatch and the mutual compatibility. To address these issues, we introduce PyramidCLIP, which constructs an input pyramid with different semantic levels for each modality, and aligns visual elements and linguistic elements in the form of hierarchy via peer-level semantics alignment and cross-level relation alignment. |
Yuting Gao; Jinfeng Liu; Zihan Xu; Jun Zhang; Ke Li; Rongrong Ji; Chunhua Shen; |
1679 | Convexity Certificates from Hessians Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Here, we implement this approach for a class of functions that is rich enough to support classical machine learning. |
Joachim Giesen; Julien Klaus; Sören Laue; Niklas Merk; Konstantin Wiedom; |
1680 | RISE: Robust Individualized Decision Learning with Sensitive Variables Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper introduces RISE, a robust individualized decision learning framework with sensitive variables, where sensitive variables are collectible data and important to the intervention decision, but their inclusion in decision making is prohibited due to reasons such as delayed availability or fairness concerns. |
Xiaoqing Tan; Zhengling Qi; Christopher Seymour; Lu Tang; |
1681 | Neural Temporal Walks: Motif-Aware Representation Learning on Continuous-Time Dynamic Graphs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we introduce a novel method, Neural Temporal Walks (NeurTWs), for representation learning on continuous-time dynamic graphs. |
Ming Jin; Yuan-Fang Li; Shirui Pan; |
1682 | Intrinsic Dimensionality Estimation Using Normalizing Flows Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel method to estimate the intrinsic dimensionality using Normalizing Flows that scale to large datasets and high dimensions. |
Christian Horvat; Jean-Pascal Pfister; |
1683 | Learning Distributed and Fair Policies for Network Load Balancing As Markov Potential Game Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: A fully distributed MARL algorithm is proposed to approximate the Nash equilibrium of the game. |
Zhiyuan Yao; Zihan Ding; |
1684 | Sleeper Agent: Scalable Hidden Trigger Backdoors for Neural Networks Trained from Scratch Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We develop a new hidden trigger attack, Sleeper Agent, which employs gradient matching, data selection, and target model re-training during the crafting process. |
Hossein Souri; Liam Fowl; Rama Chellappa; Micah Goldblum; Tom Goldstein; |
1685 | Optimal Weak to Strong Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a new algorithm that constructs a strong learner from a weak learner, but uses less training data than AdaBoost and all other weak to strong learners to achieve the same generalization bounds. |
Kasper Green Larsen; Martin Ritzert; |
1686 | Deformable Vision Transformer Based Single-Stage Pedestrian Detector Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To this end, we propose a single-stage anchor-free pedestrian detector with enhanced spatial and multi-scale features based on the deformable vision transformer aiming to achieve the balance between speed and accuracy. |
Jing Yuan; Panagiotis Barmpoutis; Tania Stathaki; |
1687 | On The Robustness of Graph Neural Diffusion Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we explore the robustness properties of graph neural PDEs. |
Yang Song; Qiyu Kang; Sijie Wang; Kai Zhao; Wee Peng Tay; |
1688 | Graph Neural Network Bandits Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We consider the bandit optimization problem with the reward function defined over graph-structured data. |
Parnian Kassraie; Andreas Krause; Ilija Bogunovic; |
1689 | What Makes Graph Neural Networks Miscalibrated? Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we conduct a systematic study on the calibration qualities of GNN node predictions. |
Hans Hao-Hsun Hsu; Yuesong Shen; Christian Tomani; Daniel Cremers; |
1690 | Neural Network Architecture Beyond Width and Depth Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper proposes a new neural network architecture by introducing an additional dimension called height beyond width and depth. |
Shijun Zhang; Zuowei Shen; Haizhao Yang; |
1691 | The Hessian Screening Rule Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper we present a new screening rule for solving the lasso path: the Hessian Screening Rule. |
Johan Larsson; Jonas Wallin; |
1692 | Graph Convolution Network Based Recommender Systems: Learning Guarantee and Item Mixture Powered Strategy Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we take a first step towards establishing a generalization guarantee for GCN-based recommendation models under inductive and transductive learning. |
Leyan Deng; Defu Lian; Chenwang Wu; Enhong Chen; |
1693 | Combinatorial Bandits with Linear Constraints: Beyond Knapsacks and Fairness Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Our model generalizes and unifies several prominent lines of work, including bandits with fairness constraints, bandits with knapsacks (BwK), etc. We propose an upper-confidence bound LP-style algorithm for this problem, called UCB-LP, and prove that it achieves a logarithmic problem-dependent regret bound and zero constraint violations in expectation. |
Qingsong Liu; Weihang Xu; Siwei Wang; Zhixuan Fang; |
1694 | Unifying and Boosting Gradient-Based Training-Free Neural Architecture Search Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To this end, this paper presents a unified theoretical analysis of gradient-based training-free NAS, which allows us to (a) theoretically study their relationships, (b) theoretically guarantee their generalization performances, and (c) exploit our unified theoretical understanding to develop a novel framework named hybrid NAS (HNAS) which consistently boosts training-free NAS in a principled way. |
YAO SHU; Zhongxiang Dai; Zhaoxuan Wu; Bryan Kian Hsiang Low; |
1695 | Robust Feature-Level Adversaries Are Interpretability Tools Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We produce feature-level adversarial attacks using a deep image generator. They have a wide range of capabilities, and they are effective for studying feature/class (mis)associations in networks. |
Stephen Casper; Max Nadeau; Dylan Hadfield-Menell; Gabriel Kreiman; |
1696 | On Elimination Strategies for Bandit Fixed-Confidence Identification Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We show that adaptive methods can be modified to use elimination in both their stopping and sampling rules, hence obtaining the best of these two worlds: the algorithms (1) remain fully adaptive, (2) suffer a sample complexity that is never worse of their non-elimination counterpart, and (3) provably eliminate certain wrong answers early. |
Andrea Tirinzoni; Rémy Degenne; |
1697 | The Mechanism of Prediction Head in Non-contrastive Self-supervised Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we present our empirical and theoretical discoveries on non-contrastive self-supervised learning. |
Zixin Wen; Yuanzhi Li; |
1698 | Quantum Speedups of Optimizing Approximately Convex Functions with Applications to Logarithmic Regret Stochastic Convex Bandits Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We initiate the study of quantum algorithms for optimizing approximately convex functions and prove polynomial quantum speedup in dimension n. This can applied to zeroth-order stochastic convex bandits with exponential speedup in iteration number T. |
Tongyang Li; Ruizhe Zhang; |
1699 | “Why Not Other Classes?”: Towards Class-Contrastive Back-Propagation Explanations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To capture features with true class-discriminative power, we should instead askwhy is the input classified into this class, but not others?” To answer this question, we propose a weighted contrastive framework for explaining DNNs. |
Yipei Wang; Xiaoqian Wang; |
1700 | Lower Bounds and Nearly Optimal Algorithms in Distributed Learning with Communication Compression Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In the smooth and non-convex stochastic regime, this paper establishes a lower bound for distributed algorithms whether using unbiased or contractive compressors in unidirection or bidirection. |
Xinmeng Huang; Yiming Chen; Wotao Yin; Kun Yuan; |
1701 | Continuous Deep Q-Learning in Optimal Control Problems: Normalized Advantage Functions Analysis Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We consider reinforcement learning problems obtained by the discretization of certain optimal control problems. Based on the idea of NAF, we present a new family of quadratic functions and prove its suitable approximation properties. |
Anton Plaksin; Stepan Martyanov; |
1702 | Distributed Distributionally Robust Optimization with Non-Convex Objectives Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To this end, we propose an asynchronous distributed algorithm, named Asynchronous Single-looP alternatIve gRadient projEction (ASPIRE) algorithm with the itErative Active SEt method (EASE) to tackle the distributed distributionally robust optimization (DDRO) problem. |
Yang Jiao; Kai Yang; Dongjin Song; |
1703 | Quantile Constrained Reinforcement Learning: A Reinforcement Learning Framework Constraining Outage Probability Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper proposes a framework, named Quantile Constrained RL (QCRL), to constrain the quantile of the distribution of cumulative sum cost that is a necessary and sufficient condition to satisfy the outage constraint. |
Whiyoung Jung; Myungsik Cho; Jongeui Park; Youngchul Sung; |
1704 | Generative Neural Articulated Radiance Fields Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: These 3D GANs, however, have not been demonstrated for human bodies and the generated radiance fields of existing frameworks are not directly editable, limiting their applicability in downstream tasks. We propose a solution to these challenges by developing a 3D GAN framework that learns to generate radiance fields of human bodies or faces in a canonical pose and warp them using an explicit deformation field into a desired body pose or facial expression. |
Alexander Bergman; Petr Kellnhofer; Wang Yifan; Eric Chan; David Lindell; Gordon Wetzstein; |
1705 | Exposing and Exploiting Fine-Grained Block Structures for Fast and Accurate Sparse Training Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose an algorithm that keeps adapting the sparse model while maintaining the active parameters in shuffled blocks. |
Peng Jiang; Lihan Hu; Shihui Song; |
1706 | Non-Linguistic Supervision for Contrastive Learning of Sentence Embeddings Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we find the performance of Transformer models as sentence encoders can be improved by training with multi-modal multi-task losses, using unpaired examples from another modality (e.g., sentences and unrelated image/audio data). |
Yiren Jian; Chongyang Gao; Soroush Vosoughi; |
1707 | A Combinatorial Perspective on The Optimization of Shallow ReLU Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The NP-hard problem of optimizing a shallow ReLU network can be characterized as a combinatorial search over the activation pattern for each training example followed by a constrained convex problem given a fixed set of activation patterns. We explore the implications of this combinatorial aspect of ReLU optimization in this work. |
Michael S Matena; Colin Raffel; |
1708 | Physics-Informed Implicit Representations of Network Flows Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose an implicit neural network layer that incorporates two fundamental physical laws: conservation of mass, and the existence of a constitutive relationship between edge flows and nodal states (e.g., Ohm’s law). |
Kevin D. Smith; Francesco Seccamonte; Ananthram Swami; Francesco Bullo; |
1709 | Rapidly Mixing Multiple-try Metropolis Algorithms for Model Selection Problems Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We prove that MTM can achieve a mixing time bound smaller than that of MH by a factor of the number of trials under a general setting applicable to high-dimensional model selection problems. |
Hyunwoong Chang; Changwoo Lee; Zhao Tang Luo; Huiyan Sang; Quan Zhou; |
1710 | Falconn++: A Locality-sensitive Filtering Approach for Approximate Nearest Neighbor Search Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present Falconn++, a novel locality-sensitive filtering (LSF) approach for approximate nearest neighbor search on angular distance. |
Ninh Pham; Tao Liu; |
1711 | Efficient Frameworks for Generalized Low-Rank Matrix Bandit Problems Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we study the generalized low-rank matrix bandit problem, which has been recently proposed in \cite{lu2021low} under the Generalized Linear Model (GLM) framework. |
Yue Kang; Cho-Jui Hsieh; Thomas Chun Man Lee; |
1712 | Knowledge Distillation: Bad Models Can Be Good Role Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We show that a model trained on noisy data can be a good teacher when unlabeled data is ample even when the teacher has noisy predictions. |
Gal Kaplun; Eran Malach; Preetum Nakkiran; Shai Shalev-Shwartz; |
1713 | Spatial Mixture-of-Experts Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Further, many works that do incorporate locality fail to capture fine-grained structure. To address this, we introduce the Spatial Mixture-of-Experts (SMoE) layer, a sparsely-gated layer that learns spatial structure in the input domain and routes experts at a fine-grained level to utilize it. |
Nikoli Dryden; Torsten Hoefler; |
1714 | EF-BV: A Unified Theory of Error Feedback and Variance Reduction Mechanisms for Biased and Unbiased Compression in Distributed Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: These two classes of compression schemes and algorithms are distinct, with different analyses and proof techniques. In this paper, we unify them into a single framework and propose a new algorithm, recovering DIANA and EF21 as particular cases. |
Laurent Condat; Kai Yi; Peter Richtarik; |
1715 | Depth Is More Powerful Than Width in Deep Forest Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we analyze the influence of depth and width on the consistency of cascade forests. |
Shen-Huan Lyu; Yi-Xiao He; Zhi-Hua Zhou; |
1716 | [Re] Explaining in Style: Training A GAN to Explain A Classifier in StyleSpace Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Our experimental results support the claims posed in the original paper – the attributes detected by StylEx are identifiable by humans to a certain degree, distinct and sufficient. |
Chase van de Geijn; Victor Kyriacou; Irene Papadopoulou; Vasiliki Vasileiou; |
1717 | Rate-Optimal Online Convex Optimization in Adaptive Linear Control Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present the first computationally-efficient algorithm that attains an optimal $\sqrt{T}$-regret rate compared to the best stabilizing linear controller in hindsight, while avoiding stringent assumptions on the costs such as strong convexity. |
Asaf Benjamin Cassel; Alon Peled-Cohen; Tomer Koren; |
1718 | Implicit Bias of Gradient Descent on Reparametrized Models: On Equivalence to Mirror Descent Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The main result here is a complete characterization of this phenomenon under a notion termed commuting parametrization, which encompasses all the previous results in this setting. |
Zhiyuan Li; Tianhao Wang; Jason Lee; Sanjeev Arora; |
1719 | Learning with Little Mixing Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We study square loss in a realizable time-series framework with martingale difference noise. |
Ingvar Ziemann; Stephen Tu; |
1720 | Globally Optimal Algorithms for Fixed-Budged Best Arm Identification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We consider the fixed-budget best arm identification problem where the goal is to find the arm of the largest mean with a fixed number of samples. |
Junpei Komiyama; Taira Tsuchiya; Junya Honda; |
1721 | Evaluating Robustness to Dataset Shift Via Parametric Robustness Sets Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We give a method for proactively identifying small, plausible shifts in distribution which lead to large differences in model performance. |
Michael Oberst; Nikolaj Thams; David Sontag; |
1722 | Incorporating Bias-aware Margins Into Contrastive Loss for Collaborative Filtering Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To reduce the negative impact of popularity bias on CF models, we incorporate Biasaware margins into Contrastive loss and propose a simple yet effective BC Loss, where the margin tailors quantitatively to the bias degree of each user-item interaction. |
An Zhang; Wenchang Ma; Xiang Wang; Tat-Seng Chua; |
1723 | Hierarchical Graph Transformer with Adaptive Node Sampling Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we identify the main deficiencies of current graph transformers: (1) Existing node sampling strategies in Graph Transformers are agnostic to the graph characteristics and the training process. |
ZAIXI ZHANG; Qi Liu; Qingyong Hu; Chee-Kong Lee; |
1724 | Mutual Information Divergence: A Unified Metric for Multimodal Generative Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Based on a recent trend that multimodal generative evaluations exploit a vison-and-language pre-trained model, we propose the negative Gaussian cross-mutual information using the CLIP features as a unified metric, coined by Mutual Information Divergence (MID). |
Jin-Hwa Kim; Yunji Kim; Jiyoung Lee; Kang Min Yoo; Sang-Woo Lee; |
1725 | Are Two Heads The Same As One? Identifying Disparate Treatment in Fair Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: The paper identifies disparate treatment in fair deep neural networks. |
Michael Lohaus; Matthäus Kleindessner; Krishnaram Kenthapadi; Francesco Locatello; Chris Russell; |
1726 | Fast and Robust Rank Aggregation Against Model Misspecification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose CoarsenRank, which possesses robustness against model misspecification. |
YUANGANG PAN; Ivor W. Tsang; Weijie Chen; Gang Niu; Masashi Sugiyama; |
1727 | Self-Supervised Multi-Granularity Map Learning for Vision-and-Language Navigation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a multi-granularity map, which contains both object fine-grained details (eg, color, texture) and semantic classes, to represent objects more comprehensively. |
Peihao Chen; Dongyu Ji; Kunyang Lin; Runhao Zeng; Thomas Li; Mingkui Tan; Chuang Gan; |
1728 | Two-Stream Network for Sign Language Recognition and Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To learn more meaningful representations and incorporate domain knowledge, such as handshape and facial expressions, we introduce a dual visual encoder containing two separate streams to model both the raw videos and the keypoint sequences generated by an off-the-shelf keypoint estimator. |
Yutong Chen; Ronglai Zuo; Fangyun Wei; Yu Wu; Shujie LIU; Brian Mak; |
1729 | TPU-KNN: K Nearest Neighbor Search at Peak FLOP/s Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper presents a novel nearest neighbor search algorithm achieving TPU (Google Tensor Processing Unit) peak performance, outperforming state-of-the-art GPU algorithms with similar level of recall. |
Felix Chern; Blake Hechtman; Andy Davis; Ruiqi Guo; David Majnemer; Sanjiv Kumar; |
1730 | Gradient-Free Methods for Deterministic and Stochastic Nonsmooth Nonconvex Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The contributions of this paper are two-fold. First, we establish the relationship between the celebrated Goldstein subdifferential~\citep{Goldstein-1977-Optimization} and uniform smoothing, thereby providing the basis and intuition for the design of gradient-free methods that guarantee the finite-time convergence to a set of Goldstein stationary points. Second, we propose the gradient-free method (GFM) and stochastic GFM for solving a class of nonsmooth nonconvex optimization problems and prove that both of them can return a $(\delta,\epsilon)$-Goldstein stationary point of a Lipschitz function $f$ at an expected convergence rate at $O(d^{3/2}\delta^{-1}\epsilon^{-4})$ where $d$ is the problem dimension. |
Tianyi Lin; Zeyu Zheng; Michael Jordan; |
1731 | Flexible Diffusion Modeling of Long Videos Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present a framework for video modeling based on denoising diffusion probabilistic models that produces long-duration video completions in a variety of realistic environments. |
William Harvey; Saeid Naderiparizi; Vaden Masrani; Christian Weilbach; Frank Wood; |
1732 | Basis Encoded Polynomial Neural Fields for Subband Decomposition Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a new class of neural fields called basis-encoded polynomial neural fields (PNFs). |
Guandao Yang; Sagie Benaim; Varun Jampani; Kyle Genova; Jonathan Barron; Thomas Funkhouser; Serge Belongie; Bharath Hariharan; |
1733 | Subsidiary Prototype Alignment for Universal Domain Adaptation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We address negative-transfer in Universal DA with BoW-inspired word-prototypes and subsidiary alignment via a word-related pretext task. |
Jogendra Nath Kundu; Suvaansh Bhambri; Akshay R Kulkarni; Hiran Sarkar; Varun Jampani; Venkatesh Babu R; |
1734 | SAMURAI: Shape And Material from Unconstrained Real-world Arbitrary Image Collections Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a joint optimization framework to estimate the shape, BRDF, and per-image camera pose and illumination. |
Mark Boss; Andreas Engelhardt; Abhishek Kar; Yuanzhen Li; Deqing Sun; Jonathan Barron; Hendrik PA Lensch; Varun Jampani; |
1735 | Understanding Square Loss in Training Overparametrized Neural Network Classifiers Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we contribute to the theoretical understanding of square loss in classification by systematically investigating how it performs for overparametrized neural networks in the neural tangent kernel (NTK) regime. |
Tianyang Hu; Jun WANG; Wenjia Wang; Zhenguo Li; |
1736 | Bridging The Gap: Unifying The Training and Evaluation of Neural Network Binary Classifiers Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose a unifying approach to training neural network binary classifiers that combines a differentiable approximation of the Heaviside function with a probabilistic view of the typical confusion matrix values using soft sets. |
Nathan Tsoi; Kate Candon; Deyuan Li; Yofti Milkessa; Marynel Vázquez; |
1737 | Communication-efficient Distributed Eigenspace Estimation with Arbitrary Node Failures Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We develop an eigenspace estimation algorithm for distributed environments with arbitrary node failures, where a subset of computing nodes can return structurally valid but otherwise arbitrarily chosen responses. |
Vasileios Charisopoulos; Anil Damle; |
1738 | Hierarchical Classification at Multiple Operating Points Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present an efficient algorithm to produce operating characteristic curves for any method that assigns a score to every class in the hierarchy. |
Jack Valmadre; |
1739 | Aligning Human and Machine Vision Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Across 85 different DNNs and three independent datasets measuring human visual strategies on ImageNet, we find a trade-off between DNN top-1 categorization accuracy and their alignment with humans. |
Thomas FEL; Ivan F Rodriguez Rodriguez; Drew A Linsley; Thomas Serre; |
1740 | PopArt: Efficient Sparse Regression and Experimental Design for Optimal Sparse Linear Bandits Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we devise a simple, novel sparse linear estimation method called $\textrm{PopArt}$ that enjoys a tighter $\ell_1$ recovery guarantee compared to Lasso (Tibshirani, 1996). |
Kyoungseok Jang; Chicheng Zhang; Kwang-Sung Jun; |
1741 | Deep Equilibrium Approaches to Diffusion Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we look at diffusion models through a different perspective, that of a (deep) equilibrium (DEQ) fixed point model. |
Ashwini Pokle; Zhengyang Geng; J. Zico Kolter; |
1742 | Geometric Order Learning for Rank Estimation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: A novel approach to rank estimation, called geometric order learning (GOL), is proposed in this paper. |
Seon-Ho Lee; Nyeong Ho Shin; Chang-Su Kim; |
1743 | Improved Regret Analysis for Variance-Adaptive Linear Bandits and Horizon-Free Linear Mixture MDPs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we present novel analyses that improve their regret bounds significantly. |
Yeoneung Kim; Insoon Yang; Kwang-Sung Jun; |
1744 | Rethinking Alignment in Video Super-Resolution Transformers Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Nevertheless, such designs will dramatically increase the computational burden, and cannot deal with large motions. Therefore, we propose a new and efficient alignment method called patch alignment, which aligns image patches instead of pixels. |
Shuwei Shi; Jinjin Gu; Liangbin Xie; Xintao Wang; Yujiu Yang; Chao Dong; |
1745 | Exploring The Latent Space of Autoencoders with Interventional Assays Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a framework, called latent responses, which exploits the locally contractive behavior exhibited by variational autoencoders to explore the learned manifold. |
Felix Leeb; Stefan Bauer; Michel Besserve; Bernhard Schölkopf; |
1746 | Embodied Scene-aware Human Pose Estimation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose embodied scene-aware human pose estimation where we estimate 3D poses based on a simulated agent’s proprioception and scene awareness, along with external third-person observations. |
Zhengyi Luo; Shun Iwase; Ye Yuan; Kris Kitani; |
1747 | Multi-block-Single-probe Variance Reduced Estimator for Coupled Compositional Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a novel stochastic estimator, which can track multiple functional mappings with stochastic samples of only O(1) functional mappings at each iteration. |
Wei Jiang; Gang Li; Yibo Wang; Lijun Zhang; Tianbao Yang; |
1748 | SecureFedYJ: A Safe Feature Gaussianization Protocol for Federated Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we investigate the problem of applying the YJ transformation in a cross-silo Federated Learning setting under privacy constraints. |
Tanguy Marchand; Boris Muzellec; Constance Béguier; Jean Ogier du Terrail; Mathieu Andreux; |
1749 | Smoothed Online Convex Optimization Based on Discounted-Normal-Predictor Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we investigate an online prediction strategy named as Discounted-Normal-Predictor (Kapralov and Panigrahy, 2010) for smoothed online convex optimization (SOCO), in which the learner needs to minimize not only the hitting cost but also the switching cost. |
Lijun Zhang; Wei Jiang; Jinfeng Yi; Tianbao Yang; |
1750 | Online Decision Mediation from Scratch Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study the problem of learning to *mediate* between (oracle) expert behavior and (imperfect) human behavior with *abstentive* feedback. |
Daniel Jarrett; Alihan Hüyük; Mihaela van der Schaar; |
1751 | Assaying Out-Of-Distribution Generalization in Transfer Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we take a unified view of previous work, highlighting message discrepancies that we address empirically, and providing recommendations on how to measure the robustness of a model and how to improve it. |
Florian Wenzel; Andrea Dittadi; Peter Gehler; Carl-Johann Simon-Gabriel; Max Horn; Dominik Zietlow; David Kernert; Chris Russell; Thomas Brox; Bernt Schiele; Bernhard Schölkopf; Francesco Locatello; |
1752 | Dict-TTS: Learning to Pronounce with Prior Dictionary Knowledge for Text-to-Speech Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper tackles the polyphone disambiguation problem from a concise and novel perspective: we propose Dict-TTS, a semantic-aware generative text-to-speech model with an online website dictionary (the existing prior information in the natural language). |
Ziyue Jiang; Zhe Su; Zhou Zhao; Qian Yang; Yi Ren; Jinglin Liu; 振辉 叶; |
1753 | Rethinking Resolution in The Context of Efficient Video Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we empirically study how to make the most of low-resolution frames for efficient video recognition. |
Chuofan Ma; Qiushan Guo; Yi Jiang; Zehuan Yuan; Ping Luo; Xiaojuan Qi; |
1754 | CATER: Intellectual Property Protection on Text Generation APIs Via Conditional Watermarks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a novel Conditional wATERmarking framework (CATER) for protecting the IP right of text generation APIs caused by imitation attacks. |
Xuanli He; Qiongkai Xu; Yi Zeng; Lingjuan Lyu; Fangzhao Wu; Jiwei Li; Ruoxi Jia; |
1755 | Learning Energy Networks with Generalized Fenchel-Young Losses Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, building upon a generalized notion of conjugate function,which replaces the usual bilinear pairing with a general energy function,we propose generalized Fenchel-Young losses, a natural loss construction forlearning energy networks. |
Mathieu Blondel; Felipe Llinares-Lopez; Robert Dadashi; Leonard Hussenot; Matthieu Geist; |
1756 | DaDA: Distortion-aware Domain Adaptation for Unsupervised Semantic Segmentation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To this end, we propose a distortion-aware domain adaptation (DaDA) framework that boosts the unsupervised segmentation performance. |
Sujin Jang; Joohan Na; Dokwan Oh; |
1757 | A Deep Reinforcement Learning Framework for Column Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose RLCG, the first Reinforcement Learning (RL) approach for CG. |
Cheng Chi; Amine Aboussalah; Elias Khalil; Juyoung Wang; Zoha Sherkat-Masoumi; |
1758 | Searching for Better Spatio-temporal Alignment in Few-Shot Action Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose to improve the performance of matching and alignment from the end-to-end design of models. |
Yichao Cao; Xiu Su; Qingfei Tang; Shan You; Xiaobo Lu; Chang Xu; |
1759 | Finite-Sample Maximum Likelihood Estimation of Location Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We consider 1-dimensional location estimation, where we estimate a parameter $\lambda$ from $n$ samples $\lambda + \eta_i$, with each $\eta_i$ drawn i.i.d. from a known distribution $f$. |
Shivam Gupta; Jasper Lee; Eric Price; Paul Valiant; |
1760 | Thompson Sampling Efficiently Learns to Control Diffusion Processes Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We establish that the popular Thompson sampling algorithm learns optimal actions fast, incurring only a square-root of time regret, and also stabilizes the system in a short time period. |
Mohamad Kazem Shirani Faradonbeh; Mohamad Sadegh Shirani Faradonbeh; Mohsen Bayati; |
1761 | Stability and Generalization for Markov Chain Stochastic Gradient Methods Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we provide a comprehensive generalization analysis of MC-SGMs for both minimization and minimax problems through the lens of algorithmic stability in the framework of statistical learning theory. |
Puyu Wang; Yunwen Lei; Yiming Ying; Ding-Xuan Zhou; |
1762 | Transformers Meet Stochastic Blockmodels: Attention with Data-Adaptive Sparsity and Cost Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose SBM-Transformer, a model that resolves both problems by endowing each attention head with a mixed-membership Stochastic Block Model (SBM). |
Sungjun Cho; Seonwoo Min; Jinwoo Kim; Moontae Lee; Honglak Lee; Seunghoon Hong; |
1763 | FreGAN: Exploiting Frequency Components for Training GANs Under Limited Data Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To fully utilize the frequency information of limited data, this paper proposes FreGAN, which raises the model’s frequency awareness and draws more attention to synthesising high-frequency signals, facilitating high-quality generation. |
mengping yang; Zhe Wang; Ziqiu Chi; Yanbing Zhang; |
1764 | Expansion and Shrinkage of Localization for Weakly-Supervised Semantic Segmentation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: The original CAM method usually produces incomplete and inaccurate localization maps. To tackle with this issue, this paper proposes an Expansion and Shrinkage scheme based on the offset learning in the deformable convolution, to sequentially improve the \textbf{recall} and \textbf{precision} of the located object in the two respective stages. |
JINLONG LI; Zequn Jie; Xu Wang; Xiaolin Wei; Lin Ma; |
1765 | KSD Aggregated Goodness-of-fit Test Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce a strategy to construct a test, called KSDAgg, which aggregates multiple tests with different kernels. |
Antonin Schrab; Benjamin Guedj; Arthur Gretton; |
1766 | Where to Pay Attention in Sparse Training for Feature Selection? Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present a new efficient unsupervised method for feature selection based on sparse autoencoders. |
Ghada Sokar; Zahra Atashgahi; Mykola Pechenizkiy; Decebal Constantin Mocanu; |
1767 | Dynamic Sparse Network for Time Series Classification: Learning What to “See” Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a dynamic sparse network (DSN) with sparse connections for TSC, which can learn to cover various RF without cumbersome hyper-parameters tuning. |
Qiao Xiao; Boqian Wu; Yu Zhang; Shiwei Liu; Mykola Pechenizkiy; Elena Mocanu; Decebal Constantin Mocanu; |
1768 | UnfoldML: A Cost-Aware 2-D Dynamic Prediction Pipeline for Multi-Stage Classification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Through the combination of this 2-D query propagation mechanism and a set of policies for dynamic model selection, UnfoldML is able to (1) navigate an accuracy/cost tradeoff space, (2) reduce the spatio-temporal cost of inference by orders of magnitude, and (3) enable early prediction on succeeding stages. UnfoldML can benefit a wide range of real-world problems. |
Yanbo Xu; Alind Khare; Glenn Matlin; Monish Ramadoss; Rishikesan Kamaleswaran; Chao Zhang; Alexey Tumanov; |
1769 | Scalable Infomin Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Drawing on recent advances in slicing techniques, we propose a new infomin learning approach, which uses a novel proxy metric to mutual information. |
Yanzhi Chen; weihao sun; Yingzhen Li; Adrian Weller; |
1770 | Iron: Private Inference on Transformers Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Our main contribution is to provide several new secure protocols for matrix multiplication and complex non-linear functions like Softmax, GELU activations, and LayerNorm, which are critical components of Transformers. |
Meng Hao; Hongwei Li; Hanxiao Chen; Pengzhi Xing; Guowen Xu; Tianwei Zhang; |
1771 | Descent Steps of A Relation-Aware Energy Produce Heterogeneous Graph Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Moreover, the complexity of this trade-off is compounded in the heterogeneous graph case due to the disparate heterophily relationships between nodes of different types. To address these issues, we proposed a novel heterogeneous GNN architecture in which layers are derived from optimization steps that descend a novel relation-aware energy function. |
Hongjoon Ahn; Yongyi Yang; Quan Gan; David P Wipf; Taesup Moon; |
1772 | Towards Practical Few-shot Query Sets: Transductive Minimum Description Length Inference Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Expectedly, our setting incurs drops in the performances of state-of-the-art methods. Motivated by these observations, we introduce a primal dual minimum description length (PADDLE) formulation, which balances data-fitting accuracy and model complexity for a given few-shot task, under supervision constraints from the support set. |
Ségolène Martin; Malik Boudiaf; Emilie Chouzenoux; Jean-Christophe Pesquet; Ismail Ayed; |
1773 | [Re] Value Alignment Verification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We explore value alignment verification for gridworlds incorporating a non-linear feature reward mapping as well as an extended action space. |
Siba Smarak Panigrahi; Sohan Patnaik; |
1774 | What I Cannot Predict, I Do Not Understand: A Human-Centered Evaluation Framework for Explainability Methods Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, it is not yet known: (1) how useful these methods are in real-world scenarios and (2) how well theoretical measures predict the usefulness of these methods for practical use by a human. To fill this gap, we conducted human psychophysics experiments at scale to evaluate the ability of human participants (n=1,150) to leverage representative attribution methods to predicting the decision of different image classifiers. |
Julien Colin; Thomas FEL; Remi Cadene; Thomas Serre; |
1775 | QueryPose: Sparse Multi-Person Pose Regression Via Spatial-Aware Part-Level Query Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a sparse end-to-end multi-person pose regression framework, termed QueryPose, which can directly predict multi-person keypoint sequences from the input image. |
Yabo Xiao; Xiaojuan Wang; Kai Su; Dongdong Yu; Lei Jin; Mingshu He; Zehuan Yuan; |
1776 | KERPLE: Kernelized Relative Positional Embedding for Length Extrapolation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose KERPLE, a framework that generalizes relative position embedding for extrapolation by kernelizing positional differences. |
Ta-Chung Chi; Ting-Han Fan; Peter J Ramadge; Alexander Rudnicky; |
1777 | Cross-Image Context for Single Image Inpainting Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose the Cross-Image Context Memory (CICM) for learning and using the cross-image context to recover the corrupted regions. |
Tingliang Feng; Wei Feng; Weiqi Li; Di Lin; |
1778 | Deep Learning Methods for Proximal Inference Via Maximum Moment Restriction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we introduce a flexible and scalable method based on a deep neural network to estimate causal effects in the presence of unmeasured confounding using proximal inference. |
Benjamin Kompa; David Bellamy; Tom Kolokotrones; james m robins; Andrew Beam; |
1779 | InsPro: Propagating Instance Query and Proposal for Online Video Instance Segmentation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we design a simple, fast and yet effective query-based framework for online VIS. |
Fei He; Naiyu Gao; Jian Jia; Haoyang Zhang; Yanhu Shan; Xin Zhao; Kaiqi Huang; |
1780 | GET3D: A Generative Model of High Quality 3D Textured Shapes Learned from Images Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In our work, we aim to train performant 3D generative models that synthesize textured meshes which can be directly consumed by 3D rendering engines, thus immediately usable in downstream applications. |
Jun Gao; Tianchang Shen; Zian Wang; Wenzheng Chen; Kangxue Yin; Daiqing Li; Or Litany; Zan Gojcic; Sanja Fidler; |
1781 | LobsDICE: Offline Learning from Observation Via Stationary Distribution Correction Estimation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present LobsDICE, an offline LfO algorithm that learns to imitate the expert policy via optimization in the space of stationary distributions. |
Geon-Hyeong Kim; Jongmin Lee; Youngsoo Jang; Hongseok Yang; Kee-Eung Kim; |
1782 | End-to-end Symbolic Regression with Transformers Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: One can subsequently refine the predicted constants by feeding them to the non-convex optimizer as an informed initialization. We present ablations to show that this end-to-end approach yields better results, sometimes even without the refinement step. |
Pierre-alexandre Kamienny; Stéphane d’Ascoli; Guillaume Lample; Francois Charton; |
1783 | [Re] An Implementation of Fair Robust Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This work attempts to reproduce the results of the 2021 ICML paper ‘To be Robust or to be Fair: Towards Fairness in Adversarial Training.’ |
Ian Hardy; |
1784 | Teach Less, Learn More: On The Undistillable Classes in Knowledge Distillation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: A counter-intuitive observation is that a more expansive teacher does not make a better student, but the reasons for this phenomenon remain unclear. In this paper, we demonstrate that this is directly attributed to the presence of \textit{undistillable classes}: when trained with distillation, the teacher’s knowledge of some classes is incomprehensible to the student model. |
Yichen Zhu; Ning Liu; Zhiyuan Xu; Xin Liu; Weibin Meng; Louis Wang; Zhicai Ou; Jian Tang; |
1785 | Hypothesis Testing for Differentially Private Linear Regression Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we design differentially private hypothesis tests for the following problems in the general linear model: testing a linear relationship and testing for the presence of mixtures. |
Daniel Alabi; Salil Vadhan; |
1786 | Let Images Give You More: Point Cloud Cross-Modal Training for Shape Analysis Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Specifically, this paper introduces a simple but effective point cloud cross-modality training (PointCMT) strategy, which utilizes view-images, i.e., rendered or projected 2D images of the 3D object, to boost point cloud classification. |
Xu Yan; Heshen Zhan; Chaoda Zheng; Jiantao Gao; Ruimao Zhang; Shuguang Cui; Zhen Li; |
1787 | Deep Multi-Modal Structural Equations For Causal Effect Estimation With Unstructured Proxies Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Specifically, we introduce deep multi-modal structural equations, a generative model in which confounders are latent variables and unstructured data are proxy variables. |
Shachi Deshpande; Kaiwen Wang; Dhruv Sreenivas; Zheng Li; Volodymyr Kuleshov; |
1788 | ClimbQ: Class Imbalanced Quantization Enabling Robustness on Efficient Inferences Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, existing approaches focused on balanced datasets, while imbalanced data is pervasive in the real world. Therefore, in this study, we investigate the realistic problem, quantization on class-imbalanced data. |
Ting-An Chen; Ming-syan Chen; |
1789 | Blackbox Attacks Via Surrogate Ensemble Search Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a novel method for blackbox attacks via surrogate ensemble search (BASES) that can generate highly successful blackbox attacks using an extremely small number of queries. |
Zikui Cai; Srikanth Krishnamurthy; Chengyu Song; Amit Roy-Chowdhury; Salman Asif; |
1790 | An Asymptotically Optimal Batched Algorithm for The Dueling Bandit Problem Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we ask: is there a solution using only a few adaptive rounds that matches the asymptotic regret bounds of the best sequential algorithms for $K$-armed dueling bandits? |
Arpit Agarwal; Rohan Ghuge; viswanath nagarajan; |
1791 | Collaborative Learning of Distributions Under Heterogeneity and Communication Constraints Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a novel two-stage method named SHIFT: First, the users collaborate by communicating with the server to learn a central distribution; relying on methods from robust statistics. |
Xinmeng Huang; Donghwan Lee; Edgar Dobriban; Hamed Hassani; |
1792 | Latent Hierarchical Causal Structure Discovery with Rank Constraints Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose an estimation procedure that can efficiently locate latent variables, determine their cardinalities, and identify the latent hierarchical structure, by leveraging rank deficiency constraints over the measured variables. |
Biwei Huang; Charles Jia Han Low; Feng Xie; Clark Glymour; Kun Zhang; |
1793 | Distributed Influence-Augmented Local Simulators for Parallel MARL in Large Networked Systems Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we show how to decompose large networked systems of many agents into multiple local components such that we can build separate simulators that run independently and in parallel. |
Miguel Suau de Castro; Jinke He; Mustafa Mert Çelikok; Matthijs Spaan; Frans Oliehoek; |
1794 | SizeShiftReg: A Regularization Method for Improving Size-Generalization in Graph Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work we consider the scenario in which we only have access to the training data, and we propose a regularization strategy that can be applied to any GNN to improve its generalization capabilities from smaller to larger graphs without requiring access to the test data. |
Davide Buffelli; Pietro Lió; Fabio Vandin; |
1795 | Neural Topological Ordering for Computation Graphs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose an end-to-end machine learning based approach for topological ordering using an encoder-decoder framework. |
Mukul Gagrani; Corrado Rainone; Yang Yang; Harris Teague; Wonseok Jeon; Roberto Bondesan; Herke van Hoof; Christopher Lott; Weiliang Zeng; Piero Zappi; |
1796 | Personalized Federated Learning Towards Communication Efficiency, Robustness and Fairness Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a personalized FL method based on shared-and-fixed low-dimensional random subspace projection and infimal convolution, our method aims for communication efficiency, robustness, and fairness. |
Shiyun Lin; Yuze Han; Xiang Li; Zhihua Zhang; |
1797 | Learning Causally Invariant Representations for Out-of-Distribution Generalization on Graphs Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Accordingly, we propose an information-theoretic objective to extract the desired subgraphs that maximally preserve the invariant intra-class information. |
Yongqiang Chen; Yonggang Zhang; Yatao Bian; Han Yang; MA Kaili; Binghui Xie; Tongliang Liu; Bo Han; James Cheng; |
1798 | Missing Data Imputation and Acquisition with Deep Hierarchical Models and Hamiltonian Monte Carlo Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, within this specific application domain, existing VAE methods are restricted by using only one layer of latent variables and strictly Gaussian posterior approximations. To address these limitations, we present HH-VAEM, a Hierarchical VAE model for mixed-type incomplete data that uses Hamiltonian Monte Carlo with automatic hyper-parameter tuning for improved approximate inference. |
Ignacio Peis; Chao Ma; José Miguel Hernández-Lobato; |
1799 | Exact Shape Correspondence Via 2D Graph Convolution Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In addition, existing methods that attempt to address the non-isometric shape problem (e.g., GRAMPA) are generally computationally expensive and do not generalise to nearly-isometric shapes. To address these two problems, we propose a 2D graph convolution-based framework called 2D-GEM. |
Barakeel Fanseu Kamhoua; Lin Zhang; Yongqiang Chen; Han Yang; MA Kaili; Bo Han; Bo Li; James Cheng; |
1800 | VAEL: Bridging Variational Autoencoders and Probabilistic Logic Programming Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present VAEL, a neuro-symbolic generative model integrating variational autoencoders (VAE) with the reasoning capabilities of probabilistic logic (L) programming. |
Eleonora Misino; Giuseppe Marra; Emanuele Sansone; |
1801 | Support Recovery in Sparse PCA with Incomplete Data Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study a practical algorithm for sparse principal component analysis (PCA) of incomplete and noisy data. |
Hanbyul Lee; Qifan Song; Jean Honorio; |
1802 | Multi-Granularity Cross-modal Alignment for Generalized Medical Visual Representation Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present a novel Multi-Granularity Cross-modal Alignment (MGCA) framework for generalized medical visual representation learning by harnessing the naturally exhibited semantic correspondences between medical image and radiology reports at three different levels, i.e., pathological region-level, instance-level, and disease-level. |
Fuying Wang; Yuyin Zhou; Shujun WANG; Varut Vardhanabhuti; Lequan Yu; |
1803 | OST: Improving Generalization of DeepFake Detection Via One-Shot Test-Time Training Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we introduce a new learning paradigm specially designed for the generalizable deepfake detection task. |
Liang Chen; Yong Zhang; Yibing Song; Jue Wang; Lingqiao Liu; |
1804 | Controllable Text Generation with Neurally-Decomposed Oracle Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a general and efficient framework to control auto-regressive generation models with NeurAlly-Decomposed Oracle (NADO). |
Tao Meng; Sidi Lu; Nanyun Peng; Kai-Wei Chang; |
1805 | Lipschitz Bandits with Batched Feedback Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we study Lipschitz bandit problems with batched feedback, where the expected reward is Lipschitz and the reward observations are communicated to the player in batches. We introduce a novel landscape-aware algorithm, called Batched Lipschitz Narrowing (BLiN), that optimally solves this problem. |
Yasong Feng; zengfeng Huang; Tianyu Wang; |
1806 | Physical Design Using Differentiable Learned Simulators Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This framework produces high-quality designs by propagating gradients through trajectories of hundreds of steps, even when using models that were pre-trained for single-step predictions on data substantially different from the design tasks. |
Kelsey Allen; Tatiana Lopez-Guevara; Kimberly Stachenfeld; Alvaro Sanchez Gonzalez; Peter Battaglia; Jessica Hamrick; Tobias Pfaff; |
1807 | Positively Weighted Kernel Quadrature Via Subsampling Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We study kernel quadrature rules with convex weights. |
Satoshi Hayakawa; Harald Oberhauser; Terry Lyons; |
1808 | Consistency of Constrained Spectral Clustering Under Graph Induced Fair Planted Partitions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we consider a setting where sensitive attributes indirectly manifest in an auxiliary \textit{representation graph} rather than being directly observed. |
Shubham Gupta; Ambedkar Dukkipati; |
1809 | Active Ranking Without Strong Stochastic Transitivity Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a $\delta$-correct algorithm, Probe-Rank, that actively learns the ranking of the items from noisy pairwise comparisons. |
Hao Lou; Tao Jin; Yue Wu; Pan Xu; Quanquan Gu; Farzad Farnoud; |
1810 | Deep Compression of Pre-trained Transformer Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce methods to deeply compress pre-trained transformer models across three major application domains: NLP, speech, and vision. |
Naigang Wang; Chi-Chun (Charlie) Liu; Swagath Venkataramani; Sanchari Sen; Chia-Yu Chen; Kaoutar El Maghraoui; Vijayalakshmi (Viji) Srinivasan; Leland Chang; |
1811 | Learning Neural Set Functions Under The Optimal Subset Oracle Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we present a principled yet practical maximum likelihood learning framework, termed as EquiVSet, that simultaneously meets the following desiderata of learning set functions under the OS oracle: i) permutation invariance of the set mass function being modeled; ii) permission of varying ground set; iii) minimum prior and iv) scalability. |
Zijing Ou; Tingyang Xu; Qinliang Su; Yingzhen Li; Peilin Zhao; Yatao Bian; |
1812 | Collaborative Decision Making Using Action Suggestions Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose two methods that use suggested actions and demonstrate the approach through simulated experiments. |
Dylan Asmar; Mykel J Kochenderfer; |
1813 | Text-Adaptive Multiple Visual Prototype Matching for Video-Text Retrieval Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: It is difficult for these methods to alleviate video-text correspondence ambiguity by describing a video using only one feature, which is required to be matched with multiple different text features at the same time. To address this problem, we propose a Text-Adaptive Multiple Visual Prototype Matching Model. |
Chengzhi Lin; Ancong Wu; Junwei Liang; Jun Zhang; Wenhang Ge; Wei-Shi Zheng; Chunhua Shen; |
1814 | Deciding What to Model: Value-Equivalent Sampling for Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we consider the scenario where agent limitations may entirely preclude identifying an exactly value-equivalent model, immediately giving rise to a trade-off between identifying a model that is simple enough to learn while only incurring bounded sub-optimality. |
Dilip Arumugam; Benjamin Van Roy; |
1815 | Mind Reader: Reconstructing Complex Images from Brain Activities Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Unlike previous works that reconstruct images with single objects or simple shapes, our work aims to reconstruct image stimuli that are rich in semantics, closer to everyday scenes, and can reveal more perspectives. |
Sikun Lin; Thomas Sprague; Ambuj K Singh; |
1816 | Transform Once: Efficient Operator Learning in Frequency Domain Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present Transform Once (T1), a deep frequency-domain architecture for efficient learning of long-range correlations in space or time that is 3x to 20x faster than Fourier Neural Operators and improves on predictive accuracy. |
Michael Poli; Stefano Massaroli; Federico Berto; Jinkyoo Park; Tri Dao; Christopher Ré; Stefano Ermon; |
1817 | Holomorphic Equilibrium Propagation Computes Exact Gradients Through Finite Size Oscillations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce ‘holomorphic equilibrium propagation’, which outperforms the classic equilibrium propagation on ImageNet32 by solving its infinitesimal teaching signal requirement, as well as its need for separate phases. |
Axel Laborieux; Friedemann Zenke; |
1818 | TA-MoE: Topology-Aware Large Scale Mixture-of-Expert Training Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose TA-MoE, a topology-aware routing strategy for large-scale MoE trainging, from a model-system co-design perspective, which can dynamically adjust the MoE dispatch pattern according to the network topology. |
Chang Chen; Min Li; Zhihua Wu; Dianhai Yu; Chao Yang; |
1819 | Learning Dynamical Systems Via Koopman Operator Regression in Reproducing Kernel Hilbert Spaces Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We study a class of dynamical systems modelled as stationary Markov chains that admit an invariant distribution via the corresponding transfer or Koopman operator. |
Pietro Novelli; Vladimir Kostic; Massimiliano Pontil; Andreas Maurer; Carlo Ciliberto; Lorenzo Rosasco; |
1820 | Cooperative Distribution Alignment Via JSD Upper Bound Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present empirical results of our framework on both simulated and real-world datasets to demonstrate the benefits of our approach. |
Wonwoong Cho; ZIYU GONG; David Inouye; |
1821 | Semi-Supervised Learning with Decision Trees: Graph Laplacian Tree Alternating Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This has been very successful with models ranging from kernel machines to neural networks, but has remained inapplicable to decision trees, for which the optimization problem is much harder. We solve this based on a reformulation of the problem which requires iteratively solving two simpler problems: a supervised tree learning problem, which can be solved by the Tree Alternating Optimization algorithm; and a label smoothing problem, which can be solved through a sparse linear system. |
Arman Zharmagambetov; Miguel A. Carreira-Perpinan; |
1822 | Multi-block Min-max Bilevel Optimization with Applications in Multi-task Deep AUC Maximization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we study multi-block min-max bilevel optimization problems, where the upper level is non-convex strongly-concave minimax objective and the lower level is a strongly convex objective, and there are multiple blocks of dual variables and lower level problems. |
Quanqi Hu; YONGJIAN ZHONG; Tianbao Yang; |
1823 | CLEAR: Generative Counterfactual Explanations on Graphs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we study the problem of counterfactual explanation generation on graphs. |
Jing Ma; Ruocheng Guo; Saumitra Mishra; Aidong Zhang; Jundong Li; |
1824 | VectorAdam for Rotation Equivariant Geometry Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we observe that naively applying Adam to optimize vector-valued data is not rotation equivariant, due to per-coordinate moment updates, and in fact this leads to significant artifacts and biases in practice. We propose to resolve this deficiency with VectorAdam, a simple modification which makes Adam rotation-equivariant by accounting for the vector structure of optimization variables. |
Selena Zihan Ling; Nicholas Sharp; Alec Jacobson; |
1825 | NSNet: A General Neural Probabilistic Framework for Satisfiability Problems Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present the Neural Satisfiability Network (NSNet), a general neural framework that models satisfiability problems as probabilistic inference and meanwhile exhibits proper explainability. |
Zhaoyu Li; Xujie Si; |
1826 | BayesPCN: A Continually Learnable Predictive Coding Associative Memory Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present BayesPCN, a hierarchical associative memory capable of performing continual one-shot memory writes without meta-learning. |
Jinsoo Yoo; Frank Wood; |
1827 | Measuring Model Inversion Defences in Edge–Cloud Collaborative Inference Systems Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we take the first step towards measuring the robustness of those state-of-the-art defence countermeasures with respect to MI attacks. |
Mengda Yang; Juan Wang; Hongxin Hu; Ao Ren; Ziang Li; Xiaoyang Xu; Wenzhe Yi; |
1828 | Compositional Generalization in Unsupervised Representation Learning: From Disentanglement to Emergent Language Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Deep learning models struggle with compositional generalization, i.e.\ the ability to recognize or generate novel combinations of observed elementary concepts.In hopes of enabling compositional generalization, various unsupervised learning algorithms have been proposed with inductive biases that aim to induce compositional structure in learned representations (e.g.\ disentangled representation and emergent language learning). In this work, we evaluate these unsupervised learning algorithms in terms of how well they enable \textit{compositional generalization}. |
Zhenlin Xu; Marc Niethammer; Colin Raffel; |
1829 | Towards Optimal Communication Complexity in Distributed Non-Convex Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose and analyze a new algorithm that improves existing methods by requiring fewer and lighter variance reduction operations. |
Kumar Kshitij Patel; Lingxiao Wang; Blake Woodworth; Brian Bullins; Nati Srebro; |
1830 | Few-Shot Non-Parametric Learning with Deep Latent Variable Model Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose Non-Parametric learning by Compression with Latent Variables (NPC-LV), a learning framework for any dataset with abundant unlabeled data but very few labeled ones. |
Zhiying Jiang; Yiqin Dai; Ji Xin; Ming Li; Jimmy Lin; |
1831 | GULP: A Prediction-based Metric Between Representations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we introduce GULP, a family of distance measures between representations that is explicitly motivated by downstream predictive tasks. |
Enric Boix-Adsera; Hannah Lawrence; George Stepaniants; Philippe Rigollet; |
1832 | On The Non-universality of Deep Learning: Quantifying The Cost of Symmetry Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We prove a general computational lower bound for learning with neural networks trained by noisy gradient descent (GD). |
Emmanuel Abbe; Enric Boix-Adsera; |
1833 | Embracing Consistency: A One-Stage Approach for Spatio-Temporal Video Grounding Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Existing approaches mainly treat this complicated task as a parallel frame-grounding problem and thus suffer from two types of inconsistency drawbacks: feature alignment inconsistency and prediction inconsistency. In this paper, we present an end-to-end one-stage framework, termed Spatio-Temporal Consistency-Aware Transformer (STCAT), to alleviate these issues. |
Yang Jin; yongzhi li; Zehuan Yuan; Yadong Mu; |
1834 | Optimal Dynamic Regret in LQR Control Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We develop an algorithm with optimal dynamic regret for LQR control. |
Dheeraj Baby; Yu-Xiang Wang; |
1835 | The Implicit Delta Method Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose an alternative, the implicit delta method, which works by infinitesimally regularizing the training loss of the predictive model to automatically assess downstream uncertainty. |
Nathan Kallus; James McInerney; |
1836 | Meta-ticket: Finding Optimal Subnetworks for Few-shot Learning Within Randomly Initialized Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose a novel meta-learning approach, called Meta-ticket, to find optimal sparse subnetworks for few-shot learning within randomly initialized NNs. |
Daiki Chijiwa; Shin’ya Yamaguchi; Atsutoshi Kumagai; Yasutoshi Ida; |
1837 | Finding and Listing Front-door Adjustment Sets Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present algorithms for finding and enumerating possible sets satisfying the FD criterion in a given causal diagram. |
Hyunchai Jeong; Jin Tian; Elias Bareinboim; |
1838 | A Simple But Strong Baseline for Online Continual Learning: Repeated Augmented Rehearsal Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper revisits the rehearsal dynamics in online settings. |
Yaqian Zhang; Bernhard Pfahringer; Eibe Frank; Albert Bifet; Nick Jin Sean Lim; Yunzhe Jia; |
1839 | Learning Graph-embedded Key-event Back-tracing for Object Tracking in Event Clouds Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Specifically, to process a non-uniformly distributed large-scale event cloud efficiently, we propose a simple yet effective density-insensitive downsampling strategy to sample a subset called key-events. |
Zhiyu Zhu; Junhui Hou; Xianqiang Lyu; |
1840 | Sample-Then-Optimize Batch Neural Thompson Sampling Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, these works suffer from the limitations of the requirement to invert an extremely large parameter matrix and the restriction to the sequential (rather than batch) setting. To overcome these limitations, we introduce two algorithms based on the Thompson sampling (TS) policy named Sample-Then-Optimize Batch Neural TS (STO-BNTS) and STO-BNTS-Linear. |
Zhongxiang Dai; YAO SHU; Bryan Kian Hsiang Low; Patrick Jaillet; |
1841 | Fine-Tuning Pre-Trained Language Models Effectively By Optimizing Subnetworks Adaptively Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a Dynamic Parameter Selection (DPS) algorithm for the large-scale pre-trained models during fine-tuning, which adaptively selects a more promising subnetwork to perform staging updates based on gradients of back-propagation. |
Haojie Zhang; Ge Li; Jia Li; Zhongjin Zhang; YUQI ZHU; Zhi Jin; |
1842 | Nonnegative Tensor Completion Via Integer Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper develops a new algorithm for the special case of completion for nonnegative tensors. |
Caleb Bugg; Chen Chen; Anil Aswani; |
1843 | Equivariant Networks for Crystal Structures Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we introduce a class of models that are equivariant with respect to crystalline symmetry groups. |
Oumar Kaba; Siamak Ravanbakhsh; |
1844 | Simple and Optimal Greedy Online Contention Resolution Schemes Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present simple $1/e$ – selectable greedy OCRSs for the single-item setting, partition matroids, and transversal matroids. |
Vasilis Livanos; |
1845 | From Gradient Flow on Population Loss to Learning with Stochastic Gradient Descent Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We show that under mild assumptions, whenever gradient flow works on the population loss, stochastic gradient descent succeeds at learning. |
Christopher De Sa; Satyen Kale; Jason Lee; Ayush Sekhari; Karthik Sridharan; |
1846 | LieGG: Studying Learned Lie Group Generators Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We depart from the position that when the symmetries are not built into a model a priori, it is advantageous for robust networks to learn the symmetries directly from the data to fit the task function. In this paper, we present a method to extract symmetries learned by the neural network and to evaluate the degree to which the network is invariant to them. |
Anna Sepliarskaia; Artem Moskalev; Ivan Sosnovik; Arnold Smeulders; |
1847 | Neural Stochastic PDEs: Resolution-Invariant Learning of Continuous Spatiotemporal Dynamics Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Based on the notion of mild solution of an SPDE, we introduce a novel neural architecture to learn solution operators of PDEs with (possibly stochastic) forcing from partially observed data. |
Cristopher Salvi; Maud Lemercier; Andris Gerasimovics; |
1848 | Fairness in Federated Learning Via Core-Stability Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Currently popular egalitarian and weighted equity-based fairness measures suffer from the aforementioned pitfall. In this work, we aim to formally represent this problem and address these fairness issues using concepts from co-operative game theory and social choice theory. |
Bhaskar Ray Chaudhury; Linyi Li; Mintong Kang; Bo Li; Ruta Mehta; |
1849 | AMP: Automatically Finding Model Parallel Strategies with Heterogeneity Awareness Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, training big models requires strong distributed system expertise to carefully design model-parallel execution strategies that suit with the model architectures and cluster setups. In this paper, we develop AMP, a framework to automatically derive such strategies. |
Dacheng Li; Hongyi Wang; Eric Xing; Hao Zhang; |
1850 | New Definitions and Evaluations for Saliency Methods: Staying Intrinsic, Complete and Sound Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Inspired by the idea of soundness from logic systems, this paper provides a new dimension for intrinsic evaluations of saliency methods. |
Arushi Gupta; Nikunj Saunshi; Dingli Yu; Kaifeng Lyu; Sanjeev Arora; |
1851 | Hand-Object Interaction Image Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we are dedicated to a new task, i.e., hand-object interaction image generation, which aims to conditionally generate the hand-object image under the given hand, object and their interaction status. |
Hezhen Hu; Weilun Wang; Wengang Zhou; Houqiang Li; |
1852 | Structuring Representations Using Geometric Invariants Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Different geometries can be defined by their invariants; for example, distances and angles are preserved in Euclidean and Conformal geometries, respectively. We propose to structure equivariant representations using loss functions based on these invariants. |
Mehran Shakerinava; Arnab Mondal; Siamak Ravanbakhsh; |
1853 | Root Cause Analysis of Failures in Microservices Through Causal Discovery Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a scalable algorithm for quickly detecting the root cause of failure in complex microservice architectures. |
Azam Ikram; Sarthak Chakraborty; Subrata Mitra; Shiv Saini; Saurabh Bagchi; Murat Kocaoglu; |
1854 | Streaming Radiance Fields for 3D Video Synthesis Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present an explicit-grid based method for efficiently reconstructing streaming radiance fields for novel view synthesis of real world dynamic scenes. |
Lingzhi LI; Zhen Shen; zhongshu wang; Li Shen; Ping Tan; |
1855 | A Unified Convergence Theorem for Stochastic Optimization Methods Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we provide a fundamental unified convergence theorem used for deriving expected and almost sure convergence results for a series of stochastic optimization methods. |
Xiao Li; Andre Milzarek; |
1856 | Neural Set Function Extensions: Learning with Discrete Functions in High Dimensions Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we address both difficulties for set functions, which capture many important discrete problems. |
Nikolaos Karalias; Joshua Robinson; Andreas Loukas; Stefanie Jegelka; |
1857 | Counterfactual Temporal Point Processes Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, these models lack the ability to answer counterfactual questions, which are increasingly relevant as these models are being used to inform targeted interventions. In this work, our goal is to fill this gap. |
Kimia Noorbakhsh; Manuel Rodriguez; |
1858 | Quality Not Quantity: On The Interaction Between Dataset Design and Robustness of CLIP Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we introduce a testbed of six publicly available data sources—YFCC, LAION, Conceptual Captions, WIT, RedCaps, Shutterstock—to investigate how pre-training distributions induce robustness in CLIP. |
Thao Nguyen; Gabriel Ilharco; Mitchell Wortsman; Sewoong Oh; Ludwig Schmidt; |
1859 | Amortized Proximal Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a framework for online meta-optimization of parameters that govern optimization, called Amortized Proximal Optimization (APO). |
Juhan Bae; Paul Vicol; Jeff Z. HaoChen; Roger Grosse; |
1860 | Near-Optimal Correlation Clustering with Privacy Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we introduce a simple and computationally efficient algorithm for the correlation clustering problem with provable privacy guarantees. |
Vincent Cohen-Addad; Chenglin Fan; Silvio Lattanzi; Slobodan Mitrovic; Ashkan Norouzi-Fard; Nikos Parotsidis; Jakub Tarnawski; |
1861 | Conditional Meta-Learning of Linear Representations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The effectiveness of these methods is often limited when the nuances of the tasks’ distribution cannot be captured by a single representation. In this work we overcome this issue by inferring a conditioning function, mapping the tasks’ side information (such as the tasks’ training dataset itself) into a representation tailored to the task at hand. |
Giulia Denevi; Massimiliano Pontil; Carlo Ciliberto; |
1862 | Latent-Variable Advantage-Weighted Policy Optimization for Offline Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Unfortunately, such datasets often contain action distributions with multiple modes and, in some cases, lack a sufficient number of high-reward trajectories, which render offline policy training inefficient. To address this challenge, we propose to leverage latent-variable generative model to represent high-advantage state-action pairs leading to better adherence to data distributions that contributes to solving the task, while maximizing reward via a policy over the latent variable. |
Xi Chen; Ali Ghadirzadeh; Tianhe Yu; Alex Yuan Gao; Jianhao Wang; Wenzhe Li; Liang Bin; Chelsea Finn; Chongjie Zhang; |
1863 | Luckiness in Multiscale Online Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Modern algorithms in addition provide tighter guarantees outside the maximally adversarial regime, most notably in the form of constant (pseudo)-regret bounds under statistical margin assumptions. We investigate the multiscale extension of the setting, where the loss ranges of the various experts are vastly different, and the regret w.r.t. each expert needs to scale with its range, instead of the maximum overall range. |
Wouter Koolen; Muriel Pérez; |
1864 | Random Rank: The One and Only Strategyproof and Proportionally Fair Randomized Facility Location Mechanism Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In our work, we propose a concept called Strong Proportionality, which ensures that when there are two groups of agents at different locations, both groups incur the same total cost. |
Haris Aziz; Alexander Lam; Mashbat Suzuki; Toby Walsh; |
1865 | A Differentiable Semantic Metric Approximation in Probabilistic Embedding for Cross-Modal Retrieval Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper presents a method that can improve and evaluate the multiplicity of probabilistic embedding in noisy cross-modal datasets. |
Hao Li; Jingkuan Song; Lianli Gao; Pengpeng Zeng; Haonan Zhang; Gongfu Li; |
1866 | The Privacy Onion Effect: Memorization Is Relative Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We demonstrate and analyze an Onion Effect of memorization: removing the "layer" of outlier points that are most vulnerable to a privacy attack exposes a new layer of previously-safe points to the same attack. |
Nicholas Carlini; Matthew Jagielski; Chiyuan Zhang; Nicolas Papernot; Andreas Terzis; Florian Tramer; |
1867 | Hyperbolic Embedding Inference for Structured Multi-Label Prediction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We consider a structured multi-label prediction problem where the labels are organized under implication and mutual exclusion constraints. |
Bo Xiong; Michael Cochez; Mojtaba Nayyeri; Steffen Staab; |
1868 | DNA: Proximal Policy Optimization with A Dual Network Architecture Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper explores the problem of simultaneously learning a value function and policy in deep actor-critic reinforcement learning models. |
Matthew Aitchison; Penny Sweetser; |
1869 | Block-Recurrent Transformers Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce the Block-Recurrent Transformer, which applies a transformer layer in a recurrent fashion along a sequence, and has linear complexity with respect to sequence length. |
DeLesley Hutchins; Imanol Schlag; Ethan Dyer; Behnam Neyshabur; Yuhuai Wu; |
1870 | Denoising Diffusion Restoration Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This work addresses these issues by introducing Denoising Diffusion Restoration Models (DDRM), an efficient, unsupervised posterior sampling method. |
Bahjat Kawar; Michael Elad; Stefano Ermon; Jiaming Song; |
1871 | WeightedSHAP: Analyzing and Improving Shapley Based Feature Attributions Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We show the suboptimality of the Shapley-based feature attribution and propose WeightedSHAP, a generalization of the Shapley value which is more flexible. |
Yongchan Kwon; James Zou; |
1872 | You Only Live Once: Single-Life Reinforcement Learning Via Learned Reward Shaping Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We formalize the single-life RL problem setting, where given prior data, an agent must complete a novel task autonomously in a single trial, and propose an algorithm (QWALE) that leverages the prior data as guidance to complete the desired task. |
Annie Chen; Archit Sharma; Sergey Levine; Chelsea Finn; |
1873 | Amortized Inference for Heterogeneous Reconstruction in Cryo-EM Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose an amortized method, validated on experimental cryo-EM datasets, to reconstruct the 3D structures of biomolecules and analyze their deformations. |
Axel Levy; Gordon Wetzstein; Julien N.P Martel; Frederic Poitevin; Ellen Zhong; |
1874 | Learning from Future: A Novel Self-Training Framework for Semantic Segmentation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a novel self-training framework, which helps the student to learn from the future, and achieve state-of-the-art performance on the task of unsupervised domain adaptive semantic segmentation. |
Ye Du; Yujun Shen; Haochen Wang; Jingjing Fei; Wei Li; Liwei Wu; Rui Zhao; Zehua Fu; Qingjie LIU; |
1875 | SALSA: Attacking Lattice Cryptography with Transformers Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we train transformers to perform modular arithmetic and mix half-trained models and statistical cryptanalysis techniques to propose SALSA: a machine learning attack on LWE-based cryptographic schemes. |
Emily Wenger; Mingjie Chen; Francois Charton; Kristin E. Lauter; |
1876 | Smooth Fictitious Play in Stochastic Games with Perturbed Payoffs and Unknown Transitions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, those decentralized algorithms need the players to know exactly the model (the transition probabilities and their payoffs at every stage). To overcome these strong assumptions, our paper introduces regularizations of the recent algorithms which are moreover, model-free (players don’t know the transitions and their payoffs are perturbed at every stage). |
Lucas Baudin; Rida Laraki; |
1877 | Uncertainty-Aware Reinforcement Learning for Risk-Sensitive Player Evaluation in Sports Game Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We demonstrate how a distributional Bellman operator and a feature-space density model can capture these uncertainties. Based on such uncertainty estimation, we propose a Risk-sensitive Game Impact Metric (RiGIM) that measures players’ performance over a season by conditioning on a specific confidence level. |
Guiliang Liu; Yudong Luo; Oliver Schulte; Pascal Poupart; |
1878 | SCL-WC: Cross-Slide Contrastive Learning for Weakly-Supervised Whole-Slide Image Classification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To obtain more discriminative features, we propose a novel weakly-supervised classification method based on cross-slide contrastive learning (called SCL-WC), which depends on task-agnostic self-supervised feature pre-extraction and task-specific weakly-supervised feature refinement and aggregation for WSI-level prediction. |
Xiyue Wang; Jinxi Xiang; Jun Zhang; Sen Yang; Zhongyi Yang; Ming-Hui Wang; Jing Zhang; Wei Yang; Junzhou Huang; Xiao Han; |
1879 | Iterative Structural Inference of Directed Graphs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a variational model, iterative Structural Inference of Directed Graphs (iSIDG), to infer the existence of directed interactions from observational agents’ features over a time period in a dynamical system. |
Aoran Wang; Jun Pang; |
1880 | A Geometric Perspective on Variational Autoencoders Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper introduces a new interpretation of the Variational Autoencoder framework by taking a fully geometric point of view. |
Clément Chadebec; Stephanie Allassonniere; |
1881 | Distributional Convergence of The Sliced Wasserstein Process Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Different choices of how to aggregate these projected distances (averaging, random sampling, maximizing) give rise to different distances, requiring different statistical analyses. We define the \emph{Sliced Wasserstein Process}, a stochastic process defined by the empirical Wasserstein distance between projections of empirical probability measures to all one-dimensional subspaces, and prove a uniform distributional limit theorem for this process. |
Jiaqi Xi; Jonathan Niles-Weed; |
1882 | Exploration With A Finite Brain Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Humans, on the other hand, seem to manage the trade-off between exploration and exploitation effortlessly. In the present article, we put forward the hypothesis that they accomplish this by making optimal use of limited computational resources. |
Marcel Binz; Eric Schulz; |
1883 | A Fully Adaptive Trust-region Method Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Unfortunately, subsequent adaptive second-order methods are built on their ideas, and therefore also suffer from this issue. We present the first adaptive second-order method which circumvents this issue and requires at most $O( \Delta_f L^{1/2} \epsilon^{-3/2}) + \tilde{O}(1)$ iterations to find an $\epsilon$-approximate stationary point, matching the optimal iteration bound up to an additive logarithmic term. |
Fadi Hamad; Oliver Hinder; |
1884 | Posterior Matching for Arbitrary Conditioning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a simple and general framework, coined Posterior Matching, that enables Variational Autoencoders (VAEs) to perform arbitrary conditioning, without modification to the VAE itself. |
Ryan Strauss; Junier B Oliva; |
1885 | Tracking Functional Changes in Nonstationary Signals with Evolutionary Ensemble Bayesian Model for Robust Neural Decoding Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a novel evolutionary ensemble framework (EvoEnsemble) to dynamically cope with changes in neural signals by evolving the decoder model accordingly. |
Xinyun Zhu; Yu Qi; Gang Pan; Yueming Wang; |
1886 | What You See Is What You Classify: Black Box Attributions Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Most existing approaches find such attributions either using activations and gradients or by repeatedly perturbing the input. We instead address this challenge by training a second deep network, the Explainer, to predict attributions for a pre-trained black-box classifier, the Explanandum. |
Steven Stalder; Nathanael Perraudin; Radhakrishna Achanta; Fernando Perez-Cruz; Michele Volpi; |
1887 | Cost-efficient Gaussian Tensor Network Embeddings for Tensor-structured Inputs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This work discusses tensor network embeddings, which are random matrices ($S$) with tensor network structure. |
Linjian Ma; Edgar Solomonik; |
1888 | GAUDI: A Neural Architect for Immersive 3D Scene Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce GAUDI, a generative model capable of capturing the distribution of complex and realistic 3D scenes that can be rendered immersively from a moving camera. |
Miguel Angel Bautista; Pengsheng Guo; Samira Abnar; Walter Talbott; Alexander Toshev; Zhuoyuan Chen; Laurent Dinh; Shuangfei Zhai; Hanlin Goh; Daniel Ulbricht; Afshin Dehghan; Joshua Susskind; |
1889 | MoGDE: Boosting Mobile Monocular 3D Object Detection with Ground Depth Estimation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Inspired by the insight that the depth of an object can be well determined according to the depth of the ground where it stands, in this paper, we propose a novel Mono3D framework, called MoGDE, which constantly estimates the corresponding ground depth of an image and then utilizes the estimated ground depth information to guide Mono3D. |
Yunsong Zhou; Quan Liu; Hongzi Zhu; Yunzhe Li; Shan Chang; Minyi Guo; |
1890 | Federated Submodel Optimization for Hot and Cold Data Features Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We thus propose federated submodel averaging (FedSubAvg), which introduces the number of feature-related clients as the metric of feature heat to correct the aggregation of submodel updates. |
Yucheng Ding; Chaoyue Niu; Fan Wu; Shaojie Tang; Chengfei Lyu; yanghe feng; Guihai Chen; |
1891 | Towards Reliable Simulation-Based Inference with Balanced Neural Ratio Estimation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we introduce Balanced Neural Ratio Estimation (BNRE), a variation of the NRE algorithm designed to produce posterior approximations that tend to be more conservative, hence improving their reliability, while sharing the same Bayes optimal solution. |
Arnaud Delaunoy; Joeri Hermans; François Rozet; Antoine Wehenkel; Gilles Louppe; |
1892 | Gold-standard Solutions to The Schrödinger Equation Using Deep Learning: How Much Physics Do We Need? Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce a novel deep-learning architecture that solves the Schrödinger equation with 40-70% lower energy error at 8x lower computational cost compared to the state of the art. |
Leon Gerard; Michael Scherbela; Philipp Marquetand; Philipp Grohs; |
1893 | Frank-Wolfe-based Algorithms for Approximating Tyler’s M-estimator Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work we propose, to the best of our knowledge, the first Frank-Wolfe-based algorithms for computing Tyler’s estimator. |
Dan Garber; Lior Danon; |
1894 | Fuzzy Learning Machine Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, a new learning machine, fuzzy learning machine (FLM), is proposed from the perspective of concept cognition. |
Junbiao Cui; Jiye Liang; |
1895 | Attracting and Dispersing: A Simple Approach for Source-free Domain Adaptation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a simple but effective source-free domain adaptation (SFDA) method. |
Shiqi Yang; yaxing wang; kai wang; Shangling Jui; Joost van de Weijer; |
1896 | Mingling Foresight with Imagination: Model-Based Cooperative Multi-Agent Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper proposes an implicit model-based multi-agent reinforcement learning method based on value decomposition methods. |
Zhiwei Xu; dapeng li; Bin Zhang; Yuan Zhan; Yunpeng Baiia; Guoliang Fan; |
1897 | Hamiltonian Latent Operators for Content and Motion Disentanglement in Image Sequences Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce \textit{Halo} — a deep generative model utilising HAmiltonian Latent Operators to disentangle content and motion information in image sequences reliably. |
Asif Khan; Amos Storkey; |
1898 | TA-GATES: An Encoding Scheme for Neural Network Architectures Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To this end, this work proposes a new encoding scheme for neural architectures, the Training-Analogous Graph-based ArchiTecture Encoding Scheme (TA-GATES). |
Xuefei Ning; Zixuan Zhou; Junbo Zhao; Tianchen Zhao; Yiping Deng; Changcheng Tang; Shuang Liang; Huazhong Yang; Yu Wang; |
1899 | Learning Contrastive Embedding in Low-Dimensional Space Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To this end, we propose a novel framework called contrastive learning with low-dimensional reconstruction (CLLR), which adopts a regularized projection layer to reduce the dimensionality of the feature embedding. |
Shuo Chen; Chen Gong; Jun Li; Jian Yang; Gang Niu; Masashi Sugiyama; |
1900 | The Trade-offs of Model Size in Large Recommendation Models : A 10000 $\times$ Compressed Criteo-tb DLRM Model (100 GB Parameters to Mere 10MB) Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper analyzes and extensively evaluates a generic parameter sharing setup (PSS) for compressing DLRM models. |
Aditya Desai; Anshumali Shrivastava; |
1901 | Generic Bounds on The Approximation Error for Physics-informed (and) Operator Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a very general framework for deriving rigorous bounds on the approximation error for physics-informed neural networks (PINNs) and operator learning architectures such as DeepONets and FNOs as well as for physics-informed operator learning. |
Tim De Ryck; Siddhartha Mishra; |
1902 | Active Learning for Multiple Target Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We describe and explore a novel setting of active learning (AL), where there are multiple target models to be learned simultaneously. |
Ying-Peng Tang; Sheng-Jun Huang; |
1903 | Neural Surface Reconstruction of Dynamic Scenes with Monocular RGB-D Camera Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose Neural-DynamicReconstruction (NDR), a template-free method to recover high-fidelity geometry and motions of a dynamic scene from a monocular RGB-D camera. |
Hongrui Cai; Wanquan Feng; Xuetao Feng; Yan Wang; Juyong Zhang; |
1904 | Perceptual Attacks of No-Reference Image Quality Models with Human-in-the-Loop Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Here we make one of the first attempts to examine the perceptual robustness of NR-IQA models. |
Weixia Zhang; Dingquan Li; Xiongkuo Min; Guangtao Zhai; Guodong Guo; Xiaokang Yang; Kede Ma; |
1905 | GT-GAN: General Purpose Time Series Synthesis with Generative Adversarial Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, there are no existing generative models that show good performance for both types without any model changes. Therefore, we present a general purpose model capable of synthesizing regular and irregular time series data. |
Jinsung Jeon; JEONGHAK KIM; Haryong Song; Seunghyeon Cho; Noseong Park; |
1906 | Fast Instrument Learning with Faster Rates Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a simple algorithm which combines kernelized IV methods and an arbitrary, adaptive regression algorithm, accessed as a black box. |
Ziyu Wang; Yuhao Zhou; Jun Zhu; |
1907 | Double Check Your State Before Trusting It: Confidence-Aware Bidirectional Offline Model-Based Imagination Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose to augment the offline dataset by using trained bidirectional dynamics models and rollout policies with double check. |
Jiafei Lyu; Xiu Li; Zongqing Lu; |
1908 | Learning Distributions Generated By Single-Layer ReLU Networks in The Presence of Arbitrary Outliers Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Our goal is to estimate the parameters (weight matrix and bias vector) of the neural network, assuming the bias vector to be non-negative. |
Saikiran Bulusu; Geethu Joseph; M. Cenk Gursoy; Pramod Varshney; |
1909 | Alleviating The Sampling Bias of Few Shot Data By Removing Projection to The Centroid Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper reveals one such phenomenon —- the classification boundary is very sensitive to the position of support samples if they are in the vicinity of the data centroid, which we call the task centroid expressing the data centroids for a given task, degenerated and unstable results are usually observed. |
Jing Xu; Xu Luo; Xinglin Pan; Yanan Li; Wenjie Pei; Zenglin Xu; |
1910 | Don’t Roll The Dice, Ask Twice: The Two-Query Distortion of Matching Problems and Beyond Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: A recent array of works put forward the agenda of designing mechanisms that learn the values of the agents for a small number of alternatives via queries, and use this limited extra information to make better-informed decisions, thus improving distortion. Following this agenda, in this work we focus on a class of combinatorial problems that includes most well-known matching problems and several of their generalizations, such as One-Sided Matching, Two-Sided Matching, General Graph Matching, and k-Constrained Resource Allocation. |
Georgios Amanatidis; Georgios Birmpas; Aris Filos-Ratsikas; Alexandros Voudouris; |
1911 | Generalization Bounds for Estimating Causal Effects of Continuous Treatments Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We theoretically provide and prove a generalization bound for estimating ADRF to guide the learning of adaptive weights that alleviating selection bias in the continuous treatment setting. |
Xin Wang; Shengfei Lyu; Xingyu Wu; Tianhao Wu; Huanhuan Chen; |
1912 | MaskTune: Mitigating Spurious Correlations By Forcing to Explore Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This work proposes MaskTune, a masking strategy that prevents over-reliance on spurious (or a limited number of) features. |
Saeid Asgari; Aliasghar Khani; Fereshte Khani; Ali Gholami; Linh Tran; Ali Mahdavi-Amiri; Ghassan Hamarneh; |
1913 | Rotation-Equivariant Conditional Spherical Neural Fields for Learning A Natural Illumination Prior Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a conditional neural field representation based on a variational auto-decoder with a SIREN network and, extending Vector Neurons, build equivariance directly into the network. |
James Gardner; Bernhard Egger; William Smith; |
1914 | Will Bilevel Optimizers Benefit from Loops Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper provides a unified convergence theory to capture the computational differences among different implementations in bilevel optimization. |
Kaiyi Ji; Mingrui Liu; Yingbin Liang; Lei Ying; |
1915 | Neural Conservation Laws: A Divergence-Free Perspective Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We hence propose an approach to building divergence-free neural networks through the concept of differential forms, and with the aid of automatic differentiation, realize two practical constructions with differing trade offs. |
Jack Richter-Powell; Yaron Lipman; Ricky T. Q. Chen; |
1916 | Neural Correspondence Prior for Effective Unsupervised Shape Matching Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present Neural Correspondence Prior (NCP), a new paradigm for computing correspondences between 3D shapes. |
Souhaib Attaiki; Maks Ovsjanikov; |
1917 | Learning Audio-Visual Dynamics Using Scene Graphs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: There exists an unequivocal distinction between the sound produced by a static agent and that produced by a moving one, especially when the agent moves towards or away from the microphone. In this paper, we propose to use this connection between audio and visual dynamics for solving two challenging tasks simultaneously, namely: (i) separating audio sources from a mixture using visual cues, and (ii) predicting the 3D visual motion of a sounding source only using its separated audio. |
Moitreya Chatterjee; Narendra Ahuja; Anoop Cherian; |
1918 | CARD: Classification and Regression Diffusion Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we introduce classification and regression diffusion (CARD) models, which combine a denoising diffusion-based conditional generative model and a pre-trained conditional mean estimator, to accurately predict the distribution of y given x. |
Xizewen Han; Huangjie Zheng; Mingyuan Zhou; |
1919 | Make Sharpness-Aware Minimization Stronger: A Sparsified Perturbation Approach Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose an efficient and effective training scheme coined as Sparse SAM (SSAM), which achieves sparse perturbation by a binary mask. |
Peng Mi; Li Shen; Tianhe Ren; Yiyi Zhou; Xiaoshuai Sun; Rongrong Ji; Dacheng Tao; |
1920 | DropCov: A Simple Yet Effective Method for Improving Deep Architectures Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Particularly, we for the first time show that \textit{effective post-normalization can make a good trade-off between representation decorrelation and information preservation for GCP, which are crucial to alleviate over-fitting and increase representation ability of deep GCP networks, respectively}. |
Qilong Wang; Mingze Gao; Zhaolin Zhang; Jiangtao Xie; Peihua Li; Qinghua Hu; |
1921 | Online Bipartite Matching with Advice: Tight Robustness-Consistency Tradeoffs for The Two-Stage Model Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Specifically, we propose an algorithm that is $R$-robust and $C$-consistent for any $(R,C)$ with $0 \leq R \leq \frac{3}{4}$ and $\sqrt{1-R} + \sqrt{1-C} = 1$, and prove that no other algorithm can achieve a better tradeoff. |
Billy Jin; Will Ma; |
1922 | Few-Shot Fast-Adaptive Anomaly Detection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This direction is however unsatisfying as it would require modeling the normal distribution of every task that comes along, which includes tedious data collection. In this paper, we report our work aiming to handle these issues. |
Ze Wang; Yipin Zhou; Rui Wang; Tsung-Yu Lin; Ashish Shah; Ser Nam Lim; |
1923 | Recursive Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce Recursive Q-learning—a model-free RL algorithm for RMDPs—and prove that it converges for finite, single-exit and deterministic multi-exit RMDPs under mild assumptions. |
Mateo Perez; Ernst Moritz Hahn; Sven Schewe; Fabio Somenzi; Ashutosh Trivedi; Dominik Wojtczak; |
1924 | On The Effectiveness of Lipschitz-Driven Rehearsal in Continual Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: These methods collect samples from previously encountered data distributions in a small memory buffer; subsequently, they repeatedly optimize on the latter to prevent catastrophic forgetting. This work draws attention to a hidden pitfall of this widespread practice: repeated optimization on a small pool of data inevitably leads to tight and unstable decision boundaries, which are a major hindrance to generalization. |
Lorenzo Bonicelli; Matteo Boschini; Angelo Porrello; Concetto Spampinato; SIMONE CALDERARA; |
1925 | Tractable Latent State Inference for Hidden Continuous-Time Semi-Markov Chains Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We show that non-sampling-based latent state inference used in HSMM’s can be generalized to latent Continuous-Time semi-Markov Chains (CTSMC’s). |
Nicolai Engelmann; Heinz Koeppl; |
1926 | Asymptotics of $\ell_2$ Regularized Network Embeddings Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This includes methods such as DeepWalk and node2vec, which learn embeddings by optimizing stochastic losses formed over subsamples of the graph at each iteration of stochastic gradient descent. In this paper, we study the effects of adding an $\ell_2$ penalty of the embedding vectors to the training loss of these types of methods. |
Andrew Davison; |
1927 | On Translation and Reconstruction Guarantees of The Cycle-Consistent Generative Adversarial Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The involvement of two unalike data spaces and the existence of multiple solution maps between them are some of the facets that make such architectures unique. In this study, we investigate the statistical properties of such unpaired data translator networks between distinct spaces, bearing the additional responsibility of cycle-consistency. |
Anish Chakrabarty; Swagatam Das; |
1928 | Better SGD Using Second-order Momentum Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We develop a new algorithm for non-convex stochastic optimization that finds an $\epsilon$-critical point in the optimal $O(\epsilon^{-3})$ stochastic gradient and Hessian-vector product computations. |
Hoang Tran; Ashok Cutkosky; |
1929 | Differentially Private Online-to-batch for Smooth Losses Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We develop a new reduction that converts any online convex optimization algorithm suffering $O(\sqrt{T})$ regret into an $\epsilon$-differentially private stochastic convex optimization algorithm with the optimal convergence rate $\tilde O(1/\sqrt{T} + 1/\epsilon T)$ on smooth losses in linear time, forming a direct analogy to the classical non-private “online-to-batch” conversion. |
Qinzi Zhang; Ashok Cutkosky; Hoang Tran; |
1930 | All You Need Is A Good Functional Prior for Bayesian Deep Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This poses a challenge because modern neural networks are characterized by a large number of parameters, and the choice of these priors has an uncontrolled effect on the induced functional prior, which is the distribution of the functions obtained by sampling the parameters from their prior distribution. We argue that this is a hugely limiting aspect of Bayesian deep learning, and this work tackles this limitation in a practical and effective way. |
Ba-Hien Tran; Simone Rossi; Dimitrios Milios; Maurizio Filippone; |
1931 | Sparse Gaussian Process Hyperparameters: Optimize or Integrate? Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work we propose an algorithm for sparse Gaussian process regression which leverages MCMC to sample from the hyperparameter posterior within the variational inducing point framework of (Titsias, 2009). |
Vidhi Lalchand; Wessel Bruinsma; David Burt; Carl Edward Rasmussen; |
1932 | DualCoOp: Fast Adaptation to Multi-Label Recognition with Limited Annotations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we utilize the strong alignment of textual and visual features pretrained with millions of auxiliary image-text pairs and propose \textit{Dual Context Optimization} (DualCoOp) as a unified framework for partial-label MLR and zero-shot MLR. |
Ximeng Sun; Ping Hu; Kate Saenko; |
1933 | Conservative Dual Policy Optimization for Efficient Model-Based Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose Conservative Dual Policy Optimization (CDPO) that involves a Referential Update and a Conservative Update. |
Shenao Zhang; |
1934 | The Sample Complexity of One-Hidden-Layer Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study norm-based uniform convergence bounds for neural networks, aiming at a tight understanding of how these are affected by the architecture and type of norm constraint, for the simple class of scalar-valued one-hidden-layer networks, and inputs bounded in Euclidean norm. |
Gal Vardi; Ohad Shamir; Nati Srebro; |
1935 | On Margin Maximization in Linear and ReLU Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, that leaves open the question of whether this point will generally be an actual optimum of the max margin problem. In this paper, we study this question in detail, for several neural network architectures involving linear and ReLU activations. |
Gal Vardi; Ohad Shamir; Nati Srebro; |
1936 | Exponential Family Model-Based Reinforcement Learning Via Score Matching Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose an optimistic model-based algorithm, dubbed SMRL, for finite-horizon episodic reinforcement learning (RL) when the transition model is specified by exponential family distributions with $d$ parameters and the reward is bounded and known. |
Gene Li; Junbo Li; Anmol Kabra; Nati Srebro; Zhaoran Wang; Zhuoran Yang; |
1937 | S4ND: Modeling Images and Videos As Multidimensional Signals with State Spaces Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a multidimensional version of S4 for modeling visual data. |
Eric Nguyen; Karan Goel; Albert Gu; Gordon Downs; Preey Shah; Tri Dao; Stephen Baccus; Christopher Ré; |
1938 | Tikhonov Regularization Is Optimal Transport Robust Under Martingale Constraints Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we find that Tikhonov regularization is distributionally robust in an optimal transport sense (i.e. if an adversary chooses distributions in a suitable optimal transport neighborhood of the empirical measure), provided that suitable martingale constraints are also imposed. |
Jiajin Li; Sirui Lin; Jose Blanchet; Viet Anh Nguyen; |
1939 | A Nonconvex Framework for Structured Dynamic Covariance Recovery Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a flexible, yet interpretable model for high-dimensional data with time-varying second-order statistics, motivated and applied to functional neuroimaging data. |
Katherine Tsai; Mladen Kolar; Sanmi Koyejo; |
1940 | Optimal Parameter-free Online Learning with Switching Cost Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Based on a novel dual space scaling strategy, we propose a simple yet powerful algorithm for Online Linear Optimization (OLO) with switching cost, which improves the existing suboptimal regret bound [ZCP22a] to the optimal rate. |
Zhiyu Zhang; Ashok Cutkosky; Yannis Paschalidis; |
1941 | Nonlinear Sufficient Dimension Reduction with A Stochastic Neural Network Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a new type of stochastic neural network under a rigorous probabilistic framework and show that it can be used for sufficient dimension reduction for large-scale data. |
SIQI LIANG; Yan Sun; Faming Liang; |
1942 | Accelerated Zeroth-Order and First-Order Momentum Methods from Mini to Minimax Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts View Abstract: In the paper, we propose a class of accelerated zeroth-order and first-order momentum methods for both nonconvex mini-optimization and minimax-optimization. Specifically, we … |
Feihu Huang; Shangqian Gao; Jian Pei; Heng Huang; |
1943 | Scalable Sensitivity and Uncertainty Analyses for Causal-Effect Estimates of Continuous-Valued Interventions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Here, we develop a continuous treatment-effect marginal sensitivity model (CMSM) and derive bounds that agree with the observed data and a researcher-defined level of hidden confounding. We introduce a scalable algorithm and uncertainty-aware deep models to derive and estimate these bounds for high-dimensional, large-sample observational data. |
Andrew Jesson; Alyson Douglas; Peter Manshausen; Nicolai Meinshausen; Philip Stier; Yarin Gal; Uri Shalit; |
1944 | Enhanced Bilevel Optimization Via Bregman Distance Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In the paper, thus, we propose a class of enhanced bilevel optimization methods with using Bregman distance to solve bilevel optimization problems, where the outer subproblem is nonconvex and possibly nonsmooth, and the inner subproblem is strongly convex. |
Feihu Huang; Junyi Li; Shangqian Gao; Heng Huang; |
1945 | Contact-aware Human Motion Forecasting Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Here, by contrast, we propose to explicitly model the human-scene contacts. |
Wei Mao; miaomiao Liu; Richard I Hartley; Mathieu Salzmann; |
1946 | On The Limitations of Stochastic Pre-processing Defenses Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: An example of such a defense is to apply a random transformation to inputs prior to feeding them to the model. In this paper, we empirically and theoretically investigate such stochastic pre-processing defenses and demonstrate that they are flawed. |
Yue Gao; I Shumailov; Kassem Fawaz; Nicolas Papernot; |
1947 | Approximate Value Equivalence Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We develop a theory of approximate value equivalence and use it to prove performance bounds for many value equivalent classes. |
Christopher Grimm; Andre Barreto; Satinder Singh; |
1948 | The Price of Ignorance: How Much Does It Cost to Forget Noise Structure in Low-rank Matrix Estimation? Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Our main technical contribution is the rigorous analysis of a Bayes estimator and of an approximate message passing (AMP) algorithm, both of which incorrectly assume a Gaussian setup. |
Jean Barbier; TianQi Hou; Marco Mondelli; Manuel Saenz; |
1949 | Local Spatiotemporal Representation Learning for Longitudinally-consistent Neuroimage Analysis Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Given longitudinal neuroimages with scarce annotation, this paper develops a self-supervised spatiotemporal representation learning method and a consistency-regularization term for image-to-image networks. |
Mengwei Ren; Neel Dey; Martin Styner; Kelly Botteron; Guido Gerig; |
1950 | Score-Based Diffusion Meets Annealed Importance Sampling Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We here leverage recent progress in score-based generative modeling (SGM) to approximate the optimal extended target distribution for AIS proposals corresponding to the discretization of Langevin and Hamiltonian dynamics. We demonstrate these novel, differentiable, AIS procedures on a number of synthetic benchmark distributions and variational auto-encoders. |
Arnaud Doucet; Will Grathwohl; Alexander Matthews; Heiko Strathmann; |
1951 | HorNet: Efficient High-Order Spatial Interactions with Recursive Gated Convolutions Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present the Recursive Gated Convolution ($\textit{g}^\textit{n}$Conv) that performs high-order spatial interactions with gated convolutions and recursive designs. |
Yongming Rao; Wenliang Zhao; Yansong Tang; Jie Zhou; Ser Nam Lim; Jiwen Lu; |
1952 | M$^4$I: Multi-modal Models Membership Inference Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This work studies the privacy leakage of multi-modal models through the lens of membership inference attack, a process of determining whether a data record involves in the model training process or not. |
Pingyi Hu; Zihan Wang; Ruoxi Sun; Hu Wang; Minhui Xue; |
1953 | Green Hierarchical Vision Transformer for Masked Image Modeling Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present an efficient approach for Masked Image Modeling (MIM) with hierarchical Vision Transformers (ViTs), e.g., Swin Transformer, allowing the hierarchical ViTs to discard masked patches and operate only on the visible ones. |
Lang Huang; Shan You; Mingkai Zheng; Fei Wang; Chen Qian; Toshihiko Yamasaki; |
1954 | Deep Generalized Schrödinger Bridge Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we aim at solving a challenging class of MFGs in which the differentiability of these interacting preferences needs not available to the solver, and the population is urged to converge exactly to some desired distribution. |
Guan-Horng Liu; Tianrong Chen; Oswin So; Evangelos Theodorou; |
1955 | One-shot Neural Backdoor Erasing Via Adversarial Weight Masking Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose Adversarial Weight Masking (AWM), a novel method capable of erasing the neural backdoors even in the one-shot setting. |
Shuwen Chai; Jinghui Chen; |
1956 | A Near-Optimal Primal-Dual Method for Off-Policy Learning in CMDP Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: By introducing a simple but novel deviation control mechanism, we propose a near-optimal primal-dual learning algorithm called DPDL. |
Fan Chen; Junyu Zhang; Zaiwen Wen; |
1957 | Deep Hierarchical Planning from Pixels Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce Director, a practical method for learning hierarchical behaviors directly from pixels by planning inside the latent space of a learned world model. |
Danijar Hafner; Kuang-Huei Lee; Ian Fischer; Pieter Abbeel; |
1958 | CASA: Category-agnostic Skeletal Animal Reconstruction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In contrast, humans can easily infer the articulation structure of an unknown character by associating it with a seen articulated object in their memory. Inspired by this fact, we present CASA, a novel category-agnostic articulated animal reconstruction method. |
Yuefan Wu; Zeyuan Chen; Shaowei Liu; Zhongzheng Ren; Shenlong Wang; |
1959 | When Does SGD Favor Flat Minima? A Quantitative Characterization Via Linear Stability Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The observation that stochastic gradient descent (SGD) tends to select flat minima has played a fundamental role in understanding implicit regularization of SGD and guiding the tuning of hyperparameters. In this paper, we provide a quantitative explanation of this phenomenon by relating the particular noise structure of SGD to its \emph{linear stability} (Wu et al., 2018). |
Lei Wu; Mingze Wang; Weijie Su; |
1960 | Muffliato: Peer-to-Peer Privacy Amplification for Decentralized Optimization and Averaging Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we introduce pairwise network differential privacy, a relaxation of LDP that captures the fact that the privacy leakage from a node $u$ to a node $v$ may depend on their relative position in the graph. |
Edwige Cyffers; Mathieu Even; Aurélien Bellet; Laurent Massoulié; |
1961 | Estimating and Explaining Model Performance When Both Covariates and Labels Shift Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a new distribution shift model, Sparse Joint Shift (SJS), which considers the joint shift of both labels and a few features. |
Lingjiao Chen; Matei Zaharia; James Zou; |
1962 | Self-Supervised Learning Via Maximum Entropy Coding Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we argue that existing pretext tasks inevitably introduce biases into the learned representation, which in turn leads to biased transfer performance on various downstream tasks. |
Xin Liu; Zhongdao Wang; Ya-Li Li; Shengjin Wang; |
1963 | SHINE: SubHypergraph Inductive Neural NEtwork Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To achieve accurate inductive subgraph prediction, we propose SubHypergraph Inductive Neural nEtwork (SHINE). |
Yuan Luo; |
1964 | Learning Optical Flow From Continuous Spike Streams Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a tailored network, Spike2Flow that extracts information from binary spikes with temporal-spatial representation based on the differential of spike firing time and spatial information aggregation. |
Rui Zhao; Ruiqin Xiong; Jing Zhao; Zhaofei Yu; Xiaopeng Fan; Tiejun Huang; |
1965 | Unsupervised Learning of Algebraic Structure from Stationary Time Sequences Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We show that equivariance can be learned from a set of sequences with different but constant accelerations/velocities, and show that disentanglement emerges when we train the model to be able to linearly predict the future in the latent space. |
Takeru Miyato; Masanori Koyama; Kenji Fukumizu; |
1966 | A Conditional Randomization Test for Sparse Logistic Regression in High-Dimension Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Despite the core role of sparse logistic regression in statistics and machine learning, it still lacks a good solution for accurate inference in the regime where the number of features $p$ is as large as or larger than the number of samples $n$. Here we tackle this problem by improving the Conditional Randomization Test (CRT). |
Binh T. Nguyen; Bertrand Thirion; Sylvain Arlot; |
1967 | SegViT: Semantic Segmentation with Plain Vision Transformers Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Specifically, we propose the Attention-to-Mask (ATM) module, in which the similarity maps between a set of learnable class tokens and the spatial feature maps are transferred to the segmentation masks. |
Bowen Zhang; Zhi Tian; Quan Tang; Xiangxiang Chu; Xiaolin Wei; Chunhua Shen; Yifan liu; |
1968 | VITA: Video Instance Segmentation Via Object Token Association Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce a novel paradigm for offline Video Instance Segmentation (VIS), based on the hypothesis that explicit object-oriented information can be a strong clue for understanding the context of the entire sequence. |
Miran Heo; Sukjun Hwang; Seoung Wug Oh; Joon-Young Lee; Seon Joo Kim; |
1969 | Resource-Adaptive Federated Learning with All-In-One Neural Composition Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Such system heterogeneity results in an inevitable trade-off between model complexity and data accessibility as a bottleneck. To avoid such a dilemma and achieve resource-adaptive federated learning, we introduce a simple yet effective mechanism, termed All-In-One Neural Composition, to systematically support training complexity-adjustable models with flexible resource adaption. |
Yiqun Mei; Pengfei Guo; Mo Zhou; Vishal Patel; |
1970 | Behavior Transformers: Cloning $k$ Modes with One Stone Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we present Behavior Transformer (BeT), a new technique to model unlabeled demonstration data with multiple modes. |
Nur Muhammad Shafiullah; Zichen Cui; Ariuntuya Altanzaya; Lerrel Pinto; |
1971 | Single-pass Streaming Lower Bounds for Multi-armed Bandits Exploration with Instance-sensitive Sample Complexity Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Motivated by applications to process massive datasets, we study streaming algorithms for pure exploration in Stochastic Multi Armed Bandits (MABs). |
Sepehr Assadi; Chen Wang; |
1972 | Evaluation Beyond Task Performance: Analyzing Concepts in AlphaZero in Hex Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce new concept-level evaluation tools to the RL community, and illustrate how evaluations other than task performance can be used to provide a more complete picture of a model’s strengths and weaknesses using AlphaZero and the game of Hex. |
Charles Lovering; Jessica Forde; George Konidaris; Ellie Pavlick; Michael Littman; |
1973 | Bridging Central and Local Differential Privacy in Data Acquisition Mechanisms Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study the design of optimal Bayesian data acquisition mechanisms for a platform interested in estimating the mean of a distribution by collecting data from privacy-conscious users. |
Alireza Fallah; Ali Makhdoumi; azarakhsh malekian; Asuman Ozdaglar; |
1974 | Intra-agent Speech Permits Zero-shot Task Acquisition Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Through both social language use and internal processes of rehearsal and practice, language learners are able to build high-level, semantic representations that explain their perceptions. Here, we take inspiration from such processes of "inner speech" in humans (Vygotsky, 1934) to better understand the role of intra-agent speech in embodied behavior. |
Chen Yan; Federico Carnevale; Petko I Georgiev; Adam Santoro; Aurelia Guy; Alistair Muldal; Chia-Chun Hung; Joshua Abramson; Timothy Lillicrap; Gregory Wayne; |
1975 | PALMER: Perception-Action Loop with Memory Reorganization for Planning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Using action-informed perceptual representations, we develop a memory-based model of the environment that enables planning for long horizon tasks. |
Onur Beker; Mohammad Mohammadi; Amir Zamir; |
1976 | Effective Dimension in Bandit Problems Under Censorship Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we study both multi-armed and contextual bandit problems in censored environments. |
Gauthier Guinet; Saurabh Amin; Patrick Jaillet; |
1977 | LBD: Decouple Relevance and Observation for Individual-Level Unbiased Learning to Rank Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: It is difficult to ravel out this coupled effect and thus obtain a correct relevance model from click data. To address this issue, we first present the concept of coupling effect for individual-level ULTR. Then, we develop the novel Lipschitz and Bernoulli Decoupling (LBD) model to decouple the effects on relevance and observation at the individual level. |
Mouxiang Chen; Chenghao Liu; Zemin Liu; Jianling Sun; |
1978 | What Can The Neural Tangent Kernel Tell Us About Adversarial Robustness? Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We study adversarial examples though the lens of the NTK, introduce a new set of induced features to uncover the role of robust/non-robust features in classification, and study the kernel dynamics during adversarial training. |
Nikolaos Tsilivis; Julia Kempe; |
1979 | Formulating Robustness Against Unforeseen Attacks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Our key contribution is to formally define the problem of learning and generalization with an unforeseen adversary, which helps us reason about the increase in adversarial risk from the conventional perspective of a known adversary. |
Sihui Dai; Saeed Mahloujifar; Prateek Mittal; |
1980 | GRASP: Navigating Retrosynthetic Planning with Goal-driven Policy Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To this end, we propose a Goal-dRiven Actor-critic retroSynthetic Planning (GRASP) framework, where we identify the policy that performs goal-driven retrosynthesis navigation toward a user-demand objective. |
Yemin Yu; Ying Wei; Kun Kuang; Zhengxing Huang; Huaxiu Yao; Fei Wu; |
1981 | Retrospective Adversarial Replay for Continual Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Continual learning is an emerging research challenge in machine learning that addresses the problem where models quickly fit the most recently trained-on data and are prone to catastrophic forgetting due to distribution shifts — it does this by maintaining a small historical replay buffer. To avoid these problems, this paper proposes a method, “Retrospective Adversarial Replay (RAR)”, that synthesizes adversarial samples near the forgetting boundary. |
Lilly Kumari; Shengjie Wang; Tianyi Zhou; Jeff A Bilmes; |
1982 | Domain Adaptation Under Open Set Label Shift Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce the problem of domain adaptation under Open Set Label Shift (OSLS), the label distribution can change arbitrarily and a new class may arrive during deployment, but the class-conditional distributions $p(x|y)$ are domain-invariant. |
Saurabh Garg; Sivaraman Balakrishnan; Zachary Lipton; |
1983 | Implicit Regularization or Implicit Conditioning? Exact Risk Trajectories of SGD in High Dimensions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we study the dynamics of multi-pass SGD on high-dimensional convex quadratics and establish an asymptotic equivalence to a stochastic differential equation, which we call homogenized stochastic gradient descent (HSGD), whose solutions we characterize explicitly in terms of a Volterra integral equation. |
Courtney Paquette; Elliot Paquette; Ben Adlam; Jeffrey Pennington; |
1984 | Exploring Linear Feature Scalability of Vision Transformer for Parameter-efficient Fine-tuning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Specifically, we introduce a novel task-specific linear feature scalability adaptation layer delicately designed for each block, which takes into account the linear transformation of the features. |
Dongze Lian; Daquan Zhou; Jiashi Feng; Xinchao Wang; |
1985 | Online Training Through Time for Spiking Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose online training through time (OTTT) for SNNs, which is derived from BPTT to enable forward-in-time learning by tracking presynaptic activities and leveraging instantaneous loss and gradients. |
Mingqing Xiao; Qingyan Meng; Zongpeng Zhang; Di He; Zhouchen Lin; |
1986 | When Combinatorial Thompson Sampling Meets Approximation Regret Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study the behavior of the Combinatorial Thompson Sampling policy (CTS) for combinatorial multi-armed bandit problems (CMAB), within an approximation regret setting. |
Pierre Perrault; |
1987 | Improving Diffusion Models for Inverse Problems Using Manifold Constraints Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: By studying the generative sampling path, here we show that current solvers throw the sample path off the data manifold, and hence the error accumulates. To address this, we propose an additional correction term inspired by the manifold constraint, which can be used synergistically with the previous solvers to make the iterations close to the manifold. |
Hyungjin Chung; Byeongsu Sim; Dohoon Ryu; Jong Chul Ye; |
1988 | Active Bayesian Causal Inference Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose Active Bayesian Causal Inference (ABCI), a fully-Bayesian active learning framework for integrated causal discovery and reasoning, i.e., for jointly inferring a posterior over causal models and queries of interest. |
Christian Toth; Lars Lorch; Christian Knoll; Andreas Krause; Franz Pernkopf; Robert Peharz; Julius von Kügelgen; |
1989 | Meta-Reward-Net: Implicitly Differentiable Reward Learning for Preference-based Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose Meta-Reward-Net (MRN), a data-efficient PbRL framework that incorporates bi-level optimization for both reward and policy learning. |
Runze Liu; Fengshuo Bai; Yali Du; Yaodong Yang; |
1990 | Vision GNN: An Image Is Worth Graph of Nodes Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose to represent the image as a graph structure and introduce a new \emph{Vision GNN} (ViG) architecture to extract graph-level feature for visual tasks. |
Kai Han; Yunhe Wang; Jianyuan Guo; Yehui Tang; Enhua Wu; |
1991 | CCCP Is Frank-Wolfe in Disguise Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper shows that the well-known convex-concave procedure (CCCP) and its generalization to constrained problems are both special cases of the Frank-Wolfe method. |
Alp Yurtsever; Suvrit Sra; |
1992 | Kernel Multimodal Continuous Attention Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: (Farinhas et al 2021) extended this to to multimodality via Gaussian mixture attention densities. In this paper, we extend this to kernel exponential families (Canu and Smola 2006) and our new sparse counterpart, kernel deformed exponential families. |
Alexander Moreno; Zhenke Wu; Supriya Nagesh; Walter Dempsey; James Rehg; |
1993 | Stochastic Window Transformer for Image Restoration Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Specifically, we introduce the window partition with stochastic shift to replace the original fixed window partition for training and elaborate the layer expectation propagation algorithm to efficiently approximate the expectation of the induced stochastic transformer for testing. |
Jie Xiao; Xueyang Fu; Feng Wu; Zheng-Jun Zha; |
1994 | Scalable and Efficient Non-adaptive Deterministic Group Testing Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To avoid all the abovementioned drawbacks, for Quantitative Group Testing (QGT) where query result is the size of its intersection with the hidden set, we present the first efficient and scalable non-adaptive deterministic algorithms for constructing queries and decoding a hidden set K from the results of the queries – these solutions do not use any randomization, adaptiveness or unlimited computational power. |
Dariusz Kowalski; Dominik Pajak; |
1995 | Privacy Induces Robustness: Information-Computation Gaps and Sparse Mean Estimation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We establish a simple connection between robust and differentially-private algorithms: private mechanisms which perform well with very high probability are automatically robust in the sense that they retain accuracy even if a constant fraction of the samples they receive are adversarially corrupted. |
Kristian Georgiev; Samuel Hopkins; |
1996 | Learning Physics Constrained Dynamics Using Autoencoders Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We consider the problem of estimating states (e.g., position and velocity) and physical parameters (e.g., friction, elasticity) from a sequence of observations when provided a dynamic equation that describes the behavior of the system. |
Tsung-Yen Yang; Justinian Rosca; Karthik Narasimhan; Peter J Ramadge; |
1997 | Envy-free Policy Teaching to Multiple Agents Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: An important question in this setting concerns how a teaching program can be designed so that the agents think that they are treated fairly. We adopt the fairness notion of envy-freeness (EF) to formalize this question and define three different EF notions, each imposing stronger requirements than the previous one. |
Jiarui Gan; R Majumdar; Adish Singla; Goran Radanovic; |
1998 | Continuously Tempered PDMP Samplers Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce an extended distribution defined over the state of the posterior distribution and an inverse temperature, which interpolates between a tractable distribution when the inverse temperature is 0 and the posterior when the inverse temperature is 1. |
Matthew Sutton; Robert Salomone; Augustin Chevallier; Paul Fearnhead; |
1999 | An Adaptive Kernel Approach to Federated Learning of Heterogeneous Causal Effects Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a new causal inference framework to learn causal effects from multiple, decentralized data sources in a federated setting. |
Thanh Vinh Vo; Arnab Bhattacharyya; Young Lee; Tze-Yun Leong; |
2000 | DevFly: Bio-Inspired Development of Binary Connections for Locality Preserving Sparse Codes Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Here we explore a bio-inspired development process to form the connections in a network used for locality sensitive hashing. |
Tianqi Wei; Rana Alkhoury Maroun; Qinghai Guo; Barbara Webb; |
2001 | PhysGNN: A Physics–Driven Graph Neural Network Based Model for Predicting Soft Tissue Deformation in Image–Guided Neurosurgery Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Therefore, this work proposes a novel framework, PhysGNN, a data-driven model that approximates the solution of FEA by leveraging graph neural networks (GNNs), which are capable of accounting for the mesh structural information and inductive learning over unstructured grids and complex topological structures. |
Yasmin Salehi; Dennis Giannacopoulos; |
2002 | Better Uncertainty Calibration Via Proper Scores for Classification and Beyond Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we introduce the framework of \textit{proper calibration errors}, which relates every calibration error to a proper score and provides a respective upper bound with optimal estimation properties. |
Sebastian Gruber; Florian Buettner; |
2003 | MLA: MultiLingual Acquisition on Multimodal Pre-training Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose a MultiLingual Acquisition (MLA) framework that can easily empower a monolingual Vision-Language Pre-training (VLP) model with multilingual capability. |
Liang Zhang; Anwen Hu; Qin Jin; |
2004 | Keypoint-Guided Optimal Transport with Applications in Heterogeneous Domain Adaptation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel KeyPoint-Guided model by ReLation preservation (KPG-RL) that searches for the matching guided by the keypoints in OT. |
Xiang Gu; Yucheng Yang; Wei Zeng; Jian Sun; Zongben Xu; |
2005 | A Gradient Sampling Method with Complexity Guarantees for Lipschitz Functions in High and Low Dimensions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we obtain the same efficiency guarantee with a standard subgradient oracle, thus making our algorithm efficiently implementable. |
Damek Davis; Dmitriy Drusvyatskiy; Yin Tat Lee; Swati Padmanabhan; Guanghao Ye; |
2006 | On The Representation Collapse of Sparse Mixture of Experts Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose to estimate the routing scores between tokens and experts on a low-dimensional hypersphere. |
Zewen Chi; Li Dong; Shaohan Huang; Damai Dai; Shuming Ma; Barun Patra; Saksham Singhal; Payal Bajaj; XIA SONG; Xian-Ling Mao; Heyan Huang; Furu Wei; |
2007 | Certifying Robust Graph Classification Under Orthogonal Gromov-Wasserstein Threats Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Although certificates of robustness have been recently developed, their threat model only counts local and global edge perturbations, which effectively ignores important graph structures such as isomorphism. To address this issue, we propose measuring the perturbation with the orthogonal Gromov-Wasserstein discrepancy, and building its Fenchel biconjugate to facilitate convex optimization. |
Hongwei Jin; Zishun Yu; Xinhua Zhang; |
2008 | Randomized Sketches for Clustering: Fast and Optimal Kernel $k$-Means Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: More precisely, we propose a unified randomized sketches framework to kernel $k$-means and investigate its excess risk bounds, obtaining the state-of-the-art risk bound with only a fraction of computations. |
Rong Yin; Yong Liu; Weiping Wang; Dan Meng; |
2009 | Learning Rigid Body Dynamics with Lagrangian Graph Neural Network Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Here, we present a Lagrangian graph neural network (LGNN) that can learn the dynamics of rigid bodies by exploiting their topology. |
Ravinder Bhattoo; Sayan Ranu; N M Anoop Krishnan; |
2010 | Relaxing Equivariance Constraints with Non-stationary Continuous Filters Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose a parameter-efficient relaxation of equivariance that can effectively interpolate between a (i) non-equivariant linear product, (ii) a strict-equivariant convolution, and (iii) a strictly-invariant mapping. |
Tycho van der Ouderaa; David W. Romero; Mark van der Wilk; |
2011 | Subquadratic Kronecker Regression with Applications to Tensor Decomposition Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a subquadratic-time algorithm for the Kronecker product regression problem. |
Mehrdad Ghadiri; Matthew Fahrbach; Gang Fu; |
2012 | CascadeXML: End-to-end Multi-Resolution Learning for Extreme Multi-Label Text Classification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We thus propose CascadeXML, an end-to-end multi-resolution learning pipeline, which can harness the multi-layered architecture of a transformer model for attending to different label resolutions with separate feature representations. |
Siddhant Kharbanda; Atmadeep Banerjee; Erik Schultheis; Rohit Babbar; |
2013 | Okapi: Generalising Better By Making Statistical Matches Match Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose Okapi, a simple, efficient, and general method for robust semi-supervised learning based on online statistical matching. |
Myles Bartlett; Sara Romiti; Viktoriia Sharmanska; Novi Quadrianto; |
2014 | Improved Techniques for Deterministic L2 Robustness Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce several new techniques that lead to significant improvements on CIFAR-10 and CIFAR-100 for both standard and provable robust accuracy and establishes a new state-of-the-art. |
Sahil Singla; Soheil Feizi; |
2015 | Unsupervised Adaptation from Repeated Traversals for Autonomous Driving Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: While extensive research has been done on such an unsupervised domain adaptation problem, one fundamental problem lingers: there is no reliable signal in the target domain to supervise the adaptation process. To overcome this issue we observe that it is easy to collect unsupervised data from multiple traversals of repeated routes. |
Yurong You; Cheng Perng Phoo; Katie Luo; Travis Zhang; Wei-Lun Chao; Bharath Hariharan; Mark Campbell; Kilian Weinberger; |
2016 | Byzantine Spectral Ranking Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study the problem of rank aggregation where the goal is to obtain a global ranking by aggregating pair-wise comparisons of voters over a set of items. |
Arnhav Datar; Arun Rajkumar; John Augustine; |
2017 | On Measuring Excess Capacity in Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We study the excess capacity of deep networks in the context of supervised classification. |
Florian Graf; Sebastian Zeng; Bastian Rieck; Marc Niethammer; Roland Kwitt; |
2018 | GPT3.int8(): 8-bit Matrix Multiplication for Transformers at Scale Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We develop methods for Int8 matrix multiplication for transformer multi-layer perceptron (MLP) and attention projection layers, which cut the required memory for inference by half while retaining full precision performance. |
Tim Dettmers; Mike Lewis; Luke Zettlemoyer; |
2019 | Bounding and Approximating Intersectional Fairness Through Marginal Fairness Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, our primary goal is to understand in detail the relationship between marginal and intersectional fairness through statistical analysis. |
Mathieu Molina; Patrick Loiseau; |
2020 | A Multilabel Classification Framework for Approximate Nearest Neighbor Search Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We formulate approximate nearest neighbor search as a multilabel classification problem and provide a sufficient condition for consistency of partitioning classifiers under this formulation. |
Ville Hyvönen; Elias Jääsaari; Teemu Roos; |
2021 | Power and Limitations of Single-qubit Native Quantum Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we formulate a theoretical framework for the expressive ability of data re-uploading quantum neural networks that consist of interleaved encoding circuit blocks and trainable circuit blocks. |
Zhan Yu; Hongshun Yao; Mujin Li; Xin Wang; |
2022 | Local Identifiability of Deep ReLU Neural Networks: The Theory Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Is a sample rich enough to determine, at least locally, the parameters of a neural network? To answer this question, we introduce a new local parameterization of a given deep ReLU neural network by fixing the values of some of its weights. |
Joachim Bona-Pellissier; François Malgouyres; Francois Bachoc; |
2023 | Proximal Point Imitation Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This work develops new algorithms with rigorous efficiency guarantees for infinite horizon imitation learning (IL) with linear function approximation without restrictive coherence assumptions. |
Luca Viano; Angeliki Kamoutsi; Gergely Neu; Igor Krawczuk; Volkan Cevher; |
2024 | Efficient Meta Reinforcement Learning for Preference-based Fast Adaptation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Despite recent advances in meta-RL, most existing methods require the access to the environmental reward function of new tasks to infer the task objective, which is not realistic in many practical applications. To bridge this gap, we study the problem of few-shot adaptation in the context of human-in-the-loop reinforcement learning. |
Zhizhou Ren; Anji Liu; Yitao Liang; Jian Peng; Jianzhu Ma; |
2025 | Generalization Analysis of Message Passing Neural Networks on Large Random Graphs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study the generalization error of MPNNs in graph classification and regression. |
Sohir Maskey; Ron Levie; Yunseok Lee; Gitta Kutyniok; |
2026 | Isometric 3D Adversarial Examples in The Physical World Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel $\epsilon$-isometric ($\epsilon$-ISO) attack method to generate natural and robust 3D adversarial examples in the physical world by considering the geometric properties of 3D objects and the invariance to physical transformations. |
yibo miao; Yinpeng Dong; Jun Zhu; Xiao-Shan Gao; |
2027 | Exploitability Minimization in Games and Beyond Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we develop fast exploitability-minimization methods that compute exact or approximate GNE in pseudo-games with jointly convex constraints. |
Denizalp Goktas; Amy Greenwald; |
2028 | Bring Your Own Algorithm for Optimal Differentially Private Stochastic Minimax Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study differentially private (DP) algorithms for smooth stochastic minimax optimization, with stochastic minimization as a byproduct. |
Liang Zhang; Kiran Thekumparampil; Sewoong Oh; Niao He; |
2029 | HyperDomainNet: Universal Domain Adaptation for Generative Adversarial Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce a novel domain-modulation technique that allows to optimize only 6 thousand-dimensional vector instead of 30 million weights of StyleGAN2 to adapt to a target domain. |
Aibek Alanov; Vadim Titov; Dmitry Vetrov; |
2030 | Measures of Information Reflect Memorization Patterns Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we hypothesize—and subsequently show—that the diversity in the activation patterns of different neurons is reflective of model generalization and memorization. |
Rachit Bansal; Danish Pruthi; Yonatan Belinkov; |
2031 | Unlabelled Sample Compression Schemes for Intersection-Closed Classes and Extremal Classes Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper we simplify and extend their proof technique to deal with so-called extremal classes of VC dimension $d$ which contain maximum classes of VC dimension $d-1$. |
Joachim Rubinstein; Benjamin Rubinstein; |
2032 | Batch Bayesian Optimisation Via Density-ratio Estimation with Guarantees Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present a theoretical analysis of BORE’s regret and an extension of the algorithm with improved uncertainty estimates. |
Rafael Oliveira; Louis Tiao; Fabio Ramos; |
2033 | Asynchronous SGD Beats Minibatch SGD Under Arbitrary Delays Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We prove much better theoretical guarantees for asynchronous SGD, which depend on the number of workers rather than the delays. |
Blake Woodworth; Mathieu Even; Konstantin Mishchenko; Francis Bach; |
2034 | An In-depth Study of Stochastic Backpropagation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we provide an in-depth study of Stochastic Backpropagation (SBP) when training deep neural networks for standard image classification and object detection tasks. |
Jun Fang; Mingze Xu; Hao Chen; Bing Shuai; Zhuowen Tu; Joseph Tighe; |
2035 | A Time-resolved Theory of Information Encoding in Recurrent Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Here, we develop a non-stationary dynamic mean-field theory that transparently explains how tight balance of excitatory currents by recurrent inhibition improves information encoding. |
Rainer Engelken; Sven Goedeke; |
2036 | Private Set Generation with Discriminative Information Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Our work provides an alternative view for differentially private generation of high-dimensional data and introduces a simple yet effective method that greatly improves the sample utility of state-of-the-art approaches. |
Dingfan Chen; Raouf Kerkouche; Mario Fritz; |
2037 | Equivariant Networks for Zero-Shot Coordination Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present a novel equivariant network architecture for use in Dec-POMDPs that prevents the agent from learning policies which break symmetries, doing so more effectively than prior methods. |
Darius Muglich; Christian Schroeder de Witt; Elise van der Pol; Shimon Whiteson; Jakob Foerster; |
2038 | Grounded Reinforcement Learning: Learning to Win The Game Under Human Commands Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We consider the problem of building a reinforcement learning (RL) agent that can both accomplish non-trivial tasks, like winning a real-time strategy game, and strictly follow high-level language commands from humans, like “attack”, even if a command is sub-optimal. |
Shusheng Xu; Huaijie Wang; YI WU; |
2039 | Tractable Optimality in Episodic Latent MABs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we show that learning with {\em polynomial} samples in $A$ is possible. |
Jeongyeol Kwon; Yonathan Efroni; Constantine Caramanis; Shie Mannor; |
2040 | Phase Diagram of Stochastic Gradient Descent in High-dimensional Two-layer Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Our work builds on a deterministic description of SGD in high-dimensions from statistical physics, which we extend and for which we provide rigorous convergence rates. |
Rodrigo Veiga; Ludovic Stephan; Bruno Loureiro; Florent Krzakala; Lenka Zdeborová; |
2041 | Patching Open-vocabulary Models By Interpolating Weights Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We study model patching, where the goal is to improve accuracy on specific tasks (i.e., patching tasks) without degrading accuracy on tasks where performance is already adequate (i.e., supported tasks). |
Gabriel Ilharco; Mitchell Wortsman; Samir Yitzhak Gadre; Shuran Song; Hannaneh Hajishirzi; Simon Kornblith; Ali Farhadi; Ludwig Schmidt; |
2042 | On Leave-One-Out Conditional Mutual Information For Generalization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We derive information theoretic generalization bounds for supervised learning algorithms based on a new measure of leave-one-out conditional mutual information (loo-CMI). |
Mohamad Rida Rammal; Alessandro Achille; Suhas Diggavi; Stefano Soatto; Aditya Golatkar; |
2043 | Truly Deterministic Policy Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present a policy gradient method that avoids exploratory noise injection and performs policy search over the deterministic landscape, with the goal of improving learning with long horizons and non-local rewards. |
Ehsan Saleh; Saba Ghaffari; Tim Bretl; Matthew West; |
2044 | DiSC: Differential Spectral Clustering of Features Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In many applications, the features of interest form clusters with similar effects on the data at hand. To recover such clusters we develop DiSC, a data-driven approach for detecting groups of features that differentiate between conditions. |
Ram Dyuthi Sristi; Gal Mishne; Ariel Jaffe; |
2045 | You Can’t Count on Luck: Why Decision Transformers Fail in Stochastic Environments Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we describe the limitations of RvS approaches in stochastic environments and propose a solution. |
Keiran Paster; Sheila McIlraith; Jimmy Ba; |
2046 | No Free Lunch from Deep Learning in Neuroscience: A Case Study Through Models of The Entorhinal-Hippocampal Circuit Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The central claims of recent deep learning-based models of brain circuits are that they make novel predictions about neural phenomena or shed light on the fundamental functions being optimized. We show, through the case-study of grid cells in the entorhinal-hippocampal circuit, that one may get neither. |
Rylan Schaeffer; Mikail Khona; Ila Fiete; |
2047 | Conditional Diffusion Process for Inverse Halftoning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose the first generative halftoning method in the literature, which regards the black pixels in halftones as physically moving particles, and makes the randomly distributed particles move under some certain guidance through the reverse diffusion process, so as to obtain the desired halftone patterns. |
Hao Jiang; Yadong Mu; |
2048 | Analyzing Lottery Ticket Hypothesis from PAC-Bayesian Theory Perspective Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, since the initial large learning rate generally helps the optimizer to converge to flatter minima, we hypothesize that the winning tickets have relatively sharp minima, which is considered a disadvantage in terms of generalization ability. In this paper, we confirm this hypothesis and show that the PAC-Bayesian theory can provide an explicit understanding of the relationship between LTH and generalization behavior. |
Keitaro Sakamoto; Issei Sato; |
2049 | Non-Stationary Bandits Under Recharging Payoffs: Improved Planning with Sublinear Regret Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We first focus on the setting where the mean payoff functions are known. In this setting, we significantly improve the best-known guarantees for the planning problem by developing a polynomial-time $(1-{1}/{e})$-approximation algorithm (asymptotically and in expectation), based on a novel combination of randomized LP rounding and a time-correlated (interleaved) scheduling method. |
Orestis Papadigenopoulos; Constantine Caramanis; Sanjay Shakkottai; |
2050 | Emergence of Hierarchical Layers in A Single Sheet of Self-Organizing Spiking Neurons Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Here we propose new self-organization principles that allow for the formation of hierarchical cortical regions (i.e. layers) in a completely unsupervised manner without requiring any predefined architecture. |
Paul Bertens; Seong-Whan Lee; |
2051 | Mix and Reason: Reasoning Over Semantic Topology with Data Mixing for Domain Generalization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose Mix and Reason (MiRe), a new DG framework that learns semantic representations via enforcing the structural invariance of semantic topology. |
Chaoqi Chen; Luyao Tang; Feng Liu; Gangming Zhao; Yue Huang; Yizhou Yu; |
2052 | Modeling Neural Population Activity with Spatiotemporal Transformer Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper we introduce SpatioTemporal Neural Data Transformer (STNDT), an NDT-based architecture that explicitly models responses of individual neurons in the population across time and space to uncover their underlying firing rates. |
Trung Le; Eli Shlizerman; |
2053 | Double Bubble, Toil and Trouble: Enhancing Certified Robustness Through Transitivity Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: While proofs exist demonstrating best-possible robustness for $L_2$-norm bounded attacks, these certificates still are lower bounds on the distance between a point of interest and its nearest adversarial example. In this work, we demonstrate how these best-possible certificates can be improved upon by exploiting both the transitivity of certifications, and the geometry of the input space, giving rise to what we have called Geometrically Informed Certified Robustness. |
Andrew Cullen; Paul Montague; Shijie Liu; Sarah Erfani; Benjamin Rubinstein; |
2054 | Max-Min Off-Policy Actor-Critic Method Focusing on Worst-Case Robustness to Model Misspecification Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To obtain a policy that optimizes the worst-case performance, we propose an off-policy actor-critic approach called the Max-Min Twin Delayed Deep Deterministic Policy Gradient Algorithm (M2TD3), which solves a max-min optimization problem using a simultaneous gradient ascent descent approach. |
Takumi Tanabe; Rei Sato; Kazuto Fukuchi; Jun Sakuma; Youhei Akimoto; |
2055 | Iterative Feature Matching: Toward Provable Domain Generalization with Logarithmic Environments Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We demonstrate the advantage of a distribution-matching algorithm over IRM and ERM under a concrete nontrivial data model for domain generalization. |
Yining Chen; Elan Rosenfeld; Mark Sellke; Tengyu Ma; Andrej Risteski; |
2056 | Few-shot Task-agnostic Neural Architecture Search for Distilling Large Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We use Neural Architecture Search (NAS) to automatically distill several compressed students with variable cost from a large model. |
Dongkuan (DK) Xu; Subhabrata Mukherjee; Xiaodong Liu; Debadeepta Dey; Wenhui Wang; Xiang Zhang; Ahmed Awadallah; Jianfeng Gao; |
2057 | PAC: Assisted Value Factorisation with Counterfactual Predictions in Multi-Agent Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we show that in partially observable MARL problems, an agent’s ordering over its own actions could impose concurrent constraints (across different states) on the representable function class, causing significant estimation error during training.We tackle this limitation and propose PAC, a new framework leveraging Assistive information generated from Counterfactual Predictions of optimal joint action selection, which enable explicit assistance to value function factorization through a novel counterfactual loss. |
Hanhan Zhou; Tian Lan; Vaneet Aggarwal; |
2058 | Oscillatory Tracking of Continuous Attractor Neural Networks Account for Phase Precession and Procession of Hippocampal Place Cells Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Here, we propose a neural circuit model to elucidate the generation of both kinds of phase shift in place cells’ firing. |
Tianhao Chu; Zilong Ji; Junfeng Zuo; Wenhao Zhang; Tiejun Huang; Yuanyuan Mi; Si Wu; |
2059 | A Variational Edge Partition Model for Supervised Graph Representation Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper introduces a graph generative process to model how the observed edges are generated by aggregating the node interactions over a set of overlapping node communities, each of which contributes to the edges via a logical OR mechanism. |
Yilin He; Chaojie Wang; Hao Zhang; Bo Chen; Mingyuan Zhou; |
2060 | A Unified Analysis of Mixed Sample Data Augmentation: A Loss Function Perspective Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose the first unified theoretical analysis of mixed sample data augmentation (MSDA). |
Chanwoo Park; Sangdoo Yun; Sanghyuk Chun; |
2061 | Diffusion-LM Improves Controllable Text Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: While recent works have demonstrated successes on controlling simple sentence attributes (e.g., sentiment), there has been little progress on complex, fine-grained controls (e.g., syntactic structure). To address this challenge, we develop a new non-autoregressive language model based on continuous diffusions that we call Diffusion-LM. |
Xiang Li; John Thickstun; Ishaan Gulrajani; Percy Liang; Tatsunori Hashimoto; |
2062 | Antigen-Specific Antibody Design and Optimization with Diffusion-Based Generative Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: For antibody optimization, we propose a special sampling scheme that first perturbs the given antibody and then denoises it. |
Shitong Luo; Yufeng Su; Xingang Peng; Sheng Wang; Jian Peng; Jianzhu Ma; |
2063 | Is Sortition Both Representative and Fair? Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper studies sortition (random selection of democratic representatives), proposes a quantitative measure of how well the population is represented, and characterizes its tradeoff with fairness, resulting in novel algorithms. |
Soroush Ebadian; Gregory Kehne; Evi Micha; Ariel Procaccia; Nisarg Shah; |
2064 | SoftCore: Unsupervised Anomaly Detection with Noisy Data Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper considers label-level noise in sensory anomaly detection for the first time. To solve this problem, we proposed a memory-based unsupervised AD method, SoftCore, which efficiently denoises the data at the patch level. |
Xi Jiang; Jianlin Liu; Jinbao Wang; Qiang Nie; Kai WU; Yong Liu; Chengjie Wang; Feng Zheng; |
2065 | Weighted Distillation with Unlabeled Examples Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Naturally, the success of the approach depends on the quality of the teacher’s labels, since the student could be confused if trained on inaccurate data. This paper proposes a principled approach for addressing this issue based on an importance reweighting scheme tailored to the distillation training paradigm. |
Fotis Iliopoulos; Cenk Baykal; Vasilis Kontonis; Gaurav Menghani; Khoa Trinh; Erik Vee; |
2066 | OPEN: Orthogonal Propagation with Ego-Network Modeling Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: A novel Orthogonal Propagation with Ego-Network modeling (OPEN) is proposed by modeling relevances between propagations. |
Liang Yang; Lina Kang; Qiuliang Zhang; Mengzhe Li; bingxin niu; Dongxiao He; Zhen Wang; Chuan Wang; Xiaochun Cao; Yuanfang Guo; |
2067 | Rethinking The Reverse-engineering of Trojan Triggers Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We observe that both input-space and feature-space Trojans are associated with feature space hyperplanes.Based on this observation, we design a novel reverse-engineering method that exploits the feature space constrain to reverse-engineer Trojan triggers. |
Zhenting Wang; Kai Mei; Hailun Ding; Juan Zhai; Shiqing Ma; |
2068 | Beyond Not-Forgetting: Continual Learning with Backward Knowledge Transfer Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a novel continual learning method to facilitate backward knowledge transfer, which can even achieve positive backward transfer on standard CL benchmarks for the first time. |
Sen Lin; Li Yang; Deliang Fan; Junshan Zhang; |
2069 | Offline Goal-Conditioned Reinforcement Learning Via $f$-Advantage Regression Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose $\textbf{Go}$al-conditioned $f$-$\textbf{A}$dvantage $\textbf{R}$egression (GoFAR), a novel regression-based offline GCRL algorithm derived from a state-occupancy matching perspective; the key intuition is that the goal-reaching task can be formulated as a state-occupancy matching problem between a dynamics-abiding imitator agent and an expert agent that directly teleports to the goal. |
Jason Yecheng Ma; Jason Yan; Dinesh Jayaraman; Osbert Bastani; |
2070 | Pre-Trained Model Reusability Evaluation for Small-Data Transfer Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a metric-based approach named synergistic learning for evaluating pre-trained model reusability with small data. |
Yao-Xiang Ding; Xi-Zhu Wu; Kun Zhou; Zhi-Hua Zhou; |
2071 | Using Embeddings for Causal Estimation of Peer Influence in Social Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a nonparametric method of causally estimating peer influence from observational data, in the presence of unobserved confounding. |
Irina Cristali; Victor Veitch; |
2072 | Bandit Theory and Thompson Sampling-Guided Directed Evolution for Sequence Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a Thompson Sampling-guided Directed Evolution (TS-DE) framework for sequence optimization, where the sequence-to-function mapping is unknown and querying a single value is subject to costly and noisy measurements. |
Hui Yuan; Chengzhuo Ni; Huazheng Wang; Xuezhou Zhang; Le Cong; Csaba Szepesvari; Mengdi Wang; |
2073 | Kernel Similarity Matching with Hebbian Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We derive a neural network with Hebbian learning rules such that the similarities between outputs matching a kernel function of input similarities. |
Kyle Luther; Sebastian Seung; |
2074 | Calibrated Data-Dependent Constraints with Exact Satisfaction Guarantees Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We consider the task of training machine learning models with data-dependent constraints. |
Songkai Xue; Yuekai Sun; Mikhail Yurochkin; |
2075 | Universally Expressive Communication in Multi-Agent Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we consider the question of whether a given communication protocol can express an arbitrary policy. |
Matthew Morris; Thomas D Barrett; Arnu Pretorius; |
2076 | Accelerated Training of Physics Informed Neural Networks (PINNs) Using Meshless Discretizations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a new technique for the accelerated training of PINNs that combines modern scientific computing techniques with machine learning: discretely-trained PINNs (DT-PINNs). |
Ramansh Sharma; Varun Shankar; |
2077 | Conformal Prediction with Temporal Quantile Adjustments Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We develop Temporal Quantile Adjustment (TQA), a general method to construct efficient and valid prediction intervals (PIs) for regression on cross-sectional time series data. |
Zhen Lin; Shubhendu Trivedi; Jimeng Sun; |
2078 | Debiased, Longitudinal and Coordinated Drug Recommendation Through Multi-Visit Clinic Records Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To this end, we propose DrugRec, a causal inference based drug recommendation model. |
Hongda Sun; Shufang Xie; Shuqi Li; Yuhan Chen; Ji-Rong Wen; Rui Yan; |
2079 | Where Do Models Go Wrong? Parameter-Space Saliency Maps for Explainability Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a parameter saliency method to identify and analyze the network parameters which are responsible for erroneous decisions. |
Roman Levin; Manli Shu; Eitan Borgnia; Furong Huang; Micah Goldblum; Tom Goldstein; |
2080 | Score-based Generative Modeling Secretly Minimizes The Wasserstein Distance Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Recently, Song et al. showed that the training objective of score-based generative models is equivalent to minimizing the Kullback-Leibler divergence of the generated distribution from the data distribution. In this work, we show that score-based models also minimize the Wasserstein distance between them. |
Dohyun Kwon; Ying Fan; Kangwook Lee; |
2081 | Hedging As Reward Augmentation in Probabilistic Graphical Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a decision-theoretic view of hedging based on augmenting a probabilistic graphical model — specifically a Bayesian network or an influence diagram — with a reward. |
Debarun Bhattacharjya; Radu Marinescu; |
2082 | Multiview Human Body Reconstruction from Uncalibrated Cameras Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a new method to reconstruct 3D human body pose and shape by fusing visual features from multiview images captured by uncalibrated cameras. |
Zhixuan Yu; Linguang Zhang; Yuanlu Xu; Chengcheng Tang; LUAN TRAN; Cem Keskin; Hyun Soo Park; |
2083 | An Efficient Framework for Computing Tight Lipschitz Constants of Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we develop an efficient framework for computing the $\ell_\infty$ local Lipschitz constant of a neural network by tightly upper bounding the norm of Clarke Jacobian. |
Zhouxing Shi; Yihan Wang; Huan Zhang; J. Zico Kolter; Cho-Jui Hsieh; |
2084 | Dynamic Tensor Product Regression Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We initiate the study of Dynamic Tensor Product Regression and related problems and provide algorithms to solve them efficiently. |
Aravind Reddy; Zhao Song; Lichen Zhang; |
2085 | DReS-FL: Dropout-Resilient Secure Federated Learning for Non-IID Clients Via Secret Data Sharing Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper proposes a Dropout-Resilient Secure Federated Learning (DReS-FL) framework based on Lagrange coded computing (LCC) to tackle both the non-IID and dropout problems. |
Jiawei Shao; Yuchang Sun; Songze Li; Jun Zhang; |
2086 | Towards Lightweight Black-Box Attack Against Deep Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: As it is hard to mitigate the approximation error with few available samples, we propose Error TransFormer (ETF) for lightweight attacks. |
Chenghao Sun; Yonggang Zhang; Wan Chaoqun; Qizhou Wang; Ya Li; Tongliang Liu; Bo Han; Xinmei Tian; |
2087 | Globally Gated Deep Linear Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we introduce a novel gating architecture, named Globally Gated Deep Linear Networks (GGDLNs) where gating units are shared among all processing units in each layer, thereby decoupling the architecture of the nonlinear but unlearned gating and the learned linear processing motifs. |
Qianyi Li; Haim Sompolinsky; |
2088 | Star Temporal Classification: Sequence Modeling with Partially Labeled Data Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We develop an algorithm which can learn from partially labeled and unsegmented sequential data. |
Vineel Pratap; Awni Hannun; Gabriel Synnaeve; Ronan Collobert; |
2089 | MACE: Higher Order Equivariant Message Passing Neural Networks for Fast and Accurate Force Fields Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Moreover, most MPNNs only pass two-body messages leading to an intricate relationship between the number of layers and the expressivity of the features. This work introduces MACE, a new equivariant MPNN model that uses higher order messages, and demonstrates that this leads to an improved learning law. |
Ilyes Batatia; David P Kovacs; Gregor Simm; Christoph Ortner; Gabor Csanyi; |
2090 | Learning Two-Player Markov Games: Neural Function Approximation and Correlated Equilibrium Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a novel online learning algorithm to find a Nash equilibrium by minimizing the duality gap. |
Chris Junchi Li; Dongruo Zhou; Quanquan Gu; Michael Jordan; |
2091 | Probable Domain Generalization Via Quantile Risk Minimization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose Quantile Risk Minimization for achieving *probable* domain generalization, where predictors are trained to generalize with a desired probability. |
Cian Eastwood; Alexander Robey; Shashank Singh; Julius von Kügelgen; Hamed Hassani; George J. Pappas; Bernhard Schölkopf; |
2092 | Embed and Emulate: Learning to Estimate Parameters of Dynamical Systems with Uncertainty Quantification Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper describes a novel framework for learning feature embeddings of observed dynamics jointly with an emulator that can replace high-cost simulators. |
Ruoxi Jiang; Rebecca Willett; |
2093 | Draft-and-Revise: Effective Image Generation with Contextual RQ-Transformer Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To address the issue, we propose an effective image generation framework of \emph{Draft-and-Revise} with \emph{Contextual RQ-transformer} to consider global contexts during the generation process. |
Doyup Lee; Chiheon Kim; Saehoon Kim; Minsu Cho; WOOK SHIN HAN; |
2094 | Variational Model Perturbation for Source-Free Domain Adaptation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We aim for source-free domain adaptation, where the task is to deploy a model pre-trained on source domains to target domains. |
Mengmeng Jing; Xiantong Zhen; Jingjing Li; Cees Snoek; |
2095 | Optimal Rates for Regularized Conditional Mean Embedding Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We address the consistency of a kernel ridge regression estimate of the conditional mean embedding (CME), which is an embedding of the conditional distribution of $Y$ given $X$ into a target reproducing kernel Hilbert space $\mathcal{H}_Y$. |
Zhu Li; Dimitri Meunier; Arthur Gretton; |
2096 | A Theoretical Understanding of Gradient Bias in Meta-Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We give a thorough theoretical analysis to undertsand the gradient bias in meta-reinforcement learning. |
Bo Liu; Xidong Feng; Jie Ren; Luo Mai; Rui Zhu; Haifeng Zhang; Jun Wang; Yaodong Yang; |
2097 | Wavelet Feature Maps Compression for Image-to-Image CNNs Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose Wavelet Compressed Convolution (WCC)—a novel approach for high-resolution activation maps compression integrated with point-wise convolutions, which are the main computational cost of modern architectures. |
Shahaf E. Finder; Yair Zohav; Maor Ashkenazi; Eran Treister; |
2098 | Paraphrasing Is All You Need for Novel Object Captioning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: As a result, we present Paraphrasing-to-Captioning (P2C), a two-stage learning framework for NOC, which would heuristically optimize the output captions via paraphrasing. |
Cheng-Fu Yang; Yao-Hung Hubert Tsai; Wan-Cyuan Fan; Russ Salakhutdinov; Louis-Philippe Morency; Frank Wang; |
2099 | Blessing of Nonconvexity in Deep Linear Models: Depth Flattens The Optimization Landscape Around The True Solution Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This work characterizes the effect of depth on the optimization landscape of linear regression, showing that, despite their nonconvexity, deeper models have more desirable optimization landscapes. |
Jianhao Ma; Salar Fattahi; |
2100 | Fast Algorithms for Packing Proportional Fairness and Its Dual Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This problem, defined as the constrained maximization of $\sum_i \log x_i$, is known as the packing proportional fairness problem when the feasible set is defined by positive linear constraints and $x \in \mathbb{R}_{\geq 0}^n$. In this work, we present a distributed accelerated first-order method for this problem which improves upon previous approaches. |
Francisco Criado; David Martinez-Rubio; Sebastian Pokutta; |
2101 | Insights Into Pre-training Via Simpler Synthetic Tasks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Notably, recent work shows that even pre-training on synthetic tasks can achieve significant gains in downstream tasks. In this work, we perform three experiments that iteratively simplify pre-training and show that the simplifications still retain much of its gains. |
Yuhuai Wu; Felix Li; Percy Liang; |
2102 | An Information-Theoretic Framework for Deep Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel information-theoretic framework with its own notions of regret and sample complexity for analyzing the data requirements of machine learning. |
Hong Jun Jeon; Benjamin Van Roy; |
2103 | AVLEN: Audio-Visual-Language Embodied Navigation in 3D Environments Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To this end, we present AVLEN — an interactive agent for Audio-Visual-Language Embodied Navigation. |
Sudipta Paul; Amit Roy-Chowdhury; Anoop Cherian; |
2104 | A Consistent, Scalable, and Differentiable Lp Canonical Calibration Error Estimator Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: As a remedy, we propose a low-bias, trainable calibration error estimator based on Dirichlet kernel density estimates, which asymptotically converges to the true $L_p$ calibration error. |
Teodora Popordanoska; Raphael Sayer; Matthew Blaschko; |
2105 | The Burer-Monteiro SDP Method Can Fail Even Above The Barvinok-Pataki Bound Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We prove that the Burer-Monteiro method can fail for the Max-Cut SDP on $n$ vertices when the rank is above the Barvinok-Pataki bound ($r \gtrsim \sqrt{2n}$). |
Liam O’Carroll; Vaidehi Srinivas; Aravindan Vijayaraghavan; |
2106 | Learning in Congestion Games with Bandit Feedback Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a centralized algorithm for Markov congestion games, whose sample complexity again has only polynomial dependence on all relevant problem parameters, but not the action size. |
Qiwen Cui; Zhihan Xiong; Maryam Fazel; Simon Du; |
2107 | Lifelong Neural Predictive Coding: Learning Cumulatively Online Without Forgetting Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a new kind of connectionist architecture, the Sequential Neural Coding Network, that is robust to forgetting when learning from streams of data points and, unlike networks of today, does not learn via the popular back-propagation of errors. |
Alex Ororbia; Ankur Mali; C Lee Giles; Daniel Kifer; |
2108 | Generalized One-shot Domain Adaption of Generative Adversarial Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Besides, to realize cross-domain correspondence, we propose the variational Laplacian regularization to constrain the smoothness of the adapted generator. |
Zicheng Zhang; Yinglu Liu; Congying Han; Tiande Guo; Ting Yao; Tao Mei; |
2109 | Stochastic Online Learning with Feedback Graphs: Finite-Time and Asymptotic Optimality Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We revisit the problem of stochastic online learning with feedbackgraphs, with the goal of devising algorithms that are optimal, up toconstants, both asymptotically and in finite time. |
Teodor Vanislavov Marinov; Mehryar Mohri; Julian Zimmert; |
2110 | Rare Gems: Finding Lottery Tickets at Initialization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Finding lottery tickets that train to better accuracy compared to simple baselines remains an open problem. In this work, we resolve this open problem by proposing Gem-Miner which finds lottery tickets at initialization that beat current baselines. |
Kartik Sreenivasan; Jy-yong Sohn; Liu Yang; Matthew Grinde; Alliot Nagle; Hongyi Wang; Eric Xing; Kangwook Lee; Dimitris Papailiopoulos; |
2111 | Optimal Transport of Classifiers to Fairness Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Yet, we hypothesize that such information is relevant in quantifying the unfairness of a given classifier. To validate this hypothesis, we introduce Optimal Transport to Fairness (OTF), a method that quantifies the violation of fairness constraints as the smallest Optimal Transport cost between a probabilistic classifier and any score function that satisfies these constraints. |
Maarten Buyl; Tijl De Bie; |
2112 | AutoML Two-Sample Test Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We use a simple test that takes the mean discrepancy of a witness function as the test statistic and prove that minimizing a squared loss leads to a witness with optimal testing power. |
Jonas Kübler; Vincent Stimper; Simon Buchholz; Krikamol Muandet; Bernhard Schölkopf; |
2113 | Listen to Interpret: Post-hoc Interpretability for Audio Networks with NMF Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To this end, we propose a novel interpreter design that incorporates non-negative matrix factorization (NMF). |
Jayneel Parekh; Sanjeel Parekh; Pavlo Mozharovskyi; Florence d’Alché-Buc; Gaël Richard; |
2114 | Temporally-Consistent Survival Analysis Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study survival analysis in the dynamic setting: We seek to model the time to an event of interest given sequences of states. |
Lucas Maystre; Daniel Russo; |
2115 | Black-Box Generalization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We provide the first generalization error analysis for black-box learning through derivative-free optimization. |
Konstantinos Nikolakakis; Farzin Haddadpour; Dionysis Kalogerias; Amin Karbasi; |
2116 | PDSketch: Integrated Domain Programming, Learning, and Planning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a new domain definition language, named PDSketch. |
Jiayuan Mao; Tomás Lozano-Pérez; Josh Tenenbaum; Leslie Kaelbling; |
2117 | Contrastive Graph Structure Learning Via Information Bottleneck for Recommendation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Here, we propose a Contrastive Graph Structure Learning via Information Bottleneck (CGI) for recommendation, which adaptively learns whether to drop an edge or node to obtain optimized graph structures in an end-to-end manner. |
Chunyu Wei; Jian Liang; Di Liu; Fei Wang; |
2118 | Self-Supervised Fair Representation Learning Without Demographics Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Can we improve fair classification without sensitive information and without labels? To tackle the problem, in this paper, we propose a novel reweighing-based contrastive learning method. |
Junyi Chai; Xiaoqian Wang; |
2119 | Graph Self-supervised Learning with Accurate Discrepancy Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Contrastive learning, while it can learn global graph-level similarities, its objective to maximize the similarity between two differently perturbed graphs may result in representations that cannot discriminate two similar graphs with different properties. To tackle such limitations, we propose a framework that aims to learn the exact discrepancy between the original and the perturbed graphs, coined as Discrepancy-based Self-supervised LeArning (D-SLA). |
Dongki Kim; Jinheon Baek; Sung Ju Hwang; |
2120 | Torsional Diffusion for Molecular Conformer Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose torsional diffusion, a novel diffusion framework that operates on the space of torsion angles via a diffusion process on the hypertorus and an extrinsic-to-intrinsic score model. |
Bowen Jing; Gabriele Corso; Jeffrey Chang; Regina Barzilay; Tommi Jaakkola; |
2121 | Adaptive Multi-stage Density Ratio Estimation for Learning Latent Space Energy-based Model Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Instead, we propose to use noise contrastive estimation (NCE) to discriminatively learn the EBM through density ratio estimation between the latent prior density and latent posterior density. |
Zhisheng Xiao; Tian Han; |
2122 | On The Symmetries of Deep Learning Models and Their Internal Representations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper we seek to connect the symmetries arising from the architecture of a family of models with the symmetries of that family’s internal representation of data. |
Charles Godfrey; Davis Brown; Tegan Emerson; Henry Kvinge; |
2123 | Brownian Noise Reduction: Maximizing Privacy Subject to Accuracy Constraints Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we generalize noise reduction to the setting of Gaussian noise, introducing the Brownian mechanism. |
Justin Whitehouse; Aaditya Ramdas; Steven Wu; Ryan Rogers; |
2124 | Is Integer Arithmetic Enough for Deep Learning Training? Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a fully integer training pipeline (i.e. forward propagation, back-propagation, stochastic gradient descent (SGD)) for deep learning models. |
Alireza Ghaffari; Marzieh S. Tahaei; Mohammadreza Tayaranian; Masoud Asgharian; Vahid Partovi Nia; |
2125 | Asymptotic Properties for Bayesian Neural Network in Besov Space Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In other words, we propose a practical Bayesian neural network with guaranteed asymptotic properties. |
Kyeongwon Lee; Jaeyong Lee; |
2126 | Class-Dependent Label-Noise Learning with Cycle-Consistency Regularization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we creatively propose to estimate the transition matrix under the forward-backward cycle-consistency regularization, of which we have greatly reduced the dependency of estimating the transition matrix T on the noisy class posterior. |
De Cheng; Yixiong Ning; Nannan Wang; Xinbo Gao; Heng Yang; Yuxuan Du; Bo Han; Tongliang Liu; |
2127 | Finite Sample Analysis Of Dynamic Regression Parameter Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper we study the global system operator — the operator that maps the noise vectors to the output. |
Mark Kozdoba; Edward Moroshko; Shie Mannor; Yacov Crammer; |
2128 | Heterogeneous Skill Learning for Multi-agent Tasks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we introduce the concept of the skill to explore the ability of heterogeneous behaviours. |
Yuntao Liu; Yuan Li; Xinhai Xu; Yong Dou; Donghong Liu; |
2129 | FedAvg with Fine Tuning: Local Updates Lead to Representation Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This surprising performance of such a simple method, however, is not fully understood from a theoretical point of view. In this paper, we formally investigate this phenomenon in the multi-task linear representation setting. |
Liam Collins; Hamed Hassani; Aryan Mokhtari; Sanjay Shakkottai; |
2130 | NeuroSchedule: A Novel Effective GNN-based Scheduling Method for High-level Synthesis Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper proposes NeuroSchedule, an efficient and effective GNN-based scheduling method called NeuroSchedule, with both fast runtime and enhanced solution quality. |
Jun Zeng; Mingyang Kou; Hailong Yao; |
2131 | FourierNets Enable The Design of Highly Non-local Optical Encoders for Computational Imaging Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We show that existing deep network decoders have a locality bias which prevents the optimization of such highly non-local optical encoders. We address this with a decoder based on a shallow neural network architecture using global kernel Fourier convolutional neural networks (FourierNets). |
Diptodip Deb; Zhenfei Jiao; Ruth Sims; Alex Chen; Michael Broxton; Misha B Ahrens; Kaspar Podgorski; Srinivas C Turaga; |
2132 | Quantifying Statistical Significance of Neural Network-based Image Segmentation By Selective Inference Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To this end, we introduce a conditional selective inference (SI) framework—a new statistical inference framework for data-driven hypotheses that has recently received considerable attention—to compute exact (non-asymptotic) valid p-values for the segmentation results. |
Vo Nguyen Le Duy; Shogo Iwazaki; Ichiro Takeuchi; |
2133 | Beyond L1: Faster and Better Sparse Models with Skglm Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a new fast algorithm to estimate any sparse generalized linear model with convex or non-convex separable penalties. |
Quentin Bertrand; Quentin Klopfenstein; Pierre-Antoine Bannier; Gauthier Gidel; Mathurin Massias; |
2134 | Towards Practical Computation of Singular Values of Convolutional Layers Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we offer a principled approach to alleviating constraints of the prior art at the expense of an insignificant reduction in layer expressivity. |
Alexandra Senderovich; Ekaterina Bulatova; Anton Obukhov; Maxim Rakhuba; |
2135 | EvenNet: Ignoring Odd-Hop Neighbors Improves Robustness of Graph Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose EvenNet, a spectral GNN corresponding to an even-polynomial graph filter. |
Runlin Lei; Zhen Wang; Yaliang Li; Bolin Ding; Zhewei Wei; |
2136 | Learning White Noises in Neural Stochastic Differential Equations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we introduce a generalized white noise to existing models and propose an efficient approximation of noise sample paths based on classical integration methods and sparse Gaussian process. |
Anh Tong Hoang; Thanh Nguyen-Tang; Toan Tran; Jaesik Choi; |
2137 | Enhancing and Scaling Cross-Modality Alignment for Contrastive Multimodal Pre-Training Via Gradient Harmonization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We explore the gradient conflicts in modality-agnostic contrastive multimodal pre-training and mitigate it via a serial of gradient harmonization techniques. |
Junru Wu; Yi Liang; feng han; Hassan Akbari; Zhangyang Wang; Cong Yu; |
2138 | Modeling The Machine Learning Multiverse Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Amid mounting concern about the reliability and credibility of machine learning research, we present a principled framework for making robust and generalizable claims: the Multiverse Analysis. |
Samuel J. Bell; Onno Kampman; Jesse Dodge; Neil Lawrence; |
2139 | Uni-Perceiver-MoE: Learning Sparse Generalist Models with Conditional MoEs Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we find that interference among different tasks and modalities is the main factor to this phenomenon. To mitigate such interference, we introduce the Conditional Mixture of Experts (Conditional MoEs) to generalist models. |
Jinguo Zhu; Xizhou Zhu; Wenhai Wang; Xiaohua Wang; Hongsheng Li; Xiaogang Wang; Jifeng Dai; |
2140 | How to Talk to Your Model: Instructions, Descriptions, and Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Yet today, we lack computational models explaining such language use. To address this challenge, we formalize learning from language in a contextual bandit setting and ask how a human might communicate preferences over behaviors. |
Theodore Sumers; Robert Hawkins; Mark Ho; Tom Griffiths; Dylan Hadfield-Menell; |
2141 | Geometric Distillation for Graph Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study a new paradigm of knowledge transfer in the context of geometric deep learning, which aims at distilling knowledge from a teacher graph neural network (GNN) model trained on a large graph to a student GNN model operating on a smaller graph. |
Chenxiao Yang; Qitian Wu; Junchi Yan; |
2142 | Fair Rank Aggregation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study ranking and rank aggregation problems from a fairness or diversity perspective, where the candidates (to be ranked) may belong to different groups and each group should have a fair representation in the final ranking. |
Diptarka Chakraborty; Syamantak Das; Arindam Khan; Aditya Subramanian; |
2143 | Consistent Sufficient Explanations and Minimal Local Rules for Explaining The Decision of Any Classifier or Regressor Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The crux of P-SE is to compute the conditional probability of maintaining the same prediction. Therefore, we introduce an accurate and fast estimator of this probability via random Forests for any data $(\boldsymbol{X}, Y)$ and show its efficiency through a theoretical analysis of its consistency. |
Salim I. Amoukou; Nicolas Brunel; |
2144 | PaCo: Parameter-Compositional Multi-task Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, the gaps between contents and difficulties of different tasks bring us challenges on both which tasks should share the parameters and what parameters should be shared, as well as the optimization challenges due to parameter sharing. In this work, we introduce a parameter-compositional approach (PaCo) as an attempt to address these challenges. |
Lingfeng Sun; Haichao Zhang; Wei Xu; Masayoshi TOMIZUKA; |
2145 | Bayesian Clustering of Neural Spiking Activity Using A Mixture of Dynamic Poisson Factor Analyzers Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We develop an efficient sampling approach to do Bayesian clustering with a mixture of dynamic Poisson factor analyzers model and show how this model can be used to identify separate populations of neurons based on their spiking activity. |
Ganchao Wei; Ian H Stevenson; Xiaojing Wang; |
2146 | Provable Benefit of Multitask Representation Learning in Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we theoretically characterize the benefit of representation learning under the low-rank Markov decision process (MDP) model. |
Yuan Cheng; Songtao Feng; Jing Yang; Hong Zhang; Yingbin Liang; |
2147 | QUARK: Controllable Text Generation with Reinforced Unlearning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce Quantized Reward Konditioning (Quark), an algorithm for optimizing a reward function that quantifies an (un)wanted property, while not straying too far from the original model. |
Ximing Lu; Sean Welleck; Liwei Jiang; Jack Hessel; Lianhui Qin; Peter West; Prithviraj Ammanabrolu; Yejin Choi; |
2148 | ReCo: Retrieve and Co-segment for Zero-shot Transfer Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a new framework for zero-shot transfer semantic segmentation, which retrieves a set of unlabelled images of a concept using a language-image pre-trained model and co-segments the category regions using modern image representations. |
Gyungin Shin; Weidi Xie; Samuel Albanie; |
2149 | Learn to Match with No Regret: Reinforcement Learning in Markov Matching Markets Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study a Markov matching market involving a planner and a set of strategic agents on the two sides of the market.At each step, the agents are presented with a dynamical context, where the contexts determine the utilities. |
Yifei Min; Tianhao Wang; Ruitu Xu; Zhaoran Wang; Michael Jordan; Zhuoran Yang; |
2150 | FasterRisk: Fast and Accurate Interpretable Risk Scores Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce an approach for efficiently producing a collection of high-quality risk scores learned from data. |
Jiachang Liu; Chudi Zhong; Boxuan Li; Margo Seltzer; Cynthia Rudin; |
2151 | Taming Fat-Tailed (“Heavier-Tailed” with Potentially Infinite Variance) Noise in Federated Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This motivates us to fill this gap in this paper by proposing an algorithmic framework called $\mathsf{FAT}$-$\mathsf{Clipping}~$ (\ul{f}ederated \ul{a}veraging with \ul{t}wo-sided learning rates and \ul{clipping}), which contains two variants: $\mathsf{FAT}$-$\mathsf{Clipping}~$ per-round ($\mathsf{FAT}$-$\mathsf{Clipping}$-$\mathsf{PR}$) and $\mathsf{FAT}$-$\mathsf{Clipping}~$ per-iteration ($\mathsf{FAT}$-$\mathsf{Clipping}$-$\mathsf{PI}$). |
Haibo Yang; Peiwen Qiu; Jia Liu; |
2152 | Deep Architecture Connectivity Matters for Its Convergence: A Fine-Grained Analysis Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we theoretically characterize the impact of connectivity patterns on the convergence of DNNs under gradient descent training in fine granularity. |
Wuyang Chen; Wei Huang; Xinyu Gong; Boris Hanin; Zhangyang Wang; |
2153 | Task-level Differentially Private Meta Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study the problem of meta-learning with task-level differential privacy. |
Xinyu Zhou; Raef Bassily; |
2154 | A Simple and Provably Efficient Algorithm for Asynchronous Federated Contextual Linear Bandits Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a simple algorithm named FedLinUCB based on the principle of optimism. |
Jiafan He; Tianhao Wang; Yifei Min; Quanquan Gu; |
2155 | OTKGE: Multi-modal Knowledge Graph Embeddings Via Optimal Transport Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, the critical challenge along this course lies in that the multi-modal embedding spaces are usually heterogeneous, and direct fusion will destroy the inherent spatial structure of different modal embeddings, which may harm the interaction of multi-modal knowledge. To overcome this challenge, we innovatively revisit multi-modal KGE from a geometric perspective and propose optimal transport knowledge graph embeddings (OTKGE). |
Zongsheng Cao; Qianqian Xu; Zhiyong Yang; Yuan He; Xiaochun Cao; Qingming Huang; |
2156 | Tsetlin Machine for Solving Contextual Bandit Problems Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper introduces an interpretable contextual bandit algorithm using Tsetlin Machines, which solves complex pattern recognition tasks using propositional (Boolean) logic. |
Raihan Seraj; Jivitesh Sharma; Ole-Christoffer Granmo; |
2157 | QC-StyleGAN – Quality Controllable Image Generation and Manipulation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, existing models are built upon high-quality (HQ) data as desired outputs, making them unfit for in-the-wild low-quality (LQ) images, which are common inputs for manipulation. In this work, we bridge this gap by proposing a novel GAN structure that allows for generating images with controllable quality. |
Dat Viet Thanh Nguyen; Phong Tran The; Tan M. Dinh; Cuong Pham; Anh Tran; |
2158 | Causal Imitation Learning with Unobserved Contexts Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We consider imitation learning problems where the expert has access to a per-episode context that is hidden from the learner, both in the demonstrations and at test-time. |
Gokul Swamy; Sanjiban Choudhury; J. Bagnell; Steven Wu; |
2159 | Differentially Private Learning with Margin Guarantees Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a series of new differentially private (DP) algorithms with dimension-independent margin guarantees. |
Raef Bassily; Mehryar Mohri; Ananda Theertha Suresh; |
2160 | Oracle-Efficient Online Learning for Smoothed Adversaries Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We establish that under smoothed analysis, there are computationally efficient online algorithms (given access to an offline optimization oracle) whose sublinear regret depends only on the VC dimension and the smootheness parameter. |
Nika Haghtalab; Yanjun Han; Abhishek Shetty; Kunhe Yang; |
2161 | Foundation Posteriors for Approximate Probabilistic Inference Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Here we formulate inference as masked language modeling: given a program, we generate a supervised dataset of variables and assignments, and randomly mask a subset of the assignments. |
Mike Wu; Noah Goodman; |
2162 | A Best-of-Both-Worlds Algorithm for Bandits with Delayed Feedback Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a best-of-the-both-worlds algorithm in adversarial and stochastic regimes for multi-armed bandit problem with arbitrary delays in feedback. |
Saeed Masoudian; Julian Zimmert; Yevgeny Seldin; |
2163 | Variational Inference Via Wasserstein Gradient Flows Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose principled methods for VI, in which $\hat \pi$ is taken to be a Gaussian or a mixture of Gaussians, which rest upon the theory of gradient flows on the Bures–Wasserstein space of Gaussian measures. |
Marc Lambert; Sinho Chewi; Francis Bach; Silvère Bonnabel; Philippe Rigollet; |
2164 | Momentum Aggregation for Private Non-convex ERM Related Papers Related Patents Related Grants Related Orgs Related Experts View Abstract: We introduce new algorithms and convergence guarantees for privacy-preserving non-convex Empirical Risk Minimization (ERM) on smooth $d$-dimensional objectives. We develop an … |
Hoang Tran; Ashok Cutkosky; |
2165 | Parallel Tempering With A Variational Reference Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, in the typical case where the prior and posterior are nearly mutually singular, PT methods are computationally prohibitive. In this work we address this challenge by constructing a generalized annealing path connecting the posterior to an adaptively tuned variational reference. |
Nikola Surjanovic; Saifuddin Syed; Alexandre Bouchard-Côté; Trevor Campbell; |
2166 | Data Augmentation MCMC for Bayesian Inference from Privatized Data Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose an MCMC framework to perform Bayesian inference from the privatized data, which is applicable to a wide range of statistical models and privacy mechanisms. |
Nianqiao Ju; Jordan Awan; Ruobin Gong; Vinayak Rao; |
2167 | Triangulation Candidates for Bayesian Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Here we propose using candidates based on a Delaunay triangulation of the existing input design. |
Robert Gramacy; Annie Sauer; Nathan Wycoff; |
2168 | Langevin Autoencoders for Learning Deep Latent Variable Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper proposes the amortized Langevin dynamics (ALD), wherein datapoint-wise MCMC iterations are entirely replaced with updates of an encoder that maps observations into latent variables. |
Shohei Taniguchi; Yusuke Iwasawa; Wataru Kumagai; Yutaka Matsuo; |
2169 | Improving Zero-Shot Generalization in Offline Reinforcement Learning Using Generalized Similarity Functions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a new theoretically-motivated framework called Generalized Similarity Functions (GSF), which uses contrastive learning to train an offline RL agent to aggregate observations based on the similarity of their expected future behavior, where we quantify this similarity using generalized value functions. |
Bogdan Mazoure; Ilya Kostrikov; Ofir Nachum; Jonathan Tompson; |
2170 | On The Statistical Efficiency of Reward-Free Exploration in Non-Linear RL Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study reward-free reinforcement learning (RL) under general non-linear function approximation, and establish sample efficiency and hardness results under various standard structural assumptions. |
Jinglin Chen; Aditya Modi; Akshay Krishnamurthy; Nan Jiang; Alekh Agarwal; |
2171 | VCT: A Video Compression Transformer Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We show how transformers can be used to vastly simplify neural video compression. |
Fabian Mentzer; George D Toderici; David Minnen; Sergi Caelles; Sung Jin Hwang; Mario Lucic; Eirikur Agustsson; |
2172 | When to Trust Your Simulator: Dynamics-Aware Hybrid Offline-and-Online Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This brings up a new question: is it possible to combine learning from limited real data in offline RL and unrestricted exploration through imperfect simulators in online RL to address the drawbacks of both approaches? In this study, we propose the Dynamics-Aware Hybrid Offline-and-Online Reinforcement Learning (H2O) framework to provide an affirmative answer to this question. |
Haoyi Niu; shubham sharma; Yiwen Qiu; Ming Li; Guyue Zhou; Jianming HU; Xianyuan Zhan; |
2173 | VICE: Variational Interpretable Concept Embeddings Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper introduces Variational Interpretable Concept Embeddings (VICE), an approximate Bayesian method for embedding object concepts in a vector space using data collected from humans in a triplet odd-one-out task. |
Lukas Muttenthaler; Charles Zheng; Patrick McClure; Robert Vandermeulen; Martin N Hebart; Francisco Pereira; |
2174 | Monte Carlo Tree Search Based Variable Selection for High Dimensional Bayesian Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a variable selection method MCTS-VS based on Monte Carlo tree search (MCTS), to iteratively select and optimize a subset of variables. |
Lei Song; Ke Xue; Xiaobin Huang; Chao Qian; |
2175 | Segmenting Moving Objects Via An Object-Centric Layered Representation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: The objective of this paper is a model that is able to discover, track and segment multiple moving objects in a video. |
Junyu Xie; Weidi Xie; Andrew Zisserman; |
2176 | House of Cans: Covert Transmission of Internal Datasets Via Capacity-Aware Neuron Stegnography Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we present a capacity-aware neuron steganography scheme (i.e., Cans) to covertly transmit multiple private machine learning (ML) datasets via a scheduled-to-publish deep neural network (DNN) as \textit{the carrier model}. |
Xudong Pan; Shengyao Zhang; Mi Zhang; Yifan Yan; Min Yang; |
2177 | Provable Defense Against Backdoor Policies in Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a provable defense mechanism against backdoor policies in reinforcement learning. |
Shubham Bharti; Xuezhou Zhang; Adish Singla; Jerry Zhu; |
2178 | Riemannian Score-Based Generative Modelling Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce here \emph{Riemannian Score-based Generative Models} (RSGMs), a class of generative models extending SGMs to Riemannian manifolds. |
Valentin De Bortoli; Emile Mathieu; Michael Hutchinson; James Thornton; Yee Whye Teh; Arnaud Doucet; |
2179 | Multitasking Models Are Robust to Structural Failure: A Neural Model for Bilingual Cognitive Reserve Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We show, theoretically and experimentally, that multitask learning increases robustness to structural perturbations. |
Giannis Daras; Negin Raoof; Zoi Gkalitsiou; Alex Dimakis; |
2180 | Bringing Efficiency and Interpretability to Learned TCP Congestion Control Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper proposes a novel two-stage solution to achieve the best of both worlds: first to train a deep RL agent, then distill its (over-)parameterized NN policy into white-box, light-weight rules in the form of symbolic expressions that are much easier to understand and to implement in constrained environments. |
S P Sharan; Wenqing Zheng; Kuo-Feng Hsu; Jiarong Xing; Ang Chen; Zhangyang Wang; |
2181 | Logical Activation Functions: Logit-space Equivalents of Probabilistic Boolean Operators Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We derive logit-space operators equivalent to probabilistic Boolean logic-gates AND, OR, and XNOR for independent probabilities. |
Scott Lowe; Robert Earle; Jason d’Eon; Thomas Trappenberg; Sageev Oore; |
2182 | Museformer: Transformer with Fine- and Coarse-Grained Attention for Music Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose Museformer, a Transformer with a novel fine- and coarse-grained attention for symbolic music generation. |
Botao Yu; Peiling Lu; Rui Wang; Wei Hu; Xu Tan; Wei Ye; Shikun Zhang; Tao Qin; Tie-Yan Liu; |
2183 | Differentiable Hierarchical and Surrogate Gradient Search for Spiking Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To this end, we develop a differentiable hierarchical search framework for spiking neurons, where spike-based computation is realized on both the cell and the layer level search space. |
Kaiwei Che; Kaixuan Zhang; Luziwei Leng; Jianguo Zhang; Qinghu Meng; Jie Cheng; Qinghai Guo; Jianxing Liao; |
2184 | Could Giant Pre-trained Image Models Extract Universal Representations? Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we present a study of frozen pretrained models when applied to diverse and representative computer vision tasks, including object detection, semantic segmentation and video action recognition. |
Yutong Lin; Ze Liu; Zheng Zhang; Han Hu; Nanning Zheng; Stephen Lin; Yue Cao; |
2185 | On Learning and Refutation in Noninteractive Local Differential Privacy Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study two basic statistical tasks in non-interactive local differential privacy (LDP): *learning* and *refutation*: learning requires finding a concept that best fits an unknown target function (from labelled samples drawn from a distribution), whereas refutation requires distinguishing between data distributions that are well-correlated with some concept in the class, versus distributions where the labels are random. |
Alexander Edmonds; Aleksandar Nikolov; Toniann Pitassi; |
2186 | A Neural Corpus Indexer for Document Retrieval Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we aim to show that an end-to-end deep neural network unifying training and indexing stages can significantly improve the recall performance of traditional methods. |
Yujing Wang; Haonan Wang; Yingyan Hou; Ziming Miao; Shibin Wu; Hao Sun; Qi Chen; Yuqing Xia; Chengmin Chi; Guoshuai Zhao; Zheng Liu; Xing Xie; Hao Sun; Weiwei Deng; Qi Zhang; Mao Yang; |
2187 | Learning Tractable Probabilistic Models from Inconsistent Local Estimates Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we consider the problem of improving a tractable model using local probability estimates for a small subset of variables (given observations) that are either available from experts or via an external process. |
Shasha Jin; Vasundhara Komaragiri; Tahrima Rahman; Vibhav Gogate; |
2188 | Non-Linear Coordination Graphs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose the first non-linear coordination graph by extending CG value decomposition beyond the linear case. |
Yipeng Kang; Tonghan Wang; Qianlan Yang; Chongjie Zhang; |
2189 | Homomorphic Matrix Completion Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a homomorphic matrix completion algorithm for privacy-preserving data completion. |
Zechu Li; Xiao-Yang Liu; Xiaodong Wang; |
2190 | [Re] A Cluster-based Approach for Improving Isotropy in Contextual Embedding Space Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To verify these claims, we reproduce all experiments described in the paper. |
Benjamin Džubur; |
2191 | Evolving Zero Cost Proxies For Neural Architecture Scoring Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a genetic programming framework to automate the discovery of zero-cost proxies for neural architecture scoring. |
Yash Akhauri; Juan Munoz; Nilesh Jain; Ravishankar Iyer; |
2192 | Deep Active Learning By Leveraging Training Dynamics Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, by exploring the connection between the generalization performance and the training dynamics, we propose a theory-driven deep active learning method (dynamicAL) which selects samples to maximize training dynamics. |
Haonan Wang; Wei Huang; Ziwei Wu; Hanghang Tong; Andrew J Margenot; Jingrui He; |
2193 | Agreement-on-the-line: Predicting The Performance of Neural Networks Under Distribution Shift Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we show a similar surprising phenomena also holds for the agreement between pairs of neural network classifiers: whenever accuracy-on-the-line holds, we observe that the OOD agreement between the predictions of any two pairs of neural networks (with potentially different architectures) also observes a strong linear correlation with their ID agreement. |
Christina Baek; Yiding Jiang; Aditi Raghunathan; J. Zico Kolter; |
2194 | Functional Indirection Neural Estimator for Better Out-of-distribution Generalization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Inspired by these mechanisms, we hypothesize that OOD generalization may be achieved by performing analogy-making and indirection in the functional space instead of the data space as in current methods. To realize this, we design FINE (Functional Indirection Neural Estimator), a neural framework that learns to compose functions that map data input to output on-the-fly. |
Kha Pham; Thai Hung Le; Man Ngo; Truyen Tran; |
2195 | Memorization Without Overfitting: Analyzing The Training Dynamics of Large Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We empirically study exact memorization in causal and masked language modeling, across model sizes and throughout the training process. |
Kushal Tirumala; Aram Markosyan; Luke Zettlemoyer; Armen Aghajanyan; |
2196 | Zonotope Domains for Lagrangian Neural Network Verification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: A key drawback of the latter technique is that each neuron is treated independently, thereby ignoring important neuron interactions. We provide an approach that merges these two threads and uses zonotopes within a Lagrangian decomposition. |
Matt Jordan; Jonathan Hayase; Alex Dimakis; Sewoong Oh; |
2197 | Solving Quantitative Reasoning Problems with Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We train a large Transformer language model on mathematical data and achieve strong performance on quantitative reasoning tasks, including state of the art performance on the MATH dataset. |
Aitor Lewkowycz; Anders Andreassen; Vinay Ramasesh; Henryk Michalewski; David Dohan; Cem Anil; Ambrose Slone; Imanol Schlag; Theo Gutman-Solo; Yuhuai Wu; Ethan Dyer; Guy Gur-Ari; Behnam Neyshabur; Vedant Misra; |
2198 | Diversified Recommendations for Agents with Adaptive Preferences Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The Recommender’s primary objective is typically to encourage content consumption which optimizes some reward, such as ad revenue, but they often additionally aim to ensure that a sufficiently wide variety of content is consumed by the Agent over time. We formalize this problem as an adversarial bandit task. |
William Brown; Arpit Agarwal; |
2199 | MonoSDF: Exploring Monocular Geometric Cues for Neural Implicit Surface Reconstruction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We demonstrate that depth and normal cues, predicted by general-purpose monocular estimators, significantly improve reconstruction quality and optimization time. |
Zehao Yu; Songyou Peng; Michael Niemeyer; Torsten Sattler; Andreas Geiger; |
2200 | Detection and Localization of Changes in Conditional Distributions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paired data scenario is in contrast to the standard setting where a sequentially observed variable is analyzed for potential changes in the marginal distribution. We propose new methodology for solving this problem, by starting from a simpler task that analyzes changes in conditional expectation, and generalizing the tools developed for that task to conditional distributions. |
Lizhen Nie; Dan Nicolae; |
2201 | Exploring Through Random Curiosity with General Value Functions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Here we propose random curiosity with general value functions (RC-GVF), a novel intrinsic reward function that draws upon connections between these distinct approaches. |
Aditya Ramesh; Louis Kirsch; Sjoerd van Steenkiste; Jürgen Schmidhuber; |
2202 | SIREN: Shaping Representations for OOD Detection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper bridges the gap by addressing two key challenges—representation learning and OOD detection—in one coherent framework. |
Xuefeng Du; Gabriel Gozum; Yifei Ming; Yixuan Li; |
2203 | Shape And Structure Preserving Differential Privacy Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Focusing on the problem of sanitizing the Fr\’echet mean of a sample of points on a manifold, we exploit the characterization of the mean as the minimizer of an objective function comprised of the sum of squared distances and develop a K-norm gradient mechanism on Riemannian manifolds that favors values that produce gradients close to the the zero of the objective function. |
Carlos Soto; Karthik Bharath; Matthew Reimherr; Aleksandra Slavković; |
2204 | Bellman Residual Orthogonalization for Offline Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study a reinforcement learning principle that approximates the Bellman equations by enforcing their validity only along an user-defined space of test functions. |
Andrea Zanette; Martin J Wainwright; |
2205 | Non-asymptotic and Accurate Learning of Nonlinear Dynamical Systems Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We consider the problem of learning a nonlinear dynamical system governed by a nonlinear state equation $h_{t+1}=\phi(h_t,u_t;\theta)+w_t$. |
Yahya Sattar; Samet Oymak; |
2206 | Reinforcement Learning with Non-Exponential Discounting Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose a theory for continuous-time reinforcement learning generalized to arbitrary discount functions. |
Matthias Schultheis; Constantin Rothkopf; Heinz Koeppl; |
2207 | HUMANISE: Language-conditioned Human Motion Generation in 3D Scenes Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To fill in the gap, we propose a large-scale and semantic-rich synthetic HSI dataset, denoted as HUMANISE, by aligning the captured human motion sequences with various 3D indoor scenes. |
Zan Wang; Yixin Chen; Tengyu Liu; Yixin Zhu; Wei Liang; Siyuan Huang; |
2208 | A Simple Decentralized Cross-Entropy Method Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Cross-Entropy Method (CEM) is commonly used for planning in model-based reinforcement learning (MBRL) where a centralized approach is typically utilized to update the sampling distribution based on only the top-$k$ operations’ results on samples. In this paper, we show that such a centralized approach makes CEM vulnerable to local optima, thus impairing its sample efficiency. |
Zichen Zhang; Jun Jin; Martin Jagersand; Jun Luo; Dale Schuurmans; |
2209 | Evolution of Neural Tangent Kernels Under Benign and Adversarial Training Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we perform an empirical study of the evolution of the NTK under standard and adversarial training, aiming to disambiguate the effect of adversarial training on kernel learning and lazy training. |
Noel Loo; Ramin Hasani; Alexander Amini; Daniela Rus; |
2210 | Efficient Dataset Distillation Using Random Feature Approximation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Today’s best performing algorithm, \textit{Kernel Inducing Points} (KIP), which makes use of the correspondence between infinite-width neural networks and kernel-ridge regression, is prohibitively slow due to the exact computation of the neural tangent kernel matrix, scaling $O(|S|^2)$, with $|S|$ being the coreset size. To improve this, we propose a novel algorithm that uses a random feature approximation (RFA) of the Neural Network Gaussian Process (NNGP) kernel which reduces the kernel matrix computation to $O(|S|)$. |
Noel Loo; Ramin Hasani; Alexander Amini; Daniela Rus; |
2211 | A Neural Pre-Conditioning Active Learning Algorithm to Reduce Label Complexity Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Focusing on this objective and after surveying tools that can be used to this end, we propose a neural pre-conditioning (NPC) algorithm, based on a neural tangent kernel (NTK) analysis, that valuates unlabeled data based on how they would contribute upon inclusion to the training set. |
Seo Taek Kong; Soomin Jeon; Dongbin Na; Jaewon Lee; Hong-Seok Lee; Kyu-Hwan Jung; |
2212 | Chain of Thought Imitation with Procedure Cloning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To properly leverage expert procedure information without relying on the privileged tools the expert may have used to perform the procedure, we propose procedure cloning, which applies supervised sequence prediction to imitate the complete series of expert computations. |
Mengjiao (Sherry) Yang; Dale Schuurmans; Pieter Abbeel; Ofir Nachum; |
2213 | Multi-Game Decision Transformers Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In the subfields of vision and language, this was largely achieved by scaling up transformer-based models and training them on large, diverse datasets. Motivated by this progress, we investigate whether the same strategy can be used to produce generalist reinforcement learning agents. |
Kuang-Huei Lee; Ofir Nachum; Mengjiao (Sherry) Yang; Lisa Lee; Daniel Freeman; Sergio Guadarrama; Ian Fischer; Winnie Xu; Eric Jang; Henryk Michalewski; Igor Mordatch; |
2214 | Controlled Sparsity Via Constrained Optimization Or: How I Learned to Stop Tuning Penalties and Love Constraints Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we focus on the task of controlling the level of sparsity when performing sparse learning. |
Jose Gallego-Posada; Juan Ramirez; Akram Erraqabi; Yoshua Bengio; Simon Lacoste-Julien; |
2215 | Nonparametric Uncertainty Quantification for Single Deterministic Neural Network Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper proposes a fast and scalable method for uncertainty quantification of machine learning models’ predictions. |
Nikita Kotelevskii; Aleksandr Artemenkov; Kirill Fedyanin; Fedor Noskov; Alexander Fishkov; Artem Shelmanov; Artem Vazhentsev; Aleksandr Petiushko; Maxim Panov; |
2216 | Boosting The Transferability of Adversarial Attacks with Reverse Adversarial Perturbation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Deep neural networks (DNNs) have been shown to be vulnerable to adversarial examples, which can produce erroneous predictions by injecting imperceptible perturbations. In this work, we study the transferability of adversarial examples, which is significant due to its threat to real-world applications where model architecture or parameters are usually unknown. |
Zeyu Qin; Yanbo Fan; Yi Liu; Li Shen; Yong Zhang; Jue Wang; Baoyuan Wu; |
2217 | Knowledge Distillation from A Stronger Teacher Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Unlike existing knowledge distillation methods focus on the baseline settings, where the teacher models and training strategies are not that strong and competing as state-of-the-art approaches, this paper presents a method dubbed DIST to distill better from a stronger teacher. |
Tao Huang; Shan You; Fei Wang; Chen Qian; Chang Xu; |
2218 | BooNTK: Convexifying Federated Learning Using Bootstrapped Neural Tangent Kernels Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: For neural networks, even when centralized SGD easily finds a solution that is simultaneously performant for all clients, current federated optimization methods fail to converge to a comparable solution. We show that this performance disparity can largely be attributed to optimization challenges presented by non-convexity. |
Yaodong Yu; Alexander Wei; Sai Praneeth Karimireddy; Yi Ma; Michael Jordan; |
2219 | Revisiting Sparse Convolutional Model for Visual Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper revisits the sparse convolutional modeling for image classification and bridges the gap between good empirical performance (of deep learning) and good interpretability (of sparse convolutional models). |
xili dai; Mingyang Li; Pengyuan Zhai; Shengbang Tong; Xingjian Gao; Shao-Lun Huang; Zhihui Zhu; Chong You; Yi Ma; |
2220 | Category-Level 6D Object Pose Estimation in The Wild: A Semi-Supervised Learning Approach and A New Dataset Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a new model, called Rendering for Pose estimation network RePoNet), that is jointly trained using the free ground-truths with the synthetic data, and a silhouette matching objective function on the real-world data. |
Yang Fu; Xiaolong Wang; |
2221 | 🏘️ ProcTHOR: Large-Scale Embodied AI Using Procedural Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This work presents a platform to enable similar success stories in Embodied AI. |
Matt Deitke; Eli VanderBilt; Alvaro Herrasti; Winson Han; Luca Weihs; Kiana Ehsani; Jordi Salvador; Eric Kolve; Aniruddha Kembhavi; Roozbeh Mottaghi; |
2222 | Generalizing Consistent Multi-Class Classification with Rejection to Be Compatible with Arbitrary Losses Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we derive a novel formulation for CwR that can be equipped with arbitrary loss functions while maintaining the theoretical guarantees. |
Yuzhou Cao; Tianchi Cai; Lei Feng; Lihong Gu; Jinjie GU; Bo An; Gang Niu; Masashi Sugiyama; |
2223 | Text Classification with Born’s Rule Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper presents a text classification algorithm inspired by the notion of superposition of states in quantum physics. |
Emanuele Guidotti; Alfio Ferrara; |
2224 | Multi-view Subspace Clustering on Topological Manifold Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Orthogonal to existing works, in this paper, we argue that it is beneficial to explore the implied data manifold by learning the topological relationship between data points. |
Shudong Huang; Hongjie Wu; Yazhou Ren; Ivor Tsang; Zenglin Xu; Jiancheng Lv; Wentao Feng; |
2225 | Learning Recourse on Instance Environment to Enhance Prediction Accuracy Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a model called {\em RecourseNet} that learns to apply recourse on the space of environments so that the recoursed instances are amenable to better predictions by the classifier. |
Lokesh N; Guntakanti Sai Koushik; Abir De; Sunita Sarawagi; |
2226 | Generative Visual Prompt: Unifying Distributional Control of Pre-Trained Generative Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, many applications of interest require sampling from a specific region of the generative model’s output space or evenly over a range of characteristics. To allow efficient sampling in these scenarios, we propose Generative Visual Prompt (PromptGen), a framework for distributional control over pre-trained generative models by incorporating knowledge of arbitrary off-the-shelf models. |
Chen Henry Wu; Saman Motamed; Shaunak Srivastava; Fernando D De la Torre; |
2227 | ULNeF: Untangled Layered Neural Fields for Mix-and-Match Virtual Try-On Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To this end, we propose a neural model that untangles layered neural fields to represent collision-free garment surfaces. |
Igor Santesteban; Miguel Otaduy; Nils Thuerey; Dan Casas; |
2228 | Practical Adversarial Multivalid Conformal Prediction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We give a simple, generic conformal prediction method for sequential prediction that achieves target empirical coverage guarantees on adversarial data. |
Osbert Bastani; Varun Gupta; Christopher Jung; Georgy Noarov; Ramya Ramalingam; Aaron Roth; |
2229 | Distributed Methods with Compressed Communication for Solving Variational Inequalities, with Theoretical Guarantees Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we present the first theoretically grounded distributed methods for solving variational inequalities and saddle point problems using compressed communication: MASHA1 and MASHA2. |
Aleksandr Beznosikov; Peter Richtarik; Michael Diskin; Max Ryabinin; Alexander Gasnikov; |
2230 | FiLM-Ensemble: Probabilistic Deep Learning Via Feature-wise Linear Modulation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce FiLM-Ensemble, a deep, implicit ensemble method based on the concept of Feature-wise Linear Modulation (FiLM). |
Mehmet Ozgur Turkoglu; Alexander Becker; Hüseyin Anil Gündüz; Mina Rezaei; Bernd Bischl; Rodrigo Caye Daudt; Stefano D’Aronco; Jan D. Wegner; Konrad Schindler; |
2231 | Beyond Neural Scaling Laws: Beating Power Law Scaling Via Data Pruning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Overall, our work suggests that the discovery of good data-pruning metrics may provide a viable path forward to substantially improved neural scaling laws, thereby reducing the resource costs of modern deep learning. |
Ben Sorscher; Robert Geirhos; Shashank Shekhar; Surya Ganguli; Ari Morcos; |
2232 | Bayesian Active Learning with Fully Bayesian Gaussian Processes Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we focus on active learning with Gaussian Processes (GPs). |
Christoffer Riis; Francisco Antunes; Frederik Hüttel; Carlos Lima Azevedo; Francisco Pereira; |
2233 | CoVariance Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Principal component analysis (PCA) involves the projection of data on the eigenspace of the covariance matrix and draws similarities with the graph convolutional filters in GNNs. Motivated by this observation, we study a GNN architecture, called coVariance neural network (VNN), that operates on sample covariance matrices as graphs. |
Saurabh Sihag; Gonzalo Mateos; Corey McMillan; Alejandro Ribeiro; |
2234 | Rethinking Lipschitz Neural Networks for Certified L-infinity Robustness Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we bridge the gap by studying certified $\ell_\infty$ robustness from a novel perspective of representing Boolean functions. |
Bohang Zhang; Du Jiang; Di He; Liwei Wang; |
2235 | Autoregressive Search Engines: Generating Substrings As Document Identifiers Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Previous work has explored ways to partition the search space into hierarchical structures and retrieve documents by autoregressively generating their unique identifier. In this work we propose an alternative that doesn’t force any structure in the search space: using all ngrams in a passage as its possible identifiers. |
Michele Bevilacqua; Giuseppe Ottaviano; Patrick Lewis; Scott Yih; Sebastian Riedel; Fabio Petroni; |
2236 | Increasing The Scope As You Learn: Adaptive Bayesian Optimization in Nested Subspaces Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper proposes BAxUS that leverages a novel family of nested random subspaces to adapt the space it optimizes over to the problem. |
Leonard Papenmeier; Matthias Poloczek; Luigi Nardi; |
2237 | On Embeddings for Numerical Features in Tabular Deep Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we argue that embeddings for numerical features are an underexplored degree of freedom in tabular DL, which allows constructing more powerful DL models and competing with gradient boosted decision trees (GBDT) on some GBDT-friendly benchmarks (that is, where GBDT outperforms conventional DL models). |
Yury Gorishniy; Ivan Rubachev; Artem Babenko; |
2238 | BadPrompt: Backdoor Attacks on Continuous Prompts Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we conduct the first study on the vulnerability of the continuous prompt learning algorithm to backdoor attacks. |
Xiangrui Cai; Haidong Xu; Sihan Xu; Ying ZHANG; Yuan xiaojie; |
2239 | GlanceNets: Interpretabile, Leak-proof Concept-based Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce a new class of self-explainable models based on interpretable concepts. |
Emanuele Marconato; Andrea Passerini; Stefano Teso; |
2240 | Multi-Scale Adaptive Network for Single Image Denoising Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, existing methods treat different scale features equally without considering their scale-specific characteristics, \textit{i.e.}, the within-scale characteristics are ignored. In this paper, we reveal this missing piece for multi-scale architecture design and accordingly propose a novel Multi-Scale Adaptive Network (MSANet) for single image denoising. |
Yuanbiao Gou; Peng Hu; Jiancheng Lv; Joey Tianyi Zhou; Xi Peng; |
2241 | An $\alpha$-regret Analysis of Adversarial Bilateral Trade Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper we consider gain from trade which is harder to approximate than social welfare. |
Yossi Azar; Amos Fiat; Federico Fusco; |
2242 | Linear-Time Gaussian Processes Using Binary Tree Kernels Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a new kernel that allows for Gaussian process regression in $O((n+m)\log(n+m))$ time. |
Michael Cohen; Samuel Daulton; Michael A Osborne; |
2243 | Differentially Private Model Compression Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we initiate the study of differentially private model compression and propose frameworks for achieving 50% sparsity levels while maintaining nearly full performance. |
FatemehSadat Mireshghallah; Arturs Backurs; Huseyin A. Inan; Lukas Wutschitz; Janardhan Kulkarni; |
2244 | Promising or Elusive? Unsupervised Object Segmentation from Real-world Single Images Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: By training more than 200 models, we demonstrate that current unsupervised methods cannot segment generic objects from real-world single images, unless the complex objectness biases are removed. |
Yafei YANG; Bo Yang; |
2245 | A Quantitative Geometric Approach to Neural Network Smoothness Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we provide a unified theoretical framework, a quantitative geometric approach, to address the Lipschitz constant estimation. |
Zi Wang; Gautam Prakriya; Somesh Jha; |
2246 | SNAKE: Shape-aware Neural 3D Keypoint Field Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Nevertheless, the idea of incorporating shape reconstruction into 3D keypoint detection is under-explored. We argue that this is restricted by former problem formulations. To this end, a novel unsupervised paradigm named SNAKE is proposed, which is short for shape-aware neural 3D keypoint field. |
Chengliang Zhong; Peixing You; Xiaoxue Chen; Hao Zhao; Fuchun Sun; Guyue Zhou; Xiaodong Mu; Chuang Gan; Wenbing Huang; |
2247 | Sound and Complete Verification of Polynomial Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we devise a new bounding method, equipped with BaB for global convergence guarantees, called VPN. |
Elias Abad Rocamora; Mehmet Fatih Sahin; Fanghui Liu; Grigorios Chrysos; Volkan Cevher; |
2248 | Respecting Transfer Gap in Knowledge Distillation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Although the gap offers the student model external knowledge from the machine domain, the imbalanced teacher knowledge would make us incorrectly estimate how much to transfer from teacher to student per sample on the non-IID transfer set. To tackle this challenge, we propose Inverse Probability Weighting Distillation (IPWD) that estimates the propensity of a training sample belonging to the machine domain, and assigns its inverse amount to compensate for under-represented samples. |
Yulei Niu; Long Chen; Hanwang Zhang; Chang Zhou; |
2249 | AdaptFormer: Adapting Vision Transformers for Scalable Visual Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Although the pre-trained Vision Transformers (ViTs) achieved great success in computer vision, adapting a ViT to various image and video tasks is challenging because of its heavy computation and storage burdens, where each model needs to be independently and comprehensively fune-tuned to different tasks, limiting its transferability in different domains. To address this challenge, we propose an effective adaptation approach for Transformer, namely AdaptFormer, which can adapt the pre-trained ViTs into many different image and video tasks efficiently. |
Shoufa Chen; Chongjian GE; Zhan Tong; Jiangliu Wang; Yibing Song; Jue Wang; Ping Luo; |
2250 | Recommender Forest for Efficient Retrieval Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Despite the reduction of running time, the representation learning is independent of ANNs index construction; thus, the two operations can be incompatible, which results in potential loss of recommendation accuracy. To overcome the above problem, we propose the Recommender Forest (a.k.a., RecForest), which jointly learns latent embedding and index for efficient and high-fidelity recommendation. |
Chao Feng; Wuchao Li; Defu Lian; Zheng Liu; Enhong Chen; |
2251 | Self-Supervised Image Restoration with Blurry and Noisy Pairs Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, the real-world training pairs are difficult to collect, and the self-supervised methods merely rely on blurry or noisy images are limited in performance. In this work, we tackle this problem by jointly leveraging the short-exposure noisy image and the long-exposure blurry image for better image restoration. |
Zhilu Zhang; RongJian Xu; Ming Liu; Zifei Yan; Wangmeng Zuo; |
2252 | On The Effect of Pre-training for Transformer in Different Modality on Offline Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We empirically investigate how pre-training on data of different modalities, such as language and vision, affects fine-tuning of Transformer-based models to offline reinforcement learning tasks. |
Shiro Takagi; |
2253 | Sparse Interaction Additive Networks Via Feature Interaction Detection and Sparse Selection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we develop a tractable selection algorithm to efficiently identify the necessary feature combinations by leveraging techniques in feature interaction detection. |
James Enouen; Yan Liu; |
2254 | Efficient and Effective Multi-task Grouping Via Meta Learning on Task Combinations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Nonetheless, identifying which tasks should be learned together is still a challenging fundamental problem because the possible task combinations grow exponentially with the number of tasks, and existing solutions heavily relying on heuristics may probably lead to ineffective groupings with severe performance degradation. To bridge this gap, we develop a systematic multi-task grouping framework with a new meta-learning problem on task combinations, which is to predict the per-task performance gains of multi-task learning over single-task learning for any combination. |
Xiaozhuang Song; Shun Zheng; Wei Cao; James Yu; Jiang Bian; |
2255 | Generating Long Videos of Dynamic Scenes Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a video generation model that accurately reproduces object motion, changes in camera viewpoint, and new content that arises over time. |
Tim Brooks; Janne Hellsten; Miika Aittala; Ting-Chun Wang; Timo Aila; Jaakko Lehtinen; Ming-Yu Liu; Alexei Efros; Tero Karras; |
2256 | CEBaB: Estimating The Causal Effects of Real-World Concepts on NLP Model Behavior Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce CEBaB, a new benchmark dataset for assessing concept-based explanation methods in Natural Language Processing (NLP). |
Eldar D Abraham; Karel D'Oosterlinck; Amir Feder; Yair Gat; Atticus Geiger; Christopher Potts; Roi Reichart; Zhengxuan Wu; |
2257 | Adaptively Exploiting D-Separators with Causal Bandits Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Ideally, an algorithm should be adaptive; that is, perform nearly as well as an algorithm with oracle knowledge of the presence or absence of a d-separator. In this work, we formalize and study this notion of adaptivity, and provide a novel algorithm that simultaneously achieves (a) optimal regret when a d-separator is observed, improving on classical minimax algorithms, and (b) significantly smaller regret than recent causal bandit algorithms when the observed variables are not a d-separator. |
Blair Bilodeau; Linbo Wang; Daniel M Roy; |
2258 | Private Synthetic Data for Multitask Learning and Marginal Queries Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We provide a differentially private algorithm for producing synthetic data simultaneously useful for multiple tasks: marginal queries and multitask machine learning (ML). |
Giuseppe Vietri; Cedric Archambeau; Sergul Aydore; William Brown; Michael Kearns; Aaron Roth; Ankit Siva; Shuai Tang; Steven Wu; |
2259 | Peer Prediction for Learning Agents Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we explore the dynamics of sequential peer prediction mechanisms when participants are learning agents. |
Shi Feng; Fang-Yi Yu; Yiling Chen; |
2260 | Maximizing Revenue Under Market Shrinkage and Market Uncertainty Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper we formulate the first formal model of shrinking markets in multi-item settings, and study how mechanism design and machine learning can help preserve revenue in an uncertain, shrinking market. |
Maria-Florina Balcan; Siddharth Prasad; Tuomas Sandholm; |
2261 | AttCAT: Explaining Transformers Via Attentive Class Activation Tokens Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a novel Transformer explanation technique via attentive class activation tokens, leveraging encoded features, their gradients, and their attention weights to generate faithful and confident explanations. |
Yao Qiang; Deng Pan; Chengyin Li; Xin Li; Rhongho Jang; Dongxiao Zhu; |
2262 | A New Family of Generalization Bounds Using Samplewise Evaluated CMI Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a new family of information-theoretic generalization bounds, in which the training loss and the population loss are compared through a jointly convex function. |
Fredrik Hellström; Giuseppe Durisi; |
2263 | Exploring Non-Monotonic Latent Alignments for Non-Autoregressive Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We explore non-monotonic latent alignments for NAT to compensate for the weakness of monotonic assumption in CTC. |
Chenze Shao; Yang Feng; |
2264 | Meta-Auto-Decoder for Solving Parametric Partial Differential Equations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose Meta-Auto-Decoder (MAD), a mesh-free and unsupervised deep learning method that enables the pre-trained model to be quickly adapted to equation instances by implicitly encoding (possibly heterogenous) PDE parameters as latent vectors. |
Xiang Huang; Zhanhong Ye; Hongsheng Liu; Shi Ji; Zidong Wang; Kang Yang; Yang Li; Min Wang; Haotian CHU; Fan Yu; Bei Hua; Lei Chen; Bin Dong; |
2265 | Theoretical Analysis of Deep Neural Networks for Temporally Dependent Observations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Despite the widespread useof neural networks in such settings, most theoretical developments of deep neural networks are under the assumption of independent observations, and theoretical results for temporally dependent observations are scarce. To bridge this gap, we study theoretical properties of deep neural networks on modeling non-linear time series data. |
Mingliang Ma; Abolfazl Safikhani; |
2266 | Online Reinforcement Learning for Mixed Policy Scopes Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we investigate the online reinforcement learning setting for optimizing the policy space with mixed scopes. |
Junzhe Zhang; Elias Bareinboim; |
2267 | Distributionally Robust Weighted K-nearest Neighbors Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we study a minimax distributionally robust formulation of weighted k-nearest neighbors, which aims to find the optimal weighted k-NN classifiers that hedge against feature uncertainties. |
Shixiang Zhu; Liyan Xie; Minghe Zhang; Rui Gao; Yao Xie; |
2268 | On The Spectral Bias of Convolutional Neural Tangent and Gaussian Process Kernels Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study the properties of various over-parameterized convolutional neural architectures through their respective Gaussian process and neural tangent kernels. |
Amnon Geifman; Meirav Galun; David Jacobs; Basri Ronen; |
2269 | A Characterization of Semi-Supervised Adversarially Robust PAC Learnability Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study the problem of learning an adversarially robust predictor to test time attacks in the semi-supervised PAC model. |
Idan Attias; Steve Hanneke; Yishay Mansour; |
2270 | SAViT: Structure-Aware Vision Transformer Pruning Via Collaborative Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we introduce joint importance, which integrates essential structural-aware interactions between components for the first time, to perform collaborative pruning. |
Chuanyang Zheng; zheyang li; Kai Zhang; Zhi Yang; Wenming Tan; Jun Xiao; Ye Ren; Shiliang Pu; |
2271 | Online Deep Equilibrium Learning for Regularization By Denoising Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose ODER as a new strategy for improving the efficiency of DEQ through stochastic approximations of the measurement models. |
Jiaming Liu; Xiaojian Xu; Weijie Gan; shirin shoushtari; Ulugbek Kamilov; |
2272 | Exploring Figure-Ground Assignment Mechanism in Perceptual Organization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Specifically, we present a novel Figure-Ground-Aided (FGA) module to learn the configural statistics of the visual scene and leverage it for the reduction of visual ambiguity. |
Wei Zhai; Yang Cao; Jing Zhang; Zheng-Jun Zha; |
2273 | Near-Optimal Multi-Agent Learning for Safe Coverage Control Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we aim to efficiently learn the density to approximately solve the coverage problem while preserving the agents’ safety. |
Manish Prajapat; Matteo Turchetta; Melanie Zeilinger; Andreas Krause; |
2274 | SQ Lower Bounds for Learning Single Neurons with Massart Noise Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We establish the first SQ lower bounds for learning single neurons (including ReLUs) with Massart noise. |
Ilias Diakonikolas; Daniel Kane; Lisheng Ren; Yuxin Sun; |
2275 | Regret Bounds for Risk-Sensitive Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We prove the first regret bounds for reinforcement learning under a general class of risk-sensitive objectives including the popular CVaR objective. |
Osbert Bastani; Jason Yecheng Ma; Estelle Shen; Wanqiao Xu; |
2276 | Relational Proxies: Emergent Relationships As Fine-Grained Discriminators Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose Relational Proxies, a novel approach that leverages the relational information between the global and local views of an object for encoding its semantic label. |
ABHRA CHAUDHURI; Massimiliano Mancini; Zeynep Akata; Anjan Dutta; |
2277 | Washing The Unwashable : On The (Im)possibility of Fairwashing Detection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we investigate the issue of fairwashing, in which model explanation techniques are manipulated to rationalize decisions taken by an unfair black-box model using deceptive surrogate models. |
Ali Shahin Shamsabadi; Mohammad Yaghini; Natalie Dullerud; Sierra Wyllie; Ulrich Aïvodji; Aisha Alaagib; Sébastien Gambs; Nicolas Papernot; |
2278 | Cross-Linked Unified Embedding for Cross-modality Representation Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Jointly learning from multi-modal data enables global integration of both shared and modality-specific information, but current strategies often fail when observations from certain modalities are incomplete/missing for part of the subjects. To learn comprehensive representations based on such modality-incomplete data, we present a semi-supervised neural network model called CLUE (Cross-Linked Unified Embedding). |
Xinming Tu; Zhi-Jie Cao; xia chenrui; Sara Mostafavi; Ge Gao; |
2279 | Large-batch Optimization for Dense Visual Predictions Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, although the current advanced algorithms such as LARS and LAMB succeed in classification models, the complicated pipelines of dense visual predictions such as object detection and segmentation still suffer from the heavy performance drop in the large-batch training regime. To address this challenge, we propose a simple yet effective algorithm, named Adaptive Gradient Variance Modulator (AGVM), which can train dense visual predictors with very large batch size, enabling several benefits more appealing than prior arts. |
Zeyue Xue; Jianming Liang; Guanglu Song; Zhuofan Zong; Liang Chen; Yu Liu; Ping Luo; |
2280 | Few-Shot Continual Active Learning By A Robot Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Therefore, in this paper, we consider a challenging but realistic continual learning problem, Few-Shot Continual Active Learning (FoCAL), where a CL agent is provided with unlabeled data for a new or a previously learned task in each increment and the agent only has limited labeling budget available. |
Ali Ayub; Carter Fendley; |
2281 | SAPipe: Staleness-Aware Pipeline for Data Parallel DNN Training Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose SAPipe, a performant system that pushes the training speed of data parallelism to its fullest extent. |
Yangrui Chen; Cong Xie; Meng Ma; Juncheng Gu; Yanghua Peng; Haibin Lin; Chuan Wu; Yibo Zhu; |
2282 | RepLAI: Self-supervised Representation Learning from Videos of Audible Interactions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a self-supervised algorithm to learn representations from egocentric video data. |
Himangi Mittal; Pedro Morgado; Unnat Jain; Abhinav Gupta; |
2283 | Tackling Overfitting and Silence in Unsupervised Audio-Visual Source Localization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This is of course an unrealistic assumption, and thus better metrics are necessary to capture the model’s performance on (negative) samples with no visible sound sources. To accomplish this, we extend the test set of popular benchmarks, Flickr SoundNet and VGG-Sound Sources, in order to include negative samples, and measure performance using metrics that balance precision and recall. |
Shentong Mo; Pedro Morgado; |
2284 | Sym-NCO: Leveraging Symmetricity for Neural Combinatorial Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper presents a novel training scheme, Sym-NCO, which is a regularizer-based training scheme that leverages universal symmetricities in various CO problems and solutions. |
Minsu Kim; Junyoung Park; Jinkyoo Park; |
2285 | Toward Equation of Motion for Deep Neural Networks: Continuous-time Gradient Descent and Discretization Error Analysis Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We 1) derive a differential equation that precisely describes the discrete dynamics of DNNs, 2) provide its discretization error, and 3) highlight differences between continuous and discrete gradient descent. |
Taiki Miyagawa; |
2286 | A Contrastive Framework for Neural Text Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present a contrastive solution: (i) SimCTG, a contrastive training objective to calibrate the model’s representation space, and (ii) a decoding method—contrastive search—to encourage diversity while maintaining coherence in the generated text. |
Yixuan Su; Tian Lan; Yan Wang; Dani Yogatama; Lingpeng Kong; Nigel Collier; |
2287 | Non-deep Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This begs the question — is it possible to build high-performing “non-deep neural networks? We show that it is. |
Ankit Goyal; Alexey Bochkovskiy; Jia Deng; Vladlen Koltun; |
2288 | Approximate Secular Equations for The Cubic Regularization Subproblem Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose and analyze a novel CRS solver based on an approximate secular equation, which requires only some of the Hessian eigenvalues and is therefore much more efficient. |
Yihang Gao; Man-Chung Yue; Michael Ng; |
2289 | Private Estimation with Public Data Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We initiate the study of differentially private (DP) estimation with access to a small amount of public data. |
Alex Bie; Gautam Kamath; Vikrant Singhal; |
2290 | Robust Testing in High-Dimensional Sparse Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We consider the problem of robustly testing the norm of a high-dimensional sparse signal vector under two different observation models. |
Anand Jerry George; Clément L Canonne; |
2291 | Learning Physical Dynamics with Subequivariant Graph Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose Subequivariant GNN for learning physical dynamics with desirable symmetry and object information. |
Jiaqi Han; Wenbing Huang; Hengbo Ma; Jiachen Li; Josh Tenenbaum; Chuang Gan; |
2292 | Rethinking Training of 3D GANs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we take a different route to 3D synthesis and develop a non-upsampler-based generator with state-of-the-art image quality, high-resolution geometry and which trains $2.5 \times$ \textit{faster}. |
Ivan Skorokhodov; Sergey Tulyakov; Yiqun Wang; Peter Wonka; |
2293 | Optimal and Adaptive Monteiro-Svaiter Acceleration Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We develop a variant of the Monteiro-Svaiter (MS) acceleration framework that removes the need to solve an expensive implicit equation at every iteration. |
Yair Carmon; Danielle Hausler; Arun Jambulapati; Yujia Jin; Aaron Sidford; |
2294 | Parameter-Efficient Image-to-Video Transfer Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, existing attempts typically focus on downstream tasks from the same modality (e.g., image understanding) of the pre-trained model. This creates a limit because in some specific modalities, (e.g., video understanding) such a strong pre-trained model with sufficient knowledge is less or not available. In this work, we investigate such a novel cross-modality transfer learning setting, namely parameter-efficient image-to-video transfer learning. |
Junting Pan; Ziyi Lin; Xiatian Zhu; Jing Shao; Hongsheng Li; |
2295 | ConvMAE: Masked Convolution Meets Masked Autoencoders Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, our ConvMAE framework demonstrates that multi-scale hybrid convolution-transformer can learn more discriminative representations via the mask auto-encoding scheme. |
Peng Gao; Teli Ma; Hongsheng Li; Ziyi Lin; Jifeng Dai; Yu Qiao; |
2296 | [Re] Reproducibility Report: Contrastive Learning of Socially-aware Motion Representations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The following paper is a reproducibility report for ‘Social NCE: Contrastive Learning of Socially-aware Motion Representations’ published in ICCV 2021 as part of the ML Reproducibility Challenge 2021. |
Roopsa Sen; Sidharth Sinha; Animesh Jha; Parv Maheshwari; |
2297 | Decentralized Local Stochastic Extra-Gradient for Variational Inequalities Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We consider distributed stochastic variational inequalities (VIs) on unbounded domains with the problem data that is heterogeneous (non-IID) and distributed across many devices. |
Aleksandr Beznosikov; Pavel Dvurechenskii; Anastasiia Koloskova; Valentin Samokhin; Sebastian Stich; Alexander Gasnikov; |
2298 | Multi-Instance Causal Representation Learning for Instance Label Prediction and Out-of-Distribution Generalization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose the CausalMIL algorithm, which not only excels at instance label prediction but also provides robustness to distribution change by synergistically integrating MIL with identifiable variational autoencoder. |
Weijia Zhang; Xuanhui Zhang; hanwen deng; Min-Ling Zhang; |
2299 | When to Intervene: Learning Optimal Intervention Policies for Critical Events Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, for the first time, we formulate the problem of finding an optimally timed intervention (OTI) policy as minimizing the expected residual time to event, subject to a constraint on the probability of missing the event. |
Niranjan Damera Venkata; Chiranjib Bhattacharyya; |
2300 | MixReg: A Simple Way to Improve Generalization in Regression for Deep Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a simple yet powerful algorithm, mixReg, to improve generalization on regression tasks. |
Huaxiu Yao; Yiping Wang; Linjun Zhang; James Zou; Chelsea Finn; |
2301 | Panchromatic and Multispectral Image Fusion Via Alternating Reverse Filtering Network Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we present a simple yet effective alternating reverse filtering network for pan-sharpening. |
Keyu Yan; Man Zhou; Jie Huang; Chengjun Xie; Feng Zhao; Chongyi Li; Danfeng Hong; |
2302 | Reinforced Genetic Algorithm for Structure-based Drug Design Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To achieve a more stable and efficient SBDD, we propose Reinforced Genetic Algorithm (RGA) that uses neural models to prioritize the profitable design steps and suppress random-walk behavior. |
Tianfan Fu; Wenhao Gao; Connor Coley; Jimeng Sun; |
2303 | Predictive Coding Beyond Gaussian Distributions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: These methods, however, fail to keep up with modern neural networks, as they are unable to replicate the dynamics of complex layers and activation functions. In this work, we solve this problem by generalizing PC to arbitrary probability distributions, enabling the training of architectures, such as transformers, that are hard to approximate with only Gaussian assumptions. |
Luca Pinchetti; Tommaso Salvatori; Yordan Yordanov; Beren Millidge; Yuhang Song; Thomas Lukasiewicz; |
2304 | Learning Distinct and Representative Modes for Image Captioning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Affected by it, the generated captions are limited in diversity and usually less informative than natural image descriptions made by humans. In this paper, we seek to avoid this problem by proposing a Discrete Mode Learning (DML) paradigm for image captioning. |
Qi Chen; Chaorui Deng; Qi Wu; |
2305 | Riemannian Neural SDE: Learning Stochastic Representations on Manifolds Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, it typically loses expressivity when the data representation is manifold-valued. To address this issue, we suggest a principled method for expressing the stochastic representation with the Riemannian neural SDE (RNSDE), which extends the conventional Euclidean NSDE. |
Sung Woo Park; Hyomin Kim; Kyungjae Lee; Junseok Kwon; |
2306 | When Does Group Invariant Learning Survive Spurious Correlations? Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we reveal the insufficiency of existing group invariant learning methods in preventing classifiers from depending on spurious correlations in the training set. Specifically, we propose two criteria on judging such sufficiency. |
Yimeng Chen; Ruibin Xiong; Zhi-Ming Ma; Yanyan Lan; |
2307 | Last-Iterate Convergence of Optimistic Gradient Method for Monotone Variational Inequalities Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, by introducing a novel analysis through potential functions, we show that (i) this $O(1/N)$ last-iterate convergence can be achieved without any assumption on the Jacobian of the operator, and (ii) it can be extended to the constrained case, which was not derived before even under Lipschitzness of the Jacobian. |
Eduard Gorbunov; Adrien Taylor; Gauthier Gidel; |
2308 | A Fast Post-Training Pruning Framework for Transformers Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To retain high accuracy without retraining, we introduce three novel techniques: (i) a lightweight mask search algorithm that finds which heads and filters to prune based on the Fisher information; (ii) mask rearrangement that complements the search algorithm; and (iii) mask tuning that reconstructs the output activations for each layer. |
Woosuk Kwon; Sehoon Kim; Michael Mahoney; Joseph Hassoun; Kurt Keutzer; Amir Gholami; |
2309 | Locating and Editing Factual Associations in GPT Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We locate and edit the mechanisms underlying factual association within the activations and weights of large pretrained GPT models. |
Kevin Meng; David Bau; Alex Andonian; Yonatan Belinkov; |
2310 | Spatial Pruned Sparse Convolution for 3D Object Detection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we analyze major components of existing sparse 3D CNNs and find that 3D CNNs ignores the redundancy of data and further amplifies it in the down-sampling process, which brings a huge amount of extra and unnecessary computational overhead. Inspired by this, we propose a new convolution operator named spatial pruned sparse convolution (SPS-Conv), which includes two variants, spatial pruned submanifold sparse convolution (SPSS-Conv) and spatial pruned regular sparse convolution (SPRS-Conv), both of which are based on the idea of dynamically determine crucial areas for performing computations to reduce redundancy. |
Jianhui Liu; Yukang Chen; Xiaoqing Ye; Zhuotao Tian; Xiao Tan; Xiaojuan Qi; |
2311 | Neural Reflectance Field from Shading and Shadow Under A Fixed Viewpoint Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we address the "dual problem" of multi-view scene reconstruction in which we utilize single-view images captured under different point lights to learn a neural scene representation. |
Wenqi Yang; Guanying Chen; Chaofeng Chen; Zhenfang Chen; Kwan-Yee K. Wong; |
2312 | Neur2SP: Neural Two-Stage Stochastic Programming Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we tackle two-stage stochastic programs (2SPs), the most widely applied and studied stochastic programming models. |
Rahul Mihir Patel; Justin Dumouchelle; Elias Khalil; Merve Bodur; |
2313 | Prune and Distill: Similar Reformatting of Image Information Along Rat Visual Cortex and Deep Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Here we consider some prominent statistical patterns that are known to exist in the internal representations of either CNNs or the visual cortex and look for them in the other system. |
Paolo Muratore; Sina Tafazoli; Eugenio Piasini; Alessandro Laio; Davide Zoccolan; |
2314 | Palm Up: Playing in The Latent Manifold for Unsupervised Pretraining Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we aim to bring the best of both worlds and propose an algorithm that exhibits an exploratory behavior whilst it utilizes large diverse datasets. |
Hao Liu; Tom Zahavy; Volodymyr Mnih; Satinder Singh; |
2315 | FairVFL: A Fair Vertical Federated Learning Framework with Contrastive Adversarial Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a fair vertical federated learning framework (FairVFL), which can improve the fairness of VFL models. |
Tao Qi; Fangzhao Wu; Chuhan Wu; Lingjuan Lyu; Tong Xu; Hao Liao; Zhongliang Yang; Yongfeng Huang; Xing Xie; |
2316 | Test Time Adaptation Via Conjugate Pseudo-labels Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we start by presenting a surprising phenomenon: if we attempt to $\textit{meta-learn}$ the “best” possible TTA loss over a wide class of functions, then we recover a function that is $\textit{remarkably}$ similar to (a temperature-scaled version of) the softmax-entropy employed by TENT. |
Sachin Goyal; Mingjie Sun; Aditi Raghunathan; J. Zico Kolter; |
2317 | Learning Options Via Compression Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, there are often many different solutions that maximize the likelihood equally well, including degenerate solutions. To address this underspecification, we propose a new objective that combines the maximum likelihood objective with a penalty on the description length of the skills. |
Yiding Jiang; Evan Liu; Benjamin Eysenbach; J. Zico Kolter; Chelsea Finn; |
2318 | Signal Recovery with Non-Expansive Generative Network Priors Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: It has remained open whether the expansivity can be relaxed, allowing for networks with contractive layers (as often the case of real generators). In this work we answer this question, proving that a signal in the range of a Gaussian generative network can be recovered from few linear measurements provided that the width of the layers is proportional to the input layer size (up to log factors). |
Jorio Cocola; |
2319 | A Continuous Time Framework for Discrete Denoising Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We provide the first complete continuous time framework for denoising diffusion models of discrete data. |
Andrew Campbell; Joe Benton; Valentin De Bortoli; Thomas Rainforth; George Deligiannidis; Arnaud Doucet; |
2320 | Supported Policy Optimization for Offline Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper presents Supported Policy OpTimization (SPOT), which is directly derived from the theoretical formalization of the density-based support constraint. |
Jialong Wu; Haixu Wu; Zihan Qiu; Jianmin Wang; Mingsheng Long; |
2321 | Learning on Arbitrary Graph Topologies Via Predictive Coding Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we show how predictive coding (PC), a theory of information processing in the cortex, can be used to perform inference and learning on arbitrary graph topologies. |
Tommaso Salvatori; Luca Pinchetti; Beren Millidge; Yuhang Song; Tianyi Bao; Rafal Bogacz; Thomas Lukasiewicz; |
2322 | Can Push-forward Generative Models Fit Multimodal Distributions? Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Among these models are the Variational Autoencoders and the Generative Adversarial Networks. In this work, we call them "push-forward" models and study their expressivity. |
Antoine Salmona; Valentin De Bortoli; Julie Delon; Agnes Desolneux; |
2323 | Inverse Game Theory for Stackelberg Games: The Blessing of Bounded Rationality Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Our work relaxes the perfect rationality agent assumption to the classic quantal response model, a more realistic behavior model of bounded rationality. Interestingly, we show that the smooth property brought by such bounded rationality model actually leads to provably more efficient learning of the follower utility parameters in general Stackelberg games. |
Jibang Wu; Weiran Shen; Fei Fang; Haifeng Xu; |
2324 | Trajectory of Mini-Batch Momentum: Batch Size Saturation and Convergence in High Dimensions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We provide exact dynamics for SGD with Momentum (SGD+M) in large scale and we show that SGD+M converges faster than SGD in large batch setting. |
Kiwon Lee; Andrew Cheng; Elliot Paquette; Courtney Paquette; |
2325 | Posterior Collapse of A Linear Latent Variable Model Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: For a general linear latent variable model that includes linear variational autoencoders as a special case, we precisely identify the nature of posterior collapse to be the competition between the likelihood and the regularization of the mean due to the prior. |
Zihao Wang; Liu Ziyin; |
2326 | DeVRF: Fast Deformable Voxel Radiance Fields for Dynamic Scenes Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present DeVRF, a novel representation to accelerate learning dynamic radiance fields. |
Jia-Wei Liu; Yan-Pei Cao; Weijia Mao; Wenqiao Zhang; David Junhao Zhang; Jussi Keppo; Ying Shan; Xiaohu Qie; Mike Zheng Shou; |
2327 | Multi-modal Grouping Network for Weakly-Supervised Audio-Visual Video Parsing Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To this end, in this paper, we propose a novel Multi-modal Grouping Network, namely MGN, for explicitly semantic-aware grouping. |
Shentong Mo; Yapeng Tian; |
2328 | Understanding Benign Overfitting in Gradient-Based Meta Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: While the conventional statistical learning theory suggests that overparameterized models tend to overfit, empirical evidence reveals that overparameterized meta learning methods still work well – a phenomenon often called “benign overfitting.” In an attempt to understand this phenomenon, we focus on the meta learning settings with a challenging bilevel structure that we term the gradient-based meta learning, and analyze its generalization performance under an overparameterized meta linear regression model. |
Lisha Chen; Songtao Lu; Tianyi Chen; |
2329 | Experimental Design for Linear Functionals in Reproducing Kernel Hilbert Spaces Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we investigate optimal design of experiments for {\em estimation of linear functionals in reproducing kernel Hilbert spaces (RKHSs)}. |
Mojmir Mutny; Andreas Krause; |
2330 | Product Ranking for Revenue Maximization with Multiple Purchases Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We assume that each customer can purchase multiple products at will in this work. |
Renzhe Xu; Xingxuan Zhang; Bo Li; Yafeng Zhang; Xiaolong Chen; Peng Cui; |
2331 | MsSVT: Mixed-scale Sparse Voxel Transformer for 3D Object Detection on Point Clouds Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Recent detectors leverage the power of window-based transformers to model long-range dependencies but tend to blur out fine-grained details. To mitigate this gap, we present a novel Mixed-scale Sparse Voxel Transformer, named MsSVT, which can well capture both types of information simultaneously by the divide-and-conquer philosophy. |
Shaocong Dong; lihe Ding; Haiyang Wang; Tingfa Xu; Xinli Xu; Jie Wang; Ziyang Bian; Ying Wang; Jianan Li; |
2332 | A2: Efficient Automated Attacker for Boosting Adversarial Training Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose an efficient automated attacker called A2 to boost AT by generating the optimal perturbations on-the-fly during training. |
Zhuoer Xu; Guanghui Zhu; Changhua Meng; shiwen cui; Zhenzhe Ying; Weiqiang Wang; Ming GU; Yihua Huang; |
2333 | Bridging The Gap Between Object and Image-level Representations for Open-Vocabulary Detection Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We note that both these modes of supervision are \emph{not} optimally aligned for the detection task: CLIP is trained with image-text pairs and lacks precise localization of objects while the image-level supervision has been used with heuristics that do not accurately specify local object regions. In this work, we propose to address this problem by performing object-centric alignment of the language embeddings from the CLIP model. |
Hanoona Bangalath; Muhammad Maaz; Muhammad Uzair Khattak; Salman Khan; Fahad Shahbaz Khan; |
2334 | Measuring and Reducing Model Update Regression in Structured Prediction for NLP Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We find that model update regression is a severe and widespread problem in NLP structured prediction and explore several mitigation methods including a novel, simple, and effective approach called backward-congruent reranking. |
Deng Cai; Elman Mansimov; Yi-An Lai; Yixuan Su; Lei Shu; Yi Zhang; |
2335 | CoNSoLe: Convex Neural Symbolic Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose Convex Neural Symbolic Learning (CoNSoLe) to seek convexity under mild conditions. |
Haoran Li; Yang Weng; Hanghang Tong; |
2336 | Variance Reduced ProxSkip: Algorithm, Theory and Application to Federated Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study distributed optimization methods based on the {\em local training (LT)} paradigm, i.e., methods which achieve communication efficiency by performing richer local gradient-based training on the clients before (expensive) parameter averaging is allowed to take place. |
Grigory Malinovsky; Kai Yi; Peter Richtarik; |
2337 | 3D Concept Grounding on Neural Fields Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we address the challenging problem of 3D concept grounding (i.e., segmenting and learning visual concepts) by looking at RGBD images and reasoning about paired questions and answers. |
Yining Hong; Yilun Du; Chunru Lin; Josh Tenenbaum; Chuang Gan; |
2338 | A Coupled Design of Exploiting Record Similarity for Practical Vertical Federated Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we design a novel coupled training paradigm, FedSim, that integrates one-to-many linkage into the training process. |
Zhaomin Wu; Qinbin Li; Bingsheng He; |
2339 | Clipped Stochastic Methods for Variational Inequalities with Heavy-Tailed Noise Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we prove the first high-probability complexity results with logarithmic dependence on the confidence level for stochastic methods for solving monotone and structured non-monotone VIPs with non-sub-Gaussian (heavy-tailed) noise and unbounded domains. |
Eduard Gorbunov; Marina Danilova; David Dobre; Pavel Dvurechenskii; Alexander Gasnikov; Gauthier Gidel; |
2340 | Phase Transitions in When Feedback Is Useful Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Here we offer a theory that accounts for both feedforward and feedback costs, and noise in all computations. |
Lokesh Boominathan; Xaq Pitkow; |
2341 | Geodesic Graph Neural Network for Efficient Graph Representation Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose an efficient GNN framework called Geodesic GNN (GD-GNN). |
Lecheng Kong; Muhan Zhang; Yixin Chen; |
2342 | Unsupervised Multi-Object Segmentation By Predicting Probable Motion Patterns Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a new approach to learn to segment multiple image objects without manual supervision. |
Laurynas Karazija; Subhabrata Choudhury; Iro Laina; Christian Rupprecht; Andrea Vedaldi; |
2343 | Decoupling Classifier for Boosting Few-shot Object Detection and Instance Segmentation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Our analysis suggests that the standard classification head of most FSOD or FSIS models needs to be decoupled to mitigate the bias classification. Therefore, we propose an embarrassingly simple but effective method that decouples the standard classifier into two heads. |
Bin-Bin Gao; Xiaochen Chen; Zhongyi Huang; Congchong Nie; Jun Liu; Jinxiang Lai; GUANNAN JIANG; Xi Wang; Chengjie Wang; |
2344 | Don’t Throw Your Model Checkpoints Away Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we make an intriguing observation that an intermediate model, i.e., a checkpoint in the middle of the training procedure, often serves as a better teacher compared to the fully converged model, although the former has much lower accuracy. |
Chaofei Wang; Qisen Yang; Rui Huang; Shiji Song; Gao Huang; |
2345 | Distribution-Informed Neural Networks for Domain Adaptation Regression Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we study the problem of domain adaptation regression, which learns a regressor for a target domain by leveraging the knowledge from a relevant source domain. |
Jun Wu; Jingrui He; Sheng Wang; Kaiyu Guan; Elizabeth Ainsworth; |
2346 | Improved Algorithms for Neural Active Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In particular, we introduce two regret metrics by minimizing the population loss that are more suitable in active learning than the one used in state-of-the-art (SOTA) related work. |
Yikun Ban; Yuheng Zhang; Hanghang Tong; Arindam Banerjee; Jingrui He; |
2347 | DAGMA: Learning DAGs Via M-matrices and A Log-Determinant Acyclicity Characterization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose a \emph{fundamentally different} acyclicity characterization based on the log-determinant (log-det) function, which leverages the nilpotency property of DAGs. |
Kevin Bello; Bryon Aragam; Pradeep Ravikumar; |
2348 | Exploring The Algorithm-Dependent Generalization of AUPRC Optimization with List Stability Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we present the first trial in the single-query generalization of stochastic AUPRC optimization. |
Peisong Wen; Qianqian Xu; Zhiyong Yang; Yuan He; Qingming Huang; |
2349 | LOT: Layer-wise Orthogonal Training on Improving L2 Certified Robustness Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a layer-wise orthogonal training method (LOT) to effectively train 1-Lipschitz convolution layers via parametrizing an orthogonal matrix with an unconstrained matrix. |
Xiaojun Xu; Linyi Li; Bo Li; |
2350 | Emergent Graphical Conventions in A Visual Communication Game Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Primarily focusing on the latter, recent studies of emergent communication overlook the sketches; they do not account for the evolution process through which symbolic sign systems emerge in the trade-off between iconicity and symbolicity. In this work, we take the very first step to model and simulate this process via two neural agents playing a visual communication game; the sender communicates with the receiver by sketching on a canvas. |
Shuwen Qiu; Sirui Xie; Lifeng Fan; Tao Gao; Jungseock Joo; Song-Chun Zhu; Yixin Zhu; |
2351 | A Consolidated Cross-Validation Algorithm for Support Vector Machines Via Data Reduction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a consolidated cross-validation (CV) algorithm for training and tuning the support vector machines (SVM) on reproducing kernel Hilbert spaces. |
Boxiang Wang; Archer Yang; |
2352 | Byzantine-tolerant Federated Gaussian Process Regression for Streaming Data Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we consider Byzantine-tolerant federated learning for streaming data using Gaussian process regression (GPR). |
Xu Zhang; Zhenyuan Yuan; Minghui Zhu; |
2353 | CoupAlign: Coupling Word-Pixel with Sentence-Mask Alignments for Referring Image Segmentation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Previous works learn to straightforwardly align the sentence embedding and pixel-level embedding for highlighting the referred objects, but ignore the semantic consistency of pixels within the same object, leading to incomplete masks and localization errors in predictions. To tackle this problem, we propose CoupAlign, a simple yet effective multi-level visual-semantic alignment method, to couple sentence-mask alignment with word-pixel alignment to enforce object mask constraint for achieving more accurate localization and segmentation. |
Zicheng Zhang; Yi Zhu; Jianzhuang Liu; Xiaodan Liang; Wei Ke; |
2354 | Asymptotic Behaviors of Projected Stochastic Approximation: A Jump Diffusion Perspective Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a stochastic approximation algorithm named by LPSA with probabilistic projections to ensure feasibility so that projections are performed with probability $p_n$ at the $n$-th iteration. |
Jiadong Liang; Yuze Han; Xiang Li; Zhihua Zhang; |
2355 | Enhancing Safe Exploration Using Safety State Augmentation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Often the safety cost is sparse and unknown, which unavoidably leads to constraint violations – a phenomenon ideally to be avoided in safety-critical applications. We tackle this problem by augmenting the state-space with a safety state, which is nonnegative if and only if the constraint is satisfied. |
Aivar Sootla; Alexander Cowen-Rivers; Jun Wang; Haitham Bou Ammar; |
2356 | [Re] Solving Phase Retrieval With A Learned Reference Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we assume that a known (learned) reference is added to the signal before capturing the Fourier amplitude measurements. |
Nick Rucks; Tobias Uelwer; Stefan Harmeling; |
2357 | [Re] Strategic Classification Made Practical: Reproduction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, the paper Strategic Classification Made Practical is evaluated through a reproduction study. |
Guilly Kolkman; Jan Athmer; Alex Labro; Maksymilian Kulicki; |
2358 | [Re] GANSpace: Discovering Interpretable GAN Controls Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In our study, we primarily focus on reproducing results on the StyleGAN and StyleGAN2 models. |
Vishnu Dasu; Midhush Manohar Thevendria Karthic; |
2359 | [Re] Learning Unknown from Correlations: Graph Neural Network for Inter-novel-protein Interaction Prediction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Our results show that it really did perform better, but in some experiments the uncertainty was too big, so some claims were only partially confirmed. |
Urša Zrimšek; |
2360 | CS-Shapley: Class-wise Shapley Values for Data Valuation in Classification Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose CS-Shapley, a Shapley value with a new value function that discriminates between training instances’ in-class and out-of-class contributions. |
Stephanie Schoch; Haifeng Xu; Yangfeng Ji; |
2361 | [Re] Learning to Count Everything Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Results We could not reproduce the density maps, but we produced similar density maps by modifying some of the parameters. We exactly reproduced the results on the paper’s data set. |
Maša Kljun; Matija Teršek; Domen Vreš; |
2362 | Self-Supervised Contrastive Pre-Training For Time Series Via Time-Frequency Consistency Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To this end, we posit that time-frequency consistency (TF-C) — embedding a time-based neighborhood of a particular example close to its frequency-based neighborhood and back — is desirable for pre-training. |
Xiang Zhang; Ziyuan Zhao; Theodoros Tsiligkaridis; Marinka Zitnik; |
2363 | A Unifying Framework for Online Optimization with Long-Term Constraints Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The goal of the decision maker is to maximize their total reward, while at the same time achieving small cumulative constraints violations across the $T$ rounds. We present the first best-of-both-world type algorithm for this general class of problems, with no-regret guarantees both in the case in which rewards and constraints are selected according to an unknown stochastic model, and in the case in which they are selected at each round by an adversary. |
Matteo Castiglioni; Andrea Celli; Alberto Marchesi; Giulia Romano; Nicola Gatti; |
2364 | Lost in Latent Space: Examining Failures of Disentangled Models at Combinatorial Generalisation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Additionally, it is not clear if the reported failures are due to (a) encoders failing to map novel combinations to the proper regions of the latent space, or (b) novel combinations being mapped correctly but the decoder is unable to render the correct output for the unseen combinations. We investigate these alternatives by testing several models on a range of datasets and training settings. |
Milton Montero; Jeffrey Bowers; Rui Ponte Costa; Casimir Ludwig; Gaurav Malhotra; |
2365 | Distributed Optimization for Overparameterized Problems: Achieving Optimal Dimension Independent Communication Complexity Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we address the following open research question: To train an overparameterized model over a set of distributed nodes, what is the {\it minimum} communication overhead (in terms of the bits got exchanged) that the system needs to sustain, while still achieving (near) zero training loss? |
Bingqing Song; Ioannis Tsaknakis; Chung-Yiu Yau; Hoi-To Wai; Mingyi Hong; |
2366 | Masked Conditional Video Diffusion for Prediction, Generation, and Interpolation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we devise a general-purpose framework called Masked Conditional Video Diffusion (MCVD) for all of these video synthesis tasks using a probabilistic conditional score-based denoising diffusion model, conditioned on past and/or future frames. |
Vikram Voleti; Alexia Jolicoeur-Martineau; Chris Pal; |
2367 | DIMES: A Differentiable Meta Solver for Combinatorial Optimization Problems Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper addresses the scalability challenge in large-scale combinatorial optimization by proposing a novel approach, namely, DIMES. |
Ruizhong Qiu; Zhiqing Sun; Yiming Yang; |
2368 | Fairness-Aware PAC Learning from Corrupted Data Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work we consider fairness-aware learning under worst-case data manipulations. |
Nikola Konstantinov; Christoph Lampert; |
2369 | Tntorch: Tensor Network Learning with PyTorch Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present tntorch, a tensor learning framework that supports multiple decompositions (including Candecomp/Parafac, Tucker, and Tensor Train) under a unified interface. |
Mikhail Usvyatsov; Rafael Ballester-Ripoll; Konrad Schindler; |
2370 | (f,Gamma)-Divergences: Interpolating Between F-Divergences and Integral Probability Metrics Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We develop a rigorous and general framework for constructing information-theoretic divergences that subsume both $f$-divergences and integral probability metrics (IPMs), such as the $1$-Wasserstein distance. |
Jeremiah Birrell; Paul Dupuis; Markos A. Katsoulakis; Yannis Pantazis; Luc Rey-Bellet; |
2371 | On The Approximation of Cooperative Heterogeneous Multi-Agent Reinforcement Learning (MARL) Using Mean Field Control (MFC) Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This work considers a collection of $N_{\mathrm{pop}}$ heterogeneous agents that can be segregated into $K$ classes such that the $k$-th class contains $N_k$ homogeneous agents. |
Washim Mondal; Mridul Agarwal; Vaneet Aggarwal; Satish Ukkusuri; |
2372 | Generalization Bounds for Equivariant Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we study how equivariance relates to generalization error utilizing PAC Bayesian analysis for equivariant networks, where the transformation laws of feature spaces are determined by group representations. |
Arash Behboodi; Gabriele Cesa; Taco Cohen; |
2373 | Maximum-Likelihood Inverse Reinforcement Learning with Finite-Time Guarantees Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper we develop a novel {\em single-loop} algorithm for IRL that does not compromise reward estimation accuracy. |
Siliang Zeng; Chenliang Li; Alfredo Garcia; Mingyi Hong; |
2374 | A Variant of Anderson Mixing with Minimal Memory Size Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Despite its numerical success in various applications, the memory requirement in AM remains a bottleneck when solving large-scale optimization problems in a resource-limited machine. To address this problem, we propose a novel variant of AM method, called Min-AM, by storing only one vector pair, that is the minimal memory size requirement in AM. |
Fuchao Wei; Chenglong Bao; Yang Liu; Guangwen Yang; |
2375 | Optimizing Relevance Maps of Vision Transformers Improves Robustness Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: It has been observed that visual classification models often rely mostly on the image background, neglecting the foreground, which hurts their robustness to distribution changes. To alleviate this shortcoming, we propose to monitor the model’s relevancy signal and manipulate it such that the model is focused on the foreground object. |
Hila Chefer; Idan Schwartz; Lior Wolf; |
2376 | Hierarchical Lattice Layer for Partially Monotone Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a novel neural network layer, the hierarchical lattice layer (HLL), as an extension of the lattice layer so that we can use a standard neural network algorithm to train HLL while satisfying monotonicity constraints and so that it can receive a high-dimensional input vector. |
Hiroki Yanagisawa; Kohei Miyaguchi; Takayuki Katsuki; |
2377 | MORA: Improving Ensemble Robustness Evaluation with Model Reweighing Attack Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Yet, we observed that ensemble can still be fooled despite most sub-models being correct. We therefore introduce MORA, a model-reweighing attack to steer adversarial example synthesis by reweighing the importance of sub-model gradients. |
yunrui yu; Xitong Gao; Cheng-Zhong Xu; |
2378 | FOF: Learning Fourier Occupancy Field for Monocular Real-time Human Reconstruction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose Fourier Occupancy Field (FOF), a novel powerful, efficient and flexible 3D representation, for monocular real-time and accurate human reconstruction. |
Qiao Feng; Yebin Liu; Yu-Kun Lai; Jingyu Yang; Kun Li; |
2379 | Model-Based Imitation Learning for Urban Driving Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present MILE: a Model-based Imitation LEarning approach for autonomous driving that scales to the complexity of urban driving scenes. |
Anthony Hu; Gianluca Corrado; Nicolas Griffiths; Zachary Murez; Corina Gurau; Hudson Yeo; Alex Kendall; Roberto Cipolla; Jamie Shotton; |
2380 | PAC-Bayes Compression Bounds So Tight That They Can Explain Generalization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we develop a compression approach based on quantizing neural network parameters in a linear subspace, profoundly improving on previous results to provide state-of-the-art generalization bounds on a variety of tasks, including transfer learning. |
Sanae Lotfi; Sanyam Kapoor; Marc Finzi; Andres Potapczynski; Micah Goldblum; Andrew Wilson; |
2381 | Fine-Grained Semantically Aligned Vision-Language Pre-Training Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we introduce LOUPE, a fine-grained semantically aLigned visiOn-langUage PrE-training framework, which learns fine-grained semantic alignment from the novel perspective of game-theoretic interactions. |
Juncheng Li; XIN HE; Longhui Wei; Long Qian; Linchao Zhu; Lingxi Xie; Yueting Zhuang; Qi Tian; Siliang Tang; |
2382 | Asymptotically Unbiased Instance-wise Regularized Partial AUC Optimization: Theory and Algorithm Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, it is based on the pair-wise formulation of AUC, which suffers from the limited scalability w.r.t. sample size and a slow convergence rate, especially for TPAUC. To address this issue, we present a simpler reformulation of the problem in an asymptotically unbiased and instance-wise manner. |
HuiYang Shao; Qianqian Xu; Zhiyong Yang; Shilong Bao; Qingming Huang; |
2383 | Set-based Meta-Interpolation for Few-Task Meta-Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: While Manifold Mixup based task augmentation methods are domain-agnostic, we empirically find them ineffective on non-image domains. To tackle these limitations, we propose a novel domain-agnostic task augmentation method, Meta-Interpolation, which utilizes expressive neural set functions to densify the meta-training task distribution using bilevel optimization. |
Seanie Lee; Bruno Andreis; Kenji Kawaguchi; Juho Lee; Sung Ju Hwang; |
2384 | In Differential Privacy, There Is Truth: on Vote-Histogram Leakage in Ensemble Private Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The mechanism adds noise to attain a differential privacy guarantee with respect to the teachers’ training data. In this work, we observe that this use of noise, which makes PATE predictions stochastic, enables new forms of leakage of sensitive information. |
JIAQI WANG; Roei Schuster; I Shumailov; David Lie; Nicolas Papernot; |
2385 | Sparse2Dense: Learning to Densify 3D Features to Boost 3D Object Detection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present Sparse2Dense, a new framework to efficiently boost 3D detection performance by learning to densify point clouds in latent space. |
Tianyu Wang; Xiaowei Hu; Zhengzhe LIU; Chi-Wing Fu; |
2386 | Provably Efficient Offline Multi-agent Reinforcement Learning Via Strategy-wise Bonus Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose the strategy-wise concentration principle which directly builds a confidence interval for the joint strategy, in contrast to the point-wise concentration principle which builds a confidence interval for each point in the joint action space. |
Qiwen Cui; Simon Du; |
2387 | Interpolation and Regularization for Causal Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, despite increasing interest in learning models with good causal properties, there is no understanding of whether such interpolators can also achieve causal generalization. To address this gap, we study causal learning from observational data through the lens of interpolation and its counterpart—regularization. |
Leena Chennuru Vankadara; Luca Rendsburg; Ulrike Luxburg; Debarghya Ghoshdastidar; |
2388 | Neural Sheaf Diffusion: A Topological Perspective on Heterophily and Oversmoothing in GNNs Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we use cellular sheaf theory to show that the underlying geometry of the graph is deeply linked with the performance of GNNs in heterophilic settings and their oversmoothing behaviour. |
Cristian Bodnar; Francesco Di Giovanni; Benjamin Chamberlain; Pietro Lió; Michael Bronstein; |
2389 | A Classification of $G$-invariant Shallow Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Before we can consider the optimization problem itself, we must understand the search space, the architectures in it, and how they relate to one another. In this paper, we take a first step towards this goal; we prove a theorem that gives a classification of all $G$-invariant single-hidden-layer or "shallow" neural network ($G$-SNN) architectures with ReLU activation for any finite orthogonal group $G$. |
Devanshu Agrawal; James Ostrowski; |
2390 | Practical Adversarial Attacks on Spatiotemporal Traffic Forecasting Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Specifically, instead of simultaneously attacking all geo-distributed data sources, an iterative gradient guided node saliency method is proposed to identify the time-dependent set of victim nodes. |
Fan LIU; Hao Liu; Wenzhao Jiang; |
2391 | Energy-Based Contrastive Learning of Visual Representations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Here we explore Energy-Based Contrastive Learning (EBCLR) that leverages the power of generative learning by combining contrastive learning with Energy-Based Models (EBMs). |
Beomsu Kim; Jong Chul Ye; |
2392 | Symmetry Teleportation for Accelerated Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We study a different approach, symmetry teleportation, that allows the parameters to travel a large distance on the loss level set, in order to improve the convergence speed in subsequent steps. |
Bo Zhao; Nima Dehmamy; Robin Walters; Rose Yu; |
2393 | Look Around and Refer: 2D Synthetic Semantics Knowledge Distillation for 3D Visual Grounding Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We leverage 2D clues, synthetically generated from 3D point clouds, that empirically show their aptitude to boost the quality of the 3D learned visual representations. |
eslam mohamed; Yasmeen Alsaedy; Mohamed Elhoseiny; |
2394 | SKFlow: Learning Optical Flow with Super Kernel Sizes Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose Super Kernel Flow Network (SKFlow), a CNN architecture to ameliorate the impacts of occlusions on optical flow estimation. |
SHANGKUN SUN; Yuanqi Chen; Ge Li; Yu Zhu; Guodong Guo; |
2395 | Optimistic Posterior Sampling for Reinforcement Learning with Few Samples and Tight Guarantees Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose an optimistic posterior sampling algorithm for reinforcement learning (OPSRL), a simple variant of posterior sampling that only needs a number of posterior samples logarithmic in $H$, $S$, $A$, and $T$ per state-action pair. |
Daniil Tiapkin; Denis Belomestny; Daniele Calandriello; Eric Moulines; Remi Munos; Alexey Naumov; Mark Rowland; Michal Valko; Pierre Ménard; |
2396 | Unpacking Reward Shaping: Understanding The Benefits of Reward Engineering on Sample Complexity Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we build on the framework of novelty-based exploration to provide a simple scheme for incorporating shaped rewards into RL along with an analysis tool to show that particular choices of reward shaping provably improve sample efficiency. |
Abhishek Gupta; Aldo Pacchiano; Yuexiang Zhai; Sham Kakade; Sergey Levine; |
2397 | Merging Models with Fisher-Weighted Averaging Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Averaging the parameters of models that have the same architecture and initialization can provide a means of combining their respective capabilities. In this paper, we take the perspective that this ”merging” operation can be seen as choosing parameters that approximately maximize the joint likelihood of the posteriors of the models’ parameters. |
Michael S Matena; Colin Raffel; |
2398 | Risk Bounds of Multi-Pass SGD for Least Squares in The Interpolation Regime Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The goal of this paper is to provide an instance-dependent excess risk bound of multi-pass SGD for least squares in the interpolation regime, which is expressed as a function of the iteration number, stepsize, and data covariance. |
Difan Zou; Jingfeng Wu; Vladimir Braverman; Quanquan Gu; Sham Kakade; |
2399 | Adapting Self-Supervised Vision Transformers By Probing Attention-Conditioned Masking Consistency Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we shift focus to adapting modern architectures for object recognition — the increasingly popular Vision Transformer (ViT) — initialized with modern pretraining based on self-supervised learning (SSL). |
Viraj Prabhu; Sriram Yenamandra; Aaditya Singh; Judy Hoffman; |
2400 | Towards Debiased Learning and Out-of-Distribution Detection for Graph Data Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose GraphDE, a probabilistic generative framework for debiased learning and OOD detection on graph data. |
Zenan Li; Qitian Wu; Fan Nie; Junchi Yan; |
2401 | Robust $\phi$-Divergence MDPs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we develop a novel solution framework for robust MDPs with $s$-rectangular ambiguity sets that decomposes the problem into a sequence of robust Bellman updates and simplex projections. |
Chin Pang Ho; Marek Petrik; Wolfram Wiesemann; |
2402 | Understanding and Extending Subgraph GNNs By Rethinking Their Symmetries Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we study the most prominent form of subgraph methods, which employs node-based subgraph selection policies such as ego-networks or node marking and deletion. |
Fabrizio Frasca; Beatrice Bevilacqua; Michael Bronstein; Haggai Maron; |
2403 | On Kernelized Multi-Armed Bandits with Constraints Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Our ultimate goal is to study how to utilize the nature of soft constraints to attain a finer complexity-regret-constraint trade-off in the kernelized bandit setting. |
Xingyu Zhou; Bo Ji; |
2404 | First Is Better Than Last for Language Data Influence Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: The cancellation effect lowers the discriminative power of the influence score, and deleting influential examples according to this measure often does not change the model’s behavior by much. To mitigate this, we propose a technique called TracIn-WE that modifies a method called TracIn to operate on the word embedding layer instead of the last layer, where the cancellation effect is less severe. |
Chih-Kuan Yeh; Ankur Taly; Mukund Sundararajan; Frederick Liu; Pradeep Ravikumar; |
2405 | Spatially Sparse Inference for Deep Generative Image Editing Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Based on our algorithm, we propose Sparse Incremental Generative Engine (SIGE) to convert the theoretical computation reduction to latency reduction on commonly-used hardware. |
Muyang Li; Ji Lin; Chenlin Meng; Stefano Ermon; Song Han; Jun-Yan Zhu; |
2406 | Moderate-fitting As A Natural Backdoor Defender for Pre-trained Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The vulnerability of PLMs under backdoor attacks has been proved with increasing evidence in the literature. In this paper, we present several simple yet effective training strategies that could effectively defend against such attacks. |
Biru Zhu; Yujia Qin; Ganqu Cui; Yangyi Chen; Weilin Zhao; Chong Fu; Yangdong Deng; Zhiyuan Liu; Jingang Wang; Wei Wu; Maosong Sun; Ming Gu; |
2407 | Learning Probabilistic Models from Generator Latent Spaces with Hat EBM Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This work proposes a method for using any generator network as the foundation of an Energy-Based Model (EBM). |
Mitch Hill; Erik Nijkamp; Bo Pang; Jonathan Mitchell; Song-Chun Zhu; |
2408 | DP-PCA: Statistically Optimal and Differentially Private PCA Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We give the first statistically optimal PCA algorithm under approximate differential privacy, which is also computationally efficient. |
Xiyang Liu; Weihao Kong; Prateek Jain; Sewoong Oh; |
2409 | Label-invariant Augmentation for Semi-Supervised Graph Classification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We conjecture the inferior performance of graph contrastive learning might result from the violation of the label-invariant augmentation assumption. In light of this, we propose a label-invariant augmentation for graph-structured data to address this challenge. |
Han Yue; Chunhui Zhang; Chuxu Zhang; Hongfu Liu; |
2410 | Associating Objects and Their Effects in Video Through Coordination Games Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We explore a feed-forward approach for decomposing a video into layers, where each layer contains an object of interest along with its associated shadows, reflections, and other visual effects. |
Erika Lu; Forrester Cole; Weidi Xie; Tali Dekel; Bill Freeman; Andrew Zisserman; Michael Rubinstein; |
2411 | Near-Optimal Goal-Oriented Reinforcement Learning in Non-Stationary Environments Related Papers Related Patents Related Grants Related Orgs Related Experts View Abstract: We initiate the study of dynamic regret minimization for goal-oriented reinforcement learning modeled by a non-stationary stochastic shortest path problem with changing cost and … |
Liyu Chen; Haipeng Luo; |
2412 | Learning Object Parts from Multiple Views for Low-shot Category Generalization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we aim to learn discriminative object part representations for low-shot category recognition without requiring any category labels. |
Stefan Stojanov; Anh Thai; Zixuan Huang; James Rehg; |
2413 | Human-AI Shared Control Via Policy Dissection Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Inspired by the neuroscience approach to investigate the motor cortex in primates, we develop a simple yet effective frequency-based approach called Policy Dissection to align the intermediate representation of the learned neural controller with the kinematic attributes of the agent behavior. |
Quanyi Li; Zhenghao Peng; Haibin Wu; Lan Feng; Bolei Zhou; |
2414 | Revisiting Non-Parametric Matching Cost Volumes for Robust and Generalizable Stereo Matching Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: First, it is unclear whether stereo matching DNNs really learn to perform matching well. This paper studies this problem from the lens of adversarial attacks. |
Kelvin Cheng; Tianfu Wu; Zhebin Zhang; Hongyu Sun; Christopher Healey; |
2415 | Benefits of Permutation-Equivariance in Auction Mechanisms Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we consider the popular {\it additive valuation} and {\it symmetric valuation} setting; {\it i.e.}, the valuation for a set of items is defined as the sum of all items’ valuations in the set, and the valuation distribution is invariant when the bidders and/or the items are permutated. |
Tian Qin; Fengxiang He; Dingfeng Shi; Wenbing Huang; Dacheng Tao; |
2416 | Contrastive Adapters for Foundation Model Group Robustness Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We thus propose contrastive adapting, which contrastively trains adapters to bring sample embeddings close to both their ground-truth class embeddings and same-class sample embeddings. |
Michael Zhang; Christopher Ré; |
2417 | Learning Gradient Fields for Object Arrangement Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper aims to search for a policy only with a set of examples from a target distribution instead of a handcrafted reward function. |
Mingdong Wu; Fangwei Zhong; Yulong Xia; Hao Dong; |
2418 | GLIF: A Unified Gated Leaky Integrate-and-Fire Neuron for Spiking Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose GLIF, a unified spiking neuron, to fuse different bio-features in different neuronal behaviors, enlarging the representation space of spiking neurons. |
Xingting Yao; Fanrong Li; Zitao Mo; Jian Cheng; |
2419 | Concrete Score Matching: Generalized Score Matching for Discrete Data Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To this end, we propose an analogous score function called the “Concrete score”, a generalization of the (Stein) score for discrete settings. |
Chenlin Meng; Kristy Choi; Jiaming Song; Stefano Ermon; |
2420 | Scalable Design of Error-Correcting Output Codes Using Discrete Optimization with Graph Coloring Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study the problem of scalable design of Error-Correcting Output Codes (ECOC) for multi-class classification. |
Samarth Gupta; Saurabh Amin; |
2421 | Automatic Differentiation of Programs with Discrete Randomness Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this manuscript we develop a new reparameterization methodology that allows for generating programs whose expectation is the derivative of the expectation of the original program. |
Gaurav Arya; Moritz Schauer; Frank Schäfer; Christopher Rackauckas; |
2422 | Open-Ended Reinforcement Learning with Neural Reward Functions Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a different approach that uses reward functions encoded by neural networks. |
Robert Meier; Asier Mujika; |
2423 | MABSplit: Faster Forest Training Using Multi-Armed Bandits Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present an algorithm that accelerates the training of random forest and other popular tree-based learning methods. |
Mo Tiwari; Ryan Kang; Jaeyong Lee; Chris Piech; Ilan Shomorony; Sebastian Thrun; Martin Zhang; |
2424 | Cross Aggregation Transformer for Image Restoration Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, these methods lack direct interaction among different windows, which limits the establishment of long-range dependencies. To address the above issue, we propose a new image restoration model, Cross Aggregation Transformer (CAT). |
Zheng Chen; Yulun Zhang; Jinjin Gu; yongbing zhang; Linghe Kong; Xin Yuan; |
2425 | Meta-Complementing The Semantics of Short Texts in Neural Topic Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Some other previous works assume that all documents are short, and leverage external auxiliary data, e.g., pretrained word embeddings and document connectivity. Orthogonal to existing works, we remedy this problem within the corpus itself by proposing a Meta-Complement Topic Model, which improves topic quality of short texts by transferring the semantic knowledge learned on long documents to complement semantically limited short texts. |
Ce Zhang; Hady Lauw; |
2426 | TANKBind: Trigonometry-Aware Neural NetworKs for Drug-Protein Binding Structure Prediction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose Trigonometry-Aware Neural networKs for binding structure prediction, TANKBind, that builds trigonometry constraint as a vigorous inductive bias into the model and explicitly attends to all possible binding sites for each protein by segmenting the whole protein into functional blocks. |
Wei Lu; Qifeng Wu; Jixian Zhang; Jiahua Rao; Chengtao Li; Shuangjia Zheng; |
2427 | Provably Efficient Model-Free Constrained RL with Linear Function Approximation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study the constrained reinforcement learning problem, in which an agent aims to maximize the expected cumulative reward subject to a constraint on the expected total value of a utility function. |
Arnob Ghosh; Xingyu Zhou; Ness Shroff; |
2428 | Retrieve, Reason, and Refine: Generating Accurate and Faithful Patient Instructions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, writing an appropriate PI can be extremely time consuming for physicians, and is subject to being incomplete or error-prone for (potentially overworked) physicians. Therefore, we propose a new task that can provide an objective means of avoiding incompleteness, while reducing clinical workload: the automatic generation of the PI, which is imagined as being a document that the clinician can review, modify, and approve as necessary (rather than taking the human "out of the loop"). |
Fenglin Liu; Bang Yang; Chenyu You; Xian Wu; Shen Ge; Zhangdaihong Liu; Xu Sun; Yang Yang; David Clifton; |
2429 | Exploration Via Planning for Information About The Optimal Trajectory Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we develop a method that allows us to plan for exploration while taking both the task and the current knowledge about the dynamics into account. |
Viraj Mehta; Ian Char; Joseph Abbate; Rory Conlin; Mark Boyer; Stefano Ermon; Jeff Schneider; Willie Neiswanger; |
2430 | Robust Bayesian Regression Via Hard Thresholding Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Based on the combination of prior information and robust regression via hard thresholding, this paper proposes an algorithm that improves the breakdown point when facing adaptive adversarial attacks. |
Fan zheyi; Qingpei Hu; Zhaohui Li; |
2431 | Class-Aware Generative Adversarial Transformers for Medical Image Segmentation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we present CASTformer, a novel type of adversarial transformers, for 2D medical image segmentation. |
Chenyu You; Ruihan Zhao; Fenglin Liu; Siyuan Dong; Sandeep Chinchali; Ufuk Topcu; Lawrence Staib; James Duncan; |
2432 | Expectation-Maximization Contrastive Learning for Compact Video-and-Language Representations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose Expectation-Maximization Contrastive Learning (EMCL) to learn compact video-and-language representations. |
Peng Jin; Fa Jin Huang; Fenglin Liu; Xian Wu; Shen Ge; Guoli Song; David Clifton; Jie Chen; |
2433 | Population Geometry Enables Fast Sampling in Spiking Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Here, we propose to leverage the population geometry, controlled by the neural code and the neural dynamics, to implement fast samplers in spiking neural networks. |
Paul Masset; Jacob Zavatone-Veth; J. Patrick Connor; Venkatesh Murthy; Cengiz Pehlevan; |
2434 | Image Inpainting Models Are Few-Shot Learners (Given The Right Data) Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Taking inspiration from prompting in NLP systems, this paper investigates Visual Prompting: given input-output image example(s) of a new task at test time and a new input image, the goal is to automatically produce the correct output image, consistent with the proposed task. |
Amir Bar; Yossi Gandelsman; Trevor Darrell; Amir Globerson; Alexei Efros; |
2435 | Handcrafted Backdoors in Deep Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We argue that a supply-chain attacker has more attack techniques available by introducing a $handcrafted$ attack that directly manipulates a model’s weights. |
Sanghyun Hong; Nicholas Carlini; Alexey Kurakin; |
2436 | Deep Differentiable Logic Gate Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose differentiable logic gate networks, an architecture that combines real-valued logics and a continuously parameterized relaxation of the network. |
Felix Petersen; Christian Borgelt; Hilde Kuehne; Oliver Deussen; |
2437 | Domain Adaptation Meets Individual Fairness. And They Get Along Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: For example, machine learning (ML) models often perform worse on demographic groups that are underrepresented in the training data. In this paper, we leverage this connection between algorithmic fairness and distribution shifts to show that algorithmic fairness interventions can help ML models overcome distribution shifts, and that domain adaptation methods (for overcoming distribution shifts) can mitigate algorithmic biases. |
Debarghya Mukherjee; Felix Petersen; Mikhail Yurochkin; Yuekai Sun; |
2438 | BYOL-Explore: Exploration By Bootstrapped Prediction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present BYOL-Explore, a conceptually simple yet general approach for curiosity-driven exploration in visually complex environments. |
Zhaohan Guo; Shantanu Thakoor; Miruna Pislar; Bernardo Avila Pires; Florent Altché; Corentin Tallec; Alaa Saade; Daniele Calandriello; Jean-Bastien Grill; Yunhao Tang; Michal Valko; Remi Munos; Mohammad Gheshlaghi Azar; Bilal Piot; |
2439 | GenerSpeech: Towards Style Transfer for Generalizable Out-Of-Domain Text-to-Speech Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper proposes GenerSpeech, a text-to-speech model towards high-fidelity zero-shot style transfer of OOD custom voice. |
Rongjie Huang; Yi Ren; Jinglin Liu; Chenye Cui; Zhou Zhao; |
2440 | Leveraging Inter-Layer Dependency for Post -Training Quantization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel Network-Wise Quantization (NWQ) approach to fully leveraging inter-layer dependency. |
changbao wang; DanDan Zheng; Yuanliu Liu; Liang Li; |
2441 | Spectrum Random Masking for Generalization in Image-based Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we argue with two principles for augmentations in RL: First, the augmented observations should facilitate learning a universal policy, which is robust to various distribution shifts. |
Yangru Huang; Peixi Peng; Yifan Zhao; Guangyao Chen; Yonghong Tian; |
2442 | MetaMask: Revisiting Dimensional Confounder for Self-Supervised Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a novel self-supervised learning method to jointly address the dimensional redundancy and confounder issues by performing a meta-learning technique. |
Jiangmeng Li; Wenwen Qiang; Yanan Zhang; Wenyi Mo; Changwen Zheng; Bing Su; Hui Xiong; |
2443 | Degradation-Aware Unfolding Half-Shuffle Transformer for Spectral Compressive Imaging Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: By plugging HST into DAUF, we establish the first Transformer-based deep unfolding method, Degradation-Aware Unfolding Half-Shuffle Transformer (DAUHST), for HSI reconstruction. |
Yuanhao Cai; Jing Lin; Haoqian Wang; Xin Yuan; Henghui Ding; Yulun Zhang; Radu Timofte; Luc V Gool; |
2444 | Compressible-composable NeRF Via Rank-residual Decomposition Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, its implicit representation causes difficulty in manipulating the models like the explicit mesh representation.Several recent advances in NeRF manipulation are usually restricted by a shared renderer network, or suffer from large model size. To circumvent the hurdle, in this paper, we present a neural field representation that enables efficient and convenient manipulation of models.To achieve this goal, we learn a hybrid tensor rank decomposition of the scene without neural networks. |
Jiaxiang Tang; Xiaokang Chen; Jingbo Wang; Gang Zeng; |
2445 | Accelerated Linearized Laplace Approximation for Bayesian Deep Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: These approximations are likely to harm the fidelity of learning outcomes. To tackle this issue, inspired by the connections between LLA and neural target kernels (NTKs), we develop a Nystr\"{o}m approximation to NTKs to accelerate LLA. |
Zhijie Deng; Feng Zhou; Jun Zhu; |
2446 | Decoupling Knowledge from Memorization: Retrieval-augmented Prompt Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Specifically, vanilla prompt learning may struggle to utilize atypical instances by rote during fully-supervised training or overfit shallow patterns with low-shot data. To alleviate such limitations, we develop RetroPrompt with the motivation of decoupling knowledge from memorization to help the model strike a balance between generalization and memorization. |
Xiang Chen; Lei Li; Ningyu Zhang; Xiaozhuan Liang; Shumin Deng; Chuanqi Tan; Fei Huang; Luo Si; Huajun Chen; |
2447 | Improving Out-of-distribution Robustness By Adversarial Training with Structured Priors Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This provides us with clues that the adversarial perturbations with universal (low dimensional) structures can enhance the robustness to large data distribution shifts which are common in OOD scenarios. Inspired by this, we propose two AT variants with low-rank structures to train OOD-robust models. |
Qixun Wang; Yifei Wang; Hong Zhu; Yisen Wang; |
2448 | Width and Depth Guidelines for Deep Q-Learning: A Function Approximation Perspective Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we focus on value based algorithm with the $\epsilon$-greedy exploration under Besov and Barron function spaces, which aims at approximating an $\alpha$-smooth Q-function in a $d$-dimensional feature space with $T$ episodes. |
Fanghui Liu; Luca Viano; Volkan Cevher; |
2449 | Unsupervised Learning for Combinatorial Optimization with Principled Objective Design Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Our key contribution is the observation that if the relaxed objective satisfies entry-wise concavity, a low optimization loss guarantees the quality of the obtained integral solutions. |
Haoyu Peter Wang; Nan Wu; Hang Yang; Cong Hao; Pan Li; |
2450 | Differentially Private Generalized Linear Models Revisited Related Papers Related Patents Related Grants Related Orgs Related Experts View Abstract: We study the problem of $(\epsilon,\delta)$-differentially private learning of linear predictors with convex losses. We provide results for two subclasses of loss functions. The … |
Raman Arora; Raef Bassily; Cristóbal Guzmán; Michael Menart; Enayat Ullah; |
2451 | NeuForm: Adaptive Overfitting for Neural Shape Editing Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose NeuForm to combine the advantages of both overfitted and generalizable representations by adaptively overfitting a generalizable representation to regions where reliable data is available, while using the generalizable representation everywhere else. |
Connor Lin; Niloy Mitra; Gordon Wetzstein; Leonidas Guibas; Paul Guerrero; |
2452 | S3GC: Scalable Self-Supervised Graph Clustering Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose S3GC which uses contrastive learning along with Graph Neural Networks and node features to learn clusterable features. |
Fnu Devvrit; Aditya Sinha; Inderjit Dhillon; Prateek Jain; |
2453 | Sequencer: Deep LSTM for Image Classification Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Against this background, there is growing interest in what inductive bias is suitable for computer vision. Here we propose Sequencer, a novel and competitive architecture alternative to ViT that provides a new perspective on these issues. |
Yuki Tatsunami; Masato Taki; |
2454 | Improving Multi-Task Generalization Via Regularizing Spurious Correlation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We point out the unique challenges of spurious correlation problem in multi-task learning and propose MT-CRL framework to improve multi-task generalization via regularizing spurious correlation. |
Ziniu Hu; Zhe Zhao; Xinyang Yi; Tiansheng Yao; Lichan Hong; Yizhou Sun; Ed Chi; |
2455 | Falsification Before Extrapolation in Causal Effect Estimation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Given a set of observational estimates (e.g., from multiple studies), we propose a meta-algorithm that attempts to reject observational estimates that are biased. |
Michael Oberst; Zeshan M Hussain; Ming-Chieh Shih; David Sontag; |
2456 | Improved Bounds on Neural Complexity for Representing Piecewise Linear Functions Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: These existing results seem to indicate that the cost of representing a CPWL function is expensive. In this paper, we propose much tighter bounds and establish a polynomial time algorithm to find a network satisfying these bounds for any given CPWL function. |
Kuan-Lin Chen; Harinath Garudadri; Bhaskar D Rao; |
2457 | Improving Transformer with An Admixture of Attention Heads Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose the Transformer with a Finite Admixture of Shared Heads (FiSHformers), a novel class of efficient and flexible transformers that allow the sharing of attention matrices between attention heads. |
Tan Nguyen; Tam Nguyen; Hai Do; Khai Nguyen; Vishwanath Saragadam; Minh Pham; Khuong Duy Nguyen; Nhat Ho; Stanley Osher; |
2458 | UQGAN: A Unified Model for Uncertainty Quantification of Deep Classifiers Trained Via Conditional GANs Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present an approach to quantifying both aleatoric and epistemic uncertainty for deep neural networks in image classification, based on generative adversarial networks (GANs). |
Philipp Oberdiek; Gernot Fink; Matthias Rottmann; |
2459 | Submodular Maximization in Clean Linear Time Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we provide the first deterministic algorithm that achieves $1/2$-approximation for submodular maximization subject to a knapsack constraint, while making a number of queries that scales only linearly with the size of the ground set $n$. |
Wenxin Li; Moran Feldman; Ehsan Kazemi; Amin Karbasi; |
2460 | Fair Infinitesimal Jackknife: Mitigating The Influence of Biased Training Data Points Without Refitting Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Focusing on demographic parity and equality of opportunity, in this paper we propose an algorithm that improves the fairness of a pre-trained classifier by simply dropping carefully selected training data points. |
Prasanna Sattigeri; Soumya Ghosh; Inkit Padhi; Pierre Dognin; Kush Varshney; |
2461 | A Simple and Optimal Policy Design for Online Learning with Safety Against Heavy-tailed Risk Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We consider the classical multi-armed bandit problem and design simple-to-implement new policies that simultaneously enjoy two properties: worst-case optimality for the expected regret, and safety against heavy-tailed risk for the regret distribution. |
Feng Zhu; Zeyu Zheng; David Simchi-Levi; |
2462 | Robust Learning Against Relational Adversaries Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Inspired by the insights, we propose $\textit{normalize-and-predict}$, a learning framework that leverages input normalization to achieve provable robustness. |
Yizhen Wang; Mohannad Alhanahnah; Xiaozhu Meng; Ke Wang; Mihai Christodorescu; Somesh Jha; |
2463 | Minimax Optimal Online Imitation Learning Via Replay Estimation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We develop a minimax-optimal extension of moment matching algorithms for imitation learning and validate it empirically. |
Gokul Swamy; Nived Rajaraman; Matt Peng; Sanjiban Choudhury; J. Bagnell; Steven Wu; Jiantao Jiao; Kannan Ramchandran; |
2464 | Tight Lower Bounds on Worst-Case Guarantees for Zero-Shot Learning with Attributes Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We develop a rigorous mathematical analysis of zero-shot learning with attributes. |
Alessio Mazzetto; Cristina Menghini; Andrew Yuan; Eli Upfal; Stephen Bach; |
2465 | Universal Rates for Interactive Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We provide a complete characterization of the optimal universal learning rates achievable by an interactive learning algorithm that can ask arbitrary binary queries. |
Steve Hanneke; Amin Karbasi; Shay Moran; Grigoris Velegkas; |
2466 | Robust Anytime Learning of Markov Decision Processes Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We continuously learn the transition probabilities of an MDP in a robust anytime-learning approach that combines a dedicated Bayesian inference scheme with the computation of robust policies. |
Marnix Suilen; Thiago D. Simão; Nils Jansen; David Parker; |
2467 | How and Why to Manipulate Your Own Agent: Modeling Games Between Users of Learning Agents Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose to view the outcomes of the agents’ dynamics as inducing a "meta-game" between the users. |
Yoav Kolumbus; Noam Nisan; |
2468 | One Layer Is All You Need Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this study, we are the first to investigate the representative potential of fixed random weights with limited unique values by iteratively learning different masks, leading to a new paradigm for model compression to diminish the model size. |
Yue Bai; Huan Wang; Xu Ma; Yitian Zhang; Zhiqiang Tao; Yun Fu; |
2469 | Model-based Lifelong Reinforcement Learning with Bayesian Exploration Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a model-based lifelong reinforcement-learning approach that estimates a hierarchical Bayesian posterior distilling the common structure shared across different tasks. |
Haotian Fu; Shangqun Yu; Michael Littman; George Konidaris; |
2470 | Oracle Inequalities for Model Selection in Offline Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose, to our knowledge, the first model selection algorithm for offline RL that achieves minimax rate-optimal oracle inequalities up to logarithmic factors. |
Jonathan N Lee; George Tucker; Ofir Nachum; Bo Dai; Emma Brunskill; |
2471 | GAMA: Generative Adversarial Multi-Object Scene Attacks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Due to their inherent property of strong transferability of perturbations to unknown models, this paper presents the first approach of using generative models for adversarial attacks on multi-object scenes. |
Abhishek Aich; Calvin-Khang Ta; Akash Gupta; Chengyu Song; Srikanth Krishnamurthy; Salman Asif; Amit Roy-Chowdhury; |
2472 | Chain of Thought Prompting Elicits Reasoning in Large Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We explore how generating a chain of thought—a series of intermediate reasoning steps—significantly improves the ability of large language models to perform complex reasoning. |
Jason Wei; Xuezhi Wang; Dale Schuurmans; Maarten Bosma; brian ichter; Fei Xia; Ed Chi; Quoc V Le; Denny Zhou; |
2473 | SketchBoost: Fast Gradient Boosted Decision Tree for Multioutput Problems Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose novel methods aiming to accelerate the training process of GBDT in the multioutput scenario. |
Leonid Iosipoi; Anton Vakhrushev; |
2474 | Label-Aware Global Consistency for Multi-Label Learning with Single Positive Labels Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose to solve SPML problems by designing a Label-Aware global Consistency (LAC) regularization, which leverages the manifold structure information to enhance the recovery of potential positive labels. |
Ming-Kun Xie; Jiahao Xiao; Sheng-Jun Huang; |
2475 | Communication-Efficient Topologies for Decentralized Learning with $O(1)$ Consensus Rate Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a new family of network topology that is with (almost) constant degree and network-size-independent consensus rate. It enables decentralized learning with faster communication and better convergence rate. |
Zhuoqing Song; Weijian Li; Kexin Jin; Lei Shi; Ming Yan; Wotao Yin; Kun Yuan; |
2476 | Why GANs Are Overkill for NLP Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In particular, on sequential data such as text, maximum-likelihood approaches are significantly more utilized than GANs. We show that, while it may seem that maximizing likelihood is inherently different than minimizing distinguishability, this distinction is largely artificial and only holds for limited models. |
David Alvarez-Melis; Vikas Garg; Adam Kalai; |
2477 | Surprise Minimizing Multi-Agent Learning with Energy-based Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We explore surprise minimization in multi-agent learning by utilizing the free energy across all agents in a multi-agent system. |
Karush Suri; |
2478 | Learning Predictions for Algorithms with Predictions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce a general design approach for algorithms that learn predictors: (1) identify a functional dependence of the performance measure on the prediction quality, and (2) apply techniques from online learning to learn predictors against adversarial instances, tune robustness-consistency trade-offs, and obtain new statistical guarantees. |
Misha Khodak; Maria-Florina Balcan; Ameet Talwalkar; Sergei Vassilvitskii; |
2479 | Fair Ranking with Noisy Protected Attributes Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present a fair-ranking framework that incorporates group fairness requirements along with probabilistic information about perturbations in socially-salient attributes. |
Anay Mehrotra; Nisheeth Vishnoi; |
2480 | Provably Expressive Temporal Graph Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Specifically, novel constructions reveal the inadequacy of MP-TGNs and WA-TGNs, proving that neither category subsumes the other. We extend the 1-WL (Weisfeiler-Leman) test to temporal graphs, and show that the most powerful MP-TGNs should use injective updates, as in this case they become as expressive as the temporal WL. |
Amauri Souza; Diego Mesquita; Samuel Kaski; Vikas Garg; |
2481 | Learning Expressive Meta-Representations with Mixture of Expert Neural Processes Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, most NP variants place a strong emphasis on a global latent variable. This weakens the approximation power and restricts the scope of applications using NP variants, especially when data generative processes are complicated.To resolve these issues, we propose to combine the Mixture of Expert models with Neural Processes to develop more expressive exchangeable stochastic processes, referred to as Mixture of Expert Neural Processes (MoE-NPs). |
Qi Wang; Herke van Hoof; |
2482 | Fast Bayesian Coresets Via Subsampling and Quasi-Newton Refinement Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a Bayesian coreset construction algorithm that first selects a uniformly random subset of data, and then optimizes the weights using a novel quasi-Newton method. |
Cian Naik; Judith Rousseau; Trevor Campbell; |
2483 | Improving GANs with A Dynamic Discriminator Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This work proposes to dynamically adjust the capacity of the discriminator during GAN training, which improves synthesis performance without incurring any additional computation cost. |
Ceyuan Yang; Yujun Shen; Yinghao Xu; Deli Zhao; Bo Dai; Bolei Zhou; |
2484 | Truncated Emphatic Temporal Difference Methods for Prediction and Control Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we address those two open problems simultaneously via using truncated followon traces in emphatic TD methods. |
Shangtong Zhang; Shimon Whiteson; |
2485 | Pseudo-Riemannian Graph Convolutional Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We consider a larger class of pseudo-Riemannian manifolds that generalize hyperboloid and sphere. |
Bo Xiong; Shichao Zhu; Nico Potyka; Shirui Pan; Chuan Zhou; Steffen Staab; |
2486 | A Unified Framework for Deep Symbolic Regression Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a strategy to integrate five disparate solution strategies for symbolic regression into a unified framework, resulting in a new state-of-the-art on SRBench benchmarks. |
Mikel Landajuela; Chak Shing Lee; Jiachen Yang; Ruben Glatt; Claudio P Santiago; Ignacio Aravena; Terrell Mundhenk; Garrett Mulcahy; Brenden K Petersen; |
2487 | S2P: State-conditioned Image Synthesis for Data Augmentation in Offline Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: For exploiting such benefit also on the image-based RL, we firstly propose a generative model, S2P (State2Pixel), which synthesizes the raw pixel of the agent from its corresponding state. |
Daesol Cho; Dongseok Shim; H. Jin Kim; |
2488 | Discovery of Single Independent Latent Variable Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Our goal is to recover the hidden component.For this purpose, we propose an autoencoder equipped with a discriminator.Unlike the standard nonlinear ICA problem, which was shown to be non-identifiable, in the special case of ICA we consider here, we show that our approach can recover the component of interest up to entropy-preserving transformation. |
Uri Shaham; Jonathan Svirsky; Ori Katz; Ronen Talmon; |
2489 | NeMF: Neural Motion Fields for Kinematic Animation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present an implicit neural representation to learn the spatio-temporal space of kinematic motions. |
Chengan He; Jun Saito; James Zachary; Holly Rushmeier; Yi Zhou; |
2490 | Rethinking and Improving Robustness of Convolutional Neural Networks: A Shapley Value-based Approach in Frequency Domain Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we introduce Shapley value, a metric of cooperative game theory, into the frequency domain and propose to quantify the positive (negative) impact of every frequency component of data on CNNs. |
Yiting Chen; Qibing Ren; Junchi Yan; |
2491 | On The Convergence of Stochastic Multi-Objective Gradient Alteration and Beyond Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we point out that the stochastic gradient alteration algorithms may fail to converge to Pareto optimal solutions. We summarize these seemingly different algorithms into a unified algorithmic framework, where the descent direction is given by the composition of the gradients w.r.t. the multiple objectives. |
Shiji Zhou; Wenpeng Zhang; Jiyan Jiang; Wenliang Zhong; Jinjie GU; Wenwu Zhu; |
2492 | Better Best of Both Worlds Bounds for Bandits with Switching Costs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce an algorithm that improves previous results in the best-of-both-worlds algorithms for bandits with switching cost domain, accompanied by an adequate lower bound. |
Idan Amir; Guy Azov; Tomer Koren; Roi Livni; |
2493 | Effects of Data Geometry in Early Deep Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: By extending recent advances in the theoretical understanding of neural networks, we study how a randomly initialized neural network with piecewise linear activation splits the data manifold into regions where the neural network behaves as a linear function. |
Saket Tiwari; George Konidaris; |
2494 | Infinite Recommendation Networks: A Data-Centric Approach Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We leverage the Neural Tangent Kernel and its equivalence to training infinitely-wide neural networks to devise $\infty$-AE: an autoencoder with infinitely-wide bottleneck layers. |
Noveen Sachdeva; Mehak Dhaliwal; Carole-Jean Wu; Julian Mcauley; |
2495 | Benign, Tempered, or Catastrophic: Toward A Refined Taxonomy of Overfitting Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we argue that, while benign overfitting has been instructive to study, real interpolating methods like deep networks do not fit benignly. |
Neil Mallinar; James Simon; Amirhesam Abedsoltan; Parthe Pandit; Misha Belkin; Preetum Nakkiran; |
2496 | Picking on The Same Person: Does Algorithmic Monoculture Lead to Outcome Homogenization? Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: While sharing offers advantages like amortizing effort, it also has risks. We introduce and formalize one such risk, $\textit{outcome homogenization}$, defined here as the extent to which particular individuals or groups experience the same outcomes across different deployments. |
Rishi Bommasani; Kathleen A. Creel; Ananya Kumar; Dan Jurafsky; Percy Liang; |
2497 | Robust Imitation Via Mirror Descent Inverse Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Inspired by a first-order optimization method called mirror descent, this paper proposes to predict a sequence of reward functions, which are iterative solutions for a constrained convex problem. |
Dong-Sig Han; Hyunseo Kim; Hyundo Lee; JeHwan Ryu; Byoung-Tak Zhang; |
2498 | A Probabilistic Graph Coupling View of Dimension Reduction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To that extent, we introduce a unifying statistical framework based on the coupling of hidden graphs using cross entropy. |
Hugues Van Assel; Thibault Espinasse; Julien Chiquet; Franck Picard; |
2499 | DTMD: Learning Improvement of Spiking Neural Networks with Dynamic Thresholding Neurons and Moderate Dropout Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Some researchers try to use specified parametric models in different network layers or regions, but most still use preset or suboptimal parameters. Inspired by the neuroscience observation that different neuronal mechanisms exist in disparate brain regions, we propose a new spiking neuronal mechanism, named dynamic thresholding, to address this issue. |
SIQI WANG; Tee Hiang Cheng; Meng-Hiot Lim; |
2500 | Empirical Phase Diagram for Three-layer Neural Networks with Infinite Width Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Inspired by the phase diagram for two-layer ReLU NNs with infinite width (Luo et al., 2021), we make a step towards drawing a phase diagram for three-layer ReLU NNs with infinite width. |
Hanxu Zhou; Zhou Qixuan; Zhenyuan Jin; Tao Luo; Yaoyu Zhang; Zhi-Qin Xu; |
2501 | Towards Understanding The Condensation of Neural Networks at Initial Training Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we illustrate the formation of the condensation in multi-layer fully connected NNs and show that the maximal number of condensed orientations in the initial training stage is twice the multiplicity of the activation function, where “multiplicity” indicates the multiple roots of activation function at origin. |
Hanxu Zhou; Zhou Qixuan; Tao Luo; Yaoyu Zhang; Zhi-Qin Xu; |
2502 | Learning to Constrain Policy Optimization with Virtual Trust Region Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce a constrained optimization method for policy gradient reinforcement learning, which uses two trust regions to regulate each policy update. |
Thai Hung Le; Thommen Karimpanal George; Majid Abdolshah; Dung Nguyen; Kien Do; Sunil Gupta; Svetha Venkatesh; |
2503 | Semantic Field of Words Represented As Non-Linear Potential Functions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: State-of-the-art word embeddings presume a linear vector space, but this approach does not easily incorporate the nonlinearity that is necessary to represent polysemy. We thus propose a novel semantic FIeld REepresentation, called FIRE, which is a $D$-dimensional field in which every word is represented as a set of its locations and a nonlinear function covering the field. |
Xin Du; Kumiko Tanaka-Ishii; |
2504 | VF-PS: How to Select Important Participants in Vertical Federated Learning, Efficiently and Securely? Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Our first technical contribution is VF-MINE—a Vertically Federated Mutual INformation Estimator—that uses one of the most celebrated algorithms in database theory—Fagin’s algorithm as a building block. |
Jiawei Jiang; Lukas Burkhalter; Fangcheng Fu; Bolin Ding; Bo Du; Anwar Hithnawi; Bo Li; Ce Zhang; |
2505 | Exploring The Limits of Domain-Adaptive Training for Detoxifying Large-Scale Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we explore domain-adaptive training to reduce the toxicity of language models. |
Boxin Wang; Wei Ping; Chaowei Xiao; Peng Xu; Mostofa Patwary; Mohammad Shoeybi; Bo Li; Anima Anandkumar; Bryan Catanzaro; |
2506 | GBA: A Tuning-free Approach to Switch Between Synchronous and Asynchronous Training for Recommendation Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose Global Batch gradients Aggregation (GBA) over PS, which aggregates and applies gradients with the same global batch size as the synchronous training. |
Wenbo Su; Yuanxing Zhang; Yufeng Cai; Kaixu Ren; Pengjie Wang; Huimin Yi; Yue Song; Jing Chen; Hongbo Deng; Jian Xu; Lin Qu; Bo Zheng; |
2507 | CalFAT: Calibrated Federated Adversarial Training with Label Skewness Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we study the problem of FAT under label skewness, and firstly reveal one root cause of the training instability and natural accuracy degradation issues: skewed labels lead to non-identical class probabilities and heterogeneous local models. |
Chen Chen; Yuchen Liu; Xingjun Ma; Lingjuan Lyu; |
2508 | Functional Ensemble Distillation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we investigate how to best distill an ensemble’s predictions using an efficient model. |
Coby Penso; Idan Achituve; Ethan Fetaya; |
2509 | PALBERT: Teaching ALBERT to Ponder Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose improving PonderNet with a novel deterministic Q-exit criterion and a revisited model architecture. |
Nikita Balagansky; Daniil Gavrilov; |
2510 | Provable Generalization of Overparameterized Meta-learning Trained with SGD Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper studies the generalization of a widely used meta-learning approach, Model-Agnostic Meta-Learning (MAML), which aims to find a good initialization for fast adaptation to new tasks. |
Yu Huang; Yingbin Liang; Longbo Huang; |
2511 | Monte Carlo Tree Descent for Black-Box Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose new Monte Carlo Tree Search methods that balance local descent and Bayesian optimization for black-box optimization problems. |
Yaoguang Zhai; Sicun Gao; |
2512 | BILCO: An Efficient Algorithm for Joint Alignment of Time Series Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this report, we present BIdirectional pushing with Linear Component Operations (BILCO), a novel algorithm that solves the joint alignment max-flow problems efficiently and exactly. |
Xuelong Mi; Mengfan Wang; Alex Chen; Jing-Xuan Lim; Yizhi Wang; Misha B Ahrens; Guoqiang Yu; |
2513 | Pyramid Attention For Source Code Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we present a multi-granularity method for the task of source code summarization, which generates a concise functional description for the given code snippet. |
Lei Chai; Ming LI; |
2514 | Reconstruction on Trees and Low-Degree Polynomials Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We pose related open questions about low-degree polynomials and the Kesten-Stigum threshold. |
Frederic Koehler; Elchanan Mossel; |
2515 | Neural Collapse with Normalized Features: A Geometric Analysis Over The Riemannian Manifold Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: As feature normalization in the last layer becomes a common practice in modern representation learning, in this work we theoretically justify the neural collapse phenomenon for normalized features. |
Can Yaras; Peng Wang; Zhihui Zhu; Laura Balzano; Qing Qu; |
2516 | Multiclass Learnability Beyond The PAC Framework: Universal Rates and Partial Concept Classes Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper we study the problem of multiclass classification with a bounded number of different labels $k$, in the realizable setting. |
Alkis Kalavasis; Grigoris Velegkas; Amin Karbasi; |
2517 | Chromatic Correlation Clustering, Revisited Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We show that there exists a $2.5$-approximation to the CCC problem based on a Linear Programming (LP) approach, thus improving the best-known approximation ratio of 3 achieved by Klodt et al. [21] . |
Qing Xiu; Kai Han; Jing Tang; Shuang Cui; He Huang; |
2518 | Learning from Stochastically Revealed Preference Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we consider two settings for the underlying utility distribution: a Gaussian setting where the customer utility follows the von Mises-Fisher distribution, and a $\delta$-corruption setting where the customer utility distribution concentrates on one fixed vector with high probability and is arbitrarily corrupted otherwise. |
John Birge; Xiaocheng Li; Chunlin Sun; |
2519 | A Single-timescale Analysis for Stochastic Approximation with Multiple Coupled Sequences Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we study the finite-time convergence of nonlinear SA with multiple coupled sequences. |
Han Shen; Tianyi Chen; |
2520 | Context-Based Dynamic Pricing with Partially Linear Demand Model Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper combines these two approaches by studying the context-based dynamic pricing with online learning, where the unknown expected demand admits a semi-parametric partially linear structure. |
Jinzhi Bu; David Simchi-Levi; Chonghuan Wang; |
2521 | Approximation with CNNs in Sobolev Space: with Applications to Classification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We derive a novel approximation error bound with explicit prefactor for Sobolev-regular functions using deep convolutional neural networks (CNNs). |
Jian Huang; GUOHAO SHEN; Yuling Jiao; Yuanyuan Lin; |
2522 | Momentum Adversarial Distillation: Handling Large Distribution Shifts in Data-Free Knowledge Distillation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Such distribution shift could be large if the generator and the student are trained adversarially, causing the student to forget the knowledge it acquired at the previous steps. To alleviate this problem, we propose a simple yet effective method called Momentum Adversarial Distillation (MAD) which maintains an exponential moving average (EMA) copy of the generator and uses synthetic samples from both the generator and the EMA generator to train the student. |
Kien Do; Thai Hung Le; Dung Nguyen; Dang Nguyen; HARIPRIYA HARIKUMAR; Truyen Tran; Santu Rana; Svetha Venkatesh; |
2523 | Characterization of Excess Risk for Locally Strongly Convex Population Risk Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We establish upper bounds for the expected excess risk of models trained by proper iterative algorithms which approximate the local minima. |
Mingyang Yi; Ruoyu Wang; Zhi-Ming Ma; |
2524 | Stochastic Second-Order Methods Provably Beat SGD For Gradient-Dominated Functions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study the performance of Stochastic Cubic Regularized Newton (SCRN) on a class of functions satisfying gradient dominance property which holds in a wide range of applications in machine learning and signal processing. |
Mohammadsaeed Masiha; Saber Salehkaleybar; Niao He; Negar Kiyavash; Patrick Thiran; |
2525 | Direct Advantage Estimation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Instead, we show that the advantage function can be interpreted as causal effects and shares similar properties with causal representations. Based on this insight, we propose Direct Advantage Estimation (DAE), a novel method that can model the advantage function and estimate it directly from on-policy data while simultaneously minimizing the variance of the return without requiring the (action-)value function. |
Hsiao-Ru Pan; Nico Gürtler; Alexander Neitz; Bernhard Schölkopf; |
2526 | Dual-Curriculum Contrastive Multi-Instance Learning for Cancer Prognosis Analysis with Whole Slide Images Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Additionally, existing methods generally aggregate instance representations as bag ones for prognosis prediction and have no consideration of intra-bag redundancy and inter-bag discrimination. To address these issues, we propose a dual-curriculum contrastive MIL method for cancer prognosis analysis with WSIs. |
CHAO TU; YU ZHANG; Zhenyuan Ning; |
2527 | Towards Effective Multi-Modal Interchanges in Zero-Resource Sounding Object Localization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we develop an effective transfer-based two-stream architecture for zero-resource sounding object localization. |
Yang Zhao; Chen Zhang; Haifeng Huang; Haoyuan Li; Zhou Zhao; |
2528 | RNNs of RNNs: Recursive Construction of Stable Assemblies of Recurrent Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Many properties of single RNNs are well characterized theoretically, but experimental neuroscience has moved in the direction of studying multiple interacting areas, and RNN theory needs to be likewise extended. We take a constructive approach towards this problem, leveraging tools from nonlinear control theory and machine learning to characterize when combinations of stable RNNs will themselves be stable. |
Leo Kozachkov; Michaela Ennis; Jean-Jacques Slotine; |
2529 | Near Instance-Optimal PAC Reinforcement Learning for Deterministic MDPs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose the first (nearly) matching upper and lower bounds on the sample complexity of PAC RL in deterministic episodic MDPs with finite state and action spaces. |
Andrea Tirinzoni; Aymen Al Marjani; Emilie Kaufmann; |
2530 | Adversarial Training with Complementary Labels: On The Benefit of Gradually Informative Attacks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we further explore the phenomenon and identify the underlying challenges of AT with CLs as intractable adversarial optimization and low-quality adversarial examples. To address the above problems, we propose a new learning strategy using gradually informative attacks, which consists of two critical components: 1) Warm-up Attack (Warm-up) gently raises the adversarial perturbation budgets to ease the adversarial optimization with CLs; 2) Pseudo-Label Attack (PLA) incorporates the progressively informative model predictions into a corrected complementary loss. |
Jianan Zhou; Jianing Zhu; Jingfeng ZHANG; Tongliang Liu; Gang Niu; Bo Han; Masashi Sugiyama; |
2531 | Motion Forecasting Transformer with Global Intention Localization and Local Movement Refinement Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose the Motion TRansformer (MTR) framework that models motion prediction as the joint optimization of global intention localization and local movement refinement. |
Shaoshuai Shi; Li Jiang; Dengxin Dai; Bernt Schiele; |
2532 | OrdinalCLIP: Learning Rank Prompts for Language-Guided Ordinal Regression Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper presents a language-powered paradigm for ordinal regression. |
Wanhua Li; Xiaoke Huang; Zheng Zhu; Yansong Tang; Xiu Li; Jie Zhou; Jiwen Lu; |
2533 | CAGroup3D: Class-Aware Grouping for 3D Object Detection on Point Clouds Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present a novel two-stage fully sparse convolutional 3D object detection framework, named CAGroup3D. |
Haiyang Wang; lihe Ding; Shaocong Dong; Shaoshuai Shi; Aoxue Li; Jianan Li; Zhenguo Li; Liwei Wang; |
2534 | Doubly Robust Counterfactual Classification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a doubly-robust nonparametric estimator for our counterfactual classifier where we can incorporate flexible constraints. |
Kwangho Kim; Edward Kennedy; Jose Zubizarreta; |
2535 | Equivariant Representation in Recurrent Networks with A Continuous Manifold of Attractors Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We found that a continuous attractor network (CAN), a canonical neural circuit model, self-consistently generates a continuous family of stationary population responses (attractors) that represents the stimulus equivariantly. |
Wenhao Zhang; Ying Nian Wu; Si Wu; |
2536 | On Gap-dependent Bounds for Offline Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper presents a systematic study on gap-dependent sample complexity in offline reinforcement learning. |
Xinqi Wang; Qiwen Cui; Simon Du; |
2537 | DeepMed: Semiparametric Causal Mediation Analysis with Debiased Deep Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To reduce bias for estimating Natural Direct and Indirect Effects in mediation analysis, we propose a new method called DeepMed that uses deep neural networks (DNNs) to cross-fit the infinite-dimensional nuisance functions in the efficient influence functions. |
Siqi Xu; Lin Liu; Zhonghua Liu; |
2538 | High-dimensional Limit Theorems for SGD: Effective Dynamics and Critical Scaling Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study the scaling limits of stochastic gradient descent (SGD) with constant step-size in the high-dimensional regime. |
Gerard Ben Arous; Reza Gheissari; Aukosh Jagannath; |
2539 | Decomposing NeRF for Editing Via Feature Field Distillation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we tackle the problem of semantic scene decomposition of NeRFs to enable query-based local editing of the represented 3D scenes. |
Sosuke Kobayashi; Eiichi Matsumoto; Vincent Sitzmann; |
2540 | On Analyzing Generative and Denoising Capabilities of Diffusion-based Deep Generative Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We show that diffusion-based generative models are composed of generative and denoising parts. |
Kamil Deja; Anna Kuzina; Tomasz Trzcinski; Jakub Tomczak; |
2541 | Towards Efficient 3D Object Detection with Knowledge Distillation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To this end, we explore the potential of knowledge distillation (KD) for developing efficient 3D object detectors, focusing on popular pillar- and voxel-based detectors. |
Jihan Yang; Shaoshuai Shi; Runyu Ding; Zhe Wang; Xiaojuan Qi; |
2542 | Inception Transformer Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Recent studies show that transformer has strong capability of building long-range dependencies, yet is incompetent in capturing high frequencies that predominantly convey local information. To tackle this issue, we present a novel and general-purpose $\textit{Inception Transformer}$, or $\textit{iFormer}$ for short, that effectively learns comprehensive features with both high- and low-frequency information in visual data. |
Chenyang Si; Weihao Yu; Pan Zhou; Yichen Zhou; Xinchao Wang; Shuicheng Yan; |
2543 | Inference and Sampling for Archimax Copulas Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Building on the stochastic representation of Archimax copulas, we develop a non-parametric inference method and sampling algorithm. |
Yuting Ng; Ali Hasan; Vahid Tarokh; |
2544 | Parameter Tuning and Model Selection in Optimal Transport with Semi-dual Brenier Formulation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: So far, no quantitative criterion has yet been put forward to tune the parameter of these models and select maps that best approximate the ground truth. To perform this task, we propose to leverage the Brenier formulation of OT. |
Adrien Vacher; Francois-Xavier Vialard; |
2545 | Fast Vision Transformers with HiLo Attention Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Thus, we propose to use the direct speed evaluation on the target platform as the design principle for efficient ViTs. |
Zizheng Pan; Jianfei Cai; Bohan Zhuang; |
2546 | Adaptive Distribution Calibration for Few-Shot Learning with Hierarchical Optimal Transport Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To this end, we propose a novel distribution calibration method by learning the adaptive weight matrix between novel samples and base classes, which is built upon a hierarchical Optimal Transport (H-OT) framework. |
Dandan Guo; Long Tian; He Zhao; Mingyuan Zhou; Hongyuan Zha; |
2547 | Poisson Flow Generative Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a new "Poisson flow" generative model~(PFGM) that maps a uniform distribution on a high-dimensional hemisphere into any data distribution. |
Yilun Xu; Ziming Liu; Max Tegmark; Tommi Jaakkola; |
2548 | Extracting Computational Mechanisms from Neural Data Using Low-rank RNNs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Here we propose a new method called Low-rank Inference from Neural Trajectories (LINT), based on a class of low-rank recurrent neural networks (lrRNNs) for which a link between connectivity and dynamics has been previously demonstrated. |
Adrian Valente; Jonathan Pillow; Srdjan Ostojic; |
2549 | Learning Operators with Coupled Attention Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a novel operator learning method, LOCA (Learning Operators with Coupled Attention), motivated from the recent success of the attention mechanism. |
Georgios Kissas; Jacob Seidman; Leonardo Ferreira Guilhoto; Victor M. Preciado; George J. Pappas; Paris Perdikaris; |
2550 | CageNeRF: Cage-based Neural Radiance Field for Generalized 3D Deformation and Animation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose a novel framework for deforming and animating the neural radiance field learned on arbitrary objects. |
Yicong Peng; Yichao Yan; Shengqi Liu; Yuhao Cheng; Shanyan Guan; Bowen Pan; Guangtao Zhai; Xiaokang Yang; |
2551 | Stability and Scalability of Node Perturbation Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Our study highlights the limitations and potential of node perturbation as a biologically plausible learning rule in the brain. |
Naoki Hiratani; Yash Mehta; Timothy Lillicrap; Peter E Latham; |
2552 | NOMAD: Nonlinear Manifold Decoders for Operator Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Here we present NOMAD, a novel operator learning framework with a nonlinear decoder map capable of learning finite dimensional representations of nonlinear submanifolds in function spaces. |
Jacob Seidman; Georgios Kissas; Paris Perdikaris; George J. Pappas; |
2553 | Generating Training Data with Language Models: Towards Zero-Shot Language Understanding Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present a simple approach that uses both types of PLMs for fully zero-shot learning of NLU tasks without requiring any task-specific data: A unidirectional PLM generates class-conditioned texts guided by prompts, which are used as the training data for fine-tuning a bidirectional PLM. |
Yu Meng; Jiaxin Huang; Yu Zhang; Jiawei Han; |
2554 | Inductive Logical Query Answering in Knowledge Graphs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we study the inductive query answering task where inference is performed on a graph containing new entities with queries over both seen and unseen entities. |
Mikhail Galkin; Zhaocheng Zhu; Hongyu Ren; Jian Tang; |
2555 | Towards Understanding Grokking: An Effective Theory of Representation Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We aim to understand grokking, a phenomenon where models generalize long after overfitting their training set. |
Ziming Liu; Ouail Kitouni; Niklas S Nolte; Eric Michaud; Max Tegmark; Mike Williams; |
2556 | Revisit Last-iterate Convergence of MSGD Under Milder Requirement on Step Size Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we relax this requirement by studying an alternate step size for the mSGD. |
ruinan Jin; Xingkang He; Lang Chen; Difei Cheng; Vijay Gupta; |
2557 | Spending Thinking Time Wisely: Accelerating MCTS with Virtual Expansions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper proposes the Virtual MCTS (V-MCTS), a variant of MCTS that spends more search time on harder states and less search time on simpler states adaptively. |
Weirui Ye; Pieter Abbeel; Yang Gao; |
2558 | On The Identifiability of Nonlinear ICA: Sparsity and Beyond Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We show the identifiability of nonlinear ICA with unconditional priors. |
Yujia Zheng; Ignavier Ng; Kun Zhang; |
2559 | Robust Models Are Less Over-Confident Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we analyze a variety of adversarially trained models that achieve high robust accuracies when facing state-of-the-art attacks and we show that AT has an interesting side-effect: it leads to models that are significantly less overconfident with their decisions even on clean data than non-robust models. |
Julia Grabinski; Paul Gavrikov; Janis Keuper; Margret Keuper; |
2560 | Learning to Re-weight Examples with Optimal Transport for Imbalanced Classification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel re-weighting method based on optimal transport (OT) from a distributional point of view. |
Dandan Guo; Zhuo Li; meixi zheng; He Zhao; Mingyuan Zhou; Hongyuan Zha; |
2561 | Adversarial Attack on Attackers: Post-Process to Mitigate Black-Box Score-Based Query Attacks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Nonetheless, we note that if the loss trend of the outputs is slightly perturbed, SQAs could be easily misled and thereby become much less effective. Following this idea, we propose a novel defense, namely Adversarial Attack on Attackers (AAA), to confound SQAs towards incorrect attack directions by slightly modifying the output logits. |
Sizhe Chen; Zhehao Huang; Qinghua Tao; Yingwen Wu; Cihang Xie; Xiaolin Huang; |
2562 | Surface Coverage Optimization in Unknown Environments By Volumetric Integration Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce a Next-Best-View method for free-motion depth sensors in large-scale environments. |
Antoine Guedon; Vincent Lepetit; Pascal Monasse; |
2563 | Finding Differences Between Transformers and ConvNets Using Counterfactual Simulation Testing Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we present Counterfactual Simulation Testing, a counterfactual framework that allows us to study the robustness of neural networks with respect to some of these naturalistic variations by building realistic synthetic scenes that allow us to ask counterfactual questions to the models, ultimately providing answers to questions such as "Would your classification still be correct if the object were viewed from the top?" |
Nataniel Ruiz; Cihang Xie; Sarah Bargal; Kate Saenko; Stan Sclaroff; |
2564 | High-dimensional Asymptotics of Feature Learning: How One Gradient Step Improves The Representation Related Papers Related Patents Related Grants Related Orgs Related Experts View Abstract: We study the first gradient descent step on the first-layer parameters $\boldsymbol{W}$ in a two-layer neural network: $f(\boldsymbol{x}) = … |
Jimmy Ba; Murat Erdogdu; Taiji Suzuki; Zhichao Wang; Denny Wu; Greg Yang; |
2565 | OmniVL: One Foundation Model for Image-Language and Video-Language Tasks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper presents OmniVL, a new foundation model to support both image-language and video-language tasks using one universal architecture. |
Junke Wang; Dongdong Chen; Zuxuan Wu; Chong Luo; Luowei Zhou; Yucheng Zhao; Yujia Xie; Ce Liu; Yu-Gang Jiang; Lu Yuan; |
2566 | Model-based Safe Deep Reinforcement Learning Via A Constrained Proximal Policy Optimization Algorithm Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose an On-policy Model-based Safe Deep RL algorithm in which we learn the transition dynamics of the environment in an online manner as well as find a feasible optimal policy using Lagrangian Relaxation-based Proximal Policy Optimization. |
Ashish K Jayant; Shalabh Bhatnagar; |
2567 | Finite-Time Analysis of Adaptive Temporal Difference Learning with Deep Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper establishes the finite-time analysis for the adaptive TD with multi-layer ReLU network approximation whose samples are generated from a Markov decision process. |
Tao Sun; Dongsheng Li; Bao Wang; |
2568 | LogiGAN: Learning Logical Reasoning Via Adversarial Pre-training Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present LogiGAN, an unsupervised adversarial pre-training framework for improving logical reasoning abilities of language models. |
Xinyu Pi; Wanjun Zhong; Yan Gao; Nan Duan; Jian-Guang Lou; |
2569 | Bridging The Gap Between Vision Transformers and Convolutional Neural Networks on Small Datasets Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To this end, we propose Dynamic Hybrid Vision Transformer (DHVT) as the solution to enhance the two inductive biases. |
Zhiying Lu; Hongtao Xie; Chuanbin Liu; Yongdong Zhang; |
2570 | Escaping Saddle Points with Bias-Variance Reduced Local Perturbed SGD for Communication Efficient Nonconvex Distributed Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we study a new local algorithm called Bias-Variance Reduced Local Perturbed SGD (BVR-L-PSGD), that combines the existing bias-variance reduced gradient estimator with parameter perturbation to find second-order optimal points in centralized nonconvex distributed optimization. |
Tomoya Murata; Taiji Suzuki; |
2571 | Implicit Warping for Animation with Image Sets Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a new implicit warping framework for image animation using sets of source images through the transfer of motion of a driving video. |
Arun Mallya; Ting-Chun Wang; Ming-Yu Liu; |
2572 | Biologically Inspired Dynamic Thresholds for Spiking Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Existing work in the machine learning community does not employ bioplausible spiking threshold schemes. This work aims at bridging this gap by introducing a novel bioinspired dynamic energy-temporal threshold (BDETT) scheme for spiking neural networks (SNNs). |
Jianchuan Ding; Bo Dong; Felix Heide; Yufei Ding; Yunduo Zhou; Baocai Yin; Xin Yang; |
2573 | Machine Learning on Graphs: A Model and Comprehensive Taxonomy Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Here, we aim to bridge the gap between network embedding, graph regularization and graph neural networks. |
Ines Chami; Sami Abu-El-Haija; Bryan Perozzi; Christopher Ré; Kevin Murphy; |
2574 | CUP: Critic-Guided Policy Reuse Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, training these components induces either optimization non-stationarity or heavy sampling cost, significantly impairing the effectiveness of transfer. To tackle this problem, we propose a novel policy reuse algorithm called Critic-gUided Policy reuse (CUP), which avoids training any extra components and efficiently reuses source policies. |
Jin Zhang; Siyuan Li; Chongjie Zhang; |
2575 | Zero-shot Transfer Learning on Heterogeneous Graphs Via Knowledge Transfer Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a zero-shot transfer learning module for HGNNs called a Knowledge Transfer Network (KTN) that transfers knowledge from label-abundant node types to zero-labeled node types through rich relational information given in the HG. |
Minji Yoon; John Palowitch; Dustin Zelle; Ziniu Hu; Ruslan Salakhutdinov; Bryan Perozzi; |
2576 | Procedural Image Programs for Representation Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Existing work focuses on a handful of generative processes which are hard to integrate together to scale up. To overcome this, we propose training with a large dataset of twenty-one thousand programs, each one generating a diverse set of synthetic images. |
Manel Baradad; Richard Chen; Jonas Wulff; Tongzhou Wang; Rogerio Feris; Antonio Torralba; Phillip Isola; |
2577 | Maximizing and Satisficing in Multi-armed Bandits with Graph Information Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In particular, we consider the problem of finding the arm with the maximum reward (i.e., the maximizing problem) or one that has sufficiently high reward (i.e., the satisficing problem) under this model. We propose novel algorithms GRUB (GRaph based UcB) and zeta-GRUB for these problems and provide theoretical characterization of their performance which specifically elicits the benefit of the graph side information. |
Parth Thaker; Mohit Malu; Nikhil Rao; Gautam Dasarathy; |
2578 | Reduced Representation of Deformation Fields for Effective Non-rigid Shape Matching Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work we present a novel approach for computing correspondences between non-rigid objects, by exploiting a reduced representation of deformation fields. |
Ramana Subramanyam Sundararaman; Riccardo Marin; Emanuele Rodolà; Maks Ovsjanikov; |
2579 | ZARTS: On Zero-order Optimization for Neural Architecture Search Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, our in-depth empirical results show that the approximation often distorts the loss landscape, leading to the biased objective to optimize and, in turn, inaccurate gradient estimation for architecture parameters. This work turns to zero-order optimization and proposes a novel NAS scheme, called ZARTS, to search without enforcing the above approximation. |
Xiaoxing Wang; Wenxuan Guo; Jianlin Su; Xiaokang Yang; Junchi Yan; |
2580 | Learning Best Combination for Efficient N:M Sparsity Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Current implementation on N:M sparsity requires a tedious pre-training phase or computationally heavy from-scratch training. To circumvent these problems, this paper presents an efficient solution for achieving N:M fine-grained sparsity from scratch. |
Yuxin Zhang; Mingbao Lin; ZhiHang Lin; Yiting Luo; Ke Li; Fei Chao; Yongjian Wu; Rongrong Ji; |
2581 | ResT V2: Simpler, Faster and Stronger Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper proposes ResTv2, a simpler, faster, and stronger multi-scale vision Transformer for visual recognition. |
Qinglong Zhang; Yu-Bin Yang; |
2582 | Federated Learning from Pre-Trained Models: A Contrastive Learning Approach Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, excessive computation and communication demands pose challenges to current FL frameworks, especially when training large-scale models. To prevent these issues from hindering the deployment of FL systems, we propose a lightweight framework where clients jointly learn to fuse the representations generated by multiple fixed pre-trained models rather than training a large-scale model from scratch. |
Yue Tan; Guodong Long; Jie Ma; LU LIU; Tianyi Zhou; Jing Jiang; |
2583 | A Lagrangian Duality Approach to Active Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We consider the batch active learning problem, where only a subset of the training data is labeled, and the goal is to query a batch of unlabeled samples to be labeled so as to maximally improve model performance. |
Juan Elenter; Navid Naderializadeh; Alejandro Ribeiro; |
2584 | Dynamic Fair Division with Partial Information Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We consider the fundamental problem of fairly and efficiently allocating $T$ indivisible items among $n$ agents with additive preferences. |
Gerdus Benade; Daniel Halpern; Alexandros Psomas; |
2585 | Truncated Matrix Power Iteration for Differentiable DAG Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: On the contrary, we discover that large coefficients on higher-order terms are beneficial for DAG learning, when the spectral radiuses of the adjacency matrices are small, and that larger coefficients for higher order terms can approximate the DAG constraints much better than the small counterparts. Based on this, we propose a novel DAG learning method with efficient truncated matrix power iteration to approximate geometric series based DAG constraints. |
Zhen Zhang; Ignavier Ng; Dong Gong; Yuhang Liu; Ehsan Abbasnejad; Mingming Gong; Kun Zhang; Javen Qinfeng Shi; |
2586 | Nonstationary Dual Averaging and Online Fair Allocation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We consider the problem of fairly allocating sequentially arriving items to a set of individuals. |
Luofeng Liao; Yuan Gao; Christian Kroer; |
2587 | Defining and Characterizing Reward Gaming Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We formally define reward gaming as situations where optimizing a proxy can decrease the true reward, and provide examples and theoretical results. |
Joar Skalse; Nikolaus Howe; Dmitrii Krasheninnikov; David Krueger; |
2588 | Cryptographic Hardness of Learning Halfspaces with Massart Noise Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We establish computational hardness of learning halfspaces with Massart noise, assuming hardness of the LWE problem. |
Ilias Diakonikolas; Daniel Kane; Pasin Manurangsi; Lisheng Ren; |
2589 | Expected Frequency Matrices of Elections: Computation, Geometry, and Preference Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We show how to compute frequency matrices of elections, which simplifies a map of elections and helps with preference learning. |
Niclas Boehmer; Robert Bredereck; Edith Elkind; Piotr Faliszewski; Stanisław Szufa; |
2590 | (Optimal) Online Bipartite Matching with Predicted Degrees Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a model for online graph problems where algorithms are given access to an oracle that predicts (e.g., based on past data) the degrees of nodes in the graph. |
Anders Aamand; Justin Chen; Piotr Indyk; |
2591 | Discovered Policy Optimisation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper we explore the Mirror Learning space by meta-learning a “drift” function. |
Chris Lu; Jakub Kuba; Alistair Letcher; Luke Metz; Christian Schroeder de Witt; Jakob Foerster; |
2592 | MCL-GAN: Generative Adversarial Networks with Multiple Specialized Discriminators Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a generative adversarial network with multiple discriminators, which collaborate to represent a real dataset more effectively. |
Jinyoung Choi; Bohyung Han; |
2593 | Contrastive Learning As Goal-Conditioned Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, instead of adding representation learning parts to an existing RL algorithm, we show (contrastive) representation learning methods are already RL algorithms in their own right. |
Benjamin Eysenbach; Tianjun Zhang; Sergey Levine; Russ Salakhutdinov; |
2594 | Robust Rent Division Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce the lexislack solution, which selects a rent division that remains envy-free for valuations within as large a radius as possible of the reported valuations. |
Dominik Peters; Ariel Procaccia; David Zhu; |
2595 | Personalized Online Federated Multi-Kernel Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The present paper develops an algorithmic framework to enable clients to communicate with the server to send their updates with affordable communication cost while clients employ a large dictionary of kernels. |
Pouya M. Ghari; Yanning Shen; |
2596 | Bayesian Persuasion for Algorithmic Recourse Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The resulting opacity forces the decision subjects to rely on incomplete information when making strategic feature modifications. We capture such settings as a game of Bayesian persuasion, in which the decision maker offers a form of recourse to the decision subject by providing them with an action recommendation (or signal) to incentivize them to modify their features in desirable ways. |
Keegan Harris; Valerie Chen; Joon Kim; Ameet Talwalkar; Hoda Heidari; Steven Wu; |
2597 | Sample Constrained Treatment Effect Estimation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We focus on designing efficient randomized controlled trials, to accurately estimate the effect of some treatment on a population of $n$ individuals. |
Raghavendra Addanki; David Arbour; Tung Mai; Cameron Musco; Anup Rao; |
2598 | Geodesic Self-Attention for 3D Point Clouds Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The current self-attention module highly relies on the dot product multiplication in Euclidean space, which cannot capture internal non-Euclidean structures of point cloud objects, especially the long relationships along the curve of the manifold surface of point cloud object. To address this problem, in this paper, we introduce the metric on the Riemannian manifold to capture the long-range geometrical dependencies of point cloud objects to replace traditional self-attention modules, namely, the Geodesic Self-attention (GSA) module. |
Zhengyu Li; Zihao Xu; Xihao Wang; XUAN TANG; Mingsong Chen; Hui Yu; xian wei; |
2599 | Recall Distortion in Neural Network Pruning and The Undecayed Pruning Algorithm Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, the impact of network pruning is not uniform: prior work has shown that the recall for underrepresented classes in a dataset may be more negatively affected. In this work, we study such relative distortions in recall by hypothesizing an intensification effect that is inherent to the model. |
Aidan Good; Jiaqi Lin; Hannah Sieg; Mikey Fergurson; Xin Yu; Shandian Zhe; Jerzy Wieczorek; Thiago Serra; |
2600 | Revisiting Optimal Convergence Rate for Smooth and Non-convex Stochastic Decentralized Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper revisits non-convex stochastic decentralized optimization and establishes an optimal convergence rate with general weight matrices. |
Kun Yuan; Xinmeng Huang; Yiming Chen; Xiaohan Zhang; Yingya Zhang; PAN PAN; |
2601 | Nest Your Adaptive Algorithm for Parameter-Agnostic Nonconvex Minimax Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To address the issue, we formally introduce a Nested Adaptive framework, NeAda for short, that carries an inner loop for adaptively maximizing the dual variable with controllable stopping criteria and an outer loop for adaptively minimizing the primal variable. |
Junchi YANG; Xiang Li; Niao He; |
2602 | Domain Generalization Without Excess Empirical Risk Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We argue that a significant failure mode of this recipe is an excess risk due to an erroneous penalty or hardness in joint optimization. We present an approach that eliminates this problem. |
Ozan Sener; Vladlen Koltun; |
2603 | MOVE: Unsupervised Movable Object Segmentation and Detection Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce MOVE, a novel method to segment objects without any form of supervision. |
Adam Bielski; Paolo Favaro; |
2604 | OGC: Unsupervised 3D Object Segmentation from Rigid Dynamics of Point Clouds Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Unlike all existing methods which usually require a large amount of human annotations for full supervision, we propose the first unsupervised method, called OGC, to simultaneously identify multiple 3D objects in a single forward pass, without needing any type of human annotations. |
Ziyang Song; Bo Yang; |
2605 | Convolutional Neural Networks on Graphs with Chebyshev Approximation, Revisited Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we revisit the problem of approximating the spectral graph convolutions with Chebyshev polynomials. |
Mingguo He; Zhewei Wei; Ji-Rong Wen; |
2606 | DPM-Solver: A Fast ODE Solver for Diffusion Probabilistic Model Sampling in Around 10 Steps Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose an exact formulation of the solution of diffusion ODEs. |
Cheng Lu; Yuhao Zhou; Fan Bao; Jianfei Chen; Chongxuan LI; Jun Zhu; |
2607 | Robust Graph Structure Learning Over Images Via Multiple Statistical Tests Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: It is well known that pairwise similarities between images are sensitive to the noise in feature representations, leading to unreliable graph structures. We address this problem from the viewpoint of statistical tests. |
Yaohua Wang; Fangyi Zhang; Ming Lin; Senzhang Wang; Xiuyu Sun; Rong Jin; |
2608 | Reconstructing Training Data From Trained Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We provide a novel scheme for reconstructing large portions of the actual training samples from a trained neural network. Our scheme is inspired by recent theoretical results of the implicit bias in training neural networks. |
Niv Haim; Gal Vardi; Gilad Yehudai; Michal Irani; Ohad Shamir; |
2609 | Deep Bidirectional Language-Knowledge Graph Pretraining Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Here we propose DRAGON (Deep Bidirectional Language-Knowledge Graph Pretraining), a self-supervised approach to pretraining a deeply joint language-knowledge model from raw text and KG at scale. |
Michihiro Yasunaga; Antoine Bosselut; Hongyu Ren; Xikun Zhang; Christopher D Manning; Percy Liang; Jure Leskovec; |
2610 | Learning to Scaffold: Optimizing Model Explanations for Teaching Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Some research argues that explanations should help teach a student (either human or machine) to simulate the model being explained, and that the quality of explanations can be measured by the simulation accuracy of students on unexplained examples. In this work, leveraging meta-learning techniques, we extend this idea to improve the quality of the explanations themselves, specifically by optimizing explanations such that student models more effectively learn to simulate the original model. |
Patrick Fernandes; Marcos Treviso; Danish Pruthi; André Martins; Graham Neubig; |
2611 | Wavelet Score-Based Generative Modeling Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We show that SGMs can be considerably accelerated, by factorizing the data distribution into a product of conditional probabilities of wavelet coefficients across scales. |
Florentin Guth; Simon Coste; Valentin De Bortoli; Stephane Mallat; |
2612 | When Are Local Queries Useful for Robust Learning? Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We show that, under the uniform distribution, LMQs do not increase the robustness threshold of conjunctions and any superclass, e.g., decision lists and halfspaces. Faced with this negative result, we introduce the local equivalence query oracle, which returns whether the hypothesis and target concept agree in the perturbation region around a point in the training sample, as well as a counterexample if it exists. |
Pascale Gourdeau; Varun Kanade; Marta Kwiatkowska; James Worrell; |
2613 | Non-rigid Point Cloud Registration with Neural Deformation Pyramid Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper present a hierarchical neural deformation representation that achieve advanced non-rigid point cloud registration. |
YANG LI; Tatsuya Harada; |
2614 | Outlier Suppression: Pushing The Limit of Low-bit Transformer Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We discover that $\boldsymbol \gamma$ in LayerNorm (LN) acts as a sinful amplifier for the outliers, and the importance of outliers varies greatly where some outliers provided by a few tokens cover a large area but can be clipped sharply without negative impacts. Motivated by these findings, we propose an outlier suppression framework including two components: Gamma Migration and Token-Wise Clipping. |
Xiuying Wei; Yunchen Zhang; Xiangguo Zhang; Ruihao Gong; Shanghang Zhang; Qi Zhang; Fengwei Yu; Xianglong Liu; |
2615 | SoteriaFL: A Unified Framework for Private Federated Learning with Communication Compression Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a unified framework that enhances the communication efficiency of private federated learning with communication compression. |
Zhize Li; Haoyu Zhao; Boyue Li; Yuejie Chi; |
2616 | Unsupervised Object Detection Pretraining with Joint Object Priors Generation and Detector Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose a novel object detection pretraining framework that could generate object priors and learn detectors jointly by generating accurate object priors from the model itself. |
Yizhou Wang; Meilin Chen; SHIXIANG TANG; Feng Zhu; Haiyang Yang; LEI BAI; Rui Zhao; Yunfeng Yan; Donglian Qi; Wanli Ouyang; |
2617 | Learning Generalizable Part-based Feature Representation for 3D Point Clouds Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Based on the observation that local geometric structures are more generalizable than the whole shape, we propose to reduce the geometry shift by a generalizable part-based feature representation and design a novel part-based domain generalization network (PDG) for 3D point cloud classification. |
Xin Wei; Xiang Gu; Jian Sun; |
2618 | Making Sense of Dependence: Efficient Black-box Explanations Using Dependence Measure Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper presents a new efficient black-box attribution method based on Hilbert-Schmidt Independence Criterion (HSIC), a dependence measure based on Reproducing Kernel Hilbert Spaces (RKHS). |
Paul Novello; Thomas FEL; David Vigouroux; |
2619 | Towards Reasonable Budget Allocation in Untargeted Graph Structure Attacks Via Gradient Debias Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Our paper found a very critical problem in the graph adversarial attack on graph structure. We demonstrated this problem as a budget allocation problem, solved by our proposed method. The experiments show that our method is very competitive. |
Zihan Liu; Yun Luo; Lirong Wu; Zicheng Liu; Stan Z. Li; |
2620 | Unsupervised Representation Learning from Pre-trained Diffusion Probabilistic Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Inspired by the classifier-guided sampling method, we employ an encoder to learn meaningful representations from images and a gradient estimator to directly model the mean shift according to the learned representations to fill the posterior mean gap for image reconstruction. |
Zijian Zhang; Zhou Zhao; Zhijie Lin; |
2621 | PerfectDou: Dominating DouDizhu with Perfect Information Distillation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose PerfectDou, a state-of-the-art Doudizhu AI system that summits the game, in an actor-critic framework with a proposed technique named perfect information distillation. |
Guan Yang; Minghuan Liu; Weijun Hong; Weinan Zhang; Fei Fang; Guangjun Zeng; Yue Lin; |
2622 | [Re] Graph Edit Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The objective of this reproduction is to assess the possibility of its re-implementation in the Python programming language and the adherence of the provided code to the methodology, described in the source material. |
Vid Stropnik; Maruša Oražem; |
2623 | Misspecified Phase Retrieval with Generative Priors Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a novel two-step approach with provable guarantees for misspecified phase retrieval with generative priors. |
Zhaoqiang Liu; Xinshao Wang; Jiulong Liu; |
2624 | Active Exploration for Inverse Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a novel IRL algorithm: Active exploration for Inverse Reinforcement Learning (AceIRL), which actively explores an unknown environment and expert policy to quickly learn the expert’s reward function and identify a good policy. |
David Lindner; Andreas Krause; Giorgia Ramponi; |
2625 | Bringing Image Scene Structure to Video Via Frame-Clip Consistency of Object Tokens Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a learning framework StructureViT (SViT for short), which demonstrates how utilizing the structure of a small number of images only available during training can improve a video model. |
Elad Ben Avraham; Roei Herzig; Karttikeya Mangalam; Amir Bar; Anna Rohrbach; Leonid Karlinsky; Trevor Darrell; Amir Globerson; |
2626 | Indicators of Attack Failure: Debugging and Improving Optimization of Adversarial Examples Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we overcome these limitations by: (i) categorizing attack failures based on how they affect the optimization of gradient-based attacks, while also unveiling two novel failures affecting many popular attack implementations and past evaluations; (ii) proposing six novel \emph{indicators of failure}, to automatically detect the presence of such failures in the attack optimization process; and (iii) suggesting a systematic protocol to apply the corresponding fixes. |
Maura Pintor; Luca Demetrio; Angelo Sotgiu; Ambra Demontis; Nicholas Carlini; Battista Biggio; Fabio Roli; |
2627 | AutoMTL: A Programming Framework for Automating Efficient Multi-Task Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper addresses the challenges by developing the first programming framework AutoMTL that automates efficient MTL model development for vision tasks. |
Lijun Zhang; Xiao Liu; Hui Guan; |
2628 | Tight Mutual Information Estimation With Contrastive Fenchel-Legendre Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: While featuring superior stability, these estimators crucially depend on costly large-batch training, and they sacrifice bound tightness for variance reduction. To overcome these limitations, we revisit the mathematics of popular variational MI bounds from the lens of unnormalized statistical modeling and convex optimization. |
Qing Guo; Junya Chen; Dong Wang; Yuewei Yang; Xinwei Deng; Jing Huang; Larry Carin; Chenyang Tao; Fan Li; |
2629 | Multi-Agent Reinforcement Learning Is A Sequence Modeling Problem Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we introduce a novel architecture named Multi-Agent Transformer (MAT) that effectively casts cooperative multi-agent reinforcement learning (MARL) into SM problems wherein the objective is to map agents’ observation sequences to agents’ optimal action sequences. |
Muning Wen; Jakub Kuba; Runji Lin; Weinan Zhang; Ying Wen; Jun Wang; Yaodong Yang; |
2630 | Multivariate Time-Series Forecasting with Temporal Polynomial Graph Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper proposes a temporal polynomial graph neural network (TPGNN) for accurate MTS forecasting, which represents the dynamic variable correlation as a temporal matrix polynomial in two steps. |
Yijing Liu; Qinxian Liu; Jian-Wei Zhang; Haozhe Feng; Zhongwei Wang; Zihan Zhou; Wei Chen; |
2631 | Feature-Proxy Transformer for Few-Shot Segmentation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper revives the straightforward framework of “feature extractor $+$ linear classification head” and proposes a novel Feature-Proxy Transformer (FPTrans) method, in which the “proxy” is the vector representing a semantic class in the linear classification head. |
Jian-Wei Zhang; Yifan Sun; Yi Yang; Wei Chen; |
2632 | A Mean-Field Game Approach to Cloud Resource Management with Function Approximation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a mean-field game (MFG) approach to cloud resource management that is scalable to a large number of users and applications and incorporates function approximation to deal with the large state-action spaces in real-world serverless platforms. |
Weichao Mao; Haoran Qiu; Chen Wang; Hubertus Franke; Zbigniew Kalbarczyk; Ravishankar Iyer; Tamer Basar; |
2633 | Bayesian Risk Markov Decision Processes Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a new formulation, Bayesian risk Markov decision process (BR-MDP), to address parameter uncertainty in MDPs, where a risk functional is applied in nested form to the expected total cost with respect to the Bayesian posterior distributions of the unknown parameters. |
Yifan Lin; Yuxuan Ren; Enlu Zhou; |
2634 | TransBoost: Improving The Best ImageNet Performance Using Deep Transduction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper deals with deep transductive learning, and proposes TransBoost as a procedure for fine-tuning any deep neural model to improve its performance on any (unlabeled) test set provided at training time. |
Omer Belhasin; Guy Bar-Shalom; Ran El-Yaniv; |
2635 | Towards Scalable (All-Pair) Message Passing for Node Classification Beyond Explicit Topology Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we introduce a novel all-pair message passing scheme for efficiently propagating layer-wise signals between arbitrary nodes. |
Qitian Wu; Wentao Zhao; Zenan Li; David P Wipf; Junchi Yan; |
2636 | Adaptive Data Debiasing Through Bounded Exploration Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose an algorithm for sequentially debiasing such datasets through adaptive and bounded exploration in a classification problem with costly and censored feedback. |
Yifan Yang; Yang Liu; Parinaz Naghizadeh; |
2637 | MSR: Making Self-supervised Learning Robust to Aggressive Augmentations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, aggressive augmentations may distort images’ structures leading to a severe semantic shift problem that augmented views of the same image may not share the same semantics, thus degrading the transfer performance. To address this problem, we propose a new SSL paradigm, which counteracts the impact of semantic shift by balancing the role of weak and aggressively augmented pairs. |
Yingbin Bai; Erkun Yang; Zhaoqing Wang; Yuxuan Du; Bo Han; Cheng Deng; Dadong Wang; Tongliang Liu; |
2638 | Operative Dimensions in Unconstrained Connectivity of Recurrent Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Here we study how network dynamics are related to network connectivity in RNN trained without any specific constraints on several tasks previously employed in neuroscience. |
Renate Krause; Matthew Cook; Sepp Kollmorgen; Valerio Mante; Giacomo Indiveri; |
2639 | Exploiting Semantic Relations for Glass Surface Detection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a glass surface detection method which exploits semantic relations. |
Yuen-Hei Yeung; Jiaying Lin; Rynson Lau; |
2640 | Rethinking Nonlinear Instrumental Variable Models Through Prediction Validity Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Current practice assumes (i) and (ii) are met, then postulates a functional form with limited input from the data. In this paper, we describe a framework that leverages machine learning to validate these typically unchecked but consequential assumptions in the IV framework, providing the researcher empirical evidence about the quality of the instrument given the data at hand. |
Chunxiao Li; Cynthia Rudin; Tyler H. McCormick; |
2641 | Fair Wrapping for Black-box Predictions Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce a new family of techniques to post-process (“wrap") a black-box classifier in order to reduce its bias. |
Alexander Soen; Ibrahim Alabdulmohsin; Sanmi Koyejo; Yishay Mansour; Nyalleng Moorosi; Richard Nock; Ke Sun; Lexing Xie; |
2642 | Concept Activation Regions: A Generalized Framework For Concept-Based Explanations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: When this holds true, the concept can be represented by a concept activation vector (CAV) pointing in that direction. In this work, we propose to relax this assumption by allowing concept examples to be scattered across different clusters in the DNN’s latent space. |
Jonathan Crabbé; Mihaela van der Schaar; |
2643 | BiT: Robustly Binarized Multi-distilled Transformer Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we identify a series of improvements that enables binary transformers at a much higher accuracy than what was possible previously. |
Zechun Liu; Barlas Oguz; Aasish Pappu; Lin Xiao; Scott Yih; Meng Li; Raghuraman Krishnamoorthi; Yashar Mehdad; |
2644 | Monte Carlo Augmented Actor-Critic for Sparse Reward Deep Reinforcement Learning from Suboptimal Demonstrations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce Monte Carlo Augmented Actor-Critic (MCAC), a parameter free modification to standard actor-critic algorithms which initializes the replay buffer with demonstrations and computes a modified $Q$-value by taking the maximum of the standard temporal distance (TD) target and a Monte Carlo estimate of the reward-to-go. |
Albert Wilcox; Ashwin Balakrishna; Daniel Brown; Jules Dedieu; Wyame Benslimane; Ken Goldberg; |
2645 | Log-Polar Space Convolution Layers Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper proposes log-polar space convolution, which not only encodes local spatial structures, but also greatly enlarges the local receptive field without increasing the number of parameters. |
Bing Su; Ji-Rong Wen; |
2646 | MAtt: A Manifold Attention Network for EEG Decoding Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We herein propose a manifold attention network (mAtt), a novel geometric deep learning (GDL)-based model, featuring a manifold attention mechanism that characterizes spatiotemporal representations of EEG data fully on a Riemannian symmetric positive definite (SPD). |
Yue-Ting Pan; Jing-Lun Chou; Chun-Shu Wei; |
2647 | On The Discrimination Risk of Mean Aggregation Feature Imputation in Graphs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a simple and effective solution to ensure mean aggregation-imputed features provably have a low discrimination risk (while minimally sacrificing utility) and improve the fairness of models. |
Arjun Subramonian; Kai-Wei Chang; Yizhou Sun; |
2648 | Few-shot Image Generation Via Adaptation-aware Kernel Modulation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We uncover some issues of state-of-the-art algorithms on few-shot image generation and propose a novel adaptation-aware kernel modulation method to improve few-shot image generation in different setups. |
Yunqing Zhao; Milad Abdollahzadeh; Keshigeyan Chandrasegaran; Ngai-Man (Man) Cheung; |
2649 | Discovering and Overcoming Limitations of Noise-engineered Data-free Knowledge Distillation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We identify the shift in the distribution of hidden layer activation as the key limiting factor, which occurs when Gaussian noise is fed to the teacher network instead of the accustomed training data. We propose a simple solution to mitigate this shift and show that for vision tasks, such as classification, it is possible to achieve a performance close to the teacher by just using the samples randomly drawn from a Gaussian distribution. |
Piyush Vinod Raikwar; Deepak Mishra; |
2650 | Theoretically Better and Numerically Faster Distributed Optimization with Smoothness-Aware Quantization Techniques Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we generalize their smoothness-aware compression strategy to {\em arbitrary unbiased compression} operators, which also includes sparsification. |
Bokun Wang; Mher Safaryan; Peter Richtarik; |
2651 | Explaining Graph Neural Networks with Structure-Aware Cooperative Games Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a Graph Structure-aware eXplanation (GStarX) method to leverage the critical graph structure information to improve the explanation. |
Shichang Zhang; Neil Shah; Yozen Liu; Yizhou Sun; |
2652 | Towards A Holistic Assessment of Health Data Representations Under Realistic Dataset Shifts Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Here we develop model diagnostic measures to detect potential pitfalls during deployment without assuming access to external data. |
Neeraj Wagh; Jionghao Wei; Samarth Rawal; Brent M Berry; Yogatheesan Varatharajah; |
2653 | Multiagent Q-learning with Sub-Team Coordination Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a novel value factorization framework, called multiagent Q-learning with sub-team coordination (QSCAN), to flexibly represent sub-team coordination while honoring the IGM condition. |
Wenhan Huang; Kai Li; Kun Shao; Tianze Zhou; Matthew Taylor; Jun Luo; Dongge Wang; Hangyu Mao; Jianye Hao; Jun Wang; Xiaotie Deng; |
2654 | MEMO: Test Time Robustness Via Adaptation and Augmentation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we aim to study and devise methods that make no assumptions about the model training process and are broadly applicable at test time. |
Marvin Zhang; Sergey Levine; Chelsea Finn; |
2655 | Periodic Graph Transformers for Crystal Material Property Prediction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose a transformer architecture, known as Matformer, for periodic graph representation learning. |
Keqiang Yan; Yi Liu; Yuchao Lin; Shuiwang Ji; |
2656 | ComENet: Towards Complete and Efficient Message Passing for 3D Molecular Graphs Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Notably, we propose the important rotation angles to fulfill global completeness. |
Limei Wang; Yi Liu; Yuchao Lin; Haoran Liu; Shuiwang Ji; |
2657 | RecZilla: Algorithm Selection for Recommender Systems Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We conduct a large-scale study of recommender system algorithms, which motivates the design of RecZilla: an algorithm selection approach based on meta-learning. |
Duncan McElfresh; Sujay Khandagale; Jonathan Valverde; John Dickerson; Colin White; |
2658 | Pruning Has A Disparate Impact on Model Accuracy Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper shows that pruning may create or exacerbate disparate impacts. |
Cuong Tran; Ferdinando Fioretto; Jung-Eun Kim; Rakshit Naidu; |
2659 | Structuring Uncertainty for Fine-Grained Sampling in Stochastic Segmentation Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: What is missing are structural insights into the uncertainty, which would be desirable for interpretability and systematic adjustments. In the context of state-of-the-art stochastic segmentation networks (SSNs), we solve this issue by dismantling the overall predicted uncertainty into smaller uncertainty components. |
Jakob Gawlikowski; Frank Nussbaum; Julia Niebling; |
2660 | Chroma-VAE: Mitigating Shortcut Learning with Generative Classifiers Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In particular, we propose Chroma-VAE, a two-pronged approach where a VAE classifier is initially trained to isolate the shortcut in a small latent subspace, allowing a secondary classifier to be trained on the complementary, shortcut-free latent subspace. |
Wanqian Yang; Polina Kirichenko; Micah Goldblum; Andrew Wilson; |
2661 | New Lower Bounds for Private Estimation and A Generalized Fingerprinting Lemma Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We prove new lower bounds for statistical estimation tasks under the constraint of $(\varepsilon,\delta)$-differential privacy. |
Gautam Kamath; Argyris Mouzakis; Vikrant Singhal; |
2662 | An Investigation Into Whitening Loss for Self-supervised Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a framework with an informative indicator to analyze whitening loss, which provides a clue to demystify several interesting phenomena as well as a pivoting point connecting to other SSL methods. |
Xi Weng; Lei Huang; Lei Zhao; Rao Anwer; Salman Khan; Fahad Shahbaz Khan; |
2663 | Global Normalization for Streaming Speech Recognition in A Modular Framework Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce the Globally Normalized Autoregressive Transducer (GNAT) for addressing the label bias problem in streaming speech recognition. |
Ehsan Variani; Ke Wu; Michael D Riley; David Rybach; Matt Shannon; Cyril Allauzen; |
2664 | Zero-Shot Video Question Answering Via Frozen Bidirectional Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present a framework based on frozen bidirectional masked language models to tackle zero-shot video question answering. |
Antoine Yang; Antoine Miech; Josef Sivic; Ivan Laptev; Cordelia Schmid; |
2665 | Singular Value Fine-tuning: Few-shot Segmentation Requires Few-parameters Fine-tuning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present a solution to overcome the overfitting problem, leading to better model generalization on learning novel classes. |
Yanpeng Sun; Qiang Chen; Xiangyu He; Zechao Li; Jian Wang; Haocheng Feng; Junyu Han; Errui Ding; Jian Cheng; Jingdong Wang; |
2666 | Debiased Causal Tree: Heterogeneous Treatment Effects Estimation with Unmeasured Confounding Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we consider the estimation of conditional causal effects in the presence of unmeasured confounding using observational data and historical controls. |
Caizhi Tang; Huiyuan Wang; Xinyu Li; Qing Cui; Ya-Lin Zhang; Feng Zhu; Longfei Li; Jun Zhou; Linbo Jiang; |
2667 | Discovering Design Concepts for CAD Sketches Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a learning based approach that discovers the modular concepts by induction over raw sketches. |
Yuezhi Yang; Hao Pan; |
2668 | High-dimensional Additive Gaussian Processes Under Monotonicity Constraints Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce an additive Gaussian process (GP) framework accounting for monotonicity constraints and scalable to high dimensions. |
Andrés López-Lopera; Francois Bachoc; Olivier Roustant; |
2669 | Alignment-guided Temporal Attention for Video Action Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: While the former is more efficient in computation, the latter often obtains better performance. In this paper, we attribute this to a dilemma between the sufficiency and the efficiency of interactions among various positions in different frames. |
Yizhou Zhao; Zhenyang Li; Xun Guo; Yan Lu; |
2670 | Bridging The Gap from Asymmetry Tricks to Decorrelation Principles in Non-contrastive Self-supervised Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, it is not fully understood why the former performs equally well. In this paper, focusing on BYOL/SimSiam, which uses the stop-gradient and a predictor as asymmetric tricks, we present a novel interpretation of these tricks; they implicitly impose a constraint that encourages feature decorrelation similar to Barlow-Twins/VICReg. |
Kang-Jun Liu; Masanori Suganuma; Takayuki Okatani; |
2671 | Large-Scale Differentiable Causal Discovery of Factor Graphs Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose Differentiable Causal Discovery of Factor Graphs (DCD-FG), a scalable implementation of $f$-DAG constrained causal discovery for high-dimensional interventional data. |
Romain Lopez; Jan-Christian Huetter; Jonathan Pritchard; Aviv Regev; |
2672 | Between Stochastic and Adversarial Online Convex Optimization: Improved Regret Bounds Via Smoothness Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work we establish novel regret bounds for online convex optimization in a setting that interpolates between stochastic i.i.d.~and fully adversarial losses. |
Sarah Sachs; Hedi Hadiji; Tim van Erven; Cristóbal Guzmán; |
2673 | Causality-driven Hierarchical Structure Discovery for Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a Causality-Driven Hierarchical Reinforcement Learning (CDHRL) framework, which leverages the causality in the environment as the guidance to discover the high-quality subgoal hierarchy. |
shaohui peng; Xing Hu; Rui Zhang; Ke Tang; Jiaming Guo; Qi Yi; Ruizhi Chen; xishan zhang; Zidong Du; Ling Li; Qi Guo; Yunji Chen; |
2674 | Near-Optimal Collaborative Learning in Bandits Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper introduces a general multi-agent bandit model in which each agent is facing a finite set of arms and may communicate with other agents through a central controller in order to identify -in pure exploration- or play -in regret minimization- its optimal arm. |
Clémence Réda; Sattar Vakili; Emilie Kaufmann; |
2675 | Redundancy-Free Message Passing for Graph Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We investigate a redundancy-free message passing paradigm for enhancing expressive power of GNNs. |
Rongqin Chen; Shenghui Zhang; Leong Hou U; Ye Li; |
2676 | Text-driven Photorealistic 3D Stylization For Arbitrary Meshes Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Technically, we propose to disentangle the appearance style as the spatially varying bidirectional reflectance distribution function, the local geometric variation, and the lighting condition, which are jointly optimized, via supervision of the CLIP loss, by a spherical Gaussians based differentiable renderer. |
yongwei chen; chen rui; Jiabao Lei; Yabin Zhang; Kui Jia; |
2677 | Your Transformer May Not Be As Powerful As You Expect Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we mathematically analyze the power of RPE-based Transformers regarding whether the model is capable of approximating any continuous sequence-to-sequence functions. |
Shengjie Luo; Shanda Li; Shuxin Zheng; Tie-Yan Liu; Liwei Wang; Di He; |
2678 | Parameter-free Regret in High Probability with Heavy Tails Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present new algorithms for online convex optimization over unbounded domains that obtain parameter-free regret in high-probability given access only to potentially heavy-tailed subgradient estimates. |
Jiujia Zhang; Ashok Cutkosky; |
2679 | High-Order Pooling for Graph Neural Networks with Tensor Decomposition Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose the Tensorized Graph Neural Network (tGNN), a highly expressive GNN architecture modeling high-order non-linear node interactions based on symmetric tensor decomposition. |
Chenqing Hua; Guillaume Rabusseau; Jian Tang; |
2680 | Learning Superpoint Graph Cut for 3D Instance Segmentation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a learning-based superpoint graph cut method that explicitly learns the local geometric structures of the point cloud for instance segmentation. |
Le Hui; Linghua Tang; Yaqi Shen; Jin Xie; Jian Yang; |
2681 | Zero-Sum Stochastic Stackelberg Games Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we study a form of dynamic zero-sum games, called stochastic games, with dependent strategy sets. |
Denizalp Goktas; Sadie Zhao; Amy Greenwald; |
2682 | Renyi Differential Privacy of Propose-Test-Release and Applications to Private and Robust Machine Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we generalize the standard PTR and derive the first RDP bound for it. |
Jiachen T. Wang; Saeed Mahloujifar; Shouda Wang; Ruoxi Jia; Prateek Mittal; |
2683 | S-Prompts Learning with Pre-trained Transformers: An Occam’s Razor for Domain Incremental Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose one simple paradigm (named as S-Prompting) and two concrete approaches to highly reduce the forgetting degree in one of the most typical continual learning scenarios, i.e., domain increment learning (DIL). |
Yabin Wang; Zhiwu Huang; Xiaopeng Hong; |
2684 | Global Convergence of Federated Learning for Mixed Regression Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper studies the problem of model training under Federated Learning when clients exhibit cluster structure. |
Lili Su; Jiaming Xu; Pengkun Yang; |
2685 | Active Learning Through A Covering Lens Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We formulate deep active learning as a probability coverage problem, and propose an active learning algorithm that improves the state-of-the-art in low budgets. |
Ofer Yehuda; Avihu Dekel; Guy Hacohen; Daphna Weinshall; |
2686 | WT-MVSNet: Window-based Transformers for Multi-view Stereo Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose Window-based Transformers (WT) for local feature matching and global feature aggregation in multi-view stereo. |
Jinli Liao; Yikang Ding; Yoli Shavit; Dihe Huang; Shihao Ren; Jia Guo; Kai Zhang; Wensen Feng; |
2687 | Do Residual Neural Networks Discretize Neural Ordinary Differential Equations? Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We investigate whether the discrete dynamics defined by a ResNet are close to the continuous one of a Neural ODE. |
Michael Sander; Pierre Ablin; Gabriel Peyré; |
2688 | A Framework for Bilevel Optimization That Enables Stochastic and Global Variance Reduction Algorithms Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, computing the gradient of the value function involves solving a linear system, which makes it difficult to derive unbiased stochastic estimates.To overcome this problem we introduce a novel framework, in which the solution of the inner problem, the solution of the linear system, and the main variable evolve at the same time. |
Mathieu Dagréou; Pierre Ablin; Samuel Vaiter; Thomas Moreau; |
2689 | Benchopt: Reproducible, Efficient and Collaborative Optimization Benchmarks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose Benchopt, a collaborative framework to automatize, publish and reproduce optimization benchmarks in machine learning across programming languages and hardware architectures. |
Thomas Moreau; Mathurin Massias; Alexandre Gramfort; Pierre Ablin; Pierre-Antoine Bannier; Benjamin Charlier; Mathieu Dagréou; Tom Dupre la Tour; Ghislain DURIF; Cassio F. Dantas; Quentin Klopfenstein; Johan Larsson; En Lai; Tanguy Lefort; Benoît Malézieux; Badr MOUFAD; Binh T. Nguyen; Alain Rakotomamonjy; Zaccharie Ramzi; Joseph Salmon; Samuel Vaiter; |
2690 | On The Learning Mechanisms in Physical Reasoning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Most studies focus on designing dynamics prediction networks and treating physical reasoning as a downstream task without investigating the questions above, taking for granted that the designed dynamics prediction would undoubtedly help the reasoning process. In this work, we take a closer look at this assumption, exploring this fundamental hypothesis by comparing two learning mechanisms: Learning from Dynamics (LfD) and Learning from Intuition (LfI). |
Shiqian Li; Kewen Wu; Chi Zhang; Yixin Zhu; |
2691 | Learnable Polyphase Sampling for Shift Invariant and Equivariant Convolutional Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose learnable polyphase sampling (LPS), a pair of learnable down/upsampling layers that enable truly shift-invariant and equivariant convolutional networks. |
Renan A. Rojas-Gomez; TeckYian Lim; Alex Schwing; Minh Do; Raymond A. Yeh; |
2692 | Improved Imaging By Invex Regularizers with Global Optima Guarantees Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To mitigate the loss of guarantees for global optima, we propose to apply the concept of invexity and provide the first list of proved invex regularizers for improving image reconstruction. |
Samuel Pinilla; Tingting Mu; Neil Bourne; Jeyan Thiyagalingam; |
2693 | On The Importance of Gradient Norm in PAC-Bayesian Bounds Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To avoid these assumptions, in this paper, we follow an alternative approach: we relax uniform bounds assumptions by using on-average bounded loss and on-average bounded gradient norm assumptions. |
Itai Gat; Yossi Adi; Alex Schwing; Tamir Hazan; |
2694 | On Margins and Generalisation for Voting Classifiers Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Our contributions add perspective to the debate on the “margins theory” proposed by Schapire et al. (1998) for the generalisation of ensemble classifiers. |
Felix Biggs; Valentina Zantedeschi; Benjamin Guedj; |
2695 | A Non-asymptotic Analysis of Non-parametric Temporal-Difference Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we study the convergence of the regularized non-parametric TD(0) algorithm, in both the independent and Markovian observation settings. |
Eloïse Berthier; Ziad Kobeissi; Francis Bach; |
2696 | MissDAG: Causal Discovery in The Presence of Missing Data with Continuous Additive Noise Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we develop a general method, which we call MissDAG, to perform causal discovery from data with incomplete observations. |
Erdun Gao; Ignavier Ng; Mingming Gong; Li Shen; Wei Huang; Tongliang Liu; Kun Zhang; Howard Bondell; |
2697 | A Causal Analysis of Harm Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper we formally define a qualitative notion of harm that uses causal models and is based on a well-known definition of actual causality (Halpern, 2016). |
Sander Beckers; Hana Chockler; Joseph Halpern; |
2698 | Characterizing Datapoints Via Second-Split Forgetting Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we focus on the second-split forgetting time (SSFT): the epoch (if any) after which an original training example is forgotten as the network is fine-tuned on a randomly held out partition of the data. |
Pratyush Maini; Saurabh Garg; Zachary Lipton; J. Zico Kolter; |
2699 | A Communication-Efficient Distributed Gradient Clipping Algorithm for Training Deep Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we explore a relaxed-smoothness assumption of the loss landscape which LSTM was shown to satisfy in previous works, and design a communication-efficient gradient clipping algorithm. |
Mingrui Liu; Zhenxun Zhuang; Yunwen Lei; Chunyang Liao; |
2700 | Spherical Channels for Modeling Atomic Interactions Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose the Spherical Channel Network (SCN) to model atomic energies and forces. |
Larry Zitnick; Abhishek Das; Adeesh Kolluru; Janice Lan; Muhammed Shuaibi; Anuroop Sriram; Zachary Ulissi; Brandon Wood; |
2701 | GraphQNTK: Quantum Neural Tangent Kernel for Graph Data Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Specifically, a quantum algorithm is proposed to approximately estimate the neural tangent kernel of the underlying graph neural network where a multi-head quantum attention mechanism is introduced to properly incorporate semantic similarity information of nodes into the model. |
Yehui Tang; Junchi Yan; |
2702 | Identifying Good Directions to Escape The NTK Regime and Efficiently Learn Low-degree Plus Sparse Polynomials Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We investigate which directions the parameters of a two-layer neural network can move in to escape the NTK regime, and show that a network trained with a regularized loss can learn low-degree plus sparse polynomials with optimal sample complexity. |
Eshaan Nichani; Yu Bai; Jason Lee; |
2703 | RORL: Robust Offline Reinforcement Learning Via Conservative Smoothing Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To trade off robustness and conservatism, we propose Robust Offline Reinforcement Learning (RORL) with a novel conservative smoothing technique. |
Rui Yang; Chenjia Bai; Xiaoteng Ma; Zhaoran Wang; Chongjie Zhang; Lei Han; |
2704 | Optimal Scaling for Locally Balanced Proposals in Discrete Spaces Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we establish, for the first time, that the efficiency of M-H in discrete spaces can also be characterized by an asymptotic acceptance rate that is independent of the target distribution. |
Haoran Sun; Hanjun Dai; Dale Schuurmans; |
2705 | An Efficient Bayesian Data Augmentation Approach for Gradient-Bias Mitigation in Contrastive Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We show that directly using minibatch stochastic optimization could lead to gradient bias. To remedy this, we propose an efficient Bayesian data augmentation technique to augment the contrastive loss into a decomposable one, where standard stochastic optimization can be directly applied without gradient bias. |
Changyou Chen; Jianyi Zhang; Yi Xu; Liqun Chen; Jiali Duan; Yiran Chen; Son Tran; Belinda Zeng; Trishul Chilimbi; |
2706 | Multimodal Contrastive Learning with LIMoE: The Language-Image Mixture of Experts Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We develop a multimodal, sparsely activated Mixture of Experts model, trained contrastively on Image and Text, proposing new regularisation schemes to stabilize it, and significantly outperform dense baselines. |
Basil Mustafa; Carlos Riquelme; Joan Puigcerver; Rodolphe Jenatton; Neil Houlsby; |
2707 | [Re] Nondeterminism and Instability in Neural Network Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In the paper, the experiments were performed on two types of datasets (image classification and language modelling). |
Waqas Ahmed; Sheeba Samuel; |
2708 | Data-Efficient Augmentation for Training Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, the most effective augmentation techniques become computationally prohibitive for even medium-sized datasets. To address this, we propose a rigorous technique to select subsets of data points that when augmented, closely capture the training dynamics of full data augmentation. |
Tian Yu Liu; Baharan Mirzasoleiman; |
2709 | Friendly Noise Against Adversarial Noise: A Powerful Defense Against Data Poisoning Attack Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Here, we propose a simple but highly effective approach that unlike existing methods breaks various types of poisoning attacks with the slightest drop in the generalization performance. |
Tian Yu Liu; Yu Yang; Baharan Mirzasoleiman; |
2710 | Temporal Latent Bottleneck: Synthesis of Fast and Slow Processing Mechanisms in Sequence Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a solution which divides computation into two streams. |
Aniket Didolkar; Kshitij Gupta; Anirudh Goyal; Alex Lamb; Nan Rosemary Ke; Yoshua Bengio; |
2711 | Data Augmentation for Efficient Learning from Parametric Experts Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a simple, yet powerful data-augmentation technique to enable data-efficient learning from parametric experts for reinforcement and imitation learning. |
Alexandre Galashov; Josh Merel; Nicolas Heess; |
2712 | The Effects of Regularization and Data Augmentation Are Class Dependent Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this study, we demonstrate that techniques such as DA or weight decay produce a model with a reduced complexity that is unfair across classes. |
Randall Balestriero; Leon Bottou; Yann LeCun; |
2713 | Meta-Learning Dynamics Forecasting Using Task Inference Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a model-based meta-learning method called DyAd which can generalize across heterogeneous domains by partitioning them into different tasks. |
Rui Wang; Robin Walters; Rose Yu; |
2714 | A Fully Transformer-Based Object Detector with Fine-Coarse Crossing Representations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we introduce the Fine-grained and Coarse-grained crossing representations for building an efficient Detection Transformer (FCDT). |
Zhishan Li; Ying Nie; Kai Han; Jianyuan Guo; Lei Xie; Yunhe Wang; |
2715 | Jump Self-attention: Capturing High-order Statistics in Transformers Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Inspired by the pieces moving of English Draughts, we introduce the spectral convolutional technique to calculate JAT on the dot-product feature map. |
Haoyi Zhou; Siyang Xiao; Shanghang Zhang; Jieqi Peng; Shuai Zhang; Jianxin Li; |
2716 | Is A Modular Architecture Enough? Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We highlight the benefits of modularity and sparsity and reveal insights on the challenges faced while optimizing modular systems. In doing so, we propose evaluation metrics that highlight the benefits of modularity, the regimes in which these benefits are substantial, as well as the sub-optimality of current end-to-end learned modular systems as opposed to their claimed potential. |
Sarthak Mittal; Yoshua Bengio; Guillaume Lajoie; |
2717 | Minimax Optimal Fixed-Budget Best Arm Identification in Linear Bandits Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study the problem of best arm identification in linear bandits in the fixed-budget setting. |
Junwen Yang; Vincent Tan; |
2718 | HUMUS-Net: Hybrid Unrolled Multi-scale Network Architecture for Accelerated MRI Reconstruction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a hybrid architecture combining the efficiency of convolutions with the power of Transformers tailored for MRI reconstruction. |
Zalan Fabian; Berk Tinaz; Mahdi Soltanolkotabi; |
2719 | You Never Stop Dancing: Non-freezing Dance Generation Via Bank-constrained Manifold Projection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we present two modules that can be plugged into the existing models to enable them to generate non-freezing and high fidelity dances. |
Jiangxin Sun; Chunyu Wang; Huang Hu; Hanjiang Lai; Zhi Jin; Jian-Fang Hu; |
2720 | Wasserstein $K$-means for Clustering Probability Distributions Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a Wasserstein distance-based formulation of the K-means for clustering probability distributions. |
Yubo Zhuang; Xiaohui Chen; Yun Yang; |
2721 | Vision Transformers Learn Patch Association Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This raises a central question: how do ViTs learn this pattern by solely minimizing their training loss using gradient-based methods from \emph{random initialization}? We propose a structured classification dataset and a simplified ViT model to provide preliminary theoretical justification of this phenomenon. |
Samy Jelassi; Michael Sander; Yuanzhi Li; |
2722 | Towards Trustworthy Automatic Diagnosis Systems By Emulating Doctors’ Reasoning with Deep Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To do so, we propose to model the evidence acquisition and automatic diagnosis tasks in a deep reinforcement learning framework by considering three essential aspects of doctors’ reasoning, namely using differential diagnosis with the exploration-confirmation approach while prioritizing severe pathologies. |
Arsene Fansi Tchango; Zhi Wen; Gaetan Marceau Caron; Joumana Ghosn; Rishab Goel; Julien Martel; |
2723 | Structured Energy Network As A Loss Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose Structured Energy As Loss (SEAL) to take advantage of the expressivity of energy networks without incurring the high inference cost. |
Jay Yoon Lee; Dhruvesh Patel; Purujit Goyal; Wenlong Zhao; Zhiyang Xu; Andrew McCallum; |
2724 | Learning Concept Credible Models for Mitigating Shortcuts Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose two approaches for mitigating shortcuts that incorporate domain knowledge, while accounting for potentially important yet unknown concepts. |
Jiaxuan Wang; Sarah Jabbour; Maggie Makar; Michael Sjoding; Jenna Wiens; |
2725 | Learn to Explain: Multimodal Reasoning Via Thought Chains for Science Question Answering Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To this end, we present Science Question Answering (SQA), a new benchmark that consists of ~21k multimodal multiple choice questions with a diverse set of science topics and annotations of their answers with corresponding lectures and explanations. |
Pan Lu; Swaroop Mishra; Tanglin Xia; Liang Qiu; Kai-Wei Chang; Song-Chun Zhu; Oyvind Tafjord; Peter Clark; Ashwin Kalyan; |
2726 | Tractable Function-Space Variational Inference in Bayesian Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The paper presents a scalable function-space variational inference method that leads to reliable predictive uncertainty estimates. |
Tim G. J. Rudner; Zonghao Chen; Yee Whye Teh; Yarin Gal; |
2727 | TVLT: Textless Vision-Language Transformer Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we present the Textless Vision-Language Transformer (TVLT), a transformer model that takes raw audio and visual inputs for vision-and-language representation learning with minimal modality-specific design, and does not use extra text-specific modules such as tokenization or automatic speech recognition (ASR). |
Zineng Tang; Jaemin Cho; Yixin Nie; Mohit Bansal; |
2728 | Improved Feature Distillation Via Projector Ensemble Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper proposes an improved feature distillation method via projector ensemble. |
Yudong Chen; Sen Wang; Jiajun Liu; Xuwei Xu; Frank de Hoog; Zi Huang; |
2729 | Learning Neural Acoustic Fields Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: While recent advances in learned implicit functions have led to increasingly higher quality representations of the visual world, there have not been commensurate advances in learning spatial auditory representations. To address this gap, we introduce Neural Acoustic Fields (NAFs), an implicit representation that captures how sounds propagate in a physical scene. |
Andrew Luo; Yilun Du; Michael Tarr; Josh Tenenbaum; Antonio Torralba; Chuang Gan; |
2730 | Do We Really Need A Learnable Classifier at The End of Deep Neural Network? Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we study the potential of learning a neural network for classification with the classifier randomly initialized as an ETF and fixed during training. |
Yibo Yang; Shixiang Chen; Xiangtai Li; Liang Xie; Zhouchen Lin; Dacheng Tao; |
2731 | Variable Experience Rollout: Training Robust Skill Policies for Rearrangement Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present Variable Experience Rollout (VER), a technique for efficiently scaling batched on-policy reinforcement learning in heterogenous environments (where different environments take vastly different times for generating rollouts) to many GPUs residing on, potentially, many machines. |
Erik Wijmans; Irfan Essa; Dhruv Batra; |
2732 | Debugging and Explaining Metric Learning Approaches: An Influence Function Based Perspective Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we design an empirical influence function (EIF), a debugging and explaining technique for the generalization errors of state-of-the-art metric learning models. |
Ruofan Liu; Yun Lin; XIANGLIN YANG; Jin Song Dong; |
2733 | Debiasing Graph Neural Networks Via Learning Disentangled Causal Substructure Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: By analyzing this problem in a causal view, we find that disentangling and decorrelating the causal and bias latent variables from the biased graphs are both crucial for debiasing. Inspiring by this, we propose a general disentangled GNN framework to learn the causal substructure and bias substructure, respectively. |
Shaohua Fan; Xiao Wang; Yanhu Mo; Chuan Shi; Jian Tang; |
2734 | DHRL: A Graph-Based Approach for Long-Horizon and Sparse Hierarchical Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The extended components, i.e., goal space and length of episodes, impose a burden on either one or both high-level and low-level policies since both levels share the total horizon of the episode. In this paper, we present a method of Decoupling Horizons Using a Graph in Hierarchical Reinforcement Learning (DHRL) which can alleviate this problem by decoupling the horizons of high-level and low-level policies and bridging the gap between the length of both horizons using a graph. |
Seungjae Lee; Jigang Kim; Inkyu Jang; H. Jin Kim; |
2735 | Learning from Label Proportions By Learning with Label Noise Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose an approach to LLP based on a reduction to learning with label noise, using the forward correction (FC) loss of \textcite{Patrini2017MakingDN}. |
Jianxin Zhang; Yutong Wang; Clay Scott; |
2736 | [Re] AdaBelief Optimizer: Adapting Stepsizes By The Belief in Observed Gradients Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We perform several analyses targeted toward AdaBelief’s claims and find that the convergence speed and training stability of AdaBelief is comparable to that of adaptive gradient optimizers. |
Anirudh Buvanesh; Madhur Panwar; |
2737 | BOME! Bilevel Optimization Made Easy: A Simple First-Order Approach Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose a simple first-order BO algorithm that depends only on first-order gradient information, requires no implicit differentiation, and is practical and efficient for large-scale non-convex functions in deep learning. |
Mao Ye; Bo Liu; Stephen Wright; Peter Stone; Qiang Liu; |
2738 | GAPX: Generalized Autoregressive Paraphrase-Identification X Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: While much progress has been made in the field, the performance of many state-of- the-art models often suffer from distribution shift during inference time. We verify that a major source of this performance drop comes from biases introduced by negative examples. |
Yifei Zhou; Renyu Li; Hayden Housen; Ser Nam Lim; |
2739 | Toward A Realistic Model of Speech Processing in The Brain with Self-supervised Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Focusing on the issue of speech processing, we here hypothesize that self-supervised algorithms trained on the raw waveform constitute a promising candidate. |
Juliette MILLET; Charlotte Caucheteux; pierre orhan; Yves Boubenec; Alexandre Gramfort; Ewan Dunbar; Christophe Pallier; Jean-Remi King; |
2740 | Variational Context Adjustment for Temporal Event Prediction Under Distribution Shifts Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We handle temporal distribution shift in sequential event prediction tasks. |
Chenxiao Yang; Qitian Wu; Qingsong Wen; Zhiqiang Zhou; Liang Sun; Junchi Yan; |
2741 | Distributed Inverse Constrained Reinforcement Learning for Multi-agent Systems Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper considers the problem of recovering the policies of multiple interacting experts by estimating their reward functions and constraints where the demonstration data of the experts is distributed to a group of learners. |
Shicheng Liu; Minghui Zhu; |
2742 | Randomized Channel Shuffling: Minimal-Overhead Backdoor Attack Detection Without Clean Datasets Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This new backdoor threat hence incurs overlooked risks for many data-hungry, or high-stake applications. This paper reports the first pilot study on this new setting, by investigating the contrasting channel-level statistics between backdoor trigger and clean features, and consequently, how the former can be differentiated by progressive channel shuffling. |
Ruisi Cai; Zhenyu Zhang; Tianlong Chen; Xiaohan Chen; Zhangyang Wang; |
2743 | Squeezeformer: An Efficient Transformer for Automatic Speech Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: After reexamining the design choices for both the macro and micro-architecture of Conformer, we propose Squeezeformer which consistently outperforms the state-of-the-art ASR models under the same training schemes. |
Amir Gholami; Kurt Keutzer; Sehoon Kim; Nicholas Lee; Michael Mahoney; Jitendra Malik; Karttikeya Mangalam; Albert Shaw; |
2744 | On The Generalization of Learning Algorithms That Do Not Converge Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To reduce this discrepancy between theory and practice, we analyze the generalization of algorithms when the weights only converge in distribution. Our main contribution is to propose a notion of ”statistical algorithmic stability” (SAS) that extends classical algorithmic stability to non-convergent algorithms and to study its connection to generalization. |
Nisha Chandramoorthy; Andreas Loukas; Khashayar Gatmiry; Stefanie Jegelka; |
2745 | Learning to Reason with Neural Networks: Generalization, Unseen Data and Boolean Measures Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper considers the Pointer Value Retrieval (PVR) benchmark introduced in [ZRKB21], where a `reasoning’ function acts on a string of digits to produce the label. |
Emmanuel Abbe; Samy Bengio; Elisabetta Cornacchia; Jon Kleinberg; Aryo Lotfi; Maithra Raghu; Chiyuan Zhang; |
2746 | Hardness of Noise-Free Learning for Two-Hidden-Layer Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We give superpolynomial statistical query (SQ) lower bounds for learning two-hidden-layer ReLU networks with respect to Gaussian inputs in the standard (noise-free) model. |
Sitan Chen; Aravind Gollakota; Adam Klivans; Raghu Meka; |
2747 | Less-forgetting Multi-lingual Fine-tuning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This forgetting phenomenon degenerates the zero-shot performance of MLF, which remains under-explored. To fill this gap, this paper proposes a multi-lingual fine-tuning method, dubbed Less-forgetting Multi-lingual Fine-tuning (LF-MLF). |
Yuren Mao; Yaobo Liang; Nan Duan; Haobo Wang; Kai Wang; Lu Chen; Yunjun Gao; |
2748 | Lossless Compression of Deep Neural Networks: A High-dimensional Neural Tangent Kernel Approach Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, building upon recent research advances in the neural tangent kernel (NTK) and random matrix theory, we provide a novel compression approach to wide and fully-connected \emph{deep} neural nets. |
lingyu gu; Yongqi Du; yuan zhang; Di Xie; Shiliang Pu; Robert Qiu; Zhenyu Liao; |
2749 | Relational Reasoning Via Set Transformers: Provable Efficiency and Applications to MARL Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we verify that the transformer implements complex relational reasoning, and we propose and analyze model-free and model-based offline MARL algorithms with the transformer approximators. |
Fengzhuo Zhang; Boyi Liu; KAIXIN WANG; Vincent Tan; Zhuoran Yang; Zhaoran Wang; |
2750 | Self-Aware Personalized Federated Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Inspired by Bayesian hierarchical models, we develop a self-aware personalized FL method where each client can automatically balance the training of its local personal model and the global model that implicitly contributes to other clients’ training. |
Huili Chen; Jie Ding; Eric W Tramel; Shuang Wu; Anit Kumar Sahu; Salman Avestimehr; Tao Zhang; |
2751 | Decentralized, Communication- and Coordination-free Learning in Structured Matching Markets Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a class of decentralized, communication- and coordination-free algorithms that agents can use to reach to their stable match in structured matching markets. |
Chinmay Maheshwari; Eric Mazumdar; Shankar Sastry; |
2752 | Understanding Non-linearity in Graph Neural Networks from The Bayesian-Inference Perspective Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we resort to Bayesian learning to give an in-depth investigation of the functions of non-linearity in GNNs for node classification tasks. |
Rongzhe Wei; Haoteng YIN; Junteng Jia; Austin Benson; Pan Li; |
2753 | On The Tradeoff Between Robustness and Fairness Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Theoretically, we analyze the class-wise performance of adversarially trained linear models with mixture Gaussian distribution. |
Xinsong Ma; Zekai Wang; Weiwei Liu; |
2754 | Online Convex Optimization with Hard Constraints: Towards The Best of Two Worlds and Beyond Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a RECtified Online Optimization algorithm (RECOO) and consider two settings: fixed constraints and adversarial constraints. |
Hengquan Guo; Xin Liu; Honghao Wei; Lei Ying; |
2755 | Staircase Attention for Recurrent Processing of Sequences Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work we introduce a novel attention procedure called staircase attention that, unlike self-attention, operates across the sequence (in time) recurrently processing the input by adding another step of processing. |
Da JU; Stephen Roller; Sainbayar Sukhbaatar; Jason E Weston; |
2756 | Augmenting Online Algorithms with $\varepsilon$-Accurate Predictions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Motivated by ML models that give a confidence parameter for their predictions, we study online algorithms with predictions that are $\epsilon$-accurate: namely, each prediction is correct with probability (at least) $\epsilon$, but can be arbitrarily inaccurate with the remaining probability. |
Anupam Gupta; Debmalya Panigrahi; Bernardo Subercaseaux; Kevin Sun; |
2757 | Seeing The Forest and The Tree: Building Representations of Both Individual and Collective Dynamics with Transformers Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Here, we present a novel transformer architecture for learning from time-varying data by building descriptions of both the individual as well as the collective population dynamics. |
Ran Liu; Mehdi Azabou; Max Dabagia; Jingyun Xiao; Eva Dyer; |
2758 | Matrix Multiplicative Weights Updates in Quantum Zero-Sum Games: Conservation Laws & Recurrence Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we focus on learning in quantum zero-sum games under Matrix Multiplicative Weights Update (a generalization of the multiplicative weights update method) and its continuous analogue, Quantum Replicator Dynamics. |
Rahul Jain; Georgios Piliouras; Ryann Sim; |
2759 | Verification and Search Algorithms for Causal DAGs Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We study two problems related to recovering causal graphs from interventional data: (i) \emph{verification}, where the task is to check if a purported causal graph is correct, and (ii) \emph{search}, where the task is to recover the correct causal graph. |
Davin Choo; Kirankumar Shiragur; Arnab Bhattacharyya; |
2760 | Understanding The Failure of Batch Normalization for Transformers in NLP Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To suppress the explosion of TID, we propose Regularized BN (RBN) that adds a simple regularization term to narrow the gap between batch statistics and population statistics of BN. |
Jiaxi Wang; Ji Wu; Lei Huang; |
2761 | WebShop: Towards Scalable Real-World Web Interaction with Grounded Language Agents Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We develop WebShop – a simulated e-commerce website environment with 1.18 million real-world products and 12, 087 crowd-sourced text instructions. |
Shunyu Yao; Howard Chen; John Yang; Karthik Narasimhan; |
2762 | Faster and Scalable Algorithms for Densest Subgraph and Decomposition Related Papers Related Patents Related Grants Related Orgs Related Experts View Abstract: We study the densest subgraph problem (DSG) and the densest subgraph local decomposition problem (DSG-LD) in undirected graphs. We also consider supermodular generalizations of … |
Elfarouk Harb; Kent Quanrud; Chandra Chekuri; |
2763 | Precise Regret Bounds for Log-loss Via A Truncated Bayesian Algorithm Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We give optimal lower and upper bounds of online regression under logarithmic loss via a novel smooth truncated Bayesian algorithm. |
Changlong Wu; Mohsen Heidari; Ananth Grama; Wojciech Szpankowski; |
2764 | Unsupervised Visual Representation Learning Via Mutual Information Regularized Assignment Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper proposes Mutual Information Regularized Assignment (MIRA), a pseudo-labeling algorithm for unsupervised representation learning inspired by information maximization. |
Dong Hoon Lee; Sungik Choi; Hyunwoo Kim; Sae-Young Chung; |
2765 | Coresets for Wasserstein Distributionally Robust Optimization Problems Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce a unified framework to construct the coreset for Wasserstein Distributionally Robust Optimization Problems. |
Ruomin Huang; Jiawei Huang; Wenjie Liu; Hu Ding; |
2766 | GenSDF: Two-Stage Learning of Generalizable Signed Distance Functions Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce a two-stage semi-supervised meta-learning approach that transfers shape priors from labeled to unlabeled data to reconstruct unseen object categories. |
Gene Chou; Ilya Chugunov; Felix Heide; |
2767 | Generalizing Goal-Conditioned Reinforcement Learning with Variational Causal Reasoning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we augment Goal-Conditioned RL (GCRL) with Causal Graph (CG), a structure built upon the relation between objects and events. |
Wenhao Ding; Haohong Lin; Bo Li; DING ZHAO; |
2768 | A Statistical Online Inference Approach in Averaged Stochastic Approximation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper we propose a general framework to perform statistical online inference in a class of constant step size stochastic approximation (SA) problems, including the well-known stochastic gradient descent (SGD) and Q-learning. |
Chuhan Xie; Zhihua Zhang; |
2769 | On Robust Multiclass Learnability Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This work analyzes the robust learning problem in the multiclass setting. |
Jingyuan Xu; Weiwei Liu; |
2770 | Constrained Update Projection Approach to Safe Policy Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this study, we propose CUP, a novel policy optimization method based on Constrained Update Projection framework that enjoys rigorous safety guarantee. |
Long Yang; Jiaming Ji; Juntao Dai; Linrui Zhang; Binbin Zhou; Pengfei Li; Yaodong Yang; Gang Pan; |
2771 | Training and Inference on Any-Order Autoregressive Models The Right Way Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: But, in spite of their success, in this paper we identify significant improvements to be made to previous formulations of AO-ARMs. |
Andy Shih; Dorsa Sadigh; Stefano Ermon; |
2772 | MorphTE: Injecting Morphology in Tensorized Embeddings Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Combining the powerful compression capability of tensor products, we propose a word embedding compression method with morphological augmentation, Morphologically-enhanced Tensorized Embeddings (MorphTE). |
Guobing Gan; Peng Zhang; Sunzhu Li; Xiuqing Lu; Benyou Wang; |
2773 | Zeroth-Order Negative Curvature Finding: Escaping Saddle Points Without Gradients Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Although a variety of works have been proposed, the majority of them require either second or first-order information, and only a few of them have exploited zeroth-order methods, particularly the technique of negative curvature finding with zeroth-order methods which has been proven to be the most efficient method for escaping saddle points. To fill this gap, in this paper, we propose two zeroth-order negative curvature finding frameworks that can replace Hessian-vector product computations without increasing the iteration complexity. |
Hualin Zhang; Huan Xiong; Bin Gu; |
2774 | Task-Free Continual Learning Via Online Discrepancy Distance Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper develops a new theoretical analysis framework that derives generalization bounds based on the discrepancy distance between the visited samples and the entire information made available for training the model. |
Fei Ye; Adrian G. Bors; |
2775 | Meta-Learning with Self-Improving Momentum Target Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, obtaining a target model for each task can be highly expensive, especially when the number of tasks for meta-learning is large. To tackle this issue, we propose a simple yet effective method, coined Self-improving Momentum Target (SiMT). |
Jihoon Tack; Jongjin Park; Hankook Lee; Jaeho Lee; Jinwoo Shin; |
2776 | Synthetic Model Combination: An Instance-wise Approach to Unsupervised Ensemble Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Machine learning models perform notoriously poorly on data outside their training domain however, and so we argue that when ensembling models the weightings for individual instances must reflect their respective domains – in other words models that are more likely to have seen information on that instance should have more attention paid to them. We introduce a method for such an instance-wise ensembling of models, including a novel representation learning step for handling sparse high-dimensional domains. |
Alex Chan; Mihaela van der Schaar; |
2777 | Transfer Learning on Heterogeneous Feature Spaces for Treatment Effects Estimation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This heterogeneous transfer learning problem for CATE estimation is ubiquitous in areas such as healthcare where we may wish to evaluate the effectiveness of a treatment for a new patient population for which different clinical covariates and limited data are available. In this paper, we address this problem by introducing several building blocks that use representation learning to handle the heterogeneous feature spaces and a flexible multi-task architecture with shared and private layers to transfer information between potential outcome functions across domains. |
Ioana Bica; Mihaela van der Schaar; |
2778 | Divide and Contrast: Source-free Domain Adaptation Via Adaptive Contrastive Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present Divide and Contrast (DaC), a new paradigm for SFUDA that strives to connect the good ends of both worlds while bypassing their limitations. |
Ziyi Zhang; Weikai Chen; Hui Cheng; Zhen Li; Siyuan Li; Liang Lin; Guanbin Li; |
2779 | Graph Coloring Via Neural Networks for Haplotype Assembly and Viral Quasispecies Reconstruction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: State-of-the-art neural approaches to solve this NP-hard problem do not adequately model relations among the reads that are important for deconvolving the input signal. We address this problem by developing a new method, called NeurHap, that combines graph representation learning with combinatorial optimization. |
Hansheng Xue; Vaibhav Rajan; Yu Lin; |
2780 | Assistive Teaching of Motor Control Tasks to Humans Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose an AI-assisted teaching algorithm that leverages skill discovery methods from reinforcement learning (RL) literature to (i) break down any motor control task into teachable skills, (ii) construct novel drill sequences, and (iii) individualize curricula to students with different capabilities. |
Megha Srivastava; Erdem Biyik; Suvir Mirchandani; Noah Goodman; Dorsa Sadigh; |
2781 | Uni[MASK]: Unified Inference in Sequential Decision Problems Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce the UniMASK framework, which provides a unified way to specify models which can be trained on many different sequential decision making tasks. |
Micah Carroll; Orr Paradise; Jessy Lin; Raluca Georgescu; Mingfei Sun; David Bignell; Stephanie Milani; Katja Hofmann; Matthew Hausknecht; Anca Dragan; Sam Devlin; |
2782 | The Missing Invariance Principle Found — The Reciprocal Twin of Invariant Risk Minimization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Here, we identify a fundamental flaw of IRM formulation that causes the failure. We then introduce a complementary notion of invariance, MRI, based on conserving the label-conditioned feature expectation $\mathbb{E}_e[f(x)|y]$ across environments, which is free of this flaw. |
Dongsung Huh; Avinash Baidya; |
2783 | Capturing Graphs with Hypo-Elliptic Diffusions Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: The distribution of these random walks evolves according to a diffusion equation defined using the graph Laplacian. We extend this approach by leveraging classic mathematical results about hypo-elliptic diffusions. |
Csaba Toth; Darrick Lee; Celia Hacker; Harald Oberhauser; |
2784 | Private and Communication-Efficient Algorithms for Entropy Estimation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: For a joint distribution on several variables whose conditional independence graph is a tree, we describe algorithms for estimating Shannon entropy that require a number of samples that is linear in the number of variables, compared to the quadratic sample complexity of prior work. |
Gecia Bravo-Hermsdorff; Róbert Busa-Fekete; Mohammad Ghavamzadeh; Andres Munoz Medina; Umar Syed; |
2785 | Sequential Hypothesis Tests of Multinomial Count Data Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We develop always-valid sequential statistical procedures for several important applications in the experimentation space dealing with count data. Illustrated with real A/B test data from Netflix. |
Michael Lindon; Alan Malek; |
2786 | The Least-control Principle for Learning at Equilibrium Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: As special cases, they include models of great current interest in both neuroscience and machine learning, such as equilibrium recurrent neural networks, deep equilibrium models, or meta-learning. Here, we present a new principle for learning such systems with a temporally- and spatially-local rule. |
Alexander Meulemans; Nicolas Zucchet; Seijin Kobayashi; Johannes von Oswald; João Sacramento; |
2787 | Reinforcement Learning in A Birth and Death Process: Breaking The Dependence on The State Space Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we revisit the regret of undiscounted reinforcement learning in MDPs with a birth and death structure. |
Jonatha Anselmi; Bruno Gaujal; Louis-Sébastien Rebuffi; |
2788 | Ordered Subgraph Aggregation Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Further, current approaches either use all subgraphs of a given size, sample them uniformly at random, or use hand-crafted heuristics to select them, oblivious to the given data distribution. Here, we offer a unified way to study such architectures by introducing a theoretical framework and extending the known expressivity results of subgraph-enhanced graph neural networks. |
Chendi Qian; Gaurav Rattan; Floris Geerts; Mathias Niepert; Christopher Morris; |
2789 | Adapting to Domain Shift By Meta-Distillation from Mixture-of-Experts Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose a novel framework for unsupervised test-time adaptation, which is formulated as a knowledge distillation process to address domain shift. |
Tao Zhong; Zhixiang Chi; Li Gu; Yang Wang; Yuanhao Yu; Jin Tang; |
2790 | Invariant and Transportable Representations for Anti-Causal Domain Shifts Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we study representation learning under a particular notion of domain shift that both respects causal invariance and that naturally handles the “anti-causal” structure. |
Yibo Jiang; Victor Veitch; |
2791 | Model Preserving Compression for Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We compress neural networks while preserving the original network’s decisions and structure. |
Jerry Chee; Megan Renz; Anil Damle; Christopher De Sa; |
2792 | Non-identifiability and The Blessings of Misspecification in Models of Molecular Fitness Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We prove that fitness is not identifiable from observational sequence data alone, placing fundamental limits on our ability to disentangle fitness landscapes from phylogenetic history. |
Eli Weinstein; Alan Amin; Jonathan Frazer; Debora Marks; |
2793 | Meta-learning for Feature Selection with Hilbert-Schmidt Independence Criterion Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a meta-learning method for feature selection that can select relevant features given a small number of labeled instances. |
Atsutoshi Kumagai; Tomoharu Iwata; Yasutoshi Ida; Yasuhiro Fujiwara; |
2794 | Generalization Gap in Amortized Inference Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We point out the two generalization gaps that can affect the generalization ability of VAEs and show that the over-fitting phenomenon is usually dominated by the amortized inference network. Based on this observation we propose a new training objective, inspired by the classic wake-sleep algorithm, to improve the generalizations properties of amortized inference. |
Mingtian Zhang; Peter Hayes; David Barber; |
2795 | On The Safety of Interpretable Machine Learning: A Maximum Deviation Approach Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present case studies, including one on mortgage approval, to illustrate our methods and the insights about models that may be obtained from deviation maximization. |
Dennis Wei; Rahul Nair; Amit Dhurandhar; Kush Varshney; Elizabeth Daly; Moninder Singh; |
2796 | Graph Agnostic Estimators with Staggered Rollout Designs Under Network Interference Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a new estimator under a staggered rollout randomized design for estimating the total treatment effect under network interference without knowledge of the underlying network. |
Mayleen Cortez; Matthew Eichhorn; Christina Yu; |
2797 | Kernel Interpolation with Sparse Grids Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Unfortunately, SKI scales poorly in the dimension of the input points, since the dense grid size grows exponentially with the dimension. To mitigate this issue, we propose the use of sparse grids within the SKI framework. |
Mohit Yadav; Daniel Sheldon; Cameron Musco; |
2798 | Value Function Decomposition for Iterative Design of Reinforcement Learning Agents Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we show how to integrate \textit{value decomposition} into a broad class of actor-critic algorithms and use it to assist in the iterative agent-design process. |
James MacGlashan; Evan W Archer; Alisa Devlic; Takuma Seno; Craig Sherstan; Peter Wurman; Peter Stone; |
2799 | Composite Feature Selection Using Deep Ensembles Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce a novel deep learning architecture that uses an ensemble of feature selection models to find predictive groups, without requiring candidate groups to be provided. |
Alexander Norcliffe; Fergus Imrie; Pietro Lió; Mihaela van der Schaar; |
2800 | Explain My Surprise: Learning Efficient Long-Term Memory By Predicting Uncertain Outcomes Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose MemUP, a new training method that allows to learn long-term dependencies without backpropagating gradients through the whole sequence at a time. |
Artyom Sorokin; Nazar Buzun; Leonid Pugachev; Mikhail Burtsev; |
2801 | Uplifting Bandits Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce a new multi-armed bandit model where the reward is a sum of multiple random variables, and each action only alters the distributions of some of these variables. |
Yu-Guan Hsieh; Shiva Kasiviswanathan; Branislav Kveton; |
2802 | Off-Policy Evaluation with Policy-Dependent Optimization Response Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we develop a new framework for off-policy evaluation with policy-dependent linear optimization responses: causal outcomes introduce stochasticity in objective function coefficients. |
Wenshuo Guo; Michael Jordan; Angela Zhou; |
2803 | Safe Opponent-Exploitation Subgame Refinement Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Inspired by the recent success of real-time search algorithms in developing superhuman AI, we investigate the dilemma of safety and opponent exploitation and present a novel real-time search framework, called Safe Exploitation Search (SES), which continuously interpolates between the two extremes of online strategy refinement. |
Mingyang Liu; Chengjie Wu; Qihan Liu; Yansen Jing; Jun Yang; Pingzhong Tang; Chongjie Zhang; |
2804 | Bezier Gaussian Processes for Tall and Wide Data Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce a kernel that allows the number of summarising variables to grow exponentially with the number of input features, but requires only linear cost in both number of observations and input features. |
Martin Jørgensen; Michael A Osborne; |
2805 | AgraSSt: Approximate Graph Stein Statistics for Interpretable Assessment of Implicit Graph Generators Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose and analyse a novel statistical procedure, coined AgraSSt, to assess the quality of graph generators which may not be available in explicit forms. |
Wenkai Xu; Gesine D Reinert; |
2806 | SAPD+: An Accelerated Stochastic Method for Nonconvex-Concave Minimax Problems Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a new stochastic method SAPD+ for solving nonconvex-concave minimax problems of the form $\min\max\mathcal{L}(x,y)=f(x)+\Phi(x,y)-g(y)$, where $f,g$ are closed convex and $\Phi(x,y)$ is a smooth function that is weakly convex in $x$, (strongly) concave in $y$. |
Xuan Zhang; Necdet Serhat Aybat; Mert Gurbuzbalaban; |
2807 | Biological Learning of Irreducible Representations of Commuting Transformations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Specifically, we propose bio-inspired mechanisms for learning these irreducible representations. |
Alexander Genkin; David Lipshutz; Siavash Golkar; Tiberiu Tesileanu; Dmitri Chklovskii; |
2808 | Group GAN Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a novel framework that takes time series’ common origin into account and favors inter-channel relationships preservation. |
Ali Seyfi; Jean-Francois Rajotte; Raymond Ng; |
2809 | A Closer Look at Prototype Classifier for Few-shot Image Classification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we analyze how a prototype classifier works equally well without training a new linear classifier or meta-learning. |
Mingcheng Hou; Issei Sato; |
2810 | Unsupervised Cross-Task Generalization Via Retrieval Augmentation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Humans can perform unseen tasks by recalling relevant skills that are acquired previously and then generalizing them to the target tasks, even if there is no supervision at all. In this paper, we aim to improve such cross-task generalization ability of massive multi-task language models such as T0 (Sanh et al., 2021) in an unsupervised setting. |
Bill Yuchen Lin; Kangmin Tan; Chris Miller; Beiwen Tian; Xiang Ren; |
2811 | Elucidating The Design Space of Diffusion-Based Generative Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We argue that the theory and practice of diffusion-based generative models are currently unnecessarily convoluted and seek to remedy the situation by presenting a design space that clearly separates the concrete design choices. |
Tero Karras; Miika Aittala; Timo Aila; Samuli Laine; |
2812 | Trading Off Image Quality for Robustness Is Not Necessary with Deterministic Autoencoders Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose an adversarially robust deterministic autoencoder with superior performance in terms of both generation and robustness of the learned representations. |
Amrutha Saseendran; Kathrin Skubch; Margret Keuper; |
2813 | Online PAC-Bayes Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This somewhat departs from many contemporary learning problems where data streams are collected and the algorithms must dynamically adjust. We prove new PAC-Bayesian bounds in this online learning framework, leveraging an updated definition of regret, and we revisit classical PAC-Bayesian results with a batch-to-online conversion, extending their remit to the case of dependent data. |
Maxime Haddouche; Benjamin Guedj; |
2814 | Temporal Effective Batch Normalization in Spiking Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To this end, we propose an effective normalization method called temporal effective batch normalization (TEBN). |
Chaoteng Duan; Jianhao Ding; Shiyan Chen; Zhaofei Yu; Tiejun Huang; |
2815 | Dynamic Inverse Reinforcement Learning for Characterizing Animal Behavior Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work we introduce ‘DIRL’, a novel IRL framework that allows for time-varying intrinsic rewards. |
Zoe Ashwood; Aditi Jha; Jonathan Pillow; |
2816 | Cost-Sensitive Self-Training for Optimizing Non-Decomposable Metrics Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we introduce the Cost-Sensitive Self-Training (CSST) framework which generalizes the self-training based methods for optimizing non-decomposable metrics. |
Harsh Rangwani; shrinivas ramasubramanian; Sho Takemori; Kato Takashi; Yuhei Umeda; Venkatesh Babu R; |
2817 | Censored Quantile Regression Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Our major contribution is a novel algorithm that simultaneously optimises a grid of quantiles output by a single NN. |
Tim Pearce; Jong-Hyeon Jeong; yichen jia; Jun Zhu; |
2818 | Dataset Inference for Self-Supervised Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce a new dataset inference defense, which uses the private training set of the victim encoder model to attribute its ownership in the event of stealing. |
Adam Dziedzic; Haonan Duan; Muhammad Ahmad Kaleem; Nikita Dhawan; Jonas Guan; Yannis Cattan; Franziska Boenisch; Nicolas Papernot; |
2819 | Dynamic Graph Neural Networks Under Spatio-Temporal Distribution Shift Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose to handle spatio-temporal distribution shifts in dynamic graphs by discovering and utilizing {\it invariant patterns}, i.e., structures and features whose predictive abilities are stable across distribution shifts, which faces two key challenges: 1) How to discover the complex variant and invariant spatio-temporal patterns in dynamic graphs, which involve both time-varying graph structures and node features. |
Zeyang Zhang; Xin Wang; Ziwei Zhang; Haoyang Li; Zhou Qin; Wenwu Zhu; |
2820 | ComGAN: Unsupervised Disentanglement and Segmentation Via Image Composition Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose ComGAN, a simple unsupervised generative model, which simultaneously generates realistic images and high semantic masks under an adversarial loss and a binary regularization. |
Rui Ding; Kehua Guo; Xiangyuan Zhu; Zheng Wu; Liwei Wang; |
2821 | Gaussian Copula Embeddings Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a Gaussian copula embedding model to learn latent vector representations of items in a heterogeneous data setting. |
Chien Lu; Jaakko Peltonen; |
2822 | PatchComplete: Learning Multi-Resolution Patch Priors for 3D Shape Completion on Unseen Categories Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Thus, we propose PatchComplete, which learns effective shape priors based on multi-resolution local patches, which are often more general than full shapes (e.g., chairs and tables often both share legs) and thus enable geometric reasoning about unseen class categories. |
Yuchen Rao; Yinyu Nie; Angela Dai; |
2823 | BR-SNIS: Bias Reduced Self-Normalized Importance Sampling Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose a new method BR-SNIS whose complexity is essentially the same as SNIS and which significantly reduces bias. |
Gabriel Cardoso; Sergey Samsonov; Achille Thin; Eric Moulines; Jimmy Olsson; |
2824 | Shape, Light, and Material Decomposition from Images Using Monte Carlo Rendering and Denoising Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present an efficient method to jointly reconstruct geometry (explicit triangle meshes), materials, and lighting, which substantially improves material and light separation compared to previous work. |
Jon Hasselgren; Nikolai Hofmann; Jacob Munkberg; |
2825 | TTOpt: A Maximum Volume Quantized Tensor Train-based Optimization and Its Application to Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present a novel procedure for optimization based on the combination of efficient quantized tensor train representation and a generalized maximum matrix volume principle.We demonstrate the applicability of the new Tensor Train Optimizer (TTOpt) method for various tasks, ranging from minimization of multidimensional functions to reinforcement learning. |
Konstantin Sozykin; Andrei Chertkov; Roman Schutski; Anh-Huy Phan; Andrzej S CICHOCKI; Ivan Oseledets; |
2826 | Invariance-Aware Randomized Smoothing Certificates Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a gray-box approach, enhancing the powerful black-box randomized smoothing technique with white-box knowledge about invariances. |
Jan Schuchardt; Stephan Günnemann; |
2827 | Contextual Squeeze-and-Excitation for Efficient Few-Shot Image Classification Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose Contextual Squeeze-and-Excitation (CaSE) an adaptive block for efficient few-shot learning that narrows the gap with leading fine-tuners (e.g. BiT) on VTAB+MD with a significantly lower adaptation cost. |
Massimiliano Patacchiola; John Bronskill; Aliaksandra Shysheya; Katja Hofmann; Sebastian Nowozin; Richard Turner; |
2828 | Rule-Based But Flexible? Evaluating and Improving Language Models As Accounts of Human Moral Judgment Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we present a novel challenge set consisting of rule-breaking question answering (RBQA) of cases that involve potentially permissible rule-breaking — inspired by recent moral psychology studies. |
Zhijing Jin; Sydney Levine; Fernando Gonzalez Adauto; Ojasv Kamal; Maarten Sap; Mrinmaya Sachan; Rada Mihalcea; Josh Tenenbaum; Bernhard Schölkopf; |
2829 | Curriculum Reinforcement Learning Using Optimal Transport Via Gradual Domain Adaptation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose GRADIENT, which formulates CRL as an optimal transport problem with a tailored distance metric between tasks. |
Peide Huang; Mengdi Xu; Jiacheng Zhu; Laixi Shi; Fei Fang; DING ZHAO; |
2830 | Causal Discovery in Heterogeneous Environments Under The Sparse Mechanism Shift Hypothesis Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose the Mechanism Shift Score (MSS), a score-based approach amenable to various empirical estimators, which provably identifies the entire causal structure with high probability if the sparse mechanism shifts hypothesis holds. |
Ronan Perry; Julius von Kügelgen; Bernhard Schölkopf; |
2831 | The Policy-gradient Placement and Generative Routing Neural Networks for Chip Design Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Distinct from traditional heuristic solvers, this paper on one hand proposes an RL-based model for mixed-size macro placement, which differs from existing learning-based placers that often consider the macro by coarse grid-based mask. |
Ruoyu Cheng; Xianglong Lyu; Yang Li; Junjie Ye; Jianye Hao; Junchi Yan; |
2832 | [Re] Lifting 2D StyleGAN for 3D-Aware Face Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This work proposes a model, called LiftedGAN, that disentangles the latent space of StyleGAN2 into texture, shape, viewpoint, lighting components and utilizes those components to render novel synthetic images. |
Doğa Yılmaz; Furkan Kınlı; Barış Özcan; Furkan Kıraç; |
2833 | Reinforcement Learning with A Terminator Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present the problem of reinforcement learning with exogenous termination. |
Guy Tennenholtz; Nadav Merlis; Lior Shani; Shie Mannor; Uri Shalit; Gal Chechik; Assaf Hallak; Gal Dalal; |
2834 | D^2NeRF: Self-Supervised Decoupling of Dynamic and Static Objects from A Monocular Video Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce Decoupled Dynamic Neural Radiance Field (D^2NeRF), a self-supervised approach that takes a monocular video and learns a 3D scene representation which decouples moving objects, including their shadows, from the static background. |
Tianhao Wu; Fangcheng Zhong; Andrea Tagliasacchi; Forrester Cole; Cengiz Oztireli; |
2835 | EGSDE: Unpaired Image-to-Image Translation Via Energy-Guided Stochastic Differential Equations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To this end, we propose energy-guided stochastic differential equations (EGSDE) that employs an energy function pretrained on both the source and target domains to guide the inference process of a pretrained SDE for realistic and faithful unpaired I2I. |
Min Zhao; Fan Bao; Chongxuan LI; Jun Zhu; |
2836 | Alleviating “Posterior Collapse” in Deep Topic Models Via Policy Gradient Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To this end, in this paper, we first develop a novel deep-coupling generative process for existing deep topic models, which incorporates skip connections into the generation of documents, enforcing strong links between the document and its multi-layer latent representations.After that, utilizing data augmentation techniques, we reformulate the deep-coupling generative process as a Markov decision process and develop a corresponding Policy Gradient (PG) based training algorithm, which can further alleviate the information reduction at higher layers. |
Yewen Li; Chaojie Wang; Zhibin Duan; Dongsheng Wang; Bo Chen; Bo An; Mingyuan Zhou; |
2837 | Out-of-Distribution Detection with An Adaptive Likelihood Ratio on Informative Hierarchical VAE Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Based on a thorough analysis for `posterior collapse”, we propose a novel informative hierarchical VAE to alleviate this issue through enhancing the connections between the data sample and its multi-layer stochastic latent representations during training. |
Yewen Li; Chaojie Wang; Xiaobo Xia; Tongliang Liu; xin miao; Bo An; |
2838 | Estimating The Arc Length of The Optimal ROC Curve and Lower Bounding The Maximal AUC Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we show the arc length of the optimal ROC curve is an $f$-divergence. |
Song Liu; |
2839 | IMED-RL: Regret Optimal Learning of Ergodic Markov Decision Processes Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Traditional asymptotic problem-dependent lower bounds on the regret are known under the assumption that the MDP is \emph{ergodic}. Under this assumption, we introduce \texttt{IMED-RL} and prove that its regret upper bound asymptotically matches the regret lower bound. |
Fabien Pesquerel; Odalric-Ambrym Maillard; |
2840 | [Re] Background-Aware Pooling and Noise-Aware Loss for Weakly-Supervised Semantic Segmentation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We performed many refactoring and upgrades on the author’s code to include various procedures mentioned in the paper as well as reimplemented the code base in PyTorch Lightning for easier reproducibility. |
Aryan Mehta; Karan Uppal; Kaushal Jadhav; Monish Natarajan; Mradul Agrawal; Debashish Chakravarty; |
2841 | Asymmetric Temperature Scaling Makes Larger Networks Teach Well Again Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Complex teachers tend to be over-confident and traditional temperature scaling limits the efficacy of {\it class discriminability}, resulting in less discriminative wrong class probabilities. Therefore, we propose {\it Asymmetric Temperature Scaling (ATS)}, which separately applies a higher/lower temperature to the correct/wrong class. |
Xin-Chun Li; Wen-shu Fan; Shaoming Song; Yinchuan Li; bingshuai Li; Shao Yunfeng; De-Chuan Zhan; |
2842 | Towards Versatile Embodied Navigation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We investigate a more challenging embodied navigation problem where a single powerful agent is learned to masters not one but multiple navigation tasks concurrently. |
Hanqing Wang; Wei Liang; Luc V Gool; Wenguan Wang; |
2843 | Laplacian Autoencoders for Learning Stochastic Representations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we present a Bayesian autoencoder for unsupervised representation learning, which is trained using a novel variational lower-bound of the autoencoder evidence. |
Marco Miani; Frederik Warburg; Pablo Moreno-Muñoz; Nicki Skafte; Søren Hauberg; |
2844 | Identify and Remove Backdoor Neurons Through Clean-Poisoned Mixture Distribution Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we demonstrate that the backdoor neurons in an infected neural network have a mixture of two distributions with significantly different moments, formed by benign samples and poisoned samples, respectively. |
Runkai Zheng; Rongjun Tang; Jianze Li; Li Liu; |
2845 | Uncertainty-Aware Hierarchical Refinement for Incremental Implicitly-Refined Classification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To address the issue, this paper explores the inheritance relations in the process of multi-level semantic increment, and propose an Uncertainty-Aware Hierarchical Refinement (UAHR) scheme. |
Jian Yang; Kai Zhu; Kecheng Zheng; Yang Cao; |
2846 | SwinTrack: A Simple and Strong Baseline for Transformer Tracking Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we aim to further unleash the power of Transformer by proposing a simple yet efficient fully-attentional tracker, dubbed \textbf{SwinTrack}, within classic Siamese framework. |
Liting Lin; Heng Fan; Zhipeng Zhang; Yong Xu; Haibin Ling; |
2847 | Proppo: A Message Passing Framework for Customizable and Composable Learning Algorithms Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: While existing automatic differentiation (AD) frameworks allow flexibly composing model architectures, they do not provide the same flexibility for composing learning algorithms—everything has to be implemented in terms of back propagation. To address this gap, we invent Automatic Propagation (AP) software, which generalizes AD, and allows custom and composable construction of complex learning algorithms. |
Paavo Parmas; Takuma Seno; |
2848 | C2FAR: Coarse-to-Fine Autoregressive Networks for Precise Probabilistic Forecasting Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present coarse-to-fine autoregressive networks (C2FAR), a method for modeling the probability distribution of univariate random variables. |
Shane Bergsma; Tim Zeyl; Javad Rahimipour Anaraki; Lei Guo; |
2849 | Pitfalls of Epistemic Uncertainty Quantification Through Loss Minimisation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we analyse a recent proposal based on the idea of a second-order learner, which yields predictions in the form of distributions over probability distributions. |
Viktor Bengs; Eyke Hüllermeier; Willem Waegeman; |
2850 | Public Wisdom Matters! Discourse-Aware Hyperbolic Fourier Co-Attention for Social Text Classification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Here, we propose $\textbf{\texttt{Hyphen}}$, a discourse-aware hyperbolic spectral co-attention network. |
Karish Grover; S M Phaneendra Angara; Md Shad Akhtar; Tanmoy Chakraborty; |
2851 | Emergent Communication: Generalization and Overfitting in Lewis Games Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Previous work has shown that agents trained to play this game with reinforcement learning tend to develop languages that display undesirable properties from a linguistic point of view (lack of generalization, lack of compositionality, etc). In this paper, we aim to provide better understanding of this phenomenon by analytically studying the learning problem in Lewis games. |
Mathieu Rita; Corentin Tallec; Paul Michel; Jean-Bastien Grill; Olivier Pietquin; Emmanuel Dupoux; Florian Strub; |
2852 | Latency-aware Spatial-wise Dynamic Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To bridge the gap between the theoretical computation and the practical efficiency, we propose a latency-aware spatial-wise dynamic network (LASNet), which performs \emph{coarse-grained} spatially adaptive inference under the guidance of a novel latency prediction model. |
Yizeng Han; Zhihang Yuan; Yifan Pu; Chenhao Xue; Shiji Song; Guangyu Sun; Gao Huang; |
2853 | Accelerated Primal-Dual Gradient Method for Smooth and Convex-Concave Saddle-Point Problems with Bilinear Coupling Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper we study the convex-concave saddle-point problem $\min_x \max_y f(x) + y^\top\mathbf{A}x – g(y)$, where $f(x)$ and $g(y)$ are smooth and convex functions. |
Dmitry Kovalev; Alexander Gasnikov; Peter Richtarik; |
2854 | Communication Acceleration of Local Gradient Methods Via An Accelerated Primal-Dual Algorithm with An Inexact Prox Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Inspired by a recent breakthrough of Mishchenko et al. [2022], who for the first time showed that local gradient steps can lead to provable communication acceleration, we propose an alternative algorithm which obtains the same communication acceleration as their method (ProxSkip). |
Abdurakhmon Sadiev; Dmitry Kovalev; Peter Richtarik; |
2855 | Reincarnating Reinforcement Learning: Reusing Prior Computation to Accelerate Progress Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Overall, this work argues for an alternative approach to RL research, which we believe could significantly improve real-world RL adoption and help democratize it further. |
Rishabh Agarwal; Max Schwarzer; Pablo Samuel Castro; Aaron Courville; Marc Bellemare; |
2856 | Optimal Algorithms for Decentralized Stochastic Variational Inequalities Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In particular, we consider decentralized stochastic (sum-type) variational inequalities over fixed and time-varying networks. We present lower complexity bounds for both communication and local iterations and construct optimal algorithms that match these lower bounds. |
Dmitry Kovalev; Aleksandr Beznosikov; Abdurakhmon Sadiev; Michael Persiianov; Peter Richtarik; Alexander Gasnikov; |
2857 | Logical Credal Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce Logical Credal Networks (or LCNs for short) — an expressive probabilistic logic that generalizes prior formalisms that combine logic and probability. |
Radu Marinescu; Haifeng Qian; Alexander Gray; Debarun Bhattacharjya; Francisco Barahona; Tian Gao; Ryan Riegel; Pravinda Sahu; |
2858 | Knowledge-Aware Bayesian Deep Topic Model Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a Bayesian generative model for incorporating prior domain knowledge into hierarchical topic modeling. |
Dongsheng Wang; Yi.shi Xu; Miaoge Li; Zhibin Duan; Chaojie Wang; Bo Chen; Mingyuan Zhou; |
2859 | Association Graph Learning for Multi-Task Classification with Category Shifts Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To this end, we propose learning an association graph to transfer knowledge among tasks for missing classes. |
Jiayi Shen; Zehao Xiao; Xiantong Zhen; Cees Snoek; Marcel Worring; |
2860 | Revisiting Active Sets for Gaussian Process Decoders Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we revisit active set approximations. |
Pablo Moreno-Muñoz; Cilie Feldager; Søren Hauberg; |
2861 | Contrastive Language-Image Pre-Training with Knowledge Graphs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a knowledge-based pre-training framework, dubbed \textit{Knowledge-CLIP}, that injects semantic information into the widely used CLIP model. |
Xuran Pan; Tianzhu Ye; Dongchen Han; Shiji Song; Gao Huang; |
2862 | Isolating and Leveraging Controllable and Noncontrollable Visual Dynamics in World Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To this end, we present a reinforcement learning approach named Iso-Dream, which expands the Dream-to-Control framework in two aspects. |
Minting Pan; Xiangming Zhu; Yunbo Wang; Xiaokang Yang; |
2863 | [Re] Understanding Self-Supervised Learning Dynamics Without Contrastive Pairs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We investigated the eigenspace alignment hypothesis in DirectPred, by plotting the eigenvalues and eigenspace alignments for both SimSiam and BYOL with and without Symmetric regularization. |
Tobias Höppe; Agnieszka Miszkurka; Dennis Bogatov Wilkman; |
2864 | DTG-SSOD: Dense Teacher Guidance for Semi-Supervised Object Detection Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Specifically, we propose the Inverse NMS Clustering (INC) and Rank Matching (RM) to instantiate the dense supervision, without the widely used, conventional sparse pseudo labels. |
Gang Li; Xiang Li; Yujie Wang; Shanshan Zhang; Wu Yichao; Ding Liang; |
2865 | Model-Based Opponent Modeling Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose model-based opponent modeling (MBOM), which employs the environment model to adapt to all kinds of opponents. |
XiaoPeng Yu; Jiechuan Jiang; Wanpeng Zhang; Haobin Jiang; Zongqing Lu; |
2866 | Generative Time Series Forecasting with Diffusion, Denoise and Disentanglement Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose to address the time series forecasting problem with generative modeling. |
Yan Li; Xinjiang Lu; Yaqing Wang; Dejing Dou; |
2867 | Graphein – A Python Library for Geometric Deep Learning and Network Analysis on Biomolecular Structures and Interaction Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Currently, efforts in both geometric deep learning and, more broadly, deep learning applied to biomolecular tasks have been hampered by a scarcity of appropriate datasets accessible to domain specialists and machine learning researchers alike. To address this, we introduce Graphein as a turn-key tool for transforming raw data from widely-used bioinformatics databases into machine learning-ready datasets in a high-throughput and flexible manner. |
Arian Jamasb; Ramon Viñas Torné; Eric Ma; Yuanqi Du; Charles Harris; Kexin Huang; Dominic Hall; Pietro Lió; Tom Blundell; |
2868 | ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we show the surprisingly good capabilities of plain vision transformers for pose estimation from various aspects, namely simplicity in model structure, scalability in model size, flexibility in training paradigm, and transferability of knowledge between models, through a simple baseline model called ViTPose. |
Yufei Xu; Jing Zhang; Qiming ZHANG; Dacheng Tao; |
2869 | Entropy-Driven Mixed-Precision Quantization for Deep Network Design Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose a one-stage solution that optimizes both jointly and automatically. |
Zhenhong Sun; Ce Ge; Junyan Wang; Ming Lin; Hesen Chen; Hao Li; Xiuyu Sun; |
2870 | Towards Learning Universal Hyperparameter Optimizers with Transformers Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we introduce the OptFormer, the first text-based Transformer HPO framework that provides a universal end-to-end interface for jointly learning policy and function prediction when trained on vast tuning data from the wild, such as Google’s Vizier database, one of the world’s largest HPO datasets. |
Yutian Chen; Xingyou Song; Chansoo Lee; Zi Wang; Richard Zhang; David Dohan; Kazuya Kawakami; Greg Kochanski; Arnaud Doucet; Marc’Aurelio Ranzato; Sagi Perel; Nando de Freitas; |
2871 | Is Out-of-distribution Detection Learnable? Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To study the generalization of OOD detection, in this paper, we investigate the probably approximately correct (PAC) learning theory of OOD detection, which is proposed by researchers as an open problem. |
Zhen Fang; Yixuan Li; Jie Lu; Jiahua Dong; Bo Han; Feng Liu; |
2872 | Non-stationary Bandits with Knapsacks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we study the problem of bandits with knapsacks (BwK) in a non-stationary environment. |
Shang Liu; Jiashuo Jiang; Xiaocheng Li; |
2873 | Alleviating Adversarial Attacks on Variational Autoencoders with MCMC Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Our method utilizes the Markov Chain Monte Carlo (MCMC) technique in the inference step that we motivate with a theoretical analysis. |
Anna Kuzina; Max Welling; Jakub Tomczak; |
2874 | Are AlphaZero-like Agents Robust to Adversarial Perturbations? Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we develop the first adversarial attack on AZ agents of Go. |
Li-Cheng Lan; Huan Zhang; Ti-Rong Wu; Meng-Yu Tsai; I-Chen Wu; Cho-Jui Hsieh; |
2875 | Point-M2AE: Multi-scale Masked Autoencoders for Hierarchical Point Cloud Pre-training Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose Point-M2AE, a strong Multi-scale MAE pre-training framework for hierarchical self-supervised learning of 3D point clouds. |
Renrui Zhang; Ziyu Guo; Peng Gao; Rongyao Fang; Bin Zhao; Dong Wang; Yu Qiao; Hongsheng Li; |
2876 | Can Adversarial Training Be Manipulated By Non-Robust Features? Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We identify a novel threat model named stability attacks, which aims to hinder robust availability by slightly manipulating the training data. |
Lue Tao; Lei Feng; Hongxin Wei; Jinfeng Yi; Sheng-Jun Huang; Songcan Chen; |
2877 | Optimal Transport-based Identity Matching for Identity-invariant Facial Expression Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper proposes to quantify the inter-identity variation by utilizing pairs of similar expressions explored through a specific matching process. |
Daeha Kim; Byung Cheol Song; |
2878 | Benign Underfitting of Stochastic Gradient Descent Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study to what extent may stochastic gradient descent (SGD) be understood as a “conventional” learning rule that achieves generalization performance by obtaining a good fit to training data. |
Tomer Koren; Roi Livni; Yishay Mansour; Uri Sherman; |
2879 | Self-supervised Amodal Video Object Segmentation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To this end, this paper develops a new framework of Self-supervised amodal Video object segmentation (SaVos). |
Jian Yao; Yuxin Hong; Chiyu Wang; Tianjun Xiao; Tong He; Yanwei Fu; Francesco Locatello; David P Wipf; Zheng Zhang; |
2880 | Nearly-Tight Bounds for Testing Histogram Distributions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We investigate the problem of testing whether a discrete probability distribution over an ordered domain is a histogram on a specified number of bins. |
Clément L Canonne; Ilias Diakonikolas; Daniel Kane; Sihan Liu; |
2881 | Grow and Merge: A Unified Framework for Continuous Categories Discovery Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we focus on the application scenarios where unlabeled data are continuously fed into the category discovery system. |
Xinwei Zhang; Jianwen Jiang; Yutong Feng; Zhi-Fan Wu; Xibin Zhao; Hai Wan; Mingqian Tang; Rong Jin; Yue Gao; |
2882 | Sharp Analysis of Stochastic Optimization Under Global Kurdyka-Lojasiewicz Inequality Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study the complexity of finding the global solution to stochastic nonconvex optimization when the objective function satisfies global Kurdyka-{\L}ojasiewicz (KL) inequality and the queries from stochastic gradient oracles satisfy mild expected smoothness assumption. |
Jalal Etesami; Ilyas Fatkhullin; Niao He; Negar Kiyavash; |
2883 | Private Graph All-Pairwise-Shortest-Path Distance Release with Improved Error Rate Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a DP algorithm with error rate O(k), which improves the error of general graphs provided k=o(n^{1/2}). |
Chenglin Fan; Ping Li; Xiaoyun Li; |
2884 | Learning Manifold Dimensions with Conditional Variational Autoencoders Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Moreover, it remains unclear how such considerations would change when various types of conditioning variables are introduced, or when the data support is extended to a union of manifolds (e.g., as is likely the case for MNIST digits and related). In this work, we address these points by first proving that VAE global minima are indeed capable of recovering the correct manifold dimension. |
Yijia Zheng; Tong He; Yixuan Qiu; David P Wipf; |
2885 | Privacy of Noisy Stochastic Gradient Descent: More Iterations Without More Privacy Loss Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We revisit the most commonly used algorithm for private convex optimization (Noisy SGD, aka SGLD), and establish a fundamental phenomenon: after a small burn-in period, running SGD longer leaks no additional privacy. |
Jason Altschuler; Kunal Talwar; |
2886 | Simulated User Studies for Explanation Evaluation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, running user studies is challenging and costly, and consequently each study typically only evaluates a limited number of different settings, e.g., studies often only evaluate a few arbitrarily selected model explanation methods. To address these challenges and aid user study design, we introduce Simulated Evaluations (SimEvals). |
Valerie Chen; Nari Johnson; Nicholay Topin; Gregory Plumb; Ameet Talwalkar; |
2887 | Earthformer: Exploring Space-Time Transformers for Earth System Forecasting Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose \emph{Earthformer}, a space-time Transformer for Earth system forecasting. |
Zhihan Gao; Xingjian Shi; Hao Wang; Yi Zhu; Yuyang (Bernie) Wang; Mu Li; Dit-Yan Yeung; |
2888 | Get More at Once: Alternating Sparse Training with Gradient Correction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, for the first time, we propose a novel alternating sparse training (AST) scheme to train multiple sparse sub-nets for dynamic inference without extra training cost compared to the case of training a single sparse model from scratch. |
Li Yang; Jian Meng; Jae-sun Seo; Deliang Fan; |
2889 | Local Bayesian Optimization Via Maximizing Probability of Descent Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We design a local Bayesian optimization policy that maximizes the probability of descending the objective function. |
Quan Nguyen; Kaiwen Wu; Jacob Gardner; Roman Garnett; |
2890 | Can Hybrid Geometric Scattering Networks Help Solve The Maximal Clique Problem? Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a geometric scattering-based graph neural network (GNN) for approximating solutions of the NP-hard maximum clique (MC) problem. |
Yimeng Min; Frederik Wenkel; Michael Perlmutter; Guy Wolf; |
2891 | ProjUNN: Efficient Method for Training Deep Networks with Unitary Matrices Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose instead an efficient method based on rank-$k$ updates — or their rank-$k$ approximation — that maintains performance at a nearly optimal training runtime. |
Bobak Kiani; Randall Balestriero; Yann LeCun; Seth Lloyd; |
2892 | Data-IQ: Characterizing Subgroups with Heterogeneous Outcomes in Tabular Data Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We consider the tabular setting, which surfaces the unique issue of outcome heterogeneity – this is prevalent in areas such as healthcare, where patients with similar features can have different outcomes, thus making reliable predictions challenging. |
Nabeel Seedat; Jonathan Crabbé; Ioana Bica; Mihaela van der Schaar; |
2893 | An Algorithm for Learning Switched Linear Dynamics from Data Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we present an algorithm for learning switched linear dynamical systems in discrete-time from data that may include noisy observations of the full system state or output observations. |
Guillaume Berger; Monal Narasimhamurthy; Kandai Watanabe; Morteza Lahijanian; Sriram Sankaranarayanan; |
2894 | Sparse Winning Tickets Are Data-Efficient Image Recognizers Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we show that “winning tickets” (small sub-networks) obtained via magnitude pruning based on the lottery ticket hypothesis (Frankle & Carbin, 2018), apart from being sparse are also effective recognizers in data limited regimes. |
Mukund Varma T; Xuxi Chen; Zhenyu Zhang; Tianlong Chen; Subhashini Venugopalan; Zhangyang Wang; |
2895 | Sobolev Acceleration and Statistical Optimality for Learning Elliptic Equations Via Gradient Descent Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we study the statistical limits in terms of Sobolev norms of gradient descent for solving inverse problem from randomly sampled noisy observations using a general class of objective functions. |
Yiping Lu; Jose Blanchet; Lexing Ying; |
2896 | LiteTransformerSearch: Training-free On-device Search for Efficient Autoregressive Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a training-free architecture evaluation proxy for NAS on autoregressive transformers, that enables fast search directly on the target commodity hardware. |
Mojan Javaheripi; Gustavo de Rosa; Subhabrata Mukherjee; Shital Shah; Tomasz Religa; Caio Cesar Teodoro Mendes; Sebastien Bubeck; Farinaz Koushanfar; Debadeepta Dey; |
2897 | Faster Linear Algebra for Distance Matrices Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Due to their wide applicability, distance matrices and related families of matrices have been the focus of many recent algorithmic works. We continue this line of research and take a broad view of algorithm design for distance matrices with the goal of designing fast algorithms, which are specifically tailored for distance matrices, for fundamental linear algebraic primitives. |
Piotr Indyk; Sandeep Silwal; |
2898 | Exponentially Improving The Complexity of Simulating The Weisfeiler-Lehman Test with Graph Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present an improved simulation of the WL test on GNNs with {\em exponentially} lower complexity. |
Anders Aamand; Justin Chen; Piotr Indyk; Shyam Narayanan; Ronitt Rubinfeld; Nicholas Schiefer; Sandeep Silwal; Tal Wagner; |
2899 | Efficient Sampling on Riemannian Manifolds Via Langevin MCMC Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study the task of efficiently sampling from a Gibbs distribution $d \pi^* = e^{-h} d {\text{vol}}_g$ over a Riemannian manifold $M$ via (geometric) Langevin MCMC; this algorithm involves computing exponential maps in random Gaussian directions and is efficiently implementable in practice. |
Xiang Cheng; Jingzhao Zhang; Suvrit Sra; |
2900 | ATD: Augmenting CP Tensor Decomposition By Self Supervision Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In practice, raw input tensor can contain irrelevant information while data augmentation techniques may be used to smooth out class-irrelevant noise in samples. This paper addresses the above challenges by proposing augmented tensor decomposition (ATD), which effectively incorporates data augmentations and self-supervised learning (SSL) to boost downstream classification. |
Chaoqi Yang; Cheng Qian; Navjot Singh; Cao (Danica) Xiao; M Westover; Edgar Solomonik; Jimeng Sun; |
2901 | Residual Multiplicative Filter Networks for Multiscale Reconstruction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce a new coordinate network architecture and training scheme that enables coarse-to-fine optimization with fine-grained control over the frequency support of learned reconstructions. |
Shayan Shekarforoush; David Lindell; Marcus Brubaker; David Fleet; |
2902 | Leveraging Factored Action Spaces for Efficient Offline Reinforcement Learning in Healthcare Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose a form of linear Q-function decomposition induced by factored action spaces. |
Shengpu Tang; Maggie Makar; Michael Sjoding; Finale Doshi-Velez; Jenna Wiens; |
2903 | CryptoGCN: Fast and Scalable Homomorphically Encrypted Graph Convolutional Network Inference Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Specifically, we propose a novel AMA data formatting method and associated spatial convolution methods, which can exploit the complex graph structure and perform efficient matrix-matrix multiplication in HE computation and thus greatly reduce the HE operations. |
Ran Ran; Wei Wang; Quan Gang; Jieming Yin; Nuo Xu; Wujie Wen; |