Most Influential ICLR Papers (2023-01)
The International Conference on Learning Representations (ICLR) is one of the top machine learning conferences in the world. Paper Digest Team analyzes all papers published on ICLR in the past years, and presents the 15 most influential papers for each year. This ranking list is automatically constructed based upon citations from both research papers and granted patents, and will be frequently updated to reflect the most recent changes. To find the latest version of this list or the most influential papers from other conferences/journals, please visit Best Paper Digest page. Note: the most influential papers may or may not include the papers that won the best paper awards. (Version: 2023-01)
To search or review papers within ICLR related to a specific topic, please use the search by venue (ICLR) and review by venue (ICLR) services. To browse the most productive ICLR authors by year ranked by #papers accepted, here is a list of most productive ICLR authors.
Based in New York, Paper Digest is dedicated to producing high-quality text analysis results that people can acturally use on a daily basis. Since 2018, we have been serving users across the world with a number of exclusive services to track, search, review and rewrite scientific literature.
You are welcome to follow us on Twitter and Linkedin to get updated with new conference digests.
Paper Digest Team
New York City, New York, 10017
team@paperdigest.org
TABLE 1: Most Influential ICLR Papers (2023-01)
Year | Rank | Paper | Author(s) |
---|---|---|---|
2022 | 1 | Multitask Prompted Training Enables Zero-Shot Task Generalization IF:6 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Can zero-shot generalization instead be directly induced by explicit multitask learning? To test this question at scale, we develop a system for easily mapping general natural language tasks into a human-readable prompted form. |
VICTOR SANH et. al. |
2022 | 2 | VICReg: Variance-Invariance-Covariance Regularization for Self-Supervised Learning IF:5 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Variance regularization prevents collapse in self-supervised representation learning |
Adrien Bardes; Jean Ponce; Yann LeCun; |
2022 | 3 | SimVLM: Simple Visual Language Model Pretraining with Weak Supervision IF:5 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we relax these constraints and present a minimalist pretraining framework, named Simple Visual Language Model (SimVLM). |
ZIRUI WANG et. al. |
2022 | 4 | LoRA: Low-Rank Adaptation of Large Language Models IF:4 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Finetuning updates have a low intrinsic rank which allows us to train only the rank decomposition matrices of certain weights, yielding better performance and practical benefits. |
EDWARD J HU et. al. |
2022 | 5 | How Attentive Are Graph Attention Networks? IF:4 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We identify that Graph Attention Networks (GAT) compute a very weak form of attention. We show its empirical implications and propose a fix. |
Shaked Brody; Uri Alon; Eran Yahav; |
2022 | 6 | Towards A Unified View of Parameter-Efficient Transfer Learning IF:4 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a unified framework for several state-of-the-art parameter-efficient tuning methods, |
Junxian He; Chunting Zhou; Xuezhe Ma; Taylor Berg-Kirkpatrick; Graham Neubig; |
2022 | 7 | MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer IF:4 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Light-weight and general-purpose vision transformers for mobile devices |
Sachin Mehta; Mohammad Rastegari; |
2022 | 8 | How Much Can CLIP Benefit Vision-and-Language Tasks? IF:4 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: To further study the advantage brought by CLIP, we propose to use CLIP as the visual encoder in various V&L models in two typical scenarios: 1) plugging CLIP into task-specific fine-tuning; 2) combining CLIP with V&L pre-training and transferring to downstream tasks. |
SHENG SHEN et. al. |
2022 | 9 | Open-vocabulary Object Detection Via Vision and Language Knowledge Distillation IF:4 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose using knowledge distillation to train an object detector that can detect objects with arbitrary text inputs, outperforming its supervised counterparts on rare categories. |
Xiuye Gu; Tsung-Yi Lin; Weicheng Kuo; Yin Cui; |
2022 | 10 | FILIP: Fine-grained Interactive Language-Image Pre-Training IF:4 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We introduce a large-scale Fine-grained Interacitve Language-Image Pretraining (FILIP) to achieve finer-level alignment through a new cross-modal late interaction mechanism, which can boost the performance on more grounded vision and language tasks. Furthermore, we construct a new large-scale image-text pair dataset called FILIP300M for pre-training. |
LEWEI YAO et. al. |
2022 | 11 | How Do Vision Transformers Work? IF:3 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We show that (1) multi-head self-attentions (MSAs) for computer vision flatten the loss landscapes, (2) MSAs are low-pass filters as opposed to Convs, and (3) MSAs at the end of a stage significantly improve the accuracy. |
Namuk Park; Songkuk Kim; |
2022 | 12 | Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm IF:3 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: This work proposes a novel training paradigm, Data efficient CLIP (DeCLIP), to alleviate this limitation. |
YANGGUANG LI et. al. |
2022 | 13 | Progressive Distillation for Fast Sampling of Diffusion Models IF:3 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Diffusion models now need just 4 sampling steps to produce high quality samples. |
Tim Salimans; Jonathan Ho; |
2022 | 14 | Efficient Self-supervised Vision Transformers for Representation Learning IF:3 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Achieving SoTA ImageNet linear probe task with 10 times higher throughput, using the synergy of a multi-stage Transformer architecture and a non-contrastive region-matching pre-training task. |
CHUNYUAN LI et. al. |
2022 | 15 | Tackling The Generative Learning Trilemma with Denoising Diffusion GANs IF:3 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: To reduce the number of sampling steps in diffusion models, we propose to model the denoising distribution with conditional GANs. We show our model tackles the generative learning trilemma & achieves high sample quality, diversity & fast sampling. |
Zhisheng Xiao; Karsten Kreis; Arash Vahdat; |
2021 | 1 | An Image Is Worth 16×16 Words: Transformers for Image Recognition at Scale IF:8 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Transformers applied directly to image patches and pre-trained on large datasets work really well on image classification. |
ALEXEY DOSOVITSKIY et. al. |
2021 | 2 | Deformable DETR: Deformable Transformers for End-to-End Object Detection IF:8 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Deformable DETR is an efficient and fast-converging end-to-end object detector. It mitigates the high complexity and slow convergence issues of DETR via a novel sampling-based efficient attention mechanism. |
XIZHOU ZHU et. al. |
2021 | 3 | DEBERTA: DECODING-ENHANCED BERT WITH DISENTANGLED ATTENTION IF:7 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: A new model architecture DeBERTa is proposed that improves the BERT and RoBERTa models using disentangled attention and enhanced mask decoder. |
Pengcheng He; Xiaodong Liu; Jianfeng Gao; Weizhu Chen; |
2021 | 4 | Rethinking Attention with Performers IF:7 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We introduce Performers, linear full-rank-attention Transformers via provable random feature approximation methods, without relying on sparsity or low-rankness. |
KRZYSZTOF MARCIN CHOROMANSKI et. al. |
2021 | 5 | Score-Based Generative Modeling Through Stochastic Differential Equations IF:7 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: A general framework for training and sampling from score-based models that unifies and generalizes previous methods, allows likelihood computation, and enables controllable generation. |
YANG SONG et. al. |
2021 | 6 | Fourier Neural Operator for Parametric Partial Differential Equations IF:7 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: A novel neural operator based on Fourier transformation for learning partial differential equations. |
ZONGYI LI et. al. |
2021 | 7 | FastSpeech 2: Fast and High-Quality End-to-End Text to Speech IF:6 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a non-autoregressive TTS model named FastSpeech 2 to better solve the one-to-many mapping problem in TTS and surpass autoregressive models in voice quality. |
YI REN et. al. |
2021 | 8 | Adaptive Federated Optimization IF:6 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose adaptive federated optimization techniques, and highlight their improved performance over popular methods such as FedAvg. |
SASHANK J. REDDI et. al. |
2021 | 9 | Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval IF:6 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper improves the learning of dense text retrieval using ANCE, which selects global negatives with bigger gradient norms using an asynchronously updated ANN index. |
LEE XIONG et. al. |
2021 | 10 | In Search of Lost Domain Generalization IF:6 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Our ERM baseline achieves state-of-the-art performance across many domain generalization benchmarks |
Ishaan Gulrajani; David Lopez-Paz; |
2021 | 11 | Prototypical Contrastive Learning of Unsupervised Representations IF:6 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose an unsupervised representation learning method that bridges contrastive learning with clustering in an EM framework. |
Junnan Li; Pan Zhou; Caiming Xiong; Steven Hoi; |
2021 | 12 | Sharpness-aware Minimization for Efficiently Improving Generalization IF:6 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Motivated by the connection between geometry of the loss landscape and generalization, we introduce a procedure for simultaneously minimizing loss value and loss sharpness. |
Pierre Foret; Ariel Kleiner; Hossein Mobahi; Behnam Neyshabur; |
2021 | 13 | Denoising Diffusion Implicit Models IF:6 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We show and justify a GAN-like iterative generative model with relatively fast sampling, high sample quality and without any adversarial training. |
Jiaming Song; Chenlin Meng; Stefano Ermon; |
2021 | 14 | Image Augmentation Is All You Need: Regularizing Deep Reinforcement Learning from Pixels IF:6 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: The first successful demonstration that image augmentation can be applied to image-based Deep RL to achieve SOTA performance. |
Denis Yarats; Ilya Kostrikov; Rob Fergus; |
2021 | 15 | GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding IF:6 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we demonstrate conditional computation as a remedy to the above mentioned impediments, and demonstrate its efficacy and utility. |
DMITRY LEPIKHIN et. al. |
2020 | 1 | ALBERT: A Lite BERT For Self-supervised Learning Of Language Representations IF:9 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: A new pretraining method that establishes new state-of-the-art results on the GLUE, RACE, and SQuAD benchmarks while having fewer parameters compared to BERT-large. |
ZHENZHONG LAN et. al. |
2020 | 2 | ELECTRA: Pre-training Text Encoders As Discriminators Rather Than Generators IF:9 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: A text encoder trained to distinguish real input tokens from plausible fakes efficiently learns effective language representations. |
Kevin Clark; Minh-Thang Luong; Quoc V. Le; Christopher D. Manning; |
2020 | 3 | BERTScore: Evaluating Text Generation With BERT IF:8 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose BERTScore, an automatic evaluation metric for text generation, which correlates better with human judgments and provides stronger model selection performance than existing metrics. |
Tianyi Zhang*; Varsha Kishore*; Felix Wu*; Kilian Q. Weinberger; Yoav Artzi; |
2020 | 4 | The Curious Case Of Neural Text Degeneration IF:8 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Current language generation systems either aim for high likelihood and devolve into generic repetition or miscalibrate their stochasticity?we provide evidence of both and propose a solution: Nucleus Sampling. |
Ari Holtzman; Jan Buys; Leo Du; Maxwell Forbes; Yejin Choi; |
2020 | 5 | On The Variance Of The Adaptive Learning Rate And Beyond IF:8 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: If warmup is the answer, what is the question? |
LIYUAN LIU et. al. |
2020 | 6 | Reformer: The Efficient Transformer IF:8 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Efficient Transformer with locality-sensitive hashing and reversible layers |
Nikita Kitaev; Lukasz Kaiser; Anselm Levskaya; |
2020 | 7 | VL-BERT: Pre-training Of Generic Visual-Linguistic Representations IF:8 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: VL-BERT is a simple yet powerful pre-trainable generic representation for visual-linguistic tasks. It is pre-trained on the massive-scale caption dataset and text-only corpus, and can be finetuned for varies down-stream visual-linguistic tasks. |
WEIJIE SU et. al. |
2020 | 8 | On The Convergence Of FedAvg On Non-IID Data IF:8 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we analyze the convergence of \texttt{FedAvg} on non-iid data and establish a convergence rate of $\mathcal{O}(\frac{1}{T})$ for strongly convex and smooth problems, where $T$ is the number of SGDs. |
Xiang Li; Kaixuan Huang; Wenhao Yang; Shusen Wang; Zhihua Zhang; |
2020 | 9 | Once For All: Train One Network And Specialize It For Efficient Deployment IF:7 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We introduce techniques to train a single once-for-all network that fits many hardware platforms. |
Han Cai; Chuang Gan; Tianzhe Wang; Zhekai Zhang; Song Han; |
2020 | 10 | Fast Is Better Than Free: Revisiting Adversarial Training IF:7 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: FGSM-based adversarial training, with randomization, works just as well as PGD-based adversarial training: we can use this to train a robust classifier in 6 minutes on CIFAR10, and 12 hours on ImageNet, on a single machine. |
Eric Wong; Leslie Rice; J. Zico Kolter; |
2020 | 11 | DropEdge: Towards Deep Graph Convolutional Networks On Node Classification IF:7 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper proposes DropEdge, a novel and flexible technique to alleviate over-smoothing and overfitting issue in deep Graph Convolutional Networks. |
Yu Rong; Wenbing Huang; Tingyang Xu; Junzhou Huang; |
2020 | 12 | AugMix: A Simple Data Processing Method To Improve Robustness And Uncertainty IF:7 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We obtain state-of-the-art on robustness to data shifts, and we maintain calibration under data shift even though even when accuracy drops |
DAN HENDRYCKS* et. al. |
2020 | 13 | Dream To Control: Learning Behaviors By Latent Imagination IF:7 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present Dreamer, an agent that learns long-horizon behaviors purely by latent imagination using analytic value gradients. |
Danijar Hafner; Timothy Lillicrap; Jimmy Ba; Mohammad Norouzi; |
2020 | 14 | Strategies For Pre-training Graph Neural Networks IF:7 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We develop a strategy for pre-training Graph Neural Networks (GNNs) and systematically study its effectiveness on multiple datasets, GNN architectures, and diverse downstream tasks. |
WEIHUA HU* et. al. |
2020 | 15 | Contrastive Representation Distillation IF:7 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Representation/knowledge distillation by maximizing mutual information between teacher and student |
Yonglong Tian; Dilip Krishnan; Phillip Isola; |
2019 | 1 | Decoupled Weight Decay Regularization IF:9 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Novel variants of optimization methods that combine the benefits of both adaptive and non-adaptive methods. |
Ilya Loshchilov; Frank Hutter; |
2019 | 2 | GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding IF:9 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present a multi-task benchmark and analysis platform for evaluating generalization in natural language understanding systems. |
ALEX WANG et. al. |
2019 | 3 | How Powerful Are Graph Neural Networks? IF:9 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We develop theoretical foundations for the expressive power of GNNs and design a provably most powerful GNN. |
Keyulu Xu*; Weihua Hu*; Jure Leskovec; Stefanie Jegelka; |
2019 | 4 | Large Scale GAN Training for High Fidelity Natural Image Synthesis IF:9 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: GANs benefit from scaling up. |
Andrew Brock; Jeff Donahue; Karen Simonyan; |
2019 | 5 | DARTS: Differentiable Architecture Search IF:9 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a differentiable architecture search algorithm for both convolutional and recurrent networks, achieving competitive performance with the state of the art using orders of magnitude less computation resources. |
Hanxiao Liu; Karen Simonyan; Yiming Yang; |
2019 | 6 | The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks IF:8 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Feedforward neural networks that can have weights pruned after training could have had the same weights pruned before training |
Jonathan Frankle; Michael Carbin; |
2019 | 7 | Learning Deep Representations By Mutual Information Estimation and Maximization IF:9 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We learn deep representation by maximizing mutual information, leveraging structure in the objective, and are able to compute with fully supervised classifiers with comparable architectures |
R DEVON HJELM et. al. |
2019 | 8 | Benchmarking Neural Network Robustness to Common Corruptions and Perturbations IF:8 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose ImageNet-C to measure classifier corruption robustness and ImageNet-P to measure perturbation robustness |
Dan Hendrycks; Thomas Dietterich; |
2019 | 9 | ImageNet-trained CNNs Are Biased Towards Texture; Increasing Shape Bias Improves Accuracy and Robustness IF:8 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: ImageNet-trained CNNs are biased towards object texture (instead of shape like humans). Overcoming this major difference between human and machine vision yields improved detection performance and previously unseen robustness to image distortions. |
ROBERT GEIRHOS et. al. |
2019 | 10 | ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware IF:8 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Proxy-less neural architecture search for directly learning architectures on large-scale target task (ImageNet) while reducing the cost to the same level of normal training. |
Han Cai; Ligeng Zhu; Song Han; |
2019 | 11 | Robustness May Be at Odds with Accuracy IF:8 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We show that adversarial robustness might come at the cost of standard classification performance, but also yields unexpected benefits. |
Dimitris Tsipras; Shibani Santurkar; Logan Engstrom; Alexander Turner; Aleksander Madry; |
2019 | 12 | A Closer Look at Few-shot Classification IF:9 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: A detailed empirical study in few-shot classification that revealing challenges in standard evaluation setting and showing a new direction. |
Wei-Yu Chen; Yen-Cheng Liu; Zsolt Kira; Yu-Chiang Frank Wang; Jia-Bin Huang; |
2019 | 13 | Deep Graph Infomax IF:8 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: A new method for unsupervised representation learning on graphs, relying on maximizing mutual information between local and global representations in a graph. State-of-the-art results, competitive with supervised learning. |
PETAR VELICKOVIC et. al. |
2019 | 14 | RotatE: Knowledge Graph Embedding By Relational Rotation in Complex Space IF:7 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: A new state-of-the-art approach for knowledge graph embedding. |
Zhiqing Sun; Zhi-Hong Deng; Jian-Yun Nie; Jian Tang; |
2019 | 15 | Meta-Learning with Latent Embedding Optimization IF:8 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Latent Embedding Optimization (LEO) is a novel gradient-based meta-learner with state-of-the-art performance on the challenging 5-way 1-shot and 5-shot miniImageNet and tieredImageNet classification tasks. |
ANDREI A. RUSU et. al. |
2018 | 1 | Graph Attention Networks IF:9 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: A novel approach to processing graph-structured data by neural networks, leveraging attention over a node’s neighborhood. Achieves state-of-the-art results on transductive citation network tasks and an inductive protein-protein interaction task. |
PETAR VELICKOVIC et. al. |
2018 | 2 | Towards Deep Learning Models Resistant to Adversarial Attacks IF:9 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We provide a principled, optimization-based re-look at the notion of adversarial examples, and develop methods that produce models that are adversarially robust against a wide range of adversaries. |
Aleksander Madry; Aleksandar Makelov; Ludwig Schmidt; Dimitris Tsipras; Adrian Vladu; |
2018 | 3 | Progressive Growing of GANs for Improved Quality, Stability, and Variation IF:9 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We train generative adversarial networks in a progressive fashion, enabling us to generate high-resolution images with high quality. |
Tero Karras; Timo Aila; Samuli Laine; Jaakko Lehtinen; |
2018 | 4 | Mixup: Beyond Empirical Risk Minimization IF:8 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Training on convex combinations between random training examples and their labels improves generalization in deep neural networks |
Hongyi Zhang; Moustapha Cisse; Yann N. Dauphin; David Lopez-Paz; |
2018 | 5 | Spectral Normalization for Generative Adversarial Networks IF:9 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a novel weight normalization technique called spectral normalization to stabilize the training of the discriminator of GANs. |
Takeru Miyato; Toshiki Kataoka; Masanori Koyama; Yuichi Yoshida; |
2018 | 6 | Unsupervised Representation Learning By Predicting Image Rotations IF:9 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: In our work we propose to learn image features by training ConvNets to recognize the 2d rotation that is applied to the image that it gets as input. |
Spyros Gidaris; Praveer Singh; Nikos Komodakis; |
2018 | 7 | Ensemble Adversarial Training: Attacks and Defenses IF:9 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Adversarial training with single-step methods overfits, and remains vulnerable to simple black-box and white-box attacks. We show that including adversarial examples from multiple sources helps defend against black-box attacks. |
FLORIAN TRAM�R et. al. |
2018 | 8 | On The Convergence of Adam and Beyond IF:9 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We investigate the convergence of popular optimization algorithms like Adam , RMSProp and propose new variants of these methods which provably converge to optimal solution in convex settings. |
Sashank J. Reddi; Satyen Kale; Sanjiv Kumar; |
2018 | 9 | Diffusion Convolutional Recurrent Neural Network: Data-Driven Traffic Forecasting IF:8 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: A neural sequence model that learns to forecast on a directed graph. |
Yaguang Li; Rose Yu; Cyrus Shahabi; Yan Liu; |
2018 | 10 | Word Translation Without Parallel Data IF:9 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Aligning languages without the Rosetta Stone: with no parallel data, we construct bilingual dictionaries using adversarial training, cross-domain local scaling, and an accurate proxy criterion for cross-validation. |
Guillaume Lample; Alexis Conneau; Marc’Aurelio Ranzato; Ludovic Denoyer; Herv� J�gou; |
2018 | 11 | A Deep Reinforced Model for Abstractive Summarization IF:9 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: A summarization model combining a new intra-attention and reinforcement learning method to increase summary ROUGE scores and quality for long sequences. |
Romain Paulus; Caiming Xiong; Richard Socher; |
2018 | 12 | Enhancing The Reliability of Out-of-distribution Image Detection in Neural Networks IF:8 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose ODIN, a simple and effective method that does not require any change to a pre-trained neural network. |
Shiyu Liang; Yixuan Li; R. Srikant; |
2018 | 13 | Countering Adversarial Images Using Input Transformations IF:8 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: We apply a model-agnostic defense strategy against adversarial examples and achieve 60% white-box accuracy and 90% black-box accuracy against major attack algorithms. |
Chuan Guo; Mayank Rana; Moustapha Cisse; Laurens van der Maaten; |
2018 | 14 | A Simple Neural Attentive Meta-Learner IF:8 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: a simple RNN-based meta-learner that achieves SOTA performance on popular benchmarks |
Nikhil Mishra; Mostafa Rohaninejad; Xi Chen; Pieter Abbeel; |
2018 | 15 | FastGCN: Fast Learning with Graph Convolutional Networks Via Importance Sampling IF:8 Literature Review Related Patents Related Grants Related Orgs Related Experts Details Highlight: Such an interpretation allows for the use of Monte Carlo approaches to consistently estimate the integrals, which in turn leads to a batched training scheme as we propose in this work—FastGCN. |
Jie Chen; Tengfei Ma; Cao Xiao; |