Most Influential ECCV Papers (2024-05)
The European Conference on Computer Vision (ECCV) is one of the top computer vision conferences in the world. Paper Digest Team analyzes all papers published on ECCV in the past years, and presents the 15 most influential papers for each year. This ranking list is automatically constructed based upon citations from both research papers and granted patents, and will be frequently updated to reflect the most recent changes. To find the latest version of this list or the most influential papers from other conferences/journals, please visit Best Paper Digest page. Note: the most influential papers may or may not include the papers that won the best paper awards. (Version: 2024-05)
To search or review papers within ECCV related to a specific topic, please use the search by venue (ECCV) and review by venue (ECCV) services. To browse the most productive ECCV authors by year ranked by #papers accepted, here is a list of most productive ECCV authors.
Based in New York, Paper Digest is dedicated to producing high-quality text analysis results that people can acturally use on a daily basis. Since 2018, we have been serving users across the world with a number of exclusive services to track, search, review and rewrite scientific literature.
You are welcome to follow us on Twitter and Linkedin to get updated with new conference digests.
Paper Digest Team
New York City, New York, 10017
team@paperdigest.org
TABLE 1: Most Influential ECCV Papers (2024-05)
Year | Rank | Paper | Author(s) |
---|---|---|---|
2022 | 1 | Visual Prompt Tuning IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces Visual Prompt Tuning (VPT) as an efficient and effective alternative to full fine-tuning for large-scale Transformer models in vision. |
MENGLIN JIA et. al. |
2022 | 2 | TensoRF: Tensorial Radiance Fields IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present TensoRF, a novel approach to model and reconstruct radiance fields. |
Anpei Chen; Zexiang Xu; Andreas Geiger; Jingyi Yu; Hao Su; |
2022 | 3 | ByteTrack: Multi-Object Tracking By Associating Every Detection Box IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The objects with low detection scores, e.g. occluded objects, are simply thrown away, which brings non-negligible true object missing and fragmented trajectories. To solve this problem, we present a simple, effective and generic association method, tracking by associating almost every detection box instead of only the high score ones. |
YIFU ZHANG et. al. |
2022 | 4 | BEVFormer: Learning Bird’s-Eye-View Representation from Multi-Camera Images Via Spatiotemporal Transformers IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present a new framework termed BEVFormer, which learns unified BEV representations with spatiotemporal transformers to support multiple autonomous driving perception tasks. |
ZHIQI LI et. al. |
2022 | 5 | Exploring Plain Vision Transformer Backbones for Object Detection IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We explore the plain, non-hierarchical Vision Transformer (ViT) as a backbone network for object detection. |
Yanghao Li; Hanzi Mao; Ross Girshick; Kaiming He; |
2022 | 6 | Simple Baselines for Image Restoration IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a simple baseline that exceeds the SOTA methods and is computationally efficient. |
Liangyu Chen; Xiaojie Chu; Xiangyu Zhang; Jian Sun; |
2022 | 7 | Detecting Twenty-Thousand Classes Using Image-Level Supervision IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose Detic, which simply trains the classifiers of a detector on image classification data and thus expands the vocabulary of detectors to tens of thousands of concepts. |
Xingyi Zhou; Rohit Girdhar; Armand Joulin; Philipp Krä,henbü,hl; Ishan Misra; |
2022 | 8 | Make-a-Scene: Scene-Based Text-to-Image Generation with Human Priors IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While these methods have incrementally improved the generated image fidelity and text relevancy, several pivotal gaps remain unanswered, limiting applicability and quality. We propose a novel text-to-image method that addresses these gaps by (i) enabling a simple control mechanism complementary to text in the form of a scene, (ii) introducing elements that substantially improve the tokenization process by employing domain-specific knowledge over key image regions (faces and salient objects), and (iii) adapting classifier-free guidance for the transformer use case. |
ORAN GAFNI et. al. |
2022 | 9 | MaxViT: Multi-axis Vision Transformer IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we introduce an efficient and scalable attention model we call multi-axis attention, which consists of two aspects: blocked local and dilated global attention. |
ZHENGZHONG TU et. al. |
2022 | 10 | MOTR: End-to-End Multiple-Object Tracking with TRansformer IF:5 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose MOTR, which extends DETR \cite{carion2020detr} and introduces “track query” to model the tracked instances in the entire video. |
FANGAO ZENG et. al. |
2022 | 11 | SLIP: Self-Supervision Meets Language-Image Pre-training IF:5 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we explore whether self-supervised learning can aid in the use of language supervision for visual representation learning with Vision Transformers. |
Norman Mu; Alexander Kirillov; David Wagner; Saining Xie; |
2022 | 12 | PETR: Position Embedding Transformation for Multi-View 3D Object Detection IF:5 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we develop position embedding transformation (PETR) for multi-view 3D object detection. |
Yingfei Liu; Tiancai Wang; Xiangyu Zhang; Jian Sun; |
2022 | 13 | Compositional Visual Generation with Composable Diffusion Models IF:5 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose an alternative structured approach for compositional generation using diffusion models. |
Nan Liu; Shuang Li; Yilun Du; Antonio Torralba; Joshua B. Tenenbaum; |
2022 | 14 | VQGAN-CLIP: Open Domain Image Generation and Editing with Natural Language Guidance IF:5 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Current methods rely heavily on training to a specific domain (e.g., only faces), manual work or algorithm tuning to latent vector discovery, and manual effort in mask selection to alter only a part of an image. We address all of these usability constraints while producing images of high visual and semantic quality through a unique combination of OpenAI’s CLIP (Radford et al., 2021), VQGAN (Esser et al., 2021), and a generation augmentation strategy to produce VQGAN-CLIP. |
KATHERINE CROWSON et. al. |
2022 | 15 | Masked Autoencoders for Point Cloud Self-Supervised Learning IF:5 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: As a promising scheme of self-supervised learning, masked autoencoding has significantly advanced natural language processing and computer vision. Inspired by this, we propose a neat scheme of masked autoencoders for point cloud self-supervised learning, addressing the challenges posed by point cloud’s properties, including leakage of location information and uneven information density. |
YATIAN PANG et. al. |
2020 | 1 | End-to-End Object Detection With Transformers IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a new method that views object detection as a direct set prediction. |
NICOLAS CARION et. al. |
2020 | 2 | NeRF: Representing Scenes As Neural Radiance Fields For View Synthesis IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a method that achieves state-of-the-art results for synthesizing novel views of complex scenes by optimizing an underlying continuous volumetric scene function using a sparse set of input views. |
BEN MILDENHALL et. al. |
2020 | 3 | Contrastive Multiview Coding IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We study this hypothesis under the framework of multiview contrastive learning, where we learn a representation that aims to maximize mutual information between different views of the same scene but is otherwise compact. |
Yonglong Tian; Dilip Krishnan; Phillip Isola; |
2020 | 4 | UNITER: UNiversal Image-TExt Representation Learning IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce UNITER, a UNiversal Image-TExt Representation, learned through large-scale pre-training over four image-text datasets (COCO, Visual Genome, Conceptual Captions, and SBU Captions), which can power heterogeneous downstream V+L tasks with joint multimodal embeddings. |
YEN-CHUN CHEN et. al. |
2020 | 5 | RAFT: Recurrent All-Pairs Field Transforms For Optical Flow IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Recurrent All-Pairs Field Transforms (RAFT), a new deep network architecture for estimating optical flow. |
Zachary Teed; Jia Deng; |
2020 | 6 | Oscar: Object-Semantics Aligned Pre-training For Vision-Language Tasks IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While existing methods simply concatenate image region features and text features as input to the model to be pre-trained and use self-attention to learn image-text semantic alignments in a brute force manner, in this paper, we propose a new learning method Oscar, which uses object tags detected in images as anchor points to significantly ease the learning of alignments. |
XIUJUN LI et. al. |
2020 | 7 | Object-Contextual Representations For Semantic Segmentation IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we address the semantic segmentation problem with a focus on the context aggregation strategy. |
Yuhui Yuan; Xilin Chen; Jingdong Wang; |
2020 | 8 | Big Transfer (BiT): General Visual Representation Learning IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We scale up pre-training, and propose a simple recipe that we call Big Transfer (BiT). |
ALEXANDER KOLESNIKOV et. al. |
2020 | 9 | Contrastive Learning For Unpaired Image-to-Image Translation IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a straightforward method for doing so — maximizing mutual information between the two, using a framework based on contrastive learning. |
Taesung Park Alexei A. Efros Richard Zhang Jun-Yan Zhu; |
2020 | 10 | Single Path One-Shot Neural Architecture Search With Uniform Sampling IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work propose a Single Path One-Shot model to address the challenge in the training. |
ZICHAO GUO et. al. |
2020 | 11 | Tracking Objects As Points IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a simultaneous detection and tracking algorithm that is simpler, faster, and more accurate than the state of the art. |
Xingyi Zhou; Vladlen Koltun; Philipp Krähenbühl; |
2020 | 12 | Convolutional Occupancy Networks IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose Convolutional Occupancy Networks, a more flexible implicit representation for detailed reconstruction of objects and 3D scenes. |
Songyou Peng; Michael Niemeyer; Lars Mescheder; Marc Pollefeys; Andreas Geiger; |
2020 | 13 | Rethinking Few-shot Image Classification: A Good Embedding Is All You Need? IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we show that a simple baseline: learning a supervised or self-supervised representation on the meta-training set, followed by training a linear classifier on top of this representation, outperforms state-of-the-art few-shot learning methods. |
Yonglong Tian; Yue Wang; Dilip Krishnan; Joshua B. Tenenbaum; Phillip Isola; |
2020 | 14 | Square Attack: A Query-efficient Black-box Adversarial Attack Via Random Search IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose the Square Attack, a score-based black-box $l_2$- and $l_\infty$- adversarial attack that does not rely on local gradient information and thus is not affected by gradient masking. |
Maksym Andriushchenko; Francesco Croce; Nicolas Flammarion; Matthias Hein; |
2020 | 15 | Towards Real-Time Multi-Object Tracking IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose an MOT system that allows target detection and appearance embedding to be learned in a shared model. |
Zhongdao Wang; Liang Zheng; Yixuan Liu; Yali Li; Shengjin Wang; |
2018 | 1 | CBAM: Convolutional Block Attention Module IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose Convolutional Block Attention Module (CBAM), a simple and effective attention module that can be integrated with any feed-forward convolutional neural networks. |
Sanghyun Woo; Jongchan Park; Joon-Young Lee; In So Kweon; |
2018 | 2 | Encoder-Decoder With Atrous Separable Convolution For Semantic Image Segmentation IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose to combine the advantages from both methods. |
Liang-Chieh Chen; Yukun Zhu; George Papandreou; Florian Schroff; Hartwig Adam; |
2018 | 3 | ShuffleNet V2: Practical Guidelines For Efficient CNN Architecture Design IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Taking these factors into account, this work proposes practical guidelines for efficient network de- sign. |
Ningning Ma; Xiangyu Zhang; Hai-Tao Zheng; Jian Sun; |
2018 | 4 | Image Super-Resolution Using Very Deep Residual Channel Attention Networks IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To solve these problems, we propose the very deep residual channel attention networks (RCAN). |
YULUN ZHANG et. al. |
2018 | 5 | CornerNet: Detecting Objects As Paired Keypoints IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose CornerNet, a new approach to object detection where we detect an object bounding box as a pair of keypoints, the top-left corner and the bottom-right corner, using a single convolution neural network. |
Hei Law; Jia Deng; |
2018 | 6 | Group Normalization IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present Group Normalization (GN) as a simple alternative to BN. |
Yuxin Wu; Kaiming He; |
2018 | 7 | Multimodal Unsupervised Image-to-image Translation IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address this limitation, we propose a Multimodal Unsupervised Image-to-image Translation (MUNIT) framework. |
Xun Huang; Ming-Yu Liu; Serge Belongie; Jan Kautz; |
2018 | 8 | Progressive Neural Architecture Search IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a new method for learning the structure of convolutional neural networks (CNNs) that is more efficient than recent state-of-the-art methods based on reinforcement learning and evolutionary algorithms. |
CHENXI LIU et. al. |
2018 | 9 | BiSeNet: Bilateral Segmentation Network For Real-time Semantic Segmentation IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we address this dilemma with a novel Bilateral Segmentation Network (BiSeNet). |
CHANGQIAN YU et. al. |
2018 | 10 | Image Inpainting For Irregular Holes Using Partial Convolutions IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose to use partial convolutions, where the convolution is masked and renormalized to be conditioned on only valid pixels. |
GUILIN LIU et. al. |
2018 | 11 | Simple Baselines For Human Pose Estimation And Tracking IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work provides simple and effective baseline methods. |
Bin Xiao; Haiping Wu; Yichen Wei; |
2018 | 12 | Deep Clustering For Unsupervised Learning Of Visual Features IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present DeepCluster, a clustering method that jointly learns the parameters of a neural network and the cluster assignments of the resulting features. |
Mathilde Caron; Piotr Bojanowski; Armand Joulin; Matthijs Douze; |
2018 | 13 | Unified Perceptual Parsing For Scene Understanding IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we study a new task called Unified Perceptual Parsing, which requires the machine vision systems to recognize as many visual concepts as possible from a given image. |
Tete Xiao; Yingcheng Liu; Bolei Zhou; Yuning Jiang; Jian Sun; |
2018 | 14 | Exploring The Limits Of Weakly Supervised Pretraining IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a unique study of transfer learning with large convolutional networks trained to predict hashtags on billions of social media images. |
DHRUV MAHAJAN et. al. |
2018 | 15 | Memory Aware Synapses: Learning What (not) To Forget IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we argue that, given the limited model capacity and the unlimited new information to be learned, knowledge has to be preserved or erased selectively. |
Rahaf Aljundi; Francesca Babiloni; Mohamed Elhoseiny; Marcus Rohrbach; Tinne Tuytelaars; |