Paper Digest: NAACL 2022 Highlights
The North American Chapter of the Association for Computational Linguistics (NAACL) is one of the top natural language processing conferences in the world. In 2022, it is to be held in Seatle, US.
To help the community quickly catch up on the work presented in this conference, Paper Digest Team processed all accepted papers, and generated one highlight sentence (typically the main topic) for each paper. Readers are encouraged to read these machine generated highlights / summaries to quickly get the main idea of each paper.
Based in New York, Paper Digest is dedicated to helping people generate contents & reason over unstructured data. Different from black-box approaches, we build deep models on semantics, which allows results to be produced with explainations. Such models power this website, and are behind our services including “search engine”, “summarization”, “question answering”, and “literature review”.
If you do not want to miss interesting academic papers, you are welcome to sign up our daily paper digest service to get updates on new papers published in your area every day. You are also welcome to follow us on Twitter and Linkedin to get updated with new conference digests.
Paper Digest Team
New York City, New York, 10017
team@paperdigest.org
TABLE 1: Paper Digest: NAACL 2022 Highlights
Paper | Author(s) | |
---|---|---|
1 | Social Norms Guide Reference Resolution Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: While some situated reference resolution is trivial, ambiguous cases arise when the language is underspecified or there are multiple candidate referents. This study investigates howpragmatic modulators external to the linguistic content are critical for the correct interpretation of referents in these scenarios. |
Mitchell Abrams; Matthias Scheutz; |
2 | Learning Natural Language Generation with Truncated Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper introduces TRUncated ReinForcement Learning for Language (TrufLL), an original approach to train conditional languagemodels without a supervised learning phase, by only using reinforcement learning (RL). |
Alice Martin; Guillaume Quispe; Charles Ollion; Sylvain Le Corff; Florian Strub; Olivier Pietquin; |
3 | Language Model Augmented Monotonic Attention for Simultaneous Translation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose a framework to aid monotonic attention with an external language model to improve its decisions. |
Sathish Reddy Indurthi; Mohd Abbas Zaidi; Beomseok Lee; Nikhil Kumar Lakumarapu; Sangha Kim; |
4 | What Makes A Good and Useful Summary? Incorporating Users in Automatic Summarization Research Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work we focus on university students, who make extensive use of summaries during their studies. |
Maartje Ter Hoeve; Julia Kiseleva; Maarten Rijke; |
5 | ErAConD: Error Annotated Conversational Dialog Dataset for Grammatical Error Correction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present a novel GEC dataset consisting of parallel original and corrected utterances drawn from open-domain chatbot conversations; this dataset is, to our knowledge, the first GEC dataset targeted to a human-machine conversational setting. |
Xun Yuan; Derek Pham; Sam Davidson; Zhou Yu; |
6 | Semantic Diversity in Dialogue with Natural Language Inference Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper makes two substantial contributions to improving diversity in dialogue generation. |
Katherine Stasaski; Marti Hearst; |
7 | LEA: Meta Knowledge-Driven Self-Attentive Document Embedding for Few-Shot Text Classification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In the study, we propose a novel learning method for learning how to attend, called LEA, through which meta-level attention aspects are derived based on our meta-learning strategy. |
S. K. Hong; Tae Young Jang; |
8 | Enhancing Self-Attention with Knowledge-Assisted Attention Maps Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel and generic solution, KAM-BERT, which directly incorporates knowledge-generated attention maps into the self-attention mechanism. |
Jiangang Bai; Yujing Wang; Hong Sun; Ruonan Wu; Tianmeng Yang; Pengfei Tang; Defu Cao; Mingliang Zhang1; Yunhai Tong; Yaming Yang; Jing Bai; Ruofei Zhang; Hao Sun; Wei Shen; |
9 | Batch-Softmax Contrastive Loss for Pairwise Sentence Scoring Tasks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce and study a number of variations in the calculation of the loss as well as in the overall training procedure; in particular, we find that a special data shuffling can be quite important. |
Anton Chernyavskiy; Dmitry Ilvovsky; Pavel Kalinin; Preslav Nakov; |
10 | NewsEdits: A News Article Revision Dataset and A Novel Document-Level Reasoning Challenge Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: News article revision histories provide clues to narrative and factual evolution in news articles. To facilitate analysis of this evolution, we present the first publicly available dataset of news revision histories, NewsEdits. |
Alexander Spangher; Xiang Ren; Jonathan May; Nanyun Peng; |
11 | Putting The Con in Context: Identifying Deceptive Actors in The Game of Mafia Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we analyze the effect of speaker role on language use through the game of Mafia, in which participants are assigned either an honest or a deceptive role. |
Samee Ibraheem; Gaoyue Zhou; John DeNero; |
12 | SUBS: Subtree Substitution for Compositional Semantic Parsing Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose to use subtree substitution for compositional data augmentation, where we consider subtrees with similar semantic functions as exchangeable. |
Jingfeng Yang; Le Zhang; Diyi Yang; |
13 | Two Contrasting Data Annotation Paradigms for Subjective NLP Tasks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This has led to partly-subjective datasets that fail to serve a clear downstream use. To address this issue, we propose two contrasting paradigms for data annotation. |
Paul Rottger; Bertie Vidgen; Dirk Hovy; Janet Pierrehumbert; |
14 | Do Deep Neural Nets Display Human-like Attention in Short Answer Scoring? Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This study aimed to investigate whether (and to what extent) DL-based graders align with human graders regarding the important words they identify when marking short answer questions. |
Zijie Zeng; Xinyu Li; Dragan Gasevic; Guanliang Chen; |
15 | Knowledge-Grounded Dialogue Generation with A Unified Knowledge Representation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In addition, it is challenging to generalize to the domains that require different types of knowledge sources. To address the above challenges, we present PLUG, a language model that homogenizes different knowledge sources to a unified knowledge representation for knowledge-grounded dialogue generation tasks. |
Yu Li; Baolin Peng; Yelong Shen; Yi Mao; Lars Liden; Zhou Yu; Jianfeng Gao; |
16 | CERES: Pretraining of Graph-Conditioned Transformer for Semi-Structured Session Data Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Despite recent advances in self-supervised learning for text or graphs, there lack of self-supervised learning models that can effectively capture both intra-item semantics and inter-item interactions for semi-structured sessions. To fill this gap, we propose CERES, a graph-based transformer model for semi-structured session data. |
Rui Feng; Chen Luo; Qingyu Yin; Bing Yin; Tuo Zhao; Chao Zhang; |
17 | Political Ideology and Polarization: A Multi-dimensional Approach Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Recent research has made great strides towards understanding the ideological bias (i.e., stance) of news media along the left-right spectrum. In this work, we instead take a novel and more nuanced approach for the study of ideology based on its left or right positions on the issue being discussed. |
Barea Sinno; Bernardo Oviedo; Katherine Atwell; Malihe Alikhani; Junyi Jessy Li; |
18 | Cooperative Self-training of Machine Reading Comprehension Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose a cooperative self-training framework, RGX, for automatically generating more non-trivial question-answer pairs to improve model performance. |
Hongyin Luo; Shang-Wen Li; Mingye Gao; Seunghak Yu; James Glass; |
19 | GlobEnc: Quantifying Global Token Attribution By Incorporating The Whole Encoder Layer in Transformers Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper introduces a novel token attribution analysis method that incorporates all the components in the encoder block and aggregates this throughout layers. |
Ali Modarressi; Mohsen Fayyaz; Yadollah Yaghoobzadeh; Mohammad Taher Pilehvar; |
20 | A Robustly Optimized BMRC for Aspect Sentiment Triplet Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: The bidirectional machine reading comprehension (BMRC), can effectively deal with ASTE task, but several problems remains, such as query conflict and probability unilateral decrease. Therefore, this paper presents a robustly optimized BMRC method by incorporating four improvements. |
Shu Liu; Kaiwen Li; Zuhe Li; |
21 | Seed-Guided Topic Discovery with Out-of-Vocabulary Seeds Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we generalize the task of seed-guided topic discovery to allow out-of-vocabulary seeds. |
Yu Zhang; Yu Meng; Xuan Wang; Sheng Wang; Jiawei Han; |
22 | Towards Process-Oriented, Modular, and Versatile Question Generation That Meets Educational Needs Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we aim to pinpoint key impediments and investigate how to improve the usability of automatic QG techniques for educational purposes by understanding how instructors construct questions and identifying touch points to enhance the underlying NLP models. |
Xu Wang; Simin Fan; Jessica Houghton; Lu Wang; |
23 | SwahBERT: Language Model of Swahili Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Thanks to the recent growth of online forums and news platforms of Swahili, we introduce two datasets of Swahili in this paper: a pre-training dataset of approximately 105MB with 16M words and annotated dataset of 13K instances for the emotion classification task. |
Gati Martin; Medard Edmund Mswahili; Young-Seob Jeong; Jeong Young-Seob; |
24 | Deconstructing NLG Evaluation: Evaluation Practices, Assumptions, and Their Implications Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Combining a formative semi-structured interview study of NLG practitioners (N=18) with a survey study of a broader sample of practitioners (N=61), we surface goals, community practices, assumptions, and constraints that shape NLG evaluations, examining their implications and how they embody ethical considerations. |
Kaitlyn Zhou; Su Lin Blodgett; Adam Trischler; Hal Daum? III; Kaheer Suleman; Alexandra Olteanu; |
25 | TSTR: Too Short to Represent, Summarize with Details! Intro-Guided Extended Summary Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose TSTR, an extractive summarizer that utilizes the introductory information of documents as pointers to their salient information. |
Sajad Sotudeh; Nazli Goharian; |
26 | Empathic Machines: Using Intermediate Features As Levers to Emulate Emotions in Text-To-Speech Systems Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a method to control the emotional prosody of Text to Speech (TTS) systems by using phoneme-level intermediate features (pitch, energy, and duration) as levers. |
Saiteja Kosgi; Sarath Sivaprasad; Niranjan Pedanekar; Anil Nelakanti; Vineet Gandhi; |
27 | The Why and The How: A Survey on Natural Language Interaction in Visualization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this survey, we provide an overview of natural language-based interaction in the research area of visualization. |
Henrik Voigt; Ozge Alacam; Monique Meuschke; Kai Lawonn; Sina Zarrie?; |
28 | Understand Before Answer: Improve Temporal Reading Comprehension Via Precise Question Understanding Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This work studies temporal reading comprehension (TRC), which reads a free-text passage and answers temporal ordering questions. |
Hao Huang; Xiubo Geng; Guodong Long; Daxin Jiang; |
29 | User-Driven Research of Medical Note Generation Software Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we present three rounds of user studies, carried out in the context of developing a medical note generation system. |
Tom Knoll; Francesco Moramarco; Alex Papadopoulos Korfiatis; Rachel Young; Claudia Ruffini; Mark Perera; Christian Perstl; Ehud Reiter; Anya Belz; Aleksandar Savkov; |
30 | Ask Me Anything in Your Native Language Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a novel approach based on single encoder for query and passage for retrieval from multi-lingual collection, together with cross-lingual generative reader. |
Nikita Sorokin; Dmitry Abulkhanov; Irina Piontkovskaya; Valentin Malykh; |
31 | Diversifying Neural Dialogue Generation Via Negative Distillation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, its performance is hindered by two issues, ignoring low-frequency but generic responses and bringing low-frequency but meaningless responses. In this paper, we propose a novel negative training paradigm, called negative distillation, to keep the model away from the undesirable generic responses while avoiding the above problems. |
Yiwei Li; Shaoxiong Feng; Bin Sun; Kan Li; |
32 | On Synthetic Data for Back Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Through both theoretical and empirical studies, we identify two key factors on synthetic data controlling the back-translation NMT performance, which are quality and importance. |
Jiahao Xu; Yubin Ruan; Wei Bi; Guoping Huang; Shuming Shi; Lihui Chen; Lemao Liu; |
33 | Mapping The Design Space of Human-AI Interaction in Text Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We first conducted a systematic literature review of 70 papers, developing a taxonomy of five interactions in AI-assisted text generation and relevant design dimensions. We designed text summarization prototypes for each interaction. |
Ruijia Cheng; Alison Smith-Renner; Ke Zhang; Joel Tetreault; Alejandro Jaimes-Larrarte; |
34 | Towards Robust and Semantically Organised Latent Representations for Unsupervised Text Style Transfer Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce EPAAEs (Embedding Perturbed Adversarial AutoEncoders) which completes this perturbation model, by adding a finely adjustable noise component on the continuous embeddings space. |
Sharan Narasimhan; Suvodip Dey; Maunendra Desarkar; |
35 | An Exploration of Post-Editing Effectiveness in Text Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Therefore, we explored whether post-editing offers advantages in text summarization. |
Vivian Lai; Alison Smith-Renner; Ke Zhang; Ruijia Cheng; Wenjuan Zhang; Joel Tetreault; Alejandro Jaimes-Larrarte; |
36 | Automatic Correction of Human Translations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce translation error correction (TEC), the task of automatically correcting human-generated translations.Imperfections in machine translations (MT) have long motivated systems for improving translations post-hoc with automatic post-editing. |
Jessy Lin; Geza Kovacs; Aditya Shastry; Joern Wuebker; John DeNero; |
37 | On The Robustness of Reading Comprehension Models to Entity Renaming Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Such failures imply that models overly rely on entity information to answer questions, and thus may generalize poorly when facts about the world change or questions are asked about novel entities. To systematically audit this issue, we present a pipeline to automatically generate test examples at scale, by replacing entity names in the original test sample with names from a variety of sources, ranging from names in the same test set, to common names in life, to arbitrary strings. |
Jun Yan; Yang Xiao; Sagnik Mukherjee; Bill Yuchen Lin; Robin Jia; Xiang Ren; |
38 | Explaining Why: How Instructions and User Interfaces Impact Annotator Rationales When Labeling Text Data Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We conducted a 332-participant online user study to understand how humans select rationales, especially how different instructions and user interface affordances impact the rationales chosen. |
Cynthia Sullivan; William Brackenbury; Andrew McNut; Kevin Bryson; Kbyllofficial@gmail.com Kbyllofficial@gmail.com; Yuxin Chen; Michael Littman; Chenhao Tan; Blase Ur; |
39 | Fine-tuning Pre-trained Language Models for Few-shot Intent Detection: Supervised Pre-training and Isotropization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Inspired by recent research in isotropization, we propose to improve supervised pre-training by regularizing the feature space towards isotropy. |
Haode Zhang; Haowen Liang; Yuwei Zhang; Li-Ming Zhan; Xiao-Ming Wu; Xiaolei Lu; Albert Lam; |
40 | Cross-document Misinformation Detection Based on Event Graph Reasoning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Multiple news articles may contain complementary or contradictory information that readers can leverage to help detect fake news. Inspired by this process, we propose a novel task of cross-document misinformation detection. |
Xueqing Wu; Kung-Hsiang Huang; Yi Fung; Heng Ji; |
41 | Disentangled Action Recognition with Knowledge Bases Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we aim to improve the generalization ability of the compositional action recognition model to novel verbs or novel nouns that are unseen during training time, by leveraging the power of knowledge graphs. |
Zhekun Luo; Shalini Ghosh; Devin Guillory; Keizo Kato; Trevor Darrell; Huijuan Xu; |
42 | Machine-in-the-Loop Rewriting for Creative Image Captioning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To allow the user to retain control over the content, we train a rewriting model that, when prompted, modifies specified spans of text within the user’s original draft to introduce descriptive and figurative elements in the text. |
Vishakh Padmakumar; He He; |
43 | A Word Is Worth A Thousand Dollars: Adversarial Attack on Tweets Fools Stock Prediction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we experiment with a variety of adversarial attack configurations to fool three stock prediction victim models. |
Yong Xie; Dakuo Wang; Pin-Yu Chen; Jinjun Xiong; Sijia Liu; Oluwasanmi Koyejo; |
44 | Building Multilingual Machine Translation Systems That Serve Arbitrary XY Translations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The model suffers from poor performance in one-to-many and many-to-many with zero-shot setup. To address this issue, this paper discusses how to practically build MNMT systems that serve arbitrary X-Y translation directions while leveraging multilinguality with a two-stage training strategy of pretraining and finetuning. |
Akiko Eriguchi; Shufang Xie; Tao Qin; Hany Hassan; |
45 | Non-Autoregressive Neural Machine Translation with Consistency Regularization Optimized Variational Framework Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: One of the prominent VAE-based NAT frameworks, LaNMT, achieves great improvements to vanilla models, but still suffers from two main issues which lower down the translation quality: (1) mismatch between training and inference circumstances and (2) inadequacy of latent representations. In this work, we target on addressing these issues by proposing posterior consistency regularization. |
Minghao Zhu; Junli Wang; Chungang Yan; |
46 | User-Centric Gender Rewriting Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we define the task of gender rewriting in contexts involving two users (I and/or You) – first and second grammatical persons with independent grammatical gender preferences. |
Bashar Alhafni; Nizar Habash; Houda Bouamor; |
47 | Reframing Human-AI Collaboration for Generating Free-Text Explanations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We consider the task of generating free-text explanations using human-written examples in a few-shot manner. |
Sarah Wiegreffe; Jack Hessel; Swabha Swayamdipta; Mark Riedl; Yejin Choi; |
48 | EmRel: Joint Representation of Entities and Embedded Relations for Multi-triple Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: While existing works only explore entity representations, we propose to explicitly introduce relation representation, jointly represent it with entities, and novelly align them to identify valid triples. |
Benfeng Xu; Quan Wang; Yajuan Lyu; Yabing Shi; Yong Zhu; Jie Gao; Zhendong Mao; |
49 | Meta Learning for Natural Language Processing: A Survey Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Our goal with this survey paper is to offer researchers pointers to relevant meta-learning works in NLP and attract more attention from the NLP community to drive future innovation. |
Hung-yi Lee; Shang-Wen Li; Thang Vu; |
50 | Analyzing Modality Robustness in Multimodal Sentiment Analysis Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we hope to address that by (i) Proposing simple diagnostic checks for modality robustness in a trained multimodal model. |
Devamanyu Hazarika; Yingting Li; Bo Cheng; Shuai Zhao; Roger Zimmermann; Soujanya Poria; |
51 | Fuse It More Deeply! A Variational Transformer with Layer-Wise Latent Variable Inference for Text Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, due to the sequential nature of the text, auto-regressive decoders tend to ignore latent variables and then reduce to simple language models, known as the \textit{KL vanishing} problem, which would further deteriorate when VAE is combined with Transformer-based structures. To ameliorate this problem, we propose Della, a novel variational Transformer framework. |
Jinyi Hu; Xiaoyuan Yi; Wenhao Li; Maosong Sun; Xing Xie; |
52 | Easy Adaptation to Mitigate Gender Bias in Multilingual Text Classification Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we treat the gender as domains (e.g., male vs. female) and present a standard domain adaptation model to reduce the gender bias and improve performance of text classifiers under multilingual settings. |
Xiaolei Huang; |
53 | On The Use of External Data for Spoken Named Entity Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we focus on low-resource spoken named entity recognition (NER) and address the question: Beyond self-supervised pre-training, how can we use external speech and/or text data that are not annotated for the task? |
Ankita Pasad; Felix Wu; Suwon Shon; Karen Livescu; Kyu Han; |
54 | Long-term Control for Dialogue Generation: Methods and Evaluation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we focus on constrained long-term dialogue generation, which involves more fine-grained control and requires a given set of control words to appear in generated responses. |
Ramya Ramakrishnan; Hashan Narangodage; Mauro Schilman; Kilian Weinberger; Ryan McDonald; |
55 | Learning Dialogue Representations from Consecutive Utterances Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we introduce Dialogue Sentence Embedding (DSE), a self-supervised contrastive learning method that learns effective dialogue representations suitable for a wide range of dialogue tasks. |
Zhihan Zhou; Dejiao Zhang; Wei Xiao; Nicholas Dingwall; Xiaofei Ma; Andrew Arnold; Bing Xiang; |
56 | On The Machine Learning of Ethical Judgments from Natural Language Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: One recent approach in this vein is the construction of NLP morality models that can take in arbitrary text and output a moral judgment about the situation described. In this work, we offer a critique of such NLP methods for automating ethical decision-making. |
Zeerak Talat; Hagen Blix; Josef Valvoda; Maya Indira Ganesh; Ryan Cotterell; Adina Williams; |
57 | NeuroLogic A*esque Decoding: Constrained Text Generation with Lookahead Heuristics Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Drawing inspiration from the A* search algorithm, we propose NeuroLogic A*esque, a decoding algorithm that incorporates heuristic estimates of future cost. |
Ximing Lu; Sean Welleck; Peter West; Liwei Jiang; Jungo Kasai; Daniel Khashabi; Ronan Le Bras; Lianhui Qin; Youngjae Yu; Rowan Zellers; Noah Smith; Yejin Choi; |
58 | PARADISE: Exploiting Parallel Data for Multilingual Sequence-to-Sequence Pretraining Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present PARADISE (PARAllel &Denoising Integration in SEquence-to-sequence models), which extends the conventional denoising objective used to train these models by (i) replacing words in the noised sequence according to a multilingual dictionary, and (ii) predicting the reference translation according to a parallel corpus instead of recovering the original sequence. |
Machel Reid; Mikel Artetxe; |
59 | Explaining Toxic Text Via Knowledge Enhanced Text Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Previous literature has mostly focused on classifying and detecting toxic speech, and existing efforts on explaining stereotypes in toxic speech mainly use standard text generation approaches, resulting in generic and repetitive explanations. Building on these prior works, we introduce a novel knowledge-informed encoder-decoder framework to utilize multiple knowledge sources to generate implications of biased text. |
Rohit Sridhar; Diyi Yang; |
60 | Teaching BERT to Wait: Balancing Accuracy and Latency for Streaming Disfluency Detection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work we propose a streaming BERT-based sequence tagging model that, combined with a novel training objective, is capable of detecting disfluencies in real-time while balancing accuracy and latency. |
Angelica Chen; Vicky Zayats; Daniel Walker; Dirk Padfield; |
61 | GRAM: Fast Fine-tuning of Pre-trained Language Models for Content-based Collaborative Filtering Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, it is resource-intensive to train a PLM-based CCF model in an end-to-end (E2E) manner, since optimization involves back-propagating through every content encoding within a given user interaction sequence. To tackle this issue, we propose GRAM (GRadient Accumulation for Multi-modality in CCF), which exploits the fact that a given item often appears multiple times within a batch of interaction histories. |
Yoonseok Yang; Kyu Seok Kim; Minsam Kim; Juneyoung Park; |
62 | Generating Repetitions with Appropriate Repeated Words Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we focus on repetition generation. |
Toshiki Kawamoto; Hidetaka Kamigaito; Kotaro Funakoshi; Manabu Okumura; |
63 | Textless Speech-to-Speech Translation on Real Data Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a textless speech-to-speech translation (S2ST) system that can translate speech from one language into another language and can be built without the need of any text data. |
Ann Lee; Hongyu Gong; Paul-Ambroise Duquenne; Holger Schwenk; Peng-Jen Chen; Changhan Wang; Sravya Popuri; Yossi Adi; Juan Pino; Jiatao Gu; Wei-Ning Hsu; |
64 | WALNUT: A Benchmark on Semi-weakly Supervised Learning for Natural Language Understanding Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: It is thus hard to compare different approaches and evaluate the benefit of weak supervision without access to a unified and systematic benchmark with diverse tasks and real-world weak labeling rules. In this paper, we propose such a benchmark, named WALNUT, to advocate and facilitate research on weak supervision for NLU. |
Guoqing Zheng; Giannis Karamanolakis; Kai Shu; Ahmed Awadallah; |
65 | CompactIE: Compact Facts in Open Information Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose CompactIE, an OpenIE system that uses a novel pipelined approach to produce compact extractions with overlapping constituents. |
Farima Fatahi Bayat; Nikita Bhutani; H. Jagadish; |
66 | CoSIm: Commonsense Reasoning for Counterfactual Scene Imagination Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present a baseline model based on a vision-language Transformer (i.e., LXMERT) and ablation studies. |
Hyounghun Kim; Abhay Zala; Mohit Bansal; |
67 | Abstraction Not Memory: BERT and The English Article System Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To this end, we compare the performance of native English speakers and pre-trained models on the task of article prediction set up as a three way choice (a/an, the, zero). |
Harish Tayyar Madabushi; Dagmar Divjak; Petar Milin; |
68 | OmniTab: Pretraining with Natural and Synthetic Data for Few-shot Table-based Question Answering Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Motivated by the fact that table-based QA requires both alignment between questions and tables and the ability to perform complicated reasoning over multiple table elements, we propose an omnivorous pretraining approach that consumes both natural and synthetic data to endow models with these respective abilities. |
Zhengbao Jiang; Yi Mao; Pengcheng He; Graham Neubig; Weizhu Chen; |
69 | Provably Confidential Language Modelling Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose Confidentially Redacted Training (CRT), a method to train language generation models while protecting the confidential segments. |
Xuandong Zhao; Lei Li; Yu-Xiang Wang; |
70 | KAT: A Knowledge Augmented Transformer for Vision-and-Language Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we ask a complementary question: Can multimodal transformers leverage explicit knowledge in their reasoning? |
Liangke Gui; Borui Wang; Qiuyuan Huang; Alexander Hauptmann; Yonatan Bisk; Jianfeng Gao; |
71 | When A Sentence Does Not Introduce A Discourse Entity, Transformer-based Models Still Sometimes Refer to It Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we adapt the psycholinguistic assessment of language models paradigm to higher-level linguistic phenomena and introduce an English evaluation suite that targets the knowledge of the interactions between sentential operators and indefinite NPs. |
Sebastian Schuster; Tal Linzen; |
72 | On Curriculum Learning for Commonsense Reasoning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We use paced curriculum learning to rank data and sample training mini-batches with increasing levels of difficulty from the ranked dataset during finetuning. Further, we investigate the effect of an adaptive curriculum, i.e., the data ranking is dynamically updated during training based on the current state of the learner model. |
Adyasha Maharana; Mohit Bansal; |
73 | DocTime: A Document-level Temporal Dependency Graph Parser Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce DocTime – a novel temporal dependency graph (TDG) parser that takes as input a text document and produces a temporal dependency graph. |
Puneet Mathur; Vlad Morariu; Verena Kaynig-Fittkau; Jiuxiang Gu; Franck Dernoncourt; Quan Tran; Ani Nenkova; Dinesh Manocha; Rajiv Jain; |
74 | FactPEGASUS: Factuality-Aware Pre-training and Fine-tuning for Abstractive Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present FactPEGASUS, an abstractive summarization model that addresses the problem of factuality during pre-training and fine-tuning: (1) We augment the sentence selection strategy of PEGASUS’s (Zhang et al., 2019) pre-training objective to create pseudo-summaries that are both important and factual; (2) We introduce three complementary components for fine-tuning. |
David Wan; Mohit Bansal; |
75 | ScAN: Suicide Attempt and Ideation Events Dataset Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this study, we first built Suicide Attempt and Ideation Events (ScAN) dataset, a subset of the publicly available MIMIC III dataset spanning over 12k+ EHR notes with 19k+ annotated SA and SI events information. |
Bhanu Pratap Singh Rawat; Samuel Kovaly; Hong Yu; Wilfred Pigeon; |
76 | Socially Aware Bias Measurements for Hindi Language Representations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we investigate the biases present in Hindi language representations such as caste and religion associated biases. |
Vijit Malik; Sunipa Dev; Akihiro Nishi; Nanyun Peng; Kai-Wei Chang; |
77 | AmbiPun: Generating Humorous Puns with Ambiguous Context Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a simple yet effective way to generate pun sentences that does not require any training on existing puns. |
Anirudh Mittal; Yufei Tian; Nanyun Peng; |
78 | EmpHi: Generating Empathetic Responses with Human-like Intents Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To address the bias of the empathetic intents distribution between empathetic dialogue models and humans, we propose a novel model to generate empathetic responses with human-consistent empathetic intents, EmpHi for short. |
Mao Yan Chen; Siheng Li; Yujiu Yang; |
79 | Yes, No or IDK: The Challenge of Unanswerable Yes/No Questions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we extend the Yes/No QA task, adding questions with an IDK answer, and show its considerable difficulty compared to the original 2-label task. |
Elior Sulem; Jamaal Hay; Dan Roth; |
80 | Inducing and Using Alignments for Transition-based AMR Parsing Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Parsers also train on a point-estimate of the alignment pipeline, neglecting the uncertainty due to the inherent ambiguity of alignment. In this work we explore two avenues for overcoming these limitations. |
Andrew Drozdov; Jiawei Zhou; Radu Florian; Andrew McCallum; Tahira Naseem; Yoon Kim; Ram?n Astudillo; |
81 | Masked Part-Of-Speech Model: Does Modeling Long Context Help Unsupervised POS-tagging? Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To facilitate flexible dependency modeling, we propose a Masked Part-of-Speech Model (MPoSM), inspired by the recent success of Masked Language Models (MLM). |
Xiang Zhou; Shiyue Zhang; Mohit Bansal; |
82 | DREAM: Improving Situational QA By First Elaborating The Situation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: While we do not know how language models (LMs) answer such questions, we conjecture that they may answer more accurately if they are also provided with additional details about the question situation, elaborating the scene. To test this conjecture, we train a new model, DREAM, to answer questions that elaborate the scenes that situated questions are about, and then provide those elaborations as additional context to a question-answering (QA) model. |
Yuling Gu; Bhavana Dalvi; Peter Clark; |
83 | CoSe-Co: Text Conditioned Generative CommonSense Contextualizer Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, training on symbolic KG entities limits their applicability in tasks involving natural language text where they ignore overall context. To mitigate this, we propose a CommonSense Contextualizer (CoSe-Co) conditioned on sentences as input to make it generically usable in tasks for generating knowledge relevant to the overall context of input text. |
Rachit Bansal; Milan Aggarwal; Sumit Bhatia; Jivat Kaur; Balaji Krishnamurthy; |
84 | Probing Via Prompting Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, the mechanism of selecting the probe model has recently been subject to intense debate, as it is not clear if the probes are merely extracting information or modelling the linguistic property themselves. To address this challenge, this paper introduces a novel model-free approach to probing via prompting, which formulates probing as a prompting task. |
Jiaoda Li; Ryan Cotterell; Mrinmaya Sachan; |
85 | Database Search Results Disambiguation for Task-Oriented Dialog Systems Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose Database Search Result (DSR) Disambiguation, a novel task that focuses on disambiguating database search results, which enhances user experience by allowing them to choose from multiple options instead of just one. |
Kun Qian; Satwik Kottur; Ahmad Beirami; Shahin Shayandeh; Paul Crook; Alborz Geramifard; Zhou Yu; Chinnadhurai Sankar; |
86 | Unsupervised Slot Schema Induction for Task-oriented Dialog Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In practical applications, manually designing schemas can be error-prone, laborious, iterative, and slow, especially when the schema is complicated. To alleviate this expensive and time consuming process, we propose an unsupervised approach for slot schema induction from unlabeled dialog corpora. |
Dian Yu; Mingqiu Wang; Yuan Cao; Izhak Shafran; Laurent Shafey; Hagen Soltau; |
87 | Towards A Progression-Aware Autonomous Dialogue Agent Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Thus, we propose a framework in which dialogue agents can evaluate the progression of a conversation toward or away from desired outcomes, and use this signal to inform planning for subsequent responses. |
Abraham Sanders; Tomek Strzalkowski; Mei Si; Albert Chang; Deepanshu Dey; Jonas Braasch; Dakuo Wang; |
88 | Cross-Domain Detection of GPT-2-Generated Technical Text Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper we examine the problem of detecting GPT-2-generated technical research text. |
Juan Rodriguez; Todd Hay; David Gros; Zain Shamsi; Ravi Srinivasan; |
89 | DISAPERE: A Dataset for Discourse Structure in Peer Review Discussions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present DISAPERE, a labeled dataset of 20k sentences contained in 506 review-rebuttal pairs in English, annotated by experts. |
Neha Kennard; Tim O?Gorman; Rajarshi Das; Akshay Sharma; Chhandak Bagchi; Matthew Clinton; Pranay Kumar Yelugam; Hamed Zamani; Andrew McCallum; |
90 | MultiSpanQA: A Dataset for Multi-Span Question Answering Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present MultiSpanQA, a new dataset that focuses on multi-span questions. |
Haonan Li; Martin Tomko; Maria Vasardani; Timothy Baldwin; |
91 | Context-Aware Abbreviation Expansion Using Large Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Motivated by the need for accelerating text entry in augmentative and alternative communication (AAC) for people with severe motor impairments, we propose a paradigm in which phrases are abbreviated aggressively as primarily word-initial letters. |
Shanqing Cai; Subhashini Venugopalan; Katrin Tomanek; Ajit Narayanan; Meredith Morris; Michael Brenner; |
92 | Theory-Grounded Measurement of U.S. Social Stereotypes in English Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce the sensitivity test (SeT) for measuring stereotypical associations from language models. |
Yang Cao; Anna Sotnikova; Hal Daum? III; Rachel Rudinger; Linda Zou; |
93 | Sort By Structure: Language Model Ranking As Dependency Probing Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose probing to rank LMs, specifically for parsing dependencies in a given language, by measuring the degree to which labeled trees are recoverable from an LM’s contextualized embeddings. |
Max M?ller-Eberstein; Rob Goot; Barbara Plank; |
94 | Quantifying Synthesis and Fusion and Their Impact on Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, literature in Natural Language Processing (NLP) typically labels a whole language with a strict type of morphology, e.g. fusional or agglutinative. In this work, we propose to reduce the rigidity of such claims, by quantifying morphological typology at the word and segment level. |
Arturo Oncevay; Duygu Ataman; Niels Van Berkel; Barry Haddow; Alexandra Birch; Johannes Bjerva; |
95 | Commonsense and Named Entity Aware Knowledge Grounded Dialogue Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present a novel open-domain dialogue generation model which effectively utilizes the large-scale commonsense and named entity based knowledge in addition to the unstructured topic-specific knowledge associated with each utterance. |
Deeksha Varshney; Akshara Prabhakar; Asif Ekbal; |
96 | Efficient Hierarchical Domain Adaptation for Pretrained Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we introduce a method to permit domain adaptation to many diverse domains using a computationally efficient adapter approach. |
Alexandra Chronopoulou; Matthew Peters; Jesse Dodge; |
97 | Hatemoji: A Test Suite and Adversarially-Generated Dataset for Benchmarking and Detecting Emoji-Based Hate Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present HatemojiCheck, a test suite of 3,930 short-form statements that allows us to evaluate performance on hateful language expressed with emoji. |
Hannah Kirk; Bertie Vidgen; Paul Rottger; Tristan Thrush; Scott Hale; |
98 | On The Economics of Multilingual Few-shot Learning: Modeling The Cost-Performance Trade-offs of Machine Translated and Manual Data Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Borrowing ideas from Production functions in micro-economics, in this paper we introduce a framework to systematically evaluate the performance and cost trade-offs between machine-translated and manually-created labelled data for task-specific fine-tuning of massively multilingual language models. |
Kabir Ahuja; Monojit Choudhury; Sandipan Dandapat; |
99 | Learning to Selectively Learn for Weakly Supervised Paraphrase Generation with Model-based Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: While data selection is privileged for the target task which has noisy data, developing a reinforced selective learning regime faces several unresolved challenges. In this paper, we carry on important discussions about the above problem and present a new model that could partially overcome the discussed issues with a model-based planning feature and a reward normalization feature. |
Haiyan Yin; Dingcheng Li; Ping Li; |
100 | Quality-Aware Decoding for Neural Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Despite the progress in machine translation quality estimation and evaluation in the last years, decoding in neural machine translation (NMT) is mostly oblivious to this and centers around finding the most probable translation according to the model (MAP decoding), approximated with beam search. In this paper, we bring together these two lines of research and propose quality-aware decoding for NMT, by leveraging recent breakthroughs in reference-free and reference-based MT evaluation through various inference methods like N-best reranking and minimum Bayes risk decoding. |
Patrick Fernandes; Ant?nio Farinhas; Ricardo Rei; Jos? De Souza; Perez Ogayo; Graham Neubig; Andre Martins; |
101 | Pretrained Models for Multilingual Federated Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We explore three multilingual language tasks, language modeling, machine translation, and text classification using differing federated and non-federated learning algorithms. |
Orion Weller; Marc Marone; Vladimir Braverman; Dawn Lawrie; Benjamin Van Durme; |
102 | AcTune: Uncertainty-Based Active Self-Training for Active Fine-Tuning of Pretrained Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We develop AcTune, a new framework that improves the label efficiency of active PLM fine-tuning by unleashing the power of unlabeled data via self-training. |
Yue Yu; Lingkai Kong; Jieyu Zhang; Rongzhi Zhang; Chao Zhang; |
103 | Label Anchored Contrastive Learning for Language Understanding Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Intuitively, the class label itself has the intrinsic ability to perform hard positive/negative mining, which is crucial for CL. Motivated by this, we propose a novel label anchored contrastive learning approach (denoted as LaCon) for language understanding. |
Zhenyu Zhang; Yuming Zhao; Meng Chen; Xiaodong He; |
104 | Go Back in Time: Generating Flashbacks in Stories with Event Temporal Prompts Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Two major issues in existing systems that exacerbate the challenges: 1) temporal bias in pertaining and story datasets that leads to monotonic event temporal orders; 2) lack of explicit guidance that helps machines decide where to insert *flashbacks*. We propose to address these issues using structured storylines to encode events and their pair-wise temporal relations (before, after and vague) as **temporal prompts** that guide how stories should unfold temporally. |
Rujun Han; Hong Chen; Yufei Tian; Nanyun Peng; |
105 | Forecasting COVID-19 Caseloads Using Unsupervised Embedding Clusters of Social Media Posts Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a novel approach incorporating transformer-based language models into infectious disease modelling. |
Felix Drinkall; Stefan Zohren; Janet Pierrehumbert; |
106 | Many Hands Make Light Work: Using Essay Traits to Automatically Score Essays Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we describe a way to score essays using a multi-task learning (MTL) approach, where scoring the essay holistically is the primary task, and scoring the essay traits is the auxiliary task. |
Rahul Kumar; Sandeep Mathias; Sriparna Saha; Pushpak Bhattacharyya; |
107 | Natural Language Inference with Self-Attention for Veracity Assessment of Pandemic Claims Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a comprehensive work on automated veracity assessment from dataset creation to developing novel methods based on Natural Language Inference (NLI), focusing on misinformation related to the COVID-19 pandemic. |
Miguel Arana-Catania; Elena Kochkina; Arkaitz Zubiaga; Maria Liakata; Robert Procter; Yulan He; |
108 | Beyond Emotion: A Multi-Modal Dataset for Human Desire Understanding Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: As a strikingly understudied task, it is difficult for machines to model and understand desire due to the unavailability of benchmarking datasets with desire and emotion labels. To bridge this gap, we present MSED, the first multi-modal and multi-task sentiment, emotion and desire dataset, which contains 9,190 text-image pairs, with English text. |
Ao Jia; Yu He; Yazhou Zhang; Sagar Uprety; Dawei Song; Christina Lioma; |
109 | Relation-Specific Attentions Over Entity Mentions for Enhanced Document-Level Relation Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: As a result, the distinct semantics between the different mentions of an entity are overlooked. To address this problem, we propose RSMAN in this paper which performs selective attentions over different entity mentions with respect to candidate relations. |
Jiaxin Yu; Deqing Yang; Shuyu Tian; |
110 | Twitter-COMMs: Detecting Climate, COVID, and Military Multimodal Misinformation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Detecting out-of-context media, such as miscaptioned images on Twitter, is a relevant problem, especially in domains of high public significance. In this work we aim to develop defenses against such misinformation for the topics of Climate Change, COVID-19, and Military Vehicles. |
Giscard Biamby; Grace Luo; Trevor Darrell; Anna Rohrbach; |
111 | BlonDe: An Automatic Evaluation Metric for Document-level Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper introduces a novel automatic metric BlonDe to widen the scope of automatic MT evaluation from sentence to document level. |
Yuchen Jiang; Tianyu Liu; Shuming Ma; Dongdong Zhang; Jian Yang; Haoyang Huang; Rico Sennrich; Ryan Cotterell; Mrinmaya Sachan; Ming Zhou; |
112 | Disentangled Learning of Stance and Aspect Topics for Vaccine Attitude Detection in Social Media Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Instead, with the aim of leveraging the large amount of unannotated data now available on vaccination, we propose a novel semi-supervised approach for vaccine attitude detection, called VADet. |
Lixing Zhu; Zheng Fang; Gabriele Pergola; Robert Procter; Yulan He; |
113 | SKILL: Structured Knowledge Infusion for Large Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose a method to infuse structured knowledge into LLMs, by directly training T5 models on factual triples of knowledge graphs (KGs). |
Fedor Moiseev; Zhe Dong; Enrique Alfonseca; Martin Jaggi; |
114 | Same Neurons, Different Languages: Probing Morphosyntax in Multilingual Pre-trained Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we conjecture that multilingual pre-trained models can derive language-universal abstractions about grammar. |
Karolina Stanczak; Edoardo Ponti; Lucas Torroba Hennigen; Ryan Cotterell; Isabelle Augenstein; |
115 | Aspect Is Not You Need: No-aspect Differential Sentiment Framework for Aspect-based Sentiment Analysis Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we analyze the ABSA task from a novel cognition perspective: humans can often judge the sentiment of an aspect even if they do not know what the aspect is. |
Jiahao Cao; Rui Liu; Huailiang Peng; Lei Jiang; Xu Bai; |
116 | MoEBERT: from BERT to Mixture-of-Experts Via Importance-Guided Adaptation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose MoEBERT, which uses a Mixture-of-Experts structure to increase model capacity and inference speed. |
Simiao Zuo; Qingru Zhang; Chen Liang; Pengcheng He; Tuo Zhao; Weizhu Chen; |
117 | Implicit N-grams Induced By Recurrence Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we present a study that shows there actually exist some explainable componentsthat reside within the hidden states, which are reminiscent of the classical n-grams features. |
Xiaobing Sun; Wei Lu; |
118 | Guiding Visual Question Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present Guiding Visual Question Generation – a variant of VQG which conditions the question generator on categorical information based on expectations on the type of question and the objects it should explore. |
Nihir Vedd; Zixu Wang; Marek Rei; Yishu Miao; Lucia Specia; |
119 | OPERA: Operation-Pivoted Discrete Reasoning Over Text Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, they ignore the utilization of symbolic operations and encounter a lack of reasoning ability and interpretability. To inherit the advantages of these two types of methods, we propose OPERA, an operation-pivoted discrete reasoning framework, where lightweight symbolic operations (compared with logical forms) as neural modules are utilized to facilitate the reasoning ability and interpretability. |
Yongwei Zhou; Junwei Bao; Chaoqun Duan; Haipeng Sun; Jiahui Liang; Yifan Wang; Jing Zhao; Youzheng Wu; Xiaodong He; Tiejun Zhao; |
120 | Improving Multi-Document Summarization Through Referenced Flexible Extraction with Credit-Awareness Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present an extract-then-abstract Transformer framework to overcome the problem. |
Yun-Zhu Song; Yi-Syuan Chen; Hong-Han Shuai; |
121 | Improving Constituent Representation with Hypertree Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we aim to improve representations of constituent spans using a novel hypertree neural networks (HTNN) that is structured with constituency parse trees. |
Hao Zhou; Gongshen Liu; Kewei Tu; |
122 | Measuring Fairness with Biased Rulers: A Comparative Study on Bias Metrics for Pre-trained Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We survey the literature on fairness metrics for pre-trained language models and experimentally evaluate compatibility, including both biases in language models and in their downstream tasks. |
Pieter Delobelle; Ewoenam Tokpo; Toon Calders; Bettina Berendt; |
123 | MuCPAD: A Multi-Domain Chinese Predicate-Argument Dataset Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In order to facilitate research on cross-domain SRL, this paper presents MuCPAD, a multi-domain Chinese predicate-argument dataset, which consists of 30,897 sentences and 92,051 predicates from six different domains. |
Yahui Liu; Haoping Yang; Chen Gong; Qingrong Xia; Zhenghua Li; Min Zhang; |
124 | Representation Learning for Conversational Data Using Discourse Mutual Information Maximization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Hence, we propose a structure-aware Mutual Information based loss-function DMI (Discourse Mutual Information) for training dialog-representation models, that additionally captures the inherent uncertainty in response prediction. |
Bishal Santra; Sumegh Roychowdhury; Aishik Mandal; Vasu Gurram; Atharva Naik; Manish Gupta; Pawan Goyal; |
125 | ValCAT: Variable-Length Contextualized Adversarial Transformations Using Encoder-Decoder Language Model Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose ValCAT, a black-box attack framework that misleads the language model by applying variable-length contextualized transformations to the original text. |
Chuyun Deng; Mingxuan Liu; Yue Qin; Jia Zhang; Hai-Xin Duan; Donghong Sun; |
126 | A Study of Syntactic Multi-Modality in Non-Autoregressive Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we conduct a systematic study on the syntactic multi-modality problem. |
Kexun Zhang; Rui Wang; Xu Tan; Junliang Guo; Yi Ren; Tao Qin; Tie-Yan Liu; |
127 | CIAug: Equipping Interpolative Augmentation with Curriculum Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose CIAug, a novel curriculum-based learning method that builds upon mixup. |
Ramit Sawhney; Ritesh Soun; Shrey Pandit; Megh Thakkar; Sarvagya Malaviya; Yuval Pinter; |
128 | Proposition-Level Clustering for Multi-Document Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we revisit the clustering approach, grouping together sub-sentential propositions, aiming at more precise information alignment. |
Ori Ernst; Avi Caciularu; Ori Shapira; Ramakanth Pasunuru; Mohit Bansal; Jacob Goldberger; Ido Dagan; |
129 | Non-Autoregressive Machine Translation: It’s Not As Fast As It Seems Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we point out flaws in the evaluation methodology present in the literature on NAR models and we provide a fair comparison between a state-of-the-art NAR model and the autoregressive submissions to the shared task. |
Jindrich Helcl; Barry Haddow; Alexandra Birch; |
130 | BAD-X: Bilingual Adapters Improve Zero-Shot Cross-Lingual Transfer Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we show that it is more effective to learn bilingual language pair adapters (BAs) when the goal is to optimize performance for a particular source-target transfer direction. |
Marinela Parovic; Goran Glava?; Ivan Vulic; Anna Korhonen; |
131 | Combining Humor and Sarcasm for Improving Political Parody Detection Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper explores jointly modelling these figurative tropes with the goal of improving performance of political parody detection in tweets. To this end, we present a multi-encoder model that combines three parallel encoders to enrich parody-specific representations with humor and sarcasm information. |
Xiao Ao; Danae Sanchez Villegas; Daniel Preotiuc-Pietro; Nikolaos Aletras; |
132 | TIE: Topological Information Enhanced Structural Reading Comprehension on Web Pages Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose a Topological Information Enhanced model (TIE), which transforms the token-level task into a tag-level task by introducing a two-stage process (i.e. node locating and answer refining). |
Zihan Zhao; Lu Chen; Ruisheng Cao; Hongshen Xu; Xingyu Chen; Kai Yu; |
133 | RSTGen: Imbuing Fine-Grained Interpretable Control Into Long-FormText Generators Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we study the task of improving the cohesion and coherence of long-form text generated by language models. |
Rilwan Adewoyin; Ritabrata Dutta; Yulan He; |
134 | Intent Detection and Discovery from User Logs Via Deep Semi-Supervised Contrastive Clustering Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Unlike existing approaches that rely on epoch wise cluster alignment, we propose an end-to-end deep contrastive clustering algorithm that jointly updates model parameters and cluster centers via supervised and self-supervised learning and optimally utilizes both labeled and unlabeled data. |
Rajat Kumar; Mayur Patidar; Vaibhav Varshney; Lovekesh Vig; Gautam Shroff; |
135 | Extending Multi-Text Sentence Fusion Resources Via Pyramid Annotations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we revisit and substantially extend previous dataset creation efforts. |
Daniela Brook Weiss; Paul Roit; Ori Ernst; Ido Dagan; |
136 | The Devil Is in The Details: On The Pitfalls of Vocabulary Selection in Neural Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a model of vocabulary selection, integrated into the neural translation model, that predicts the set of allowed output words from contextualized encoder representations. |
Tobias Domhan; Eva Hasler; Ke Tran; Sony Trenous; Bill Byrne; Felix Hieber; |
137 | MultiCite: Modeling Realistic Citations Requires Moving Beyond The Single-sentence Single-label Setting Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Yet, recent work in CCA is often approached as a single-sentence, single-label classification task, and thus many datasets used to develop modern computational approaches fail to capture this interesting discourse. To address this research gap, we highlight three understudied phenomena for CCA and release MULTICITE, a new dataset of 12.6K citation contexts from 1.2K computational linguistics papers that fully models these phenomena. |
Anne Lauscher; Brandon Ko; Bailey Kuehl; Sophie Johnson; Arman Cohan; David Jurgens; Kyle Lo; |
138 | DEGREE: A Data-Efficient Generation-Based Event Extraction Model Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we focus on low-resource end-to-end event extraction and propose DEGREE, a data-efficient model that formulates event extraction as a conditional generation problem. |
I-Hung Hsu; Kuan-Hao Huang; Elizabeth Boschee; Scott Miller; Prem Natarajan; Kai-Wei Chang; Nanyun Peng; |
139 | Bridging The Gap Between Language Models and Cross-Lingual Sequence Labeling Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we first design a pre-training task tailored for xSL named Cross-lingual Language Informative Span Masking (CLISM) to eliminate the objective gap in a self-supervised manner. Second, we present ContrAstive-Consistency Regularization (CACR), which utilizes contrastive learning to encourage the consistency between representations of input parallel sequences via unsupervised cross-lingual instance-wise training signals during pre-training. |
Nuo Chen; Linjun Shou; Ming Gong; Jian Pei; Daxin Jiang; |
140 | Hero-Gang Neural Model For Named Entity Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Unfortunately, although these models can capture effective global context information, they are still limited in the local feature and position information extraction, which is critical in NER. In this paper, to address this limitation, we propose a novel Hero-Gang Neural structure (HGN), including the Hero and Gang module, to leverage both global and local information to promote NER. |
Jinpeng Hu; Yaling Shen; Yang Liu; Xiang Wan; Tsung-Hui Chang; |
141 | MGIMN: Multi-Grained Interactive Matching Network for Few-shot Text Classification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To deal with these issues, we propose a meta-learning based method MGIMN which performs instance-wise comparison followed by aggregation to generate class-wise matching vectors instead of prototype learning. |
Jianhai Zhang; Mieradilijiang Maimaiti; Gao Xing; Yuanhang Zheng; Ji Zhang; |
142 | All You May Need for VQA Are Image Captions Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a method that automatically derives VQA examples at volume, by leveraging the abundance of existing image-caption annotations combined with neural models for textual question generation. |
Soravit Changpinyo; Doron Kukliansy; Idan Szpektor; Xi Chen; Nan Ding; Radu Soricut; |
143 | Frustratingly Easy System Combination for Grammatical Error Correction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we formulate system combination for grammatical error correction (GEC) as a simple machine learning task: binary classification. |
Muhammad Qorib; Seung-Hoon Na; Hwee Tou Ng; |
144 | Simple Local Attentions Remain Competitive for Long-Context Tasks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Despite the abundance of research along this direction, it is still difficult to gauge the relative effectiveness of these models in practical use cases, e.g., if we apply these models following the pretrain-and-finetune paradigm. In this work, we aim to conduct a thorough analysis of these emerging models with large-scale and controlled experiments. |
Wenhan Xiong; Barlas Oguz; Anchit Gupta; Xilun Chen; Diana Liskovich; Omer Levy; Scott Yih; Yashar Mehdad; |
145 | Even The Simplest Baseline Needs Careful Re-investigation: A Case Study on XML-CNN Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, through an astonishing example we argue that more efforts should be paid to ensure the progress in developing a new deep learning method. |
Si-An Chen; Jie-jyun Liu; Tsung-Han Yang; Hsuan-Tien Lin; Chih-Jen Lin; |
146 | Multi-Relational Graph Transformer for Automatic Short Answer Grading Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Most existing methods utilize sequential context to compare two sentences and ignore the structural context of the sentence; therefore, these methods may not result in the desired performance. In this paper, we overcome this problem by proposing a Multi-Relational Graph Transformer, MitiGaTe, to prepare token representations considering the structural context. |
Rajat Agarwal; Varun Khurana; Karish Grover; Mukesh Mohania; Vikram Goyal; |
147 | Event Schema Induction with Double Graph Autoencoders Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a new event schema induction framework using double graph autoencoders, which captures the global dependencies among nodes in event graphs. |
Xiaomeng Jin; Manling Li; Heng Ji; |
148 | CS1QA: A Dataset for Assisting Code-based Question Answering in An Introductory Programming Course Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce CS1QA, a dataset for code-based question answering in the programming education domain. |
Changyoon Lee; Yeon Seonwoo; Alice Oh; |
149 | Unsupervised Cross-Lingual Transfer of Structured Predictors Without Source Data Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To that end, we generalise methods for unsupervised transfer from multiple input models for structured prediction. |
Kemal Kurniawan; Lea Frermann; Philip Schulz; Trevor Cohn; |
150 | Don’t Take It Literally: An Edit-Invariant Sequence Loss for Text Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To address the challenge, we propose a novel Edit-Invariant Sequence Loss (EISL), which computes the matching loss of a target n-gram with all n-grams in the generated sequence. |
Guangyi Liu; Zichao Yang; Tianhua Tao; Xiaodan Liang; Junwei Bao; Zhen Li; Xiaodong He; Shuguang Cui; Zhiting Hu; |
151 | Modeling Exemplification in Long-form Question Answering Via Retrieval Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we provide the first computational study of exemplification in QA, performing a fine-grained annotation of different types of examples (e.g., hypotheticals, anecdotes) in three corpora. |
Shufan Wang; Fangyuan Xu; Laure Thompson; Eunsol Choi; Mohit Iyyer; |
152 | D2U: Distance-to-Uniform Learning for Out-of-Scope Detection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Specifically, we propose a zero-shot post-processing step, called Distance-to-Uniform (D2U), exploiting not only the classification confidence score, but the shape of the entire output distribution. |
Eyup Yilmaz; Cagri Toraman; |
153 | Reference-free Summarization Evaluation Via Semantic Correlation and Compression Ratio Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a new automatic reference-free evaluation metric that compares semantic distribution between source document and summary by pretrained language models and considers summary compression ratio. |
Yizhu Liu; Qi Jia; Kenny Zhu; |
154 | KroneckerBERT: Significant Compression of Pre-trained Language Models Through Kronecker Decomposition and Knowledge Distillation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present our KroneckerBERT, a compressed version of the BERT_BASE model obtained by compressing the embedding layer and the linear mappings in the multi-head attention, and the feed-forward network modules in the Transformer layers. |
Marzieh Tahaei; Ella Charlaix; Vahid Nia; Ali Ghodsi; Mehdi Rezagholizadeh; |
155 | Building A Role Specified Open-Domain Dialogue System Leveraging Large-Scale Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we study the challenge of imposing roles on open-domain dialogue systems, with the goal of making the systems maintain consistent roles while conversing naturally with humans. |
Sanghwan Bae; Donghyun Kwak; Sungdong Kim; Donghoon Ham; Soyoung Kang; Sang-Woo Lee; Woomyoung Park; |
156 | Sentence-Level Resampling for Named Entity Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To alleviate data imbalance, we propose a set of sentence-level resampling methods where the importance of each training sentence is computed based on its tokens and entities. |
Xiaochen Wang; Yue Wang; |
157 | Word Tour: One-dimensional Word Embeddings Via The Traveling Salesman Problem Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this study, we propose WordTour, unsupervised one-dimensional word embeddings. |
Ryoma Sato; |
158 | On The Diversity and Limits of Human Explanations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Our goal is to provide an overview of the diversity of explanations, discuss human limitations in providing explanations, and ultimately provide implications for collecting and using human explanations in NLP. |
Chenhao Tan; |
159 | Locally Aggregated Feature Attribution on Natural Language Model Understanding Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose Locally Aggregated Feature Attribution (LAFA), a novel gradient-based feature attribution method for NLP models. |
Sheng Zhang; Jin Wang; Haitao Jiang; Rui Song; |
160 | Generic and Trend-aware Curriculum Learning for Relation Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a generic and trend-aware curriculum learning approach that effectively integrates textual and structural information in text graphs for relation extraction between entities, which we consider as node pairs in graphs. |
Nidhi Vakil; Hadi Amiri; |
161 | On Systematic Style Differences Between Unsupervised and Supervised MT and An Application for High-Resource Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We compare translations from supervised and unsupervised MT systems of similar quality, finding that unsupervised output is more fluent and more structurally different in comparison to human translation than is supervised MT. We then demonstrate a way to combine the benefits of both methods into a single system which results in improved adequacy and fluency as rated by human evaluators. |
Kelly Marchisio; Markus Freitag; David Grangier; |
162 | Evidentiality-guided Generation for Knowledge-Intensive NLP Tasks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This work introduces a method to incorporate evidentiality of passages-whether a passage contains correct evidence to support the output-into training the generator. |
Akari Asai; Matt Gardner; Hannaneh Hajishirzi; |
163 | Modularized Transfer Learning with Multiple Knowledge Graphs for Zero-shot Commonsense Reasoning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Considering the increasing type of different commonsense KGs, this paper aims to extend the zero-shot transfer learning scenario into multiple-source settings, where different KGs can be utilized synergetically. |
Yu Jin Kim; Beong-woo Kwak; Youngwook Kim; Reinald Kim Amplayo; Seung-won Hwang; Jinyoung Yeo; |
164 | Learning to Express in Knowledge-Grounded Conversation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we mainly consider two aspects of knowledge expression, namely the structure of the response and style of the content in each part. |
Xueliang Zhao; Tingchen Fu; Chongyang Tao; Wei Wu; Dongyan Zhao; Rui Yan; |
165 | End-to-End Chinese Speaker Identification Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To make large end-to-end models possible, we design a new annotation guideline that regards SI as span extraction from the local context, and we annotate by far the largest SI dataset for Chinese named CSI based on eighteen novels. |
Dian Yu; Ben Zhou; Dong Yu; |
166 | MINION: A Large-Scale and Diverse Dataset for Multilingual Event Detection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In addition, the current datasets are often small and not accessible to the public. To overcome those shortcomings, we introduce a new large-scale multilingual dataset for ED (called MINION) that consistently annotates events for 8 different languages; 5 of them have not been supported by existing multilingual datasets. |
Amir Pouran Ben Veyseh; Minh Van Nguyen; Franck Dernoncourt; Thien Nguyen; |
167 | Do Prompt-Based Models Really Understand The Meaning of Their Prompts? Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this study, we experiment with over 30 prompts manually written for natural language inference (NLI). |
Albert Webson; Ellie Pavlick; |
168 | GPL: Generative Pseudo Labeling for Unsupervised Domain Adaptation of Dense Retrieval Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose the novel unsupervised domain adaptation method Generative Pseudo Labeling (GPL), which combines a query generator with pseudo labeling from a cross-encoder. |
Kexin Wang; Nandan Thakur; Nils Reimers; Iryna Gurevych; |
169 | Sparse Distillation: Speeding Up Text Classification By Using Bigger Student Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Therefore, the improved inference speed may still be unsatisfactory for real-time or high-volume use cases. In this paper, we aim to further push the limit of inference speed by distilling teacher models into bigger, sparser student models – bigger in that they scale up to billions of parameters; sparser in that most of the model parameters are n-gram embeddings. |
Qinyuan Ye; Madian Khabsa; Mike Lewis; Sinong Wang; Xiang Ren; Aaron Jaech; |
170 | Towards Understanding Large-Scale Discourse Structures in Pre-Trained and Fine-Tuned Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we extend the line of BERTology work by focusing on the important, yet less explored, alignment of pre-trained and fine-tuned PLMs with large-scale discourse structures. |
Patrick Huber; Giuseppe Carenini; |
171 | SAIS: Supervising and Augmenting Intermediate Steps for Document-Level Relation Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In contrast, we propose to explicitly teach the model to capture relevant contexts and entity types by supervising and augmenting intermediate steps (SAIS) for RE. |
Yuxin Xiao; Zecheng Zhang; Yuning Mao; Carl Yang; Jiawei Han; |
172 | LITE: Intent-based Task Representation Learning Using Weak Supervision Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Yet, understanding and representing their meaning is the first step towards providing intelligent assistance for to-do management. We address this problem by proposing a neural multi-task learning framework, LITE, which extracts representations of English to-do tasks with a multi-head attention mechanism on top of a pre-trained text encoder. |
Naoki Otani; Michael Gamon; Sujay Kumar Jauhar; Mei Yang; Sri Raghu Malireddi; Oriana Riva; |
173 | Does Summary Evaluation Survive Translation to Other Languages? Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To investigate how much we can trust machine translation of summarization datasets, we translate the English SummEval dataset to seven languages and compare performances across automatic evaluation measures. |
Spencer Braun; Oleg Vasilyev; Neslihan Iskender; John Bohannon; |
174 | A Shoulder to Cry On: Towards A Motivational Virtual Assistant for Assuaging Mental Agony Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a VA that can act as the first point of contact and comfort for mental health patients. |
Tulika Saha; Saichethan Reddy; Anindya Das; Sriparna Saha; Pushpak Bhattacharyya; |
175 | SueNes: A Weakly Supervised Approach to Evaluating Single-Document Summarization Via Negative Sampling Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present a proof-of-concept study to a weakly supervised summary evaluation approach without the presence of reference summaries. |
Forrest Bao; Ge Luo; Hebi Li; Minghui Qiu; Yinfei Yang; Youbiao He; Cen Chen; |
176 | Combating The Curse of Multilinguality in Cross-Lingual WSD By Aligning Sparse Contextualized Word Representations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we advocate for using large pre-trained monolingual language models in cross lingual zero-shot word sense disambiguation (WSD) coupled with a contextualized mapping mechanism. |
G?bor Berend; |
177 | Cheat Codes to Quantify Missing Source Information in Neural Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper describes a method to quantify the amount of information H(t|s) added by the target sentence t that is not present in the source s in a neural machine translation system. |
Proyag Pal; Kenneth Heafield; |
178 | WiC = TSV = WSD: On The Equivalence of Three Semantic Tasks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we establish the exact relationship between WiC and WSD, as well as the related task of target sense verification (TSV). |
Bradley Hauer; Grzegorz Kondrak; |
179 | What Do Tokens Know About Their Characters and How Do They Know It? Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Here, studying a range of models (e.g., GPT- J, BERT, RoBERTa, GloVe), we probe what word pieces encode about character-level information by training classifiers to predict the presence or absence of a particular alphabetical character in a token, based on its embedding (e.g., probing whether the model embedding for cat encodes that it contains the character a). |
Ayush Kaushal; Kyle Mahowald; |
180 | AnswerSumm: A Manually-Curated Dataset and Pipeline for Answer Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This work introduces a novel dataset of 4,631 CQA threads for answer summarization curated by professional linguists. |
Alexander Fabbri; Xiaojian Wu; Srini Iyer; Haoran Li; Mona Diab; |
181 | Paragraph-based Transformer Pre-training for Multi-Sentence Inference Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we first show that popular pre-trained transformers perform poorly when used for fine-tuning on multi-candidate inference tasks. We then propose a new pre-training objective that models the paragraph-level semantics across multiple input sentences. |
Luca Di Liello; Siddhant Garg; Luca Soldaini; Alessandro Moschitti; |
182 | Text Style Transfer Via Optimal Transport Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose a novel method based on Optimal Transport for TST to simultaneously incorporate syntactic and semantic information into similarity computation between the source and the converted text. |
Nasim Nouri; |
183 | Exploring The Role of Task Transferability in Large-Scale Multi-Task Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we aim to disentangle the effect of scale and relatedness of tasks in multi-task representation learning. |
Vishakh Padmakumar; Leonard Lausen; Miguel Ballesteros; Sheng Zha; He He; George Karypis; |
184 | Interactive Query-Assisted Summarization Via Deep Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To that end, we propose two novel deep reinforcement learning models for the task that address, respectively, the subtask of summarizing salient information that adheres to user queries, and the subtask of listing suggested queries to assist users throughout their exploration. |
Ori Shapira; Ramakanth Pasunuru; Mohit Bansal; Ido Dagan; Yael Amsterdamer; |
185 | Data Augmentation with Dual Training for Offensive Span Detection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: One of the challenges to train a model for this novel setting is the lack of enough training data. To address this limitation, in this work we propose a novel method in which the large-scale pre-trained language model GPT-2 is employed to generate synthetic training data for OSD. |
Nasim Nouri; |
186 | Training Mixed-Domain Translation Models Via Federated Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we leverage federated learning (FL) in order to tackle the problem. |
Peyman Passban; Tanya Roosta; Rahul Gupta; Ankit Chadha; Clement Chung; |
187 | QAFactEval: Improved QA-Based Factual Consistency Evaluation for Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we conduct an extensive comparison of entailment and QA-based metrics, demonstrating that carefully choosing the components of a QA-based metric, especially question generation and answerability classification, is critical to performance. |
Alexander Fabbri; Chien-Sheng Wu; Wenhao Liu; Caiming Xiong; |
188 | How Gender Debiasing Affects Internal Model Representations, and Why It Matters Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we illuminate this relationship by measuring both quantities together: we debias a model during downstream fine-tuning, which reduces extrinsic bias, and measure the effect on intrinsic bias, which is operationalized as bias extractability with information-theoretic probing. |
Hadas Orgad; Seraphina Goldfarb-Tarrant; Yonatan Belinkov; |
189 | A Structured Span Selector Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a novel grammar-based structured span selection model which learns to make use of the partial span-level annotation provided for such problems. |
Tianyu Liu; Yuchen Jiang; Ryan Cotterell; Mrinmaya Sachan; |
190 | Unified Semantic Typing with Meaningful Label Inference Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present UniST, a unified framework for semantic typing that captures label semantics by projecting both inputs and labels into a joint semantic embedding space. |
James Y. Huang; Bangzheng Li; Jiashu Xu; Muhao Chen; |
191 | Learning To Retrieve Prompts for In-Context Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose an efficient method for retrieving prompts for in-context learning using annotated data and an LM. |
Ohad Rubin; Jonathan Herzig; Jonathan Berant; |
192 | Necessity and Sufficiency for Explaining Text Classifiers: A Case Study in Hate Speech Detection Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present a novel feature attribution method for explaining text classifiers, and analyze it in the context of hate speech detection. |
Esma Balkir; Isar Nejadgholi; Kathleen Fraser; Svetlana Kiritchenko; |
193 | Learning to Retrieve Passages Without Supervision Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Dense retrievers for open-domain question answering (ODQA) have been shown to achieve impressive performance by training on large datasets of question-passage pairs. In this work we ask whether this dependence on labeled data can be reduced via unsupervised pretraining that is geared towards ODQA. |
Ori Ram; Gal Shachaf; Omer Levy; Jonathan Berant; Amir Globerson; |
194 | Re2G: Retrieve, Rerank, Generate Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We build on this line of research, proposing Re2G, which combines both neural initial retrieval and reranking into a BART-based sequence-to-sequence generation. |
Michael Glass; Gaetano Rossiello; Md Faisal Mahbub Chowdhury; Ankita Naik; Pengshan Cai; Alfio Gliozzo; |
195 | Don’t Sweat The Small Stuff, Classify The Rest: Sample Shielding to Protect Text Classifiers Against Adversarial Attacks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We shield three popular DL text classifiers with Sample Shielding, test their resilience against four SOTA attackers across three datasets in a realistic threat setting. |
Jonathan Rusert; Padmini Srinivasan; |
196 | Federated Learning with Noisy User Feedback Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose a strategy for training FL models using positive and negative user feedback. |
Rahul Sharma; Anil Ramakrishna; Ansel MacLaughlin; Anna Rumshisky; Jimit Majmudar; Clement Chung; Salman Avestimehr; Rahul Gupta; |
197 | Gender Bias in Masked Language Models for Multiple Languages Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose Multilingual Bias Evaluation (MBE) score, to evaluate bias in various languages using only English attribute word lists and parallel corpora between the target language and English without requiring manually annotated data. |
Masahiro Kaneko; Aizhan Imankulova; Danushka Bollegala; Naoaki Okazaki; |
198 | Multi-Domain Targeted Sentiment Analysis Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To address this scenario, we present a multi-domain TSA system based on augmenting a given training set with diverse weak labels from assorted domains. |
Orith Toledo-Ronen; Matan Orbach; Yoav Katz; Noam Slonim; |
199 | Falsesum: Generating Document-level NLI Examples for Recognizing Factual Inconsistency in Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, state-of-the-art NLI models perform poorly in this context due to their inability to generalize to the target task. In this work, we show that NLI models can be effective for this task when the training data is augmented with high-quality task-oriented examples. |
Prasetya Utama; Joshua Bambrick; Nafise Moosavi; Iryna Gurevych; |
200 | Dynamic Gazetteer Integration in Multilingual Models for Cross-Lingual and Cross-Domain Named Entity Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose an approach that with limited effort and data, addresses the NER knowledge gap across languages and domains. |
Besnik Fetahu; Anjie Fang; Oleg Rokhlenko; Shervin Malmasi; |
201 | MetaICL: Learning to Learn In Context Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce MetaICL (Meta-training for In-Context Learning), a new meta-training framework for few-shot learning where a pretrained language model is tuned to do in-context learning on a large set of training tasks. |
Sewon Min; Mike Lewis; Luke Zettlemoyer; Hannaneh Hajishirzi; |
202 | Enhancing Knowledge Selection for Grounded Dialogues Via Document Semantic Graphs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Existing models treat knowledge selection as a sentence ranking or classification problem where each sentence is handled individually, ignoring the internal semantic connection between sentences. In this work, we propose to automatically convert the background knowledge documents into document semantic graphs and then perform knowledge selection over such graphs. |
Sha Li; Mahdi Namazifar; Di Jin; Mohit Bansal; Heng Ji; Yang Liu; Dilek Hakkani-Tur; |
203 | Using Natural Sentence Prompts for Understanding Biases in Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This dependence traces back to the need of prompt-style dataset to trigger specific behaviors of language models. In this paper, we address this gap by creating a prompt dataset with respect to occupations collected from real-world natural sentences present in Wikipedia. |
Sarah Alnegheimish; Alicia Guo; Yi Sun; |
204 | Robust Conversational Agents Against Imperceptible Toxicity Triggers Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose attacks against conversational agents that are imperceptible, i.e., they fit the conversation in terms of coherency, relevancy, and fluency, while they are effective and scalable, i.e., they can automatically trigger the system into generating toxic language. |
Ninareh Mehrabi; Ahmad Beirami; Fred Morstatter; Aram Galstyan; |
205 | Selective Differential Privacy for Language Modeling Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Given that the private information in natural language is sparse (for example, the bulk of an email might not carry personally identifiable information), we propose a new privacy notion, selective differential privacy, to provide rigorous privacy guarantees on the sensitive portion of the data to improve model utility. |
Weiyan Shi; Aiqi Cui; Evan Li; Ruoxi Jia; Zhou Yu; |
206 | Do Trajectories Encode Verb Meaning? Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we investigate the extent to which trajectories (i.e. the position and rotation of objects over time) naturally encode verb semantics. |
Dylan Ebert; Chen Sun; Ellie Pavlick; |
207 | Long Context Question Answering Via Supervised Contrastive Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose a novel method for equipping long-context QA models with an additional sequence-level objective for better identification of the supporting evidence. |
Avi Caciularu; Ido Dagan; Jacob Goldberger; Arman Cohan; |
208 | The USMLE Step 2 Clinical Skills Patient Note Corpus Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper presents a corpus of 43,985 clinical patient notes (PNs) written by 35,156 examinees during the high-stakes USMLE Step 2 Clinical Skills examination. |
Victoria Yaneva; Janet Mee; Le Ha; Polina Harik; Michael Jodoin; Alex Mechaber; |
209 | Learning to Borrow- Relation Representation for Without-Mention Entity-Pairs for Knowledge Graph Completion Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a supervised borrowing method, SuperBorrow, that learns to score the suitability of an LDP to represent a without-mentions entity pair using pre-trained entity embeddings and contextualised LDP representations. |
Huda Hakami; Mona Hakami; Angrosh Mandya; Danushka Bollegala; |
210 | Improving Entity Disambiguation By Reasoning Over A Knowledge Base Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To allow the use of all KB facts, as well as descriptions and types, we introduce an ED model which links entities by reasoning over a symbolic knowledge base in a fully differentiable fashion. |
Tom Ayoola; Joseph Fisher; Andrea Pierleoni; |
211 | Modal Dependency Parsing Via Language Model Priming Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We design a modal dependency parser that is based on priming pre-trained language models, and evaluate the parser on two data sets. |
Jiarui Yao; Nianwen Xue; Bonan Min; |
212 | Document-Level Relation Extraction with Sentences Importance Estimation and Focusing Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To this end, we propose a Sentence Importance Estimation and Focusing (SIEF) framework for DocRE, where we design a sentence importance score and a sentence focusing loss, encouraging DocRE models to focus on evidence sentences. |
Wang Xu; Kehai Chen; Lili Mou; Tiejun Zhao; |
213 | Are All The Datasets in Benchmark Necessary? A Pilot Study of Dataset Evaluation for Text Classification Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we ask the research question of whether all the datasets in the benchmark are necessary. |
Yang Xiao; Jinlan Fu; See-Kiong Ng; Pengfei Liu; |
214 | Triggerless Backdoor Attack for NLP Tasks with Clean Labels Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To generate poisoned clean-labeled examples, we propose a sentence generation model based on the genetic algorithm to cater to the non-differentiable characteristic of text data. |
Leilei Gan; Jiwei Li; Tianwei Zhang; Xiaoya Li; Yuxian Meng; Fei Wu; Yi Yang; Shangwei Guo; Chun Fan; |
215 | PPL-MCTS: Constrained Textual Generation Through Discriminator-Guided MCTS Decoding Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Large language models (LM) based on Transformers allow to generate plausible long texts. In this paper, we explore how this generation can be further controlled at decoding time to satisfy certain constraints (e.g. being non-toxic, conveying certain emotions, using a specific writing style, etc.) without fine-tuning the LM. |
Antoine Chaffin; Vincent Claveau; Ewa Kijak; |
216 | Interpretable Proof Generation Via Iterative Backward Reasoning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present IBR, an Iterative Backward Reasoning model to solve the proof generation tasks on rule-based Question Answering (QA), where models are required to reason over a series of textual rules and facts to find out the related proof path and derive the final answer. |
Hanhao Qu; Yu Cao; Jun Gao; Liang Ding; Ruifeng Xu; |
217 | Domain Confused Contrastive Learning for Unsupervised Domain Adaptation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we study Unsupervised Domain Adaptation (UDA) in a challenging self-supervised approach. |
Quanyu Long; Tianze Luo; Wenya Wang; Sinno Pan; |
218 | Incorporating Centering Theory Into Neural Coreference Resolution Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose to incorporate centering transitions derived from centering theory in the form of a graph into a neural coreference model. |
Haixia Chai; Michael Strube; |
219 | Progressive Class Semantic Matching for Semi-supervised Text Classification Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Specifically, we propose a joint semi-supervised learning process that can progressively build a standard K-way classifier and a matching network for the input text and the Class Semantic Representation (CSR). |
Haiming Xu; Lingqiao Liu; Ehsan Abbasnejad; |
220 | Low Resource Style Transfer Via Domain Adaptive Meta Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose DAML-ATM (Domain Adaptive Meta-Learning with Adversarial Transfer Model), which consists of two parts: DAML and ATM. |
Xiangyang Li; Xiang Long; Yu Xia; Sujian Li; |
221 | Features or Spurious Artifacts? Data-centric Baselines for Fair and Robust Hate Speech Detection Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper we critically analyze lexical biases in hate speech detection via a cross-platform study, disentangling various types of spurious and authentic artifacts and analyzing their impact on out-of-distribution fairness and robustness. |
Alan Ramponi; Sara Tonelli; |
222 | Document-Level Event Argument Extraction By Leveraging Redundant Information and Closed Boundary Loss Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, to make use of redundant event information underlying a document, we build an entity coreference graph with the graph2token module to produce a comprehensive and coreference-aware representation for every entity and then build an entity summary graph to merge the multiple extraction results. |
Hanzhang Zhou; Kezhi Mao; |
223 | A Few Thousand Translations Go A Long Way! Leveraging Pre-trained Models for African News Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This work investigates how to optimally leverage existing pre-trained models to create low-resource translation systems for 16 African languages. |
David Adelani; Jesujoba Alabi; Angela Fan; Julia Kreutzer; Xiaoyu Shen; Machel Reid; Dana Ruiter; Dietrich Klakow; Peter Nabende; Ernie Chang; Tajuddeen Gwadabe; Freshia Sackey; Bonaventure F. P. Dossou; Chris Emezue; Colin Leong; Michael Beukman; Shamsuddeen Muhammad; Guyo Jarso; Oreen Yousuf; Andre Niyongabo Rubungo; Gilles Hacheme; Eric Peter Wairagala; Muhammad Umair Nasir; Benjamin Ajibade; Tunde Ajayi; Yvonne Gitau; Jade Abbott; Mohamed Ahmed; Millicent Ochieng; Anuoluwapo Aremu; Perez Ogayo; Jonathan Mukiibi; Fatoumata Ouoba Kabore; Godson Kalipe; Derguene Mbaye; Allahsera Auguste Tapo; Victoire Memdjokam Koagne; Edwin Munkoh-Buabeng; Valencia Wagner; Idris Abdulmumin; Ayodele Awokoya; Happy Buzaaba; Blessing Sibanda; Andiswa Bukula; Sam Manthalu; |
224 | Should We Rely on Entity Mentions for Relation Extraction? Debiasing Relation Extraction with Counterfactual Analysis Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose the CoRE (Counterfactual Analysis based Relation Extraction) debiasing method that guides the RE models to focus on the main effects of textual context without losing the entity information. |
Yiwei Wang; Muhao Chen; Wenxuan Zhou; Yujun Cai; Yuxuan Liang; Dayiheng Liu; Baosong Yang; Juncheng Liu; Bryan Hooi; |
225 | Analyzing Encoded Concepts in Transformer Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a novel framework ConceptX, to analyze how latent concepts are encoded in representations learned within pre-trained lan-guage models. |
Hassan Sajjad; Nadir Durrani; Fahim Dalvi; Firoj Alam; Abdul Khan; Jia Xu; |
226 | Boosted Dense Retriever Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose DrBoost, a dense retrieval ensemble inspired by boosting. |
Patrick Lewis; Barlas Oguz; Wenhan Xiong; Fabio Petroni; Scott Yih; Sebastian Riedel; |
227 | MuCGEC: A Multi-Reference Multi-Source Evaluation Dataset for Chinese Grammatical Error Correction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper presents MuCGEC, a multi-reference multi-source evaluation dataset for Chinese Grammatical Error Correction (CGEC), consisting of 7,063 sentences collected from three Chinese-as-a-Second-Language (CSL) learner sources. |
Yue Zhang; Zhenghua Li; Zuyi Bao; Jiacheng Li; Bo Zhang; Chen Li; Fei Huang; Min Zhang; |
228 | NeuS: Neutral Multi-News Summarization for Mitigating Framing Bias Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a new task, a neutral summary generation from multiple news articles of the varying political leaningsto facilitate balanced and unbiased news reading. |
Nayeon Lee; Yejin Bang; Tiezheng Yu; Andrea Madotto; Pascale Fung; |
229 | Enhance Incomplete Utterance Restoration By Joint Learning Token Extraction and Text Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper introduces a model for incomplete utterance restoration (IUR) called JET (Joint learning token Extraction and Text generation). |
Shumpei Inoue; Tsungwei Liu; Son Nguyen; Minh-Tien Nguyen; |
230 | Efficient Constituency Tree Based Encoding for Natural Language to Bash Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Recent works on natural language to Bash translation have made significant advances, but none of the previous methods utilize the problem’s inherent structure. We identify this structure andpropose a Segmented Invocation Transformer (SIT) that utilizes the information from the constituency parse tree of the natural language text. |
Shikhar Bharadwaj; Shirish Shevade; |
231 | Privacy-Preserving Text Classification on BERT Embeddings with Homomorphic Encryption Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, recent research has shown that embeddings can potentially leak private information about sensitive attributes of the text, and in some cases, can be inverted to recover the original input text. To address these growing privacy challenges, we propose a privatization mechanism for embeddings based on homomorphic encryption, to prevent potential leakage of any piece of information in the process of text classification. |
Garam Lee; Minsoo Kim; Jai Hyun Park; Seung-won Hwang; Jung Hee Cheon; |
232 | ITA: Image-Text Alignments for Multi-Modal Named Entity Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: As text representations take the most important role in MNER, in this paper, we propose Image-text Alignments (ITA) to align image features into the textual space, so that the attention mechanism in transformer-based pretrained textual embeddings can be better utilized. |
Xinyu Wang; Min Gui; Yong Jiang; Zixia Jia; Nguyen Bach; Tao Wang; Zhongqiang Huang; Kewei Tu; |
233 | A Dataset for N-ary Relation Extraction of Drug Combinations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To assist medical professionals in identifying beneficial drug-combinations, we construct an expert-annotated dataset for extracting information about the efficacy of drug combinations from the scientific literature. |
Aryeh Tiktinsky; Vijay Viswanathan; Danna Niezni; Dana Meron Azagury; Yosi Shamay; Hillel Taub-Tabib; Tom Hope; Yoav Goldberg; |
234 | Curriculum: A Broad-Coverage Benchmark for Linguistic Phenomena in Natural Language Understanding Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we introduce Curriculum as a new format of NLI benchmark for evaluation of broad-coverage linguistic phenomena. |
Zeming Chen; Qiyue Gao; |
235 | Neural Language Taskonomy: Which NLP Tasks Are The Most Predictive of FMRI Brain Activity? Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we explore transfer learning from representations learned for ten popular natural language processing tasks (two syntactic and eight semantic) for predicting brain responses from two diverse datasets: Pereira (subjects reading sentences from paragraphs) and Narratives (subjects listening to the spoken stories). |
Subba Reddy Oota; Jashn Arora; Veeral Agarwal; Mounika Marreddy; Manish Gupta; Bapi Surampudi; |
236 | FactGraph: Evaluating Factuality in Summarization with Semantic Graph Representations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Recent works have shown promising improvements in factuality error identification using text or dependency arc entailments; however, they do not consider the entire semantic graph simultaneously. To this end, we propose FactGraph, a method that decomposes the document and the summary into structured meaning representations (MR), which are more suitable for factuality evaluation. |
Leonardo Ribeiro; Mengwen Liu; Iryna Gurevych; Markus Dreyer; Mohit Bansal; |
237 | Unsupervised Paraphrasability Prediction for Compound Nominalizations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper investigates unsupervised prediction of paraphrasability, which determines whether the prenominal modifier of a nominalization can be re-written as a noun or adverb in a clausal paraphrase. |
John Sie Yuen Lee; Ho Hung Lim; Carol Carol Webster; |
238 | Global Entity Disambiguation with BERT Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a global entity disambiguation (ED) model based on BERT. |
Ikuya Yamada; Koki Washio; Hiroyuki Shindo; Yuji Matsumoto; |
239 | Clues Before Answers: Generation-Enhanced Multiple-Choice QA Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To exploit the generation capability and underlying knowledge of a pre-trained encoder-decoder model, in this paper, we propose a generation-enhanced MCQA model named GenMC. |
Zixian Huang; Ao Wu; Jiaying Zhou; Yu Gu; Yue Zhao; Gong Cheng; |
240 | Towards Efficient NLP: A Standard Evaluation and A Strong Baseline Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To that end, this work presents ELUE (Efficient Language Understanding Evaluation), a standard evaluation, and a public leaderboard for efficient NLP models. |
Xiangyang Liu; Tianxiang Sun; Junliang He; Jiawen Wu; Lingling Wu; Xinyu Zhang; Hao Jiang; Zhao Cao; Xuanjing Huang; Xipeng Qiu; |
241 | Stylized Knowledge-Grounded Dialogue Generation Via Disentangled Template Rewriting Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel disentangled template rewriting (DTR) method which generates responses via combing disentangled style templates (from monolingual stylized corpus) and content templates (from KDG corpus). |
Qingfeng Sun; Can Xu; Huang Hu; Yujing Wang; Jian Miao; Xiubo Geng; Yining Chen; Fei Xu; Daxin Jiang; |
242 | LUNA: Learning Slot-Turn Alignment for Dialogue State Tracking Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This could lead to suboptimal results due to the information introduced from irrelevant utterances in the dialogue history, which may be useless and can even cause confusion. To address this problem, we propose LUNA, a SLot-TUrN Alignment enhanced approach. |
Yifan Wang; Jing Zhao; Junwei Bao; Chaoqun Duan; Youzheng Wu; Xiaodong He; |
243 | Crossroads, Buildings and Neighborhoods: A Dataset for Fine-grained Location Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we introduce a new dataset HarveyNER with fine-grained locations annotated in tweets. |
Pei Chen; Haotian Xu; Cheng Zhang; Ruihong Huang; |
244 | Tricks for Training Sparse Translation Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We find that that sparse architectures for multilingual machine translation can perform poorly out of the box and propose two straightforward techniques to mitigate this – a temperature heating mechanism and dense pre-training. |
Dheeru Dua; Shruti Bhosale; Vedanuj Goswami; James Cross; Mike Lewis; Angela Fan; |
245 | Persona-Guided Planning for Controlling The Protagonist’s Persona in Story Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we aim to control the protagonist’s persona in story generation, i.e., generating a story from a leading context and a persona description, where the protagonist should exhibit the specified personality through a coherent event sequence. |
Zhexin Zhang; Jiaxin Wen; Jian Guan; Minlie Huang; |
246 | CHEF: A Pilot Chinese Dataset for Evidence-Based Fact-Checking Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Datasets and tools available in other languages, such as Chinese, are limited. In order to bridge this gap, we construct CHEF, the first CHinese Evidence-based Fact-checking dataset of 10K real-world claims. |
Xuming Hu; Zhijiang Guo; GuanYu Wu; Aiwei Liu; Lijie Wen; Philip Yu; |
247 | VGNMN: Video-grounded Neural Module Networks for Video-Grounded Dialogue Systems Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Motivated by recent NMN approaches on image-grounded tasks, we introduce Video-grounded Neural Module Network (VGNMN) to model the information retrieval process in video-grounded language tasks as a pipeline of neural modules. |
Hung Le; Nancy Chen; Steven Hoi; |
248 | Multimodal Dialogue State Tracking Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose to extend the definition of dialogue state tracking to multimodality. |
Hung Le; Nancy Chen; Steven Hoi; |
249 | On The Use of Bert for Automated Essay Scoring: Joint Learning of Multi-Scale Essay Representation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we introduce a novel multi-scale essay representation for BERT that can be jointly learned. |
Yongjie Wang; Chuang Wang; Ruobing Li; Hui Lin; |
250 | Recognition of They/Them As Singular Personal Pronouns in Coreference Resolution Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce a new benchmark for coreference resolution systems which evaluates singular personal they recognition. |
Connor Baumler; Rachel Rudinger; |
251 | TWEETSPIN: Fine-grained Propaganda Detection in Social Media Using Multi-View Representations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we present TWEETSPIN, a dataset containing tweets that are weakly annotated with different fine-grained propaganda techniques, and propose a neural approach to detect and categorize propaganda tweets across those fine-grained categories. |
Prashanth Vijayaraghavan; Soroush Vosoughi; |
252 | UserIdentifier: Implicit User Representations for Simple and Effective Personalized Sentiment Analysis Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Contrary to widely-used personalization techniques based on few-shot and meta-learning, we propose UserIdentifier, a novel scheme for training a single shared model for all users. |
Fatemehsadat Mireshghallah; Vaishnavi Shrivastava; Milad Shokouhi; Taylor Berg-Kirkpatrick; Robert Sim; Dimitrios Dimitriadis; |
253 | Improving Neural Models for Radiology Report Retrieval with Lexicon-based Automated Annotation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper we combine clinical finding detection with supervised query match learning. |
Luyao Shi; Tanveer Syeda-mahmood; Tyler Baldwin; |
254 | Transparent Human Evaluation for Image Captioning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We establish THumB, a rubric-based human evaluation protocol for image captioning models. |
Jungo Kasai; Keisuke Sakaguchi; Lavinia Dunagan; Jacob Morrison; Ronan Le Bras; Yejin Choi; Noah Smith; |
255 | Lifting The Curse of Multilinguality By Pre-training Modular Transformers Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Multilingual pre-trained models are known to suffer from the curse of multilinguality, which causes per-language performance to drop as they cover more languages. We address this issue by introducing language-specific modules, which allows us to grow the total capacity of the model, while keeping the total number of trainable parameters per language constant. |
Jonas Pfeiffer; Naman Goyal; Xi Lin; Xian Li; James Cross; Sebastian Riedel; Mikel Artetxe; |
256 | DocAMR: Multi-Sentence AMR Representation and Evaluation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Taking advantage of a super-sentential level of coreference annotation from previous work, we introduce a simple algorithm for deriving a unified graph representation, avoiding the pitfalls of information loss from over-merging and lack of coherence from under merging. |
Tahira Naseem; Austin Blodgett; Sadhana Kumaravel; Tim O?Gorman; Young-Suk Lee; Jeffrey Flanigan; Ram?n Astudillo; Radu Florian; Salim Roukos; Nathan Schneider; |
257 | Learning to Transfer Prompts for Text Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we improve this technique and propose a novel prompt-based method (PTG) for text generation in a transferable setting. |
Junyi Li; Tianyi Tang; Jian-Yun Nie; Ji-Rong Wen; Xin Zhao; |
258 | ElitePLM: An Empirical Study on General Language Ability Evaluation of Pretrained Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present a large-scale empirical study on general language ability evaluation of PLMs (ElitePLM). |
Junyi Li; Tianyi Tang; Zheng Gong; Lixin Yang; Zhuohao Yu; Zhipeng Chen; Jingyuan Wang; Xin Zhao; Ji-Rong Wen; |
259 | Bidimensional Leaderboards: Generate and Evaluate Language Hand in Hand Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We argue that new advances on models and metrics should each more directly benefit and inform the other. We therefore propose a generalization of leaderboards, bidimensional leaderboards (Billboards), that simultaneously tracks progress in language generation models and metrics for their evaluation. |
Jungo Kasai; Keisuke Sakaguchi; Ronan Le Bras; Lavinia Dunagan; Jacob Morrison; Alexander Fabbri; Yejin Choi; Noah Smith; |
260 | Improving In-Context Few-Shot Learning Via Self-Supervised Training Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose to use self-supervision in an intermediate training stage between pretraining and downstream few-shot usage with the goal to teach the model to perform in-context few shot learning. |
Mingda Chen; Jingfei Du; Ramakanth Pasunuru; Todor Mihaylov; Srini Iyer; Veselin Stoyanov; Zornitsa Kozareva; |
261 | Exposing The Limits of Video-Text Models Through Contrast Sets Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Can they discriminate between similar entities and actions? To answer this, we propose an evaluation framework that probes video-text models with hard negatives. |
Jae Sung Park; Sheng Shen; Ali Farhadi; Trevor Darrell; Yejin Choi; Anna Rohrbach; |
262 | Zero-shot Sonnet Generation with Discourse-level Planning and Aesthetics Features Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present a novel framework to generate sonnets that does not require training on poems. |
Yufei Tian; Nanyun Peng; |
263 | Benchmarking Intersectional Biases in NLP Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we benchmark multiple NLP models with regards to their fairness and predictive performance across a variety of NLP tasks. |
John Lalor; Yi Yang; Kendall Smith; Nicole Forsgren; Ahmed Abbasi; |
264 | When Is BERT Multilingual? Isolating Crucial Ingredients for Cross-lingual Transfer Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we perform a large-scale empirical study to isolate the effects of various linguistic properties by measuring zero-shot transfer between four diverse natural languages and their counterparts constructed by modifying aspects such as the script, word order, and syntax. |
Ameet Deshpande; Partha Talukdar; Karthik Narasimhan; |
265 | How Conservative Are Language Models? Adapting to The Introduction of Gender-Neutral Pronouns Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We show that gender-neutral pronouns in Danish, English, and Swedish are associated with higher perplexity, more dispersed attention patterns, and worse downstream performance. |
Stephanie Brandl; Ruixiang Cui; Anders S?gaard; |
266 | Prompt Waywardness: The Curious Case of Discretized Interpretation of Continuous Prompts Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Motivated by these promising results, we investigate the feasibility of extracting a discrete (textual) interpretation of continuous prompts that is faithful to the problem they solve. |
Daniel Khashabi; Xinxi Lyu; Sewon Min; Lianhui Qin; Kyle Richardson; Sean Welleck; Hannaneh Hajishirzi; Tushar Khot; Ashish Sabharwal; Sameer Singh; Yejin Choi; |
267 | Contrastive Representation Learning for Cross-Document Coreference Resolution of Events and Entities Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present an approach to entity and event coreference resolution utilizing contrastive representation learning. |
Benjamin Hsu; Graham Horwood; |
268 | Learning The Ordering of Coordinate Compounds and Elaborate Expressions in Hmong, Lahu, and Chinese Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We investigate whether the ordering of CCs and EEs can be learned empirically and whether computational models (classifiers and sequence-labeling models) learn unnatural hierarchies similar to those posited by Mortensen (2006). |
Chenxuan Cui; Katherine Zhang; David Mortensen; |
269 | FRUIT: Faithfully Reflecting Updated Information in Text Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we introduce the novel generation task of *faithfully reflecting updated information in text* (FRUIT) where the goal is to update an existing article given new evidence. |
Robert Iv; Alexandre Passos; Sameer Singh; Ming-Wei Chang; |
270 | Multi2WOZ: A Robust Multilingual Dataset and Conversational Pretraining for Task-Oriented Dialog Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we introduce Multi2WOZ, a new multilingual multi-domain TOD dataset, derived from the well-established English dataset MultiWOZ, that spans four typologically diverse languages: Chinese, German, Arabic, and Russian. |
Chia-Chien Hung; Anne Lauscher; Ivan Vulic; Simone Ponzetto; Goran Glava?; |
271 | ChapterBreak: A Challenge Dataset for Long-Range Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To this end, we introduce ChapterBreak, a challenge dataset that provides an LRLM with a long segment from a narrative that ends at a chapter boundary and asks it to distinguish the beginning of the ground-truth next chapter from a set of negative segments sampled from the same narrative. |
Simeng Sun; Katherine Thai; Mohit Iyyer; |
272 | ColBERTv2: Effective and Efficient Retrieval Via Lightweight Late Interaction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we introduce Maize, a retriever that couples an aggressive residual compression mechanism with a denoised supervision strategy to simultaneously improve the quality and space footprint of late interaction. |
Keshav Santhanam; Omar Khattab; Jon Saad-Falcon; Christopher Potts; Matei Zaharia; |
273 | Quantifying Language Variation Acoustically with Few Resources Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, deep acoustic models might have learned linguistic information that transfers to low-resource languages. In this study, we evaluate whether this is the case through the task of distinguishing low-resource (Dutch) regional varieties. |
Martijn Bartelds; Martijn Wieling; |
274 | Adaptable Adapters Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we introduce adaptable adapters that contain (1) learning different activation functions for different layers and different input data, and (2) a learnable switch to select and only use the beneficial adapter layers. |
Nafise Moosavi; Quentin Delfosse; Kristian Kersting; Iryna Gurevych; |
275 | Models in The Loop: Aiding Crowdworkers with Generative Annotation Assistants Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we examine whether we can maintain the advantages of DADC, without incurring the additional cost. |
Max Bartolo; Tristan Thrush; Sebastian Riedel; Pontus Stenetorp; Robin Jia; Douwe Kiela; |
276 | GMN: Generative Multi-modal Network for Practical Document Information Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Although recent literature has already achieved competitive results, these approaches usually fail when dealing with complex documents with noisy OCR results or mutative layouts. This paper proposes Generative Multi-modal Network (GMN) for real-world scenarios to address these problems, which is a robust multi-modal generation method without predefined label categories. |
Haoyu Cao; Jiefeng Ma; Antai Guo; Yiqing Hu; Hao Liu; Deqiang Jiang; Yinsong Liu; Bo Ren; |
277 | One Reference Is Not Enough: Diverse Distillation with Reference Selection for Non-Autoregressive Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we argue that one reference is not enough and propose diverse distillation with reference selection (DDRS) for NAT. |
Chenze Shao; Xuanfu Wu; Yang Feng; |
278 | Can Rationalization Improve Robustness? Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: A growing line of work has investigated the development of neural NLP models that can produce rationales-subsets of input that can explain their model predictions. In this paper, we ask whether such rationale models can provide robustness to adversarial attacks in addition to their interpretable nature. |
Howard Chen; Jacqueline He; Karthik Narasimhan; Danqi Chen; |
279 | On The Effectiveness of Sentence Encoding for Intent Detection Meta-Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we conduct empirical studies on a number of general-purpose sentence embedding schemes, showing that good sentence embeddings without any fine-tuning on intent detection data could produce a non-trivially strong performance. |
Tingting Ma; Qianhui Wu; Zhiwei Yu; Tiejun Zhao; Chin-Yew Lin; |
280 | A Computational Acquisition Model for Multimodal Word Categorization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This is (a) not faithful to the information children receive and (b) prohibits the evaluation of such models with respect to category learning tasks, due to the pre-imposed category structure. We address this gap, and present a cognitively-inspired, multimodal acquisition model, trained from image-caption pairs on naturalistic data using cross-modal self-supervision. |
Uri Berger; Gabriel Stanovsky; Omri Abend; Lea Frermann; |
281 | Residue-Based Natural Language Adversarial Attack Detection Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: As an equivalent model-focused NLP detection approach, this work proposes a simple sentence-embedding residue based detector to identify adversarial examples. |
Vyas Raina; Mark Gales; |
282 | Does It Really Generalize Well on Unseen Data? Systematic Evaluation of Relational Triple Extraction Methods Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To keep a knowledge graph up-to-date, an extractor needs not only the ability to recall the triples it encountered during training, but also the ability to extract the new triples from the context that it has never seen before. In this paper, we show that although existing extraction models are able to easily memorize and recall already seen triples, they cannot generalize effectively for unseen triples. |
Juhyuk Lee; Min-Joong Lee; June Yong Yang; Eunho Yang; |
283 | From Spoken Dialogue to Formal Summary: An Utterance Rewriting for Dialogue Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, the current state-of-the-art models pay more attention on the topic or structure of summary, rather than the consistency of dialogue summary with its input dialogue context, which may suffer from the personal and logical inconsistency problem. In this paper, we propose a new model, named ReWriteSum, to tackle this problem. |
Yue Fang; Hainan Zhang; Hongshen Chen; Zhuoye Ding; Bo Long; Yanyan Lan; Yanquan Zhou; |
284 | EASE: Entity-Aware Contrastive Learning of Sentence Embedding Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present EASE, a novel method for learning sentence embeddings via contrastive learning between sentences and their related entities. |
Sosuke Nishikawa; Ryokan Ri; Ikuya Yamada; Yoshimasa Tsuruoka; Isao Echizen; |
285 | Is Neural Topic Modelling Better Than Clustering? An Empirical Study on Clustering with Contextual Embeddings for Topics Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we conduct thorough experiments showing that directly clustering high-quality sentence embeddings with an appropriate word selecting method can generate more coherent and diverse topics than NTMs, achieving also higher efficiency and simplicity. |
Zihan Zhang; Meng Fang; Ling Chen; Mohammad Reza Namazi Rad; |
286 | Dynamic Multistep Reasoning Based on Video Scene Graph for Video Question Answering Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose for video QA a novel model which performs dynamic multistep reasoning between questions and videos. |
Jianguo Mao; Wenbin Jiang; Xiangdong Wang; Zhifan Feng; Yajuan Lyu; Hong Liu; Yong Zhu; |
287 | TRUE: Re-evaluating Factual Consistency Evaluation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we introduce TRUE: a comprehensive survey and assessment of factual consistency metrics on a standardized collection of existing texts from diverse tasks, manually annotated for factual consistency. |
Or Honovich; Roee Aharoni; Jonathan Herzig; Hagai Taitelbaum; Doron Kukliansy; Vered Cohen; Thomas Scialom; Idan Szpektor; Avinatan Hassidim; Yossi Matias; |
288 | Knowledge Inheritance for Pre-trained Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Specifically, we introduce a pre-training framework named knowledge inheritance (KI) and explore how could knowledge distillation serve as auxiliary supervision during pre-training to efficiently learn larger PLMs. |
Yujia Qin; Yankai Lin; Jing Yi; Jiajie Zhang; Xu Han; Zhengyan Zhang; Yusheng Su; Zhiyuan Liu; Peng Li; Maosong Sun; Jie Zhou; |
289 | Bi-SimCut: A Simple Strategy for Boosting Neural Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce Bi-SimCut: a simple but effective training strategy to boost neural machine translation (NMT) performance. |
Pengzhi Gao; Zhongjun He; Hua Wu; Haifeng Wang; |
290 | On Transferability of Prompt Tuning for Natural Language Processing Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To explore whether we can improve PT via prompt transfer, we empirically investigate the transferability of soft prompts across different downstream tasks and PLMs in this work. |
Yusheng Su; Xiaozhi Wang; Yujia Qin; Chi-Min Chan; Yankai Lin; Huadong Wang; Kaiyue Wen; Zhiyuan Liu; Peng Li; Juanzi Li; Lei Hou; Maosong Sun; Jie Zhou; |
291 | DocEE: A Large-Scale and Fine-grained Benchmark for Document-level Event Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present DocEE, a new document-level event extraction dataset including 27,000+ events, 180,000+ arguments. |
MeiHan Tong; Bin Xu; Shuai Wang; Meihuan Han; Yixin Cao; Jiangqi Zhu; Siyu Chen; Lei Hou; Juanzi Li; |
292 | Towards Debiasing Translation Artifacts Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose a novel approach to reducing translationese by extending an established bias-removal technique. |
Koel Dutta Chowdhury; Rricha Jalota; Cristina Espa?a-Bonet; Josef Genabith; |
293 | WECHSEL: Effective Initialization of Subword Embeddings for Cross-lingual Transfer of Monolingual Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: It is exceedingly expensive to train these models in other languages. To alleviate this problem, we introduce a novel method – called WECHSEL – to efficiently and effectively transfer pretrained LMs to new languages. |
Benjamin Minixhofer; Fabian Paischer; Navid Rekabsaz; |
294 | A New Concept of Knowledge Based Question Answering (KBQA) System for Multi-hop Reasoning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we introduce a new concept of KBQA system which can leverage multiple reasoning paths’ information and only requires labeled answer as supervision. |
Yu Wang; V.srinivasan@samsung.com V.srinivasan@samsung.com; Hongxia Jin; |
295 | Bilingual Tabular Inference: A Case Study on Indic Languages Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: As a result, we present the challenging task of bilingual Tabular Natural Language Inference (bTNLI), in which the tabular premise and a hypothesis over it are in two separate languages. |
Chaitanya Agarwal; Vivek Gupta; Anoop Kunchukuttan; Manish Shrivastava; |
296 | Generative Biomedical Entity Linking Via Knowledge Base-Guided Pre-training and Synonyms-Aware Fine-tuning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we use a generative approach to model biomedical EL and propose to inject synonyms knowledge in it. |
Hongyi Yuan; Zheng Yuan; Sheng Yu; |
297 | Robust Self-Augmentation for Named Entity Recognition with Meta Reweighting Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Prior research has mainly resorted to heuristic rule-based constraints to reduce the noise for specific self-augmentation methods individually. In this paper, we revisit these two typical self-augmentation methods for NER, and propose a unified meta-reweighting strategy for them to achieve a natural integration. |
Linzhi Wu; Pengjun Xie; Jie Zhou; Meishan Zhang; Ma Chunping; Guangwei Xu; Min Zhang; |
298 | Unsupervised Stem-based Cross-lingual Part-of-Speech Tagging for Morphologically Rich Low-Resource Languages Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Our contributions are: 1) we propose an unsupervised stem-based cross-lingual approach for POS tagging for low-resource languages of rich morphology; 2) we further investigate morpheme-level alignment and projection; and 3) we examine whether the use of linguistic priors for morphological segmentation improves POS tagging. |
Ramy Eskander; Cass Lowry; Sujay Khandagale; Judith Klavans; Maria Polinsky; Smaranda Muresan; |
299 | Optimising Equal Opportunity Fairness in Model Training Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose two novel training objectives which directly optimise for the widely-used criterion of equal opportunity, and show that they are effective in reducing bias while maintaining high performance over two classification tasks. |
Aili Shen; Xudong Han; Trevor Cohn; Timothy Baldwin; Lea Frermann; |
300 | Leaner and Faster: Two-Stage Model Compression for Lightweight Text-Image Retrieval Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present an effective two-stage framework to compress large pre-trained dual-encoder for lightweight text-image retrieval. |
Siyu Ren; Kenny Zhu; |
301 | Joint Learning-based Heterogeneous Graph Attention Network for Timeline Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we present a joint learning-based heterogeneous graph attention network for TLS (HeterTls), in which date selection and event detection are combined into a unified framework to improve the extraction accuracy and remove redundant sentences simultaneously. |
Jingyi You; Dongyuan Li; Hidetaka Kamigaito; Kotaro Funakoshi; Manabu Okumura; |
302 | Early Rumor Detection Using Neural Hawkes Process with A New Benchmark Dataset Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Little attention has been paid on EArly Rumor Detection (EARD), and EARD performance was evaluated inappropriately on a few datasets where the actual early-stage information is largely missing. To reverse such situation, we construct BEARD, a new Benchmark dataset for EARD, based on claims from fact-checking websites by trying to gather as many early relevant posts as possible. |
Fengzhu Zeng; Wei Gao; |
303 | Emp-RFT: Empathetic Response Generation Via Recognizing Feature Transitions Between Utterances Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, existing approaches fail to perceive the transitions because they extract features for the context at the coarse-grained level. To solve the above issue, we propose a novel approach of recognizing feature transitions between utterances, which helps understand the dialogue flow and better grasp the features of utterance that needs attention. |
Wongyu Kim; Youbin Ahn; Donghyun Kim; Kyong-Ho Lee; |
304 | KCD: Knowledge Walks and Textual Cues Enhanced Political Perspective Detection in News Media Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Previous approaches generally focus on leveraging textual content to identify stances, while they fail to reason with background knowledge or leverage the rich semantic and syntactic textual labels in news articles. In light of these limitations, we propose KCD, a political perspective detection approach to enable multi-hop knowledge reasoning and incorporate textual cues as paragraph-level labels. |
Wenqian Zhang; Shangbin Feng; Zilong Chen; Zhenyu Lei; Jundong Li; Minnan Luo; |
305 | Collective Relevance Labeling for Passage Retrieval Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In contrast, we propose knowledge distillation for informed labeling, without incurring high computation overheads at evaluation time. |
Jihyuk Kim; Minsoo Kim; Seung-won Hwang; |
306 | COGMEN: COntextualized GNN Based Multimodal Emotion RecognitioN Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose COntextualized Graph Neural Network based Multi- modal Emotion recognitioN (COGMEN) system that leverages local information (i.e., inter/intra dependency between speakers) and global information (context). |
Abhinav Joshi; Ashwani Bhat; Ayush Jain; Atin Singh; Ashutosh Modi; |
307 | Revisit Overconfidence for OOD Detection: Reassigned Contrastive Learning with Adaptive Class-dependent Threshold Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we comprehensively analyze overconfidence and classify it into two perspectives: over-confident OOD and in-domain (IND). |
Yanan Wu; Keqing He; Yuanmeng Yan; QiXiang Gao; Zhiyuan Zeng; Fujia Zheng; Lulu Zhao; Huixing Jiang; Wei Wu; Weiran Xu; |
308 | AISFG: Abundant Information Slot Filling Generator Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Previous researches on zero/few-shot cross-domain slot filling focus on slot descriptions and examples while ignoring the slot type ambiguity and example ambiguity issues. To address these problems, we propose Abundant Information Slot Filling Generator (AISFG), a generative model with a novel query template that incorporates domain descriptions, slot descriptions, and examples with context. |
Yang Yan; Junda Ye; Zhongbao Zhang; Liwen Wang; |
309 | Improving Negation Detection with Negation-focused Pre-training Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a new negation-focused pre-training strategy, involving targeted data augmentation and negation masking, to better incorporate negation information into language models. |
Thinh Truong; Timothy Baldwin; Trevor Cohn; Karin Verspoor; |
310 | Practice Makes A Solver Perfect: Data Augmentation for Math Word Problem Solvers Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we first conduct experiments to showcase that this behaviour is mainly associated with the limited size and diversity present in existing MWP datasets. Next, we propose several data augmentation techniques broadly categorized into Substitution and Paraphrasing based methods. |
Vivek Kumar; Rishabh Maheshwary; Vikram Pudi; |
311 | DiffCSE: Difference-based Contrastive Learning for Sentence Embeddings Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose DiffCSE, an unsupervised contrastive learning framework for learning sentence embeddings. |
Yung-Sung Chuang; Rumen Dangovski; Hongyin Luo; Yang Zhang; Shiyu Chang; Marin Soljacic; Shang-Wen Li; Scott Yih; Yoon Kim; James Glass; |
312 | Generative Cross-Domain Data Augmentation for Aspect and Opinion Co-Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a new Generative Cross-Domain Data Augmentation framework for unsupervised domain adaptation. |
Junjie Li; Jianfei Yu; Rui Xia; |
313 | ProQA: Structural Prompt-based Pre-training for Unified Question Answering Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: The specialty in QA research hinders systems from modeling commonalities between tasks and generalization for wider applications. To address this issue, we present ProQA, a unified QA paradigm that solves various tasks through a single model. |
Wanjun Zhong; Yifan Gao; Ning Ding; Yujia Qin; Zhiyuan Liu; Ming Zhou; Jiahai Wang; Jian Yin; Nan Duan; |
314 | A Data Cartography Based MixUp for Pre-trained Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose TDMixUp, a novel MixUp strategy that leverages Training Dynamics and allows more informative samples to be combined for generating new data samples. |
Seo Yeon Park; Cornelia Caragea; |
315 | Grapheme-to-Phoneme Conversion for Thai Using Neural Regression Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a novel Thai grapheme-to-phoneme conversion method based on a neural regression model that is trained using neural networks to predict the similarity between a candidate and the correct pronunciation. |
Tomohiro Yamasaki; |
316 | Generating Authentic Adversarial Examples Beyond Meaning-preserving with Doubly Round-trip Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, a potential pitfall for this approach is that we cannot decide whether the generated examples are adversarial to the target NMT model or the auxiliary backward one, as the reconstruction error through the RTT can be related to either. To remedy this problem, we propose a new definition for NMT adversarial examples based on the Doubly Round-Trip Translation (DRTT). |
Siyu Lai; Zhen Yang; Fandong Meng; Xue Zhang; Yufeng Chen; Jinan Xu; Jie Zhou; |
317 | TVShowGuess: Character Comprehension in Stories As Speaker Guessing Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a new task for assessing machines’ skills of understanding fictional characters in narrative stories. |
Yisi Sang; Xiangyang Mou; Mo Yu; Shunyu Yao; Jing Li; Jeffrey Stanton; |
318 | Causal Distillation for Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we show that it is beneficial to augment distillation with a third objective that encourages the student to imitate the causal dynamics of the teacher through a distillation interchange intervention training objective (DIITO). |
Zhengxuan Wu; Atticus Geiger; Joshua Rozner; Elisa Kreiss; Hanson Lu; Thomas Icard; Christopher Potts; Noah Goodman; |
319 | FNet: Mixing Tokens with Fourier Transforms Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We show that Transformer encoder architectures can be sped up, with limited accuracy costs, by replacing the self-attention sublayers with simple linear transformations that mix input tokens. |
James Lee-Thorp; Joshua Ainslie; Ilya Eckstein; Santiago Ontanon; |
320 | Answer Consolidation: Formulation and Benchmarking Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we formulate the problem of answer consolidation, where answers are partitioned into multiple groups, each representing different aspects of the answer set. |
Wenxuan Zhou; Qiang Ning; Heba Elfardy; Kevin Small; Muhao Chen; |
321 | Informativeness and Invariance: Two Perspectives on Spurious Correlations in Natural Language Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Gardner et al (2021) argue that the compositional nature of language implies that all correlations between labels and individual “input features” are spurious. This paper analyzes this proposal in the context of a toy example, demonstrating three distinct conditions that can give rise to feature-label correlations in a simple PCFG. |
Jacob Eisenstein; |
322 | FOAM: A Follower-aware Speaker Model For Vision-and-Language Navigation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present FOAM, a FOllower-Aware speaker Model that is constantly updated given the follower feedback, so that the generated instructions can be more suitable to the current learning state of the follower. |
Zi-Yi Dou; Nanyun Peng; |
323 | Improving Compositional Generalization with Latent Structure and Data Augmentation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present a more powerful data recombination method using a model called Compositional Structure Learner (CSL). |
Linlu Qiu; Peter Shaw; Panupong Pasupat; Pawel Nowak; Tal Linzen; Fei Sha; Kristina Toutanova; |
324 | Joint Extraction of Entities, Relations, and Events Via Modeling Inter-Instance and Inter-Label Dependencies Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, previous JointIE models often assume heuristic manually-designed dependency between the task instances and mean-field factorization for the joint distribution of instance labels, thus unable to capture optimal dependencies among instances and labels to improve representation learning and IE performance. To overcome these limitations, we propose to induce a dependency graph among task instances from data to boost representation learning. |
Minh Van Nguyen; Bonan Min; Franck Dernoncourt; Thien Nguyen; |
325 | Linguistic Frameworks Go Toe-to-Toe at Neuro-Symbolic Language Modeling Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We examine the extent to which, in principle, different syntactic and semantic graph representations can complement and improve neural language modeling. |
Jakob Prange; Nathan Schneider; Lingpeng Kong; |
326 | Imagination-Augmented Natural Language Understanding Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Therefore, we introduce an Imagination-Augmented Cross-modal Encoder (iACE) to solve natural language understanding tasks from a novel learning perspective-imagination-augmented cross-modal understanding. |
Yujie Lu; Wanrong Zhu; Xin Wang; Miguel Eckstein; William Yang Wang; |
327 | What Company Do Words Keep? Revisiting The Distributional Semantics of J.R. Firth & Zellig Harris Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Contrasting these theories from the perspective of current debates in NLP, we discover in Firth a figure who could guide the field towards a more culturally grounded notion of semantics. |
Mikael Brunila; Jack LaViolette; |
328 | Compositional Task-Oriented Parsing As Abstractive Question Answering Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work we continue to explore naturalized semantic parsing by presenting a general reduction of TOP to abstractive question answering that overcomes some limitations of canonical paraphrasing. |
Wenting Zhao; Konstantine Arkoudas; Weiqi Sun; Claire Cardie; |
329 | Learning Cross-Lingual IR from An English Retriever Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present DR.DECR (Dense Retrieval with Distillation-Enhanced Cross-Lingual Representation), a new cross-lingual information retrieval (CLIR) system trained using multi-stage knowledge distillation (KD). |
Yulong Li; Martin Franz; Md Arafat Sultan; Bhavani Iyer; Young-Suk Lee; Avirup Sil; |
330 | Testing The Ability of Language Models to Interpret Figurative Language Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, figurative language has been a relatively under-studied area in NLP, and it remains an open question to what extent modern language models can interpret nonliteral phrases. To address this question, we introduce Fig-QA, a Winograd-style nonliteral language understanding task consisting of correctly interpreting paired figurative phrases with divergent meanings. |
Emmy Liu; Chenxuan Cui; Kenneth Zheng; Graham Neubig; |
331 | Multi-Vector Models with Textual Guidance for Fine-Grained Scientific Document Similarity Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present a new scientific document similarity model based on matching fine-grained aspects of texts. |
Sheshera Mysore; Arman Cohan; Tom Hope; |
332 | CHAI: A CHatbot AI for Task-Oriented Dialogue with Offline Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we study how offline reinforcement learning can instead be used to train dialogue agents entirely using static datasets collected from human speakers. |
Siddharth Verma; Justin Fu; Sherry Yang; Sergey Levine; |
333 | Connecting The Dots Between Audio and Text Without Parallel Data Through Visual Knowledge Transfer Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose VIP-ANT that induces Audio-Text alignment without using any parallel audio-text data. |
Yanpeng Zhao; Jack Hessel; Youngjae Yu; Ximing Lu; Rowan Zellers; Yejin Choi; |
334 | SURF: Semantic-level Unsupervised Reward Function for Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This is due to the intrinsic difficulty of the task in the high-dimensional discrete action space as well as the sparseness of the standard reward functions defined for limited set of ground-truth sequences biased towards singular lexical choices. To address this issue, we formulate SURF, a maximally dense semantic-level unsupervised reward function which mimics human evaluation by considering both sentence fluency and semantic similarity. |
Atijit Anuchitanukul; Julia Ive; |
335 | Disentangling Categorization in Multi-agent Emergent Communication Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose the use of disentangled representations from representation learning to quantify the categorization power of agents, enabling a differential analysis between combinations of heterogeneous systems, e.g., pairs of agents which learn to communicate despite mismatched concept realization. |
Washington Garcia; Hamilton Clouse; Kevin Butler; |
336 | Show, Don’t Tell: Demonstrations Outperform Descriptions for Schema-Guided Task-Oriented Dialogue Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose Show, Don’t Tell, which prompts seq2seq models with a labeled example dialogue to show the semantics of schema elements rather than tell the model through descriptions. |
Raghav Gupta; Harrison Lee; Jeffrey Zhao; Yuan Cao; Abhinav Rastogi; Yonghui Wu; |
337 | Does Pre-training Induce Systematic Inference? How Masked Language Models Acquire Commonsense Knowledge Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Transformer models pre-trained with a masked-language-modeling objective (e.g., BERT) encode commonsense knowledge as evidenced by behavioral probes; however, the extent to which this knowledge is acquired by systematic inference over the semantics of the pre-training corpora is an open question. To answer this question, we selectively inject verbalized knowledge into the pre-training minibatches of BERT and evaluate how well the model generalizes to supported inferences after pre-training on the injected knowledge. |
Ian Porada; Alessandro Sordoni; Jackie Cheung; |
338 | Using Paraphrases to Study Properties of Contextual Embeddings Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We use paraphrases as a unique source of data to analyze contextualized embeddings, with a particular focus on BERT. |
Laura Burdick; Jonathan Kummerfeld; Rada Mihalcea; |
339 | Measure and Improve Robustness in NLP Models: A Survey Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we aim to provide a unifying survey of how to define, measure and improve robustness in NLP. |
Xuezhi Wang; Haohan Wang; Diyi Yang; |
340 | Learning to Generate Examples for Semantic Processing Tasks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a neural approach to automatically learn to generate new examples using a pre-trained sequence-to-sequence model. |
Danilo Croce; Simone Filice; Giuseppe Castellucci; Roberto Basili; |
341 | Symbolic Knowledge Distillation: from General Language Models to Commonsense Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we investigate an alternative, from-machine-to-corpus-to-machine: general language models author these commonsense knowledge graphs to train commonsense models. |
Peter West; Chandra Bhagavatula; Jack Hessel; Jena Hwang; Liwei Jiang; Ronan Le Bras; Ximing Lu; Sean Welleck; Yejin Choi; |
342 | GenIE: Generative Information Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce GenIE (generative information extraction), the first end-to-end autoregressive formulation of closed information extraction. |
Martin Josifoski; Nicola De Cao; Maxime Peyrard; Fabio Petroni; Robert West; |
343 | Entity Linking Via Explicit Mention-Mention Coreference Modeling Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present and empirically analyze a novel training approach for learning mention and entity representations that is based on building minimum spanning arborescences (i.e., directed spanning trees) over mentions and entities across documents to explicitly model mention coreference relationships. |
Dhruv Agarwal; Rico Angell; Nicholas Monath; Andrew McCallum; |
344 | Massive-scale Decoding for Text Generation Using Lattices Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present a search algorithm to construct lattices encoding a massive number of generation options. |
Jiacheng Xu; Siddhartha Jonnalagadda; Greg Durrett; |
345 | Disentangling Indirect Answers to Yes-No Questions in Real Conversations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we explore the task of determining indirect answers to yes-no questions in real conversations. |
Krishna Sanagavarapu; Jathin Singaraju; Anusha Kakileti; Anirudh Kaza; Aaron Mathews; Helen Li; Nathan Brito; Eduardo Blanco; |
346 | Quantifying Adaptability in Pre-trained Language Models with 500 Tasks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present a large-scale empirical study of the features and limits of LM adaptability using a new benchmark, TaskBench500, built from 500 procedurally generated sequence modeling tasks. |
Belinda Li; Jane Yu; Madian Khabsa; Luke Zettlemoyer; Alon Halevy; Jacob Andreas; |
347 | Counterfactually Augmented Data and Unintended Bias: The Case of Sexism and Hate Speech Detection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Here, we test models for sexism and hate speech detection on challenging data: non-hate and non-sexist usage of identity and gendered terms. |
Indira Sen; Mattia Samory; Claudia Wagner; Isabelle Augenstein; |
348 | A Study of The Attention Abnormality in Trojaned BERTs Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we investigate the underlying mechanism of Trojaned BERT models. |
Weimin Lyu; Songzhu Zheng; Tengfei Ma; Chao Chen; |
349 | EPiDA: An Easy Plug-in Data Augmentation Framework for High Performance Text Classification Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present an easy and plug-in data augmentation framework EPiDA to support effective text classification. |
Minyi Zhao; Lu Zhang; Yi Xu; Jiandong Ding; Jihong Guan; Shuigeng Zhou; |
350 | Partial-input Baselines Show That NLI Models Can Ignore Context, But They Don’t Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce an evaluation set of 600 examples consisting of perturbed premises to examine a RoBERTa model’s sensitivity to edited contexts. |
Neha Srikanth; Rachel Rudinger; |
351 | Lifelong Pretraining: Continually Adapting Language Models to Emerging Corpora Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we study a lifelong language model pretraining challenge where a PTLM is continually updated so as to adapt to emerging data. |
Xisen Jin; Dejiao Zhang; Henghui Zhu; Wei Xiao; Shang-Wen Li; Xiaokai Wei; Andrew Arnold; Xiang Ren; |
352 | Learning As Conversation: Dialogue Systems Reinforced for Information Acquisition Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose novel AI-empowered chat bots for learning as conversation where a user does not read a passage but gains information and knowledge through conversation with a teacher bot. |
Pengshan Cai; Hui Wan; Fei Liu; Mo Yu; Hong Yu; Sachindra Joshi; |
353 | Dynamic Programming in Rank Space: Scaling Structured Inference with Low-Rank HMMs and PCFGs Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, inference with large state spaces is computationally demanding, especially for PCFGs. To tackle this challenge, we leverage tensor rank decomposition (aka. CPD) to decrease inference computational complexities for a subset of FGGs subsuming HMMs and PCFGs. |
Songlin Yang; Wei Liu; Kewei Tu; |
354 | What Factors Should Paper-Reviewer Assignments Rely On? Community Perspectives on Issues and Ideals in Conference Peer-Review Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present the results of the first survey of the NLP community, identifying common issues and perspectives on what factors should be considered by paper-reviewer matching systems. |
Terne Thorn Jakobsen; Anna Rogers; |
355 | Reducing Disambiguation Biases in NMT By Leveraging Explicit Word Sense Information Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we first provide a novel approach for automatically creating high-precision sense-annotated parallel corpora, and then put forward a specifically tailored fine-tuning strategy for exploiting these sense annotations during training without introducing any additional requirement at inference time. |
Niccol? Campolungo; Tommaso Pasini; Denis Emelin; Roberto Navigli; |
356 | Mining Clues from Incomplete Utterance: A Query-enhanced Network for Incomplete Utterance Rewriting Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, previous works do not consider the semantic structural information between incomplete utterance and rewritten utterance or model the semantic structure implicitly and insufficiently. To address this problem, we propose a QUEry-Enhanced Network(QUEEN) to solve this problem. |
Shuzheng Si; Shuang Zeng; Baobao Chang; |
357 | Domain-Oriented Prefix-Tuning: Towards Efficient and Generalizable Fine-tuning for Zero-Shot Dialogue Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To explore the lightweight fine-tuning methods for domain adaptation of dialogue summarization, in this paper, we propose an efficient and generalizable Domain-Oriented Prefix-tuning model, which utilizes a domain word initialized prefix module to alleviate domain entanglement and adopts discrete prompts to guide the model to focus on key contents of dialogues and enhance model generalization. |
Lulu Zhao; Fujia Zheng; Weihao Zeng; Keqing He; Weiran Xu; Huixing Jiang; Wei Wu; Yanan Wu; |
358 | Interactive Symbol Grounding with Complex Referential Expressions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a procedure for learning to ground symbols from a sequence of stimuli consisting of an arbitrarily complex noun phrase (e.g. all but one green square above both red circles.) |
Rimvydas Rubavicius; Alex Lascarides; |
359 | Generalized Quantifiers As A Source of Error in Multilingual NLU Benchmarks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To facilitate directly-targeted probing, we present an adversarial generalized quantifier NLI task (GQNLI) and show that pre-trained language models have a clear lack of robustness in generalized quantifier reasoning. |
Ruixiang Cui; Daniel Hershcovich; Anders S?gaard; |
360 | Exact Paired-Permutation Testing for Structured Test Statistics Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we provide an efficient exact algorithm for the paired-permutation test for a family of structured test statistics. |
Ran Zmigrod; Tim Vieira; Ryan Cotterell; |
361 | A Balanced Data Approach for Evaluating Cross-Lingual Transfer: Mapping The Linguistic Blood Bank Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We inspect zero-shot performance in balanced data conditions to mitigate data size confounds, classifying pretraining languages that improve downstream performance as donors, and languages that are improved in zero-shot performance as recipients. |
Dan Malkin; Tomasz Limisiewicz; Gabriel Stanovsky; |
362 | SSEGCN: Syntactic and Semantic Enhanced Graph Convolutional Network for Aspect-based Sentiment Analysis Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a novel Syntactic and Semantic Enhanced Graph Convolutional Network (SSEGCN) model for ABSA task. |
Zheng Zhang; Zili Zhou; Yanna Wang; |
363 | Mitigating Toxic Degeneration with Empathetic Data: Exploring The Relationship Between Toxicity and Empathy Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Using empathetic data, we improve over recent work on controllable text generation that aims to reduce the toxicity of generated text. |
Allison Lahnala; Charles Welch; B?la Neuendorf; Lucie Flek; |
364 | DUCK: Rumour Detection on Social Media By Modelling User and Comment Propagation Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Motivated by the observation that the user network – which captures \textit{who} engage with a story – and the comment network – which captures \textit{how} they react to it – provide complementary signals for rumour detection, in this paper, we propose DUCK (rumour _detection with _user and _comment networ _ks) for rumour detection on social media. |
Lin Tian; Xiuzhen Zhang; Jey Han Lau; |
365 | Jam or Cream First? Modeling Ambiguity in Neural Machine Translation with SCONES Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Machine translation, however, is intrinsically uncertain: the same source sentence can have multiple semantically equivalent translations. Therefore, we propose to replace the softmax activation with a multi-label classification layer that can model ambiguity more effectively. |
Felix Stahlberg; Shankar Kumar; |
366 | SkillSpan: Hard and Soft Skill Extraction from English Job Postings Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To address this gap, we introduce SKILLSPAN, a novel SE dataset consisting of 14.5K sentences and over 12.5K annotated spans. |
Mike Zhang; Kristian Jensen; Sif Sonniks; Barbara Plank; |
367 | RAAT: Relation-Augmented Attention Transformer for Relation Modeling in Document-Level Event Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we argue that the relation information of event arguments is of greatsignificance for addressing the above two issues, and propose a new DEE framework which can model the relation dependencies, calledRelation-augmented Document-level Event Extraction (ReDEE). |
Yuan Liang; Zhuoxuan Jiang; Di Yin; Bo Ren; |
368 | A Double-Graph Based Framework for Frame Semantic Parsing Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a Knowledge-guided Incremental semantic parser with Double-graph (KID). |
Ce Zheng; Xudong Chen; Runxin Xu; Baobao Chang; |
369 | An Enhanced Span-based Decomposition Method for Few-Shot Sequence Labeling Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, most prior works assign a label to each token based on the token-level similarities, which ignores the integrality of named entities or slots. To this end, in this paper, we propose ESD, an Enhanced Span-based Decomposition method for FSSL. |
Peiyi Wang; Runxin Xu; Tianyu Liu; Qingyu Zhou; Yunbo Cao; Baobao Chang; Zhifang Sui; |
370 | A Two-Stream AMR-enhanced Model for Document-level Event Argument Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we focus on extracting event arguments from an entire document, which mainly faces two critical problems: a) the long-distance dependency between trigger and arguments over sentences; b) the distracting context towards an event in the document. |
Runxin Xu; Peiyi Wang; Tianyu Liu; Shuang Zeng; Baobao Chang; Zhifang Sui; |
371 | Robust (Controlled) Table-to-Text Generation with Structure-Aware Equivariance Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Accordingly, we propose an equivariance learning framework, which encodes tables with a structure-aware self-attention mechanism. |
Fei Wang; Zhewei Xu; Pedro Szekely; Muhao Chen; |
372 | JointLK: Joint Reasoning with Language Models and Knowledge Graphs for Commonsense Question Answering Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, they ignore (i) the effectively fusing and reasoning over question context representations and the KG representations, and (ii) automatically selecting relevant nodes from the noisy KGs during reasoning. In this paper, we propose a novel model, JointLK, which solves the above limitations through the joint reasoning of LM and GNN and the dynamic KGs pruning mechanism. |
Yueqing Sun; Qi Shi; Le Qi; Yu Zhang; |
373 | Models In A Spelling Bee: Language Models Implicitly Learn The Character Composition of Tokens Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We probe theembedding layer of pretrained language models and show that models learn the internalcharacter composition of whole word and subword tokens to a surprising extent, withoutever seeing the characters coupled with the tokens. |
Itay Itzhak; Omer Levy; |
374 | A Corpus for Understanding and Generating Moral Stories Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Its challenges mainly lie in: (1) grasping knowledge about abstract concepts in morals, (2) capturing inter-event discourse relations in stories, and (3) aligning value preferences of stories and morals concerning good or bad behavior. In this paper, we propose two understanding tasks and two generation tasks to assess these abilities of machines. |
Jian Guan; Ziqi Liu; Minlie Huang; |
375 | Modeling Multi-Granularity Hierarchical Features for Relation Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a novel method to extract multi-granularity features based solely on the original input sentences. |
Xinnian Liang; Shuangzhi Wu; Mu Li; Zhoujun Li; |
376 | Cross-modal Contrastive Learning for Speech Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Learning similar representations for semantically similar speech and text is important for speech translation. To this end, we propose ConST, a cross-modal contrastive learning method for end-to-end speech-to-text translation. |
Rong Ye; Mingxuan Wang; Lei Li; |
377 | Meet Your Favorite Character: Open-domain Chatbot Mimicking Fictional Characters with Only A Few Utterances Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we consider mimicking fictional characters as a promising direction for building engaging conversation models. |
Seungju Han; Beomsu Kim; Jin Yong Yoo; Seokjun Seo; Sangbum Kim; Enkhbayar Erdenee; Buru Chang; |
378 | DynamicTOC: Persona-based Table of Contents for Consumption of Long Documents Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we describe DynamicToC, a dynamic table of content-based navigator, to aid in the task of non-linear, persona-based document consumption. |
Himanshu Maheshwari; Nethraa Sivakumar; Shelly Jain; Tanvi Karandikar; Vinay Aggarwal; Navita Goyal; Sumit Shekhar; |
379 | KALA: Knowledge-Augmented Language Model Adaptation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Moreover, adaptive pre-training can harm the PLM’s performance on the downstream task by causing catastrophic forgetting of its general knowledge. To overcome such limitations of adaptive pre-training for PLM adaption, we propose a novel domain adaption framework for PLMs coined as Knowledge-Augmented Language model Adaptation (KALA), which modulates the intermediate hidden representations of PLMs with domain knowledge, consisting of entities and their relational facts. |
Minki Kang; Jinheon Baek; Sung Ju Hwang; |
380 | On The Effect of Pretraining Corpora on In-context Learning By A Large-scale Language Model Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Here, we investigate the effects of the source and size of the pretraining corpus on in-context learning in HyperCLOVA, a Korean-centric GPT-3 model. |
Seongjin Shin; Sang-Woo Lee; Hwijeen Ahn; Sungdong Kim; HyoungSeok Kim; Boseop Kim; Kyunghyun Cho; Gichang Lee; Woomyoung Park; Jung-Woo Ha; Nako Sung; |
381 | Sketching As A Tool for Understanding and Accelerating Self-attention for Long Sequences Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To address this limitation, Linformer and Informer reduce the quadratic complexity to linear (modulo logarithmic factors) via low-dimensional projection and row selection, respectively. These two models are intrinsically connected, and to understand their connection we introduce a theoretical framework of matrix sketching. |
Yifan Chen; Qi Zeng; Dilek Hakkani-Tur; Di Jin; Heng Ji; Yun Yang; |
382 | Partner Personas Generation for Dialogue Response Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Moreover, in practical applications, the availability of the gold partner personas is often not the case. This paper attempts to tackle these issues by offering a novel framework that leverages automatic partner personas generation to enhance the succeeding dialogue response generation. |
Hongyuan Lu; Wai Lam; Hong Cheng; Helen Meng; |
383 | Semantically Informed Slang Interpretation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a semantically informed slang interpretation (SSI) framework that considers jointly the contextual and semantic appropriateness of a candidate interpretation for a query slang. |
Zhewei Sun; Richard Zemel; Yang Xu; |
384 | Dual-Channel Evidence Fusion for Fact Verification Over Texts and Tables Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a Dual Channel Unified Format fact verification model (DCUF), which unifies various evidence into parallel streams, i.e., natural language sentences and a global evidence table, simultaneously. |
Nan Hu; Zirui Wu; Yuxuan Lai; Xiao Liu; Yansong Feng; |
385 | TreeMix: Compositional Constituency-based Data Augmentation for Natural Language Understanding Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Though effective, they missed one important characteristic of language’compositionality, meaning of a complex expression is built from its sub-parts. Motivated by this, we propose a compositional data augmentation approach for natural language understanding called TreeMix. |
Le Zhang; Zichao Yang; Diyi Yang; |
386 | Syn2Vec: Synset Colexification Graphs for Lexical Semantic Similarity Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper we focus on patterns of colexification (co-expressions of form-meaning mapping in the lexicon) as an aspect of lexical-semantic organization, and use them to build large scale synset graphs across BabelNet’s typologically diverse set of 499 world languages. |
John Harvill; Roxana Girju; Mark Hasegawa-Johnson; |
387 | On The Origin of Hallucinations in Conversational Models: Is It The Datasets or The Models? Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Knowledge-grounded conversational models are known to suffer from producing factually invalid statements, a phenomenon commonly called hallucination. In this work, we investigate the underlying causes of this phenomenon: is hallucination due to the training data, or to the models? |
Nouha Dziri; Sivan Milton; Mo Yu; Osmar Zaiane; Siva Reddy; |
388 | Is My Favorite New Movie My Favorite Movie? Probing The Understanding of Recursive Noun Phrases Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce the Recursive Noun Phrase Challenge (RNPC), a dataset of three textual inference tasks involving textual entailment and event plausibility comparison, precisely targeting the understanding of recursive NPs. |
Qing Lyu; Zheng Hua; Daoxin Li; Li Zhang; Marianna Apidianaki; Chris Callison-Burch; |
389 | Original or Translated? A Causal Analysis of The Impact of Translationese on Machine Translation Performance Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we collect CausalMT, a dataset where the MT training data are also labeled with the human translation directions. |
Jingwei Ni; Zhijing Jin; Markus Freitag; Mrinmaya Sachan; Bernhard Sch?lkopf; |
390 | Visual Commonsense in Pretrained Unimodal and Multimodal Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we investigate to what degree unimodal (language-only) and multimodal (image and language) models capture a broad range of visually salient attributes. |
Chenyu Zhang; Benjamin Van Durme; Zhuowan Li; Elias Stengel-Eskin; |
391 | QuALITY: Question Answering with Long Input Texts, Yes! Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To enable building and testing models on long-document comprehension, we introduce QuALITY, a multiple-choice QA dataset with context passages in English that have an average length of about 5,000 tokens, much longer than typical current models can process. |
Richard Yuanzhe Pang; Alicia Parrish; Nitish Joshi; Nikita Nangia; Jason Phang; Angelica Chen; Vishakh Padmakumar; Johnny Ma; Jana Thompson; He He; Samuel Bowman; |
392 | ExSum: From Local Explanations to Model Understanding Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we introduce explanation summary (ExSum), a mathematical framework for quantifying model understanding, and propose metrics for its quality assessment. |
Yilun Zhou; Marco Tulio Ribeiro; Julie Shah; |
393 | Maximum Bayes Smatch Ensemble Distillation for AMR Parsing Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, for most recent high performant parsers, the effect of self-learning and silver data augmentation seems to be fading. In this paper we propose to overcome this diminishing returns of silver data by combining Smatch-based ensembling techniques with ensemble distillation. |
Young-Suk Lee; Ram?n Astudillo; Hoang Thanh Lam; Tahira Naseem; Radu Florian; Salim Roukos; |
394 | When Does Syntax Mediate Neural Language Model Performance? Evidence from Dropout Probes Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We demonstrate that models do encode syntactic information redundantly and introduce a new probe design that guides probes to consider all syntactic information present in embeddings. |
Mycal Tucker; Tiwalayo Eisape; Peng Qian; Roger Levy; Julie Shah; |
395 | Modeling Task Interactions in Document-Level Joint Entity and Relation Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Especially, we address the two-way interaction between COREF and RE that has not been the focus by previous work, and propose to introduce explicit interaction namely Graph Compatibility (GC) that is specifically designed to leverage task characteristics, bridging decisions of two tasks for direct task interference. |
Liyan Xu; Jinho Choi; |
396 | Few-Shot Semantic Parsing with Language Models Trained on Code Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: For semantic parsing tasks where we map natural language into code, such models may prove more adept at it. In this paper, we test this hypothesis and find that Codex performs better on such tasks than equivalent GPT-3 models. |
Richard Shin; Benjamin Van Durme; |
397 | CORWA: A Citation-Oriented Related Work Annotation Dataset Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: As a first step toward a linguistically-motivated related work generation framework, we present a Citation Oriented Related Work Annotation (CORWA) dataset that labels different types of citation text fragments from different information sources. |
Xiangci Li; Biswadip Mandal; Jessica Ouyang; |
398 | Overcoming Catastrophic Forgetting During Domain Adaptation of Seq2seq Language Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose an innovative framework, RMR_DSEthat leverages a recall optimization mechanism to selectively memorize important parameters of previous tasks via regularization, and uses a domain drift estimation algorithm to compensate the drift between different do-mains in the embedding space. |
Dingcheng Li; Zheng Chen; Eunah Cho; Jie Hao; Xiaohu Liu; Fan Xing; Chenlei Guo; Yang Liu; |
399 | Extreme Zero-Shot Learning for Extreme Text Classification Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we consider a more practical scenario called Extreme Zero-Shot XMC (EZ-XMC), in which no supervision is needed and merely raw text of instances and labels are accessible. |
Yuanhao Xiong; Wei-Cheng Chang; Cho-Jui Hsieh; Hsiang-Fu Yu; Inderjit Dhillon; |
400 | ConfliBERT: A Pre-trained Language Model for Political Conflict and Violence Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To help advance research in political science, we introduce ConfliBERT, a domain-specific pre-trained language model for conflict and political violence. |
Yibo Hu; MohammadSaleh Hosseini; Erick Skorupa Parolin; Javier Osorio; Latifur Khan; Patrick Brandt; Vito D?Orazio; |
401 | Automatic Multi-Label Prompting: Simple and Interpretable Few-Shot Classification Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose Automatic Multi-Label Prompting (AMuLaP), a simple yet effective method to automatically select label mappings for few-shot text classification with prompting. |
Han Wang; Canwen Xu; Julian McAuley; |
402 | Few-shot Subgoal Planning with Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Pre-trained language models have shown successful progress in many text understanding benchmarks. This work explores the capability of these models to predict actionable plans in real-world environments. |
Lajanugen Logeswaran; Yao Fu; Moontae Lee; Honglak Lee; |
403 | IDPG: An Instance-Dependent Prompt Generation Method Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a conditional prompt generation method to generate prompts for each input instance, referred to as the Instance-Dependent Prompt Generation (IDPG). |
Zhuofeng Wu; Sinong Wang; Jiatao Gu; Rui Hou; Yuxiao Dong; V.G.Vinod Vydiswaran; Hao Ma; |
404 | Embedding Hallucination for Few-shot Language Fine-tuning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose an Embedding Hallucination (EmbedHalluc) method, which generates auxiliary embedding-label pairs to expand the fine-tuning dataset. |
Yiren Jian; Chongyang Gao; Soroush Vosoughi; |
405 | Cryptocurrency Bubble Detection: A New Stock Market Dataset, Financial Task & Hyperbolic Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Taking the first step towards NLP for cryptocoins, we present and publicly release CryptoBubbles, a novel multi- span identification task for bubble detection, and a dataset of more than 400 cryptocoins from 9 exchanges over five years spanning over two million tweets. |
Ramit Sawhney; Shivam Agarwal; Vivek Mittal; Paolo Rosso; Vikram Nanda; Sudheer Chava; |
406 | Nearest Neighbor Knowledge Distillation for Neural Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose to move the time-consuming kNN search forward to the preprocessing phase, and then introduce k Nearest Neighbor Knowledge Distillation (kNN-KD) that trains the base NMT model to directly learn the knowledge of kNN. |
Zhixian Yang; Renliang Sun; Xiaojun Wan; |
407 | DEMix Layers: Disentangling Domains for Modular Language Modeling Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce a new domain expert mixture (DEMix) layer that enables conditioning a language model (LM) on the domain of the input text. |
Suchin Gururangan; Mike Lewis; Ari Holtzman; Noah Smith; Luke Zettlemoyer; |
408 | Contrastive Learning for Prompt-based Few-shot Language Learners Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: The impressive performance of GPT-3 using natural language prompts and in-context learning has inspired work on better fine-tuning of moderately-sized models under this paradigm. Following this line of work, we present a contrastive learning framework that clusters inputs from the same class for better generality of models trained with only limited examples. |
Yiren Jian; Chongyang Gao; Soroush Vosoughi; |
409 | Cross-Lingual Event Detection Via Optimized Adversarial Training Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we focus on Cross-Lingual Event Detection where a model is trained on data from a \textit{source} language but its performance is evaluated on data from a second, \textit{target}, language. |
Luis Guzman-Nateras; Minh Van Nguyen; Thien Nguyen; |
410 | Identifying Implicitly Abusive Remarks About Identity Groups Using A Linguistically Informed Approach Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Following the recently-proposed strategy to solve implicit abuse by separately addressing its different subtypes, we present a new focused and less biased dataset that consists of the subtype of atomic negative sentences about identity groups. |
Michael Wiegand; Elisabeth Eder; Josef Ruppenhofer; |
411 | Label Definitions Improve Semantic Role Labeling Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Learning symbolic labels usually requires ample training data, which is frequently unavailable due to the cost of annotation. We instead propose to retrieve and leverage the definitions of these labels from the annotation guidelines. |
Li Zhang; Ishan Jindal; Yunyao Li; |
412 | Shedding New Light on The Language of The Dark Web Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper introduces CoDA, a publicly available Dark Web dataset consisting of 10000 web documents tailored towards text-based Dark Web analysis. |
Youngjin Jin; Eugene Jang; Yongjae Lee; Seungwon Shin; Jin-Woo Chung; |
413 | Conceptualizing Treatment Leakage in Text-based Causal Inference Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this article, we define the treatment-leakage problem, and discuss the identification as well as the estimation challenges it raises. |
Adel Daoud; Connor Jerzak; Richard Johansson; |
414 | Consistency Training with Virtual Adversarial Discrete Perturbation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Thus, the perturbed samples may not aid in regularization due to their ease of classification from the model. In this context, we propose an augmentation method of adding a discrete noise that would incur the highest divergence between predictions. |
Jungsoo Park; Gyuwan Kim; Jaewoo Kang; |
415 | CONFIT: Toward Faithful Dialogue Summarization with Linguistically-Informed Contrastive Fine-tuning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To tackle top factual errors from our annotation, we introduce additional contrastive loss with carefully designed hard negative samples and self-supervised dialogue-specific loss to capture the key information between speakers. |
Xiangru Tang; Arjun Nair; Borui Wang; Bingyao Wang; Jai Desai; Aaron Wade; Haoran Li; Asli Celikyilmaz; Yashar Mehdad; Dragomir Radev; |
416 | CoMPM: Context Modeling with Speaker’s Pre-trained Memory Tracking for Emotion Recognition in Conversation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce CoMPM, which combines the speaker’s pre-trained memory with the context model, and find that the pre-trained memory significantly improves the performance of the context model. |
Joosung Lee; Wooin Lee; |
417 | Investigating Crowdsourcing Protocols for Evaluating The Factual Consistency of Summaries Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To determine the factors that affect the reliability of the human evaluation, we crowdsource evaluations for factual consistency across state-of-the-art models on two news summarization datasets using the rating-based Likert Scale and ranking-based Best-Worst Scaling. |
Xiangru Tang; Alexander Fabbri; Haoran Li; Ziming Mao; Griffin Adams; Borui Wang; Asli Celikyilmaz; Yashar Mehdad; Dragomir Radev; |
418 | DialSummEval: Revisiting Summarization Evaluation for Dialogues Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In our paper, we re-evaluate 18 categories of metrics in terms of four dimensions: coherence, consistency, fluency and relevance, as well as a unified human evaluation of various models for the first time. |
Mingqi Gao; Xiaojun Wan; |
419 | Hyperbolic Relevance Matching for Neural Keyphrase Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Identifying important keyphrases is the central component of keyphrase extraction, and its main challenge is learning to represent information comprehensively and discriminate importance accurately. In this paper, to address the above issues, we design a new hyperbolic matching model (HyperMatch) to explore keyphrase extraction in hyperbolic space. |
Mingyang Song; Yi Feng; Liping Jing; |
420 | Template-free Prompt Tuning for Few-shot NER Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose a more elegant method to reformulate NER tasks as LM problems without any templates. |
Ruotian Ma; Xin Zhou; Tao Gui; Yiding Tan; Linyang Li; Qi Zhang; Xuanjing Huang; |
421 | Few-Shot Document-Level Relation Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present FREDo, a few-shot document-level relation extraction (FSDLRE) benchmark. |
Nicholas Popovic; Michael F?rber; |
422 | LaMemo: Language Modeling with Look-Ahead Memory Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: As a result, this prohibits the memory to dynamically interact with the current context that provides up-to-date information for token prediction. To remedy this issue, we propose Look-Ahead Memory (LaMemo) that enhances the recurrence memory by incrementally attending to the right-side tokens and interpolating with the old memory states to maintain long-term information in the history. |
Haozhe Ji; Rongsheng Zhang; Zhenyu Yang; Zhipeng Hu; Minlie Huang; |
423 | Exploiting Inductive Bias in Transformers for Unsupervised Disentanglement of Syntax and Semantics with VAEs Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a generative model for text generation, which exhibits disentangled latent representations of syntax and semantics. |
Ghazi Felhi; Joseph Roux; Djam? Seddah; |
424 | Neighbors Are Not Strangers: Improving Non-Autoregressive Translation Under Low-Frequency Lexical Constraints Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we focus on non-autoregressive translation (NAT) for this problem for its efficiency advantage. |
Chun Zeng; Jiangjie Chen; Tianyi Zhuang; Rui Xu; Hao Yang; Qin Ying; Shimin Tao; Yanghua Xiao; |
425 | What Do Toothbrushes Do in The Kitchen? How Transformers Think Our World Is Structured Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper we utilize this research on biases to investigate to what extent transformer-based language models allow for extracting knowledge about object relations (X occurs in Y; X consists of Z; action A involves using X). |
Alexander Henlein; Alexander Mehler; |
426 | Less Is More: Learning to Refine Dialogue History for Personalized Dialogue Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose to refine the user dialogue history on a large scale, based on which we can handle more dialogue history and obtain more abundant and accurate persona information. |
Hanxun Zhong; Zhicheng Dou; Yutao Zhu; Hongjin Qian; Ji-Rong Wen; |
427 | A Holistic Framework for Analyzing The COVID-19 Vaccine Debate Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work we propose a holistic analysis framework connecting stance and reason analysis, and fine-grained entity level moral sentiment analysis. |
Maria Pacheco; Tunazzina Islam; Monal Mahajan; Andrey Shor; Ming Yin; Lyle Ungar; Dan Goldwasser; |
428 | Learning to Win Lottery Tickets in BERT Transfer Via Task-agnostic Mask Training Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: These subnetworks are found using magnitude-based pruning. In this paper, we find that the BERT subnetworks have even more potential than these studies have shown. |
Yuanxin Liu; Fandong Meng; Zheng Lin; Peng Fu; Yanan Cao; Weiping Wang; Jie Zhou; |
429 | You Don’t Know My Favorite Color: Preventing Dialogue Representations from Revealing Speakers’ Private Personas Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To this end, we propose effective defense objectives to protect persona leakage from hidden states. |
Haoran Li; Yangqiu Song; Lixin Fan; |
430 | Explaining Dialogue Evaluation Metrics Using Adversarial Behavioral Analysis Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose an adversarial test-suite which generates problematic variations of various dialogue aspects, e.g. logical entailment, using automatic heuristics. |
Baber Khalid; Sungjin Lee; |
431 | Annotators with Attitudes: How Annotator Beliefs And Identities Bias Toxic Language Detection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We seek to understand the *who*, *why*, and *what* behind biases in toxicity annotations. |
Maarten Sap; Swabha Swayamdipta; Laura Vianna; Xuhui Zhou; Yejin Choi; Noah Smith; |
432 | Non-Autoregressive Chinese ASR Error Correction with Phonological Training Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: As the errors introduced by ASR systems will impair the performance of downstream tasks, we introduce a post-processing error correction method, PhVEC, to correct errors in text space. |
Zheng Fang; Ruiqing Zhang; Zhongjun He; Hua Wu; Yanan Cao; |
433 | Hate Speech and Counter Speech Detection: Conversational Context Does Matter Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper investigates the role of context in the annotation and detection of online hate and counter speech, where context is defined as the preceding comment in a conversation thread. |
Xinchen Yu; Eduardo Blanco; Lingzi Hong; |
434 | DACSA: A Large-scale Dataset for Automatic Summarization of Catalan and Spanish Newspaper Articles Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we describe the construction of a corpus of Catalan and Spanish newspapers, the Dataset for Automatic summarization of Catalan and Spanish newspaper Articles (DACSA) corpus. |
Encarnaci?n Segarra Soriano; Vicent Ahuir; Llu?s-F. Hurtado; Jos? Gonz?lez; |
435 | Time Waits for No One! Analysis and Challenges of Temporal Misalignment Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we establish a suite of eight diverse tasks across different domains (social media, science papers, news, and reviews) and periods of time (spanning five years or more) to quantify the effects of temporal misalignment. |
Kelvin Luu; Daniel Khashabi; Suchin Gururangan; Karishma Mandyam; Noah Smith; |
436 | MCSE: Multimodal Contrastive Learning of Sentence Embeddings Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose a sentence embedding learning approach that exploits both visual and textual information via a multimodal contrastive objective. |
Miaoran Zhang; Marius Mosbach; David Adelani; Michael Hedderich; Dietrich Klakow; |
437 | HiURE: Hierarchical Exemplar Contrastive Learning for Unsupervised Relation Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Existing works either utilize self-supervised schemes to refine relational feature signals by iteratively leveraging adaptive clustering and classification that provoke gradual drift problems, or adopt instance-wise contrastive learning which unreasonably pushes apart those sentence pairs that are semantically similar. To overcome these defects, we propose a novel contrastive learning framework named HiURE, which has the capability to derive hierarchical signals from relational feature space using cross hierarchy attention and effectively optimize relation representation of sentences under exemplar-wise contrastive learning. |
Shuliang Liu; Xuming Hu; Chenwei Zhang; Shu?ang Li; Lijie Wen; Philip Yu; |
438 | Diagnosing Vision-and-Language Navigation: What Really Matters Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we conduct a series of diagnostic experiments to unveil agents’ focus during navigation. |
Wanrong Zhu; Yuankai Qi; Pradyumna Narayana; Kazoo Sone; Sugato Basu; Xin Wang; Qi Wu; Miguel Eckstein; William Yang Wang; |
439 | Aligning to Social Norms and Values in Interactive Narratives Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We focus on creating agents that act in alignment with socially beneficial norms and values in interactive narratives or text-based games-environments wherein an agent perceives and interacts with a world through natural language. |
Prithviraj Ammanabrolu; Liwei Jiang; Maarten Sap; Hannaneh Hajishirzi; Yejin Choi; |
440 | MOVER: Mask, Over-generate and Rank for Hyperbole Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Despite being a common figure of speech, hyperbole is under-researched in Figurative Language Processing. In this paper, we tackle the challenging task of hyperbole generation to transfer a literal sentence into its hyperbolic paraphrase. |
Yunxiang Zhang; Xiaojun Wan; |
441 | Embarrassingly Simple Performance Prediction for Abductive Natural Language Inference Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This is a time-consuming and resource-intense endeavour. To solve this practical problem, we propose a simple method for predicting the performance without actually fine-tuning the model. |
Emils Kadikis; Vaibhav Srivastav; Roman Klinger; |
442 | Re-Examining System-Level Correlations of Automatic Summarization Evaluation Metrics Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We identify two ways in which the definition of the system-level correlation is inconsistent with how metrics are used to evaluate systems in practice and propose changes to rectify this disconnect. |
Daniel Deutsch; Rotem Dror; Dan Roth; |