Paper Digest: ACL 2022 Highlights
Annual Meeting of the Association for Computational Linguistics (ACL) is one of the top natural language processing conferences in the world. In 2022, it is to be held in Dublin, Ireland.
To help the community quickly catch up on the work presented in this conference, Paper Digest Team processed all accepted papers, and generated one highlight sentence (typically the main topic) for each paper. Readers are encouraged to read these machine generated highlights / summaries to quickly get the main idea of each paper.
Based in New York, Paper Digest is dedicated to helping people generate contents & reason over unstructured data. Different from black-box approaches, we build deep models on semantics, which allows results to be produced with explainations. Such models power this website, and are behind our services including “search engine”, “summarization”, “question answering”, and “literature review”.
If you do not want to miss interesting academic papers, you are welcome to sign up our daily paper digest service to get updates on new papers published in your area every day. You are also welcome to follow us on Twitter and Linkedin to get updated with new conference digests.
Paper Digest Team
New York City, New York, 10017
team@paperdigest.org
TABLE 1: Paper Digest: ACL 2022 Highlights
Paper | Author(s) | |
---|---|---|
1 | AdapLeR: Speeding Up Inference By Adaptive Length Reduction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose a novel approach for reducing the computational cost of BERT with minimal loss in downstream performance. |
Ali Modarressi; Hosein Mohebbi; Mohammad Taher Pilehvar; |
2 | Quantified Reproducibility Assessment of NLP Results Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper describes and tests a method for carrying out quantified reproducibility assessment (QRA) that is based on concepts and definitions from metrology. |
Anya Belz; Maja Popovic; Simon Mille; |
3 | Rare Tokens Degenerate All Tokens: Improving Neural Text Generation Via Adaptive Gradient Gating for Rare Token Embeddings Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this study, we analyze the training dynamics of the token embeddings focusing on rare token embedding. |
Sangwon Yu; Jongyoon Song; Heeseung Kim; Seongmin Lee; Woo-Jong Ryu; Sungroh Yoon; |
4 | AlephBERT: Language Model Pre-training and Evaluation from Sub-Word to Sentence Level Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present AlephBERT, a large PLM for Modern Hebrew, trained on larger vocabulary and a larger dataset than any Hebrew PLM before. |
Amit Seker; Elron Bandel; Dan Bareket; Idan Brusilovsky; Refael Greenfeld; Reut Tsarfaty; |
5 | Learning to Imagine: Integrating Counterfactual Thinking in Neural Discrete Reasoning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we devise a Learning to Imagine (L2I) module, which can be seamlessly incorporated into NDR models to perform the imagination of unseen counterfactual. |
Moxin Li; Fuli Feng; Hanwang Zhang; Xiangnan He; Fengbin Zhu; Tat-Seng Chua; |
6 | Domain Adaptation in Multilingual and Multi-Domain Monolingual Settings for Complex Word Identification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel training technique for the CWI task based on domain adaptation to improve the target character and context representations. |
George-Eduard Zaharia; Razvan-Alexandru Smadu; Dumitru Cercel; Mihai Dascalu; |
7 | JointCL: A Joint Contrastive Learning Framework for Zero-Shot Stance Detection Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a joint contrastive learning (JointCL) framework, which consists of stance contrastive learning and target-aware prototypical graph contrastive learning. |
Bin Liang; Qinglin Zhu; Xiang Li; Min Yang; Lin Gui; Yulan He; Ruifeng Xu; |
8 | [CASPI] Causal-aware Safe Policy Improvement for Task-oriented Dialogue Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Unfortunately, RL policy trained on off-policy data are prone to issues of bias and generalization, which are further exacerbated by stochasticity in human response and non-markovian nature of annotated belief state of a dialogue management system.To this end, we propose a batch-RL framework for ToD policy learning: Causal-aware Safe Policy Improvement (CASPI). |
Govardana Sachithanandam Ramachandran; Kazuma Hashimoto; Caiming Xiong; |
9 | UniTranSeR: A Unified Transformer Semantic Representation Framework for Multimodal Task-Oriented Dialog System Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Nevertheless, almost all existing studies follow the pipeline to first learn intra-modal features separately and then conduct simple feature concatenation or attention-based feature fusion to generate responses, which hampers them from learning inter-modal interactions and conducting cross-modal feature alignment for generating more intention-aware responses. To address these issues, we propose UniTranSeR, a Unified Transformer Semantic Representation framework with feature alignment and intention reasoning for multimodal dialog systems. |
Zhiyuan Ma; Jianjun Li; Guohui Li; Yongjing Cheng; |
10 | Dynamic Schema Graph Fusion Network for Multi-Domain Dialogue State Tracking Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Existing approaches that have considered such relations generally fall short in: (1) fusing prior slot-domain membership relations and dialogue-aware dynamic slot relations explicitly, and (2) generalizing to unseen domains. To address these issues, we propose a novel Dynamic Schema Graph Fusion Network (DSGFNet), which generates a dynamic schema graph to explicitly fuse the prior slot-domain membership relations and dialogue-aware dynamic slot relations. |
Yue Feng; Aldo Lipani; Fanghua Ye; Qiang Zhang; Emine Yilmaz; |
11 | Attention Temperature Matters in Abstractive Summarization Distillation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we find simply manipulating attention temperatures in Transformers can make pseudo labels easier to learn for student models. |
Shengqiang Zhang; Xingxing Zhang; Hangbo Bao; Furu Wei; |
12 | Towards Making The Most of Cross-Lingual Transfer for Zero-Shot Neural Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper demonstrates that multilingual pretraining and multilingual fine-tuning are both critical for facilitating cross-lingual transfer in zero-shot translation, where the neural machine translation (NMT) model is tested on source languages unseen during supervised training. Following this idea, we present SixT+, a strong many-to-English NMT model that supports 100 source languages but is trained with a parallel dataset in only six source languages. |
Guanhua Chen; Shuming Ma; Yun Chen; Dongdong Zhang; Jia Pan; Wenping Wang; Furu Wei; |
13 | TopWORDS-Seg: Simultaneous Text Segmentation and Word Discovery for Open-Domain Chinese Texts Via Bayesian Inference Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: No existing methods yet can achieve effective text segmentation and word discovery simultaneously in open domain. This study fills in this gap by proposing a novel method called TopWORDS-Seg based on Bayesian inference, which enjoys robust performance and transparent interpretation when no training corpus and domain vocabulary are available. |
Changzai Pan; Maosong Sun; Ke Deng; |
14 | An Unsupervised Multiple-Task and Multiple-Teacher Model for Cross-lingual Named Entity Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this study, based on the knowledge distillation framework and multi-task learning, we introduce the similarity metric model as an auxiliary task to improve the cross-lingual NER performance on the target domain. |
Zhuoran Li; Chunming Hu; Xiaohui Guo; Junfan Chen; Wenyi Qin; Richong Zhang; |
15 | Discriminative Marginalized Probabilistic Neural Method for Multi-Document Summarization of Medical Literature Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Despite the importance and social impact of medicine, there are no ad-hoc solutions for multi-document summarization. For this reason, we propose a novel discriminative marginalized probabilistic method (DAMEN) trained to discriminate critical information from a cluster of topic-related medical documents and generate a multi-document summary via token probability marginalization. |
Gianluca Moro; Luca Ragazzi; Lorenzo Valgimigli; Davide Freddi; |
16 | Sparse Progressive Distillation: Resolving Overfitting Under Pretrain-and-Finetune Paradigm Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we aim to address the overfitting problem and improve pruning performance via progressive knowledge distillation with error-bound properties. |
Shaoyi Huang; Dongkuan Xu; Ian Yen; Yijue Wang; Sung-En Chang; Bingbing Li; Shiyang Chen; Mimi Xie; Sanguthevar Rajasekaran; Hang Liu; Caiwen Ding; |
17 | CipherDAug: Ciphertext Based Data Augmentation for Neural Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a novel data-augmentation technique for neural machine translation based on ROT-k ciphertexts. |
Nishant Kambhatla; Logan Born; Anoop Sarkar; |
18 | Overlap-based Vocabulary Generation Improves Cross-lingual Transfer Among Related Languages Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we argue that relatedness among languages in a language family along the dimension of lexical overlap may be leveraged to overcome some of the corpora limitations of LRLs. |
Vaidehi Patil; Partha Talukdar; Sunita Sarawagi; |
19 | Long-range Sequence Modeling with Predictable Sparse Attention Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Due to the sparsity of the attention matrix, much computation is redundant. Therefore, in this paper, we design an efficient Transformer architecture, named Fourier Sparse Attention for Transformer (FSAT), for fast long-range sequence modeling. |
Yimeng Zhuang; Jing Zhang; Mei Tu; |
20 | Improving Personalized Explanation Generation Through Visualization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This begs an interesting question: can we immerse the models in a multimodal environment to gain proper awareness of real-world concepts and alleviate above shortcomings? To this end, we propose a visually-enhanced approach named METER with the help of visualization generation and text-image matching discrimination: the explainable recommendation model is encouraged to visualize what it refers to while incurring a penalty if the visualization is incongruent with the textual explanation. |
Shijie Geng; Zuohui Fu; Yingqiang Ge; Lei Li; Gerard de Melo; Yongfeng Zhang; |
21 | New Intent Discovery with Pre-training and Contrastive Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we provide new solutions to two important research questions for new intent discovery: (1) how to learn semantic utterance representations and (2) how to better cluster utterances. |
Yuwei Zhang; Haode Zhang; Li-Ming Zhan; Xiao-Ming Wu; Albert Lam; |
22 | Modeling U.S. State-Level Policies By Extracting Winners and Losers from Legislative Texts Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We take a data-driven approach by decoding the impact of legislation on relevant stakeholders (e.g., teachers in education bills) to understand legislators’ decision-making process and votes. |
Maryam Davoodi; Eric Waltenburg; Dan Goldwasser; |
23 | Structural Characterization for Dialogue Disentanglement Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We specially take structure factors into account and design a novel model for dialogue disentangling. |
Xinbei Ma; Zhuosheng Zhang; Hai Zhao; |
24 | Multi-Party Empathetic Dialogue Generation: A New Task for Dialog Systems Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Furthermore, emotion and sensibility are typically confused; a refined empathy analysis is needed for comprehending fragile and nuanced human feelings. We address these issues by proposing a novel task called Multi-Party Empathetic Dialogue Generation in this study. |
Ling.Yu Zhu; Zhengkun Zhang; Jun Wang; Hongbin Wang; Haiying Wu; Zhenglu Yang; |
25 | MISC: A Mixed Strategy-Aware Model Integrating COMET for Emotional Support Conversation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To address the problems, we propose a novel model \textbf{MISC}, which firstly infers the user’s fine-grained emotional status, and then responds skillfully using a mixture of strategy. |
Quan Tu; Yanran Li; Jianwei Cui; Bin Wang; Ji-Rong Wen; Rui Yan; |
26 | GLM: General Language Model Pretraining with Autoregressive Blank Infilling Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, none of the pretraining frameworks performs the best for all tasks of three main categories including natural language understanding (NLU), unconditional generation, and conditional generation. We propose a General Language Model (GLM) based on autoregressive blank infilling to address this challenge. |
Zhengxiao Du; Yujie Qian; Xiao Liu; Ming Ding; Jiezhong Qiu; Zhilin Yang; Jie Tang; |
27 | QuoteR: A Benchmark of Quote Recommendation for Writing Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: There have been various quote recommendation approaches, but they are evaluated on different unpublished datasets. To facilitate the research on this task, we build a large and fully open quote recommendation dataset called QuoteR, which comprises three parts including English, standard Chinese and classical Chinese. |
Fanchao Qi; Yanhui Yang; Jing Yi; Zhili Cheng; Zhiyuan Liu; Maosong Sun; |
28 | Towards Comprehensive Patent Approval Predictions:Beyond Traditional Document Classification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Such novelty evaluations differ the patent approval prediction from conventional document classification – Successful patent applications may share similar writing patterns; however, too-similar newer applications would receive the opposite label, thus confusing standard document classifiers (e.g., BERT). To address this issue, we propose a novel framework that unifies the document classifier with handcrafted features, particularly time-dependent novelty scores. |
Xiaochen Gao; Zhaoyi Hou; Yifei Ning; Kewen Zhao; Beilei He; Jingbo Shang; Vish Krishnan; |
29 | Hypergraph Transformer: Weakly-Supervised Multi-hop Reasoning for Knowledge-based Visual Question Answering Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Answering complex questions that require multi-hop reasoning under weak supervision is considered as a challenging problem since i) no supervision is given to the reasoning process and ii) high-order semantics of multi-hop knowledge facts need to be captured. In this paper, we introduce a concept of hypergraph to encode high-level semantics of a question and a knowledge base, and to learn high-order associations between them. |
Yu-Jung Heo; Eun-Sol Kim; Woo Suk Choi; Byoung-Tak Zhang; |
30 | Cross-Utterance Conditioned VAE for Non-Autoregressive Text-to-Speech Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, a cross-utterance conditional VAE (CUC-VAE) is proposed to estimate a posterior probability distribution of the latent prosody features for each phoneme by conditioning on acoustic features, speaker information, and text features obtained from both past and future sentences. |
Yang Li; Cheng Yu; Guangzhi Sun; Hua Jiang; Fanglei Sun; Weiqin Zu; Ying Wen; Yang Yang; Jun Wang; |
31 | Mix and Match: Learning-free Controllable Text Generationusing Energy Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose Mix and Match LM, a global score-based alternative for controllable text generation that combines arbitrary pre-trained black-box models for achieving the desired attributes in the generated text without involving any fine-tuning or structural assumptions about the black-box models. |
Fatemehsadat Mireshghallah; Kartik Goyal; Taylor Berg-Kirkpatrick; |
32 | So Different Yet So Alike! Constrained Unsupervised Text Style Transfer Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Maintaining constraints in transfer has several downstream applications, including data augmentation and debiasing. We introduce a method for such constrained unsupervised text style transfer by introducing two complementary losses to the generative adversarial network (GAN) family of models. |
Abhinav Ramesh Kashyap; Devamanyu Hazarika; Min-Yen Kan; Roger Zimmermann; Soujanya Poria; |
33 | E-CARE: A New Dataset for Exploring Explainable Causal Reasoning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, such explanation information still remains absent in existing causal reasoning resources. In this paper, we fill this gap by presenting a human-annotated explainable CAusal REasoning dataset (e-CARE), which contains over 20K causal reasoning questions, together with natural language formed explanations of the causal questions. |
Li Du; Xiao Ding; Kai Xiong; Ting Liu; Bing Qin; |
34 | Fantastic Questions and Where to Find Them: FairytaleQA – An Authentic Dataset for Narrative Comprehension Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Drawing on the reading education research, we introduce FairytaleQA, a dataset focusing on narrative comprehension of kindergarten to eighth-grade students. |
Ying Xu; Dakuo Wang; Mo Yu; Daniel Ritchie; Bingsheng Yao; Tongshuang Wu; Zheng Zhang; Toby Li; Nora Bradford; Branda Sun; Tran Hoang; Yisi Sang; Yufang Hou; Xiaojuan Ma; Diyi Yang; Nanyun Peng; Zhou Yu; Mark Warschauer; |
35 | KaFSP: Knowledge-Aware Fuzzy Semantic Parsing for Conversational Question Answering Over A Large-Scale Knowledge Base Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we study two issues of semantic parsing approaches to conversational question answering over a large-scale knowledge base: (1) The actions defined in grammar are not sufficient to handle uncertain reasoning common in real-world scenarios. |
Junzhuo Li; Deyi Xiong; |
36 | Multilingual Knowledge Graph Completion with Self-Supervised Adaptive Graph Alignment Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we explore multilingual KG completion, which leverages limited seed alignment as a bridge, to embrace the collective knowledge from multiple languages. |
Zijie Huang; Zheng Li; Haoming Jiang; Tianyu Cao; Hanqing Lu; Bing Yin; Karthik Subbian; Yizhou Sun; Wei Wang; |
37 | Modeling Hierarchical Syntax Structure with Triplet Position for Source Code Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose CODESCRIBE to model the hierarchical syntax structure of code by introducing a novel triplet position for code summarization. |
Juncai Guo; Jin Liu; Yao Wan; Li Li; Pingyi Zhou; |
38 | FewNLU: Benchmarking State-of-the-Art Methods for Few-Shot Natural Language Understanding Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, prior methods have been evaluated under a disparate set of protocols, which hinders fair comparison and measuring the progress of the field. To address this issue, we introduce an evaluation framework that improves previous evaluation procedures in three key aspects, i.e., test performance, dev-test correlation, and stability. |
Yanan Zheng; Jing Zhou; Yujie Qian; Ming Ding; Chonghua Liao; Li Jian; Ruslan Salakhutdinov; Jie Tang; Sebastian Ruder; Zhilin Yang; |
39 | Learn to Adapt for Generalized Zero-Shot Text Classification Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Most existing methods generalize poorly since the learned parameters are only optimal for seen classes rather than for both classes, and the parameters keep stationary in predicting procedures. To address these challenges, we propose a novel Learn to Adapt (LTA) network using a variant meta-learning framework. |
Yiwen Zhang; Caixia Yuan; Xiaojie Wang; Ziwei Bai; Yongbin Liu; |
40 | TableFormer: Robust Transformer Modeling for Table-Text Encoding Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose a robust and structurally aware table-text encoding architecture TableFormer, where tabular structural biases are incorporated completely through learnable attention biases. |
Jingfeng Yang; Aditya Gupta; Shyam Upadhyay; Luheng He; Rahul Goel; Shachi Paul; |
41 | Perceiving The World: Question-guided Reinforcement Learning for Text-based Games Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we address the challenges by introducing world-perceiving modules, which automatically decompose tasks and prune actions by answering questions about the environment. |
Yunqiu Xu; Meng Fang; Ling Chen; Yali Du; Joey Zhou; Chengqi Zhang; |
42 | Neural Label Search for Zero-Shot Multi-Lingual Extractive Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this way, it is possible to translate the English dataset to other languages and obtain different sets of labels again using heuristics. To fully leverage the information of these different sets of labels, we propose NLSSum (Neural Label Search for Summarization), which jointly learns hierarchical weights for these different sets of labels together with our summarization model. |
Ruipeng Jia; Xingxing Zhang; Yanan Cao; Zheng Lin; Shi Wang; Furu Wei; |
43 | Few-Shot Class-Incremental Learning for Named Entity Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we study a more challenging but practical problem, i.e., few-shot class-incremental learning for NER, where an NER model is trained with only few labeled samples of the new classes, without forgetting knowledge of the old ones. |
Rui Wang; Tong Yu; Handong Zhao; Sungchul Kim; Subrata Mitra; Ruiyi Zhang; Ricardo Henao; |
44 | Improving Meta-learning for Low-resource Text Classification and Generation Via Memory Imitation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Nonetheless, these approaches suffer from the memorization overfitting issue, where the model tends to memorize the meta-training tasks while ignoring support sets when adapting to new tasks. To address this issue, we propose a memory imitation meta-learning (MemIML) method that enhances the model’s reliance on support sets for task adaptation. |
Yingxiu Zhao; Zhiliang Tian; Huaxiu Yao; Yinhe Zheng; Dongkyu Lee; Yiping Song; Jian Sun; Nevin Zhang; |
45 | Quality Controlled Paraphrase Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Here we propose QCPG, a quality-guided controlled paraphrase generation model, that allows directly controlling the quality dimensions. |
Elron Bandel; Ranit Aharonov; Michal Shmueli-Scheuer; Ilya Shnayderman; Noam Slonim; Liat Ein-Dor; |
46 | Controllable Dictionary Example Generation: Generating Example Sentences for Specific Targeted Audiences Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we introduce the problem of dictionary example sentence generation, aiming to automatically generate dictionary example sentences for targeted words according to the corresponding definitions. |
Xingwei He; Siu Ming Yiu; |
47 | AraT5: Text-to-Text Transformers for Arabic Language Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Although a multilingual version of the T5 model (mT5) was also introduced, it is not clear how well it can fare on non-English tasks involving diverse data. To investigate this question, we apply mT5 on a language with a wide variety of dialects-Arabic. |
El Moatez Billah Nagoudi; AbdelRahim Elmadany; Muhammad Abdul-Mageed; |
48 | Legal Judgment Prediction Via Event Extraction with Constraints Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: While significant progress has been made on the task of Legal Judgment Prediction (LJP) in recent years, the incorrect predictions made by SOTA LJP models can be attributed in part to their failure to (1) locate the key event information that determines the judgment, and (2) exploit the cross-task consistency constraints that exist among the subtasks of LJP. To address these weaknesses, we propose EPM, an Event-based Prediction Model with constraints, which surpasses existing SOTA models in performance on a standard LJP dataset. |
Yi Feng; Chuanyi Li; Vincent Ng; |
49 | Answer-level Calibration for Free-form Multiple Choice Question Answering Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we consider the question answering format, where we need to choose from a set of (free-form) textual choices of unspecified lengths given a context. |
Sawan Kumar; |
50 | Learning When to Translate for Streaming Speech Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose MoSST, a simple yet effective method for translating streaming speech content. |
Qian Dong; Yaoming Zhu; Mingxuan Wang; Lei Li; |
51 | Compact Token Representations with Contextual Quantization for Efficient Document Re-ranking Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper proposes contextual quantization of token embeddings by decoupling document-specific and document-independent ranking contributions during codebook-based compression. |
Yingrui Yang; Yifan Qiao; Tao Yang; |
52 | Early Stopping Based on Unlabeled Samples in Text Classification Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this study, we propose an early stopping method that uses unlabeled samples. |
HongSeok Choi; Dongha Choi; Hyunju Lee; |
53 | Meta-learning Via Language Model In-context Tuning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Inspired by the recent progress in large language models, we propose \textit{in-context tuning} (ICT), which recasts task adaptation and prediction as a simple sequence prediction problem: to form the input sequence, we concatenate the task instruction, labeled in-context examples, and the target input to predict; to meta-train the model to learn from in-context examples, we fine-tune a pre-trained language model (LM) to predict the target label given the input sequence on a collection of tasks.We benchmark our method on two collections of text classification tasks: LAMA and BinaryClfs. |
Yanda Chen; Ruiqi Zhong; Sheng Zha; George Karypis; He He; |
54 | It Is AI’s Turn to Ask Humans A Question: Question-Answer Pair Generation for Children’s Story Books Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We design an automated question-answer generation (QAG) system for this education scenario: given a story book at the kindergarten to eighth-grade level as input, our system can automatically generate QA pairs that are capable of testing a variety of dimensions of a student’s comprehension skills. |
Bingsheng Yao; Dakuo Wang; Tongshuang Wu; Zheng Zhang; Toby Li; Mo Yu; Ying Xu; |
55 | Prompt-Based Rule Discovery and Boosting for Interactive Weakly-Supervised Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We study interactive weakly-supervised learning-the problem of iteratively and automatically discovering novel labeling rules from data to improve the WSL model. |
Rongzhi Zhang; Yue Yu; Pranav Shetty; Le Song; Chao Zhang; |
56 | Constrained Multi-Task Learning for Bridging Resolution Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We examine the extent to which supervised bridging resolvers can be improved without employing additional labeled bridging data by proposing a novel constrained multi-task learning framework for bridging resolution, within which we (1) design cross-task consistency constraints to guide the learning process; (2) pre-train the entity coreference model in the multi-task framework on the large amount of publicly available coreference data; and (3) integrating prior knowledge encoded in rule-based resolvers. |
Hideo Kobayashi; Yufang Hou; Vincent Ng; |
57 | DEAM: Dialogue Coherence Evaluation Using AMR-based Semantic Manipulations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Such approaches are insufficient to appropriately reflect the incoherence that occurs in interactions between advanced dialogue models and humans. To tackle this problem, we propose DEAM, a Dialogue coherence Evaluation metric that relies on Abstract Meaning Representation (AMR) to apply semantic-level Manipulations for incoherent (negative) data generation. |
Sarik Ghazarian; Nuan Wen; Aram Galstyan; Nanyun Peng; |
58 | HIBRIDS: Attention with Hierarchical Biases for Structure-aware Long Document Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we present HIBRIDS, which injects Hierarchical Biases foR Incorporating Document Structure into attention score calculation. |
Shuyang Cao; Lu Wang; |
59 | De-Bias for Generative Extraction in Unified NER Task Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, when the generative model is applied to NER, its optimization objective is not consistent with the task, which makes the model vulnerable to the incorrect biases. In this paper, we analyze the incorrect biases in the generation process from a causality perspective and attribute them to two confounders: pre-context confounder and entity-order confounder. |
Shuai Zhang; Yongliang Shen; Zeqi Tan; Yiquan Wu; Weiming Lu; |
60 | An Information-theoretic Approach to Prompt Engineering Without Ground Truth Labels Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce a new method for selecting prompt templates without labeled examples and without direct access to the model. |
Taylor Sorensen; Joshua Robinson; Christopher Rytting; Alexander Shaw; Kyle Rogers; Alexia Delorey; Mahmoud Khalil; Nancy Fulda; David Wingate; |
61 | Expanding Pretrained Models to Thousands More Languages Via Lexicon-based Adaptation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To expand possibilities of using NLP technology in these under-represented languages, we systematically study strategies that relax the reliance on conventional language resources through the use of bilingual lexicons, an alternative resource with much better language coverage. |
Xinyi Wang; Sebastian Ruder; Graham Neubig; |
62 | Language-agnostic BERT Sentence Embedding Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We systematically investigate methods for learning multilingual sentence embeddings by combining the best methods for learning monolingual and cross-lingual representations including: masked language modeling (MLM), translation language modeling (TLM), dual encoder translation ranking, and additive margin softmax. |
Fangxiaoyu Feng; Yinfei Yang; Daniel Cer; Naveen Arivazhagan; Wei Wang; |
63 | Nested Named Entity Recognition with Span-level Graphs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we try to improve the span representation by utilizing retrieval-based span-level graphs, connecting spans and entities in the training data based on n-gram features. |
Juncheng Wan; Dongyu Ru; Weinan Zhang; Yong Yu; |
64 | CogTaskonomy: Cognitively Inspired Task Taxonomy Is Beneficial to Transfer Learning in NLP Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a cognitively inspired framework, CogTaskonomy, to learn taxonomy for NLP tasks. |
Yifei Luo; Minghui Xu; Deyi Xiong; |
65 | RoCBert: Robust Chinese Bert with Multimodal Contrastive Pretraining Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose RoCBert: a pretrained Chinese Bert that is robust to various forms of adversarial attacks like word perturbation, synonyms, typos, etc. |
Hui Su; Weiwei Shi; Xiaoyu Shen; Zhou Xiao; Tuo Ji; Jiarui Fang; Jie Zhou; |
66 | Premise-based Multimodal Reasoning: Conditional Inference on Joint Textual and Visual Clues Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: It is a common practice for recent works in vision language cross-modal reasoning to adopt a binary or multi-choice classification formulation taking as input a set of source image(s) and textual query. In this work, we take a sober look at such an unconditional formulation in the sense that no prior knowledge is specified with respect to the source image(s). |
Qingxiu Dong; Ziwei Qin; Heming Xia; Tian Feng; Shoujie Tong; Haoran Meng; Lin Xu; Zhongyu Wei; Weidong Zhan; Baobao Chang; Sujian Li; Tianyu Liu; Zhifang Sui; |
67 | Parallel Instance Query Network for Named Entity Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To deal with them, we propose Parallel Instance Query Network (PIQN), which sets up global and learnable instance queries to extract entities from a sentence in a parallel manner. |
Yongliang Shen; Xiaobin Wang; Zeqi Tan; Guangwei Xu; Pengjun Xie; Fei Huang; Weiming Lu; Yueting Zhuang; |
68 | ProphetChat: Enhancing Dialogue Generation with Simulation of Future Conversation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Accordingly, we propose a novel dialogue generation framework named ProphetChat that utilizes the simulated dialogue futures in the inference phase to enhance response generation. |
Chang Liu; Xu Tan; Chongyang Tao; Zhenxin Fu; Dongyan Zhao; Tie-Yan Liu; Rui Yan; |
69 | Modeling Multi-hop Question Answering As Single Sequence Prediction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose a simple generative approach (PathFid) that extends the task beyond just answer generation by explicitly modeling the reasoning process to resolve the answer for multi-hop questions. |
Semih Yavuz; Kazuma Hashimoto; Yingbo Zhou; Nitish Shirish Keskar; Caiming Xiong; |
70 | Learning Disentangled Semantic Representations for Zero-Shot Cross-Lingual Transfer in Multilingual Machine Reading Comprehension Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a novel multilingual MRC framework equipped with a Siamese Semantic Disentanglement Model (S2DM) to disassociate semantics from syntax in representations learned by multilingual pre-trained models. |
Linjuan Wu; Shaojuan Wu; Xiaowang Zhang; Deyi Xiong; Shizhan Chen; Zhiqiang Zhuang; Zhiyong Feng; |
71 | Multi-Granularity Structural Knowledge Distillation for Language Model Compression Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To overcome the problems, we present a novel knowledge distillation framework that gathers intermediate representations from multiple semantic granularities (e.g., tokens, spans and samples) and forms the knowledge as more sophisticated structural relations specified as the pair-wise interactions and the triplet-wise geometric angles based on multi-granularity representations. |
Chang Liu; Chongyang Tao; Jiazhan Feng; Dongyan Zhao; |
72 | Auto-Debias: Debiasing Masked Language Models with Automated Biased Prompts Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose an automatic method to mitigate the biases in pretrained language models. |
Yue Guo; Yi Yang; Ahmed Abbasi; |
73 | Where to Go for The Holidays: Towards Mixed-Type Dialogs for Clarification of User Goals Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, in many scenarios, limited by experience and knowledge, users may know what they need, but still struggle to figure out clear and specific goals by determining all the necessary slots. In this paper, we identify this challenge, and make a step forward by collecting a new human-to-human mixed-type dialog corpus. |
Zeming Liu; Jun Xu; Zeyang Lei; Haifeng Wang; Zheng-Yu Niu; Hua Wu; |
74 | Semi-supervised Domain Adaptation for Dependency Parsing with Dynamic Matching Network Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The shared-private model has shown its promising advantages for alleviating this problem via feature separation, whereas prior works pay more attention to enhance shared features but neglect the in-depth relevance of specific ones. To address this issue, we for the first time apply a dynamic matching network on the shared-private model for semi-supervised cross-domain dependency parsing. |
Ying Li; Shuaike Li; Min Zhang; |
75 | A Closer Look at How Fine-tuning Changes BERT Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we study the English BERT family and use two probing techniques to analyze how fine-tuning changes the space. |
Yichu Zhou; Vivek Srikumar; |
76 | Sentence-aware Contrastive Learning for Open-Domain Passage Retrieval Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This work thus presents a refined model on the basis of a smaller granularity, contextual sentences, to alleviate the concerned conflicts. |
Wu Hong; Zhuosheng Zhang; Jinyuan Wang; Hai Zhao; |
77 | FaiRR: Faithful and Robust Deductive Reasoning Over Natural Language Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we frame the deductive logical reasoning task by defining three modular components: rule selection, fact selection, and knowledge composition. |
Soumya Sanyal; Harman Singh; Xiang Ren; |
78 | HiTab: A Hierarchical Table Dataset for Question Answering and Natural Language Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present a new dataset, HiTab, to study question answering (QA) and natural language generation (NLG) over hierarchical tables. |
Zhoujun Cheng; Haoyu Dong; Zhiruo Wang; Ran Jia; Jiaqi Guo; Yan Gao; Shi Han; Jian-Guang Lou; Dongmei Zhang; |
79 | Doctor Recommendation in Online Health Forums Via Expertise Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To better help patients, this paper studies a novel task of doctor recommendation to enable automatic pairing of a patient to a doctor with relevant expertise. |
Xiaoxin Lu; Yubo Zhang; Jing Li; Shi Zong; |
80 | Continual Prompt Tuning for Dialog State Tracking Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To achieve bi-directional knowledge transfer among tasks, we propose several techniques (continual prompt initialization, query fusion, and memory replay) to transfer knowledge from preceding tasks and a memory-guided technique to transfer knowledge from subsequent tasks. |
Qi Zhu; Bing Li; Fei Mi; Xiaoyan Zhu; Minlie Huang; |
81 | There’s A Time and Place for Reasoning Beyond The Image Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we formulate this problem and introduce TARA: a dataset with 16k images with their associated news, time, and location, automatically extracted from New York Times, and an additional 61k examples as distant supervision from WIT. |
Xingyu Fu; Ben Zhou; Ishaan Chandratreya; Carl Vondrick; Dan Roth; |
82 | FORTAP: Using Formulas for Numerical-Reasoning-Aware Table Pretraining Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we find that the spreadsheet formula, a commonly used language to perform computations on numerical values in spreadsheets, is a valuable supervision for numerical reasoning in tables. |
Zhoujun Cheng; Haoyu Dong; Ran Jia; Pengfei Wu; Shi Han; Fan Cheng; Dongmei Zhang; |
83 | Multimodal Fusion Via Cortical Network Inspired Losses Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Inspired by neuroscientific ideas about multisensory integration and processing, we investigate the effect of introducing neural dependencies in the loss functions. |
Shiv Shankar; |
84 | Modeling Temporal-Modal Entity Graph for Procedural Multimodal Machine Comprehension Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this study, we approach Procedural M3C at a fine-grained level (compared with existing explorations at a document or sentence level), that is, entity. |
Huibin Zhang; Zhengkun Zhang; Yao Zhang; Jun Wang; Yufan Li; Ning Jiang; Xin Wei; Zhenglu Yang; |
85 | Explanation Graph Generation Via Pre-trained Language Models: An Empirical Study with Contrastive Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we study pre-trained language models that generate explanation graphs in an end-to-end manner and analyze their ability to learn the structural constraints and semantics of such graphs. |
Swarnadeep Saha; Prateek Yadav; Mohit Bansal; |
86 | Unsupervised Extractive Opinion Summarization Using Sparse Coding Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present Semantic Autoencoder (SemAE) to perform extractive opinion summarization in an unsupervised manner. |
Somnath Basu Roy Chowdhury; Chao Zhao; Snigdha Chaturvedi; |
87 | LexSubCon: Integrating Knowledge from Lexical Resources Into Contextual Embeddings for Lexical Substitution Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce LexSubCon, an end-to-end lexical substitution framework based on contextual embedding models that can identify highly-accurate substitute candidates. |
George Michalopoulos; Ian McKillop; Alexander Wong; Helen Chen; |
88 | Think Before You Speak: Explicitly Generating Implicit Commonsense Knowledge for Response Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Current neural response generation (RG) models are trained to generate responses directly, omitting unstated implicit knowledge. In this paper, we present Think-Before-Speaking (TBS), a generative approach to first externalize implicit commonsense knowledge (think) and use this knowledge to generate responses (speak). |
Pei Zhou; Karthik Gopalakrishnan; Behnam Hedayatnia; Seokhwan Kim; Jay Pujara; Xiang Ren; Yang Liu; Dilek Hakkani-Tur; |
89 | Flow-Adapter Architecture for Unsupervised Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose a flow-adapter architecture for unsupervised NMT. |
Yihong Liu; Haris Jabbar; Hinrich Schuetze; |
90 | Efficient Unsupervised Sentence Compression By Fine-tuning Transformers with Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we explore the use of reinforcement learning to train effective sentence compression models that are also fast when generating predictions. |
Demian Ghalandari; Chris Hokamp; Georgiana Ifrim; |
91 | Tracing Origins: Coreference-aware Machine Reading Comprehension Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we imitate the human reading process in connecting the anaphoric expressions and explicitly leverage the coreference information of the entities to enhance the word embeddings from the pre-trained language model, in order to highlight the coreference mentions of the entities that must be identified for coreference-intensive question answering in QUOREF, a relatively new dataset that is specifically designed to evaluate the coreference-related performance of a model. |
Zhuosheng Zhang; Hai Zhao; |
92 | WatClaimCheck: A New Dataset for Claim Entailment and Inference Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We contribute a new dataset for the task of automated fact checking and an evaluation of state of the art algorithms. |
Kashif Khan; Ruizhe Wang; Pascal Poupart; |
93 | FrugalScore: Learning Cheaper, Lighter and Faster Evaluation Metrics for Automatic Text Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Conversely, new metrics based on large pretrained language models are much more reliable, but require significant computational resources. In this paper, we propose FrugalScore, an approach to learn a fixed, low cost version of any expensive NLG metric, while retaining most of its original performance. |
Moussa Kamal Eddine; Guokan Shang; Antoine Tixier; Michalis Vazirgiannis; |
94 | A Well-Composed Text Is Half Done! Composition Sampling for Diverse Conditional Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose Composition Sampling, a simple but effective method to generate diverse outputs for conditional generation of higher quality compared to previous stochastic decoding strategies. |
Shashi Narayan; Gon�alo Sim�es; Yao Zhao; Joshua Maynez; Dipanjan Das; Michael Collins; Mirella Lapata; |
95 | Synthetic Question Value Estimation for Domain Adaptation of Question Answering Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we introduce a novel idea of training a question value estimator (QVE) that directly estimates the usefulness of synthetic questions for improving the target-domain QA performance. |
Xiang Yue; Ziyu Yao; Huan Sun; |
96 | Better Language Model with Hypernym Class Prediction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Class-based language models (LMs) have been long devised to address context sparsity in n-gram LMs. In this study, we revisit this approach in the context of neural LMs. |
He Bai; Tong Wang; Alessandro Sordoni; Peng Shi; |
97 | Tackling Fake News Detection By Continually Improving Social Context Representations Using Graph Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We view fake news detection as reasoning over the relations between sources, articles they publish, and engaging users on social media in a graph framework. After embedding this information, we formulate inference operators which augment the graph edges by revealing unobserved interactions between its elements, such as similarity between documents’ contents and users’ engagement patterns. |
Nikhil Mehta; Maria Pacheco; Dan Goldwasser; |
98 | Understanding Gender Bias in Knowledge Base Embeddings Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Knowledge base (KB) embeddings have been shown to contain gender biases. In this paper, we study two questions regarding these biases: how to quantify them, and how to trace their origins in KB? |
Yupei Du; Qi Zheng; Yuanbin Wu; Man Lan; Yan Yang; Meirong Ma; |
99 | Computational Historical Linguistics and Language Diversity in South Asia Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We claim that data scatteredness (rather than scarcity) is the primary obstacle in the development of South Asian language technology, and suggest that the study of language history is uniquely aligned with surmounting this obstacle. |
Aryaman Arora; Adam Farris; Samopriya Basu; Suresh Kolichala; |
100 | Faithful or Extractive? On Mitigating The Faithfulness-Abstractiveness Trade-off in Abstractive Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we present a framework for evaluating the effective faithfulness of summarization systems, by generating a faithfulness-abstractiveness trade-off curve that serves as a control at different operating points on the abstractiveness spectrum. |
Faisal Ladhak; Esin Durmus; He He; Claire Cardie; Kathleen McKeown; |
101 | Slangvolution: A Causal Analysis of Semantic Change and Frequency Dynamics in Slang Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Languages are continuously undergoing changes, and the mechanisms that underlie these changes are still a matter of debate. In this work, we approach language evolution through the lens of causality in order to model not only how various distributional factors associate with language change, but how they causally affect it. |
Daphna Keidar; Andreas Opedal; Zhijing Jin; Mrinmaya Sachan; |
102 | Spurious Correlations in Reference-Free Evaluation of Text Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Despite promising recentresults, we find evidence that reference-freeevaluation metrics of summarization and dialoggeneration may be relying on spuriouscorrelations with measures such as word overlap,perplexity, and length. |
Esin Durmus; Faisal Ladhak; Tatsunori Hashimoto; |
103 | On The Ingredients of An Effective Zero-shot Semantic Parser Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper we analyze zero-shot parsers through the lenses of the language and logical gaps (Herzig and Berant, 2019), which quantify the discrepancy of language and programmatic patterns between the canonical examples and real-world user-issued ones. |
Pengcheng Yin; John Wieting; Avirup Sil; Graham Neubig; |
104 | Bias Mitigation in Machine Translation Quality Estimation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We analyse the partial input bias in further detail and evaluate four approaches to use auxiliary tasks for bias mitigation. |
Hanna Behnke; Marina Fomicheva; Lucia Specia; |
105 | Unified Speech-Text Pre-training for Speech Translation and Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we describe a method to jointly pre-train speech and text in an encoder-decoder modeling framework for speech translation and recognition. |
Yun Tang; Hongyu Gong; Ning Dong; Changhan Wang; Wei-Ning Hsu; Jiatao Gu; Alexei Baevski; Xian Li; Abdelrahman Mohamed; Michael Auli; Juan Pino; |
106 | Match The Script, Adapt If Multilingual: Analyzing The Effect of Multilingual Pretraining on Cross-lingual Transferability Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, it is unclear how the number of pretraining languages influences a model’s zero-shot learning for languages unseen during pretraining. To fill this gap, we ask the following research questions: (1) How does the number of pretraining languages influence zero-shot performance on unseen target languages? (2) Does the answer to that question change with model adaptation? (3) Do the findings for our first question change if the languages used for pretraining are all related? |
Yoshinari Fujinuma; Jordan Boyd-Graber; Katharina Kann; |
107 | Structured Pruning Learns Compact and Accurate Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose a task-specific structured pruning method CoFi (Coarse- and Fine-grained Pruning), which delivers highly parallelizable subnetworks and matches the distillation methods in both accuracy and latency, without resorting to any unlabeled data. |
Mengzhou Xia; Zexuan Zhong; Danqi Chen; |
108 | How Can NLP Help Revitalize Endangered Languages? A Case Study and Roadmap for The Cherokee Language Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we focus on discussing how NLP can help revitalize endangered languages. |
Shiyue Zhang; Ben Frey; Mohit Bansal; |
109 | Differentiable Multi-Agent Actor-Critic for Multi-Step Radiology Report Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To fully explore the cascade structure and explainability of radiology report summarization, we introduce two innovations. First, we design a two-step approach: extractive summarization followed by abstractive summarization. Second, we additionally break down the extractive part into two independent tasks: extraction of salient (1) sentences and (2) keywords. |
Sanjeev Kumar Karn; Ning Liu; Hinrich Schuetze; Oladimeji Farri; |
110 | Online Semantic Parsing for Latency Reduction in Task-Oriented Dialogue Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a general framework with first a learned prefix-to-program prediction module, and then a simple yet effective thresholding heuristic for subprogram selection for early execution. |
Jiawei Zhou; Jason Eisner; Michael Newman; Emmanouil Antonios Platanios; Sam Thomson; |
111 | Few-Shot Tabular Data Enrichment Using Fine-Tuned Transformer Architectures Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Our approach is based on an adaptation of BERT, for which we present a novel fine-tuning approach that reformulates the tuples of the datasets as sentences. |
Asaf Harari; Gilad Katz; |
112 | SummN: A Multi-Stage Summarization Framework for Long Input Dialogues and Documents Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose SummN, a simple, flexible, and effective multi-stage framework for input texts that are longer than the maximum context length of typical pretrained LMs. |
Yusen Zhang; Ansong Ni; Ziming Mao; Chen Henry Wu; Chenguang Zhu; Budhaditya Deb; Ahmed Awadallah; Dragomir Radev; Rui Zhang; |
113 | Open Domain Question Answering with A Unified Knowledge Interface Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: While data-to-text generation has the potential to serve as a universal interface for data and text, its feasibility for downstream tasks remains largely unknown. In this work, we bridge this gap and use the data-to-text method as a means for encoding structured knowledge for open-domain question answering. |
Kaixin Ma; Hao Cheng; Xiaodong Liu; Eric Nyberg; Jianfeng Gao; |
114 | Principled Paraphrase Generation with Parallel Corpora Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we formalize the implicit similarity function induced by this approach, and show that it is susceptible to non-paraphrase pairs sharing a single ambiguous translation. |
Aitor Ormazabal; Mikel Artetxe; Aitor Soroa; Gorka Labaka; Eneko Agirre; |
115 | GlobalWoZ: Globalizing MultiWoZ to Develop Multilingual Task-Oriented Dialogue Systems Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, existing multilingual ToD datasets either have a limited coverage of languages due to the high cost of data curation, or ignore the fact that dialogue entities barely exist in countries speaking these languages. To tackle these limitations, we introduce a novel data curation method that generates GlobalWoZ – a large-scale multilingual ToD dataset globalized from an English ToD dataset for three unexplored use cases of multilingual ToD systems. |
Bosheng Ding; Junjie Hu; Lidong Bing; Mahani Aljunied; Shafiq Joty; Luo Si; Chunyan Miao; |
116 | Domain Knowledge Transferring for Pre-trained Language Model Via Calibrated Activation Boundary Distillation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this study, we propose a domain knowledge transferring (DoKTra) framework for PLMs without additional in-domain pretraining. |
Dongha Choi; HongSeok Choi; Hyunju Lee; |
117 | Retrieval-guided Counterfactual Generation for QA Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We focus on the task of creating counterfactuals for question answering, which presents unique challenges related to world knowledge, semantic diversity, and answerability. To address these challenges, we develop a Retrieve-Generate-Filter(RGF) technique to create counterfactual evaluation and training data with minimal human supervision. |
Bhargavi Paranjape; Matthew Lamm; Ian Tenney; |
118 | DYLE: Dynamic Latent Extraction for Abstractive Long-Input Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present DYLE, a novel dynamic latent extraction approach for abstractive long-input summarization. |
Ziming Mao; Chen Henry Wu; Ansong Ni; Yusen Zhang; Rui Zhang; Tao Yu; Budhaditya Deb; Chenguang Zhu; Ahmed Awadallah; Dragomir Radev; |
119 | Searching for Fingerspelled Content in American Sign Language Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we address the problem of searching for fingerspelled keywords or key phrases in raw sign language videos. |
Bowen Shi; Diane Brentari; Greg Shakhnarovich; Karen Livescu; |
120 | Skill Induction and Planning with Latent Language Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a framework for learning hierarchical policies from demonstrations, using sparse natural language annotations to guide the discovery of reusable skills for autonomous decision-making. |
Pratyusha Sharma; Antonio Torralba; Jacob Andreas; |
121 | Fully-Semantic Parsing and Generation: The BabelNet Meaning Representation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present the BabelNet Meaning Representation (BMR), an interlingual formalism that abstracts away from language-specific constraints by taking advantage of the multilingual semantic resources of BabelNet and VerbAtlas. |
Abelardo Carlos Mart�nez Lorenzo; Marco Maru; Roberto Navigli; |
122 | Leveraging Similar Users for Personalized Language Modeling with Limited Data Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, when a new user joins a platform and not enough text is available, it is harder to build effective personalized language models. We propose a solution for this problem, using a model trained on users that are similar to a new user. |
Charles Welch; Chenxi Gu; Jonathan Kummerfeld; Veronica Perez-Rosas; Rada Mihalcea; |
123 | DEEP: DEnoising Entity Pre-training for Neural Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Earlier named entity translation methods mainly focus on phonetic transliteration, which ignores the sentence context for translation and is limited in domain and language coverage. To address this limitation, we propose DEEP, a DEnoising Entity Pre-training method that leverages large amounts of monolingual data and a knowledge base to improve named entity translation accuracy within sentences. |
Junjie Hu; Hiroaki Hayashi; Kyunghyun Cho; Graham Neubig; |
124 | Multi-Modal Sarcasm Detection Via Cross-Modal Graph Convolutional Network Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we investigate multi-modal sarcasm detection from a novel perspective by constructing a cross-modal graph for each instance to explicitly draw the ironic relations between textual and visual modalities. |
Bin Liang; Chenwei Lou; Xiang Li; Min Yang; Lin Gui; Yulan He; Wenjie Pei; Ruifeng Xu; |
125 | Composable Sparse Fine-Tuning for Cross-Lingual Transfer Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Sparse fine-tuning is expressive, as it controls the behavior of all model components. In this work, we introduce a new fine-tuning method with both these desirable properties. |
Alan Ansell; Edoardo Ponti; Anna Korhonen; Ivan Vulic; |
126 | Toward Annotator Group Bias in Crowdsourcing Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we reveal that annotators within the same demographic group tend to show consistent group bias in annotation tasks and thus we conduct an initial study on annotator group bias. |
Haochen Liu; Joseph Thekinen; Sinem Mollaoglu; Da Tang; Ji Yang; Youlong Cheng; Hui Liu; Jiliang Tang; |
127 | Under The Morphosyntactic Lens: A Multifaceted Evaluation of Gender Bias in Speech Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Focusing on speech translation, we conduct a multifaceted evaluation on three language directions (English-French/Italian/Spanish), with models trained on varying amounts of data and different word segmentation techniques. |
Beatrice Savoldi; Marco Gaido; Luisa Bentivogli; Matteo Negri; Marco Turchi; |
128 | Answering Open-Domain Multi-Answer Questions Via A Recall-then-Verify Framework Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To address these issues, we propose to answer open-domain multi-answer questions with a recall-then-verify framework, which separates the reasoning process of each answer so that we can make better use of retrieved evidence while also leveraging large models under the same memory constraint. |
Zhihong Shao; Minlie Huang; |
129 | Probing As Quantifying Inductive Bias Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In the theoretical portion of this paper, we take the position that the goal of probing ought to be measuring the amount of inductive bias that the representations encode on a specific task. |
Alexander Immer; Lucas Torroba Hennigen; Vincent Fortuin; Ryan Cotterell; |
130 | Probing Structured Pruning on Multilingual Pre-trained Models: Settings, Algorithms, and Efficiency Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To facilitate the comparison on all sparsity levels, we present Dynamic Sparsification, a simple approach that allows training the model once and adapting to different model sizes at inference. |
Yanyang Li; Fuli Luo; Runxin Xu; Songfang Huang; Fei Huang; Liwei Wang; |
131 | GPT-D: Inducing Dementia-related Linguistic Anomalies By Deliberate Degradation of Artificial Neural Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, questions remain about their ability to generalize beyond the small reference sets that are publicly available for research. As an alternative to fitting model parameters directly, we propose a novel method by which a Transformer DL model (GPT-2) pre-trained on general English text is paired with an artificially degraded version of itself (GPT-D), to compute the ratio between these two models’ perplexities on language from cognitively healthy and impaired individuals. |
Changye Li; David Knopman; Weizhe Xu; Trevor Cohen; Serguei Pakhomov; |
132 | An Empirical Survey of The Effectiveness of Debiasing Techniques for Pre-trained Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we perform an empirical survey of five recently proposed bias mitigation techniques: Counterfactual Data Augmentation (CDA), Dropout, Iterative Nullspace Projection, Self-Debias, and SentenceDebias. |
Nicholas Meade; Elinor Poole-Dayan; Siva Reddy; |
133 | Exploring and Adapting Chinese GPT to Pinyin Input Method Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: While GPT has become the de-facto method for text generation tasks, its application to pinyin input method remains unexplored.In this work, we make the first exploration to leverage Chinese GPT for pinyin input method. |
Minghuan Tan; Yong Dai; Duyu Tang; Zhangyin Feng; Guoping Huang; Jing Jiang; Jiwei Li; Shuming Shi; |
134 | Enhancing Cross-lingual Natural Language Inference By Prompt-learning from Cross-lingual Templates Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Inspired by recent promising results achieved by prompt-learning, this paper proposes a novel prompt-learning based framework for enhancing XNLI. |
Kunxun Qi; Hai Wan; Jianfeng Du; Haolan Chen; |
135 | Sense Embeddings Are Also Biased – Evaluating Social Biases in Static and Contextualised Sense Embeddings Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We create a benchmark dataset for evaluating the social biases in sense embeddings and propose novel sense-specific bias evaluation measures. |
Yi Zhou; Masahiro Kaneko; Danushka Bollegala; |
136 | Hybrid Semantics for Goal-Directed Natural Language Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We consider the problem of generating natural language given a communicative goal and a world description. |
Connor Baumler; Soumya Ray; |
137 | Predicting Intervention Approval in Clinical Trials Through Multi-Document Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this study, we propose a new method to predict the effectiveness of an intervention in a clinical trial. |
Georgios Katsimpras; Georgios Paliouras; |
138 | BiTIIMT: A Bilingual Text-infilling Method for Interactive Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose a novel BiTIIMT system, Bilingual Text-Infilling for Interactive Neural Machine Translation. |
Yanling Xiao; Lemao Liu; Guoping Huang; Qu Cui; Shujian Huang; Shuming Shi; Jiajun Chen; |
139 | Distributionally Robust Finetuning BERT for Covariate Drift in Spoken Language Understanding Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this study, we investigate robustness against covariate drift in spoken language understanding (SLU). |
Samuel Broscheit; Quynh Do; Judith Gaspers; |
140 | Enhancing Chinese Pre-trained Language Model Via Heterogeneous Linguistics Graph Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Hence, we propose a task-free enhancement module termed as Heterogeneous Linguistics Graph (HLG) to enhance Chinese pre-trained language models by integrating linguistics knowledge. |
Yanzeng Li; Jiangxia Cao; Xin Cong; Zhenyu Zhang; Bowen Yu; Hongsong Zhu; Tingwen Liu; |
141 | Divide and Denoise: Learning from Noisy Labels in Fine-Grained Entity Typing with Cluster-Wise Loss Correction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Existing FET noise learning methods rely on prediction distributions in an instance-independent manner, which causes the problem of confirmation bias. In this work, we propose a clustering-based loss correction framework named Feature Cluster Loss Correction (FCLC), to address these two problems. |
Kunyuan Pang; Haoyu Zhang; Jie Zhou; Ting Wang; |
142 | Towards Robustness of Text-to-SQL Models Against Natural and Realistic Adversarial Table Perturbation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Previous studies along this line primarily focused on perturbations in the natural language question side, neglecting the variability of tables. Motivated by this, we propose the Adversarial Table Perturbation (ATP) as a new attacking paradigm to measure robustness of Text-to-SQL models. |
Xinyu Pi; Bing Wang; Yan Gao; Jiaqi Guo; Zhoujun Li; Jian-Guang Lou; |
143 | Overcoming Catastrophic Forgetting Beyond Continual Learning: Balanced Training for Neural Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: The underlying cause is that training samples do not get balanced training in each model update, so we name this problem imbalanced training. To alleviate this problem, we propose Complementary Online Knowledge Distillation (COKD), which uses dynamically updated teacher models trained on specific data orders to iteratively provide complementary knowledge to the student model. |
Chenze Shao; Yang Feng; |
144 | Metaphors in Pre-Trained Language Models: Probing and Generalization Across Datasets and Languages Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Large pre-trained language models (PLMs) are therefore assumed to encode metaphorical knowledge useful for NLP systems. In this paper, we investigate this hypothesis for PLMs, by probing metaphoricity information in their encodings, and by measuring the cross-lingual and cross-dataset generalization of this information. |
Ehsan Aghazadeh; Mohsen Fayyaz; Yadollah Yaghoobzadeh; |
145 | Discrete Opinion Tree Induction for Aspect-based Sentiment Analysis Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose an aspect-specific and language-agnostic discrete latent opinion tree model as an alternative structure to explicit dependency trees. |
Chenhua Chen; Zhiyang Teng; Zhongqing Wang; Yue Zhang; |
146 | Investigating Non-local Features for Neural Constituency Parsing Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we investigate injecting non-local features into the training process of a local span-based parser, by predicting constituent n-gram non-local patterns and ensuring consistency between non-local patterns and local constituents. |
Leyang Cui; Sen Yang; Yue Zhang; |
147 | Learning from Sibling Mentions with Scalable Graph Inference in Fine-Grained Entity Typing Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we firstly empirically find that existing models struggle to handle hard mentions due to their insufficient contexts, which consequently limits their overall typing performance. To this end, we propose to exploit sibling mentions for enhancing the mention representations. |
Yi Chen; Jiayang Cheng; Haiyun Jiang; Lemao Liu; Haisong Zhang; Shuming Shi; Ruifeng Xu; |
148 | A Variational Hierarchical Model for Neural Cross-Lingual Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, it is very challenging for the model to directly conduct CLS as it requires both the abilities to translate and summarize. To address this issue, we propose a hierarchical model for the CLS task, based on the conditional variational auto-encoder. |
Yunlong Liang; Fandong Meng; Chulun Zhou; Jinan Xu; Yufeng Chen; Jinsong Su; Jie Zhou; |
149 | On The Robustness of Question Rewriting Systems to Questions of Varying Hardness Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we are interested in the robustness of a QR system to questions varying in rewriting hardness or difficulty. |
Hai Ye; Hwee Tou Ng; Wenjuan Han; |
150 | OpenHands: Making Sign Language Recognition Accessible with Pose-based Pretrained Models Across Languages Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce OpenHands, a library where we take four key ideas from the NLP community for low-resource languages and apply them to sign languages for word-level recognition. |
Prem Selvaraj; Gokul Nc; Pratyush Kumar; Mitesh Khapra; |
151 | Bert2BERT: Towards Reusable Pretrained Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose bert2BERT, which can effectively transfer the knowledge of an existing smaller pre-trained model to a large model through parameter initialization and significantly improve the pre-training efficiency of the large model. |
Cheng Chen; Yichun Yin; Lifeng Shang; Xin Jiang; Yujia Qin; Fengyu Wang; Zhi Wang; Xiao Chen; Zhiyuan Liu; Qun Liu; |
152 | Vision-Language Pre-Training for Multimodal Aspect-Based Sentiment Analysis Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, previous approaches either (i) use separately pre-trained visual and textual models, which ignore the crossmodalalignment or (ii) use vision-language models pre-trained with general pre-training tasks, which are inadequate to identify fine-grainedaspects, opinions, and their alignments across modalities. To tackle these limitations, we propose a task-specific Vision-LanguagePre-training framework for MABSA (VLP-MABSA), which is a unified multimodal encoder-decoder architecture for all the pretrainingand downstream tasks. |
Yan Ling; Jianfei Yu; Rui Xia; |
153 | You Might Think About Slightly Revising The Title: Identifying Hedges in Peer-tutoring Interactions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We compared approaches relying on pre-trained resources with others that integrate insights from the social science literature. |
Yann Raphalen; Chlo� Clavel; Justine Cassell; |
154 | Efficient Cluster-Based K-Nearest-Neighbor Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To make it practical, in this paper, we explore a more efficient kNN-MT and propose to use clustering to improve the retrieval efficiency. |
Dexin Wang; Kai Fan; Boxing Chen; Deyi Xiong; |
155 | Headed-Span-Based Projective Dependency Parsing Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a new method for projective dependency parsing based on headed spans. |
Songlin Yang; Kewei Tu; |
156 | Decoding Part-of-Speech from Human EEG Signals Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This work explores techniques to predict Part-of-Speech (PoS) tags from neural signals measured at millisecond resolution with electroencephalography (EEG) during text reading. |
Alex Murphy; Bernd Bohnet; Ryan McDonald; Uta Noppeney; |
157 | Robust Lottery Tickets for Pre-trained Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, these tickets are proved to be notrobust to adversarial examples, and even worse than their PLM counterparts. To address this problem, we propose a novel method based on learning binary weight masks to identify robust tickets hidden in the original PLMs. |
Rui Zheng; Bao Rong; Yuhao Zhou; Di Liang; Sirui Wang; Wei Wu; Tao Gui; Qi Zhang; Xuanjing Huang; |
158 | Knowledgeable Prompt-tuning: Incorporating Knowledge Into Prompt Verbalizer for Text Classification Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we focus on incorporating external knowledge into the verbalizer, forming a knowledgeable prompttuning (KPT), to improve and stabilize prompttuning. |
Shengding Hu; Ning Ding; Huadong Wang; Zhiyuan Liu; Jingang Wang; Juanzi Li; Wei Wu; Maosong Sun; |
159 | Cross-Lingual Contrastive Learning for Fine-Grained Entity Typing for Low-Resource Languages Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a cross-lingual contrastive learning framework to learn FGET models for low-resource languages. |
Xu Han; Yuqi Luo; Weize Chen; Zhiyuan Liu; Maosong Sun; Zhou Botong; Hao Fei; Suncong Zheng; |
160 | MELM: Data Augmentation with Masked Entity Language Modeling for Low-Resource NER Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose Masked Entity Language Modeling (MELM) as a novel data augmentation framework for low-resource NER. |
Ran Zhou; Xin Li; Ruidan He; Lidong Bing; Erik Cambria; Luo Si; Chunyan Miao; |
161 | Word2Box: Capturing Set-Theoretic Semantics of Words Using Box Embeddings Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we provide a fuzzy-set interpretation of box embeddings, and learn box representations of words using a set-theoretic training objective. |
Shib Dasgupta; Michael Boratko; Siddhartha Mishra; Shriya Atmakuri; Dhruvesh Patel; Xiang Li; Andrew McCallum; |
162 | IAM: A Comprehensive and Large-Scale Dataset for Integrated Argument Mining Tasks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we introduce a comprehensive and large dataset named IAM, which can be applied to a series of argument mining tasks, including claim extraction, stance classification, evidence extraction, etc. |
Liying Cheng; Lidong Bing; Ruidan He; Qian Yu; Yan Zhang; Luo Si; |
163 | PLANET: Dynamic Content Planning in Autoregressive Transformers for Long-form Text Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose PLANET, a novel generation framework leveraging autoregressive self-attention mechanism to conduct content planning and surface realization dynamically. |
Zhe Hu; Hou Pong Chan; Jiachen Liu; Xinyan Xiao; Hua Wu; Lifu Huang; |
164 | CTRLEval: An Unsupervised Reference-Free Metric for Evaluating Controlled Text Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose an unsupervised reference-free metric called CTRLEval, which evaluates controlled text generation from different aspects by formulating each aspect into multiple text infilling tasks. |
Pei Ke; Hao Zhou; Yankai Lin; Peng Li; Jie Zhou; Xiaoyan Zhu; Minlie Huang; |
165 | Beyond The Granularity: Multi-Perspective Dialogue Collaborative Selection for Dialogue State Tracking Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Therefore, using consistent dialogue contents may lead to insufficient or redundant information for different slots, which affects the overall performance. To address this problem, we devise DiCoS-DST to dynamically select the relevant dialogue contents corresponding to each slot for state updating. |
Jinyu Guo; Kai Shuang; Jijie Li; Zihan Wang; Yixuan Liu; |
166 | Are Prompt-based Models Clueless? Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper presents an empirical examination of whether few-shot prompt-based models also exploit superficial cues. |
Pride Kavumba; Ryo Takahashi; Yusuke Oda; |
167 | Learning Confidence for Transformer-based Neural Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, this task remains a severe challenge for neural machine translation (NMT), where probabilities from softmax distribution fail to describe when the model is probably mistaken. To address this problem, we propose an unsupervised confidence estimate learning jointly with the training of the NMT model. |
Yu Lu; Jiali Zeng; Jiajun Zhang; Shuangzhi Wu; Mu Li; |
168 | Things Not Written in Text: Exploring Spatial Commonsense from Visual Signals Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Starting from the observation that images are more likely to exhibit spatial commonsense than texts, we explore whether models with visual signals learn more spatial commonsense than text-based PLMs. |
Xiao Liu; Da Yin; Yansong Feng; Dongyan Zhao; |
169 | Conditional Bilingual Mutual Information Based Adaptive Training for Neural Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: While one possible solution is to directly take target contexts into these statistical metrics, the target-context-aware statistical computing is extremely expensive, and the corresponding storage overhead is unrealistic. To solve the above issues, we propose a target-context-aware metric, named conditional bilingual mutual information (CBMI), which makes it feasible to supplement target context information for statistical metrics. |
Songming Zhang; Yijin Liu; Fandong Meng; Yufeng Chen; Jinan Xu; Jian Liu; Jie Zhou; |
170 | ClusterFormer: Neural Clustering Attention for Efficient and Effective Transformer Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Other sparse methods use clustering patterns to select words, but the clustering process is separate from the training process of the target task, which causes a decrease in effectiveness. To address these limitations, we design a neural clustering method, which can be seamlessly integrated into the Self-Attention Mechanism in Transformer. |
Ningning Wang; Guobing Gan; Peng Zhang; Shuai Zhang; Junqiu Wei; Qun Liu; Xin Jiang; |
171 | Bottom-Up Constituency Parsing and Nested Named Entity Recognition with Pointer Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we cast nested NER to constituency parsing and propose a novel pointing mechanism for bottom-up parsing to tackle both tasks. |
Songlin Yang; Kewei Tu; |
172 | Redistributing Low-Frequency Words: Making The Most of Monolingual Data in Non-Autoregressive Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we provide an appealing alternative for NAT – monolingual KD, which trains NAT student on external monolingual data with AT teacher trained on the original bilingual data. |
Liang Ding; Longyue Wang; Shuming Shi; Dacheng Tao; Zhaopeng Tu; |
173 | Dependency Parsing As MRC-based Span-Span Prediction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Higher-order methods for dependency parsing can partially but not fully address the issue that edges in dependency trees should be constructed at the text span/subtree level rather than word level. In this paper, we propose a new method for dependency parsing to address this issue. |
Leilei Gan; Yuxian Meng; Kun Kuang; Xiaofei Sun; Chun Fan; Fei Wu; Jiwei Li; |
174 | Adversarial Soft Prompt Tuning for Cross-Domain Sentiment Analysis Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel Adversarial Soft Prompt Tuning method (AdSPT) to better model cross-domain sentiment analysis. |
Hui Wu; Xiaodong Shi; |
175 | Generating Scientific Claims for Zero-Shot Scientific Fact Checking Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose CLAIMGEN-BART, a new supervised method for generating claims supported by the literature, as well as KBIN, a novel method for generating claim negations. |
Dustin Wright; David Wadden; Kyle Lo; Bailey Kuehl; Arman Cohan; Isabelle Augenstein; Lucy Wang; |
176 | Modeling Dual Read/Write Paths for Simultaneous Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a method of dual-path SiMT which introduces duality constraints to direct the read/write path. |
Shaolei Zhang; Yang Feng; |
177 | ExtEnD: Extractive Entity Disambiguation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, despite their significant performance achievements, most of these approaches frame ED through classification formulations that have intrinsic limitations, both computationally and from a modeling perspective. In contrast with this trend, here we propose ExtEnD, a novel local formulation for ED where we frame this task as a text extraction problem, and present two Transformer-based architectures that implement it. |
Edoardo Barba; Luigi Procopio; Roberto Navigli; |
178 | Hierarchical Sketch Induction for Paraphrase Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a generative model of paraphrase generation, that encourages syntactic diversity by conditioning on an explicit syntactic sketch. |
Tom Hosking; Hao Tang; Mirella Lapata; |
179 | Alignment-Augmented Consistent Translation for Multilingual Open Information Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we explore techniques to automatically convert English text for training OpenIE systems in other languages. |
Keshav Kolluru; Muqeeth Mohammed; Shubham Mittal; Soumen Chakrabarti; Mausam .; |
180 | Text-to-Table: A New Way of Information Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we formalize text-to-table as a sequence-to-sequence (seq2seq) problem. |
Xueqing Wu; Jiacheng Zhang; Hang Li; |
181 | Accelerating Code Search with Deep Hashing and Code Classification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a novel method CoSHC to accelerate code search with deep hashing and code classification, aiming to perform efficient code search without sacrificing too much accuracy. |
Wenchao Gu; Yanlin Wang; Lun Du; Hongyu Zhang; Shi Han; Dongmei Zhang; Michael Lyu; |
182 | Other Roles Matter! Enhancing Role-Oriented Dialogue Summarization Via Role Interactions Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, we believe that other roles’ content could benefit the quality of summaries, such as the omitted information mentioned by other roles. Therefore, we propose a novel role interaction enhanced method for role-oriented dialogue summarization. |
Haitao Lin; Junnan Zhu; Lu Xiang; Yu Zhou; Jiajun Zhang; Chengqing Zong; |
183 | ClarET: Pre-training A Correlation-Aware Context-To-Event Transformer for Event-Centric Generation and Classification Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose to pre-train a general Correlation-aware context-to-Event Transformer (ClarET) for event-centric reasoning. |
Yucheng Zhou; Tao Shen; Xiubo Geng; Guodong Long; Daxin Jiang; |
184 | Measuring and Mitigating Name Biases in Neural Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper we describe a new source of bias prevalent in NMT systems, relating to translations of sentences containing person names. |
Jun Wang; Benjamin Rubinstein; Trevor Cohn; |
185 | Understanding and Improving Sequence-to-Sequence Pretraining for Neural Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we present a substantial step in better understanding the SOTA sequence-to-sequence (Seq2Seq) pretraining for neural machine translation (NMT). |
Wenxuan Wang; Wenxiang Jiao; Yongchang Hao; Xing Wang; Shuming Shi; Zhaopeng Tu; Michael Lyu; |
186 | MSCTD: A Multimodal Sentiment Chat Translation Dataset Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we introduce a new task named Multimodal Chat Translation (MCT), aiming to generate more accurate translations with the help of the associated dialogue history and visual context. |
Yunlong Liang; Fandong Meng; Jinan Xu; Yufeng Chen; Jie Zhou; |
187 | Learning Disentangled Textual Representations Via Statistical Measures of Similarity Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we introduce a family of regularizers for learning disentangled representations that do not require training. |
Pierre Colombo; Guillaume Staerman; Nathan Noiry; Pablo Piantanida; |
188 | On The Sensitivity and Stability of Model Interpretations in NLP Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Motivated by the desiderata of sensitivity and stability, we introduce a new class of interpretation methods that adopt techniques from adversarial robustness. |
Fan Yin; Zhouxing Shi; Cho-Jui Hsieh; Kai-Wei Chang; |
189 | Down and Across: Introducing Crossword-Solving As A New NLP Benchmark Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we introduce solving crossword puzzles as a new natural language understanding task. |
Saurabh Kulshreshtha; Olga Kovaleva; Namrata Shivagunde; Anna Rumshisky; |
190 | Generating Data to Mitigate Spurious Correlations in Natural Language Inference Datasets Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Natural language processing models often exploit spurious correlations between task-independent features and labels in datasets to perform well only within the distributions they are trained on, while not generalising to different task distributions. We propose to tackle this problem by generating a debiased version of a dataset, which can then be used to train a debiased, off-the-shelf model, by simply replacing its training data. |
Yuxiang Wu; Matt Gardner; Pontus Stenetorp; Pradeep Dasigi; |
191 | GL-CLeF: A Global-Local Contrastive Learning Framework for Cross-lingual Spoken Language Understanding Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, existing models solely rely on shared parameters, which can only perform implicit alignment across languages. We present Global-Local Contrastive Learning Framework (GL-CLeF) to address this shortcoming. |
Libo Qin; Qiguang Chen; Tianbao Xie; Qixin Li; Jian-Guang Lou; Wanxiang Che; Min-Yen Kan; |
192 | Good Examples Make A Faster Learner: Simple Demonstration-based Learning for Low-resource NER Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Here we present a simple demonstration-based learning method for NER, which lets the input be prefaced by task demonstrations for in-context learning. |
Dong-Ho Lee; Akshen Kadakia; Kangmin Tan; Mahak Agarwal; Xinyu Feng; Takashi Shibuya; Ryosuke Mitani; Toshiyuki Sekiya; Jay Pujara; Xiang Ren; |
193 | Contextual Representation Learning Beyond Masked Language Modeling Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we analyze the learning dynamics of MLMs and find that it adopts sampled embeddings as anchors to estimate and inject contextual semantics to representations, which limits the efficiency and effectiveness of MLMs. |
Zhiyi Fu; Wangchunshu Zhou; Jingjing Xu; Hao Zhou; Lei Li; |
194 | Efficient Hyper-parameter Search for Knowledge Graph Embedding Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Based on the analysis, we propose an efficient two-stage search algorithm KGTuner, which efficiently explores HP configurations on small subgraph at the first stage and transfers the top-performed configurations for fine-tuning on the large full graph at the second stage. |
Yongqi Zhang; Zhanke Zhou; Quanming Yao; Yong Li; |
195 | A Meta-framework for Spatiotemporal Quantity Extraction from Text Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper thus formulates the NLP problem of spatiotemporal quantity extraction, and proposes the first meta-framework for solving it. |
Qiang Ning; Ben Zhou; Hao Wu; Haoruo Peng; Chuchu Fan; Matt Gardner; |
196 | Leveraging Visual Knowledge in Language Tasks: An Empirical Study on Intermediate Pre-training for Cross-Modal Knowledge Transfer Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Pre-trained language models are still far from human performance in tasks that need understanding of properties (e.g. appearance, measurable quantity) and affordances of everyday objects in the real world since the text lacks such information due to reporting bias.In this work, we study whether integrating visual knowledge into a language model can fill the gap. |
Woojeong Jin; Dong-Ho Lee; Chenguang Zhu; Jay Pujara; Xiang Ren; |
197 | A Good Prompt Is Worth Millions of Parameters: Low-resource Prompt-based Learning for Vision-Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, these VL models are hard to deploy for real-world applications due to their impractically huge sizes and slow inference speed.To solve this limitation, we study prompt-based low-resource learning of VL tasks with our proposed method, FewVLM, relatively smaller than recent few-shot learners. |
Woojeong Jin; Yu Cheng; Yelong Shen; Weizhu Chen; Xiang Ren; |
198 | Continual Few-shot Relation Learning Via Embedding Space Regularization and Data Augmentation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: It is therefore necessary for the model to learn novel relational patterns with very few labeled data while avoiding catastrophic forgetting of previous task knowledge. In this paper, we formulate this challenging yet practical problem as continual few-shot relation learning (CFRL). |
Chengwei Qin; Shafiq Joty; |
199 | Variational Graph Autoencoding As Cheap Supervision for AMR Coreference Resolution Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a general pretraining method using variational graph autoencoder (VGAE) for AMR coreference resolution, which can leverage any general AMR corpus and even automatically parsed AMR data. |
Irene Li; Linfeng Song; Kun Xu; Dong Yu; |
200 | Identifying Chinese Opinion Expressions with Extremely-Noisy Crowdsourcing Annotations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we investigate Chinese OEI with extremely-noisy crowdsourcing annotations, constructing a dataset at a very low cost. |
Xin Zhang; Guangwei Xu; Yueheng Sun; Meishan Zhang; Xiaobin Wang; Min Zhang; |
201 | Sequence-to-Sequence Knowledge Graph Completion and Question Answering Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We show that an off-the-shelf encoder-decoder Transformer model can serve as a scalable and versatile KGE model obtaining state-of-the-art results for KG link prediction and incomplete KG question answering. |
Apoorv Saxena; Adrian Kochsiek; Rainer Gemulla; |
202 | Learning to Mediate Disparities Towards Pragmatic Communication Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Towards building AI agents with similar abilities in language communication, we propose a novel rational reasoning framework, Pragmatic Rational Speaker (PRS), where the speaker attempts to learn the speaker-listener disparity and adjust the speech accordingly, by adding a light-weighted disparity adjustment layer into working memory on top of speaker’s long-term memory system. |
Yuwei Bao; Sayan Ghosh; Joyce Chai; |
203 | Unsupervised Corpus Aware Language Model Pre-training for Dense Passage Retrieval Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we identify and address two underlying problems of dense retrievers: i) fragility to training data noise and ii) requiring large batches to robustly learn the embedding space. |
Luyu Gao; Jamie Callan; |
204 | Multimodal Dialogue Response Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Yet existing works only focus on exploring the multimodal dialogue models which depend on retrieval-based methods, but neglecting generation methods. To fill in the gaps, we first present a new task: multimodal dialogue response generation (MDRG) – given the dialogue history, one model needs to generate a text sequence or an image as response. |
Qingfeng Sun; Yujing Wang; Can Xu; Kai Zheng; Yaming Yang; Huang Hu; Fei Xu; Jessica Zhang; Xiubo Geng; Daxin Jiang; |
205 | CAKE: A Scalable Commonsense-Aware Framework For Multi-View Knowledge Graph Completion Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: The previous knowledge graph embedding (KGE) techniques suffer from invalid negative sampling and the uncertainty of fact-view link prediction, limiting KGC’s performance. To address the above challenges, we propose a novel and scalable Commonsense-Aware Knowledge Embedding (CAKE) framework to automatically extract commonsense from factual triples with entity concepts. |
Guanglin Niu; Bo Li; Yongfei Zhang; Shiliang Pu; |
206 | Confidence Based Bidirectional Global Context Aware Training Framework for Neural Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a Confidence Based Bidirectional Global Context Aware (CBBGCA) training framework for NMT, where the NMT model is jointly trained with an auxiliary conditional masked language model (CMLM). |
Chulun Zhou; Fandong Meng; Jie Zhou; Min Zhang; Hongji Wang; Jinsong Su; |
207 | BRIO: Bringing Order to Abstractive Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This assumption may lead to performance degradation during inference, where the model needs to compare several system-generated (candidate) summaries that have deviated from the reference summary. To address this problem, we propose a novel training paradigm which assumes a non-deterministic distribution so that different candidate summaries are assigned probability mass according to their quality. |
Yixin Liu; Pengfei Liu; Dragomir Radev; Graham Neubig; |
208 | Leveraging Relaxed Equilibrium By Lazy Transition for Sequence Modeling Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Inspired by the equilibrium phenomenon, we present a lazy transition, a mechanism to adjust the significance of iterative refinements for each token representation. |
Xi Ai; Bin Fang; |
209 | FIBER: Fill-in-the-Blanks As A Challenging Video Understanding Evaluation Framework Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose fill-in-the-blanks as a video understanding evaluation framework and introduce FIBER – a novel dataset consisting of 28,000 videos and descriptions in support of this evaluation framework. |
Santiago Castro; Ruoyao Wang; Pingxuan Huang; Ian Stewart; Oana Ignat; Nan Liu; Jonathan Stroud; Rada Mihalcea; |
210 | KenMeSH: Knowledge-enhanced End-to-end Biomedical Text Labelling Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: MeSH indexing is a challenging task for machine learning, as it needs to assign multiple labels to each article from an extremely large hierachically organized collection. To address this challenge, we propose KenMeSH, an end-to-end model that combines new text features and a dynamic knowledge-enhanced mask attention that integrates document features with MeSH label hierarchy and journal correlation features to index MeSH terms. |
Xindi Wang; Robert Mercer; Frank Rudzicz; |
211 | A Taxonomy of Empathetic Questions in Social Dialogs Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, current dialog generation approaches do not model this subtle emotion regulation technique due to the lack of a taxonomy of questions and their purpose in social chitchat. To address this gap, we have developed an empathetic question taxonomy (EQT), with special attention paid to questions? ability to capture communicative acts and their emotion-regulation intents. |
Ekaterina Svikhnushina; Iuliana Voinea; Anuradha Welivita; Pearl Pu; |
212 | Enhanced Multi-Channel Graph Convolutional Network for Aspect Sentiment Triplet Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose an Enhanced Multi-Channel Graph Convolutional Network model (EMC-GCN) to fully utilize the relations between words. |
Hao Chen; Zepeng Zhai; Fangxiang Feng; Ruifan Li; Xiaojie Wang; |
213 | ProtoTEx: Explaining Model Decisions with Prototype Tensors Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present ProtoTEx, a novel white-box NLP classification architecture based on prototype networks (Li et al., 2018). |
Anubrata Das; Chitrank Gupta; Venelin Kovatchev; Matthew Lease; Junyi Jessy Li; |
214 | Show Me More Details: Discovering Hierarchies of Procedures from Semi-structured Web Data Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we attempt to construct an open-domain hierarchical knowledge-base (KB) of procedures based on wikiHow, a website containing more than 110k instructional articles, each documenting the steps to carry out a complex procedure. |
Shuyan Zhou; Li Zhang; Yue Yang; Qing Lyu; Pengcheng Yin; Chris Callison-Burch; Graham Neubig; |
215 | Cross-Modal Discrete Representation Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In contrast to recent advances focusing on high-level representation learning across modalities, in this work we present a self-supervised learning framework that is able to learn a representation that captures finer levels of granularity across different modalities such as concepts or events represented by visual objects or spoken words. |
Alexander Liu; SouYoung Jin; Cheng-I Lai; Andrew Rouditchenko; Aude Oliva; James Glass; |
216 | Improving Event Representation Via Simultaneous Weakly Supervised Contrastive Learning and Clustering Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we present SWCC: a Simultaneous Weakly supervised Contrastive learning and Clustering framework for event representation learning. |
Jun Gao; Wei Wang; Changlong Yu; Huan Zhao; Wilfred Ng; Ruifeng Xu; |
217 | Contrastive Visual Semantic Pretraining Magnifies The Semantics of Natural Language Representations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We examine the effects of contrastive visual semantic pretraining by comparing the geometry and semantic properties of contextualized English language representations formed by GPT-2 and CLIP, a zero-shot multimodal image classifier which adapts the GPT-2 architecture to encode image captions. |
Robert Wolfe; Aylin Caliskan; |
218 | ConTinTin: Continual Learning from Task Instructions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This work defines a new learning paradigm ConTinTin (Continual Learning from Task Instructions), in which a system should learn a sequence of new tasks one by one, each task is explained by a piece of textual instruction. |
Wenpeng Yin; Jia Li; Caiming Xiong; |
219 | Automated Crossword Solving Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present the Berkeley Crossword Solver, a state-of-the-art approach for automatically solving crossword puzzles. |
Eric Wallace; Nicholas Tomlin; Albert Xu; Kevin Yang; Eshaan Pathak; Matthew Ginsberg; Dan Klein; |
220 | Learned Incremental Representations for Parsing Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present an incremental syntactic representation that consists of assigning a single discrete label to each word in a sentence, where the label is predicted using strictly incremental processing of a prefix of the sentence, and the sequence of labels for a sentence fully determines a parse tree. |
Nikita Kitaev; Thomas Lu; Dan Klein; |
221 | Knowledge Enhanced Reflection Generation for Counseling Dialogues Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we study the effect of commonsense and domain knowledge while generating responses in counseling conversations using retrieval and generative methods for knowledge integration. |
Siqi Shen; Veronica Perez-Rosas; Charles Welch; Soujanya Poria; Rada Mihalcea; |
222 | Misinfo Reaction Frames: Reasoning About Readers’ Reactions to News Headlines Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose Misinfo Reaction Frames (MRF), a pragmatic formalism for modeling how readers might react to a news headline. |
Saadia Gabriel; Skyler Hallinan; Maarten Sap; Pemi Nguyen; Franziska Roesner; Eunsol Choi; Yejin Choi; |
223 | On Continual Model Refinement in Out-of-Distribution Data Streams Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: For benchmarking and analysis, we propose a general sampling algorithm to obtain dynamic OOD data streams with controllable non-stationarity, as well as a suite of metrics measuring various aspects of online performance. |
Bill Yuchen Lin; Sida Wang; Xi Lin; Robin Jia; Lin Xiao; Xiang Ren; Scott Yih; |
224 | Achieving Conversational Goals with Unsupervised Post-hoc Knowledge Injection Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a post-hoc knowledge-injection technique where we first retrieve a diverse set of relevant knowledge snippets conditioned on both the dialog history and an initial response from an existing dialog model. |
Bodhisattwa Prasad Majumder; Harsh Jhamtani; Taylor Berg-Kirkpatrick; Julian McAuley; |
225 | Generated Knowledge Prompting for Commonsense Reasoning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: It remains an open question whether incorporating external knowledge benefits commonsense reasoning while maintaining the flexibility of pretrained sequence models. To investigate this question, we develop generated knowledge prompting, which consists of generating knowledge from a language model, then providing the knowledge as additional input when answering a question |
Jiacheng Liu; Alisa Liu; Ximing Lu; Sean Welleck; Peter West; Ronan Le Bras; Yejin Choi; Hannaneh Hajishirzi; |
226 | Training Data Is More Valuable Than You Think: A Simple and Effective Method By Retrieving from Training Data Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Surprisingly, we found that REtrieving from the traINing datA (REINA) only can lead to significant gains on multiple NLG and NLU tasks. |
Shuohang Wang; Yichong Xu; Yuwei Fang; Yang Liu; Siqi Sun; Ruochen Xu; Chenguang Zhu; Michael Zeng; |
227 | Life After BERT: What Do Other Muppets Understand About Language? Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In our work, we utilize the oLMpics bench- mark and psycholinguistic probing datasets for a diverse set of 29 models including T5, BART, and ALBERT. |
Vladislav Lialin; Kevin Zhao; Namrata Shivagunde; Anna Rumshisky; |
228 | Tailor: Generating and Perturbing Text with Semantic Controls Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present Tailor, a semantically-controlled text generation system. |
Alexis Ross; Tongshuang Wu; Hao Peng; Matthew Peters; Matt Gardner; |
229 | TruthfulQA: Measuring How Models Mimic Human Falsehoods Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a benchmark to measure whether a language model is truthful in generating answers to questions. |
Stephanie Lin; Jacob Hilton; Owain Evans; |
230 | Adaptive Testing and Debugging of NLP Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present AdaTest, a process which uses large scale language models (LMs) in partnership with human feedback to automatically write unit tests highlighting bugs in a target model. |
Marco Tulio Ribeiro; Scott Lundberg; |
231 | Right for The Right Reason: Evidence Extraction for Trustworthy Tabular Reasoning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: As a case study, we propose a two-stage sequential prediction approach, which includes an evidence extraction and an inference stage. |
Vivek Gupta; Shuo Zhang; Alakananda Vempala; Yujie He; Temma Choji; Vivek Srikumar; |
232 | Interactive Word Completion for Plains Cree Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we develop an approach to morph-based auto-completion based on a finite state morphological analyzer of Plains Cree (nehiyawewin), showing the portability of the concept to a much larger, more complete morphological transducer. |
William Lane; Atticus Harrigan; Antti Arppe; |
233 | LAGr: Label Aligned Graphs for Better Systematic Generalization in Semantic Parsing Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we show that better systematic generalization can be achieved by producing the meaning representation directly as a graph and not as a sequence. To this end we propose LAGr (Label Aligned Graphs), a general framework to produce semantic parses by independently predicting node and edge labels for a complete multi-layer input-aligned graph. |
Dora Jambor; Dzmitry Bahdanau; |
234 | ToxiGen: A Large-Scale Machine-Generated Dataset for Adversarial and Implicit Hate Speech Detection Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We develop a demonstration-based prompting framework and an adversarial classifier-in-the-loop decoding method to generate subtly toxic and benign text with a massive pretrained language model. |
Thomas Hartvigsen; Saadia Gabriel; Hamid Palangi; Maarten Sap; Dipankar Ray; Ece Kamar; |
235 | Direct Speech-to-Speech Translation With Discrete Units Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a direct speech-to-speech translation (S2ST) model that translates speech from one language to speech in another language without relying on intermediate text generation. |
Ann Lee; Peng-Jen Chen; Changhan Wang; Jiatao Gu; Sravya Popuri; Xutai Ma; Adam Polyak; Yossi Adi; Qing He; Yun Tang; Juan Pino; Wei-Ning Hsu; |
236 | Hallucinated But Factual! Inspecting The Factuality of Hallucinations in Abstractive Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose a novel detection approach that separates factual from non-factual hallucinations of entities. |
Meng Cao; Yue Dong; Jackie Cheung; |
237 | EntSUM: A Data Set for Entity-Centric Extractive Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose extensions to state-of-the-art summarization approaches that achieve substantially better results on our data set. |
Mounica Maddela; Mayank Kulkarni; Daniel Preotiuc-Pietro; |
238 | Sentence-level Privacy for Document Embeddings Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work we propose SentDP, pure local differential privacy at the sentence level for a single user document. |
Casey Meehan; Khalil Mrini; Kamalika Chaudhuri; |
239 | Dataset Geography: Mapping Language Data to Language Users Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we study the geographical representativeness of NLP datasets, aiming to quantify if and by how much do NLP datasets match the expected needs of the language speakers. |
Fahim Faisal; Yinkai Wang; Antonios Anastasopoulos; |
240 | ILDAE: Instance-Level Difficulty Analysis of Evaluation Data Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Can we extract such benefits of instance difficulty in Natural Language Processing? To this end, we conduct Instance-Level Difficulty Analysis of Evaluation data (ILDAE) in a large-scale setup of 23 datasets and demonstrate its five novel applications: 1) conducting efficient-yet-accurate evaluations with fewer instances saving computational cost and time, 2) improving quality of existing evaluation datasets by repairing erroneous and trivial instances, 3) selecting the best model based on application requirements, 4) analyzing dataset characteristics for guiding future data creation, 5) estimating Out-of-Domain performance reliably. |
Neeraj Varshney; Swaroop Mishra; Chitta Baral; |
241 | Image Retrieval from Contextual Descriptions Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: The ability to integrate context, including perceptual and temporal cues, plays a pivotal role in grounding the meaning of a linguistic utterance. In order to measure to what extent current vision-and-language models master this ability, we devise a new multimodal challenge, Image Retrieval from Contextual Descriptions (ImageCoDe). |
Benno Krojer; Vaibhav Adlakha; Vibhav Vineet; Yash Goyal; Edoardo Ponti; Siva Reddy; |
242 | Multilingual Molecular Representation Learning Via Contrastive Pre-training Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Motivated by the fact that a given molecule can be described using different languages such as Simplified Molecular Line Entry System (SMILES), The International Union of Pure and Applied Chemistry (IUPAC), and The IUPAC International Chemical Identifier (InChI), we propose a multilingual molecular embedding generation approach called MM-Deacon (multilingual molecular domain embedding analysis via contrastive learning). |
Zhihui Guo; Pramod Sharma; Andy Martinez; Liang Du; Robin Abraham; |
243 | Investigating Failures of Automatic Translationin The Case of Unambiguous Gender Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Transformer-based models are the modern work horses for neural machine translation (NMT), reaching state of the art across several benchmarks. Despite their impressive accuracy, we observe a systemic and rudimentary class of errors made by current state-of-the-art NMT models with regards to translating from a language that doesn’t mark gender on nouns into others that do. |
Adi Renduchintala; Adina Williams; |
244 | Cross-Task Generalization Via Natural Language Crowdsourcing Instructions Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: A long-standing challenge in AI is to build a model that learns a new task by understanding the human-readable instructions that define it. To study this, we introduce NATURAL INSTRUCTIONS, a dataset of 61 distinct tasks, their human-authored instructions, and 193k task instances (input-output pairs). |
Swaroop Mishra; Daniel Khashabi; Chitta Baral; Hannaneh Hajishirzi; |
245 | Imputing Out-of-Vocabulary Embeddings with LOVE Makes LanguageModels Robust with Little Cost Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present a simple contrastive learning framework, LOVE, which extends the word representation of an existing pre-trained language model (such as BERT) and makes it robust to OOV with few additional parameters. |
Lihu Chen; Gael Varoquaux; Fabian Suchanek; |
246 | NumGLUE: A Suite of Fundamental Yet Challenging Mathematical Reasoning Tasks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Drawing inspiration from GLUE that was proposed in the context of natural language understanding, we propose NumGLUE, a multi-task benchmark that evaluates the performance of AI systems on eight different tasks, that at their core require simple arithmetic understanding. |
Swaroop Mishra; Arindam Mitra; Neeraj Varshney; Bhavdeep Sachdeva; Peter Clark; Chitta Baral; Ashwin Kalyan; |
247 | Upstream Mitigation Is Not All You Need: Testing The Bias Transfer Hypothesis in Pre-Trained Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We investigate the bias transfer hypothesis: the theory that social biases (such as stereotypes) internalized by large language models during pre-training transfer into harmful task-specific behavior after fine-tuning. |
Ryan Steed; Swetasudha Panda; Ari Kobren; Michael Wick; |
248 | Improving Multi-label Malevolence Detection in Dialogues Through Multi-faceted Label Correlation Enhancement Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Second, current methods for detecting dialogue malevolence neglect label correlation. Therefore, we propose the task of multi-label dialogue malevolence detection and crowdsource a multi-label dataset, multi-label dialogue malevolence detection (MDMD) for evaluation. |
Yangjun Zhang; Pengjie Ren; Wentao Deng; Zhumin Chen; Maarten Rijke; |
249 | How Do We Answer Complex Questions: Discourse Structure of Long-form Answers Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Our main goal is to understand how humans organize information to craft complex answers. |
Fangyuan Xu; Junyi Jessy Li; Eunsol Choi; |
250 | Understanding Iterative Revision from Human-Written Text Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This work describes IteraTeR: the first large-scale, multi-domain, edit-intention annotated corpus of iteratively revised text. |
Wanyu Du; Vipul Raheja; Dhruv Kumar; Zae Myung Kim; Melissa Lopez; Dongyeop Kang; |
251 | Making Transformers Solve Compositional Tasks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper we explore the design space of Transformer models showing that the inductive biases given to the model by several design decisions significantly impact compositional generalization. |
Santiago Ontanon; Joshua Ainslie; Zachary Fisher; Vaclav Cvicek; |
252 | Can Transformer Be Too Compositional? Analysing Idiom Processing in Neural Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we investigate whether the non-compositionality of idioms is reflected in the mechanics of the dominant NMT model, Transformer, by analysing the hidden states and attention patterns for models with English as source language and one of seven European languages as target language. |
Verna Dankers; Christopher Lucas; Ivan Titov; |
253 | ConditionalQA: A Complex Reading Comprehension Dataset with Conditional Answers Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We describe a Question Answering (QA) dataset that contains complex questions with conditional answers, i.e. the answers are only applicable when certain conditions apply. |
Haitian Sun; William Cohen; Ruslan Salakhutdinov; |
254 | Prompt-free and Efficient Few-shot Learning with Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Current methods for few-shot fine-tuning of pretrained masked language models (PLMs) require carefully engineered prompts and verbalizers for each new task to convert examples into a cloze-format that the PLM can score. In this work, we propose Perfect, a simple and efficient method for few-shot fine-tuning of PLMs without relying on any such handcrafting, which is highly effective given as few as 32 data points. |
Rabeeh Karimi Mahabadi; Luke Zettlemoyer; James Henderson; Lambert Mathias; Marzieh Saeidi; Veselin Stoyanov; Majid Yazdani; |
255 | Continual Sequence Generation with Adaptive Compositional Modules Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To get the best of both worlds, in this work, we propose continual sequence generation with adaptive compositional modules to adaptively add modules in transformer architectures and compose both old and new modules for new tasks. |
Yanzhe Zhang; Xuezhi Wang; Diyi Yang; |
256 | An Investigation of The (In)effectiveness of Counterfactually Augmented Data Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, empirical results using CAD during training for OOD generalization have been mixed. To explain this discrepancy, through a toy theoretical example and empirical analysis on two crowdsourced CAD datasets, we show that: (a) while features perturbed in CAD are indeed robust features, it may prevent the model from learning unperturbed robust features; and (b) CAD may exacerbate existing spurious correlations in the data. |
Nitish Joshi; He He; |
257 | Inducing Positive Perspectives with Text Reframing Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce a different but related task called positive reframing in which we neutralize a negative point of view and generate a more positive perspective for the author without contradicting the original meaning. |
Caleb Ziems; Minzhi Li; Anthony Zhang; Diyi Yang; |
258 | VALUE: Understanding Dialect Disparity in NLU Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To understand disparities in current models and to facilitate more dialect-competent NLU systems, we introduce the VernAcular Language Understanding Evaluation (VALUE) benchmark, a challenging variant of GLUE that we created with a set of lexical and morphosyntactic transformation rules. |
Caleb Ziems; Jiaao Chen; Camille Harris; Jessica Anderson; Diyi Yang; |
259 | From The Detection of Toxic Spans in Online Discussions to The Analysis of Toxic-to-Civil Transfer Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We study the task of toxic spans detection, which concerns the detection of the spans that make a text toxic, when detecting such spans is possible. We introduce a dataset for this task, ToxicSpans, which we release publicly. |
John Pavlopoulos; Leo Laugier; Alexandros Xenos; Jeffrey Sorensen; Ion Androutsopoulos; |
260 | FormNet: Structural Encoding Beyond Sequential Modeling in Form Document Information Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose FormNet, a structure-aware sequence model to mitigate the suboptimal serialization of forms. |
Chen-Yu Lee; Chun-Liang Li; Timothy Dozat; Vincent Perot; Guolong Su; Nan Hua; Joshua Ainslie; Renshen Wang; Yasuhisa Fujii; Tomas Pfister; |
261 | The Moral Integrity Corpus: A Benchmark for Ethical Dialogue Systems Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we introduce a new resource, not to authoritatively resolve moral ambiguities, but instead to facilitate systematic understanding of the intuitions, values and moral judgments reflected in the utterances of dialogue systems. |
Caleb Ziems; Jane Yu; Yi-Chia Wang; Alon Halevy; Diyi Yang; |
262 | Token Dropping for Efficient BERT Pretraining Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Transformer-based models generally allocate the same amount of computation for each token in a given sequence. We develop a simple but effective token dropping method to accelerate the pretraining of transformer models, such as BERT, without degrading its performance on downstream tasks. |
Le Hou; Richard Yuanzhe Pang; Tianyi Zhou; Yuexin Wu; Xinying Song; Xiaodan Song; Denny Zhou; |
263 | DialFact: A Benchmark for Fact-Checking in Dialogue Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We found that existing fact-checking models trained on non-dialogue data like FEVER fail to perform well on our task, and thus, we propose a simple yet data-efficient solution to effectively improve fact-checking performance in dialogue. |
Prakhar Gupta; Chien-Sheng Wu; Wenhao Liu; Caiming Xiong; |
264 | The Trade-offs of Domain Adaptation for Neural Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This work connects language model adaptation with concepts of machine learning theory. |
David Grangier; Dan Iter; |
265 | Towards Afrocentric NLP for African Languages: Where We Are and Where We Can Go Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Our main objective is to motivate and advocate for an Afrocentric approach to technology development. |
Ife Adebara; Muhammad Abdul-Mageed; |
266 | Ensembling and Knowledge Distilling of Large Sequence Taggers for Grammatical Error Correction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we investigate improvements to the GEC sequence tagging architecture with a focus on ensembling of recent cutting-edge Transformer-based encoders in Large configurations. |
Maksym Tarnavskyi; Artem Chernodub; Kostiantyn Omelianchuk; |
267 | Speaker Information Can Guide Models to Better Inductive Biases: A Case Study On Predicting Code-Switching Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To our knowledge, we are the first to incorporate speaker characteristics in a neural model for code-switching, and more generally, take a step towards developing transparent, personalized models that use speaker information in a controlled way. |
Alissa Ostapenko; Shuly Wintner; Melinda Fricke; Yulia Tsvetkov; |
268 | Detecting Unassimilated Borrowings in Spanish: An Annotated Corpus and Approaches to Modeling Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This work presents a new resource for borrowing identification and analyzes the performance and errors of several models on this task. |
Elena �lvarez-Mellado; Constantine Lignos; |
269 | Is Attention Explanation? An Introduction to The Debate Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Although the debate has created a vast literature thanks to contributions from various areas, the lack of communication is becoming more and more tangible. In this paper, we provide a clear overview of the insights on the debate by critically confronting works from these different areas. |
Adrien Bibal; R�mi Cardon; David Alfter; Rodrigo Wilkens; Xiaoou Wang; Thomas Fran�ois; Patrick Watrin; |
270 | There Are A Thousand Hamlets in A Thousand People’s Eyes: Enhancing Knowledge-grounded Dialogue with Personal Memory Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we introduce personal memory into knowledge selection in KGC to address the personalization issue. |
Tingchen Fu; Xueliang Zhao; Chongyang Tao; Ji-Rong Wen; Rui Yan; |
271 | Neural Pipeline for Zero-Shot Data-to-Text Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Inspired by pipeline approaches, we propose to generate text by transforming single-item descriptions with a sequence of modules trained on general-domain text-based operations: ordering, aggregation, and paragraph compression. |
Zdenek Kasner; Ondrej Dusek; |
272 | Not Always About You: Prioritizing Community Needs When Developing Endangered Language Technology Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this position paper, we discuss the unique technological, cultural, practical, and ethical challenges that researchers and indigenous speech community members face when working together to develop language technology to support endangered language documentation and revitalization. |
Zoey Liu; Crystal Richardson; Richard Hatcher; Emily Prud�hommeaux; |
273 | Automatic Identification and Classification of Bragging in Social Media Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we present the first large scale study of bragging in computational linguistics, building on previous research in linguistics and pragmatics. |
Mali Jin; Daniel Preotiuc-Pietro; A. Dogru�z; Nikolaos Aletras; |
274 | Automatic Error Analysis for Document-level Information Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We build on the work of Kummerfeld and Klein (2013) to propose a transformation-based framework for automating error analysis in document-level event and (N-ary) relation extraction. |
Aliva Das; Xinya Du; Barry Wang; Kejian Shi; Jiayuan Gu; Thomas Porter; Claire Cardie; |
275 | Learning Functional Distributional Semantics with Visual Data Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose a method to train a Functional Distributional Semantics model with grounded visual data. |
Yinhong Liu; Guy Emerson; |
276 | EPiC: Employing Proverbs in Context As A Benchmark for Abstract Language Understanding Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Here, we introduce a high-quality crowdsourced dataset of narratives for employing proverbs in context as a benchmark for abstract language understanding. |
Sayan Ghosh; Shashank Srivastava; |
277 | Chart-to-Text: A Large-Scale Benchmark for Chart Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present Chart-to-text, a large-scale benchmark with two datasets and a total of 44,096 charts covering a wide range of topics and chart types. |
Shankar Kantharaj; Rixie Tiffany Leong; Xiang Lin; Ahmed Masry; Megh Thakkar; Enamul Hoque; Shafiq Joty; |
278 | Characterizing Idioms: Conventionality and Contingency Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We define two measures that correspond to the properties above, and we show that idioms fall at the expected intersection of the two dimensions, but that the dimensions themselves are not correlated. |
Michaela Socolof; Jackie Cheung; Michael Wagner; Timothy O�Donnell; |
279 | Bag-of-Words Vs. Graph Vs. Sequence in Text Classification: Questioning The Necessity of Text-Graphs and The Surprising Strength of A Wide MLP Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We show that a wide multi-layer perceptron (MLP) using a Bag-of-Words (BoW) outperforms the recent graph-based models TextGCN and HeteGCN in an inductive text classification setting and is comparable with HyperGAT. |
Lukas Galke; Ansgar Scherp; |
280 | Generative Pretraining for Paraphrase Evaluation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce ParaBLEU, a paraphrase representation learning model and evaluation metric for text generation. |
Jack Weston; Raphael Lenain; Udeepa Meepegama; Emil Fristed; |
281 | Incorporating Stock Market Signals for Twitter Stance Detection Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we investigate the integration of textual and financial signals for stance detection in the financial domain. |
Costanza Conforti; Jakob Berndt; Mohammad Taher Pilehvar; Chryssi Giannitsarou; Flavio Toxvaerd; Nigel Collier; |
282 | Multilingual Mix: Example Interpolation Improves Multilingual Neural Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we introduce multilingual crossover encoder-decoder (mXEncDec) to fuse language pairs at an instance level. |
Yong Cheng; Ankur Bapna; Orhan Firat; Yuan Cao; Pidong Wang; Wolfgang Macherey; |
283 | Word Segmentation As Unsupervised Constituency Parsing Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This work takes one step forward by exploring a radically different approach of word identification, in which segmentation of a continuous input is viewed as a process isomorphic to unsupervised constituency parsing. |
Raquel G. Alhama; |
284 | SafetyKit: First Aid for Measuring Safety in Open-domain Conversational Systems Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The social impact of natural language processing and its applications has received increasing attention. In this position paper, we focus on the problem of safety for end-to-end conversational AI. |
Emily Dinan; Gavin Abercrombie; A. Bergman; Shannon Spruit; Dirk Hovy; Y-Lan Boureau; Verena Rieser; |
285 | Zero-Shot Cross-lingual Semantic Parsing Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a multi-task encoder-decoder model to transfer parsing knowledge to additional languages using only English-logical form paired data and in-domain natural language corpora in each new language. |
Tom Sherborne; Mirella Lapata; |
286 | The Paradox of The Compositionality of Natural Language: A Neural Machine Translation Case Study Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we re-instantiate three compositionality tests from the literature and reformulate them for neural machine translation (NMT). |
Verna Dankers; Elia Bruni; Dieuwke Hupkes; |
287 | Multilingual Document-Level Translation Enables Zero-Shot Transfer From Sentences to Documents Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we study whether and how contextual modeling in DocNMT is transferable via multilingual modeling. |
Biao Zhang; Ankur Bapna; Melvin Johnson; Ali Dabirmoghaddam; Naveen Arivazhagan; Orhan Firat; |
288 | Cross-Lingual Phrase Retrieval Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose , a cross-lingual phrase retriever that extracts phrase representations from unlabeled example sentences. |
Heqi Zheng; Xiao Zhang; Zewen Chi; Heyan Huang; Yan Tan; Tian Lan; Wei Wei; Xian-Ling Mao; |
289 | Improving Compositional Generalization with Self-Training for Data-to-Text Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We show that T5 models fail to generalize to unseen MRs, and we propose a template-based input representation that considerably improves the model’s generalization capability. |
Sanket Vaibhav Mehta; Jinfeng Rao; Yi Tay; Mihir Kale; Ankur Parikh; Emma Strubell; |
290 | MMCoQA: Conversational Question Answering Over Text, Tables, and Images Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we hence define a novel research task, i.e., multimodal conversational question answering (MMCoQA), aiming to answer users’ questions with multimodal knowledge sources via multi-turn conversations. |
Yongqi Li; Wenjie Li; Liqiang Nie; |
291 | Effective Token Graph Modeling Using A Novel Labeling Strategy for Structured Sentiment Analysis Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: (1) The label proportions for span prediction and span relation prediction are imbalanced. (2) The span lengths of sentiment tuple components may be very large in this task, which will further exacerbates the imbalance problem. (3) Two nodes in a dependency graph cannot have multiple arcs, therefore some overlapped sentiment tuples cannot be recognized. In this work, we propose nichetargeting solutions for these issues. |
Wenxuan Shi; Fei Li; Jingye Li; Hao Fei; Donghong Ji; |
292 | PromDA: Prompt-based Data Augmentation for Low-Resource NLU Tasks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose Prompt-based Data Augmentation model (PromDA) which only trains small-scale Soft Prompt (i.e., a set of trainable vectors) in the frozen Pre-trained Language Models (PLMs). |
Yufei Wang; Can Xu; Qingfeng Sun; Huang Hu; Chongyang Tao; Xiubo Geng; Daxin Jiang; |
293 | Disentangled Sequence to Sequence Learning for Compositional Generalization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose an extension to sequence-to-sequence models which encourage disentanglement by adaptively re-encoding (at each time step) the source input. |
Hao Zheng; Mirella Lapata; |
294 | RST Discourse Parsing with Second-Stage EDU-Level Pre-training Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Current PLMs are obtained by sentence-level pre-training, which is different from the basic processing unit, i.e. element discourse unit (EDU). To this end, we propose a second-stage EDU-level pre-training approach in this work, which presents two novel tasks to learn effective EDU representations continually based on well pre-trained language models.Concretely, the two tasks are (1) next EDU prediction (NEP) and (2) discourse marker prediction (DMP). |
Nan Yu; Meishan Zhang; Guohong Fu; Min Zhang; |
295 | SimKGC: Simple Contrastive Knowledge Graph Completion with Pre-trained Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, the performance of text-based methods still largely lag behind graph embedding-based methods like TransE (Bordes et al., 2013) and RotatE (Sun et al., 2019b). In this paper, we identify that the key issue is efficient contrastive learning. |
Liang Wang; Wei Zhao; Zhuoyu Wei; Jingming Liu; |
296 | Do Transformer Models Show Similar Attention Patterns to Task-Specific Human Gaze? Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We investigate whether self-attention in large-scale pre-trained language models is as predictive of human eye fixation patterns during task-reading as classical cognitive models of human attention. |
Oliver Eberle; Stephanie Brandl; Jonas Pilot; Anders S�gaard; |
297 | LexGLUE: A Benchmark Dataset for Legal Language Understanding in English Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Their usefulness, however, largely depends on whether current state-of-the-art models can generalize across various tasks in the legal domain. To answer this currently open question, we introduce the Legal General Language Understanding Evaluation (LexGLUE) benchmark, a collection of datasets for evaluating model performance across a diverse set of legal NLU tasks in a standardized way. |
Ilias Chalkidis; Abhik Jana; Dirk Hartung; Michael Bommarito; Ion Androutsopoulos; Daniel Katz; Nikolaos Aletras; |
298 | DiBiMT: A Novel Benchmark for Measuring Word Sense Disambiguation Biases in Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we present DiBiMT, the first entirely manually-curated evaluation benchmark which enables an extensive study of semantic biases in Machine Translation of nominal and verbal words in five different language combinations, namely, English and one or other of the following languages: Chinese, German, Italian, Russian and Spanish. |
Niccol� Campolungo; Federico Martelli; Francesco Saina; Roberto Navigli; |
299 | Improving Word Translation Via Two-Stage Contrastive Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose a robust and effective two-stage contrastive learning framework for the BLI task. |
Yaoyiran Li; Fangyu Liu; Nigel Collier; Anna Korhonen; Ivan Vulic; |
300 | Scheduled Multi-task Learning for Neural Chat Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Although the NCT models have achieved impressive success, it is still far from satisfactory due to insufficient chat translation data and simple joint training manners. To address the above issues, we propose a scheduled multi-task learning framework for NCT. |
Yunlong Liang; Fandong Meng; Jinan Xu; Yufeng Chen; Jie Zhou; |
301 | FairLex: A Multilingual Benchmark for Evaluating Fairness in Legal Text Processing Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present a benchmark suite of four datasets for evaluating the fairness of pre-trained language models and the techniques used to fine-tune them for downstream tasks. |
Ilias Chalkidis; Tommaso Pasini; Sheng Zhang; Letizia Tomada; Sebastian Schwemer; Anders S�gaard; |
302 | Towards Abstractive Grounded Summarization of Podcast Transcripts Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: The problem is exacerbated by speech disfluencies and recognition errors in transcripts of spoken language. In this paper, we explore a novel abstractive summarization method to alleviate these issues. |
Kaiqiang Song; Chen Li; Xiaoyang Wang; Dong Yu; Fei Liu; |
303 | FiNER: Financial Numeric Entity Recognition for XBRL Tagging Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To improve BERT’s performance, we propose two simple and effective solutions that replace numeric expressions with pseudo-tokens reflecting original token shapes and numeric magnitudes. |
Lefteris Loukas; Manos Fergadiotis; Ilias Chalkidis; Eirini Spyropoulou; Prodromos Malakasiotis; Ion Androutsopoulos; Georgios Paliouras; |
304 | Keywords and Instances: A Hierarchical Contrastive Learning Framework Unifying Hybrid Granularities for Text Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Hence, in this work, we propose a hierarchical contrastive learning mechanism, which can unify hybrid granularities semantic meaning in the input text. |
Mingzhe Li; XieXiong Lin; Xiuying Chen; Jinxiong Chang; Qishen Zhang; Feng Wang; Taifeng Wang; Zhongyi Liu; Wei Chu; Dongyan Zhao; Rui Yan; |
305 | EPT-X: An Expression-Pointer Transformer Model That Generates EXplanations for Numbers Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a neural model EPT-X (Expression-Pointer Transformer with Explanations), which utilizes natural language explanations to solve an algebraic word problem. |
Bugeun Kim; Kyung Seo Ki; Sangkyu Rhim; Gahgene Gweon; |
306 | Identifying The Human Values Behind Arguments Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper studies the (often implicit) human values behind natural language arguments, such as to have freedom of thought or to be broadminded. |
Johannes Kiesel; Milad Alshomary; Nicolas Handke; Xiaoni Cai; Henning Wachsmuth; Benno Stein; |
307 | BenchIE: A Framework for Multi-Faceted Fact-Based Open Information Extraction Evaluation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we introduce BenchIE: a benchmark and evaluation framework for comprehensive evaluation of OIE systems for English, Chinese, and German. |
Kiril Gashteovski; Mingying Yu; Bhushan Kotnis; Carolin Lawrence; Mathias Niepert; Goran Glava�; |
308 | Leveraging Unimodal Self-Supervised Learning for Multimodal Audio-Visual Speech Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we successfully leverage unimodal self-supervised learning to promote the multimodal AVSR. |
Xichen Pan; Peiyu Chen; Yichen Gong; Helong Zhou; Xinbing Wang; Zhouhan Lin; |
309 | SummaReranker: A Multi-Task Mixture-of-Experts Re-ranking Framework for Abstractive Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we show that it is possible to directly train a second-stage model performing re-ranking on a set of summary candidates. |
Mathieu Ravaut; Shafiq Joty; Nancy Chen; |
310 | Understanding Multimodal Procedural Knowledge By Sequencing Multimodal Instructional Manuals Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we benchmark models’ capability of reasoning over and sequencing unordered multimodal instructions by curating datasets from online instructional manuals and collecting comprehensive human annotations. |
Te-Lin Wu; Alex Spangher; Pegah Alipoormolabashi; Marjorie Freedman; Ralph Weischedel; Nanyun Peng; |
311 | Zoom Out and Observe: News Environment Perception for Fake News Detection Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To capture the environmental signals of news posts, we zoom out to observe the news environment and propose the News Environment Perception Framework (NEP). |
Qiang Sheng; Juan Cao; Xueyao Zhang; Rundong Li; Danding Wang; Yongchun Zhu; |
312 | Divide and Rule: Effective Pre-Training for Context-Aware Multi-Encoder Translation Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: The context encoding is undertaken by contextual parameters, trained on document-level data. In this work, we discuss the difficulty of training these parameters effectively, due to the sparsity of the words in need of context (i.e., the training signal), and their relevant context. |
Lorenzo Lupo; Marco Dinarelli; Laurent Besacier; |
313 | Saliency As Evidence: Event Detection with Trigger Saliency Attribution Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This research examines the issue in depth and presents a new concept termed trigger salience attribution, which can explicitly quantify the underlying patterns of events. |
Jian Liu; Yufeng Chen; Jinan Xu; |
314 | SRL4E – Semantic Role Labeling for Emotions: A Unified Evaluation Framework Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, currently available gold datasets are heterogeneous in size, domain, format, splits, emotion categories and role labels, making comparisons across different works difficult and hampering progress in the area. In this paper, we tackle this issue and present a unified evaluation framework focused on Semantic Role Labeling for Emotions (SRL4E), in which we unify several datasets tagged with emotions and semantic roles by using a common labeling scheme. |
Cesare Campagnano; Simone Conia; Roberto Navigli; |
315 | Context Matters: A Pragmatic Study of PLMs’ Negation Understanding Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this article, we adopt the pragmatic paradigm to conduct a study of negation understanding focusing on transformer-based PLMs. |
Reto Gubelmann; Siegfried Handschuh; |
316 | Probing for Predicate Argument Structures in Pretrained Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: These results have prompted researchers to investigate the inner workings of modern PLMs with the aim of understanding how, where, and to what extent they encode information about SRL. In this paper, we follow this line of research and probe for predicate argument structures in PLMs. |
Simone Conia; Roberto Navigli; |
317 | Multilingual Generative Language Models for Zero-Shot Cross-Lingual Event Argument Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present a study on leveraging multilingual pre-trained generative language models for zero-shot cross-lingual event argument extraction (EAE). |
Kuan-Hao Huang; I-Hung Hsu; Prem Natarajan; Kai-Wei Chang; Nanyun Peng; |
318 | Identifying Moments of Change from Longitudinal User Text Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Here we define a new task, that of identifying moments of change in individuals on the basis of their shared content online. |
Adam Tsakalidis; Federico Nanni; Anthony Hills; Jenny Chim; Jiayu Song; Maria Liakata; |
319 | Multi-Task Pre-Training for Plug-and-Play Task-Oriented Dialogue System Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this study, we present PPTOD, a unified plug-and-play model for task-oriented dialogue. |
Yixuan Su; Lei Shu; Elman Mansimov; Arshit Gupta; Deng Cai; Yi-An Lai; Yi Zhang; |
320 | Graph Enhanced Contrastive Learning for Radiology Findings Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To address the limitation, we propose a unified framework for exploiting both extra knowledge and the original findings in an integrated way so that the critical information (i.e., key words and their relations) can be extracted in an appropriate way to facilitate impression generation. |
Jinpeng Hu; Zhuo Li; Zhihong Chen; Zhen Li; Xiang Wan; Tsung-Hui Chang; |
321 | Semi-Supervised Formality Style Transfer with Consistency Training Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose a simple yet effective semi-supervised framework to better utilize source-side unlabeled sentences based on consistency training. |
Ao Liu; An Wang; Naoaki Okazaki; |
322 | Cross-Lingual Ability of Multilingual Masked Language Models: A Study of Language Structure Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In our work, we argue that cross-language ability comes from the commonality between languages. |
Yuan Chai; Yaobo Liang; Nan Duan; |
323 | Rare and Zero-shot Word Sense Disambiguation Using Z-Reweighting Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Based on the relation, we propose a Z-reweighting method on the word level to adjust the training on the imbalanced dataset. |
Ying Su; Hongming Zhang; Yangqiu Song; Tong Zhang; |
324 | Nibbling at The Hard Core of Word Sense Disambiguation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we provide evidence showing why the F1 score metric should not simply be taken at face value and present an exhaustive analysis of the errors that seven of the most representative state-of-the-art systems for English all-words WSD make on traditional evaluation benchmarks. |
Marco Maru; Simone Conia; Michele Bevilacqua; Roberto Navigli; |
325 | Large Scale Substitution-based Word Sense Induction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a word-sense induction method based on pre-trained masked language models (MLMs), which can cheaply scale to large vocabularies and large corpora. |
Matan Eyal; Shoval Sadde; Hillel Taub-Tabib; Yoav Goldberg; |
326 | Can Synthetic Translations Improve Bitext Quality? Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This work explores, instead, how synthetic translations can be used to revise potentially imperfect reference translations in mined bitext. |
Eleftheria Briakou; Marine Carpuat; |
327 | Unsupervised Dependency Graph Network Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce a new model, the Unsupervised Dependency Graph Network (UDGN), that can induce dependency structures from raw corpora and the masked language modeling task. |
Yikang Shen; Shawn Tan; Alessandro Sordoni; Peng Li; Jie Zhou; Aaron Courville; |
328 | WikiDiverse: A Multimodal Entity Linking Dataset with Diversified Contextual Topics and Entity Types Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present WikiDiverse, a high-quality human-annotated MEL dataset with diversified contextual topics and entity types from Wikinews, which uses Wikipedia as the corresponding knowledge base. |
Xuwu Wang; Junfeng Tian; Min Gui; Zhixu Li; Rui Wang; Ming Yan; Lihan Chen; Yanghua Xiao; |
329 | Rewire-then-Probe: A Contrastive Recipe for Probing Biomedical Knowledge of Pre-trained Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: While highlighting various sources of domain-specific challenges that amount to this underwhelming performance, we illustrate that the underlying PLMs have a higher potential for probing tasks. To achieve this, we propose Contrastive-Probe, a novel self-supervised contrastive probing approach, that adjusts the underlying PLMs without using any probing data. |
Zaiqiao Meng; Fangyu Liu; Ehsan Shareghi; Yixuan Su; Charlotte Collins; Nigel Collier; |
330 | Fine- and Coarse-Granularity Hybrid Self-Attention for Efficient BERT Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, deploying these models can be prohibitively costly, as the standard self-attention mechanism of the Transformer suffers from quadratic computational cost in the input sequence length. To confront this, we propose FCA, a fine- and coarse-granularity hybrid self-attention that reduces the computation cost through progressively shortening the computational sequence length in self-attention. |
Jing Zhao; Yifan Wang; Junwei Bao; Youzheng Wu; Xiaodong He; |
331 | Compression of Generative Pre-trained Language Models Via Quantization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we compress generative PLMs by quantization. |
Chaofan Tao; Lu Hou; Wei Zhang; Lifeng Shang; Xin Jiang; Qun Liu; Ping Luo; Ngai Wong; |
332 | Visual-Language Navigation Pretraining Via Prompt-based Environmental Self-exploration Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To improve the ability of fast cross-domain adaptation, we propose Prompt-based Environmental Self-exploration (ProbES), which can self-explore the environments by sampling trajectories and automatically generates structured instructions via a large-scale cross-modal pretrained model (CLIP). |
Xiwen Liang; Fengda Zhu; Li Lingling; Hang Xu; Xiaodan Liang; |
333 | DialogVED: A Pre-trained Latent Variable Encoder-Decoder Model for Dialog Response Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a new dialog pre-training framework called DialogVED, which introduces continuous latent variables into the enhanced encoder-decoder pre-training framework to increase the relevance and diversity of responses. |
Wei Chen; Yeyun Gong; Song Wang; Bolun Yao; Weizhen Qi; Zhongyu Wei; Xiaowu Hu; Bartuer Zhou; Yi Mao; Weizhu Chen; Biao Cheng; Nan Duan; |
334 | Contextual Fine-to-Coarse Distillation for Coarse-grained Response Selection in Open-Domain Conversations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a Contextual Fine-to-Coarse (CFC) distilled model for coarse-grained response selection in open-domain conversations. |
Wei Chen; Yeyun Gong; Can Xu; Huang Hu; Bolun Yao; Zhongyu Wei; Zhihao Fan; Xiaowu Hu; Bartuer Zhou; Biao Cheng; Daxin Jiang; Nan Duan; |
335 | Textomics: A Dataset for Genomics Data Summary Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Based on this dataset, we study two novel tasks: generating textual summary from a genomics data matrix and vice versa. Inspired by the successful applications of k nearest neighbors in modeling genomics data, we propose a kNN-Vec2Text model to address these tasks and observe substantial improvement on our dataset. |
Mu-Chun Wang; Zixuan Liu; Sheng Wang; |
336 | A Contrastive Framework for Learning Sentence Representations from Pairwise and Triple-wise Perspective in Angular Space Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: So in this paper, we propose a new method ArcCSE, with training objectives designed to enhance the pairwise discriminative power and model the entailment relation of triplet sentences. |
Yuhao Zhang; Hongji Zhu; Yongliang Wang; Nan Xu; Xiaobo Li; Binqiang Zhao; |
337 | Packed Levitated Marker for Entity and Relation Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose a novel span representation approach, named Packed Levitated Markers (PL-Marker), to consider the interrelation between the spans (pairs) by strategically packing the markers in the encoder. |
Deming Ye; Yankai Lin; Peng Li; Maosong Sun; |
338 | An Interpretable Neuro-Symbolic Reasoning Framework for Task-Oriented Dialogue Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Since deriving reasoning chains requires multi-hop reasoning for task-oriented dialogues, existing neuro-symbolic approaches would induce error propagation due to the one-phase design. To overcome this, we propose a two-phase approach that consists of a hypothesis generator and a reasoner. |
Shiquan Yang; Rui Zhang; Sarah Erfani; Jey Han Lau; |
339 | Impact of Evaluation Methodologies on Code Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we introduce the time-segmented evaluation methodology, which is novel to the code summarization research community, and compare it with the mixed-project and cross-project methodologies that have been commonly used. |
Pengyu Nie; Jiyang Zhang; Junyi Jessy Li; Ray Mooney; Milos Gligoric; |
340 | KG-FiD: Infusing Knowledge Graph in Fusion-in-Decoder for Open-Domain Question Answering Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The recently proposed Fusion-in-Decoder (FiD) framework is a representative example, which is built on top of a dense passage retriever and a generative reader, achieving the state-of-the-art performance. In this paper we further improve the FiD approach by introducing a knowledge-enhanced version, namely KG-FiD. |
Donghan Yu; Chenguang Zhu; Yuwei Fang; Wenhao Yu; Shuohang Wang; Yichong Xu; Xiang Ren; Yiming Yang; Michael Zeng; |
341 | Which Side Are You On? Insider-Outsider Classification in Conspiracy-theoretic Social Media Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To address these challenges, we define a novel Insider-Outsider classification task. Because we are not aware of any appropriate existing datasets or attendant models, we introduce a labeled dataset (CT5K) and design a model (NP2IO) to address this task. |
Pavan Holur; Tianyi Wang; Shadi Shahsavari; Timothy Tangherlini; Vwani Roychowdhury; |
342 | Learning From Failure: Data Capture in An Australian Aboriginal Community Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper presents a close-up study of the process of deploying data capture technology on the ground in an Australian Aboriginal community. |
Eric Le Ferrand; Steven Bird; Laurent Besacier; |
343 | Deep Inductive Logic Reasoning for Multi-Hop Reading Comprehension Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a deep-learning based inductive logic reasoning method that firstly extracts query-related (candidate-related) information, and then conducts logic reasoning among the filtered information by inducing feasible rules that entail the target relation. |
Wenya Wang; Sinno Pan; |
344 | CICERO: A Dataset for Contextualized Commonsense Inference in Dialogues Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper addresses the problem of dialogue reasoning with contextualized commonsense inference. |
Deepanway Ghosal; Siqi Shen; Navonil Majumder; Rada Mihalcea; Soujanya Poria; |
345 | A Comparative Study of Faithfulness Metrics for Model Interpretability Methods Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, we find that different faithfulness metrics show conflicting preferences when comparing different interpretations. Motivated by this observation, we aim to conduct a comprehensive and comparative study of the widely adopted faithfulness metrics. |
Chun Sik Chan; Huanqi Kong; Liang Guanqing; |
346 | SPoT: Better Frozen Model Adaptation Through Soft Prompt Transfer Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Building on the Prompt Tuning approach of Lester et al. (2021), which learns task-specific soft prompts to condition a frozen pre-trained model to perform different tasks, we propose a novel prompt-based transfer learning approach called SPoT: Soft Prompt Transfer. |
Tu Vu; Brian Lester; Noah Constant; Rami Al-Rfou�; Daniel Cer; |
347 | Pass Off Fish Eyes for Pearls: Attacking Model Selection of Pre-trained Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we argue that current FMS methods are vulnerable, as the assessment mainly relies on the static features extracted from PTMs. |
Biru Zhu; Yujia Qin; Fanchao Qi; Yangdong Deng; Zhiyuan Liu; Maosong Sun; Ming Gu; |
348 | Educational Question Generation of Children Storybooks Via Question Type Distribution Learning and Event-centric Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a novel question generation method that first learns the question type distribution of an input story paragraph, and then summarizes salient events which can be used to generate high-cognitive-demand questions. |
Zhenjie Zhao; Yufang Hou; Dakuo Wang; Mo Yu; Chengzhong Liu; Xiaojuan Ma; |
349 | HeterMPC: A Heterogeneous Graph Neural Network for Response Generation in Multi-Party Conversations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Compared with a two-party conversation where a dialogue context is a sequence of utterances, building a response generation model for MPCs is more challenging, since there exist complicated context structures and the generated responses heavily rely on both interlocutors (i.e., speaker and addressee) and history utterances. To address these challenges, we present HeterMPC, a heterogeneous graph-based neural network for response generation in MPCs which models the semantics of utterances and interlocutors simultaneously with two types of nodes in a graph. |
Jia-Chen Gu; Chao-Hong Tan; Chongyang Tao; Zhen-Hua Ling; Huang Hu; Xiubo Geng; Daxin Jiang; |
350 | The Patient Is More Dead Than Alive: Exploring The Current State of The Multi-document Summarisation of The Biomedical Literature Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Based on this analysis, we propose a new approach to human evaluation and identify several challenges that must be overcome to develop effective biomedical MDS systems. |
Yulia Otmakhova; Karin Verspoor; Timothy Baldwin; Jey Han Lau; |
351 | A Multi-Document Coverage Reward for RELAXed Multi-Document Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: As for many other generative tasks, reinforcement learning (RL) offers the potential to improve the training of MDS models; yet, it requires a carefully-designed reward that can ensure appropriate leverage of both the reference summaries and the input documents. For this reason, in this paper we propose fine-tuning an MDS baseline with a reward that balances a reference-based metric such as ROUGE with coverage of the input documents. |
Jacob Parnell; Inigo Jauregi Unanue; Massimo Piccardi; |
352 | KNN-Contrastive Learning for Out-of-Domain Intent Classification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we start from the nature of OOD intent classification and explore its optimization objective. |
Yunhua Zhou; Peiju Liu; Xipeng Qiu; |
353 | A Neural Network Architecture for Program Understanding Inspired By Human Behaviors Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we consider human behaviors and propose the PGNN-EK model that consists of two main components. |
Renyu Zhu; Lei Yuan; Xiang Li; Ming Gao; Wenyuan Cai; |
354 | FaVIQ: FAct Verification from Information-seeking Questions Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we construct a large-scale challenging fact verification dataset called FAVIQ, consisting of 188k claims derived from an existing corpus of ambiguous information-seeking questions. |
Jungsoo Park; Sewon Min; Jaewoo Kang; Luke Zettlemoyer; Hannaneh Hajishirzi; |
355 | Simulating Bandit Learning from User Feedback for Extractive Question Answering Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We study learning from user feedback for extractive question answering by simulating feedback using supervised data. |
Ge Gao; Eunsol Choi; Yoav Artzi; |
356 | Beyond Goldfish Memory: Long-Term Open-Domain Conversation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work we collect and release a human-human dataset consisting of multiple chat sessions whereby the speaking partners learn about each other’s interests and discuss the things they have learnt from past sessions. |
Jing Xu; Arthur Szlam; Jason Weston; |
357 | ReCLIP: A Strong Zero-Shot Baseline for Referring Expression Comprehension Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present ReCLIP, a simple but strong zero-shot baseline that repurposes CLIP, a state-of-the-art large-scale model, for ReC. |
Sanjay Subramanian; William Merrill; Trevor Darrell; Matt Gardner; Sameer Singh; Anna Rohrbach; |
358 | Dynamic Prefix-Tuning for Generative Template-based Event Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a generative template-based event extraction method with dynamic prefix (GTEE-DynPref) by integrating context information with type-specific prefixes to learn a context-specific prefix for each context. |
Xiao Liu; Heyan Huang; Ge Shi; Bo Wang; |
359 | E-LANG: Energy-Based Joint Inferencing of Super and Swift Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper proposes an effective dynamic inference approach, called E-LANG, which distributes the inference between large accurate Super-models and light-weight Swift models. |
Mohammad Akbari; Amin Banitalebi-Dehkordi; Yong Zhang; |
360 | PRIMERA: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce PRIMERA, a pre-trained model for multi-document representation with a focus on summarization that reduces the need for dataset-specific architectures and large amounts of fine-tuning labeled data. |
Wen Xiao; Iz Beltagy; Giuseppe Carenini; Arman Cohan; |
361 | Dynamic Global Memory for Document-level Argument Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: While recent work on document-level extraction has gone beyond single-sentence and increased the cross-sentence inference capability of end-to-end models, they are still restricted by certain input sequence length constraints and usually ignore the global context between events. To tackle this issue, we introduce a new global neural generation-based framework for document-level event argument extraction by constructing a document memory store to record the contextual event information and leveraging it to implicitly and explicitly help with decoding of arguments for later events. |
Xinya Du; Sha Li; Heng Ji; |
362 | Measuring The Impact of (Psycho-)Linguistic and Readability Features and Their Spill Over Effects on The Prediction of Eye Movement Patterns Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper we report on experiments with two eye-tracking corpora of naturalistic reading and two language models (BERT and GPT-2). |
Daniel Wiechmann; Elma Kerz; |
363 | Alternative Input Signals Ease Transfer in Multilingual Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we tackle inhibited transfer by augmenting the training data with alternative signals that unify different writing systems, such as phonetic, romanized, and transliterated input. |
Simeng Sun; Angela Fan; James Cross; Vishrav Chaudhary; Chau Tran; Philipp Koehn; Francisco Guzm�n; |
364 | Phone-ing It In: Towards Flexible Multi-Modal Language Model Training By Phonetic Representations of Data Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose a multi-modal approach to train language models using whatever text and/or audio data might be available in a language. |
Colin Leong; Daniel Whitenack; |
365 | Noisy Channel Language Model Prompting for Few-Shot Text Classification Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce a noisy channel approach for language model prompting in few-shot text classification. |
Sewon Min; Mike Lewis; Hannaneh Hajishirzi; Luke Zettlemoyer; |
366 | Multilingual Unsupervised Sequence Segmentation Transfers to Extremely Low-resource Languages Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We show that unsupervised sequence-segmentation performance can be transferred to extremely low-resource languages by pre-training a Masked Segmental Language Model (Downey et al., 2021) multilingually. |
C. Downey; Shannon Drizin; Levon Haroutunian; Shivin Thukral; |
367 | KinyaBERT: A Morphology-aware Kinyarwanda Language Model Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Even given a morphological analyzer, naive sequencing of morphemes into a standard BERT architecture is inefficient at capturing morphological compositionality and expressing word-relative syntactic regularities. We address these challenges by proposing a simple yet effective two-tier BERT architecture that leverages a morphological analyzer and explicitly represents morphological compositionality. |
Antoine Nzeyimana; Andre Niyongabo Rubungo; |
368 | On The Calibration of Pre-trained Language Models Using Mixup Guided By Area Under The Margin and Saliency Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we explore mixup for model calibration on several NLU tasks and propose a novel mixup strategy for pre-trained language models that improves model calibration further. |
Seo Yeon Park; Cornelia Caragea; |
369 | IMPLI: Investigating NLI Models’ Performance on Figurative Language Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce the IMPLI (Idiomatic and Metaphoric Paired Language Inference) dataset, an English dataset consisting of paired sentences spanning idioms and metaphors. |
Kevin Stowe; Prasetya Utama; Iryna Gurevych; |
370 | QAConv: Question Answering on Informative Conversations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper introduces QAConv, a new question answering (QA) dataset that uses conversations as a knowledge source. |
Chien-Sheng Wu; Andrea Madotto; Wenhao Liu; Pascale Fung; Caiming Xiong; |
371 | Prix-LM: Pretraining for Multilingual Knowledge Base Construction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To achieve this, it is crucial to represent multilingual knowledge in a shared/unified space. To this end, we propose a unified representation model, Prix-LM, for multilingual KB construction and completion. |
Wenxuan Zhou; Fangyu Liu; Ivan Vulic; Nigel Collier; Muhao Chen; |
372 | Semantic Composition with PSHRG for Derivation Tree Reconstruction from Graph-Based Meaning Representations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce a data-driven approach to generating derivation trees from meaning representation graphs with probabilistic synchronous hyperedge replacement grammar (PSHRG). |
Chun Hei Lo; Wai Lam; Hong Cheng; |
373 | HOLM: Hallucinating Objects with Language Models for Referring Expression Recognition in Partially-Observed Scenes Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we introduce HOLM, Hallucinating Objects with Language Models, to address the challenge of partial observability. |
Volkan Cirik; Louis-Philippe Morency; Taylor Berg-Kirkpatrick; |
374 | Multi Task Learning For Zero Shot Performance Prediction of Multilingual Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we build upon some of the existing techniques for predicting the zero-shot performance on a task, by modeling it as a multi-task learning problem. |
Kabir Ahuja; Shanu Kumar; Sandipan Dandapat; Monojit Choudhury; |
375 | 8-former: Infinite Memory Transformer Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose the 8-former, which extends the vanilla transformer with an unbounded long-term memory. |
Pedro Henrique Martins; Zita Marinho; Andre Martins; |
376 | Systematic Inequalities in Language Technology Performance Across The World’s Languages Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce a framework for estimating the global utility of language technologies as revealed in a comprehensive snapshot of recent publications in NLP. |
Damian Blasi; Antonios Anastasopoulos; Graham Neubig; |
377 | CaMEL: Case Marker Extraction Without Labels Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce CaMEL (Case Marker Extraction without Labels), a novel and challenging task in computational morphology that is especially relevant for low-resource languages. |
Leonie Weissweiler; Valentin Hofmann; Masoud Jalili Sabet; Hinrich Schuetze; |
378 | Improving Generalizability in Implicitly Abusive Language Detection with Concept Activation Vectors Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we show that general abusive language classifiers tend to be fairly reliable in detecting out-of-domain explicitly abusive utterances but fail to detect new types of more subtle, implicit abuse. |
Isar Nejadgholi; Kathleen Fraser; Svetlana Kiritchenko; |
379 | Reports of Personal Experiences and Stories in Argumentation: Datasets and Analysis Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: The impact of personal reports and stories in argumentation has been studied in the Social Sciences, but it is still largely underexplored in NLP. Our work is the first step towards filling this gap: our goal is to develop robust classifiers to identify documents containing personal experiences and reports. |
Neele Falk; Gabriella Lapesa; |
380 | Non-neural Models Matter: A Re-evaluation of Neural Referring Expression Generation Systems Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, the task of generating referring expressions in linguistic context is used as an example. |
Fahime Same; Guanyi Chen; Kees Van Deemter; |
381 | Bridging The Generalization Gap in Text-to-SQL Parsing with Schema Expansion Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To study this problem, we first propose a synthetic dataset along with a re-purposed train/test split of the Squall dataset (Shi et al., 2020) as new benchmarks to quantify domain generalization over column operations, and find existing state-of-the-art parsers struggle in these benchmarks. We propose to address this problem by incorporating prior domain knowledge by preprocessing table schemas, and design a method that consists of two components: schema expansion and schema pruning. |
Chen Zhao; Yu Su; Adam Pauls; Emmanouil Antonios Platanios; |
382 | Predicate-Argument Based Bi-Encoder for Paraphrase Identification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we adopt a bi-encoder approach to the paraphrase identification task, and investigate the impact of explicitly incorporating predicate-argument information into SBERT through weighted aggregation. |
Qiwei Peng; David Weir; Julie Weeds; Yekun Chai; |
383 | MINER: Improving Out-of-Vocabulary Named Entity Recognition from An Information Theoretic Perspective Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, recent studies show that previous approaches may over-rely on entity mention information, resulting in poor performance on out-of-vocabulary(OOV) entity recognition. In this work, we propose MINER, a novel NER learning framework, to remedy this issue from an information-theoretic perspective. |
Xiao Wang; Shihan Dou; Limao Xiong; Yicheng Zou; Qi Zhang; Tao Gui; Liang Qiao; Zhanzhan Cheng; Xuanjing Huang; |
384 | Leveraging Wikipedia Article Evolution for Promotional Tone Detection Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work we introduce WikiEvolve, a dataset for document-level promotional tone detection. |
Christine De Kock; Andreas Vlachos; |
385 | From Text to Talk: Harnessing Conversational Corpora for Humane and Diversity-aware Language Technology Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We show how interactional data from 63 languages (26 families) harbours insights about turn-taking, timing, sequential structure and social action, with implications for language technology, natural language understanding, and the design of conversational interfaces. |
Mark Dingemanse; Andreas Liesenfeld; |
386 | Flooding-X: Improving BERT’s Resistance to Adversarial Attacks Via Loss-Restricted Fine-Tuning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, the tradition of generating adversarial perturbations for each input embedding (in the settings of NLP) scales up the training computational complexity by the number of gradient steps it takes to obtain the adversarial samples. To address this problem, we leverage Flooding method which primarily aims at better generalization and we find promising in defending adversarial attacks. |
Qin Liu; Rui Zheng; Bao Rong; Jingyi Liu; ZhiHua Liu; Zhanzhan Cheng; Liang Qiao; Tao Gui; Qi Zhang; Xuanjing Huang; |
387 | RoMe: A Robust Metric for Evaluating Natural Language Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose an automatic evaluation metric incorporating several core aspects of natural language understanding (language competence, syntactic and semantic variation). |
Md Rashad Al Hasan Rony; Liubov Kovriguina; Debanjan Chaudhuri; Ricardo Usbeck; Jens Lehmann; |
388 | Finding Structural Knowledge in Multimodal-BERT Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we investigate the knowledge learned in the embeddings of multimodal-BERT models. |
Victor Milewski; Miryam de Lhoneux; Marie-Francine Moens; |
389 | Fully Hyperbolic Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a fully hyperbolic framework to build hyperbolic networks based on the Lorentz model by adapting the Lorentz transformations (including boost and rotation) to formalize essential operations of neural networks. |
Weize Chen; Xu Han; Yankai Lin; Hexu Zhao; Zhiyuan Liu; Peng Li; Maosong Sun; Jie Zhou; |
390 | Neural Machine Translation with Phrase-Level Universal Visual Representations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a phrase-level retrieval-based method for MMT to get visual information for the source input from existing sentence-image data sets so that MMT can break the limitation of paired sentence-image input. |
Qingkai Fang; Yang Feng; |
391 | M3ED: Multi-modal Multi-scene Multi-label Emotional Dialogue Database Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose a Multi-modal Multi-scene Multi-label Emotional Dialogue dataset, M3ED, which contains 990 dyadic emotional dialogues from 56 different TV series, a total of 9,082 turns and 24,449 utterances. |
Jinming Zhao; Tenggan Zhang; Jingwen Hu; Yuchen Liu; Qin Jin; Xinchao Wang; Haizhou Li; |
392 | Few-shot Named Entity Recognition with Self-describing Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a self-describing mechanism for few-shot NER, which can effectively leverage illustrative instances and precisely transfer knowledge from external resources by describing both entity types and mentions using a universal concept set. |
Jiawei Chen; Qing Liu; Hongyu Lin; Xianpei Han; Le Sun; |
393 | SpeechT5: Unified-Modal Encoder-Decoder Pre-Training for Spoken Language Processing Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Motivated by the success of T5 (Text-To-Text Transfer Transformer) in pre-trained natural language processing models, we propose a unified-modal SpeechT5 framework that explores the encoder-decoder pre-training for self-supervised speech/text representation learning. |
Junyi Ao; Rui Wang; Long Zhou; Chengyi Wang; Shuo Ren; Yu Wu; Shujie Liu; Tom Ko; Qing Li; Yu Zhang; Zhihua Wei; Yao Qian; Jinyu Li; Furu Wei; |
394 | Human Evaluation and Correlation with Automatic Metrics in Consultation Note Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In recent years, machine learning models have rapidly become better at generating clinical consultation notes; yet, there is little work on how to properly evaluate the generated consultation notes to understand the impact they may have on both the clinician using them and the patient’s clinical safety.To address this we present an extensive human evaluation study of consultation notes where 5 clinicians (i) listen to 57 mock consultations, (ii) write their own notes, (iii) post-edit a number of automatically generated notes, and (iv) extract all the errors, both quantitative and qualitative. |
Francesco Moramarco; Alex Papadopoulos Korfiatis; Mark Perera; Damir Juric; Jack Flann; Ehud Reiter; Anya Belz; Aleksandar Savkov; |
395 | Unified Structure Generation for Universal Information Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a unified text-to-structure generation framework, namely UIE, which can universally model different IE tasks, adaptively generate targeted structures, and collaboratively learn general IE abilities from different knowledge sources. |
Yaojie Lu; Qing Liu; Dai Dai; Xinyan Xiao; Hongyu Lin; Xianpei Han; Le Sun; Hua Wu; |
396 | Subgraph Retrieval Enhanced Model for Multi-hop Knowledge Base Question Answering Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper proposes a trainable subgraph retriever (SR) decoupled from the subsequent reasoning process, which enables a plug-and-play framework to enhance any subgraph-oriented KBQA model. |
Jing Zhang; Xiaokang Zhang; Jifan Yu; Jian Tang; Jie Tang; Cuiping Li; Hong Chen; |
397 | Pre-training to Match for Unified Low-shot Relation Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose Multi-Choice Matching Networks to unify low-shot relation extraction. |
Fangchao Liu; Hongyu Lin; Xianpei Han; Boxi Cao; Le Sun; |
398 | Can Prompt Probe Pretrained Language Models? Understanding The Invisible Risks from A Causal View Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To discover, understand and quantify the risks, this paper investigates the prompt-based probing from a causal view, highlights three critical biases which could induce biased results and conclusions, and proposes to conduct debiasing via causal intervention. |
Boxi Cao; Hongyu Lin; Xianpei Han; Fangchao Liu; Le Sun; |
399 | Evaluating Extreme Hierarchical Multi-label Classification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We analyze the state of the art of evaluation metrics based on a set of formal properties and we define an information theoretic based metric inspired by the Information Contrast Model (ICM). |
Enrique Amigo; Agust�n Delgado; |
400 | What Does The Sea Say to The Shore? A BERT Based DST Style Approach for Speaker to Dialogue Attribution in Novels Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a complete pipeline to extract characters in a novel and link them to their direct-speech utterances. |
Carolina Cuesta-Lazaro; Animesh Prasad; Trevor Wood; |
401 | Measuring Fairness of Text Classifiers Via Prediction Sensitivity Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose a new formulation – accumulated prediction sensitivity, which measures fairness in machine learning models based on the model’s prediction sensitivity to perturbations in input features. |
Satyapriya Krishna; Rahul Gupta; Apurv Verma; Jwala Dhamala; Yada Pruksachatkun; Kai-Wei Chang; |
402 | RotateQVS: Representing Temporal Information As Rotations in Quaternion Vector Space for Temporal Knowledge Graph Completion Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel temporal modeling method which represents temporal entities as Rotations in Quaternion Vector Space (RotateQVS) and relations as complex vectors in Hamilton’s quaternion space. |
Kai Chen; Ye Wang; Yitong Li; Aiping Li; |
403 | Feeding What You Need By Understanding What You Learned Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, such a paradigm lacks sufficient interpretation to model capability and can not efficiently train a model with a large corpus. In this paper, we argue that a deep understanding of model capabilities and data properties can help us feed a model with appropriate training data based on its learning status. |
Xiaoqiang Wang; Bang Liu; Fangli Xu; Bo Long; Siliang Tang; Lingfei Wu; |
404 | Probing Simile Knowledge from Pre-trained Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we probe simile knowledge from PLMs to solve the SI and SG tasks in the unified framework of simile triple completion for the first time. |
Weijie Chen; Yongzhu Chang; Rongsheng Zhang; Jiashu Pu; Guandan Chen; Le Zhang; Yadong Xi; Yijiang Chen; Chang Su; |
405 | An Effective and Efficient Entity Alignment Decoding Algorithm Via Third-Order Tensor Isomorphism Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose an effective and efficient EA Decoding Algorithm via Third-order Tensor Isomorphism (DATTI). |
Xin Mao; Meirong Ma; Hao Yuan; Jianchao Zhu; ZongYu Wang; Rui Xie; Wei Wu; Man Lan; |
406 | Entailment Graph Learning with Textual Entailment and Soft Transitivity Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a two-stage method, Entailment Graph with Textual Entailment and Transitivity (EGT2). |
Zhibin Chen; Yansong Feng; Dongyan Zhao; |
407 | Logic Traps in Evaluating Attribution Scores Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper systematically reviews existing methods for evaluating attribution scores and summarizes the logic traps in these methods. |
Yiming Ju; Yuanzhe Zhang; Zhao Yang; Zhongtao Jiang; Kang Liu; Jun Zhao; |
408 | Continual Pre-training of Language Models for Math Problem Understanding with Syntax-Aware Memory Network Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we study how to continually pre-train language models for improving the understanding of math problems. |
Zheng Gong; Kun Zhou; Xin Zhao; Jing Sha; Shijin Wang; Ji-Rong Wen; |
409 | Multitasking Framework for Unsupervised Simple Definition Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a novel task of Simple Definition Generation (SDG) to help language learners and low literacy readers. |
Cunliang Kong; Yun Chen; Hengyuan Zhang; Liner Yang; Erhong Yang; |
410 | Learning to Reason Deductively: Math Word Problem Solving As Complex Relation Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we view the task as a complex relation extraction problem, proposing a novel approach that presents explainable deductive reasoning steps to iteratively construct target expressions, where each step involves a primitive operation over two quantities defining their relation. |
Zhanming Jie; Jierui Li; Wei Lu; |
411 | When Did You Become So Smart, Oh Wise One?! Sarcasm Explanation in Multi-modal Multi-party Dialogues Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we study the discourse structure of sarcastic conversations and propose a novel task – Sarcasm Explanation in Dialogue (SED). |
Shivani Kumar; Atharva Kulkarni; Md Shad Akhtar; Tanmoy Chakraborty; |
412 | Toward Interpretable Semantic Textual Similarity Via Optimal Transport-based Contrastive Sentence Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we explicitly describe the sentence distance as the weighted sum of contextualized token distances on the basis of a transportation problem, and then present the optimal transport-based distance measure, named RCMD; it identifies and leverages semantically-aligned token pairs. |
Seonghyeon Lee; Dongha Lee; Seongbo Jang; Hwanjo Yu; |
413 | Pre-training and Fine-tuning Neural Topic Model: A Simple Yet Effective Approach to Incorporating External Knowledge Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel strategy to incorporate external knowledge into neural topic modeling where the neural topic model is pre-trained on a large corpus and then fine-tuned on the target dataset. |
Linhai Zhang; Xuemeng Hu; Boyu Wang; Deyu Zhou; Qian-Wen Zhang; Yunbo Cao; |
414 | Multi-View Document Representation Learning for Open-Domain Dense Retrieval Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: So the single vector representation of a document is hard to match with multi-view queries, and faces a semantic mismatch problem. This paper proposes a multi-view document representation learning framework, aiming to produce multi-view embeddings to represent documents and enforce them to align with different queries. |
Shunyu Zhang; Yaobo Liang; Ming Gong; Daxin Jiang; Nan Duan; |
415 | Graph Pre-training for AMR Parsing and Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, PLMs are typically pre-trained on textual data, thus are sub-optimal for modeling structural knowledge.To this end, we investigate graph self-supervised training to improve the structure awareness of PLMs over AMR graphs. |
Xuefeng Bai; Yulong Chen; Yue Zhang; |
416 | Turning Tables: Generating Examples from Semi-structured Tables for Endowing Language Models with Reasoning Skills Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose to leverage semi-structured tables, and automatically generate at scale question-paragraph pairs, where answering the question requires reasoning over multiple facts in the paragraph. |
Ori Yoran; Alon Talmor; Jonathan Berant; |
417 | RNG-KBQA: Generation Augmented Iterative Ranking for Knowledge Base Question Answering Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present RnG-KBQA, a Rank-and-Generate approach for KBQA, which remedies the coverage issue with a generation model while preserving a strong generalization capability. |
Xi Ye; Semih Yavuz; Kazuma Hashimoto; Yingbo Zhou; Caiming Xiong; |
418 | Rethinking Self-Supervision Objectives for Generalizable Coherence Modeling Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We instead use a basic model architecture and show significant improvements over state of the art within the same training regime. We then design a harder self-supervision objective by increasing the ratio of negative samples within a contrastive learning setup, and enhance the model further through automatic hard negative mining coupled with a large global negative queue encoded by a momentum encoder. |
Prathyusha Jwalapuram; Shafiq Joty; Xiang Lin; |
419 | Just Rank: Rethinking Evaluation with Word and Sentence Similarities Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper first points out the problems using semantic similarity as the gold standard for word and sentence embedding evaluations. Further, we propose a new intrinsic evaluation method called EvalRank, which shows a much stronger correlation with downstream tasks. |
Bin Wang; C.-c. Kuo; Haizhou Li; |
420 | MarkupLM: Pre-training of Text and Markup Language for Visually Rich Document Understanding Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose MarkupLM for document understanding tasks with markup languages as the backbone, such as HTML/XML-based documents, where text and markup information is jointly pre-trained. |
Junlong Li; Yiheng Xu; Lei Cui; Furu Wei; |
421 | CLIP Models Are Few-Shot Learners: Empirical Studies on VQA and Visual Entailment Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we empirically show that CLIP can be a strong vision-language few-shot learner by leveraging the power of language. |
Haoyu Song; Li Dong; Weinan Zhang; Ting Liu; Furu Wei; |
422 | KQA Pro: A Dataset with Explicit Compositional Programs for Complex Question Answering Over Knowledge Base Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Existing benchmarks have some shortcomings that limit the development of Complex KBQA: 1) they only provide QA pairs without explicit reasoning processes; 2) questions are poor in diversity or scale. To this end, we introduce KQA Pro, a dataset for Complex KBQA including around 120K diverse natural language questions. |
Shulin Cao; Jiaxin Shi; Liangming Pan; Lunyiu Nie; Yutong Xiang; Lei Hou; Juanzi Li; Bin He; Hanwang Zhang; |
423 | Debiased Contrastive Learning of Unsupervised Sentence Representations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Such a way may cause the sampling bias that improper negatives (false negatives and anisotropy representations) are used to learn sentence representations, which will hurt the uniformity of the representation space.To address it, we present a new framework DCLR (Debiased Contrastive Learning of unsupervised sentence Representations) to alleviate the influence of these improper negatives. |
Kun Zhou; Beichen Zhang; Xin Zhao; Ji-Rong Wen; |
424 | MSP: Multi-Stage Prompting for Making Pre-trained Language Models Better Translators Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present Multi-Stage Prompting, a simple and automatic approach for leveraging pre-trained language models to translation tasks. |
Zhixing Tan; Xiangwen Zhang; Shuo Wang; Yang Liu; |
425 | SalesBot: Transitioning from Chit-Chat to Task-Oriented Dialogues Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Hence, this paper focuses on investigating the conversations starting from open-domain social chatting and then gradually transitioning to task-oriented purposes, and releases a large-scale dataset with detailed annotations for encouraging this research direction. To achieve this goal, this paper proposes a framework to automatically generate many dialogues without human involvement, in which any powerful open-domain dialogue generation model can be easily leveraged. |
Ssu Chiu; Maolin Li; Yen-Ting Lin; Yun-Nung Chen; |
426 | UCTopic: Unsupervised Contrastive Learning for Phrase Representations and Topic Mining Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose UCTopic, a novel unsupervised contrastive learning framework for context-aware phrase representations and topic mining. |
Jiacheng Li; Jingbo Shang; Julian McAuley; |
427 | XLM-E: Cross-lingual Language Model Pre-training Via ELECTRA Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we introduce ELECTRA-style tasks to cross-lingual language model pre-training. |
Zewen Chi; Shaohan Huang; Li Dong; Shuming Ma; Bo Zheng; Saksham Singhal; Payal Bajaj; Xia Song; Xian-Ling Mao; Heyan Huang; Furu Wei; |
428 | Nested Named Entity Recognition As Latent Lexicalized Constituency Parsing Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we resort to more expressive structures, lexicalized constituency trees in which constituents are annotated by headwords, to model nested entities. |
Chao Lou; Songlin Yang; Kewei Tu; |
429 | Can Explanations Be Useful for Calibrating Black Box Models? Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We study how to improve a black box model’s performance on a new domain by leveraging explanations of the model’s behavior. |
Xi Ye; Greg Durrett; |
430 | OIE@OIA: An Adaptable and Efficient Open Information Extraction Framework Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper discusses the adaptability problem in existing OIE systems and designs a new adaptable and efficient OIE system – OIE@OIA as a solution. |
Xin Wang; Minlong Peng; Mingming Sun; Ping Li; |
431 | ReACC: A Retrieval-Augmented Code Completion Framework Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Specifically, we propose a retrieval-augmented code completion framework, leveraging both lexical copying and referring to code with similar semantics by retrieval. |
Shuai Lu; Nan Duan; Hojae Han; Daya Guo; Seung-won Hwang; Alexey Svyatkovskiy; |
432 | Does Recommend-Revise Produce Reliable Annotations? An Analysis on Missing Instances in DocRED Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Furthermore, we observe that the models trained on DocRED have low recall on our relabeled dataset and inherit the same bias in the training data. Through the analysis of annotators’ behaviors, we figure out the underlying reason for the problems above: the scheme actually discourages annotators from supplementing adequate instances in the revision phase. |
Quzhe Huang; Shibo Hao; Yuan Ye; Shengqi Zhu; Yansong Feng; Dongyan Zhao; |
433 | UniPELT: A Unified Framework for Parameter-Efficient Language Model Tuning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In light of model diversity and the difficulty of model selection, we propose a unified framework, UniPELT, which incorporates different PELT methods as submodules and learns to activate the ones that best suit the current data or task setup via gating mechanism. |
Yuning Mao; Lambert Mathias; Rui Hou; Amjad Almahairi; Hao Ma; Jiawei Han; Scott Yih; Madian Khabsa; |
434 | An Empirical Study of Memorization in NLP Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we use three different NLP tasks to check if the long-tail theory holds. |
Xiaosen Zheng; Jing Jiang; |
435 | AmericasNLI: Evaluating Zero-shot Natural Language Understanding of Pretrained Multilingual Models in Truly Low-resource Languages Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, prior work evaluating performance on unseen languages has largely been limited to low-level, syntactic tasks, and it remains unclear if zero-shot learning of high-level, semantic tasks is possible for unseen languages. To explore this question, we present AmericasNLI, an extension of XNLI (Conneau et al., 2018) to 10 Indigenous languages of the Americas. |
Abteen Ebrahimi; Manuel Mager; Arturo Oncevay; Vishrav Chaudhary; Luis Chiruzzo; Angela Fan; John Ortega; Ricardo Ramos; Annette Rios; Ivan Vladimir Meza Ruiz; Gustavo Gim�nez-Lugo; Elisabeth Mager; Graham Neubig; Alexis Palmer; Rolando Coto-Solano; Thang Vu; Katharina Kann; |
436 | Towards Learning (Dis)-Similarity of Source Code from Program Contrasts Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present DISCO (DIS-similarity of COde), a novel self-supervised model focusing on identifying (dis)similar functionalities of source code. |
Yangruibo Ding; Luca Buratti; Saurabh Pujar; Alessandro Morari; Baishakhi Ray; Saikat Chakraborty; |
437 | Guided Attention Multimodal Multitask Financial Forecasting with Inter-Company Relationships and Global and Local News Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a model that captures both global and local multimodal information for investment and risk management-related forecasting tasks. |
Gary Ang; Ee-Peng Lim; |
438 | On Vision Features in Multimodal Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we investigate the impact of vision models on MMT. |
Bei Li; Chuanhao Lv; Zefan Zhou; Tao Zhou; Tong Xiao; Anxiang Ma; JingBo Zhu; |
439 | CONTaiNER: Few-Shot Named Entity Recognition Via Contrastive Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This affects generalizability to unseen target domains, resulting in suboptimal performances. To this end, we present CONTaiNER, a novel contrastive learning technique that optimizes the inter-token distribution distance for Few-Shot NER. |
Sarkar Snigdha Sarathi Das; Arzoo Katiyar; Rebecca Passonneau; Rui Zhang; |
440 | Cree Corpus: A Collection of Nehiyawewin Resources Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To support nehiyawewin revitalization and preservation, we developed a corpus covering diverse genres, time periods, and texts for a variety of intended audiences. |
Daniela Teodorescu; Josie Matalski; Delaney Lothian; Denilson Barbosa; Carrie Demmans Epp; |
441 | Learning to Rank Visual Stories From Human Ranking Data Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present the VHED (VIST Human Evaluation Data) dataset, which first re-purposes human evaluation results for automatic evaluation; hence we develop Vrank (VIST Ranker), a novel reference-free VIST metric for story evaluation. |
Chi-Yang Hsu; Yun-Wei Chu; Vincent Chen; Kuan-Chieh Lo; Chacha Chen; Ting-Hao Huang; Lun-Wei Ku; |
442 | Universal Conditional Masked Language Pre-training for Neural Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Specifically, we propose CeMAT, a conditional masked language model pre-trained on large-scale bilingual and monolingual corpora in many languages. |
Pengfei Li; Liangyou Li; Meng Zhang; Minghao Wu; Qun Liu; |
443 | CARETS: A Consistency And Robustness Evaluative Test Suite for VQA Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce CARETS, a systematic test suite to measure consistency and robustness of modern VQA models through a series of six fine-grained capability tests. |
Carlos Jimenez; Olga Russakovsky; Karthik Narasimhan; |
444 | Phrase-aware Unsupervised Constituency Parsing Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we revisit LM-based constituency parsing from a phrase-centered perspective. |
Xiaotao Gu; Yikang Shen; Jiaming Shen; Jingbo Shang; Jiawei Han; |
445 | Achieving Reliable Human Assessment of Open-Domain Dialogue Systems Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Answering the distress call of competitions that have emphasized the urgent need for better evaluation techniques in dialogue, we present the successful development of human evaluation that is highly reliable while still remaining feasible and low cost. |
Tianbo Ji; Yvette Graham; Gareth Jones; Chenyang Lyu; Qun Liu; |
446 | Updated Headline Generation: Creating Updated Summaries for Evolving News Stories Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose the task of updated headline generation, in which a system generates a headline for an updated article, considering both the previous article and headline. |
Sheena Panthaplackel; Adrian Benton; Mark Dredze; |
447 | SaFeRDialogues: Taking Feedback Gracefully After Conversational Safety Failures Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This work proposes SaFeRDialogues, a task and dataset of graceful responses to conversational feedback about safety failures.We collect a dataset of 8k dialogues demonstrating safety failures, feedback signaling them, and a response acknowledging the feedback. |
Megan Ung; Jing Xu; Y-Lan Boureau; |
448 | Compositional Generalization in Dependency Parsing Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we introduce a gold-standard set of dependency parses for CFQ, and use this to analyze the behaviour of a state-of-the art dependency parser (Qi et al., 2020) on the CFQ dataset. |
Emily Goodwin; Siva Reddy; Timothy O�Donnell; Dzmitry Bahdanau; |
449 | ASPECTNEWS: Aspect-Oriented Summarization of News Documents Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we collect a dataset of realistic aspect-oriented summaries, AspectNews, which covers different subtopics about articles in news sub-domains. |
Ojas Ahuja; Jiacheng Xu; Akshay Gupta; Kevin Horecka; Greg Durrett; |
450 | MemSum: Extractive Summarization of Long Documents Using Multi-Step Episodic Markov Decision Processes Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce MemSum (Multi-step Episodic Markov decision process extractive SUMmarizer), a reinforcement-learning-based extractive summarizer enriched at each step with information on the current extraction history. |
Nianlong Gu; Elliott Ash; Richard Hahnloser; |
451 | CLUES: A Benchmark for Learning Classifiers Using Natural Language Explanations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Here, we explore training zero-shot classifiers for structured data purely from language. For this, we introduce CLUES, a benchmark for Classifier Learning Using natural language ExplanationS, consisting of a range of classification tasks over structured data along with natural language supervision in the form of explanations. |
Rakesh Menon; Sayan Ghosh; Shashank Srivastava; |
452 | Substructure Distribution Projection for Zero-Shot Cross-Lingual Dependency Parsing Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present substructure distribution projection (SubDP), a technique that projects a distribution over structures in one domain to another, by projecting substructure distributions separately. |
Freda Shi; Kevin Gimpel; Karen Livescu; |
453 | Multilingual Detection of Personal Employment Status on Twitter Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Here, we examine three Active Learning (AL) strategies in real-world settings of extreme class imbalance, and identify five types of disclosures about individuals’ employment status (e.g. job loss) in three languages using BERT-based classification models. |
Manuel Tonneau; Dhaval Adjodah; Joao Palotti; Nir Grinberg; Samuel Fraiberger; |
454 | MultiHiertt: Numerical Reasoning Over Multi Hierarchical Tabular and Textual Data Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To facilitate data analytical progress, we construct a new large-scale benchmark, MultiHiertt, with QA pairs over Multi Hierarchical Tabular and Textual data. |
Yilun Zhao; Yunxiang Li; Chenying Li; Rui Zhang; |
455 | Transformers in The Loop: Polarity in Neural Models of Language Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Representation of linguistic phenomena in computational language models is typically assessed against the predictions of existing linguistic theories of these phenomena. Using the notion of polarity as a case study, we show that this is not always the most adequate set-up. |
Lisa Bylinina; Alexey Tikhonov; |
456 | Bridging The Data Gap Between Training and Inference for Unsupervised Neural Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To narrow the data gap, we propose an online self-training approach, which simultaneously uses the pseudo parallel data {natural source, translated target} to mimic the inference scenario. |
Zhiwei He; Xing Wang; Rui Wang; Shuming Shi; Zhaopeng Tu; |
457 | SDR: Efficient Neural Re-ranking Using Succinct Document Representation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose the Succinct Document Representation (SDR) scheme that computes highly compressed intermediate document representations, mitigating the storage/network issue. |
Nachshon Cohen; Amit Portnoy; Besnik Fetahu; Amir Ingber; |
458 | The AI Doctor Is In: A Survey of Task-Oriented Dialogue Systems for Healthcare Applications Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To fill this gap, we investigated an initial pool of 4070 papers from well-known computer science, natural language processing, and artificial intelligence venues, identifying 70 papers discussing the system-level implementation of task-oriented dialogue systems for healthcare applications. We conducted a comprehensive technical review of these papers, and present our key findings including identified gaps and corresponding recommendations. |
Mina Valizadeh; Natalie Parde; |
459 | SHIELD: Defending Textual Neural Networks Against Multiple Black-Box Adversarial Attacks with Stochastic Multi-Expert Patcher Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In particular, the state-of-the-art transformer models (e.g., BERT, RoBERTa) require great time and computation resources. By borrowing an idea from software engineering, in order to address these limitations, we propose a novel algorithm, SHIELD, which modifies and re-trains only the last layer of a textual NN, and thus it patches and transforms the NN into a stochastic weighted ensemble of multi-expert prediction heads. |
Thai Le; Noseong Park; Dongwon Lee; |
460 | Accurate Online Posterior Alignments for Principled Lexically-Constrained Decoding Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a novel posterior alignment technique that is truly online in its execution and superior in terms of alignment error rates compared to existing methods. |
Soumya Chatterjee; Sunita Sarawagi; Preethi Jyothi; |
461 | Leveraging Task Transferability to Meta-learning for Clinical Section Classification with Limited Data Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The present paper proposes an algorithmic way to improve the task transferability of meta-learning-based text classification in order to address the issue of low-resource target data. |
Zhuohao Chen; Jangwon Kim; Ram Bhakta; Mustafa Sir; |
462 | Reinforcement Guided Multi-Task Learning Framework for Low-Resource Stereotype Detection Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we annotate a focused evaluation set for ‘Stereotype Detection’ that addresses those pitfalls by de-constructing various ways in which stereotypes manifest in text. |
Rajkumar Pujari; Erik Oveson; Priyanka Kulkarni; Elnaz Nouri; |
463 | Letters From The Past: Modeling Historical Sound Change Through Diachronic Character Embeddings Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we address the detection of sound change through historical spelling. |
Sidsel Boldsen; Patrizia Paggio; |
464 | A Token-level Reference-free Hallucination Detection Benchmark for Free-form Text Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However ground-truth references may not be readily available for many free-form text generation applications, and sentence- or document-level detection may fail to provide the fine-grained signals that would prevent fallacious content in real time. As a first step to addressing these issues, we propose a novel token-level, reference-free hallucination detection task and an associated annotated dataset named HaDeS (HAllucination DEtection dataSet). |
Tianyu Liu; Yizhe Zhang; Chris Brockett; Yi Mao; Zhifang Sui; Weizhu Chen; Bill Dolan; |
465 | Low-Rank Softmax Can Have Unargmaxable Classes in Theory But Rarely in Practice Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In theory, the result is some words may be impossible to be predicted via argmax, irrespective of input features, and empirically, there is evidence this happens in small language models (Demeter et al., 2020). In this paper we ask whether it can happen in practical large language models and translation models. |
Andreas Grivas; Nikolay Bogoychev; Adam Lopez; |
466 | Prompt for Extraction? PAIE: Prompting Argument Interaction for Event Argument Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose an effective yet efficient model PAIE for both sentence-level and document-level Event Argument Extraction (EAE), which also generalizes well when there is a lack of training data. |
Yubo Ma; Zehao Wang; Yixin Cao; Mukai Li; Meiqi Chen; Kun Wang; Jing Shao; |
467 | Reducing Position Bias in Simultaneous Machine Translation with Length-Aware Framework Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we first analyze the phenomenon of position bias in SiMT, and develop a Length-Aware Framework to reduce the position bias by bridging the structural gap between SiMT and full-sentence MT. |
Shaolei Zhang; Yang Feng; |
468 | A Statutory Article Retrieval Dataset in French Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: While recent advances in natural language processing have sparked considerable interest in many legal tasks, statutory article retrieval remains primarily untouched due to the scarcity of large-scale and high-quality annotated datasets. To address this bottleneck, we introduce the Belgian Statutory Article Retrieval Dataset (BSARD), which consists of 1,100+ French native legal questions labeled by experienced jurists with relevant articles from a corpus of 22,600+ Belgian law articles. |
Antoine Louis; Gerasimos Spanakis; |
469 | ParaDetox: Detoxification with Parallel Data Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present a novel pipeline for the collection of parallel data for the detoxification task. |
Varvara Logacheva; Daryna Dementieva; Sergey Ustyantsev; Daniil Moskovskiy; David Dale; Irina Krotova; Nikita Semenov; Alexander Panchenko; |
470 | Interpreting Character Embeddings With Perceptual Representations: The Case of Shape, Sound, and Color Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We leverage perceptual representations in the form of shape, sound, and color embeddings and perform a representational similarity analysis to evaluate their correlation with textual representations in five languages. |
Sidsel Boldsen; Manex Agirrezabal; Nora Hollenstein; |
471 | Fine-Grained Controllable Text Generation Using Non-Residual Prompting Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To alleviate this trade-off, we propose an encoder-decoder architecture that enables intermediate text prompts at arbitrary time steps. We propose a resource-efficient method for converting a pre-trained CLM into this architecture, and demonstrate its potential on various experiments, including the novel task of contextualized word inclusion. |
Fredrik Carlsson; Joey �hman; Fangyu Liu; Severine Verlinden; Joakim Nivre; Magnus Sahlgren; |
472 | Language-Agnostic Meta-Learning for Low-Resource Text-to-Speech with Articulatory Features Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we use embeddings derived from articulatory vectors rather than embeddings derived from phoneme identities to learn phoneme representations that hold across languages. |
Florian Lux; Thang Vu; |
473 | TwittIrish: A Universal Dependencies Treebank of Tweets in Modern Irish Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we explore the differences between Irish tweets and standard Irish text, and the challenges associated with dependency parsing of Irish tweets. |
Lauren Cassidy; Teresa Lynn; James Barry; Jennifer Foster; |
474 | Length Control in Abstractive Summarization By Pretraining Information Selection Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a length-aware attention mechanism (LAAM) to adapt the encoding of the source based on the desired length. |
Yizhu Liu; Qi Jia; Kenny Zhu; |
475 | CQG: A Simple and Effective Controlled Generation Framework for Multi-hop Question Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, most models can not ensure the complexity of generated questions, so they may generate shallow questions that can be answered without multi-hop reasoning. To address this challenge, we propose the CQG, which is a simple and effective controlled framework. |
Zichu Fei; Qi Zhang; Tao Gui; Di Liang; Sirui Wang; Wei Wu; Xuanjing Huang; |
476 | Word Order Does Matter and Shuffled Language Models Know It Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We probe these language models for word order information and investigate what position embeddings learned from shuffled text encode, showing that these models retain a notion of word order information. We show this is in part due to a subtlety in how shuffling is implemented in previous work – before rather than after subword segmentation. |
Mostafa Abdou; Vinit Ravishankar; Artur Kulmizev; Anders S�gaard; |
477 | An Empirical Study on Explanations in Out-of-Domain Settings Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we conduct an extensive empirical study that examines: (1) the out-of-domain faithfulness of post-hoc explanations, generated by five feature attribution methods; and (2) the out-of-domain performance of two inherently faithful models over six datasets. |
George Chrysostomou; Nikolaos Aletras; |
478 | MILIE: Modular & Iterative Multilingual Open Information Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In contrast, we explore the hypothesis that it may be beneficial to extract triple slots iteratively: first extract easy slots, followed by the difficult ones by conditioning on the easy slots, and therefore achieve a better overall extraction.Based on this hypothesis, we propose a neural OpenIE system, MILIE, that operates in an iterative fashion. |
Bhushan Kotnis; Kiril Gashteovski; Daniel Rubio; Ammar Shaker; Vanesa Rodriguez-Tembras; Makoto Takamoto; Mathias Niepert; Carolin Lawrence; |
479 | What Makes Reading Comprehension Questions Difficult? Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this study, we crowdsource multiple-choice reading comprehension questions for passages taken from seven qualitatively distinct sources, analyzing what attributes of passages contribute to the difficulty and question types of the collected examples. |
Saku Sugawara; Nikita Nangia; Alex Warstadt; Samuel Bowman; |
480 | From Simultaneous to Streaming Machine Translation By Leveraging Streaming History Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This work proposes a stream-level adaptation of the current latency measures based on a re-segmentation approach applied to the output translation, that is successfully evaluated on streaming conditions for a reference IWSLT task |
Javier Iranzo Sanchez; Jorge Civera; Alfons Juan-C�scar; |
481 | A Rationale-Centric Framework for Human-in-the-loop Machine Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present a novel rational-centric framework with human-in-the-loop – Rationales-centric Double-robustness Learning (RDL) – to boost model out-of-distribution performance in few-shot learning scenarios. |
Jinghui Lu; Linyi Yang; Brian Namee; Yue Zhang; |
482 | Challenges and Strategies in Cross-Cultural NLP Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Analogous to cross-lingual and multilingual NLP, cross-cultural and multicultural NLP considers these differences in order to better serve users of NLP systems. We propose a principled framework to frame these efforts, and survey existing and potential strategies. |
Daniel Hershcovich; Stella Frank; Heather Lent; Miryam de Lhoneux; Mostafa Abdou; Stephanie Brandl; Emanuele Bugliarello; Laura Cabello Piqueras; Ilias Chalkidis; Ruixiang Cui; Constanza Fierro; Katerina Margatina; Phillip Rust; Anders S�gaard; |
483 | Prototypical Verbalizer for Prompt-based Few-shot Tuning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose the prototypical verbalizer (ProtoVerb) which is built directly from training data. Specifically, ProtoVerb learns prototype vectors as verbalizers by contrastive learning. |
Ganqu Cui; Shengding Hu; Ning Ding; Longtao Huang; Zhiyuan Liu; |
484 | Clickbait Spoiling Via Question Answering and Passage Retrieval Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce and study the task of clickbait spoiling: generating a short text that satisfies the curiosity induced by a clickbait post. |
Matthias Hagen; Maik Fr�be; Artur Jurk; Martin Potthast; |
485 | BERT Learns to Teach: Knowledge Distillation with Meta Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present Knowledge Distillation with Meta Learning (MetaDistil), a simple yet effective alternative to traditional knowledge distillation (KD) methods where the teacher model is fixed during training. |
Wangchunshu Zhou; Canwen Xu; Julian McAuley; |
486 | STEMM: Self-learning with Speech-text Manifold Mixup for Speech Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Existing techniques often attempt to transfer powerful machine translation (MT) capabilities to ST, but neglect the representation discrepancy across modalities. In this paper, we propose the Speech-TExt Manifold Mixup (STEMM) method to calibrate such discrepancy. |
Qingkai Fang; Rong Ye; Lei Li; Yang Feng; Mingxuan Wang; |
487 | Integrating Vectorized Lexical Constraints for Neural Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Due to the representation gap between discrete constraints and continuous vectors in NMT models, most existing works choose to construct synthetic data or modify the decoding algorithm to impose lexical constraints, treating the NMT model as a black box. In this work, we propose to open this black box by directly integrating the constraints into NMT models. |
Shuo Wang; Zhixing Tan; Yang Liu; |
488 | MPII: Multi-Level Mutual Promotion for Inference and Interpretation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a multi-level Mutual Promotion mechanism for self-evolved Inference and sentence-level Interpretation (MPII). |
Yan Liu; Sanyuan Chen; Yazheng Yang; Qi Dai; |
489 | StableMoE: Stable Routing Strategy for Mixture of Experts Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose StableMoE with two training stages to address the routing fluctuation problem. |
Damai Dai; Li Dong; Shuming Ma; Bo Zheng; Zhifang Sui; Baobao Chang; Furu Wei; |
490 | Boundary Smoothing for Named Entity Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Inspired by label smoothing and driven by the ambiguity of boundary annotation in NER engineering, we propose boundary smoothing as a regularization technique for span-based neural NER models. |
Enwei Zhu; Jinpeng Li; |
491 | Incorporating Hierarchy Into Text Encoder: A Contrastive Learning Approach for Hierarchical Text Classification Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Existing methods encode text and label hierarchy separately and mix their representations for classification, where the hierarchy remains unchanged for all input text. Instead of modeling them separately, in this work, we propose Hierarchy-guided Contrastive Learning (HGCLR) to directly embed the hierarchy into a text encoder. |
Zihan Wang; Peiyi Wang; Lianzhe Huang; Xin Sun; Houfeng Wang; |
492 | Signal in Noise: Exploring Meaning Encoded in Random Character Sequences with Character-Aware Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose that n-grams composed of random character sequences, or garble, provide a novel context for studying word meaning both within and beyond extant language. |
Mark Chu; Bhargav Srinivasa Desikan; Ethan Nadler; Donald Ruggiero Lo Sardo; Elise Darragh-Ford; Douglas Guilbeault; |
493 | Hyperlink-induced Pre-training for Passage Retrieval in Open-domain Question Answering Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, there still remains a large discrepancy between the provided upstream signals and the downstream question-passage relevance, which leads to less improvement. To bridge this gap, we propose the HyperLink-induced Pre-training (HLP), a method to pre-train the dense retriever with the text relevance induced by hyperlink-based topology within Web documents. |
Jiawei Zhou; Xiaoguang Li; Lifeng Shang; Lan Luo; Ke Zhan; Enrui Hu; Xinyu Zhang; Hao Jiang; Zhao Cao; Fan Yu; Xin Jiang; Qun Liu; Lei Chen; |
494 | AdaLoGN: Adaptive Logic Graph Network for Reasoning-Based Machine Reading Comprehension Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To meet the challenge, we present a neural-symbolic approach which, to predict an answer, passes messages over a graph representing logical relations between text units. |
Xiao Li; Gong Cheng; Ziheng Chen; Yawei Sun; Yuzhong Qu; |
495 | CAMERO: Consistency Regularized Ensemble of Perturbed Language Models with Weight Sharing Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To retain ensemble benefits while maintaining a low memory cost, we propose a consistency-regularized ensemble learning approach based on perturbed models, named CAMERO. |
Chen Liang; Pengcheng He; Yelong Shen; Weizhu Chen; Tuo Zhao; |
496 | Interpretability for Language Learners Using Example-Based Grammatical Error Correction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this study, we introduce an Example-Based GEC (EB-GEC) that presents examples to language learners as a basis for a correction result. |
Masahiro Kaneko; Sho Takase; Ayana Niwa; Naoaki Okazaki; |
497 | Rethinking Negative Sampling for Handling Missing Entity Annotations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: One of our contributions is an analysis on how it makes sense through introducing two insightful concepts: missampling and uncertainty. |
Yangming Li; Lemao Liu; Shuming Shi; |
498 | Distantly Supervised Named Entity Recognition Via Confidence-Based Multi-Class Positive and Unlabeled Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we study the named entity recognition (NER) problem under distant supervision. |
Kang Zhou; Yuepei Li; Qi Li; |
499 | UniXcoder: Unified Cross-Modal Pre-training for Code Representation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present UniXcoder, a unified cross-modal pre-trained model for programming language. |
Daya Guo; Shuai Lu; Nan Duan; Yanlin Wang; Ming Zhou; Jian Yin; |
500 | One Country, 700+ Languages: NLP Challenges for Underrepresented Languages and Dialects in Indonesia Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Focusing on the languages spoken in Indonesia, the second most linguistically diverse and the fourth most populous nation of the world, we provide an overview of the current state of NLP research for Indonesia’s 700+ languages. We highlight challenges in Indonesian NLP and how these affect the performance of current NLP systems. |
Alham Aji; Genta Indra Winata; Fajri Koto; Samuel Cahyawijaya; Ade Romadhony; Rahmad Mahendra; Kemal Kurniawan; David Moeljadi; Radityo Eko Prasojo; Timothy Baldwin; Jey Han Lau; Sebastian Ruder; |
501 | Is GPT-3 Text Indistinguishable from Human Text? Scarecrow: A Framework for Scrutinizing Machine Text Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: As errors in machine generations become ever subtler and harder to spot, it poses a new challenge to the research community for robust machine text evaluation.We propose a new framework called Scarecrow for scrutinizing machine text via crowd annotation. |
Yao Dou; Maxwell Forbes; Rik Koncel-Kedziorski; Noah Smith; Yejin Choi; |
502 | Transkimmer: Transformer Learns to Layer-wise Skim Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, they suffer from not having effectual and end-to-end optimization of the discrete skimming predictor. To address the above limitations, we propose the Transkimmer architecture, which learns to identify hidden state tokens that are not required by each layer. |
Yue Guan; Zhengyi Li; Jingwen Leng; Zhouhan Lin; Minyi Guo; |
503 | SkipBERT: Efficient Inference with Shallow Layer Skipping Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose SkipBERT to accelerate BERT inference by skipping the computation of shallow layers. |
Jue Wang; Ke Chen; Gang Chen; Lidan Shou; Julian McAuley; |
504 | Pretraining with Artificial Language: Studying Transferable Knowledge in Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We investigate what kind of structural knowledge learned in neural network encoders is transferable to processing natural language. |
Ryokan Ri; Yoshimasa Tsuruoka; |
505 | MLUKE: The Power of Entity Representations in Multilingual Pretrained Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this study, we explore the effectiveness of leveraging entity representations for downstream cross-lingual tasks. |
Ryokan Ri; Ikuya Yamada; Yoshimasa Tsuruoka; |
506 | Evaluating Factuality in Text Simplification Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce a taxonomy of errors that we use to analyze both references drawn from standard simplification datasets and state-of-the-art model outputs. |
Ashwin Devaraj; William Sheffield; Byron Wallace; Junyi Jessy Li; |
507 | Requirements and Motivations of Low-Resource Speech Synthesis for Language Revitalization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper describes the motivation and development of speech synthesis systems for the purposes of language revitalization. |
Aidan Pine; Dan Wells; Nathan Brinklow; Patrick Littell; Korin Richmond; |
508 | Sharpness-Aware Minimization Improves Language Model Generalization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we show that Sharpness-Aware Minimization (SAM), a recently proposed optimization procedure that encourages convergence to flatter minima, can substantially improve the generalization of language models without much computational overhead. |
Dara Bahri; Hossein Mobahi; Yi Tay; |
509 | Adversarial Authorship Attribution for Deobfuscation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Specifically, they are not evaluated against adversarially trained authorship attributors that are aware of potential obfuscation. To fill this gap, we investigate the problem of adversarial authorship attribution for deobfuscation. |
Wanyue Zhai; Jonathan Rusert; Zubair Shafiq; Padmini Srinivasan; |
510 | Weakly Supervised Word Segmentation for Computational Language Documentation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper studies how such a weak supervision can be taken advantage of in Bayesian non-parametric models of segmentation. |
Shu Okabe; Laurent Besacier; Fran�ois Yvon; |
511 | SciNLI: A Corpus for Natural Language Inference on Scientific Text Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we introduce SciNLI, a large dataset for NLI that captures the formality in scientific text and contains 107,412 sentence pairs extracted from scholarly papers on NLP and computational linguistics. |
Mobashir Sadat; Cornelia Caragea; |
512 | Neural Reality of Argument Structure Constructions Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Here we adapt several psycholinguistic studies to probe for the existence of argument structure constructions (ASCs) in Transformer-based language models (LMs). |
Bai Li; Zining Zhu; Guillaume Thomas; Frank Rudzicz; Yang Xu; |
513 | On The Robustness of Offensive Language Classifiers Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Prior work in this space is limited to studying robustness of offensive language classifiers against primitive attacks such as misspellings and extraneous spaces. To address this gap, we systematically analyze the robustness of state-of-the-art offensive language classifiers against more crafty adversarial attacks that leverage greedy- and attention-based word selection and context-aware embeddings for word replacement. |
Jonathan Rusert; Zubair Shafiq; Padmini Srinivasan; |
514 | Few-shot Controllable Style Transfer for Low-Resource Multilingual Settings Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work we study a relevant low-resource setting: style transfer for languages where no style-labelled corpora are available. |
Kalpesh Krishna; Deepak Nathani; Xavier Garcia; Bidisha Samanta; Partha Talukdar; |
515 | ABC: Attention with Bounded-memory Control Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: One way to improve the efficiency is to bound the memory size. We show that disparate approaches can be subsumed into one abstraction, attention with bounded-memory control (ABC), and they vary in their organization of the memory. |
Hao Peng; Jungo Kasai; Nikolaos Pappas; Dani Yogatama; Zhaofeng Wu; Lingpeng Kong; Roy Schwartz; Noah Smith; |
516 | The Dangers of Underclaiming: Reasons for Caution When Reporting How NLP Systems Fail Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: It also limits our ability to prepare for the potentially enormous impacts of more distant future advances. This paper urges researchers to be careful about these claims and suggests some research directions and communication strategies that will make it easier to avoid or rebut them. |
Samuel Bowman; |
517 | RELiC: Retrieving Evidence for Literary Claims Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We collect a large-scale dataset (RELiC) of 78K literary quotations and surrounding critical analysis and use it to formulate the novel task of literary evidence retrieval, in which models are given an excerpt of literary analysis surrounding a masked quotation and asked to retrieve the quoted passage from the set of all passages in the work. |
Katherine Thai; Yapei Chang; Kalpesh Krishna; Mohit Iyyer; |
518 | Analyzing Generalization of Vision and Language Navigation to Unseen Outdoor Areas Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We focus on VLN in outdoor scenarios and find that in contrast to indoor VLN, most of the gain in outdoor VLN on unseen data is due to features like junction type embedding or heading delta that are specific to the respective environment graph, while image information plays a very minor role in generalizing VLN to unseen outdoor areas. |
Raphael Schumann; Stefan Riezler; |
519 | Adapting Coreference Resolution Models Through Active Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper explores how to actively label coreference, examining sources of model uncertainty and document reading costs. |
Michelle Yuan; Patrick Xia; Chandler May; Benjamin Van Durme; Jordan Boyd-Graber; |
520 | An Imitation Learning Curriculum for Text Editing with Non-Autoregressive Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a framework for training non-autoregressive sequence-to-sequence models for editing tasks, where the original input sequence is iteratively edited to produce the output. |
Sweta Agrawal; Marine Carpuat; |
521 | Memorisation Versus Generalisation in Pre-trained Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, our experiments also show that they mainly learn from high-frequency patterns and largely fail when tested on low-resource tasks such as few-shot learning and rare entity recognition. To mitigate such limitations, we propose an extension based on prototypical networks that improves performance in low-resource named entity recognition tasks. |
Michael T�nzer; Sebastian Ruder; Marek Rei; |
522 | ChatMatch: Evaluating Chatbots By Autonomous Chat Tournaments Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In our work, we propose an interactive chatbot evaluation framework in which chatbots compete with each other like in a sports tournament, using flexible scoring metrics. |
Ruolan Yang; Zitong Li; Haifeng Tang; Kenny Zhu; |
523 | Do Self-supervised Speech Models Develop Human-like Perception Biases? Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Does the same thing happen in self-supervised models? We examine the representational spaces of three kinds of state of the art self-supervised models: wav2vec, HuBERT and contrastive predictive coding (CPC), and compare them with the perceptual spaces of French-speaking and English-speaking human listeners, both globally and taking account of the behavioural differences between the two language groups. |
Juliette Millet; Ewan Dunbar; |
524 | Vision-and-Language Navigation: A Survey of Tasks, Methods, and Future Directions Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we review contemporary studies in the emerging field of VLN, covering tasks, evaluation metrics, methods, etc. |
Jing Gu; Eliana Stefani; Qi Wu; Jesse Thomason; Xin Wang; |
525 | Learning to Generate Programs for Table Fact Verification Via Structure-Aware Semantic Parsing Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we address the challenge by leveraging both lexical features and structure features for program generation. |
Suixin Ou; Yongmei Liu; |
526 | Cluster & Tune: Boost Cold Start Performance in Text Classification Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In such cases, the common practice of fine-tuning pre-trained models, such as BERT, for a target classification task, is prone to produce poor performance. We suggest a method to boost the performance of such models by adding an intermediate unsupervised classification task, between the pre-training and fine-tuning phases. |
Eyal Shnarch; Ariel Gera; Alon Halfon; Lena Dankin; Leshem Choshen; Ranit Aharonov; Noam Slonim; |
527 | Overcoming A Theoretical Limitation of Self-Attention Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Hahn shows that for languages where acceptance depends on a single input symbol, a transformer’s classification decisions get closer and closer to random guessing (that is, a cross-entropy of 1) as input strings get longer and longer. We examine this limitation using two languages: PARITY, the language of bit strings with an odd number of 1s, and FIRST, the language of bit strings starting with a 1. |
David Chiang; Peter Cholak; |
528 | Prediction Difference Regularization Against Perturbation for Neural Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we utilize prediction difference for ground-truth tokens to analyze the fitting of token-level samples and find that under-fitting is almost as common as over-fitting. |
Dengji Guo; Zhengrui Ma; Min Zhang; Yang Feng; |
529 | Make The Best of Cross-lingual Transfer: Evidence from POS Tagging with Over 100 Languages Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We explore a more extensive transfer learning setup with 65 different source languages and 105 target languages for part-of-speech tagging. |
Wietse de Vries; Martijn Wieling; Malvina Nissim; |
530 | Should A Chatbot Be Sarcastic? Understanding User Preferences Towards Sarcasm Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we argue that we should first turn our attention to the question of when sarcasm should be generated, finding that humans consider sarcastic responses inappropriate to many input utterances. |
Silviu Vlad Oprea; Steven Wilson; Walid Magdy; |
531 | How Do Seq2Seq Models Perform on End-to-End Data-to-Text Generation? Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In order to better understand the ability of Seq2Seq models, evaluate their performance and analyze the results, we choose to use Multidimensional Quality Metric(MQM) to evaluate several representative Seq2Seq models on end-to-end data-to-text generation. |
Xunjian Yin; Xiaojun Wan; |
532 | Probing for Labeled Dependency Trees Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This work introduces DepProbe, a linear probe which can extract labeled and directed dependency parse trees from embeddings while using fewer parameters and compute than prior methods. |
Max M�ller-Eberstein; Rob Goot; Barbara Plank; |
533 | DoCoGen: Domain Counterfactual Generation for Low Resource Domain Adaptation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Natural language processing (NLP) algorithms have become very successful, but they still struggle when applied to out-of-distribution examples. In this paper we propose a controllable generation approach in order to deal with this domain adaptation (DA) challenge. |
Nitay Calderon; Eyal Ben-David; Amir Feder; Roi Reichart; |
534 | LiLT: A Simple Yet Effective Language-Independent Layout Transformer for Structured Document Understanding Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, most existing related models can only deal with the document data of specific language(s) (typically English) included in the pre-training collection, which is extremely limited. To address this issue, we propose a simple yet effective Language-independent Layout Transformer (LiLT) for structured document understanding. |
Jiapeng Wang; Lianwen Jin; Kai Ding; |
535 | Dependency-based Mixture Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we introduce the Dependency-based Mixture Language Models. |
Zhixian Yang; Xiaojun Wan; |
536 | Can Unsupervised Knowledge Transfer from Social Discussions Help Argument Mining? Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: While pretrained Transformer-based Language Models (LM) have been shown to provide state-of-the-art results over different NLP tasks, the scarcity of manually annotated data and the highly domain-dependent nature of argumentation restrict the capabilities of such models. In this work, we propose a novel transfer learning strategy to overcome these challenges. |
Subhabrata Dutta; Jeevesh Juneja; Dipankar Das; Tanmoy Chakraborty; |
537 | Entity-based Neural Local Coherence Modeling Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose an entity-based neural local coherence model which is linguistically more sound than previously proposed neural coherence models. |
Sungho Jeon; Michael Strube; |
538 | That Is A Suspicious Reaction!: Interpreting Logits Variation to Detect NLP Adversarial Attacks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Our work presents a model-agnostic detector of adversarial text examples. |
Edoardo Mosca; Shreyash Agarwal; Javier Rando Ram�rez; Georg Groh; |
539 | Local Languages, Third Spaces, and Other High-Resource Scenarios Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: These are often subsumed under the label of under-resourced languages even though they have distinct functions and prospects. I explore this position and propose some ecologically-aware language technology agendas. |
Steven Bird; |
540 | That Slepen Al The Nyght with Open Ye! Cross-era Sequence Segmentation with Switch-memory Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Although the Chinese language has a long history, previous Chinese natural language processing research has primarily focused on tasks within a specific era. Therefore, we propose a cross-era learning framework for Chinese word segmentation (CWS), CROSSWISE, which uses the Switch-memory (SM) module to incorporate era-specific linguistic knowledge. |
Xuemei Tang; Qi Su; |
541 | Fair and Argumentative Language Modeling for Computational Argumentation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we address this research gap and conduct a thorough investigation of bias in argumentative language models. |
Carolin Holtermann; Anne Lauscher; Simone Ponzetto; |
542 | Learning Adaptive Segmentation Policy for End-to-End Simultaneous Translation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper proposes an adaptive segmentation policy for end-to-end ST. Inspired by human interpreters, the policy learns to segment the source streaming speech into meaningful units by considering both acoustic features and translation history, maintaining consistency between the segmentation and translation. |
Ruiqing Zhang; Zhongjun He; Hua Wu; Haifeng Wang; |
543 | Can Pre-trained Language Models Interpret Similes As Smart As Human? Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we investigate the ability of PLMs in simile interpretation by designing a novel task named Simile Property Probing, i.e., to let the PLMs infer the shared properties of similes. |
Qianyu He; Sijie Cheng; Zhixu Li; Rui Xie; Yanghua Xiao; |
544 | CBLUE: A Chinese Biomedical Language Understanding Evaluation Benchmark Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To facilitate research in this direction, we collect real-world biomedical data and present the first Chinese Biomedical Language Understanding Evaluation (CBLUE) benchmark: a collection of natural language understanding tasks including named entity recognition, information extraction, clinical diagnosis normalization, single-sentence/sentence-pair classification, and an associated online platform for model evaluation, comparison, and analysis. |
Ningyu Zhang; Mosha Chen; Zhen Bi; Xiaozhuan Liang; Lei Li; Xin Shang; Kangping Yin; Chuanqi Tan; Jian Xu; Fei Huang; Luo Si; Yuan Ni; Guotong Xie; Zhifang Sui; Baobao Chang; Hui Zong; Zheng Yuan; Linfeng Li; Jun Yan; Hongying Zan; Kunli Zhang; Buzhou Tang; Qingcai Chen; |
545 | Learning Non-Autoregressive Models from Search for Unsupervised Sentence Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose a Non-Autoregressive Unsupervised Summarization (NAUS) approach, which does not require parallel data for training. |
Puyuan Liu; Chenyang Huang; Lili Mou; |
546 | Learning to Generalize to More: Continuous Semantic Augmentation for Neural Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present a novel data augmentation paradigm termed Continuous Semantic Augmentation (CsaNMT), which augments each training instance with an adjacency semantic region that could cover adequate variants of literal expression under the same meaning. |
Xiangpeng Wei; Heng Yu; Yue Hu; Rongxiang Weng; Weihua Luo; Rong Jin; |
547 | Lexical Knowledge Internalization for Neural Dialog Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose knowledge internalization (KI), which aims to complement the lexical knowledge into neural dialog models. |
Zhiyong Wu; Wei Bi; Xiang Li; Lingpeng Kong; Ben Kao; |
548 | Modeling Syntactic-Semantic Dependency Correlations in Semantic Role Labeling Using Mixture Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a mixture model-based end-to-end method to model the syntactic-semantic dependency correlation in Semantic Role Labeling (SRL). |
Junjie Chen; Xiangheng He; Yusuke Miyao; |
549 | Learning The Beauty in Songs: Neural Singing Voice Beautifier Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In NSVB, we propose a novel time-warping approach for pitch correction: Shape-Aware Dynamic Time Warping (SADTW), which ameliorates the robustness of existing time-warping approaches, to synchronize the amateur recording with the template pitch curve. |
Jinglin Liu; Chengxi Li; Yi Ren; Zhiying Zhu; Zhou Zhao; |
550 | A Model-agnostic Data Manipulation Method for Persona-based Dialogue Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We point out that the data challenges of this generation task lie in two aspects: first, it is expensive to scale up current persona-based dialogue datasets; second, each data sample in this task is more complex to learn with than conventional dialogue data. To alleviate the above data issues, we propose a data manipulation method, which is model-agnostic to be packed with any persona-based dialogue generation model to improve their performance. |
Yu Cao; Wei Bi; Meng Fang; Shuming Shi; Dacheng Tao; |
551 | LinkBERT: Pretraining Language Models with Document Links Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose LinkBERT, an LM pretraining method that leverages links between documents, e.g., hyperlinks. |
Michihiro Yasunaga; Jure Leskovec; Percy Liang; |
552 | Improving Time Sensitivity for Question Answering Over Temporal Knowledge Graphs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a time-sensitive question answering (TSQA) framework to tackle these problems. |
Chao Shang; Guangtao Wang; Peng Qi; Jing Huang; |
553 | Self-supervised Semantic-driven Phoneme Discovery for Zero-resource Speech Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we bridge the gap between the linguistic and statistical definition of phonemes and propose a novel neural discrete representation learning model for self-supervised learning of phoneme inventory with raw speech and word labels. |
Liming Wang; Siyuan Feng; Mark Hasegawa-Johnson; Chang Yoo; |
554 | Softmax Bottleneck Makes Language Models Unable to Represent Multi-mode Word Distributions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, we discover that this single hidden state cannot produce all probability distributions regardless of the LM size or training data size because the single hidden state embedding cannot be close to the embeddings of all the possible next words simultaneously when there are other interfering word embeddings between them. In this work, we demonstrate the importance of this limitation both theoretically and practically. |
Haw-Shiuan Chang; Andrew McCallum; |
555 | Ditch The Gold Standard: Re-evaluating Conversational Question Answering Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we conduct the first large-scale human evaluation of state-of-the-art conversational QA systems, where human evaluators converse with models and judge the correctness of their answers. |
Huihan Li; Tianyu Gao; Manan Goenka; Danqi Chen; |
556 | Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: While one could use a development set to determine which permutations are performant, this would deviate from the true few-shot setting as it requires additional annotated data. Instead, we use the generative nature of language models to construct an artificial development set and based on entropy statistics of the candidate permutations on this set, we identify performant prompts. |
Yao Lu; Max Bartolo; Alastair Moore; Sebastian Riedel; Pontus Stenetorp; |
557 | Situated Dialogue Learning Through Procedural Environment Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We teach goal-driven agents to interactively act and speak in situated environments by training on generated curriculums. |
Prithviraj Ammanabrolu; Renee Jia; Mark Riedl; |
558 | UniTE: Unified Translation Evaluation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose , which is the first unified framework engaged with abilities to handle all three evaluation tasks. |
Yu Wan; Dayiheng Liu; Baosong Yang; Haibo Zhang; Boxing Chen; Derek Wong; Lidia Chao; |
559 | Program Transfer for Answering Complex Questions Over Knowledge Bases Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose the approach of program transfer, which aims to leverage the valuable program annotations on the rich-resourced KBs as external supervision signals to aid program induction for the low-resourced KBs that lack program annotations. |
Shulin Cao; Jiaxin Shi; Zijun Yao; Xin Lv; Jifan Yu; Lei Hou; Juanzi Li; Zhiyuan Liu; Jinghui Xiao; |
560 | EAG: Extract and Generate Multi-way Aligned Corpus for Complete Multi-lingual Neural Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, since exactly identical sentences from different language pairs are scarce, the power of the multi-way aligned corpus is limited by its scale. To handle this problem, this paper proposes Extract and Generate (EAG), a two-step approach to construct large-scale and high-quality multi-way aligned corpus from bilingual data. |
Yulin Xu; Zhen Yang; Fandong Meng; Jie Zhou; |
561 | Using Context-to-Vector with Graph Retrofitting to Improve Word Embeddings Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we aim to improve word embeddings by 1) incorporating more contextual information from existing pre-trained models into the Skip-gram framework, which we call Context-to-Vec; 2) proposing a post-processing retrofitting method for static embeddings independent of training by employing priori synonym knowledge and weighted vector distribution. |
Jiangbin Zheng; Yile Wang; Ge Wang; Jun Xia; Yufei Huang; Guojiang Zhao; Yue Zhang; Stan Li; |
562 | Multimodal Sarcasm Target Identification in Tweets Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we introduce multimodality to STI and present Multimodal Sarcasm Target Identification (MSTI) task. |
Jiquan Wang; Lin Sun; Yi Liu; Meizhi Shao; Zengwei Zheng; |
563 | Flexible Generation from Fragmentary Linguistic Input Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We hypothesize that human performance is better characterized by flexible inference through composition of basic computational motifs available to the human language user. To test this hypothesis, we formulate a set of novel fragmentary text completion tasks, and compare the behavior of three direct-specialization models against a new model we introduce, GibbsComplete, which composes two basic computational motifs central to contemporary models: masked and autoregressive word prediction. |
Peng Qian; Roger Levy; |
564 | Revisiting Over-Smoothness in Text to Speech Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: One limitation of NAR-TTS models is that they ignore the correlation in time and frequency domains while generating speech mel-spectrograms, and thus cause blurry and over-smoothed results. In this work, we revisit this over-smoothing problem from a novel perspective: the degree of over-smoothness is determined by the gap between the complexity of data distributions and the capability of modeling methods. |
Yi Ren; Xu Tan; Tao Qin; Zhou Zhao; Tie-Yan Liu; |
565 | Coherence Boosting: When Your Pretrained Language Model Is Not Paying Enough Attention Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present coherence boosting, an inference procedure that increases a LM’s focus on a long context. |
Nikolay Malkin; Zhen Wang; Nebojsa Jojic; |
566 | Uncertainty Estimation of Transformer Predictions for Misclassification Detection Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Little attention has been paid to UE in natural language processing. To fill this gap, we perform a vast empirical investigation of state-of-the-art UE methods for Transformer models on misclassification detection in named entity recognition and text classification tasks and propose two computationally efficient modifications, one of which approaches or even outperforms computationally intensive methods. |
Artem Vazhentsev; Gleb Kuzmin; Artem Shelmanov; Akim Tsvigun; Evgenii Tsymbalov; Kirill Fedyanin; Maxim Panov; Alexander Panchenko; Gleb Gusev; Mikhail Burtsev; Manvel Avetisian; Leonid Zhukov; |
567 | VALSE: A Task-Independent Benchmark for Vision and Language Models Centered on Linguistic Phenomena Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose VALSE (Vision And Language Structured Evaluation), a novel benchmark designed for testing general-purpose pretrained vision and language (V&L) models for their visio-linguistic grounding capabilities on specific linguistic phenomena. |
Letitia Parcalabescu; Michele Cafagna; Lilitta Muradjan; Anette Frank; Iacer Calixto; Albert Gatt; |
568 | The Grammar-Learning Trajectories of Neural Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we show that NLMs with different initialization, architecture, and training data acquire linguistic phenomena in a similar order, despite their different end performance. |
Leshem Choshen; Guy Hacohen; Daphna Weinshall; Omri Abend; |
569 | Generating Scientific Definitions with Controllable Complexity Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce a novel reranking approach and find in human evaluations that it offers superior fluency while also controlling complexity, compared to several controllable generation baselines. |
Tal August; Katharina Reinecke; Noah Smith; |
570 | Label Semantic Aware Pre-training for Few-shot Text Classification Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, use of label-semantics during pre-training has not been extensively explored. We therefore propose Label Semantic Aware Pre-training (LSAP) to improve the generalization and data efficiency of text classification systems. |
Aaron Mueller; Jason Krone; Salvatore Romeo; Saab Mansour; Elman Mansimov; Yi Zhang; Dan Roth; |
571 | ODE Transformer: An Ordinary Differential Equation-Inspired Model for Sequence Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper explores a deeper relationship between Transformer and numerical ODE methods. |
Bei Li; Quan Du; Tao Zhou; Yi Jing; Shuhan Zhou; Xin Zeng; Tong Xiao; JingBo Zhu; Xuebo Liu; Min Zhang; |
572 | A Comparison of Strategies for Source-Free Domain Adaptation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We take algorithms that traditionally assume access to the source-domain training data-active learning, self-training, and data augmentation-and adapt them for source free domain adaptation. |
Xin Su; Yiyun Zhao; Steven Bethard; |
573 | Ethics Sheets for AI Tasks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this position paper, I make a case for thinking about ethical considerations not just at the level of individual models and datasets, but also at the level of AI tasks. |
Saif Mohammad; |
574 | Learning Disentangled Representations of Negation and Uncertainty Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, previous works on representation learning do not explicitly model this independence. We therefore attempt to disentangle the representations of negation, uncertainty, and content using a Variational Autoencoder. |
Jake Vasilakes; Chrysoula Zerva; Makoto Miwa; Sophia Ananiadou; |
575 | Latent-GLAT: Glancing at Latent Variables for Parallel Text Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose GLAT, which employs the discrete latent variables to capture word categorical information and invoke an advanced curriculum learning technique, alleviating the multi-modality problem. |
Yu Bao; Hao Zhou; Shujian Huang; Dongqi Wang; Lihua Qian; Xinyu Dai; Jiajun Chen; Lei Li; |
576 | PPT: Pre-trained Prompt Tuning for Few-shot Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We attribute this low performance to the manner of initializing soft prompts. Therefore, in this work, we propose to pre-train prompts by adding soft prompts into the pre-training stage to obtain a better initialization. |
Yuxian Gu; Xu Han; Zhiyuan Liu; Minlie Huang; |
577 | Deduplicating Training Data Makes Language Models Better Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We develop two tools that allow us to deduplicate training datasets—for example removing from C4 a single 61 word English sentence that is repeated over 60,000 times. |
Katherine Lee; Daphne Ippolito; Andrew Nystrom; Chiyuan Zhang; Douglas Eck; Chris Callison-Burch; Nicholas Carlini; |
578 | Improving The Generalizability of Depression Detection By Leveraging Clinical Questionnaires Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose approaches for depression detection that are constrained to different degrees by the presence of symptoms described in PHQ9, a questionnaire used by clinicians in the depression screening process. |
Thong Nguyen; Andrew Yates; Ayah Zirikly; Bart Desmet; Arman Cohan; |
579 | Internet-Augmented Dialogue Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The largest store of continually updating knowledge on our planet can be accessed via internet search. In this work we study giving access to this information to conversational agents. |
Mojtaba Komeili; Kurt Shuster; Jason Weston; |
580 | SUPERB-SG: Enhanced Speech Processing Universal PERformance Benchmark for Semantic and Generative Capabilities Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we introduce SUPERB-SG, a new benchmark focusing on evaluating the semantic and generative capabilities of pre-trained models by increasing task diversity and difficulty over SUPERB. |
Hsiang-Sheng Tsai; Heng-Jui Chang; Wen-Chin Huang; Zili Huang; Kushal Lakhotia; Shu-wen Yang; Shuyan Dong; Andy Liu; Cheng-I Lai; Jiatong Shi; Xuankai Chang; Phil Hall; Hsuan-Jui Chen; Shang-Wen Li; Shinji Watanabe; Abdelrahman Mohamed; Hung-yi Lee; |
581 | Knowledge Neurons in Pretrained Transformers Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present preliminary studies on how factual knowledge is stored in pretrained Transformers by introducing the concept of knowledge neurons. |
Damai Dai; Li Dong; Yaru Hao; Zhifang Sui; Baobao Chang; Furu Wei; |
582 | Meta-Learning for Fast Cross-Lingual Adaptation in Dependency Parsing Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We apply model-agnostic meta-learning (MAML) to the task of cross-lingual dependency parsing. |
Anna Langedijk; Verna Dankers; Phillip Lippe; Sander Bos; Bryan Cardenas Guevara; Helen Yannakoudakis; Ekaterina Shutova; |
583 | French CrowS-Pairs: Extending A Challenge Dataset for Measuring Social Bias in Masked Language Models to A Language Other Than English Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce 1,679 sentence pairs in French that cover stereotypes in ten types of bias like gender and age. |
Aur�lie N�v�ol; Yoann Dupont; Julien Bezan�on; Kar�n Fort; |
584 | Few-Shot Learning with Siamese Networks and Label Tuning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we show that with proper pre-training, Siamese Networks that embed texts and labels offer a competitive alternative. |
Thomas M�ller; Guillermo P�rez-Torr�; Marc Franco-Salvador; |
585 | Inferring Rewards from Language in Context Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present a model that infers rewards from language pragmatically: reasoning about how speakers choose utterances not only to elicit desired actions, but also to reveal information about their preferences. |
Jessy Lin; Daniel Fried; Dan Klein; Anca Dragan; |
586 | Generating Biographies on Wikipedia: The Impact of Gender Bias on The Retrieval-Based Generation of Women Biographies Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Generating factual, long-form text such as Wikipedia articles raises three key challenges: how to gather relevant evidence, how to structure information into well-formed text, and how to ensure that the generated text is factually correct. We address these by developing a model for English text that uses a retrieval mechanism to identify relevant supporting information on the web and a cache-based pre-trained encoder-decoder to generate long-form biographies section by section, including citation information. |
Angela Fan; Claire Gardent; |
587 | Your Answer Is Incorrect… Would You Like to Know Why? Introducing A Bilingual Short Answer Feedback Dataset Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To encourage research on explainable and understandable feedback systems, we present the Short Answer Feedback dataset (SAF). |
Anna Filighera; Siddharth Parihar; Tim Steuer; Tobias Meuser; Sebastian Ochs; |
588 | Towards Better Characterization of Paraphrases Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To effectively characterize the nature of paraphrase pairs without expert human annotation, we proposes two new metrics: word position deviation (WPD) and lexical deviation (LD). |
Timothy Liu; De Wen Soh; |
589 | SummScreen: A Dataset for Abstractive Screenplay Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce SummScreen, a summarization dataset comprised of pairs of TV series transcripts and human written recaps. |
Mingda Chen; Zewei Chu; Sam Wiseman; Kevin Gimpel; |
590 | Sparsifying Transformer Models with Trainable Representation Pooling Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a novel method to sparsify attention in the Transformer model by learning to select the most-informative token representations during the training process, thus focusing on the task-specific parts of an input. |
Michal Pietruszka; Lukasz Borchmann; Lukasz Garncarek; |
591 | Uncertainty Determines The Adequacy of The Mode and The Tractability of Decoding in Sequence-to-Sequence Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In particular, we show that well-known pathologies such as a high number of beam search errors, the inadequacy of the mode, and the drop in system performance with large beam sizes apply to tasks with high level of ambiguity such as MT but not to less uncertain tasks such as GEC. |
Felix Stahlberg; Ilia Kulikov; Shankar Kumar; |
592 | FlipDA: Effective and Robust Data Augmentation for Few-Shot Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Under this setting, we reproduced a large number of previous augmentation methods and found that these methods bring marginal gains at best and sometimes degrade the performance much. To address this challenge, we propose a novel data augmentation method FlipDA that jointly uses a generative model and a classifier to generate label-flipped data. |
Jing Zhou; Yanan Zheng; Jie Tang; Li Jian; Zhilin Yang; |
593 | Text-Free Prosody-Aware Generative Spoken Language Modeling Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we present a prosody-aware generative spoken language model (pGSLM). |
Eugene Kharitonov; Ann Lee; Adam Polyak; Yossi Adi; Jade Copet; Kushal Lakhotia; Tu Anh Nguyen; Morgane Riviere; Abdelrahman Mohamed; Emmanuel Dupoux; Wei-Ning Hsu; |
594 | Lite Unified Modeling for Discriminative Reading Comprehension Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Among previous works, there lacks a unified design with pertinence for the overall discriminative MRC tasks. To fill in above gap, we propose a lightweight POS-Enhanced Iterative Co-Attention Network (POI-Net) as the first attempt of unified modeling with pertinence, to handle diverse discriminative MRC tasks synchronously. |
Yilin Zhao; Hai Zhao; Libin Shen; Yinggong Zhao; |
595 | Bilingual Alignment Transfers to Multilingual Alignment for Unsupervised Parallel Text Mining Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This work presents methods for learning cross-lingual sentence representations using paired or unpaired bilingual texts. |
Chih-chan Tien; Shane Steinert-Threlkeld; |
596 | End-to-End Modeling Via Information Tree for One-Shot Natural Language Spatial Video Grounding Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Another challenge relates to the limited supervision, which might result in ineffective representation learning. To address these challenges, we designed an end-to-end model via Information Tree for One-Shot video grounding (IT-OS). |
Mengze Li; Tianbao Wang; Haoyu Zhang; Shengyu Zhang; Zhou Zhao; Jiaxu Miao; Wenqiao Zhang; Wenming Tan; Jin Wang; Peng Wang; Shiliang Pu; Fei Wu; |
597 | RNSum: A Large-Scale Dataset for Automatic Release Note Generation Via Commit Logs Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we present a new dataset called RNSum, which contains approximately 82,000 English release notes and the associated commit messages derived from the online repositories in GitHub. |
Hisashi Kamezawa; Noriki Nishida; Nobuyuki Shimizu; Takashi Miyazaki; Hideki Nakayama; |
598 | Improving Machine Reading Comprehension with Contextualized Commonsense Knowledge Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper aims to extract a new kind of structured knowledge from scripts and use it to improve MRC. |
Kai Sun; Dian Yu; Jianshu Chen; Dong Yu; Claire Cardie; |
599 | Modeling Persuasive Discourse to Adaptively Support Students’ Argumentative Writing Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce an argumentation annotation approach to model the structure of argumentative discourse in student-written business model pitches. |
Thiemo Wambsganss; Christina Niklaus; |
600 | Active Evaluation: Efficient NLG Evaluation with Few Pairwise Comparisons Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we introduce Active Evaluation, a framework to efficiently identify the top-ranked system by actively choosing system pairs for comparison using dueling bandit algorithms. |
Akash Kumar Mohankumar; Mitesh Khapra; |
601 | The Moral Debater: A Study on The Computational Generation of Morally Framed Arguments Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Following the moral foundation theory, we propose a system that effectively generates arguments focusing on different morals. |
Milad Alshomary; Roxanne El Baff; Timon Gurcke; Henning Wachsmuth; |
602 | Pyramid-BERT: Reducing Complexity Via Successive Core-set Based Token Selection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: A recent line of works use various heuristics to successively shorten sequence length while transforming tokens through encoders, in tasks such as classification and ranking that require a single token embedding for prediction.We present a novel solution to this problem, called Pyramid-BERT where we replace previously used heuristics with a core-set based token selection method justified by theoretical results. |
Xin Huang; Ashish Khetan; Rene Bidart; Zohar Karnin; |
603 | Probing for The Usage of Grammatical Number Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we try to find an encoding that the model actually uses, introducing a usage-based probing setup. |
Karim Lasri; Tiago Pimentel; Alessandro Lenci; Thierry Poibeau; Ryan Cotterell; |
604 | BitFit: Simple Parameter-efficient Fine-tuning for Transformer-based Masked Language-models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce BitFit, a sparse-finetuning method where only the bias-terms of the model (or a subset of them) are being modified. |
Elad Ben Zaken; Yoav Goldberg; Shauli Ravfogel; |
605 | Are Shortest Rationales The Best Explanations for Human Understanding? Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Is the shortest rationale indeed the most human-understandable? To answer this question, we design a self-explaining model, LimitedInk, which allows users to extract rationales at any target length. |
Hua Shen; Tongshuang Wu; Wenbo Guo; Ting-Hao Huang; |
606 | Analyzing Wrap-Up Effects Through An Information-Theoretic Lens Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Consequently, the understanding of the cognitive processes that might be involved in these effects is limited. In this work, we attempt to learn more about these processes by looking for the existence-or absence-of a link between wrap-up effects and information theoretic quantities, such as word and context information content. |
Clara Meister; Tiago Pimentel; Thomas Clark; Ryan Cotterell; Roger Levy; |
607 | Have My Arguments Been Replied To? Argument Pair Extraction As Machine Reading Comprehension Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Existing studies typically identify argument pairs indirectly by predicting sentence-level relations between two documents, neglecting the modeling of the holistic argument-level interactions. Towards this issue, we propose to address APE via a machine reading comprehension (MRC) framework with two phases. |
Jianzhu Bao; Jingyi Sun; Qinglin Zhu; Ruifeng Xu; |
608 | High Probability or Low Information? The Probability-quality Paradox in Language Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Rather, mode-seeking decoding methods can lead to incredibly unnatural language, while stochastic methods produce text perceived as much more human-like. In this note, we offer an explanation for this phenomenon by analyzing language as a means of communication in the information-theoretic sense. |
Clara Meister; Gian Wiher; Tiago Pimentel; Ryan Cotterell; |
609 | Disentangled Knowledge Transfer for OOD Intent Discovery with Unified Contrastive Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Different from existing work based on shared intent representation, we propose a novel disentangled knowledge transfer method via a unified multi-head contrastive learning framework. |
Yutao Mou; Keqing He; Yanan Wu; Zhiyuan Zeng; Hong Xu; Huixing Jiang; Wei Wu; Weiran Xu; |
610 | Voxel-informed Language Grounding Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present the Voxel-informed Language Grounder (VLG), a language grounding model that leverages 3D geometric information in the form of voxel maps derived from the visual input using a volumetric reconstruction model. |
Rodolfo Corona; Shizhan Zhu; Dan Klein; Trevor Darrell; |
611 | P-Tuning: Prompt Tuning Can Be Comparable to Fine-tuning Across Scales and Tasks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a novel empirical finding that properly optimized prompt tuning can be universally effective across a wide range of model scales and NLU tasks. |
Xiao Liu; Kaixuan Ji; Yicheng Fu; Weng Tam; Zhengxiao Du; Zhilin Yang; Jie Tang; |
612 | On Efficiently Acquiring Annotations for Multilingual Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we show that the strategy of joint learning across multiple languages using a single model performs substantially better than the aforementioned alternatives. |
Joel Moniz; Barun Patra; Matthew Gormley; |
613 | Automatic Detection of Entity-Manipulated Text Using Factual Knowledge Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we focus on the problem of distinguishing a human written news article from a news article that is created by manipulating entities in a human written news article (e.g., replacing entities with factually incorrect entities). |
Ganesh Jawahar; Muhammad Abdul-Mageed; Laks Lakshmanan; |
614 | Does BERT Know That The IS-A Relation Is Transitive? Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We aim to quantify how much BERT agrees with the transitive property of IS-A relations, via a minimalist probing setting. |
Ruixi Lin; Hwee Tou Ng; |
615 | Buy Tesla, Sell Ford: Assessing Implicit Stock Market Preference in Pre-trained Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we assess the implicit stock market preferences in BERT and its finance domain-specific model FinBERT. |
Chengyu Chuang; Yi Yang; |
616 | Pixie: Preference in Implicit and Explicit Comparisons Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present Pixie, a manually annotated dataset for preference classification comprising 8,890 sentences drawn from app reviews. |
Amanul Haque; Vaibhav Garg; Hui Guo; Munindar Singh; |
617 | Counterfactual Explanations for Natural Language Interfaces Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a novel approach for generating explanations of a natural language interface based on semantic parsing. |
George Tolkachev; Stephen Mell; Stephan Zdancewic; Osbert Bastani; |
618 | Predicting Difficulty and Discrimination of Natural Language Questions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: More recently, IRT has been used to similarly characterize item difficulty and discrimination for natural language models across various datasets (Lalor et al., 2019; Vania et al., 2021; Rodriguez et al., 2021). In this work, we explore predictive models for directly estimating and explaining these traits for natural language questions in a question-answering context. |
Matthew Byrd; Shashank Srivastava; |
619 | How Does The Pre-training Objective Affect What Large Language Models Learn About Linguistic Properties? Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We hypothesize that linguistically motivated objectives such as MLM should help BERT to acquire better linguistic knowledge compared to other non-linguistically motivated objectives that are not intuitive or hard for humans to guess the association between the input and the label to be predicted. To this end, we pre-train BERT with two linguistically motivated objectives and three non-linguistically motivated ones. |
Ahmed Alajrami; Nikolaos Aletras; |
620 | The Power of Prompt Tuning for Low-Resource Semantic Parsing Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we investigate prompt tuning for semantic parsing-the task of mapping natural language utterances onto formal meaning representations. |
Nathan Schucher; Siva Reddy; Harm de Vries; |
621 | Data Contamination: From Memorization to Exploitation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: It is not clear to what extent models exploit the contaminated data for downstream tasks. We present a principled method to study this question. |
Inbal Magar; Roy Schwartz; |
622 | Detecting Annotation Errors in Morphological Data with The Transformer Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we evaluate the feasibility of using the Transformer model to detect various types of annotator errors in morphological data sets that contain inflected word forms. |
Ling Liu; Mans Hulden; |
623 | Estimating The Entropy of Linguistic Distributions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: While entropy estimation is a well-studied problem in other fields, there is not yet a comprehensive exploration of the efficacy of entropy estimators for use with linguistic data. In this work, we fill this void, studying the empirical effectiveness of different entropy estimators for linguistic distributions. |
Aryaman Arora; Clara Meister; Ryan Cotterell; |
624 | Morphological Reinflection with Multiple Arguments: An Extended Annotation Schema and A Georgian Case Study Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, the flat structure of the current morphological annotation makes the treatment of some languages quirky, if not impossible, specifically in cases of polypersonal agreement. In this paper we propose a general solution for such cases and expand the UniMorph annotation schema to naturally address this phenomenon, in which verbs agree with multiple arguments using true affixes. |
David Guriel; Omer Goldman; Reut Tsarfaty; |
625 | DQ-BART: Efficient Sequence-to-Sequence Model Via Joint Distillation and Quantization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, such models pose a great challenge in resource-constrained scenarios owing to their large memory requirements and high latency. To alleviate this issue, we propose to jointly distill and quantize the model, where knowledge is transferred from the full-precision teacher model to the quantized and distilled low-precision student model. |
Zheng Li; Zijian Wang; Ming Tan; Ramesh Nallapati; Parminder Bhatia; Andrew Arnold; Bing Xiang; Dan Roth; |
626 | Learning-by-Narrating: Narrative Pre-Training for Zero-Shot Dialogue Comprehension Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Towards the objective of pre-training a zero-shot dialogue comprehension model, we develop a novel narrative-guided pre-training strategy that learns by narrating the key information from a dialogue input. |
Chao Zhao; Wenlin Yao; Dian Yu; Kaiqiang Song; Dong Yu; Jianshu Chen; |
627 | Kronecker Decomposition for GPT Compression Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we use Kronecker decomposition to compress the linear mappings of the GPT-2 model. |
Ali Edalati; Marzieh Tahaei; Ahmad Rashid; Vahid Nia; James Clark; Mehdi Rezagholizadeh; |
628 | Simple and Effective Knowledge-Driven Query Expansion for QA-Based Product Attribute Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We thus propose simple knowledge-driven query expansion based on possible answers (values) of a query (attribute) for QA-based AVE. |
Keiji Shinzato; Naoki Yoshinaga; Yandi Xia; Wei-Te Chen; |
629 | Event-Event Relation Extraction Using Probabilistic Box Embedding Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose to modify the underlying ERE model to guarantee coherence by representing each event as a box representation (BERE) without applying explicit constraints. |
EunJeong Hwang; Jay-Yoon Lee; Tianyi Yang; Dhruvesh Patel; Dongxu Zhang; Andrew McCallum; |
630 | Sample, Translate, Recombine: Leveraging Audio Alignments for Data Augmentation in End-to-end Speech Translation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we present a novel approach to data augmentation that leverages audio alignments, linguistic properties, and translation. |
Tsz Kin Lam; Shigehiko Schamoni; Stefan Riezler; |
631 | Predicting Sentence Deletions for Text Simplification Using A Functional Discourse Structure Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we focus on sentence deletions for text simplification and use a news genre-specific functional discourse structure, which categorizes sentences based on their contents and their function roles in telling a news story, for predicting sentence deletion. |
Bohan Zhang; Prafulla Kumar Choubey; Ruihong Huang; |
632 | Multilingual Pre-training with Language and Task Adaptation for Multilingual Text Style Transfer Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Besides, in view of the general scarcity of parallel data, we propose a modular approach for multilingual formality transfer, which consists of two training strategies that target adaptation to both language and task. |
Huiyuan Lai; Antonio Toral; Malvina Nissim; |
633 | When to Use Multi-Task Learning Vs Intermediate Fine-Tuning for Pre-Trained Encoder Transfer Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we compare all three TL methods in a comprehensive analysis on the GLUE dataset suite. |
Orion Weller; Kevin Seppi; Matt Gardner; |
634 | Leveraging Explicit Lexico-logical Alignments in Text-to-SQL Parsing Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a new approach to leveraging explicit lexico-logical alignments. |
Runxin Sun; Shizhu He; Chong Zhu; Yaohan He; Jinlong Li; Jun Zhao; Kang Liu; |
635 | Complex Evolutional Pattern Learning for Temporal Knowledge Graph Reasoning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Thus, we propose a new model, called Complex Evolutional Network (CEN), which uses a length-aware Convolutional Neural Network (CNN) to handle evolutional patterns of different lengths via an easy-to-difficult curriculum learning strategy. |
Zixuan Li; Saiping Guan; Xiaolong Jin; Weihua Peng; Yajuan Lyu; Yong Zhu; Long Bai; Wei Li; Jiafeng Guo; Xueqi Cheng; |
636 | Mismatch Between Multi-turn Dialogue and Its Evaluation Metric in Dialogue State Tracking Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The trained model predicts accumulated belief states in every turn, and joint goal accuracy and slot accuracy are mainly used to evaluate the prediction; however, we specify that the current evaluation metrics have a critical limitation when evaluating belief states accumulated as the dialogue proceeds, especially in the most used MultiWOZ dataset. |
Takyoung Kim; Hoonsang Yoon; Yukyung Lee; Pilsung Kang; Misuk Kim; |
637 | LM-BFF-MS: Improving Few-Shot Fine-tuning of Language Models Based on Multiple Soft Demonstration Memory Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To improve the approach of LM-BFF, this paper proposes LM-BFF-MS-better few-shot fine-tuning of language models with multiple soft demonstrations by making its further extensions, which include 1) prompts with multiple demonstrations based on automatic generation of multiple label words; and 2) soft demonstration memory which consists of multiple sequences of globally shared word embeddings for a similar context. |
Eunhwan Park; Donghyeon Jeon; Seonhoon Kim; Inho Kang; Seung-Hoon Na; |
638 | Towards Fair Evaluation of Dialogue State Tracking By Flexible Incorporation of Turn-level Performances Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we discuss various evaluation metrics used for DST along with their shortcomings. |
Suvodip Dey; Ramamohan Kummara; Maunendra Desarkar; |
639 | Exploiting Language Model Prompts Using Similarity Measures: A Case Study on The Word-in-Context Task Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Specifically, none of the existing few-shot approaches (including the in-context learning of GPT-3) can attain a performance that is meaningfully different from the random baseline.Trying to fill this gap, we propose a new prompting technique, based on similarity metrics, which boosts few-shot performance to the level of fully supervised methods. |
Mohsen Tabasi; Kiamehr Rezaee; Mohammad Taher Pilehvar; |
640 | Hierarchical Curriculum Learning for AMR Parsing Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, there exists a gap between their flat training objective (i.e., equally treats all output tokens) and the hierarchical AMR structure, which limits the model generalization. To bridge this gap, we propose a Hierarchical Curriculum Learning (HCL) framework with Structure-level (SC) and Instance-level Curricula (IC). |
Peiyi Wang; Liang Chen; Tianyu Liu; Damai Dai; Yunbo Cao; Baobao Chang; Zhifang Sui; |
641 | PARE: A Simple and Strong Baseline for Monolingual and Multilingual Distantly Supervised Relation Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In response, we explore a simple baseline approach (PARE) in which all sentences of a bag are concatenated into a passage of sentences, and encoded jointly using BERT. |
Vipul Rathore; Kartikeya Badola; Parag Singla; Mausam .; |
642 | To Find Waldo You Need Contextual Cues: Debiasing Who’s Waldo Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present a debiased dataset for the Person-centric Visual Grounding (PCVG) task first proposed by Cui et al. (2021) in the Who’s Waldo dataset. |
Yiran Luo; Pratyay Banerjee; Tejas Gokhale; Yezhou Yang; Chitta Baral; |
643 | Translate-Train Embracing Translationese Artifacts Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, its performance is often hampered by the artifacts in the translated texts (translationese). We discover that such artifacts have common patterns in different languages and can be modeled by deep learning, and subsequently propose an approach to conduct translate-train using Translationese Embracing the effect of Artifacts (TEA). |
Sicheng Yu; Qianru Sun; Hao Zhang; Jing Jiang; |
644 | C-MORE: Pretraining to Answer Open-Domain Questions By Consulting Millions of References Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we automatically construct a large-scale corpus that meets all three criteria by consulting millions of references cited within Wikipedia. |
Xiang Yue; Xiaoman Pan; Wenlin Yao; Dian Yu; Dong Yu; Jianshu Chen; |
645 | K-Rater Reliability: The Correct Unit of Reliability for Aggregated Human Annotations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We conducted two replications of the WordSim-353 benchmark, and present empirical, analytical, and bootstrap-based methods for computing kRR on WordSim-353. |
Ka Wong; Praveen Paritosh; |
646 | An Embarrassingly Simple Method to Mitigate Undesirable Properties of Pretrained Language Model Tokenizers Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce FLOTA (Few Longest Token Approximation), a simple yet effective method to improve the tokenization of pretrained language models (PLMs). |
Valentin Hofmann; Hinrich Schuetze; Janet Pierrehumbert; |
647 | SCD: Self-Contrastive Decorrelation of Sentence Embeddings Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose Self-Contrastive Decorrelation (SCD), a self-supervised approach. |
Tassilo Klein; Moin Nabi; |
648 | Problems with Cosine As A Measure of Embedding Similarity for High Frequency Words Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Here, we uncover systematic ways in which word similarities estimated by cosine over BERT embeddings are understated and trace this effect to training data frequency. |
Kaitlyn Zhou; Kawin Ethayarajh; Dallas Card; Dan Jurafsky; |
649 | Revisiting The Compositional Generalization Abilities of Neural Sequence Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we focus on one-shot primitive generalization as introduced by the popular SCAN benchmark. |
Arkil Patel; Satwik Bhattamishra; Phil Blunsom; Navin Goyal; |
650 | A Copy-Augmented Generative Model for Open-Domain Question Answering Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this article, we focus on improving the effectiveness of the reader module and propose a novel copy-augmented generative approach that integrates the merits of both extractive and generative readers. |
Shuang Liu; Dong Wang; Xiaoguang Li; Minghui Huang; Meizhen Ding; |
651 | Augmenting Document Representations for Dense Retrieval with Interpolation and Perturbation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Yet, dense models require a vast amount of labeled training data for notable performance, whereas it is often challenging to acquire query-document pairs annotated by humans. To tackle this problem, we propose a simple but effective Document Augmentation for dense Retrieval (DAR) framework, which augments the representations of documents with their interpolation and perturbation. |
Soyeong Jeong; Jinheon Baek; Sukmin Cho; Sung Ju Hwang; Jong Park; |
652 | WLASL-LEX: A Dataset for Recognising Phonological Properties in American Sign Language Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we bring to attention the task of modelling the phonology of sign languages. |
Federico Tavella; Viktor Schlegel; Marta Romeo; Aphrodite Galata; Angelo Cangelosi; |
653 | Investigating Person-specific Errors in Chat-oriented Dialogue Systems Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We found that person-specific errors can be divided into two types: errors in attributes and those in relations, each of which can be divided into two levels: self and other.. |
Koh Mitsuda; Ryuichiro Higashinaka; Tingxuan Li; Sen Yoshida; |
654 | Direct Parsing to Sentiment Graphs Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper demonstrates how a graph-based semantic parser can be applied to the task of structured sentiment analysis, directly predicting sentiment graphs from text. |
David Samuel; Jeremy Barnes; Robin Kurtz; Stephan Oepen; Lilja �vrelid; Erik Velldal; |
655 | XDBERT: Distilling Visual Information to BERT from Cross-Modal Systems to Improve Language Understanding Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This study explores distilling visual information from pretrained multimodal transformers to pretrained language encoders. |
Chan-Jan Hsu; Hung-yi Lee; Yu Tsao; |
656 | As Little As Possible, As Much As Necessary: Detecting Over- and Undertranslations with Contrastive Conditioning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Omission and addition of content is a typical issue in neural machine translation. We propose a method for detecting such phenomena with off-the-shelf translation models. |
Jannis Vamvas; Rico Sennrich; |
657 | How Distributed Are Distributed Representations? An Observation on The Locality of Syntactic Information in Verb Agreement Tasks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This work addresses the question of the localization of syntactic information encoded in the transformers representations. |
Bingzhi Li; Guillaume Wisniewski; Benoit Crabb�; |
658 | Machine Translation for Livonian: Catering to 20 Speakers Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper we tackle the task of developing neural machine translation (NMT) between Livonian and English, with a two-fold aim: on one hand, preserving the language and on the other – enabling access to Livonian folklore, lifestories and other textual intangible heritage as well as making it easier to create further parallel corpora. |
Matiss Rikters; Marili Tomingas; Tuuli Tuisk; Valts Ern�treits; Mark Fishel; |
659 | Fire Burns, Sword Cuts: Commonsense Inductive Bias for Exploration in Text-based Games Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose CommExpl, an exploration technique that injects external commonsense knowledge, via a pretrained language model (LM), into the agent during training when the agent is the most uncertain about its next action. |
Dongwon Ryu; Ehsan Shareghi; Meng Fang; Yunqiu Xu; Shirui Pan; Reza Haf; |
660 | A Simple But Effective Pluggable Entity Lookup Table for Pre-trained Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose to build a simple but effective Pluggable Entity Lookup Table (PELT) on demand by aggregating the entity’s output representations of multiple occurrences in the corpora. |
Deming Ye; Yankai Lin; Peng Li; Maosong Sun; Zhiyuan Liu; |
661 | S4-Tuning: A Simple Cross-lingual Sub-network Tuning Method Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, vanilla fine-tuning tends to achieve degenerated and unstable results, owing to the Language Interference among different languages, and Parameter Overload under the few-sample transfer learning scenarios.To address two problems elegantly, we propose S4-Tuning, a Simple Cross-lingual Sub-network Tuning method. |
Runxin Xu; Fuli Luo; Baobao Chang; Songfang Huang; Fei Huang; |
662 | Region-dependent Temperature Scaling for Certainty Calibration and Application to Class-imbalanced Token Classification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce a region-balanced calibration error metric that weights all certainty regions equally. |
Hillary Dawkins; Isar Nejadgholi; |
663 | Developmental Negation Processing in Transformer Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: While previous studies have used the tools of psycholinguistics to probe a transformer’s ability to reason over negation, none have focused on the types of negation studied in developmental psychology. We explore how well transformers can process such categories of negation, by framing the problem as a natural language inference (NLI) task. |
Antonio Laverghetta Jr.; John Licato; |
664 | Canary Extraction in Natural Language Understanding Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Recent literature has focused on Model Inversion Attacks (ModIvA) that can extract training data from model parameters. In this work, we present a version of such an attack by extracting canaries inserted in NLU training data. |
Rahil Parikh; Christophe Dupuy; Rahul Gupta; |
665 | On The Intrinsic and Extrinsic Fairness Evaluation Metrics for Contextualized Language Representations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we conduct an extensive correlation study between intrinsic and extrinsic metrics across bias notions using 19 contextualized language models. |
Yang Cao; Yada Pruksachatkun; Kai-Wei Chang; Rahul Gupta; Varun Kumar; Jwala Dhamala; Aram Galstyan; |
666 | Sequence-to-sequence AMR Parsing with Ancestor Information Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we design several strategies to add the important ancestor information into the Transformer Decoder. |
Chen Yu; Daniel Gildea; |
667 | Zero-Shot Dependency Parsing with Worst-Case Aware Automated Curriculum Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, source and training languages are rarely related, when parsing truly low-resource languages. To close this gap, we adopt a method from multi-task learning, which relies on automated curriculum learning, to dynamically optimize for parsing performance on outlier languages. |
Miryam de Lhoneux; Sheng Zhang; Anders S�gaard; |
668 | PriMock57: A Dataset Of Primary Care Mock Consultations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We detail the development of a public access, high quality dataset comprising of 57 mocked primary care consultations, including audio recordings, their manual utterance-level transcriptions, and the associated consultation notes. |
Alex Papadopoulos Korfiatis; Francesco Moramarco; Radmila Sarac; Aleksandar Savkov; |
669 | UniGDD: A Unified Generative Framework for Goal-Oriented Document-Grounded Dialogue Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, such pipeline methods would unavoidably suffer from the error propagation issue. This paper proposes to unify these two sub-tasks via sequentially generating the grounding knowledge and the response. |
Chang Gao; Wenxuan Zhang; Wai Lam; |
670 | DMix: Adaptive Distance-aware Interpolative Mixup Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Interpolation-based regularisation methods such as Mixup, which generate virtual training samples, have proven to be effective for various tasks and modalities.We extend Mixup and propose DMix, an adaptive distance-aware interpolative Mixup that selects samples based on their diversity in the embedding space. |
Ramit Sawhney; Megh Thakkar; Shrey Pandit; Ritesh Soun; Di Jin; Diyi Yang; Lucie Flek; |
671 | Sub-Word Alignment Is Still Useful: A Vest-Pocket Method for Enhancing Low-Resource Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We leverage embedding duplication between aligned sub-words to extend the Parent-Child transfer learning method, so as to improve low-resource machine translation. |
Minhan Xu; Yu Hong; |
672 | HYPHEN: Hyperbolic Hawkes Attention For Text Streams Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a Hyperbolic Hawkes Attention Network (HYPHEN), which learns a data-driven hyperbolic space and models irregular powerlaw excitations using a hyperbolic Hawkes process. |
Shivam Agarwal; Ramit Sawhney; Sanchit Ahuja; Ritesh Soun; Sudheer Chava; |
673 | A Risk-Averse Mechanism for Suicidality Assessment on Social Media Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose SASI, a risk-averse and self-aware transformer-based hierarchical attention classifier, augmented to refrain from making uncertain predictions. |
Ramit Sawhney; Atula Neerkaje; Manas Gaur; |
674 | When Classifying Grammatical Role, BERT Doesn’t Care About Word Order… Except When It Matters Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Recent work has shown large language models to be surprisingly word order invariant, but crucially has largely considered natural prototypical inputs, where compositional meaning mostly matches lexical expectations. To overcome this confound, we probe grammatical role representation in English BERT and GPT-2, on instances where lexical expectations are not sufficient, and word order knowledge is necessary for correct classification. |
Isabel Papadimitriou; Richard Futrell; Kyle Mahowald; |
675 | Triangular Transfer: Freezing The Pivot for Triangular Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose a transfer-learning-based approach that utilizes all types of auxiliary data. |
Meng Zhang; Liangyou Li; Qun Liu; |
676 | Can Visual Dialogue Models Do Scorekeeping? Exploring How Dialogue Representations Incrementally Encode Shared Knowledge Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a theory-based evaluation method for investigating to what degree models pretrained on the VisDial dataset incrementally build representations that appropriately do scorekeeping. |
Brielen Madureira; David Schlangen; |
677 | Focus on The Target’s Vocabulary: Masked Label Smoothing for Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: When allocating smoothed probability, original label smoothing treats the source-side words that would never appear in the target language equally to the real target-side words, which could bias the translation model. To address this issue, we propose Masked Label Smoothing (MLS), a new mechanism that masks the soft label probability of source-side words to zero. |
Liang Chen; Runxin Xu; Baobao Chang; |
678 | Contrastive Learning-Enhanced Nearest Neighbor Mechanism for Multi-Label Text Classification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Previous studies mainly focus on learning text representation and modeling label correlation but neglect the rich knowledge from the existing similar instances when predicting labels of a specific text. To make up for this oversight, we propose a k nearest neighbor (kNN) mechanism which retrieves several neighbor instances and interpolates the model output with their labels. |
Xi�ao Su; Ran Wang; Xinyu Dai; |
679 | NoisyTune: A Little Noise Can Help You Finetune Pretrained Language Models Better Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a very simple yet effective method named NoisyTune to help better finetune PLMs on downstream tasks by adding some noise to the parameters of PLMs before fine-tuning. |
Chuhan Wu; Fangzhao Wu; Tao Qi; Yongfeng Huang; |
680 | Adjusting The Precision-Recall Trade-Off with Align-and-Predict Decoding for Grammatical Error Correction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a simple yet effective counterpart – Align-and-Predict Decoding (APD) for the most popular sequence-to-sequence models to offer more flexibility for the precision-recall trade-off. |
Xin Sun; Houfeng Wang; |
681 | On The Effect of Isotropy on VAE Representations of Text Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In parallel, VAEs have been successful in areas of NLP, but are known for their sub-optimal utilisation of the representation space. To address an aspect of this, we investigate the impact of injecting isotropy during training of VAEs. |
Lan Zhang; Wray Buntine; Ehsan Shareghi; |
682 | Efficient Classification of Long Documents Using Transformers Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we provide a comprehensive evaluation of the relative efficacy measured against various baselines and diverse datasets – both in terms of accuracy as well as time and space overheads. |
Hyunji Park; Yogarshi Vyas; Kashif Shah; |
683 | Rewarding Semantic Similarity Under Optimized Alignments for AMR-to-Text Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Furthermore, past approaches featuring semantic similarity rewards suffer from repetitive outputs and overfitting. We address these issues by proposing metrics that replace the greedy alignments in BERTScore with optimized ones. |
Lisa Jin; Daniel Gildea; |
684 | An Analysis of Negation in Natural Language Understanding Corpora Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper analyzes negation in eight popular corpora spanning six natural language understanding tasks. |
Md Mosharaf Hossain; Dhivya Chinnappa; Eduardo Blanco; |
685 | Primum Non Nocere: Before Working with Indigenous Data, The ACL Must Confront Ongoing Colonialism Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we challenge the ACL community to reckon with historical and ongoing colonialism by adopting a set of ethical obligations and best practices drawn from the Indigenous studies literature. |
Lane Schwartz; |
686 | Unsupervised Multiple-choice Question Generation for Out-of-domain Q&A Fine-tuning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose an approach for generating a fine-tuning dataset thanks to a rule-based algorithm that generates questions and answers from unannotated sentences. |
Guillaume Le Berre; Christophe Cerisara; Philippe Langlais; Guy Lapalme; |
687 | Can A Transformer Pass The Wug Test? Tuning Copying Bias in Neural Morphological Inflection Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Surprisingly, we find that standard models such as the Transformer almost completely fail at generalizing inflection patterns when trained on a limited number of lemmata and asked to inflect previously unseen lemmata-i.e. under wug test-like circumstances. |
Ling Liu; Mans Hulden; |
688 | Probing The Robustness of Trained Metrics for Conversational Dialogue Systems Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper introduces an adversarial method to stress-test trained metrics for the evaluation of conversational dialogue systems. |
Jan Deriu; Don Tuggener; Pius Von D�niken; Mark Cieliebak; |
689 | Rethinking and Refining The Distinct Metric Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We refine the calculation of distinct scores by scaling the number of distinct tokens based on their expectations. |
Siyang Liu; Sahand Sabour; Yinhe Zheng; Pei Ke; Xiaoyan Zhu; Minlie Huang; |
690 | How Reparametrization Trick Broke Differentially-private Text Representation Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this short paper, we formally analyze several recent NLP papers proposing text representation learning using DPText (Beigi et al., 2019a,b; Alnasser et al., 2021; Beigi et al., 2021) and reveal their false claims of being differentially private. |
Ivan Habernal; |
691 | Towards Consistent Document-level Entity Linking: Joint Models for Entity Linking and Coreference Resolution Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We aim to leverage explicit connections among mentions within the document itself: we propose to join EL and coreference resolution (coref) in a single structured prediction task over directed trees and use a globally normalized model to solve it. |
Klim Zaporojets; Johannes Deleu; Yiwei Jiang; Thomas Demeester; Chris Develder; |
692 | A Flexible Multi-Task Model for BERT Serving Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present an efficient BERT-based multi-task (MT) framework that is particularly suitable for iterative and incremental development of the tasks. |
Tianwen Wei; Jianwei Qi; Shenghuan He; |
693 | Understanding Game-Playing Agents with Natural Language Annotations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present a new dataset containing 10K human-annotated games of Go and show how these natural language annotations can be used as a tool for model interpretability. |
Nicholas Tomlin; Andre He; Dan Klein; |
694 | Code Synonyms Do Matter: Multiple Synonyms Matching Network for Automatic ICD Coding Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Existing methods usually apply label attention with code representations to match related text snippets.Unlike these works that model the label with the code hierarchy or description, we argue that the code synonyms can provide more comprehensive knowledge based on the observation that the code expressions in EMRs vary from their descriptions in ICD. |
Zheng Yuan; Chuanqi Tan; Songfang Huang; |
695 | CoDA21: Evaluating Language Understanding Capabilities of NLP Models With Context-Definition Alignment Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce CoDA21 (Context Definition Alignment), a challenging benchmark that measures natural language understanding (NLU) capabilities of PLMs: Given a definition and a context each for k words, but not the words themselves, the task is to align the k definitions with the k contexts. |
L�tfi Kerem Senel; Timo Schick; Hinrich Schuetze; |
696 | On The Importance of Effectively Adapting Pretrained Language Models for Active Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Recent active learning (AL) approaches in Natural Language Processing (NLP) proposed using off-the-shelf pretrained language models (LMs). In this paper, we argue that these LMs are not adapted effectively to the downstream task during AL and we explore ways to address this issue. |
Katerina Margatina; Loic Barrault; Nikolaos Aletras; |
697 | A Recipe for Arbitrary Text Style Transfer with Large Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we leverage large language models (LLMs) to perform zero-shot text style transfer. |
Emily Reif; Daphne Ippolito; Ann Yuan; Andy Coenen; Chris Callison-Burch; Jason Wei; |
698 | DiS-ReX: A Multilingual Dataset for Distantly Supervised Relation Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Our goal is to study the novel task of distant supervision for multilingual relation extraction (Multi DS-RE). |
Abhyuday Bhartiya; Kartikeya Badola; Mausam .; |
699 | (Un)solving Morphological Inflection: Lemma Overlap Artificially Inflates Models’ Performance Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose to re-evaluate morphological inflection models by employing harder train-test splits that will challenge the generalization capacity of the models. |
Omer Goldman; David Guriel; Reut Tsarfaty; |
700 | Text Smoothing: Enhance Various Data Augmentation Methods on Text Classification Tasks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose an efficient data augmentation method, dub as text smoothing, by converting a sentence from its one-hot representation to controllable smoothed representation. |
Xing Wu; Chaochen Gao; Meng Lin; Liangjun Zang; Songlin Hu; |