Paper Digest: EMNLP 2020 (Main Track) Highlights

November 11, 2020December 17, 2020 admin

Download EMNLP-2020-Paper-Digests.pdf– Highlights of all ~750 EMNLP-2020 main track papers. Readers are also encouraged to read our EMNLP 2020 Papers with Code/Data Page, which lists those papers that have published their code or data.

The Conference on Empirical Methods in Natural Language Processing (EMNLP) is one of the top natural language processing conferences in the world. In 2020, it is to be held online due to covid-19 pandemic.

An innovation for EMNLP 2020 is a new acceptance category, which will allow for more high quality papers (short and long) to be accepted than usual. EMNLP 2020 is creating a new sister publication, Findings of ACL: EMNLP 2020 (hereafter Findings), which will serve as an online companion publication for papers that are not accepted for publication in the main conference, but nonetheless have been assessed by the programme committee as solid work with sufficient substance, quality and novelty to warrant publication. All these Finding track papers are put in a seperate page: Findings Track Paper Highlights.

Readers can choose to read all EMNLP-2020 papers including both main track and findings track on our console, which allows users to filter out papers using keywords and find related papers and patents.

To help the community quickly catch up on the work presented in this conference, Paper Digest Team processed all accepted papers, and generated one highlight sentence (typically the main topic) for each paper. Readers are encouraged to read these machine generated highlights / summaries to quickly get the main idea of each paper.

If you do not want to miss any interesting academic paper, you are welcome to sign up our free daily paper digest service to get updates on new papers published in your area every day. You are also welcome to follow us on Twitter and Linkedin to get updated with new conference digests.

Paper Digest Team
team@paperdigest.org

TABLE 1: Paper Digest: EMNLP 2020 (Main Track) Highlights

	Paper	Author(s)	Code
1	Detecting Attackable Sentences In Arguments Highlight: We present a first large-scale analysis of sentence attackability in online arguments. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yohan Jo; Seojin Bang; Emaad Manzoor; Eduard Hovy; Chris Reed;
2	Extracting Implicitly Asserted Propositions In Argumentation Highlight: In this paper, we examine a wide range of computational methods for extracting propositions that are implicitly asserted in questions, reported speech, and imperatives in argumentation. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yohan Jo; Jacky Visser; Chris Reed; Eduard Hovy;
3	Quantitative Argument Summarization And Beyond: Cross-domain Key Point Analysis Highlight: The current work advances key point analysis in two important respects: first, we develop a method for automatic extraction of key points, which enables fully automatic analysis, and is shown to achieve performance comparable to a human expert. Second, we demonstrate that the applicability of key point analysis goes well beyond argumentation data. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Roy Bar-Haim; Yoav Kantor; Lilach Eden; Roni Friedman; Dan Lahav; Noam Slonim;
4	Unsupervised Stance Detection For Arguments From Consequences Highlight: In this paper, we propose an unsupervised method to detect the stance of argumentative claims with respect to a topic. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Jonathan Kobbe; Ioana Hulpuș; Heiner Stuckenschmidt;
5	BLEU Might Be Guilty But References Are Not Innocent Highlight: We study different methods to collect references and compare their value in automated evaluation by reporting correlation with human evaluation for a variety of systems and metrics. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Markus Freitag; David Grangier; Isaac Caswell;
6	Statistical Power And Translationese In Machine Translation Evaluation Highlight: The term translationese has been used to describe features of translated text, and in this paper, we provide detailed analysis of potential adverse effects of translationese on machine translation evaluation. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yvette Graham; Barry Haddow; Philipp Koehn;
7	Simulated Multiple Reference Training Improves Low-resource Machine Translation Highlight: We introduce Simulated Multiple Reference Training (SMRT), a novel MT training method that approximates the full space of possible translations by sampling a paraphrase of the reference sentence from a paraphraser and training the MT model to predict the paraphraser’s distribution over possible tokens. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Huda Khayrallah; Brian Thompson; Matt Post; Philipp Koehn;
8	Automatic Machine Translation Evaluation In Many Languages Via Zero-Shot Paraphrasing Highlight: We propose training the paraphraser as a multilingual NMT system, treating paraphrasing as a zero-shot translation task (e.g., Czech to Czech). Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Brian Thompson; Matt Post;
9	PRover: Proof Generation For Interpretable Reasoning Over Rules Highlight: In our work, we take a step closer to emulating formal theorem provers, by proposing PRover, an interpretable transformer-based model that jointly answers binary questions over rule-bases and generates the corresponding proofs. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Swarnadeep Saha; Sayan Ghosh; Shashank Srivastava; Mohit Bansal;
10	Learning To Explain: Datasets And Models For Identifying Valid Reasoning Chains In Multihop Question-Answering Highlight: To address this, we introduce three explanation datasets in which explanations formed from corpus facts are annotated. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Harsh Jhamtani; Peter Clark;
11	Self-Supervised Knowledge Triplet Learning For Zero-Shot Question Answering Highlight: This work proposes Knowledge Triplet Learning (KTL), a self-supervised task over knowledge graphs. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Pratyay Banerjee; Chitta Baral;
12	More Bang For Your Buck: Natural Perturbation For Robust Question Answering Highlight: As an alternative to the traditional approach of creating new instances by repeating the process of creating one instance, we propose doing so by first collecting a set of seed examples and then applying human-driven natural perturbations (as opposed to rule-based machine perturbations), which often change the gold label as well. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Daniel Khashabi; Tushar Khot; Ashish Sabharwal;
13	A Matter Of Framing: The Impact Of Linguistic Formalism On Probing Results Highlight: To investigate, we conduct an in-depth cross-formalism layer probing study in role semantics. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Ilia Kuznetsov; Iryna Gurevych;
14	Information-Theoretic Probing With Minimum Description Length Highlight: Instead, we propose an alternative to the standard probes, information-theoretic probing with minimum description length (MDL). Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Elena Voita; Ivan Titov;
15	Intrinsic Probing Through Dimension Selection Highlight: To enable intrinsic probing, we propose a novel framework based on a decomposable multivariate Gaussian probe that allows us to determine whether the linguistic information in word embeddings is dispersed or focal. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Lucas Torroba Hennigen; Adina Williams; Ryan Cotterell;
16	Learning Which Features Matter: RoBERTa Acquires A Preference For Linguistic Generalizations (Eventually) Highlight: With this goal in mind, we introduce a new English-language diagnostic set called MSGS (the Mixed Signals Generalization Set), which consists of 20 ambiguous binary classification tasks that we use to test whether a pretrained model prefers linguistic or surface generalizations during finetuning. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Alex Warstadt; Yian Zhang; Xiaocheng Li; Haokun Liu; Samuel R. Bowman;
17	Repulsive Attention: Rethinking Multi-head Attention As Bayesian Inference Highlight: In this paper, for the first time, we provide a novel understanding of multi-head attention from a Bayesian perspective. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Bang An; Jie Lyu; Zhenyi Wang; Chunyuan Li; Changwei Hu; Fei Tan; Ruiyi Zhang; Yifan Hu; Changyou Chen;
18	KERMIT: Complementing Transformer Architectures With Encoders Of Explicit Syntactic Interpretations Highlight: In this paper, we propose KERMIT (Kernel-inspired Encoder with Recursive Mechanism for Interpretable Trees) to embed symbolic syntactic parse trees into artificial neural networks and to visualize how syntax is used in inference. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Fabio Massimo Zanzotto; Andrea Santilli; Leonardo Ranaldi; Dario Onorati; Pierfrancesco Tommasino; Francesca Fallucchi;
19	ETC: Encoding Long And Structured Inputs In Transformers Highlight: In this paper, we present a new Transformer architecture, Extended Transformer Construction (ETC), that addresses two key challenges of standard Transformer architectures, namely scaling input length and encoding structured inputs. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Joshua Ainslie; Santiago Ontanon; Chris Alberti; Vaclav Cvicek; Zachary Fisher; Philip Pham; Anirudh Ravula; Sumit Sanghai; Qifan Wang; Li Yang;
20	Pre-Training Transformers As Energy-Based Cloze Models Highlight: We introduce Electric, an energy-based cloze model for representation learning over text. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Kevin Clark; Minh-Thang Luong; Quoc Le; Christopher D. Manning;
21	Calibration Of Pre-trained Transformers Highlight: We focus on BERT and RoBERTa in this work, and analyze their calibration across three tasks: natural language inference, paraphrase detection, and commonsense reasoning. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Shrey Desai; Greg Durrett;
22	Near-imperceptible Neural Linguistic Steganography Via Self-Adjusting Arithmetic Coding Highlight: In this study, we present a new linguistic steganography method which encodes secret messages using self-adjusting arithmetic coding based on a neural language model. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Jiaming Shen; Heng Ji; Jiawei Han;
23	Multi-Dimensional Gender Bias Classification Highlight: In this work, we propose a novel, general framework that decomposes gender bias in text along several pragmatic and semantic dimensions: bias from the gender of the person being spoken about, bias from the gender of the person being spoken to, and bias from the gender of the speaker. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Emily Dinan; Angela Fan; Ledell Wu; Jason Weston; Douwe Kiela; Adina Williams;
24	FIND: Human-in-the-Loop Debugging Deep Text Classifiers Highlight: In this paper, we propose FIND – a framework which enables humans to debug deep learning text classifiers by disabling irrelevant hidden features. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Piyawat Lertvittayakumjorn; Lucia Specia; Francesca Toni;
25	Conversational Document Prediction To Assist Customer Care Agents Highlight: We study the task of predicting the documents that customer care agents can use to facilitate users’ needs. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Jatin Ganhotra; Haggai Roitman; Doron Cohen; Nathaniel Mills; Chulaka Gunasekara; Yosi Mass; Sachindra Joshi; Luis Lastras; David Konopnicki;
26	Incremental Processing In The Age Of Non-Incremental Encoders: An Empirical Assessment Of Bidirectional Models For Incremental NLU Highlight: We investigate how they behave under incremental interfaces, when partial output must be provided based on partial input seen up to a certain time step, which may happen in interactive systems. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Brielen Madureira; David Schlangen;
27	Augmented Natural Language For Generative Sequence Labeling Highlight: We propose a generative framework for joint sequence labeling and sentence-level classification. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Ben Athiwaratkun; Cicero Nogueira dos Santos; Jason Krone; Bing Xiang;
28	Dialogue Response Ranking Training With Large-Scale Human Feedback Data Highlight: We leverage social media feedback data (number of replies and upvotes) to build a large-scale training dataset for feedback prediction. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Xiang Gao; Yizhe Zhang; Michel Galley; Chris Brockett; Bill Dolan;
29	Semantic Evaluation For Text-to-SQL With Distilled Test Suites Highlight: We propose test suite accuracy to approximate semantic accuracy for Text-to-SQL models. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Ruiqi Zhong; Tao Yu; Dan Klein;
30	Cross-Thought For Sentence Encoder Pre-training Highlight: In this paper, we propose Cross-Thought, a novel approach to pre-training sequence encoder, which is instrumental in building reusable sequence embeddings for large-scale NLP tasks such as question answering. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Shuohang Wang; Yuwei Fang; Siqi Sun; Zhe Gan; Yu Cheng; Jingjing Liu; Jing Jiang;
31	AutoQA: From Databases To QA Semantic Parsers With Only Synthetic Training Data Highlight: We propose AutoQA, a methodology and toolkit to generate semantic parsers that answer questions on databases, with no manual effort. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Silei Xu; Sina Semnani; Giovanni Campagna; Monica Lam;
32	A Spectral Method For Unsupervised Multi-Document Summarization Highlight: In this paper, we propose a spectral-based hypothesis, which states that the goodness of summary candidate is closely linked to its so-called spectral impact. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Kexiang Wang; Baobao Chang; Zhifang Sui;
33	What Have We Achieved On Text Summarization? Highlight: Aiming to gain more understanding of summarization systems with respect to their strengths and limits on a fine-grained syntactic and semantic level, we consult the Multidimensional Quality Metric (MQM) and quantify 8 major sources of errors on 10 representative summarization models manually. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Dandan Huang; Leyang Cui; Sen Yang; Guangsheng Bao; Kun Wang; Jun Xie; Yue Zhang;
34	Q-learning With Language Model For Edit-based Unsupervised Summarization Highlight: In this paper, we propose a new approach based on Q-learning with an edit-based summarization. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Ryosuke Kohita; Akifumi Wachi; Yang Zhao; Ryuki Tachibana;
35	Friendly Topic Assistant For Transformer Based Abstractive Summarization Highlight: To this end, we rearrange and explore the semantics learned by a topic model, and then propose a topic assistant (TA) including three modules. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Zhengjue Wang; Zhibin Duan; Hao Zhang; Chaojie Wang; Long Tian; Bo Chen; Mingyuan Zhou;
36	Contrastive Distillation On Intermediate Representations For Language Model Compression Highlight: To achieve better distillation efficacy, we propose Contrastive Distillation on Intermediate Representations (CoDIR), a principled knowledge distillation framework where the student is trained to distill knowledge through intermediate layers of the teacher via a contrastive objective. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Siqi Sun; Zhe Gan; Yuwei Fang; Yu Cheng; Shuohang Wang; Jingjing Liu;
37	TernaryBERT: Distillation-aware Ultra-low Bit BERT Highlight: In this work, we propose TernaryBERT, which ternarizes the weights in a fine-tuned BERT model. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Wei Zhang; Lu Hou; Yichun Yin; Lifeng Shang; Xiao Chen; Xin Jiang; Qun Liu;
38	Self-Supervised Meta-Learning For Few-Shot Natural Language Classification Tasks Highlight: This paper proposes a self-supervised approach to generate a large, rich, meta-learning task distribution from unlabeled text. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Trapit Bansal; Rishikesh Jha; Tsendsuren Munkhdalai; Andrew McCallum;
39	Efficient Meta Lifelong-Learning With Limited Memory Highlight: In this paper, we identify three common principles of lifelong learning methods and propose an efficient meta-lifelong framework that combines them in a synergistic fashion. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Zirui Wang; Sanket Vaibhav Mehta; Barnabas Poczos; Jaime Carbonell;
40	Don’t Use English Dev: On The Zero-Shot Cross-Lingual Evaluation Of Contextual Embeddings Highlight: We show that the standard practice of using English dev accuracy for model selection in the zero-shot setting makes it difficult to obtain reproducible results on the MLDoc and XNLI tasks. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Phillip Keung; Yichao Lu; Julian Salazar; Vikas Bhardwaj;
41	A Supervised Word Alignment Method Based On Cross-Language Span Prediction Using Multilingual BERT Highlight: We present a novel supervised word alignment method based on cross-language span prediction. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Masaaki Nagata; Katsuki Chousa; Masaaki Nishino;
42	Accurate Word Alignment Induction From Neural Machine Translation Highlight: In this paper, we show that attention weights do capture accurate word alignments and propose two novel word alignment induction methods Shift-Att and Shift-AET. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yun Chen; Yang Liu; Guanhua Chen; Xin Jiang; Qun Liu;
43	ChrEn: Cherokee-English Machine Translation For Endangered Language Revitalization Highlight: To help save this endangered language, we introduce ChrEn, a Cherokee-English parallel dataset, to facilitate machine translation research between Cherokee and English. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Shiyue Zhang; Benjamin Frey; Mohit Bansal;
44	Unsupervised Discovery Of Implicit Gender Bias Highlight: We take an unsupervised approach to identifying gender bias against women at a comment level and present a model that can surface text likely to contain bias. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Anjalie Field; Yulia Tsvetkov;
45	Condolence And Empathy In Online Communities Highlight: Here, we develop computational tools to create a massive dataset of 11.4M expressions of distress and 2.8M corresponding offerings of condolence in order to examine the dynamics of condolence online. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Naitian Zhou; David Jurgens;
46	An Embedding Model For Estimating Legislative Preferences From The Frequency And Sentiment Of Tweets Highlight: In this paper we introduce a method of measuring more specific legislator attitudes using an alternative expression of preferences: tweeting. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Gregory Spell; Brian Guay; Sunshine Hillygus; Lawrence Carin;
47	Measuring Information Propagation In Literary Social Networks Highlight: We describe a new pipeline for measuring information propagation in this domain and publish a new dataset for speaker attribution, enabling the evaluation of an important component of this pipeline on a wider range of literary texts than previously studied. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Matthew Sims; David Bamman;
48	Social Chemistry 101: Learning To Reason About Social And Moral Norms Highlight: We present SOCIAL CHEMISTRY, a new conceptual formalism to study people’s everyday social norms and moral judgments over a rich spectrum of real life situations described in natural language. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Maxwell Forbes; Jena D. Hwang; Vered Shwartz; Maarten Sap; Yejin Choi;
49	Event Extraction By Answering (Almost) Natural Questions Highlight: To avoid this issue, we introduce a new paradigm for event extraction by formulating it as a question answering (QA) task that extracts the event arguments in an end-to-end manner. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Xinya Du; Claire Cardie;
50	Connecting The Dots: Event Graph Schema Induction With Path Language Modeling Highlight: We propose a new Event Graph Schema, where two event types are connected through multiple paths involving entities that fill important roles in a coherent story. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Manling Li; Qi Zeng; Ying Lin; Kyunghyun Cho; Heng Ji; Jonathan May; Nathanael Chambers; Clare Voss;
51	Joint Constrained Learning For Event-Event Relation Extraction Highlight: Due to the lack of jointly labeled data for these relational phenomena and the restriction on the structures they articulate, we propose a joint constrained learning framework for modeling event-event relations. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Haoyu Wang; Muhao Chen; Hongming Zhang; Dan Roth;
52	Incremental Event Detection Via Knowledge Consolidation Networks Highlight: In this paper, we propose a Knowledge Consolidation Network (KCN) to address the above issues. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Pengfei Cao; Yubo Chen; Jun Zhao; Taifeng Wang;
53	Semi-supervised New Event Type Induction And Event Detection Highlight: In this paper, we work on a new task of semi-supervised event type induction, aiming to automatically discover a set of unseen types from a given corpus by leveraging annotations available for a few seen types. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Lifu Huang; Heng Ji;
54	Language Generation With Multi-Hop Reasoning On Commonsense Knowledge Graph Highlight: In this paper, we propose Generation with Multi-Hop Reasoning Flow (GRF) that enables pre-trained models with dynamic multi-hop reasoning on multi-relational paths extracted from the external commonsense knowledge graph. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Haozhe Ji; Pei Ke; Shaohan Huang; Furu Wei; Xiaoyan Zhu; Minlie Huang;
55	Reformulating Unsupervised Style Transfer As Paraphrase Generation Highlight: In this paper, we reformulate unsupervised style transfer as a paraphrase generation problem, and present a simple methodology based on fine-tuning pretrained language models on automatically generated paraphrase data. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Kalpesh Krishna; John Wieting; Mohit Iyyer;
56	De-Biased Court’s View Generation With Causality Highlight: In this paper, we propose a novel Attentional and Counterfactual based Natural Language Generation (AC-NLG) method, consisting of an attentional encoder and a pair of innovative counterfactual decoders. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yiquan Wu; Kun Kuang; Yating Zhang; Xiaozhong Liu; Changlong Sun; Jun Xiao; Yueting Zhuang; Luo Si; Fei Wu;
57	PAIR: Planning And Iterative Refinement In Pre-trained Transformers For Long Text Generation Highlight: In this work, we present a novel content-controlled text generation framework, PAIR, with planning and iterative refinement, which is built upon a large model, BART. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Xinyu Hua; Lu Wang;
58	Back To The Future: Unsupervised Backprop-based Decoding For Counterfactual And Abductive Commonsense Reasoning Highlight: In this paper, we propose DeLorean, a new unsupervised decoding algorithm that can flexibly incorporate both the past and future contexts using only off-the-shelf, left-to-right language models and no supervision. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Lianhui Qin; Vered Shwartz; Peter West; Chandra Bhagavatula; Jena D. Hwang; Ronan Le Bras; Antoine Bosselut; Yejin Choi;
59	Where Are You? Localization From Embodied Dialog Highlight: In this paper, we focus on the LED task – providing a strong baseline model with detailed ablations characterizing both dataset biases and the importance of various modeling choices. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Meera Hahn; Jacob Krantz; Dhruv Batra; Devi Parikh; James Rehg; Stefan Lee; Peter Anderson;
60	Learning To Represent Image And Text With Denotation Graph Highlight: In this paper, we propose learning representations from a set of implied, visually grounded expressions between image and text, automatically mined from those datasets. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Bowen Zhang; Hexiang Hu; Vihan Jain; Eugene Ie; Fei Sha;	code
61	Video2Commonsense: Generating Commonsense Descriptions To Enrich Video Captioning Highlight: We present the first work on generating \textit{commonsense} captions directly from videos, to describe latent aspects such as intentions, effects, and attributes. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Zhiyuan Fang; Tejas Gokhale; Pratyay Banerjee; Chitta Baral; Yezhou Yang;
62	Does My Multimodal Model Learn Cross-modal Interactions? It’s Harder To Tell Than You Might Think! Highlight: We propose a new diagnostic tool, empirical multimodally-additive function projection (EMAP), for isolating whether or not cross-modal interactions improve performance for a given model on a given task. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Jack Hessel; Lillian Lee;
63	MUTANT: A Training Paradigm For Out-of-Distribution Generalization In Visual Question Answering Highlight: In this paper, we present \textit{MUTANT}, a training paradigm that exposes the model to perceptually similar, yet semantically distinct \textit{mutations} of the input, to improve OOD generalization, such as the VQA-CP challenge. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Tejas Gokhale; Pratyay Banerjee; Chitta Baral; Yezhou Yang;
64	Mitigating Gender Bias For Neural Dialogue Generation With Adversarial Learning Highlight: In this paper, we propose a novel adversarial learning framework Debiased-Chat to train dialogue models free from gender bias while keeping their performance. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Haochen Liu; Wentao Wang; Yiqi Wang; Hui Liu; Zitao Liu; Jiliang Tang;
65	Will I Sound Like Me? Improving Persona Consistency In Dialogues Through Pragmatic Self-Consciousness Highlight: We explore the task of improving persona consistency of dialogue agents. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Hyunwoo Kim; Byeongchang Kim; Gunhee Kim;
66	TOD-BERT: Pre-trained Natural Language Understanding For Task-Oriented Dialogue Highlight: In this work, we unify nine human-human and multi-turn task-oriented dialogue datasets for language modeling. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Chien-Sheng Wu; Steven C.H. Hoi; Richard Socher; Caiming Xiong;
67	RiSAWOZ: A Large-Scale Multi-Domain Wizard-of-Oz Dataset With Rich Semantic Annotations For Task-Oriented Dialogue Modeling Highlight: In order to alleviate the shortage of multi-domain data and to capture discourse phenomena for task-oriented dialogue modeling, we propose RiSAWOZ, a large-scale multi-domain Chinese Wizard-of-Oz dataset with Rich Semantic Annotations. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Jun Quan; Shian Zhang; Qian Cao; Zizhong Li; Deyi Xiong;
68	Filtering Noisy Dialogue Corpora By Connectivity And Content Relatedness Highlight: In this paper, we propose a method for scoring the quality of utterance pairs in terms of their connectivity and relatedness. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Reina Akama; Sho Yokoi; Jun Suzuki; Kentaro Inui;
69	Latent Geographical Factors For Analyzing The Evolution Of Dialects In Contact Highlight: In this paper, we propose a probabilistic generative model that represents latent factors as geographical distributions. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yugo Murawaki;
70	Predicting Reference: What Do Language Models Learn About Discourse Models? Highlight: We address this question by drawing on a rich psycholinguistic literature that has established how different contexts affect referential biases concerning who is likely to be referred to next. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Shiva Upadhye; Leon Bergen; Andrew Kehler;
71	Word Class Flexibility: A Deep Contextualized Approach Highlight: We propose a principled methodology to explore regularity in word class flexibility. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Bai Li; Guillaume Thomas; Yang Xu; Frank Rudzicz;
72	Shallow-to-Deep Training For Neural Machine Translation Highlight: In this paper, we investigate the behavior of a well-tuned deep Transformer system. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Bei Li; Ziyang Wang; Hui Liu; Yufan Jiang; Quan Du; Tong Xiao; Huizhen Wang; Jingbo Zhu;	code
73	Iterative Refinement In The Continuous Space For Non-Autoregressive Neural Machine Translation Highlight: We propose an efficient inference procedure for non-autoregressive machine translation that iteratively refines translation purely in the continuous space. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Jason Lee; Raphael Shu; Kyunghyun Cho;
74	Why Skip If You Can Combine: A Simple Knowledge Distillation Technique For Intermediate Layers Highlight: In this paper, we target low-resource settings and evaluate our translation engines for Portuguese?English, Turkish?English, and English?German directions. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yimeng Wu; Peyman Passban; Mehdi Rezagholizadeh; Qun Liu;
75	Multi-task Learning For Multilingual Neural Machine Translation Highlight: In this work, we propose a multi-task learning (MTL) framework that jointly trains the model with the translation task on bitext data and two denoising tasks on the monolingual data. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yiren Wang; ChengXiang Zhai; Hany Hassan;
76	Token-level Adaptive Training For Neural Machine Translation Highlight: In this paper, we explored target token-level adaptive objectives based on token frequencies to assign appropriate weights for each target token during training. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Shuhao Gu; Jinchao Zhang; Fandong Meng; Yang Feng; Wanying Xie; Jie Zhou; Dong Yu;
77	Multi-Unit Transformers For Neural Machine Translation Highlight: In this paper, we propose the Multi-Unit Transformer (MUTE) , which aim to promote the expressiveness of the Transformer by introducing diverse and complementary units. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Jianhao Yan; Fandong Meng; Jie Zhou;
78	On The Sparsity Of Neural Machine Translation Models Highlight: In response to this problem, we empirically investigate whether the redundant parameters can be reused to achieve better performance. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yong Wang; Longyue Wang; Victor Li; Zhaopeng Tu;
79	Incorporating A Local Translation Mechanism Into Non-autoregressive Translation Highlight: In this work, we introduce a novel local autoregressive translation (LAT) mechanism into non-autoregressive translation (NAT) models so as to capture local dependencies among target outputs. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Xiang Kong; Zhisong Zhang; Eduard Hovy;
80	Self-Paced Learning For Neural Machine Translation Highlight: We ameliorate this procedure with a more flexible manner by proposing self-paced learning, where NMT model is allowed to 1) automatically quantify the learning confidence over training examples; and 2) flexibly govern its learning via regulating the loss in each iteration step. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yu Wan; Baosong Yang; Derek F. Wong; Yikai Zhou; Lidia S. Chao; Haibo Zhang; Boxing Chen;
81	Long-Short Term Masking Transformer: A Simple But Effective Baseline For Document-level Neural Machine Translation Highlight: In this paper, we research extensively the pros and cons of the standard transformer in document-level translation, and find that the auto-regressive property can simultaneously bring both the advantage of the consistency and the disadvantage of error accumulation. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Pei Zhang; Boxing Chen; Niyu Ge; Kai Fan;
82	Generating Diverse Translation From Model Distribution With Dropout Highlight: In this paper, we propose to generate diverse translations by deriving a large number of possible models with Bayesian modelling and sampling models from them for inference. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Xuanfu Wu; Yang Feng; Chenze Shao;
83	Non-Autoregressive Machine Translation With Latent Alignments Highlight: This paper presents two strong methods, CTC and Imputer, for non-autoregressive machine translation that model latent alignments with dynamic programming. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Chitwan Saharia; William Chan; Saurabh Saxena; Mohammad Norouzi;
84	Look At The First Sentence: Position Bias In Question Answering Highlight: In this study, we hypothesize that when the distribution of the answer positions is highly skewed in the training set (e.g., answers lie only in the k-th sentence of each passage), QA models predicting answers as positions can learn spurious positional cues and fail to give answers in different positions. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Miyoung Ko; Jinhyuk Lee; Hyunjae Kim; Gangwoo Kim; Jaewoo Kang;
85	ProtoQA: A Question Answering Dataset For Prototypical Common-Sense Reasoning Highlight: This paper introduces a new question answering dataset for training and evaluating common sense reasoning capabilities of artificial intelligence systems in such prototypical situations. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Michael Boratko; Xiang Li; Tim O’Gorman; Rajarshi Das; Dan Le; Andrew McCallum;
86	IIRC: A Dataset Of Incomplete Information Reading Comprehension Questions Highlight: To fill this gap, we present a dataset, IIRC, with more than 13K questions over paragraphs from English Wikipedia that provide only partial information to answer them, with the missing information occurring in one or more linked documents. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	James Ferguson; Matt Gardner; Hannaneh Hajishirzi; Tushar Khot; Pradeep Dasigi;	code
87	Unsupervised Adaptation Of Question Answering Systems Via Generative Self-training Highlight: In this paper we investigate the iterative generation of synthetic QA pairs as a way to realize unsupervised self adaptation. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Steven Rennie; Etienne Marcheret; Neil Mallinar; David Nahamoo; Vaibhava Goel;
88	TORQUE: A Reading Comprehension Dataset Of Temporal Ordering Questions Highlight: We introduce TORQUE, a new English reading comprehension benchmark built on 3.2k news snippets with 21k human-generated questions querying temporal relationships. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Qiang Ning; Hao Wu; Rujun Han; Nanyun Peng; Matt Gardner; Dan Roth;
89	ToTTo: A Controlled Table-To-Text Generation Dataset Highlight: We present ToTTo, an open-domain English table-to-text dataset with over 120,000 training examples that proposes a controlled generation task: given a Wikipedia table and a set of highlighted table cells, produce a one-sentence description. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Ankur Parikh; Xuezhi Wang; Sebastian Gehrmann; Manaal Faruqui; Bhuwan Dhingra; Diyi Yang; Dipanjan Das;
90	ENT-DESC: Entity Description Generation By Exploring Knowledge Graph Highlight: In this paper, we introduce a large-scale and challenging dataset to facilitate the study of such a practical scenario in KG-to-text. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Liying Cheng; Dekun Wu; Lidong Bing; Yan Zhang; Zhanming Jie; Wei Lu; Luo Si;
91	Small But Mighty: New Benchmarks For Split And Rephrase Highlight: We find that the widely used benchmark dataset universally contains easily exploitable syntactic cues caused by its automatic generation process. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Li Zhang; Huaiyu Zhu; Siddhartha Brahma; Yunyao Li;
92	Online Back-Parsing For AMR-to-Text Generation Highlight: We propose a decoder that back predicts projected AMR graphs on the target sentence during text generation. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Xuefeng Bai; Linfeng Song; Yue Zhang;
93	Reading Between The Lines: Exploring Infilling In Visual Narratives Highlight: In this paper, we tackle this problem by using infilling techniques involving prediction of missing steps in a narrative while generating textual descriptions from a sequence of images. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Khyathi Raghavi Chandu; Ruo-Ping Dong; Alan W Black;	code
94	Acrostic Poem Generation Highlight: We propose a new task in the area of computational creativity: acrostic poem generation in English. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Rajat Agarwal; Katharina Kann;
95	Local Additivity Based Data Augmentation For Semi-supervised NER Highlight: In this work, to alleviate the dependence on labeled data, we propose a Local Additivity based Data Augmentation (LADA) method for semi-supervised NER, in which we create virtual samples by interpolating sequences close to each other. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Jiaao Chen; Zhenghui Wang; Ran Tian; Zichao Yang; Diyi Yang;	code
96	Grounded Compositional Outputs For Adaptive Language Modeling Highlight: In this work, we go one step beyond and propose a fully compositional output embedding layer for language models, which is further grounded in information from a structured lexicon (WordNet), namely semantically related words and free-text definitions. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Nikolaos Pappas; Phoebe Mulcaire; Noah A. Smith;
97	SSMBA: Self-Supervised Manifold Based Data Augmentation For Improving Out-of-Domain Robustness Highlight: We introduce SSMBA, a data augmentation method for generating synthetic training examples by using a pair of corruption and reconstruction functions to move randomly on a data manifold. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Nathan Ng; Kyunghyun Cho; Marzyeh Ghassemi;
98	SetConv: A New Approach For Learning From Imbalanced Data Highlight: To address this problem, we propose a set convolution (SetConv) operation and an episodic training strategy to extract a single representative for each class, so that classifiers can later be trained on a balanced class distribution. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yang Gao; Yi-Fan Li; Yu Lin; Charu Aggarwal; Latifur Khan;
99	Scalable Multi-Hop Relational Reasoning For Knowledge-Aware Question Answering Highlight: In this paper, we propose a novel knowledge-aware approach that equips pre-trained language models (PTLMs) has with a multi-hop relational reasoning module, named multi-hop graph relation network (MHGRN). Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yanlin Feng; Xinyue Chen; Bill Yuchen Lin; Peifeng Wang; Jun Yan; Xiang Ren;
100	Improving Bilingual Lexicon Induction For Low Frequency Words Highlight: This paper designs a Monolingual Lexicon Induction task and observes that two factors accompany the degraded accuracy of bilingual lexicon induction for rare words. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Jiaji Huang; Xingyu Cai; Kenneth Church;
101	Learning VAE-LDA Models With Rounded Reparameterization Trick Highlight: In this work, we propose a new method, which we call Rounded Reparameterization Trick (RRT), to reparameterize Dirichlet distributions for the learning of VAE-LDA models. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Runzhi Tian; Yongyi Mao; Richong Zhang;
102	Calibrated Language Model Fine-Tuning For In- And Out-of-Distribution Data Highlight: To mitigate this issue, we propose a regularized fine-tuning method. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Lingkai Kong; Haoming Jiang; Yuchen Zhuang; Jie Lyu; Tuo Zhao; Chao Zhang;	code
103	Scaling Hidden Markov Language Models Highlight: We propose methods for scaling HMMs to massive state spaces while maintaining efficient exact inference, a compact parameterization, and effective regularization. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Justin Chiu; Alexander Rush;
104	Coding Textual Inputs Boosts The Accuracy Of Neural Networks Highlight: As alternatives to a text representation, we introduce Soundex, MetaPhone, NYSIIS, logogram to NLP, and develop fixed-output-length coding and its extension using Huffman coding. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Abdul Rafae Khan; Jia Xu; Weiwei Sun;	code
105	Learning From Task Descriptions Highlight: To take a step toward closing this gap, we introduce a framework for developing NLP systems that solve new tasks after reading their descriptions, synthesizing prior work in this area. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Orion Weller; Nicholas Lourie; Matt Gardner; Matthew Peters;
106	Hashtags, Emotions, And Comments: A Large-Scale Dataset To Understand Fine-Grained Social Emotions To Online Topics Highlight: This paper studies social emotions to online discussion topics. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Keyang Ding; Jing Li; Yuji Zhang;
107	Named Entity Recognition For Social Media Texts With Semantic Augmentation Highlight: In this paper, we propose a neural-based approach to NER for social media texts where both local (from running text) and augmented semantics are taken into account. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yuyang Nie; Yuanhe Tian; Xiang Wan; Yan Song; Bo Dai;
108	Coupled Hierarchical Transformer For Stance-Aware Rumor Verification In Social Media Conversations Highlight: Therefore, in this paper, to extend BERT to obtain thread representations, we first propose a Hierarchical Transformer, which divides each long thread into shorter subthreads, and employs BERT to separately represent each subthread, followed by a global Transformer layer to encode all the subthreads. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Jianfei Yu; Jing Jiang; Ling Min Serena Khoo; Hai Leong Chieu; Rui Xia;
109	Social Media Attributions In The Context Of Water Crisis Highlight: In this paper, we explore the viability of using unstructured, noisy social media data to complement traditional surveys through automatically extracting attribution factors. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Rupak Sarkar; Sayantan Mahinder; Hirak Sarkar; Ashiqur KhudaBukhsh;
110	On The Reliability And Validity Of Detecting Approval Of Political Actors In Tweets Highlight: In this work, we attempt to gauge the efficacy of untargeted sentiment, targeted sentiment, and stance detection methods in labeling various political actors’ approval by benchmarking them across several datasets. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Indira Sen; Fabian Flöck; Claudia Wagner;
111	Towards Medical Machine Reading Comprehension With Structural Knowledge And Plain Text Highlight: As an effort, we first collect a large scale medical multi-choice question dataset (more than 21k instances) for the National Licensed Pharmacist Examination in China. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Dongfang Li; Baotian Hu; Qingcai Chen; Weihua Peng; Anqi Wang;
112	Generating Radiology Reports Via Memory-driven Transformer Highlight: In this paper, we propose to generate radiology reports with memory-driven Transformer, where a relational memory is designed to record key information of the generation process and a memory-driven conditional layer normalization is applied to incorporating the memory into the decoder of Transformer. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Zhihong Chen; Yan Song; Tsung-Hui Chang; Xiang Wan;
113	Planning And Generating Natural And Diverse Disfluent Texts As Augmentation For Disfluency Detection Highlight: In this work, we propose a simple Planner-Generator based disfluency generation model to generate natural and diverse disfluent texts as augmented data, where the Planner decides on where to insert disfluent segments and the Generator follows the prediction to generate corresponding disfluent segments. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Jingfeng Yang; Diyi Yang; Zhaoran Ma;
114	Predicting Clinical Trial Results By Implicit Evidence Integration Highlight: To optimize the design of clinical trials, we introduce a novel Clinical Trial Result Prediction (CTRP) task. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Qiao Jin; Chuanqi Tan; Mosha Chen; Xiaozhong Liu; Songfang Huang;
115	Explainable Clinical Decision Support From Text Highlight: We propose a hierarchical CNN-transformer model with explicit attention as an interpretable, multi-task clinical language model, which achieves an AUROC of 0.75 and 0.78 on sepsis and mortality prediction, respectively. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Jinyue Feng; Chantal Shaib; Frank Rudzicz;
116	A Knowledge-driven Generative Model For Multi-implication Chinese Medical Procedure Entity Normalization Highlight: In this paper, we focus on Chinese medical procedure entity normalization. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Jinghui Yan; Yining Wang; Lu Xiang; Yu Zhou; Chengqing Zong;
117	Combining Automatic Labelers And Expert Annotations For Accurate Radiology Report Labeling Using BERT Highlight: In this work, we introduce a BERT-based approach to medical image report labeling that exploits both the scale of available rule-based systems and the quality of expert annotations. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Akshay Smit; Saahil Jain; Pranav Rajpurkar; Anuj Pareek; Andrew Ng; Matthew Lungren;
118	Benchmarking Meaning Representations In Neural Semantic Parsing Highlight: Upon identifying these gaps, we propose , a new unified benchmark on meaning representations, by integrating existing semantic parsing datasets, completing the missing logical forms, and implementing the missing execution engines. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Jiaqi Guo; Qian Liu; Jian-Guang Lou; Zhenwen Li; Xueqing Liu; Tao Xie; Ting Liu;	code
119	Analogous Process Structure Induction For Sub-event Sequence Prediction Highlight: In this paper, we propose an Analogous Process Structure Induction (APSI) framework, which leverages analogies among processes and conceptualization of sub-event instances to predict the whole sub-event sequence of previously unseen open-domain processes. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Hongming Zhang; Muhao Chen; Haoyu Wang; Yangqiu Song; Dan Roth;
120	SLM: Learning A Discourse Language Representation With Sentence Unshuffling Highlight: We introduce Sentence-level Language Modeling, a new pre-training objective for learning a discourse language representation in a fully self-supervised manner. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Haejun Lee; Drew A. Hudson; Kangwook Lee; Christopher D. Manning;
121	Detecting Fine-Grained Cross-Lingual Semantic Divergences Without Supervision By Learning To Rank Highlight: We introduce a training strategy for multilingual BERT models by learning to rank synthetic divergent examples of varying granularity. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Eleftheria Briakou; Marine Carpuat;
122	A Bilingual Generative Transformer For Semantic Sentence Embedding Highlight: We propose a deep latent variable model that attempts to perform source separation on parallel sentences, isolating what they have in common in a latent semantic vector, and explaining what is left over with language-specific latent vectors. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	John Wieting; Graham Neubig; Taylor Berg-Kirkpatrick;
123	Semantically Inspired AMR Alignment For The Portuguese Language Highlight: Aiming to fulfill this gap, we developed an alignment method for the Portuguese language based on a more semantically matched word-concept pair. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Rafael Anchiêta; Thiago Pardo;
124	An Unsupervised Sentence Embedding Method By Mutual Information Maximization Highlight: In this paper, we propose a lightweight extension on top of BERT and a novel self-supervised learning objective based on mutual information maximization strategies to derive meaningful sentence embeddings in an unsupervised manner. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yan Zhang; Ruidan He; Zuozhu Liu; Kwan Hui Lim; Lidong Bing;
125	Compositional Phrase Alignment And Beyond Highlight: We address the phrase alignment problem by combining an unordered tree mapping algorithm and phrase representation modelling that explicitly embeds the similarity distribution in the sentences onto powerful contextualized representations. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yuki Arase; Jun’ichi Tsujii;
126	Table Fact Verification With Structure-Aware Transformer Highlight: To better utilize pre-trained transformers for table representation, we propose a Structure-Aware Transformer (SAT), which injects the table structural information into the mask of the self-attention layer. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Hongzhi Zhang; Yingyao Wang; Sirui Wang; Xuezhi Cao; Fuzheng Zhang; Zhongyuan Wang;
127	Double Graph Based Reasoning For Document-level Relation Extraction Highlight: In this paper, we propose Graph Aggregation-and-Inference Network (GAIN), a method to recognize such relations for long paragraphs. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Shuang Zeng; Runxin Xu; Baobao Chang; Lei Li;	code
128	Event Extraction As Machine Reading Comprehension Highlight: In this paper, we propose a new learning paradigm of EE, by explicitly casting it as a machine reading comprehension problem (MRC). Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Jian Liu; Yubo Chen; Kang Liu; Wei Bi; Xiaojiang Liu;
129	MAVEN: A Massive General Domain Event Detection Dataset Highlight: To alleviate these problems, we present a MAssive eVENt detection dataset (MAVEN), which contains 4,480 Wikipedia documents, 118,732 event mention instances, and 168 event types. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Xiaozhi Wang; Ziqi Wang; Xu Han; Wangyi Jiang; Rong Han; Zhiyuan Liu; Juanzi Li; Peng Li; Yankai Lin; Jie Zhou;	code
130	Knowledge Graph Alignment With Entity-Pair Embedding Highlight: In this work, we present a new approach that directly learns embeddings of entity-pairs for KG alignment. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Zhichun Wang; Jinjian Yang; Xiaoju Ye;
131	Adaptive Attentional Network For Few-Shot Knowledge Graph Completion Highlight: This work proposes an adaptive attentional network for few-shot KG completion by learning adaptive entity and reference representations. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Jiawei Sheng; Shu Guo; Zhenyu Chen; Juwei Yue; Lihong Wang; Tingwen Liu; Hongbo Xu;	code
132	Pre-training Entity Relation Encoder With Intra-span And Inter-span Information Highlight: In this paper, we integrate span-related information into pre-trained encoder for entity relation extraction task. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yijun Wang; Changzhi Sun; Yuanbin Wu; Junchi Yan; Peng Gao; Guotong Xie;
133	Two Are Better Than One: Joint Entity And Relation Extraction With Table-Sequence Encoders Highlight: In this work, we propose the novel table-sequence encoders where two different encoders – a table encoder and a sequence encoder are designed to help each other in the representation learning process. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Jue Wang; Wei Lu;
134	Beyond [CLS] Through Ranking By Generation Highlight: In this work, we revisit the generative framework for information retrieval and show that our generative approaches are as effective as state-of-the-art semantic similarity-based discriminative models for the answer selection task. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Cicero Nogueira dos Santos; Xiaofei Ma; Ramesh Nallapati; Zhiheng Huang; Bing Xiang;
135	Tired Of Topic Models? Clusters Of Pretrained Word Embeddings Make For Fast And Good Topics Too! Highlight: The dominant approach is to use probabilistic topic models that posit a generative story, but in this paper we propose an alternative way to obtain topics: clustering pre-trained word embeddings while incorporating document information for weighted clustering and reranking top words. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Suzanna Sia; Ayush Dalmia; Sabrina J. Mielke;
136	Multi-document Summarization With Maximal Marginal Relevance-guided Reinforcement Learning Highlight: To close the gap, we present RL-MMR, Maximal Margin Relevance-guided Reinforcement Learning for MDS, which unifies advanced neural SDS methods and statistical measures used in classical MDS. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yuning Mao; Yanru Qu; Yiqing Xie; Xiang Ren; Jiawei Han;
137	Improving Neural Topic Models Using Knowledge Distillation Highlight: We use knowledge distillation to combine the best attributes of probabilistic topic models and pretrained transformers. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Alexander Miserlis Hoyle; Pranav Goel; Philip Resnik;
138	Short Text Topic Modeling With Topic Distribution Quantization And Negative Sampling Decoder Highlight: In this paper, to address this issue, we propose a novel neural topic model in the framework of autoencoding with a new topic distribution quantization approach generating peakier distributions that are more appropriate for modeling short texts. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Xiaobao Wu; Chunping Li; Yan Zhu; Yishu Miao;
139	Querying Across Genres For Medical Claims In News Highlight: We present a query-based biomedical information retrieval task across two vastly different genres – newswire and research literature – where the goal is to find the research publication that supports the primary claim made in a health-related news article. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Chaoyuan Zuo; Narayan Acharya; Ritwik Banerjee;
140	Incorporating Multimodal Information In Open-Domain Web Keyphrase Extraction Highlight: In this work, we propose a modeling approach that leverages these multi-modal signals to aid in the KPE task. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yansen Wang; Zhen Fan; Carolyn Rose;
141	CMU-MOSEAS: A Multimodal Language Dataset For Spanish, Portuguese, German And French Highlight: As a step towards building more equitable and inclusive multimodal systems, we introduce the first large-scale multimodal language dataset for Spanish, Portuguese, German and French. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	AmirAli Bagher Zadeh; Yansheng Cao; Simon Hessner; Paul Pu Liang; Soujanya Poria; Louis-Philippe Morency;
142	Combining Self-Training And Self-Supervised Learning For Unsupervised Disfluency Detection Highlight: In this work, we explore the unsupervised learning paradigm which can potentially work with unlabeled text corpora that are cheaper and easier to obtain. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Shaolei Wang; Zhongyuan Wang; Wanxiang Che; Ting Liu;
143	Multimodal Routing: Improving Local And Global Interpretability Of Multimodal Language Analysis Highlight: In this paper we propose, which dynamically adjusts weights between input modalities and output representations differently for each input sample. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yao-Hung Hubert Tsai; Martin Ma; Muqiao Yang; Ruslan Salakhutdinov; Louis-Philippe Morency;
144	Multistage Fusion With Forget Gate For Multimodal Summarization In Open-Domain Videos Highlight: To address these two issues, we propose a multistage fusion network with the fusion forget gate module, which builds upon this approach by modeling fine-grained interactions between the modalities through a multistep fusion schema and controlling the flow of redundant information between multimodal long sequences via a forgetting module. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Nayu Liu; Xian Sun; Hongfeng Yu; Wenkai Zhang; Guangluan Xu;
145	BiST: Bi-directional Spatio-Temporal Reasoning For Video-Grounded Dialogues Highlight: To address this drawback, we proposed Bi-directional Spatio-Temporal Learning (BiST), a vision-language neural framework for high-resolution queries in videos based on textual cues. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Hung Le; Doyen Sahoo; Nancy Chen; Steven C.H. Hoi;
146	UniConv: A Unified Conversational Neural Architecture For Multi-domain Task-oriented Dialogues Highlight: Unlike the existing approaches that are often designed to train each module separately, we propose UniConv – a novel unified neural architecture for end-to-end conversational systems in multi-domain task-oriented dialogues, which is designed to jointly train (i) a Bi-level State Tracker which tracks dialogue states by learning signals at both slot and domain level independently, and (ii) a Joint Dialogue Act and Response Generator which incorporates information from various input components and models dialogue acts and target responses simultaneously. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Hung Le; Doyen Sahoo; Chenghao Liu; Nancy Chen; Steven C.H. Hoi;
147	GraphDialog: Integrating Graph Knowledge Into End-to-End Task-Oriented Dialogue Systems Highlight: In this paper, we address these two challenges by exploiting the graph structural information in the knowledge base and in the dependency parsing tree of the dialogue. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Shiquan Yang; Rui Zhang; Sarah Erfani;
148	Structured Attention For Unsupervised Dialogue Structure Induction Highlight: In this work, we propose to incorporate structured attention layers into a Variational Recurrent Neural Network (VRNN) model with discrete latent states to learn dialogue structure in an unsupervised fashion. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Liang Qiu; Yizhou Zhao; Weiyan Shi; Yuan Liang; Feng Shi; Tao Yuan; Zhou Yu; Song-Chun Zhu;
149	Cross Copy Network For Dialogue Generation Highlight: In this paper, we propose a novel network architecture – Cross Copy Networks (CCN) to explore the current dialog context and similar dialogue instances’ logical structure simultaneously. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Changzhen Ji; Xin Zhou; Yating Zhang; Xiaozhong Liu; Changlong Sun; Conghui Zhu; Tiejun Zhao;
150	Multi-turn Response Selection Using Dialogue Dependency Relations Highlight: In this paper, we propose a dialogue extraction algorithm to transform a dialogue history into threads based on their dependency relations. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Qi Jia; Yizhu Liu; Siyu Ren; Kenny Zhu; Haifeng Tang;
151	Parallel Interactive Networks For Multi-Domain Dialogue State Generation Highlight: In this study, we argue that the incorporation of these dependencies is crucial for the design of MDST and propose Parallel Interactive Networks (PIN) to model these dependencies. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Junfan Chen; Richong Zhang; Yongyi Mao; Jie Xu;
152	SlotRefine: A Fast Non-Autoregressive Model For Joint Intent Detection And Slot Filling Highlight: In this paper, we propose a novel non-autoregressive model named SlotRefine for joint intent detection and slot filling. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Di Wu; Liang Ding; Fan Lu; Jian Xie;
153	An Information Bottleneck Approach For Controlling Conciseness In Rationale Extraction Highlight: In this paper, we show that it is possible to better manage the trade-off between concise explanations and high task accuracy by optimizing a bound on the Information Bottleneck (IB) objective. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Bhargavi Paranjape; Mandar Joshi; John Thickstun; Hannaneh Hajishirzi; Luke Zettlemoyer;
154	CrowS-Pairs: A Challenge Dataset For Measuring Social Biases In Masked Language Models Highlight: To measure some forms of social bias in language models against protected demographic groups in the US, we introduce the Crowdsourced Stereotype Pairs benchmark (CrowS-Pairs). Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Nikita Nangia; Clara Vania; Rasika Bhalerao; Samuel R. Bowman;
155	LOGAN: Local Group Bias Detection By Clustering Highlight: To analyze and detect such local bias, we propose LOGAN, a new bias detection technique based on clustering. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Jieyu Zhao; Kai-Wei Chang;
156	RNNs Can Generate Bounded Hierarchical Languages With Optimal Memory Highlight: We introduce Dyck-$(k,m)$, the language of well-nested brackets (of $k$ types) and $m$-bounded nesting depth, reflecting the bounded memory needs and long-distance dependencies of natural language syntax. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	John Hewitt; Michael Hahn; Surya Ganguli; Percy Liang; Christopher D. Manning;
157	Detecting Independent Pronoun Bias With Partially-Synthetic Data Generation Highlight: We introduce a new technique for measuring bias in models, using Bayesian approximations to generate partially-synthetic data from the model itself. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Robert Munro; Alex (Carmen) Morrison;
158	Visually Grounded Continual Learning Of Compositional Phrases Highlight: To study this human-like language acquisition ability, we present VisCOLL, a visually grounded language learning task, which simulates the continual acquisition of compositional phrases from streaming visual scenes. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Xisen Jin; Junyi Du; Arka Sadhu; Ram Nevatia; Xiang Ren;
159	MAF: Multimodal Alignment Framework For Weakly-Supervised Phrase Grounding Highlight: Given difficulties in annotating phrase-to-object datasets at scale, we develop a Multimodal Alignment Framework (MAF) to leverage more widely-available caption-image datasets, which can then be used as a form of weak supervision. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Qinxin Wang; Hao Tan; Sheng Shen; Michael Mahoney; Zhewei Yao;
160	Domain-Specific Lexical Grounding In Noisy Visual-Textual Documents Highlight: We present a simple unsupervised clustering-based method that increases precision and recall beyond object detection and image tagging baselines when evaluated on labeled subsets of the dataset. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Gregory Yauney; Jack Hessel; David Mimno;
161	HERO: Hierarchical Encoder For Video+Language Omni-representation Pre-training Highlight: We present HERO, a novel framework for large-scale video+language omni-representation learning. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Linjie Li; Yen-Chun Chen; Yu Cheng; Zhe Gan; Licheng Yu; Jingjing Liu;
162	Vokenization: Improving Language Understanding With Contextualized, Visual-Grounded Supervision Highlight: Therefore, we develop a technique named vokenization that extrapolates multimodal alignments to language-only data by contextually mapping language tokens to their related images (which we call vokens). Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Hao Tan; Mohit Bansal;
163	Detecting Cross-Modal Inconsistency To Defend Against Neural Fake News Highlight: In this paper, we introduce the more realistic and challenging task of defending against machine-generated news that also includes images and captions. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Reuben Tan; Bryan Plummer; Kate Saenko;
164	Enhancing Aspect Term Extraction With Soft Prototypes Highlight: In this paper, we propose to tackle this problem by correlating words with each other through soft prototypes. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Zhuang Chen; Tieyun Qian;
165	FedED: Federated Learning Via Ensemble Distillation For Medical Relation Extraction Highlight: In this paper, we propose a privacy-preserving medical relation extraction model based on federated learning, which enables training a central model with no single piece of private local data being shared or exchanged. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Dianbo Sui; Yubo Chen; Jun Zhao; Yantao Jia; Yuantao Xie; Weijian Sun;
166	Multimodal Joint Attribute Prediction And Value Extraction For E-commerce Product Highlight: In this paper, we propose a multimodal method to jointly predict product attributes and extract values from textual product descriptions with the help of the product images. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Tiangang Zhu; Yue Wang; Haoran Li; Youzheng Wu; Xiaodong He; Bowen Zhou;	code
167	A Predicate-Function-Argument Annotation Of Natural Language For Open-Domain Information EXpression Highlight: This paper proposes a new pipeline to build OIE systems, where an Open-domain Information eXpression (OIX) task is proposed to provide a platform for all OIE strategies. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Mingming Sun; Wenyue Hua; Zoey Liu; Xin Wang; Kangjie Zheng; Ping Li;
168	Retrofitting Structure-aware Transformer Language Model For End Tasks Highlight: We consider retrofitting structure-aware Transformer language model for facilitating end tasks by proposing to exploit syntactic distance to encode both the phrasal constituency and dependency connection into the language model. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Hao Fei; Yafeng Ren; Donghong Ji;
169	Lightweight, Dynamic Graph Convolutional Networks For AMR-to-Text Generation Highlight: In this paper, we introduce a dynamic fusion mechanism, proposing Lightweight Dynamic Graph Convolutional Networks (LDGCNs) that capture richer non-local interactions by synthesizing higher order information from the input graphs. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yan Zhang; Zhijiang Guo; Zhiyang Teng; Wei Lu; Shay B. Cohen; Zuozhu Liu; Lidong Bing;
170	If Beam Search Is The Answer, What Was The Question? Highlight: We frame beam search as the exact solution to a different decoding objective in order to gain insights into why high probability under a model alone may not indicate adequacy. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Clara Meister; Ryan Cotterell; Tim Vieira;
171	Understanding The Mechanics Of SPIGOT: Surrogate Gradients For Latent Structure Learning Highlight: In this paper, we focus on surrogate gradients, a popular strategy to deal with this problem. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Tsvetomila Mihaylova; Vlad Niculae; André F. T. Martins;
172	Is The Best Better? Bayesian Statistical Model Comparison For Natural Language Processing Highlight: We propose a Bayesian statistical model comparison technique which uses k-fold cross-validation across multiple data sets to estimate the likelihood that one model will outperform the other, or that the two will produce practically equivalent results. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Piotr Szymański; Kyle Gorman;
173	Exploring Logically Dependent Multi-task Learning With Causal Inference Highlight: In this paper, we view logically dependent MTL from the perspective of causal inference and suggest a mediation assumption instead of the confounding assumption in conventional MTL models. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Wenqing Chen; Jidong Tian; Liqiang Xiao; Hao He; Yaohui Jin;
174	Masking As An Efficient Alternative To Finetuning For Pretrained Language Models Highlight: We present an efficient method of utilizing pretrained language models, where we learn selective binary masks for pretrained weights in lieu of modifying them through finetuning. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Mengjie Zhao; Tao Lin; Fei Mi; Martin Jaggi; Hinrich Schütze;
175	Dynamic Context Selection For Document-level Neural Machine Translation Via Reinforcement Learning Highlight: To address this problem, we propose an effective approach to select dynamic context so that the document-level translation model can utilize the more useful selected context sentences to produce better translations. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Xiaomian Kang; Yang Zhao; Jiajun Zhang; Chengqing Zong;
176	Data Rejuvenation: Exploiting Inactive Training Examples For Neural Machine Translation Highlight: In this work, we explore to identify the inactive training examples which contribute less to the model performance, and show that the existence of inactive examples depends on the data distribution. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Wenxiang Jiao; Xing Wang; Shilin He; Irwin King; Michael Lyu; Zhaopeng Tu;
177	Pronoun-Targeted Fine-tuning For NMT With Hybrid Losses Highlight: We introduce a class of conditional generative-discriminative hybrid losses that we use to fine-tune a trained machine translation model. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Prathyusha Jwalapuram; Shafiq Joty; Youlin Shen;
178	Learning Adaptive Segmentation Policy For Simultaneous Translation Highlight: Inspired by human interpreters, we propose a novel adaptive segmentation policy for simultaneous translation. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Ruiqing Zhang; Chuanqiang Zhang; Zhongjun He; Hua Wu; Haifeng Wang;
179	Learn To Cross-lingual Transfer With Meta Graph Learning Across Heterogeneous Languages Highlight: To address the issues, we propose a meta graph learning (MGL) method. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Zheng Li; Mukul Kumar; William Headden; Bing Yin; Ying Wei; Yu Zhang; Qiang Yang;
180	UDapter: Language Adaptation For Truly Universal Dependency Parsing Highlight: To address this, we propose a novel multilingual task adaptation approach based on contextual parameter generation and adapter modules. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Ahmet Üstün; Arianna Bisazza; Gosse Bouma; Gertjan van Noord;
181	Uncertainty-Aware Label Refinement For Sequence Labeling Highlight: In this work, we introduce a novel two-stage label decoding framework to model long-term label dependencies, while being much more computationally efficient. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Tao Gui; Jiacheng Ye; Qi Zhang; Zhengyan Li; Zichu Fei; Yeyun Gong; Xuanjing Huang;
182	Adversarial Attack And Defense Of Structured Prediction Models Highlight: In this paper, we investigate attacks and defenses for structured prediction tasks in NLP. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Wenjuan Han; Liwen Zhang; Yong Jiang; Kewei Tu;
183	Position-Aware Tagging For Aspect Sentiment Triplet Extraction Highlight: In this work, we propose the first end-to-end model with a novel position-aware tagging scheme that is capable of jointly extracting the triplets. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Lu Xu; Hao Li; Wei Lu; Lidong Bing;
184	Simultaneous Machine Translation With Visual Context Highlight: In this paper, we seek to understand whether the addition of visual information can compensate for the missing source context. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Ozan Caglayan; Julia Ive; Veneta Haralampieva; Pranava Madhyastha; Loïc Barrault; Lucia Specia;
185	XCOPA: A Multilingual Dataset For Causal Commonsense Reasoning Highlight: Motivated by both demands, we introduce Cross-lingual Choice of Plausible Alternatives (XCOPA), a typologically diverse multilingual dataset for causal commonsense reasoning in 11 languages, which includes resource-poor languages like Eastern Apur{\’\i}mac Quechua and Haitian Creole. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Edoardo Maria Ponti; Goran Glavaš; Olga Majewska; Qianchu Liu; Ivan Vulić; Anna Korhonen;
186	The Secret Is In The Spectra: Predicting Cross-lingual Task Performance With Spectral Similarity Measures Highlight: In this work we present a large-scale study focused on the correlations between monolingual embedding space similarity and task performance, covering thousands of language pairs and four different tasks: BLI, parsing, POS tagging and MT. We hypothesize that statistics of the spectrum of each monolingual embedding space indicate how well they can be aligned. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Haim Dubossarsky; Ivan Vulić; Roi Reichart; Anna Korhonen;
187	Bridging Linguistic Typology And Multilingual Machine Translation With Multi-View Language Representations Highlight: We propose to fuse both views using singular vector canonical correlation analysis and study what kind of information is induced from each source. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Arturo Oncevay; Barry Haddow; Alexandra Birch;
188	AnswerFact: Fact Checking In Product Question Answering Highlight: To tackle this issue, we investigate to predict the veracity of answers in this paper and introduce AnswerFact, a large scale fact checking dataset from product question answering forums. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Wenxuan Zhang; Yang Deng; Jing Ma; Wai Lam;
189	Context-Aware Answer Extraction In Question Answering Highlight: To resolve this issue, we propose BLANC (BLock AttentioN for Context prediction) based on two main ideas: context prediction as an auxiliary task in multi-task learning manner, and a block attention method that learns the context prediction task. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yeon Seonwoo; Ji-Hoon Kim; Jung-Woo Ha; Alice Oh;
190	What Do Models Learn From Question Answering Datasets? Highlight: In this paper, we investigate if models are learning reading comprehension from QA datasets by evaluating BERT-based models across five datasets. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Priyanka Sen; Amir Saffari;	code
191	Discern: Discourse-Aware Entailment Reasoning Network For Conversational Machine Reading Highlight: In this work, we propose Discern, a discourse-aware entailment reasoning network to strengthen the connection and enhance the understanding of both document and dialog. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yifan Gao; Chien-Sheng Wu; Jingjing Li; Shafiq Joty; Steven C.H. Hoi; Caiming Xiong; Irwin King; Michael Lyu;	code
192	A Method For Building A Commonsense Inference Dataset Based On Basic Events Highlight: We present a scalable, low-bias, and low-cost method for building a commonsense inference dataset that combines automatic extraction from a corpus and crowdsourcing. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Kazumasa Omura; Daisuke Kawahara; Sadao Kurohashi;
193	Neural Deepfake Detection With Factual Structure Of Text Highlight: To address this, we propose a graph-based model that utilizes the factual structure of a document for deepfake detection of text. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Wanjun Zhong; Duyu Tang; Zenan Xu; Ruize Wang; Nan Duan; Ming Zhou; Jiahai Wang; Jian Yin;
194	MultiCQA: Zero-Shot Transfer Of Self-Supervised Text Matching Models On A Massive Scale Highlight: We propose to incorporate self-supervised with supervised multi-task learning on all available source domains. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Andreas Rücklé; Jonas Pfeiffer; Iryna Gurevych;
195	XL-AMR: Enabling Cross-Lingual AMR Parsing With Transfer Learning Techniques Highlight: In this work we tackle these two problems so as to enable cross-lingual AMR parsing: we explore different transfer learning techniques for producing automatic AMR annotations across languages and develop a cross-lingual AMR parser, XL-AMR. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Rexhina Blloshmi; Rocco Tripodi; Roberto Navigli;
196	Improving AMR Parsing With Sequence-to-Sequence Pre-training Highlight: In this paper, we focus on sequence-to-sequence (seq2seq) AMR parsing and propose a seq2seq pre-training approach to build pre-trained models in both single and joint way on three relevant tasks, i.e., machine translation, syntactic parsing, and AMR parsing itself. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Dongqin Xu; Junhui Li; Muhua Zhu; Min Zhang; Guodong Zhou;	code
197	Hate-Speech And Offensive Language Detection In Roman Urdu Highlight: In this study, we: (1) Present a lexicon of hateful words in RU, (2) Develop an annotated dataset called RUHSOLD consisting of 10,012 tweets in RU with both coarse-grained and fine-grained labels of hate-speech and offensive language, (3) Explore the feasibility of transfer learning of five existing embedding models to RU, (4) Propose a novel deep learning architecture called CNN-gram for hate-speech and offensive language detection and compare its performance with seven current baseline approaches on RUHSOLD dataset, and (5) Train domain-specific embeddings on more than 4.7 million tweets and make them publicly available. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Hammad Rizwan; Muhammad Haroon Shakeel; Asim Karim;
198	Suicidal Risk Detection For Military Personnel Highlight: We analyze social media for detecting the suicidal risk of military personnel, which is especially crucial for countries with compulsory military service such as the Republic of Korea. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Sungjoon Park; Kiwoong Park; Jaimeen Ahn; Alice Oh;
199	Comparative Evaluation Of Label-Agnostic Selection Bias In Multilingual Hate Speech Datasets Highlight: We examine selection bias in hate speech in a language and label independent fashion. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Nedjma Ousidhoum; Yangqiu Song; Dit-Yan Yeung;
200	HENIN: Learning Heterogeneous Neural Interaction Networks For Explainable Cyberbullying Detection On Social Media Highlight: In this paper, therefore, we propose a novel deep model, HEterogeneous Neural Interaction Networks (HENIN), for explainable cyberbullying detection. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Hsin-Yu Chen; Cheng-Te Li;
201	Reactive Supervision: A New Method For Collecting Sarcasm Data Highlight: We introduce reactive supervision, a novel data collection method that utilizes the dynamics of online conversations to overcome the limitations of existing data collection techniques. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Boaz Shmueli; Lun-Wei Ku; Soumya Ray;
202	Self-Induced Curriculum Learning In Self-Supervised Neural Machine Translation Highlight: In this study, we provide an in-depth analysis of the sampling choices the SSNMT model makes during training. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Dana Ruiter; Josef van Genabith; Cristina España-Bonet;
203	Towards Reasonably-Sized Character-Level Transformer NMT By Finetuning Subword Systems Highlight: We show that by initially training a subword model and then finetuning it on characters, we can obtain a neural machine translation model that works at the character level without requiring token segmentation. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Jindřich Libovický; Alexander Fraser;
204	Transfer Learning And Distant Supervision For Multilingual Transformer Models: A Study On African Languages Highlight: In this work, we study trends in performance for different amounts of available resources for the three African languages Hausa, isiXhosa and on both NER and topic classification. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Michael A. Hedderich; David Adelani; Dawei Zhu; Jesujoba Alabi; Udia Markus; Dietrich Klakow;
205	Translation Quality Estimation By Jointly Learning To Score And Rank Highlight: In order to make use of different types of human evaluation data for supervised learning, we present a multi-task learning QE model that jointly learns two tasks: score a translation and rank two translations. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Jingyi Zhang; Josef van Genabith;
206	Direct Segmentation Models For Streaming Speech Translation Highlight: This work proposes novel segmentation models for streaming ST that incorporate not only textual, but also acoustic information to decide when the ASR output is split into a chunk. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Javier Iranzo-Sánchez; Adrià Giménez Pastor; Joan Albert Silvestre-Cerdà; Pau Baquero-Arnal; Jorge Civera Saiz; Alfons Juan;
207	Not Low-Resource Anymore: Aligner Ensembling, Batch Filtering, And New Datasets For Bengali-English Machine Translation Highlight: In this work, we build a customized sentence segmenter for Bengali and propose two novel methods for parallel corpus creation on low-resource setups: aligner ensembling and batch filtering. We release the segmenter, parallel corpus, and the evaluation set, thus elevating Bengali from its low-resource status. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Tahmid Hasan; Abhik Bhattacharjee; Kazi Samin; Masum Hasan; Madhusudan Basak; M. Sohel Rahman; Rifat Shahriyar;	code
208	CSP:Code-Switching Pre-training For Neural Machine Translation Highlight: This paper proposes a new pre-training method, called Code-Switching Pre-training (CSP for short) for Neural Machine Translation (NMT). Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Zhen Yang; Bojie Hu; Ambyera Han; Shen Huang; Qi Ju;
209	Type B Reflexivization As An Unambiguous Testbed For Multilingual Multi-Task Gender Bias Highlight: We present a multilingual, multi-task challenge dataset, which spans four languages and four NLP tasks and focuses only on this phenomenon. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Ana Valeria González; Maria Barrett; Rasmus Hvingelby; Kellie Webster; Anders Søgaard;
210	Pre-training Multilingual Neural Machine Translation By Leveraging Alignment Information Highlight: We propose mRASP, an approach to pre-train a universal multilingual neural machine translation model. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Zehui Lin; Xiao Pan; Mingxuan Wang; Xipeng Qiu; Jiangtao Feng; Hao Zhou; Lei Li;	code
211	Losing Heads In The Lottery: Pruning Transformer Attention In Neural Machine Translation Highlight: In this paper, we apply the lottery ticket hypothesis to prune heads in the early stages of training. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Maximiliana Behnke; Kenneth Heafield;
212	Towards Enhancing Faithfulness For Neural Machine Translation Highlight: In this paper, we propose a novel training strategy with a multi-task learning paradigm to build a faithfulness enhanced NMT model (named FEnmt). Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Rongxiang Weng; Heng Yu; Xiangpeng Wei; Weihua Luo;
213	COMET: A Neural Framework For MT Evaluation Highlight: We present COMET, a neural framework for training multilingual machine translation evaluation models which obtains new state-of-the-art levels of correlation with human judgements. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Ricardo Rei; Craig Stewart; Ana C Farinha; Alon Lavie;
214	Reusing A Pretrained Language Model On Languages With Limited Corpora For Unsupervised NMT Highlight: We present an effective approach that reuses an LM that is pretrained only on the high-resource language. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Alexandra Chronopoulou; Dario Stojanovski; Alexander Fraser;
215	LNMap: Departures From Isomorphic Assumption In Bilingual Lexicon Induction Through Non-Linear Mapping In Latent Space Highlight: In this work, we propose a novel semi-supervised method to learn cross-lingual word embeddings for BLI. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Tasnim Mohiuddin; M Saiful Bari; Shafiq Joty;
216	Uncertainty-Aware Semantic Augmentation For Neural Machine Translation Highlight: To address this problem, we propose uncertainty-aware semantic augmentation, which explicitly captures the universal semantic information among multiple semantically-equivalent source sentences and enhances the hidden representations with this information for better translations. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Xiangpeng Wei; Heng Yu; Yue Hu; Rongxiang Weng; Luxi Xing; Weihua Luo;
217	Can Automatic Post-Editing Improve NMT? Highlight: We hypothesize that APE models have been underperforming in improving NMT translations due to the lack of adequate supervision. To ascertain our hypothesis, we compile a larger corpus of human post-edits of English to German NMT. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Shamil Chollampatt; Raymond Hendy Susanto; Liling Tan; Ewa Szymanska;	code
218	Parsing Gapping Constructions Based On Grammatical And Semantic Roles Highlight: This paper proposes a method of parsing sentences with gapping to recover elided elements. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yoshihide Kato; Shigeki Matsubara;
219	Span-based Discontinuous Constituency Parsing: A Family Of Exact Chart-based Algorithms With Time Complexities From O(n^6) Down To O(n^3) Highlight: We introduce a novel chart-based algorithm for span-based parsing of discontinuous constituency trees of block degree two, including ill-nested structures. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Caio Corro;
220	Some Languages Seem Easier To Parse Because Their Treebanks Leak Highlight: We compute graph isomorphisms, and show that, treebank size aside, overlap between training and test graphs explain more of the observed variation than standard explanations such as the above. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Anders Søgaard;
221	Discontinuous Constituent Parsing As Sequence Labeling Highlight: This paper reduces discontinuous parsing to sequence labeling. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	David Vilares; Carlos Gómez-Rodríguez;
222	Modularized Syntactic Neural Networks For Sentence Classification Highlight: This paper focuses on tree-based modeling for the sentence classification task. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Haiyan Wu; Ying Liu; Shaoyun Shi;
223	TED-CDB: A Large-Scale Chinese Discourse Relation Dataset On TED Talks Highlight: As different genres are known to differ in their communicative properties and as previously, for Chinese, discourse relations have only been annotated over news text, we have created the TED-CDB dataset. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Wanqiu Long; Bonnie Webber; Deyi Xiong;
224	QADiscourse – Discourse Relations As QA Pairs: Representation, Crowdsourcing And Baselines Highlight: This paper proposes a novel representation of discourse relations as QA pairs, which in turn allows us to crowd-source wide-coverage data annotated with discourse relations, via an intuitively appealing interface for composing such questions and answers. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Valentina Pyatkin; Ayal Klein; Reut Tsarfaty; Ido Dagan;
225	Discourse Self-Attention For Discourse Element Identification In Argumentative Student Essays Highlight: This paper proposes to adapt self-attention to discourse level for modeling discourse elements in argumentative student essays. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Wei Song; Ziyao Song; Ruiji Fu; Lizhen Liu; Miaomiao Cheng; Ting Liu;
226	MEGATRON-CNTRL: Controllable Story Generation With External Knowledge Using Large-Scale Language Models Highlight: In this paper, we propose MEGATRON-CNTRL, a novel framework that uses large-scale language models and adds control to text generation by incorporating an external knowledge base. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Peng Xu; Mostofa Patwary; Mohammad Shoeybi; Raul Puri; Pascale Fung; Anima Anandkumar; Bryan Catanzaro;
227	Incomplete Utterance Rewriting As Semantic Segmentation Highlight: In this paper, we present a novel and extensive approach, which formulates it as a semantic segmentation task. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Qian Liu; Bei Chen; Jian-Guang Lou; Bin Zhou; Dongmei Zhang;
228	Improving Grammatical Error Correction Models With Purpose-Built Adversarial Examples Highlight: We propose a method inspired by adversarial training to generate more meaningful and valuable training examples by continually identifying the weak spots of a model, and to enhance the model by gradually adding the generated adversarial examples to the training set. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Lihao Wang; Xiaoqing Zheng;
229	Homophonic Pun Generation With Lexically Constrained Rewriting Highlight: In this paper, we focus on the task of generating a pun sentence given a pair of homophones. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Zhiwei Yu; Hongyu Zang; Xiaojun Wan;
230	How To Make Neural Natural Language Generation As Reliable As Templates In Task-Oriented Dialogue Highlight: To overcome this issue, we propose a data augmentation approach which allows us to restrict the output of a network and guarantee reliability. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Henry Elder; Alexander O’Connor; Jennifer Foster;
231	Multilingual AMR-to-Text Generation Highlight: In this work, we focus on Abstract Meaning Representations (AMRs) as structured input, where previous research has overwhelmingly focused on generating only into English. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Angela Fan; Claire Gardent;
232	Exploring The Linear Subspace Hypothesis In Gender Bias Mitigation Highlight: In this work, we generalize their method to a kernelized, non-linear version. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Francisco Vargas; Ryan Cotterell;
233	Lifelong Language Knowledge Distillation Highlight: To address this issue, we present Lifelong Language Knowledge Distillation (L2KD), a simple but efficient method that can be easily applied to existing LLL architectures in order to mitigate the degradation. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yung-Sung Chuang; Shang-Yu Su; Yun-Nung Chen;
234	Sparse Parallel Training Of Hierarchical Dirichlet Process Topic Models Highlight: In this work, we study data-parallel training for the hierarchical Dirichlet process (HDP) topic model. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Alexander Terenin; Måns Magnusson; Leif Jonsson;
235	Multi-label Few/Zero-shot Learning With Knowledge Aggregated From Multiple Label Graphs Highlight: In this paper, we present a simple multi-graph aggregation model that fuses knowledge from multiple label graphs encoding different semantic label relationships in order to study how the aggregated knowledge can benefit multi-label zero/few-shot document classification. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Jueqing Lu; Lan Du; Ming Liu; Joanna Dipnall;
236	Word Rotator’s Distance Highlight: Accordingly, we propose decoupling word vectors into their norm and direction then computing the alignment-based similarity with the help of earth mover’s distance (optimal transport), which we refer to as word rotator’s distance. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Sho Yokoi; Ryo Takahashi; Reina Akama; Jun Suzuki; Kentaro Inui;	code
237	Disentangle-based Continual Graph Representation Learning Highlight: To address this issue, we study the problem of continual graph representation learning which aims to continually train a GE model on new data to learn incessantly emerging multi-relational data while avoiding catastrophically forgetting old learned knowledge. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Xiaoyu Kou; Yankai Lin; Shaobo Liu; Peng Li; Jie Zhou; Yan Zhang;	code
238	Semi-Supervised Bilingual Lexicon Induction With Two-way Interaction Highlight: In this paper, we propose a new semi-supervised BLI framework to encourage the interaction between the supervised signal and unsupervised alignment. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Xu Zhao; Zihao Wang; Hao Wu; Yong Zhang;
239	Wasserstein Distance Regularized Sequence Representation For Text Matching In Asymmetrical Domains Highlight: In this paper, we propose a novel match method tailored for text matching in asymmetrical domains, called WD-Match. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Weijie Yu; Chen Xu; Jun Xu; Liang Pang; Xiaopeng Gao; Xiaozhao Wang; Ji-Rong Wen;
240	A Simple Approach To Learning Unsupervised Multilingual Embeddings Highlight: In contrast, we propose a simple approach by decoupling the above two sub-problems and solving them separately, one after another, using existing techniques. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Pratik Jawanpuria; Mayank Meghwanshi; Bamdev Mishra;
241	Bootstrapped Q-learning With Context Relevant Observation Pruning To Generalize In Text-based Games Highlight: To address this issue, we propose Context Relevant Episodic State Truncation (CREST) for irrelevant token removal in observation text for improved generalization. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Subhajit Chaudhury; Daiki Kimura; Kartik Talamadupula; Michiaki Tatsubori; Asim Munawar; Ryuki Tachibana;
242	BERT-EMD: Many-to-Many Layer Mapping For BERT Compression With Earth Mover’s Distance Highlight: In this paper, we propose a novel BERT distillation method based on many-to-many layer mapping, which allows each intermediate student layer to learn from any intermediate teacher layers. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Jianquan Li; Xiaokang Liu; Honghong Zhao; Ruifeng Xu; Min Yang; Yaohong Jin;
243	Slot Attention With Value Normalization For Multi-Domain Dialogue State Tracking Highlight: In this paper, we propose a new architecture to cleverly exploit ontology, which consists of Slot Attention (SA) and Value Normalization (VN), referred to as SAVN. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yexiang Wang; Yi Guo; Siqi Zhu;
244	Don’t Read Too Much Into It: Adaptive Computation For Open-Domain Question Answering Highlight: To reduce this cost, we propose the use of adaptive computation to control the computational budget allocated for the passages to be read. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yuxiang Wu; Sebastian Riedel; Pasquale Minervini; Pontus Stenetorp;
245	Multi-Step Inference For Reasoning Over Paragraphs Highlight: We present a middle ground between these two extremes: a compositional model reminiscent of neural module networks that can perform chained logical reasoning. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Jiangming Liu; Matt Gardner; Shay B. Cohen; Mirella Lapata;
246	Learning A Cost-Effective Annotation Policy For Question Answering Highlight: As a remedy, we propose a novel framework for annotating QA datasets that entails learning a cost-effective annotation policy and a semi-supervised annotation scheme. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Bernhard Kratzwald; Stefan Feuerriegel; Huan Sun;
247	Scene Restoring For Narrative Machine Reading Comprehension Highlight: Inspired by this behavior of humans, we propose a method to let the machine imagine a scene during reading narrative for better comprehension. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Zhixing Tian; Yuanzhe Zhang; Kang Liu; Jun Zhao; Yantao Jia; Zhicheng Sheng;
248	A Simple And Effective Model For Answering Multi-span Questions Highlight: In this work, we propose a simple architecture for answering multi-span questions by casting the task as a sequence tagging problem, namely, predicting for each input token whether it should be part of the output or not. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Elad Segal; Avia Efrat; Mor Shoham; Amir Globerson; Jonathan Berant;
249	Top-Rank-Focused Adaptive Vote Collection For The Evaluation Of Domain-Specific Semantic Models Highlight: In this work, we give a threefold contribution to address these requirements: (i) we define a protocol for the construction, based on adaptive pairwise comparisons, of a relatedness-based evaluation dataset tailored on the available resources and optimized to be particularly accurate in top-rank evaluation; (ii) we define appropriate metrics, extensions of well-known ranking correlation coefficients, to evaluate a semantic model via the aforementioned dataset by taking into account the greater significance of top ranks. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Pierangelo Lombardo; Alessio Boiardi; Luca Colombo; Angelo Schiavone; Nicolò Tamagnone;
250	Meta Fine-Tuning Neural Language Models For Multi-Domain Text Mining Highlight: In this paper, we propose an effective learning procedure named Meta Fine-Tuning (MFT), serving as a meta-learner to solve a group of similar NLP tasks for neural language models. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Chengyu Wang; Minghui Qiu; Jun Huang; Xiaofeng He;
251	Incorporating Behavioral Hypotheses For Query Generation Highlight: This paper induces these behavioral biases as hypotheses for query generation, where a generic encoder-decoder Transformer framework is presented to aggregate arbitrary hypotheses of choice. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Ruey-Cheng Chen; Chia-Jung Lee;
252	Conditional Causal Relationships Between Emotions And Causes In Texts Highlight: To address such an issue, we propose a new task of determining whether or not an input pair of emotion and cause has a valid causal relationship under different contexts, and construct a corresponding dataset via manual annotation and negative sampling based on an existing benchmark dataset. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Xinhong Chen; Qing Li; Jianping Wang;
253	COMETA: A Corpus For Medical Entity Linking In The Social Media Highlight: To address this we introduce a new corpus called COMETA, consisting of 20k English biomedical entity mentions from Reddit expert-annotated with links to SNOMED CT, a widely-used medical knowledge graph. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Marco Basaldella; Fangyu Liu; Ehsan Shareghi; Nigel Collier;
254	Pareto Probing: Trading Off Accuracy For Complexity Highlight: In our contribution to this discussion, we argue, first, for a probe metric that reflects the trade-off between probe complexity and performance: the Pareto hypervolume. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Tiago Pimentel; Naomi Saphra; Adina Williams; Ryan Cotterell;
255	Interpretation Of NLP Models Through Input Marginalization Highlight: In this study, we raise the out-of-distribution problem induced by the existing interpretation methods and present a remedy; we propose to marginalize each token out. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Siwon Kim; Jihun Yi; Eunji Kim; Sungroh Yoon;
256	Generating Label Cohesive And Well-Formed Adversarial Claims Highlight: We extend the HotFlip attack algorithm used for universal trigger generation by jointly minimizing the target class loss of a fact checking model and the entailment class loss of an auxiliary natural language inference model. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Pepa Atanasova; Dustin Wright; Isabelle Augenstein;	code
257	Are All Good Word Vector Spaces Isomorphic? Highlight: In this work, we ask whether non-isomorphism is also crucially a sign of degenerate word vector spaces. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Ivan Vulić; Sebastian Ruder; Anders Søgaard;
258	Cold-Start And Interpretability: Turning Regular Expressions Into Trainable Recurrent Neural Networks Highlight: In this paper, we propose a type of recurrent neural networks called FA-RNNs that combine the advantages of neural networks and regular expression rules. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Chengyue Jiang; Yinggong Zhao; Shanbo Chu; Libin Shen; Kewei Tu;
259	When BERT Plays The Lottery, All Tickets Are Winning Highlight: For fine-tuned BERT, we show that (a) it is possible to find subnetworks achieving performance that is comparable with that of the full model, and (b) similarly-sized subnetworks sampled from the rest of the model perform worse. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Sai Prasanna; Anna Rogers; Anna Rumshisky;
260	On The Weak Link Between Importance And Prunability Of Attention Heads Highlight: Given the success of Transformer-based models, two directions of study have emerged: interpreting role of individual attention heads and down-sizing the models for efficiency. Our work straddles these two streams: We analyse the importance of basing pruning strategies on the interpreted role of the attention heads. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Aakriti Budhraja; Madhura Pande; Preksha Nema; Pratyush Kumar; Mitesh M. Khapra;
261	Towards Interpreting BERT For Reading Comprehension Based QA Highlight: In this work, we attempt to interpret BERT for RCQA. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Sahana Ramnath; Preksha Nema; Deep Sahni; Mitesh M. Khapra;	code
262	How Do Decisions Emerge Across Layers In Neural Models? Interpretation With Differentiable Masking Highlight: To deal with these challenges, we introduce Differentiable Masking. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Nicola De Cao; Michael Sejr Schlichtkrull; Wilker Aziz; Ivan Titov;
263	A Diagnostic Study Of Explainability Techniques For Text Classification Highlight: In this paper, we develop a comprehensive list of diagnostic properties for evaluating existing explainability techniques. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Pepa Atanasova; Jakob Grue Simonsen; Christina Lioma; Isabelle Augenstein;	code
264	STL-CQA: Structure-based Transformers With Localization And Encoding For Chart Question Answering Highlight: We propose STL-CQA which improves the question/answering through sequential elements localization, question encoding and then, a structural transformer-based learning approach. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Hrituraj Singh; Sumit Shekhar;
265	Learning To Contrast The Counterfactual Samples For Robust Visual Question Answering Highlight: Therefore, we introduce a novel self-supervised contrastive learning mechanism to learn the relationship between original samples, factual samples and counterfactual samples. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Zujie Liang; Weitao Jiang; Haifeng Hu; Jiaying Zhu;
266	Learning Physical Common Sense As Knowledge Graph Completion Via BERT Data Augmentation And Constrained Tucker Factorization Highlight: In this paper, we formulate physical commonsense learning as a knowledge graph completion problem to better use the latent relationships among training samples. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Zhenjie Zhao; Evangelos Papalexakis; Xiaojuan Ma;
267	A Visually-grounded First-person Dialogue Dataset With Verbal And Non-verbal Responses Highlight: In this paper, we propose a visually-grounded first-person dialogue (VFD) dataset with verbal and non-verbal responses. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Hisashi Kamezawa; Noriki Nishida; Nobuyuki Shimizu; Takashi Miyazaki; Hideki Nakayama;
268	Cross-Media Keyphrase Prediction: A Unified Framework With Multi-Modality Multi-Head Attention And Image Wordings Highlight: In this work, we explore the joint effects of texts and images in predicting the keyphrases for a multimedia post. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yue Wang; Jing Li; Michael Lyu; Irwin King;
269	VD-BERT: A Unified Vision And Dialog Transformer With BERT Highlight: By contrast, in this work, we propose VD-BERT, a simple yet effective framework of unified vision-dialog Transformer that leverages the pretrained BERT language models for Visual Dialog tasks. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yue Wang; Shafiq Joty; Michael Lyu; Irwin King; Caiming Xiong; Steven C.H. Hoi;	code
270	The Grammar Of Emergent Languages Highlight: In this paper, we consider the syntactic properties of languages emerged in referential games, using unsupervised grammar induction (UGI) techniques originally designed to analyse natural language. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Oskar van der Wal; Silvan de Boer; Elia Bruni; Dieuwke Hupkes;
271	Sub-Instruction Aware Vision-and-Language Navigation Highlight: In this work, we focus on the granularity of the visual and language sequences as well as the traceability of agents through the completion of an instruction. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yicong Hong; Cristian Rodriguez; Qi Wu; Stephen Gould;	code
272	Knowledge-Grounded Dialogue Generation With Pre-trained Language Models Highlight: To leverage the redundant external knowledge under capacity constraint, we propose equipping response generation defined by a pre-trained language model with a knowledge selection module, and an unsupervised approach to jointly optimizing knowledge selection and response generation with unlabeled dialogues. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Xueliang Zhao; Wei Wu; Can Xu; Chongyang Tao; Dongyan Zhao; Rui Yan;
273	MinTL: Minimalist Transfer Learning For Task-Oriented Dialogue Systems Highlight: In this paper, we propose Minimalist Transfer Learning (MinTL) to simplify the system design process of task-oriented dialogue systems and alleviate the over-dependency on annotated data. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Zhaojiang Lin; Andrea Madotto; Genta Indra Winata; Pascale Fung;
274	Variational Hierarchical Dialog Autoencoder For Dialog State Tracking Data Augmentation Highlight: In this work, we extend this approach to the task of dialog state tracking for goaloriented dialogs. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Kang Min Yoo; Hanbit Lee; Franck Dernoncourt; Trung Bui; Walter Chang; Sang-goo Lee;
275	Bridging The Gap Between Prior And Posterior Knowledge Selection For Knowledge-Grounded Dialogue Generation Highlight: Here, we deal with these issues on two aspects: (1) We enhance the prior selection module with the necessary posterior information obtained from the specially designed Posterior Information Prediction Module (PIPM); (2) We propose a Knowledge Distillation Based Training Strategy (KDBTS) to train the decoder with the knowledge selected from the prior distribution, removing the exposure bias of knowledge selection. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Xiuyi Chen; Fandong Meng; Peng Li; Feilong Chen; Shuang Xu; Bo Xu; Jie Zhou;
276	Counterfactual Off-Policy Training For Neural Dialogue Generation Highlight: In this paper, we propose to explore potential responses by counterfactual reasoning. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Qingfu Zhu; Wei-Nan Zhang; Ting Liu; William Yang Wang;
277	Dialogue Distillation: Open-Domain Dialogue Augmentation Using Unpaired Data Highlight: To address this data dilemma, we propose a novel data augmentation method for training open-domain dialogue models by utilizing unpaired data. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Rongsheng Zhang; Yinhe Zheng; Jianzhi Shao; Xiaoxi Mao; Yadong Xi; Minlie Huang;
278	Task-Completion Dialogue Policy Learning Via Monte Carlo Tree Search With Dueling Network Highlight: We introduce a framework of Monte Carlo Tree Search with Double-q Dueling network (MCTS-DDU) for task-completion dialogue policy learning. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Sihan Wang; Kaijie Zhou; Kunfeng Lai; Jianping Shen;
279	Learning A Simple And Effective Model For Multi-turn Response Generation With Auxiliary Tasks Highlight: In this work, we pursue a model that has a simple structure yet can effectively leverage conversation contexts for response generation. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yufan Zhao; Can Xu; Wei Wu;
280	AttnIO: Knowledge Graph Exploration With In-and-Out Attention Flow For Knowledge-Grounded Dialogue Highlight: To this effect, we present AttnIO, a new dialog-conditioned path traversal model that makes a full use of rich structural information in KG based on two directions of attention flows. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Jaehun Jung; Bokyung Son; Sungwon Lyu;
281	Amalgamating Knowledge From Two Teachers For Task-oriented Dialogue System With Adversarial Training Highlight: In this paper, we propose a Two-Teacher One-Student learning framework (TTOS) for task-oriented dialogue, with the goal of retrieving accurate KB entities and generating human-like responses simultaneously. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Wanwei He; Min Yang; Rui Yan; Chengming Li; Ying Shen; Ruifeng Xu;
282	Task-oriented Domain-specific Meta-Embedding For Text Classification Highlight: In this paper, we propose a method to incorporate both domain-specific and task-oriented information into meta-embeddings. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Xin Wu; Yi Cai; Yang Kai; Tao Wang; Qing Li;
283	Don’t Neglect The Obvious: On The Role Of Unambiguous Words In Word Sense Disambiguation Highlight: In this paper, we propose a simple method to provide annotations for most unambiguous words in a large corpus. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Daniel Loureiro; Jose Camacho-Collados;
284	Within-Between Lexical Relation Classification Highlight: We propose the novel \textit{Within-Between} Relation model for recognizing lexical-semantic relations between words. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Oren Barkan; Avi Caciularu; Ido Dagan;
285	With More Contexts Comes Better Performance: Contextualized Sense Embeddings For All-Round Word Sense Disambiguation Highlight: In this paper we present ARES (context-AwaRe Embeddings of Senses), a semi-supervised approach to producing sense embeddings for the lexical meanings within a lexical knowledge base that lie in a space that is comparable to that of contextualized word vectors. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Bianca Scarlini; Tommaso Pasini; Roberto Navigli;	code
286	Convolution Over Hierarchical Syntactic And Lexical Graphs For Aspect Level Sentiment Analysis Highlight: To tackle the above two limitations, we propose a novel architecture which convolutes over hierarchical syntactic and lexical graphs. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Mi Zhang; Tieyun Qian;
287	Multi-Instance Multi-Label Learning Networks For Aspect-Category Sentiment Analysis Highlight: In this paper, we propose a Multi-Instance Multi-Label Learning Network for Aspect-Category sentiment analysis (AC-MIMLLN), which treats sentences as bags, words as instances, and the words indicating an aspect category as the key instances of the aspect category. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yuncong Li; Cunxiang Yin; Sheng-hua Zhong; Xu Pan;
288	Aspect Sentiment Classification With Aspect-Specific Opinion Spans Highlight: In this paper, we present a neat and effective structured attention model by aggregating multiple linear-chain CRFs. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Lu Xu; Lidong Bing; Wei Lu; Fei Huang;
289	Emotion-Cause Pair Extraction As Sequence Labeling Based On A Novel Tagging Scheme Highlight: Targeting this issue, we regard the task as a sequence labeling problem and propose a novel tagging scheme with coding the distance between linked components into the tags, so that emotions and the corresponding causes can be extracted simultaneously. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Chaofa Yuan; Chuang Fan; Jianzhu Bao; Ruifeng Xu;
290	End-to-End Emotion-Cause Pair Extraction Based On Sliding Window Multi-Label Learning Highlight: To tackle these shortcomings, we propose two joint frameworks for ECPE: 1) multi-label learning for the extraction of the cause clauses corresponding to the specified emotion clause (CMLL) and 2) multi-label learning for the extraction of the emotion clauses corresponding to the specified cause clause (EMLL). Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Zixiang Ding; Rui Xia; Jianfei Yu;
291	Multi-modal Multi-label Emotion Detection With Modality And Label Dependence Highlight: In this paper, we focus on multi-label emotion detection in a multi-modal scenario. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Dong Zhang; Xincheng Ju; Junhui Li; Shoushan Li; Qiaoming Zhu; Guodong Zhou;
292	Tasty Burgers, Soggy Fries: Probing Aspect Robustness In Aspect-Based Sentiment Analysis Highlight: To solve this problem, we develop a simple but effective approach to enrich ABSA test sets. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Xiaoyu Xing; Zhijing Jin; Di Jin; Bingning Wang; Qi Zhang; Xuanjing Huang;	code
293	Modeling Content Importance For Summarization With Pre-trained Language Models Highlight: In this work, we apply information theory on top of pre-trained language models and define the concept of importance from the perspective of information amount. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Liqiang Xiao; Lu Wang; Hao He; Yaohui Jin;
294	Unsupervised Reference-Free Summary Quality Evaluation Via Contrastive Learning Highlight: In this work, we propose to evaluate the summary qualities without reference summaries by unsupervised contrastive learning. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Hanlu Wu; Tengfei Ma; Lingfei Wu; Tariro Manyumwa; Shouling Ji;
295	Neural Extractive Summarization With Hierarchical Attentive Heterogeneous Graph Network Highlight: In this paper, we propose HAHSum (as shorthand for Hierarchical Attentive Heterogeneous Graph for Text Summarization), which well models different levels of information, including words and sentences, and spotlights redundancy dependencies between sentences. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Ruipeng Jia; Yanan Cao; Hengzhu Tang; Fang Fang; Cong Cao; Shi Wang;
296	Coarse-to-Fine Query Focused Multi-Document Summarization Highlight: We propose a coarse-to-fine modeling framework which employs progressively more accurate modules for estimating whether text segments are relevant, likely to contain an answer, and central. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yumo Xu; Mirella Lapata;
297	Pre-training For Abstractive Document Summarization By Reinstating Source Text Highlight: This paper presents three sequence-to-sequence pre-training (in shorthand, STEP) objectives which allow us to pre-train a SEQ2SEQ based abstractive summarization model on unlabeled text. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yanyan Zou; Xingxing Zhang; Wei Lu; Furu Wei; Ming Zhou;
298	Learning From Context Or Names? An Empirical Study On Neural Relation Extraction Highlight: Based on the analyses, we propose an entity-masked contrastive pre-training framework for RE to gain a deeper understanding on both textual context and type information while avoiding rote memorization of entities or use of superficial cues in mentions. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Hao Peng; Tianyu Gao; Xu Han; Yankai Lin; Peng Li; Zhiyuan Liu; Maosong Sun; Jie Zhou;	code
299	SelfORE: Self-supervised Relational Feature Learning For Open Relation Extraction Highlight: In this work, we propose a self-supervised framework named SelfORE, which exploits weak, self-supervised signals by leveraging large pretrained language model for adaptive clustering on contextualized relational features, and bootstraps the self-supervised signals by improving contextualized features in relation classification. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Xuming Hu; Lijie Wen; Yusong Xu; Chenwei Zhang; Philip Yu;
300	Denoising Relation Extraction From Document-level Distant Supervision Highlight: To alleviate this issue, we propose a novel pre-trained model for DocRE, which de-emphasize noisy DS data via multiple pre-training tasks. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Chaojun Xiao; Yuan Yao; Ruobing Xie; Xu Han; Zhiyuan Liu; Maosong Sun; Fen Lin; Leyu Lin;
301	Let’s Stop Incorrect Comparisons In End-to-end Relation Extraction! Highlight: In this paper, we first identify several patterns of invalid comparisons in published papers and describe them to avoid their propagation. We then propose a small empirical study to quantify the most common mistake’s impact and evaluate it leads to overestimating the final RE performance by around 5% on ACE05. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Bruno Taillé; Vincent Guigue; Geoffrey Scoutheeten; Patrick Gallinari;
302	Exposing Shallow Heuristics Of Relation Extraction Models With Challenge Data Highlight: We identify failure modes of SOTA relation extraction (RE) models trained on TACRED, which we attribute to limitations in the data annotation process. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Shachar Rosenman; Alon Jacovi; Yoav Goldberg;
303	Global-to-Local Neural Networks For Document-Level Relation Extraction Highlight: In this paper, we propose a novel model to document-level RE, by encoding the document information in terms of entity global and local representations as well as context relation representations. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Difeng Wang; Wei Hu; Ermei Cao; Weijian Sun;
304	Recurrent Interaction Network For Jointly Extracting Entities And Classifying Relations Highlight: As a solution, we design a multi-task learning model which we refer to as recurrent interaction network which allows the learning of interactions dynamically, to effectively model task-specific features for classification. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Kai Sun; Richong Zhang; Samuel Mensah; Yongyi Mao; Xudong Liu;
305	Temporal Knowledge Base Completion: New Algorithms And Evaluation Protocols Highlight: In response, we propose improved TKBC evaluation protocols for both link and time prediction tasks, dealing with subtle issues that arise from the partial overlap of time intervals in gold instances and system predictions. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Prachi Jain; Sushant Rathi; Mausam; Soumen Chakrabarti;
306	OpenIE6: Iterative Grid Labeling And Coordination Analysis For Open Information Extraction Highlight: In this paper, we bridge this trade-off by presenting an iterative labeling-based system that establishes a new state of the art for OpenIE, while extracting 10x faster. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Keshav Kolluru; Vaibhav Adlakha; Samarth Aggarwal; Mausam; Soumen Chakrabarti;
307	Public Sentiment Drift Analysis Based On Hierarchical Variational Auto-encoder Highlight: In this paper, we focus on distribution learning by proposing a novel Hierarchical Variational Auto-Encoder (HVAE) model to learn better distribution representation, and design a new drift measure to directly evaluate distribution changes between historical data and new data. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Wenyue Zhang; Xiaoli Li; Yang Li; Suge Wang; Deyu Li; Jian Liao; Jianxing Zheng;
308	Point To The Expression: Solving Algebraic Word Problems Using The Expression-Pointer Transformer Model Highlight: To address each of these two issues, we propose a pure neural model, Expression-Pointer Transformer (EPT), which uses (1) Expression’ token and (2) operand-context pointers when generating solution equations. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Bugeun Kim; Kyung Seo Ki; Donggeon Lee; Gahgene Gweon;
309	Semantically-Aligned Universal Tree-Structured Solver For Math Word Problems Highlight: Herein, we propose a simple but efficient method called Universal Expression Tree (UET) to make the first attempt to represent the equations of various MWPs uniformly. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Jinghui Qin; Lihui Lin; Xiaodan Liang; Rumin Zhang; Liang Lin;
310	Neural Topic Modeling By Incorporating Document Relationship Graph Highlight: In this paper, we propose Graph Topic Model (GTM), a GNN based neural topic model that represents a corpus as a document relationship graph. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Deyu Zhou; Xuemeng Hu; Rui Wang;
311	Routing Enforced Generative Model For Recipe Generation Highlight: In this work, we propose a routing method to dive into the content selection under the internal restrictions. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Zhiwei Yu; Hongyu Zang; Xiaojun Wan;
312	Assessing The Helpfulness Of Learning Materials With Inference-Based Learner-Like Agent Highlight: Thus, we propose the inference-based learner-like agent to mimic learner behavior and identify good learning materials by examining the agent’s performance. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yun-Hsuan Jen; Chieh-Yang Huang; MeiHua Chen; Ting-Hao Huang; Lun-Wei Ku;
313	Selection And Generation: Learning Towards Multi-Product Advertisement Post Generation Highlight: We propose a novel end-to-end model named S-MG Net to generate the AD post. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Zhangming Chan; Yuchi Zhang; Xiuying Chen; Shen Gao; Zhiqiang Zhang; Dongyan Zhao; Rui Yan;
314	Form2Seq : A Framework For Higher-Order Form Structure Extraction Highlight: To mitigate this, we propose Form2Seq, a novel sequence-to-sequence (Seq2Seq) inspired framework for structure extraction using text, with a specific focus on forms, which leverages relative spatial arrangement of structures. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Milan Aggarwal; Hiresh Gupta; Mausoom Sarkar; Balaji Krishnamurthy;
315	Domain Adaptation Of Thai Word Segmentation Models Using Stacked Ensemble Highlight: We propose a filter-and-refine solution based on the stacked-ensemble learning paradigm to address this black-box limitation. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Peerat Limkonchotiwat; Wannaphong Phatthiyaphaibun; Raheem Sarwar; Ekapol Chuangsuwanich; Sarana Nutanong;
316	DagoBERT: Generating Derivational Morphology With A Pretrained Language Model Highlight: We present the first study investigating this question, taking BERT as the example PLM. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Valentin Hofmann; Janet Pierrehumbert; Hinrich Schütze;
317	Attention Is All You Need For Chinese Word Segmentation Highlight: Taking greedy decoding algorithm as it should be, this work focuses on further strengthening the model itself for Chinese word segmentation (CWS), which results in an even more fast and more accurate CWS model. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Sufeng Duan; Hai Zhao;
318	A Joint Multiple Criteria Model In Transfer Learning For Cross-domain Chinese Word Segmentation Highlight: To this end, we propose a joint multiple criteria model that shares all parameters to integrate different segmentation criteria into one model. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Kaiyu Huang; Degen Huang; Zhuang Liu; Fengran Mo;
319	Alignment-free Cross-lingual Semantic Role Labeling Highlight: We propose a cross-lingual SRL model which only requires annotations in a source language and access to raw text in the form of a parallel corpus. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Rui Cai; Mirella Lapata;
320	Leveraging Declarative Knowledge In Text And First-Order Logic For Fine-Grained Propaganda Detection Highlight: Instead of merely learning from input-output datapoints in training data, we introduce an approach to inject declarative knowledge of fine-grained propaganda techniques. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Ruize Wang; Duyu Tang; Nan Duan; Wanjun Zhong; Zhongyu Wei; Xuanjing Huang; Daxin Jiang; Ming Zhou;
321	X-SRL: A Parallel Cross-Lingual Semantic Role Labeling Dataset Highlight: In this work we propose a method to automatically construct an SRL corpus that is parallel in four languages: English, French, German, Spanish, with unified predicate and role annotations that are fully comparable across languages. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Angel Daza; Anette Frank;
322	Graph Convolutions Over Constituent Trees For Syntax-Aware Semantic Role Labeling Highlight: In contrast, we show how graph convolutional networks (GCNs) can be used to encode constituent structures and inform an SRL system. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Diego Marcheggiani; Ivan Titov;
323	Fast Semantic Parsing With Well-typedness Guarantees Highlight: We describe an A* parser and a transition-based parser for AM dependency parsing which guarantee well-typedness and improve parsing speed by up to 3 orders of magnitude, while maintaining or improving accuracy. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Matthias Lindemann; Jonas Groschwitz; Alexander Koller;
324	Improving Out-of-Scope Detection In Intent Classification By Using Embeddings Of The Word Graph Space Of The Classes Highlight: This paper explores how intent classification can be improved by representing the class labels not as a discrete set of symbols but as a space where the word graphs associated to each class are mapped using typical graph embedding techniques. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Paulo Cavalin; Victor Henrique Alves Ribeiro; Ana Appel; Claudio Pinhanez;
325	Supervised Seeded Iterated Learning For Interactive Language Learning Highlight: Given these observations, we introduce Supervised Seeded Iterated Learning (SSIL) to combine both methods to minimize their respective weaknesses. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yuchen Lu; Soumye Singhal; Florian Strub; Olivier Pietquin; Aaron Courville;
326	Spot The Bot: A Robust And Efficient Framework For The Evaluation Of Conversational Dialogue Systems Highlight: In this work, we introduce Spot The Bot, a cost-efficient and robust evaluation framework that replaces human-bot conversations with conversations between bots. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Jan Deriu; Don Tuggener; Pius von Däniken; Jon Ander Campos; Alvaro Rodrigo; Thiziri Belkacem; Aitor Soroa; Eneko Agirre; Mark Cieliebak;
327	Human-centric Dialog Training Via Offline Reinforcement Learning Highlight: We solve the challenge by developing a novel class of offline RL algorithms. These algorithms use KL-control to penalize divergence from a pre-trained prior language model, and use a new strategy to make the algorithm pessimistic, instead of optimistic, in the face of uncertainty. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Natasha Jaques; Judy Hanwen Shen; Asma Ghandeharioun; Craig Ferguson; Agata Lapedriza; Noah Jones; Shixiang Gu; Rosalind Picard;
328	Speakers Fill Lexical Semantic Gaps With Context Highlight: To investigate whether this is the case, we operationalise the lexical ambiguity of a word as the entropy of meanings it can take, and provide two ways to estimate this-one which requires human annotation (using WordNet), and one which does not (using BERT), making it readily applicable to a large number of languages. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Tiago Pimentel; Rowan Hall Maudslay; Damian Blasi; Ryan Cotterell;
329	Investigating Cross-Linguistic Adjective Ordering Tendencies With A Latent-Variable Model Highlight: We present the first purely corpus-driven model of multi-lingual adjective ordering in the form of a latent-variable model that can accurately order adjectives across 24 different languages, even when the training and testing languages are different. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Jun Yen Leung; Guy Emerson; Ryan Cotterell;
330	Surprisal Predicts Code-Switching In Chinese-English Bilingual Text Highlight: We describe and model a new dataset of Chinese-English text with 1476 clean code-switched sentences, translated back into Chinese. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Jesús Calvillo; Le Fang; Jeremy Cole; David Reitter;
331	Word Frequency Does Not Predict Grammatical Knowledge In Language Models Highlight: In this work, we investigate whether there are systematic sources of variation in the language models’ accuracy. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Charles Yu; Ryan Sie; Nicolas Tedeschi; Leon Bergen;
332	Improving Word Sense Disambiguation With Translations Highlight: In this paper, we present a novel approach that improves the performance of a base WSD system using machine translation. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yixing Luan; Bradley Hauer; Lili Mou; Grzegorz Kondrak;
333	Towards Better Context-aware Lexical Semantics:Adjusting Contextualized Representations Through Static Anchors Highlight: In this paper, we present a post-processing technique that enhances these representations by learning a transformation through static anchors. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Qianchu Liu; Diana McCarthy; Anna Korhonen;
334	Compositional Demographic Word Embeddings Highlight: We propose a new form of personalized word embeddings that use demographic-specific word representations derived compositionally from full or partial demographic information for a user (i.e., gender, age, location, religion). Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Charles Welch; Jonathan K. Kummerfeld; Verónica Pérez-Rosas; Rada Mihalcea;
335	Do “Undocumented Workers” == “Illegal Aliens”? Differentiating Denotation And Connotation In Vector Spaces Highlight: In this study, we propose an adversarial nerual netowrk that decomposes a pretrained representation as independent denotation and connotation representations. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Albert Webson; Zhizhong Chen; Carsten Eickhoff; Ellie Pavlick;
336	Multi-View Sequence-to-Sequence Models With Conversational Structure For Abstractive Dialogue Summarization Highlight: This work proposes a multi-view sequence-to-sequence model by first extracting conversational structures of unstructured daily chats from different views to represent conversations and then utilizing a multi-view decoder to incorporate different views to generate dialogue summaries. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Jiaao Chen; Diyi Yang;	code
337	Few-Shot Learning For Opinion Summarization Highlight: In this work, we show that even a handful of summaries is sufficient to bootstrap generation of the summary text with all expected properties, such as writing style, informativeness, fluency, and sentiment preservation. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Arthur Bražinskas; Mirella Lapata; Ivan Titov;
338	Learning To Fuse Sentences With Transformers For Summarization Highlight: In this paper, we explore the ability of Transformers to fuse sentences and propose novel algorithms to enhance their ability to perform sentence fusion by leveraging the knowledge of points of correspondence between sentences. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Logan Lebanoff; Franck Dernoncourt; Doo Soon Kim; Lidan Wang; Walter Chang; Fei Liu;
339	Stepwise Extractive Summarization And Planning With Structured Transformers Highlight: We propose encoder-centric stepwise models for extractive summarization using structured transformers – HiBERT and Extended Transformers. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Shashi Narayan; Joshua Maynez; Jakub Adamek; Daniele Pighin; Blaz Bratanic; Ryan McDonald;
340	CLIRMatrix: A Massively Large Collection Of Bilingual And Multilingual Datasets For Cross-Lingual Information Retrieval Highlight: We present CLIRMatrix, a massively large collection of bilingual and multilingual datasets for Cross-Lingual Information Retrieval extracted automatically from Wikipedia. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Shuo Sun; Kevin Duh;
341	SLEDGE-Z: A Zero-Shot Baseline For COVID-19 Literature Search Highlight: In this work, we present a zero-shot ranking algorithm that adapts to COVID-related scientific literature. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Sean MacAvaney; Arman Cohan; Nazli Goharian;
342	Modularized Transfomer-based Ranking Framework Highlight: In this work, we modularize the Transformer ranker into separate modules for text representation and interaction. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Luyu Gao; Zhuyun Dai; Jamie Callan;
343	Ad-hoc Document Retrieval Using Weak-Supervision With BERT And GPT2 Highlight: We describe a weakly-supervised method for training deep learning models for the task of ad-hoc document retrieval. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yosi Mass; Haggai Roitman;
344	Adversarial Semantic Collisions Highlight: We develop gradient-based approaches for generating semantic collisions and demonstrate that state-of-the-art models for many tasks which rely on analyzing the meaning and similarity of texts-including paraphrase identification, document retrieval, response suggestion, and extractive summarization-are vulnerable to semantic collisions. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Congzheng Song; Alexander Rush; Vitaly Shmatikov;	code
345	Learning Explainable Linguistic Expressions With Neural Inductive Logic Programming For Sentence Classification Highlight: We present RuleNN, a neural network architecture for learning transparent models for sentence classification. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Prithviraj Sen; Marina Danilevsky; Yunyao Li; Siddhartha Brahma; Matthias Boehm; Laura Chiticariu; Rajasekar Krishnamurthy;
346	AutoPrompt: Eliciting Knowledge From Language Models With Automatically Generated Prompts Highlight: To address this, we develop AutoPrompt, an automated method to create prompts for a diverse set of tasks, based on a gradient-guided search. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Taylor Shin; Yasaman Razeghi; Robert L. Logan IV; Eric Wallace; Sameer Singh;
347	Learning Variational Word Masks To Improve The Interpretability Of Neural Text Classifiers Highlight: To address this limitation, we propose the variational word mask (VMASK) method to automatically learn task-specific important words and reduce irrelevant information on classification, which ultimately improves the interpretability of model predictions. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Hanjie Chen; Yangfeng Ji;
348	Sparse Text Generation Highlight: In this paper, we use the recently introduced entmax transformation to train and sample from a natively sparse language model, avoiding this mismatch. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Pedro Henrique Martins; Zita Marinho; André F. T. Martins;
349	PlotMachines: Outline-Conditioned Generation With Dynamic Plot State Tracking Highlight: We present PlotMachines, a neural narrative model that learns to transform an outline into a coherent story by tracking the dynamic plot states. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Hannah Rashkin; Asli Celikyilmaz; Yejin Choi; Jianfeng Gao;
350	Do Sequence-to-sequence VAEs Learn Global Features Of Sentences? Highlight: To alleviate this, we investigate alternative architectures based on bag-of-words assumptions and language model pretraining. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Tom Bosc; Pascal Vincent;
351	Content Planning For Neural Story Generation With Aristotelian Rescoring Highlight: We utilize a plot-generation language model along with an ensemble of rescoring models that each implement an aspect of good story-writing as detailed in Aristotle’s Poetics. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Seraphina Goldfarb-Tarrant; Tuhin Chakrabarty; Ralph Weischedel; Nanyun Peng;
352	Generating Dialogue Responses From A Semantic Latent Space Highlight: In this work, we hypothesize that the current models are unable to integrate information from multiple semantically similar valid responses of a prompt, resulting in the generation of generic and uninformative responses. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Wei-Jen Ko; Avik Ray; Yilin Shen; Hongxia Jin;
353	Refer, Reuse, Reduce: Generating Subsequent References In Visual And Conversational Contexts Highlight: We propose a generation model that produces referring utterances grounded in both the visual and the conversational context. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Ece Takmaz; Mario Giulianelli; Sandro Pezzelle; Arabella Sinclair; Raquel Fernández;
354	Visually Grounded Compound PCFGs Highlight: In this work, we study visually grounded grammar induction and learn a constituency parser from both unlabeled text and its visual groundings. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yanpeng Zhao; Ivan Titov;
355	ALICE: Active Learning With Contrastive Natural Language Explanations Highlight: We propose Active Learning with Contrastive Explanations (ALICE), an expert-in-the-loop training framework that utilizes contrastive natural language explanations to improve data efficiency in learning. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Weixin Liang; James Zou; Zhou Yu;
356	Room-Across-Room: Multilingual Vision-and-Language Navigation With Dense Spatiotemporal Grounding Highlight: We introduce Room-Across-Room (RxR), a new Vision-and-Language Navigation (VLN) dataset. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Alexander Ku; Peter Anderson; Roma Patel; Eugene Ie; Jason Baldridge;
357	SSCR: Iterative Language-Based Image Editing Via Self-Supervised Counterfactual Reasoning Highlight: In this paper, we introduce a Self-Supervised Counterfactual Reasoning (SSCR) framework that incorporates counterfactual thinking to overcome data scarcity. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Tsu-Jui Fu; Xin Wang; Scott Grafton; Miguel Eckstein; William Yang Wang;
358	Identifying Elements Essential For BERT’s Multilinguality Highlight: We aim to identify architectural properties of BERT and linguistic properties of languages that are necessary for BERT to become multilingual. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Philipp Dufter; Hinrich Schütze;
359	On Negative Interference In Multilingual Models: Findings And A Meta-Learning Treatment Highlight: In this paper, we present the first systematic study of negative interference. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Zirui Wang; Zachary C. Lipton; Yulia Tsvetkov;
360	Pre-tokenization Of Multi-word Expressions In Cross-lingual Word Embeddings Highlight: We propose a simple method for word translation of MWEs to and from English in ten languages: we first compile lists of MWEs in each language and then tokenize the MWEs as single tokens before training word embeddings. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Naoki Otani; Satoru Ozaki; Xingyuan Zhao; Yucen Li; Micaelah St Johns; Lori Levin;
361	Monolingual Adapters For Zero-Shot Neural Machine Translation Highlight: We propose a novel adapter layer formalism for adapting multilingual models. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Jerin Philip; Alexandre Berard; Matthias Gallé; Laurent Besacier;
362	Do Explicit Alignments Robustly Improve Multilingual Encoders? Highlight: In this paper, we propose a new contrastive alignment objective that can better utilize such signal, and examine whether these previous alignment methods can be adapted to noisier sources of aligned data: a randomly sampled 1 million pair subset of the OPUS collection. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Shijie Wu; Mark Dredze;
363	From Zero To Hero: On The Limitations Of Zero-Shot Language Transfer With Multilingual Transformers Highlight: In this work, we analyze the limitations of downstream language transfer with MMTs, showing that, much like cross-lingual word embeddings, they are substantially less effective in resource-lean scenarios and for distant languages. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Anne Lauscher; Vinit Ravishankar; Ivan Vulić; Goran Glavaš;
364	Distilling Multiple Domains For Neural Machine Translation Highlight: In this paper, we propose a framework for training a single multi-domain neural machine translation model that is able to translate several domains without increasing inference time or memory usage. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Anna Currey; Prashant Mathur; Georgiana Dinu;
365	Making Monolingual Sentence Embeddings Multilingual Using Knowledge Distillation Highlight: We present an easy and efficient method to extend existing sentence embedding models to new languages. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Nils Reimers; Iryna Gurevych;
366	A Streaming Approach For Efficient Batched Beam Search Highlight: We propose an efficient batching strategy for variable-length decoding on GPU architectures. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Kevin Yang; Violet Yao; John DeNero; Dan Klein;
367	Improving Multilingual Models With Language-Clustered Vocabularies Highlight: In this work, we introduce a novel procedure for multilingual vocabulary generation that combines the separately trained vocabularies of several automatically derived language clusters, thus balancing the trade-off between cross-lingual subword sharing and language-specific vocabularies. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Hyung Won Chung; Dan Garrette; Kiat Chuan Tan; Jason Riesa;
368	Zero-Shot Cross-Lingual Transfer With Meta Learning Highlight: We show that this challenging setup can be approached using meta-learning: in addition to training a source language model, another model learns to select which training instances are the most beneficial to the first. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Farhad Nooralahzadeh; Giannis Bekoulis; Johannes Bjerva; Isabelle Augenstein;	code
369	The Multilingual Amazon Reviews Corpus Highlight: We present the Multilingual Amazon Reviews Corpus (MARC), a large-scale collection of Amazon reviews for multilingual text classification. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Phillip Keung; Yichao Lu; György Szarvas; Noah A. Smith;
370	GLUCOSE: GeneraLized And COntextualized Story Explanations Highlight: As a step toward AI systems that can build similar mental models, we introduce GLUCOSE, a large-scale dataset of implicit commonsense causal knowledge, encoded as causal mini-theories about the world, each grounded in a narrative context. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Nasrin Mostafazadeh; Aditya Kalyanpur; Lori Moon; David Buchanan; Lauren Berkowitz; Or Biran; Jennifer Chu-Carroll;
371	Character-level Representations Improve DRS-based Semantic Parsing Even In The Age Of BERT Highlight: We combine character-level and contextual language model representations to improve performance on Discourse Representation Structure parsing. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Rik van Noord; Antonio Toral; Johan Bos;
372	Infusing Disease Knowledge Into BERT For Health Question Answering, Medical Inference And Disease Name Recognition Highlight: Specifically, we propose a new disease knowledge infusion training procedure and evaluate it on a suite of BERT models including BERT, BioBERT, SciBERT, ClinicalBERT, BlueBERT, and ALBERT. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yun He; Ziwei Zhu; Yin Zhang; Qin Chen; James Caverlee;
373	Unsupervised Commonsense Question Answering With Self-Talk Highlight: We propose an unsupervised framework based on self-talk as a novel alternative to multiple-choice commonsense tasks. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Vered Shwartz; Peter West; Ronan Le Bras; Chandra Bhagavatula; Yejin Choi;
374	Reasoning About Goals, Steps, And Temporal Ordering With WikiHow Highlight: We propose a suite of reasoning tasks on two types of relations between procedural events: goal-step relations (learn poses is a step in the larger goal of doing yoga) and step-step temporal relations (buy a yoga mat typically precedes learn poses). Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Li Zhang; Qing Lyu; Chris Callison-Burch;
375	Structural Supervision Improves Few-Shot Learning And Syntactic Generalization In Neural Language Models Highlight: We find that in most cases, the neural models are able to induce the proper syntactic generalizations after minimal exposure, often from just two examples during training, and that the two structurally supervised models generalize more accurately than the LSTM model. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Ethan Wilcox; Peng Qian; Richard Futrell; Ryosuke Kohita; Roger Levy; Miguel Ballesteros;
376	Investigating Representations Of Verb Bias In Neural Language Models Highlight: Here we introduce DAIS, a large benchmark dataset containing 50K human judgments for 5K distinct sentence pairs in the English dative alternation. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Robert Hawkins; Takateru Yamakoshi; Thomas Griffiths; Adele Goldberg;
377	Generating Image Descriptions Via Sequential Cross-Modal Alignment Guided By Human Gaze Highlight: In this paper, we investigate such sequential cross-modal alignment by modelling the image description generation process computationally. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Ece Takmaz; Sandro Pezzelle; Lisa Beinborn; Raquel Fernández;
378	Optimus: Organizing Sentences Via Pre-trained Modeling Of A Latent Space Highlight: In this paper, we propose the first large-scale language VAE model Optimus (Organizing sentences via Pre-Trained Modeling of a Universal Space). Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Chunyuan Li; Xiang Gao; Yuan Li; Baolin Peng; Xiujun Li; Yizhe Zhang; Jianfeng Gao;
379	BioMegatron: Larger Biomedical Domain Language Model Highlight: We empirically study and evaluate several factors that can affect performance on domain language applications, such as the sub-word vocabulary set, model size, pre-training corpus, and domain transfer. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Hoo-Chang Shin; Yang Zhang; Evelina Bakhturina; Raul Puri; Mostofa Patwary; Mohammad Shoeybi; Raghav Mani;
380	Text Segmentation By Cross Segment Attention Highlight: In this work, we propose three transformer-based architectures and provide comprehensive comparisons with previously proposed approaches on three standard datasets. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Michal Lukasik; Boris Dadachev; Kishore Papineni; Gonçalo Simões;
381	RussianSuperGLUE: A Russian Language Understanding Evaluation Benchmark Highlight: In this paper, we introduce an advanced Russian general language understanding evaluation benchmark – Russian SuperGLUE. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Tatiana Shavrina; Alena Fenogenova; Emelyanov Anton; Denis Shevelev; Ekaterina Artemova; Valentin Malykh; Vladislav Mikhailov; Maria Tikhonova; Andrey Chertok; Andrey Evlampiev;
382	An Empirical Study Of Pre-trained Transformers For Arabic Information Extraction Highlight: In this paper, we pre-train a customized bilingual BERT, dubbed GigaBERT, that is designed specifically for Arabic NLP and English-to-Arabic zero-shot transfer learning. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Wuwei Lan; Yang Chen; Wei Xu; Alan Ritter;	code
383	TNT: Text Normalization Based Pre-training Of Transformers For Content Moderation Highlight: In this work, we present a new language pre-training model TNT (Text Normalization based pre-training of Transformers) for content moderation. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Fei Tan; Yifan Hu; Changwei Hu; Keqian Li; Kevin Yen;
384	Methods For Numeracy-Preserving Word Embeddings Highlight: We propose a new methodology to assign and learn embeddings for numbers. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Dhanasekar Sundararaman; Shijing Si; Vivek Subramanian; Guoyin Wang; Devamanyu Hazarika; Lawrence Carin;
385	An Empirical Investigation Of Contextualized Number Prediction Highlight: Specifically, we introduce a suite of output distribution parameterizations that incorporate latent variables to add expressivity and better fit the natural distribution of numeric values in running text, and combine them with both recur-rent and transformer-based encoder architectures. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Taylor Berg-Kirkpatrick; Daniel Spokoyny;
386	Modeling The Music Genre Perception Across Language-Bound Cultures Highlight: In this work, we study the feasibility of obtaining relevant cross-lingual, culture-specific music genre annotations based only on language-specific semantic representations, namely distributed concept embeddings and ontologies. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Elena V. Epure; Guillaume Salha; Manuel Moussallam; Romain Hennequin;
387	Joint Estimation And Analysis Of Risk Behavior Ratings In Movie Scripts Highlight: To address this limitation, we propose a model that estimates content ratings based on the language use in movie scripts, making our solution available at the earlier stages of creative production. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Victor Martinez; Krishna Somandepalli; Yalda Tehranian-Uhls; Shrikanth Narayanan;
388	Keep It Surprisingly Simple: A Simple First Order Graph Based Parsing Model For Joint Morphosyntactic Parsing In Sanskrit Highlight: We propose a graph-based model for joint morphological parsing and dependency parsing in Sanskrit. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Amrith Krishna; Ashim Gupta; Deepak Garasangi; Pavankumar Satuluri; Pawan Goyal;
389	Unsupervised Parsing Via Constituency Tests Highlight: We propose a method for unsupervised parsing based on the linguistic notion of a constituency test. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Steven Cao; Nikita Kitaev; Dan Klein;
390	Please Mind The Root: Decoding Arborescences For Dependency Parsing Highlight: We analyzed the output of state-of-the-art parsers on many languages from the Universal Dependency Treebank: although these parsers are often able to learn that trees which violate the constraint should be assigned lower probabilities, their ability to do so unsurprisingly de-grades as the size of the training set decreases. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Ran Zmigrod; Tim Vieira; Ryan Cotterell;
391	Unsupervised Cross-Lingual Part-of-Speech Tagging For Truly Low-Resource Scenarios Highlight: We describe a fully unsupervised cross-lingual transfer approach for part-of-speech (POS) tagging under a truly low resource scenario. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Ramy Eskander; Smaranda Muresan; Michael Collins;
392	Unsupervised Parsing With S-DIORA: Single Tree Encoding For Deep Inside-Outside Recursive Autoencoders Highlight: In this paper, we discover that while DIORA exhaustively encodes all possible binary trees of a sentence with a soft dynamic program, its vector averaging approach is locally greedy and cannot recover from errors when computing the highest scoring parse tree in bottom-up chart parsing. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Andrew Drozdov; Subendhu Rongali; Yi-Pei Chen; Tim O’Gorman; Mohit Iyyer; Andrew McCallum;
393	Utility Is In The Eye Of The User: A Critique Of NLP Leaderboards Highlight: In this opinion paper, we study the divergence between what is incentivized by leaderboards and what is useful in practice through the lens of microeconomic theory. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Kawin Ethayarajh; Dan Jurafsky;
394	An Empirical Investigation Towards Efficient Multi-Domain Language Model Pre-training Highlight: In this paper we conduct an empirical investigation into known methods to mitigate CF. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Kristjan Arumae; Qing Sun; Parminder Bhatia;
395	Analyzing Individual Neurons In Pre-trained Language Models Highlight: We found small subsets of neurons to predict linguistic tasks, with lower level tasks (such as morphology) localized in fewer neurons, compared to higher level task of predicting syntax. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Nadir Durrani; Hassan Sajjad; Fahim Dalvi; Yonatan Belinkov;
396	Dissecting Span Identification Tasks With Performance Prediction Highlight: Our contributions are: (a) we identify key properties of span ID tasks that can inform performance prediction; (b) we carry out a large-scale experiment on English data, building a model to predict performance for unseen span ID tasks that can support architecture choices; (c), we investigate the parameters of the meta model, yielding new insights on how model and task properties interact to affect span ID performance. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Sean Papay; Roman Klinger; Sebastian Padó;
397	Assessing Phrasal Representation And Composition In Transformers Highlight: In this paper, we present systematic analysis of phrasal representations in state-of-the-art pre-trained transformers. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Lang Yu; Allyson Ettinger;
398	Analyzing Redundancy In Pretrained Transformer Models Highlight: In this paper, we study the cause of these limitations by defining a notion of Redundancy, which we categorize into two classes: General Redundancy and Task-specific Redundancy. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Fahim Dalvi; Hassan Sajjad; Nadir Durrani; Yonatan Belinkov;
399	Be More With Less: Hypergraph Attention Networks For Inductive Text Classification Highlight: To address those issues, in this paper, we propose a principled model – hypergraph attention networks (HyperGAT), which can obtain more expressive power with less computational consumption for text representation learning. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Kaize Ding; Jianling Wang; Jundong Li; Dingcheng Li; Huan Liu;
400	Entities As Experts: Sparse Memory Access With Entity Supervision Highlight: We introduce a new model-Entities as Experts (EaE)-that can access distinct memories of the entities mentioned in a piece of text. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Thibault Févry; Livio Baldini Soares; Nicholas FitzGerald; Eunsol Choi; Tom Kwiatkowski;
401	H2KGAT: Hierarchical Hyperbolic Knowledge Graph Attention Network Highlight: To fill this gap, in this paper, we propose Hierarchical Hyperbolic Knowledge Graph Attention Network (H2KGAT), a novel knowledge graph embedding framework, which is able to better model and infer hierarchical relation patterns. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Shen Wang; Xiaokai Wei; Cicero Nogueira dos Santos; Zhiguo Wang; Ramesh Nallapati; Andrew Arnold; Bing Xiang; Philip S. Yu;
402	Does The Objective Matter? Comparing Training Objectives For Pronoun Resolution Highlight: In this work, we make a fair comparison of the performance and seed-wise stability of four models that represent the four categories of objectives. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yordan Yordanov; Oana-Maria Camburu; Vid Kocijan; Thomas Lukasiewicz;
403	On Losses For Modern Language Models Highlight: In this paper, we 1) clarify NSP’s effect on BERT pre-training, 2) explore fourteen possible auxiliary pre-training tasks, of which seven are novel to modern language models, and 3) investigate different ways to include multiple tasks into pre-training. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Stéphane Aroca-Ouellette; Frank Rudzicz;
404	We Can Detect Your Bias: Predicting The Political Ideology Of News Articles Highlight: From a modeling perspective, we propose an adversarial media adaptation, as well as a specially adapted triplet loss. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Ramy Baly; Giovanni Da San Martino; James Glass; Preslav Nakov;
405	Semantic Label Smoothing For Sequence To Sequence Problems Highlight: Unlike these works, in this paper, we propose a technique that smooths over \textit{well formed} relevant sequences that not only have sufficient n-gram overlap with the target sequence, but are also \textit{semantically similar}. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Michal Lukasik; Himanshu Jain; Aditya Menon; Seungyeon Kim; Srinadh Bhojanapalli; Felix Yu; Sanjiv Kumar;
406	Training For Gibbs Sampling On Conditional Random Fields With Neural Scoring Factors Highlight: In this work, we present an approach for efficiently training and decoding hybrids of graphical models and neural networks based on Gibbs sampling. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Sida Gao; Matthew R. Gormley;
407	Multilevel Text Alignment With Cross-Document Attention Highlight: We propose a new learning approach that equips previously established hierarchical attention encoders for representing documents with a cross-document attention component, enabling structural comparisons across different levels (document-to-document and sentence-to-document). Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Xuhui Zhou; Nikolaos Pappas; Noah A. Smith;
408	Conversational Semantic Parsing Highlight: In this paper, we propose a semantic representation for such task-oriented conversational systems that can represent concepts such as co-reference and context carryover, enabling comprehensive understanding of queries in a session. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Armen Aghajanyan; Jean Maillard; Akshat Shrivastava; Keith Diedrick; Michael Haeger; Haoran Li; Yashar Mehdad; Veselin Stoyanov; Anuj Kumar; Mike Lewis; Sonal Gupta;
409	Probing Task-Oriented Dialogue Representation From Language Models Highlight: The goals of this empirical paper are to 1) investigate probing techniques, especially from the unsupervised mutual information aspect, 2) provide guidelines of pre-trained language model selection for the dialogue research community, 3) find insights of pre-training factors for dialogue application that may be the key to success. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Chien-Sheng Wu; Caiming Xiong;
410	End-to-End Slot Alignment And Recognition For Cross-Lingual NLU Highlight: In this work, we propose a novel end-to-end model that learns to align and predict target slot labels jointly for cross-lingual transfer. We introduce MultiATIS++, a new multilingual NLU corpus that extends the Multilingual ATIS corpus to nine languages across four language families, and evaluate our method using the corpus. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Weijia Xu; Batool Haider; Saab Mansour;
411	Discriminative Nearest Neighbor Few-Shot Intent Detection By Transferring Natural Language Inference Highlight: In this paper, we present a simple yet effective approach, discriminative nearest neighbor classification with deep self-attention. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Jianguo Zhang; Kazuma Hashimoto; Wenhao Liu; Chien-Sheng Wu; Yao Wan; Philip Yu; Richard Socher; Caiming Xiong;	code
412	Simple Data Augmentation With The Mask Token Improves Domain Adaptation For Dialog Act Tagging Highlight: In this work, we investigate how to better adapt DA taggers to desired target domains with only unlabeled data. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Semih Yavuz; Kazuma Hashimoto; Wenhao Liu; Nitish Shirish Keskar; Richard Socher; Caiming Xiong;
413	Low-Resource Domain Adaptation For Compositional Task-Oriented Semantic Parsing Highlight: In this paper, we focus on adapting task-oriented semantic parsers to low-resource domains, and propose a novel method that outperforms a supervised neural model at a 10-fold data reduction. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Xilun Chen; Asish Ghoshal; Yashar Mehdad; Luke Zettlemoyer; Sonal Gupta;
414	Sound Natural: Content Rephrasing In Dialog Systems Highlight: In this paper, we study the problem of rephrasing with messaging as a use case and release a dataset of 3000 pairs of original query and rephrased query. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Arash Einolghozati; Anchit Gupta; Keith Diedrick; Sonal Gupta;
415	Zero-Shot Crosslingual Sentence Simplification Highlight: We propose a zero-shot modeling framework which transfers simplification knowledge from English to another language (for which no parallel simplification corpus exists) while generalizing across languages and tasks. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Jonathan Mallinson; Rico Sennrich; Mirella Lapata;
416	Facilitating The Communication Of Politeness Through Fine-Grained Paraphrasing Highlight: In this work, we take the first steps towards automatically assisting people in adjusting their language to a specific communication circumstance. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Liye Fu; Susan Fussell; Cristian Danescu-Niculescu-Mizil;
417	CAT-Gen: Improving Robustness In NLP Models Via Controlled Adversarial Text Generation Highlight: In this work, we present a Controlled Adversarial Text Generation (CAT-Gen) model that, given an input text, generates adversarial texts through controllable attributes that are known to be invariant to task labels. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Tianlu Wang; Xuezhi Wang; Yao Qin; Ben Packer; Kang Li; Jilin Chen; Alex Beutel; Ed Chi;
418	Seq2Edits: Sequence Transduction Using Span-level Edit Operations Highlight: We propose Seq2Edits, an open-vocabulary approach to sequence editing for natural language processing (NLP) tasks with a high degree of overlap between input and output texts. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Felix Stahlberg; Shankar Kumar;
419	Controllable Meaning Representation To Text Generation: Linearization And Data Augmentation Strategies Highlight: We study the degree to which neural sequence-to-sequence models exhibit fine-grained controllability when performing natural language generation from a meaning representation. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Chris Kedzie; Kathleen McKeown;
420	Blank Language Models Highlight: We propose Blank Language Model (BLM), a model that generates sequences by dynamically creating and filling in blanks. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Tianxiao Shen; Victor Quach; Regina Barzilay; Tommi Jaakkola;
421	COD3S: Diverse Generation With Discrete Semantic Signatures Highlight: We present COD3S, a novel method for generating semantically diverse sentences using neural sequence-to-sequence (seq2seq) models. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Nathaniel Weir; João Sedoc; Benjamin Van Durme;
422	Automatic Extraction Of Rules Governing Morphological Agreement Highlight: In this paper, we take steps towards automating this process by devising an automated framework for extracting a first-pass grammatical specification from raw text in a concise, human- and machine-readable format. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Aditi Chaudhary; Antonios Anastasopoulos; Adithya Pratapa; David R. Mortensen; Zaid Sheikh; Yulia Tsvetkov; Graham Neubig;	code
423	Tackling The Low-resource Challenge For Canonical Segmentation Highlight: We explore two new models for the task, borrowing from the closely related area of morphological generation: an LSTM pointer-generator and a sequence-to-sequence model with hard monotonic attention trained with imitation learning. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Manuel Mager; Özlem Çetinoğlu; Katharina Kann;
424	IGT2P: From Interlinear Glossed Texts To Paradigms Highlight: We introduce a new task that speeds this process and automatically generates new morphological resources for natural language processing systems: IGT-to-paradigms (IGT2P). Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Sarah Moeller; Ling Liu; Changbing Yang; Katharina Kann; Mans Hulden;
425	A Computational Approach To Understanding Empathy Expressed In Text-Based Mental Health Support Highlight: In this work, we present a computational approach to understanding how empathy is expressed in online mental health platforms. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Ashish Sharma; Adam Miner; David Atkins; Tim Althoff;
426	Modeling Protagonist Emotions For Emotion-Aware Storytelling Highlight: In this paper, we present the first study on modeling the emotional trajectory of the protagonist in neural storytelling. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Faeze Brahman; Snigdha Chaturvedi;
427	Help! Need Advice On Identifying Advice Highlight: We present preliminary models showing that while pre-trained language models are able to capture advice better than rule-based systems, advice identification is challenging, and we identify directions for future research. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Venkata Subrahmanyan Govindarajan; Benjamin Chen; Rebecca Warholic; Katrin Erk; Junyi Jessy Li;
428	Quantifying Intimacy In Language Highlight: Here, we introduce a new computational framework for studying expressions of the intimacy in language with an accompanying dataset and deep learning model for accurately predicting the intimacy level of questions (Pearson r = 0.87). Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Jiaxin Pei; David Jurgens;
429	Writing Strategies For Science Communication: Data And Computational Analysis Highlight: We compile a set of writing strategies drawn from a wide range of prescriptive sources and develop an annotation scheme allowing humans to recognize them. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Tal August; Lauren Kim; Katharina Reinecke; Noah A. Smith;
430	Weakly Supervised Subevent Knowledge Acquisition Highlight: Acknowledging the scarcity of subevent knowledge, we propose a weakly supervised approach to extract subevent relation tuples from text and build the first large scale subevent knowledge base. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Wenlin Yao; Zeyu Dai; Maitreyi Ramaswamy; Bonan Min; Ruihong Huang;
431	Biomedical Event Extraction As Sequence Labeling Highlight: We introduce Biomedical Event Extraction as Sequence Labeling (BeeSL), a joint end-to-end neural information extraction model. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Alan Ramponi; Rob van der Goot; Rosario Lombardo; Barbara Plank;
432	Annotating Temporal Dependency Graphs Via Crowdsourcing Highlight: We present the construction of a corpus of 500 Wikinews articles annotated with temporal dependency graphs (TDGs) that can be used to train systems to understand temporal relations in text. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Jiarui Yao; Haoling Qiu; Bonan Min; Nianwen Xue;
433	Introducing A New Dataset For Event Detection In Cybersecurity Texts Highlight: In particular, to facilitate the future research, we introduce a new dataset for this problem, characterizing the manual annotation for 30 important cybersecurity event types and a large dataset size to develop deep learning models. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Hieu Man Duc Trong; Duc Trong Le; Amir Pouran Ben Veyseh; Thuat Nguyen; Thien Huu Nguyen;
434	CHARM: Inferring Personal Attributes From Conversations Highlight: This paper overcomes this limitation by devising CHARM: a zero-shot learning method that creatively leverages keyword extraction and document retrieval in order to predict attribute values that were never seen during training. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Anna Tigunova; Andrew Yates; Paramita Mirza; Gerhard Weikum;
435	Event Detection: Gate Diversity And Syntactic Importance Scores For Graph Convolution Neural Networks Highlight: In this study, we propose a novel gating mechanism to filter noisy information in the hidden vectors of the GCN models for ED based on the information from the trigger candidate. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Viet Dac Lai; Tuan Ngo Nguyen; Thien Huu Nguyen;
436	Severing The Edge Between Before And After: Neural Architectures For Temporal Ordering Of Events Highlight: In this paper, we propose a neural architecture and a set of training methods for ordering events by predicting temporal relations. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Miguel Ballesteros; Rishita Anubhai; Shuai Wang; Nima Pourdamghani; Yogarshi Vyas; Jie Ma; Parminder Bhatia; Kathleen McKeown; Yaser Al-Onaizan;
437	How Much Knowledge Can You Pack Into The Parameters Of A Language Model? Highlight: In this short paper, we measure the practical utility of this approach by fine-tuning pre-trained models to answer questions without access to any external context or knowledge. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Adam Roberts; Colin Raffel; Noam Shazeer;
438	EXAMS: A Multi-subject High School Examinations Dataset For Cross-lingual And Multilingual Question Answering Highlight: We propose EXAMS – a new benchmark dataset for cross-lingual and multilingual question answering for high school examinations. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Momchil Hardalov; Todor Mihaylov; Dimitrina Zlatkova; Yoan Dinkov; Ivan Koychev; Preslav Nakov;	code
439	End-to-End Synthetic Data Generation For Domain Adaptation Of Question Answering Systems Highlight: We propose an end-to-end approach for synthetic QA data generation. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Siamak Shakeri; Cicero Nogueira dos Santos; Henghui Zhu; Patrick Ng; Feng Nan; Zhiguo Wang; Ramesh Nallapati; Bing Xiang;
440	Multi-Stage Pre-training For Low-Resource Domain Adaptation Highlight: We show that extending the vocabulary of the LM with domain-specific terms leads to further gains. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Rong Zhang; Revanth Gangi Reddy; Md Arafat Sultan; Vittorio Castelli; Anthony Ferritto; Radu Florian; Efsun Sarioglu Kayi; Salim Roukos; Avi Sil; Todd Ward;
441	ISAAQ – Mastering Textbook Questions With Pre-trained Transformers And Bottom-Up And Top-Down Attention Highlight: For the first time, this paper taps on the potential of transformer language models and bottom-up and top-down attention to tackle the language and visual understanding challenges this task entails. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Jose Manuel Gomez-Perez; Raúl Ortega;
442	SubjQA: A Dataset For Subjectivity And Review Comprehension Highlight: We find that subjectivity is an important feature in the case of QA, albeit with more intricate interactions between subjectivity and QA performance than found in previous work on sentiment analysis. We develop a new dataset which allows us to investigate this relationship. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Johannes Bjerva; Nikita Bhutani; Behzad Golshan; Wang-Chiew Tan; Isabelle Augenstein;	code
443	Widget Captioning: Generating Natural Language Description For Mobile User Interface Elements Highlight: We propose widget captioning, a novel task for automatically generating language descriptions for UI elements from multimodal input including both the image and the structural representations of user interfaces. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yang Li; Gang Li; Luheng He; Jingjie Zheng; Hong Li; Zhiwei Guan;
444	Unsupervised Natural Language Inference Via Decoupled Multimodal Contrastive Learning Highlight: In this paper, we propose Multimodal Aligned Contrastive Decoupled learning (MACD) network. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Wanyun Cui; Guangyu Zheng; Wei Wang;
445	Digital Voicing Of Silent Speech Highlight: In this paper, we consider the task of digitally voicing silent speech, where silently mouthed words are converted to audible speech based on electromyography (EMG) sensor measurements that capture muscle impulses. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	David Gaddy; Dan Klein;
446	Imitation Attacks And Defenses For Black-box Machine Translation Systems Highlight: To mitigate these vulnerabilities, we propose a defense that modifies translation outputs in order to misdirect the optimization of imitation models. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Eric Wallace; Mitchell Stern; Dawn Song;
447	Sequence-Level Mixed Sample Data Augmentation Highlight: This work proposes a simple data augmentation approach to encourage compositional behavior in neural models for sequence-to-sequence problems. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Demi Guo; Yoon Kim; Alexander Rush;
448	Consistency Of A Recurrent Language Model With Respect To Incomplete Decoding Highlight: Based on these insights, we propose two remedies which address inconsistency: consistent variants of top-k and nucleus sampling, and a self-terminating recurrent language model. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Sean Welleck; Ilia Kulikov; Jaedeok Kim; Richard Yuanzhe Pang; Kyunghyun Cho;
449	An Exploration Of Arbitrary-Order Sequence Labeling Via Energy-Based Inference Networks Highlight: In this work, we propose several high-order energy terms to capture complex dependencies among labels in sequence labeling, including several that consider the entire label sequence. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Lifu Tu; Tianyu Liu; Kevin Gimpel;
450	Ensemble Distillation For Structured Prediction: Calibrated, Accurate, Fast—Choose Three Highlight: In this paper, we study \textit{ensemble distillation} as a general framework for producing well-calibrated structured prediction models while avoiding the prohibitive inference-time cost of ensembles. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Steven Reich; David Mueller; Nicholas Andrews;
451	Inducing Target-Specific Latent Structures For Aspect Sentiment Classification Highlight: We propose gating mechanisms to dynamically combine information from word dependency graphs and latent graphs which are learned by self-attention networks. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Chenhua Chen; Zhiyang Teng; Yue Zhang;
452	Affective Event Classification With Discourse-enhanced Self-training Highlight: Our research introduces new classification models to assign affective polarity to event phrases. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yuan Zhuang; Tianyu Jiang; Ellen Riloff;
453	Deep Weighted MaxSAT For Aspect-based Opinion Extraction Highlight: We adopt the MaxSAT semantics to model logic inference process and smoothly incorporate a weighted version of MaxSAT that connects deep neural networks and a graphical model in a joint framework. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Meixi Wu; Wenya Wang; Sinno Jialin Pan;
454	Multi-view Story Characterization From Movie Plot Synopses And Reviews Highlight: This paper considers the problem of characterizing stories by inferring properties such as theme and style using written synopses and reviews of movies. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Sudipta Kar; Gustavo Aguilar; Mirella Lapata; Thamar Solorio;	code
455	Mind Your Inflections! Improving NLP For Non-Standard Englishes With Base-Inflection Encoding Highlight: We propose Base-Inflection Encoding (BITE), a method to tokenize English text by reducing inflected words to their base forms before reinjecting the grammatical information as special symbols. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Samson Tan; Shafiq Joty; Lav Varshney; Min-Yen Kan;
456	Measuring The Similarity Of Grammatical Gender Systems By Comparing Partitions Highlight: To quantify the similarity, we define gender systems extensionally, thereby reducing the problem of comparisons between languages’ gender systems to cluster evaluation. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Arya D. McCarthy; Adina Williams; Shijia Liu; David Yarowsky; Ryan Cotterell;
457	RethinkCWS: Is Chinese Word Segmentation A Solved Task? Highlight: In this paper, we take stock of what we have achieved and rethink what’s left in the CWS task. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Jinlan Fu; Pengfei Liu; Qi Zhang; Xuanjing Huang;	code
458	Learning To Pronounce Chinese Without A Pronunciation Dictionary Highlight: We demonstrate a program that learns to pronounce Chinese text in Mandarin, without a pronunciation dictionary. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Christopher Chu; Scot Fang; Kevin Knight;
459	Dynamic Anticipation And Completion For Multi-Hop Reasoning Over Sparse Knowledge Graph Highlight: To solve these problems, we propose a multi-hop reasoning model over sparse KGs, by applying novel dynamic anticipation and completion strategies: (1) The anticipation strategy utilizes the latent prediction of embedding-based models to make our model perform more potential path search over sparse KGs. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Xin Lv; Xu Han; Lei Hou; Juanzi Li; Zhiyuan Liu; Wei Zhang; Yichi Zhang; Hao Kong; Suhui Wu;	code
460	Knowledge Association With Hyperbolic Knowledge Graph Embeddings Highlight: We propose a hyperbolic relational graph neural network for KG embedding and capture knowledge associations with a hyperbolic transformation. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Zequn Sun; Muhao Chen; Wei Hu; Chengming Wang; Jian Dai; Wei Zhang;
461	Domain Knowledge Empowered Structured Neural Net For End-to-End Event Temporal Relation Extraction Highlight: To address these issues, we propose a framework that enhances deep neural network with distributional constraints constructed by probabilistic domain knowledge. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Rujun Han; Yichao Zhou; Nanyun Peng;
462	TeMP: Temporal Message Passing For Temporal Knowledge Graph Completion Highlight: We propose the Temporal Message Passing (TeMP) framework to address these challenges by combining graph neural networks, temporal dynamics models, data imputation and frequency-based gating techniques. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Jiapeng Wu; Meng Cao; Jackie Chi Kit Cheung; William L. Hamilton;
463	Understanding The Difficulty Of Training Transformers Highlight: Our objective here is to understand {\_}{\_}what complicates Transformer training{\_}{\_} from both empirical and theoretical perspectives. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Liyuan Liu; Xiaodong Liu; Jianfeng Gao; Weizhu Chen; Jiawei Han;
464	An Empirical Study Of Generation Order For Machine Translation Highlight: In this work, we present an empirical study of generation order for machine translation. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	William Chan; Mitchell Stern; Jamie Kiros; Jakob Uszkoreit;
465	Inference Strategies For Machine Translation With Conditional Masking Highlight: We identify a thresholding strategy that has advantages over the standard mask-predict algorithm, and provide analyses of its behavior on machine translation tasks. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Julia Kreutzer; George Foster; Colin Cherry;
466	AmbigQA: Answering Ambiguous Open-domain Questions Highlight: In this paper, we introduce AmbigQA, a new open-domain question answering task which involves finding every plausible answer, and then rewriting the question for each one to resolve the ambiguity. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Sewon Min; Julian Michael; Hannaneh Hajishirzi; Luke Zettlemoyer;	code
467	Tell Me How To Ask Again: Question Data Augmentation With Controllable Rewriting In Continuous Space Highlight: In this paper, we propose a novel data augmentation method, referred to as Controllable Rewriting based Question Data Augmentation (CRQDA), for machine reading comprehension (MRC), question generation, and question-answering natural language inference tasks. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Dayiheng Liu; Yeyun Gong; Jie Fu; Yu Yan; Jiusheng Chen; Jiancheng Lv; Nan Duan; Ming Zhou;
468	Training Question Answering Models From Synthetic Data Highlight: This work aims to narrow this gap by taking advantage of large language models and explores several factors such as model size, quality of pretrained models, scale of data synthesized, and algorithmic choices. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Raul Puri; Ryan Spring; Mohammad Shoeybi; Mostofa Patwary; Bryan Catanzaro;
469	Few-Shot Complex Knowledge Base Question Answering Via Meta Reinforcement Learning Highlight: This paper proposes a meta-reinforcement learning approach to program induction in CQA to tackle the potential distributional bias in questions. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yuncheng Hua; Yuan-Fang Li; Gholamreza Haffari; Guilin Qi; Tongtong Wu;	code
470	Multilingual Offensive Language Identification With Cross-lingual Embeddings Highlight: In this paper, we take advantage of English data available by applying cross-lingual contextual word embeddings and transfer learning to make predictions in languages with less resources. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Tharindu Ranasinghe; Marcos Zampieri;
471	Solving Historical Dictionary Codes With A Neural Language Model Highlight: We solve difficult word-based substitution codes by constructing a decoding lattice and searching that lattice with a neural language model. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Christopher Chu; Raphael Valenti; Kevin Knight;
472	Toward Micro-Dialect Identification In Diaglossic And Code-Switched Environments Highlight: Inspired by geolocation research, we propose the novel task of Micro-Dialect Identification (MDI) and introduce MARBERT, a new language model with striking abilities to predict a fine-grained variety (as small as that of a city) given a single, short message. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Muhammad Abdul-Mageed; Chiyu Zhang; AbdelRahim Elmadany; Lyle Ungar;
473	Investigating African-American Vernacular English In Transformer-Based Text Generation Highlight: We investigate the performance of GPT-2 on AAVE text by creating a dataset of intent-equivalent parallel AAVE/SAE tweet pairs, thereby isolating syntactic structure and AAVE- or SAE-specific language for each pair. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Sophie Groenwold; Lily Ou; Aesha Parekh; Samhita Honnavalli; Sharon Levy; Diba Mirza; William Yang Wang;
474	Iterative Domain-Repaired Back-Translation Highlight: In this paper, we focus on the domain-specific translation with low resources, where in-domain parallel corpora are scarce or nonexistent. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Hao-Ran Wei; Zhirui Zhang; Boxing Chen; Weihua Luo;
475	Dynamic Data Selection And Weighting For Iterative Back-Translation Highlight: In this paper, we provide insights into this commonly used approach and generalize it to a dynamic curriculum learning strategy, which is applied to iterative back-translation models. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Zi-Yi Dou; Antonios Anastasopoulos; Graham Neubig;
476	Revisiting Modularized Multilingual NMT To Meet Industrial Demands Highlight: In this study, we revisit the multilingual neural machine translation model that only share modules among the same languages (M2) as a practical alternative to 1-1 to satisfy industrial requirements. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Sungwon Lyu; Bokyung Son; Kichang Yang; Jaekyoung Bae;
477	LAReQA: Language-Agnostic Answer Retrieval From A Multilingual Pool Highlight: We present LAReQA, a challenging new benchmark for language-agnostic answer retrieval from a multilingual candidate pool. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Uma Roy; Noah Constant; Rami Al-Rfou; Aditya Barua; Aaron Phillips; Yinfei Yang;	code
478	OCR Post Correction For Endangered Language Texts Highlight: In this work, we address the task of extracting text from these resources. We create a benchmark dataset of transcriptions for scanned books in three critically endangered languages and present a systematic analysis of how general-purpose OCR tools are not robust to the data-scarce setting of endangered languages. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Shruti Rijhwani; Antonios Anastasopoulos; Graham Neubig;
479	X-FACTR: Multilingual Factual Knowledge Retrieval From Pretrained Language Models Highlight: To assess factual knowledge retrieval in LMs in different languages, we create a multilingual benchmark of cloze-style probes for typologically diverse languages. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Zhengbao Jiang; Antonios Anastasopoulos; Jun Araki; Haibo Ding; Graham Neubig;	code
480	CCAligned: A Massive Collection Of Cross-Lingual Web-Document Pairs Highlight: In this paper, we exploit the signals embedded in URLs to label web documents at scale with an average precision of 94.5% across different language pairs. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Ahmed El-Kishky; Vishrav Chaudhary; Francisco Guzmán; Philipp Koehn;
481	Localizing Open-Ontology QA Semantic Parsers In A Day Using Machine Translation Highlight: We propose Semantic Parser Localizer (SPL), a toolkit that leverages Neural Machine Translation (NMT) systems to localize a semantic parser for a new language. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Mehrad Moradshahi; Giovanni Campagna; Sina Semnani; Silei Xu; Monica Lam;
482	Interactive Refinement Of Cross-Lingual Word Embeddings Highlight: We introduce CLIME, an interactive system to quickly refine cross-lingual word embeddings for a given classification problem. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Michelle Yuan; Mozhi Zhang; Benjamin Van Durme; Leah Findlater; Jordan Boyd-Graber;
483	Exploiting Sentence Order In Document Alignment Highlight: We present a simple document alignment method that incorporates sentence order information in both candidate generation and candidate re-scoring. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Brian Thompson; Philipp Koehn;
484	XGLUE: A New Benchmark Datasetfor Cross-lingual Pre-training, Understanding And Generation Highlight: In this paper, we introduce XGLUE, a new benchmark dataset to train large-scale cross-lingual pre-trained models using multilingual and bilingual corpora, and evaluate their performance across a diverse set of cross-lingual tasks. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yaobo Liang; Nan Duan; Yeyun Gong; Ning Wu; Fenfei Guo; Weizhen Qi; Ming Gong; Linjun Shou; Daxin Jiang; Guihong Cao; Xiaodong Fan; Ruofei Zhang; Rahul Agrawal; Edward Cui; Sining Wei; Taroon Bharti; Ying Qiao; Jiun-Hung Chen; Winnie Wu; Shuguang Liu; Fan Yang; Daniel Campos; Rangan Majumder; Ming Zhou;
485	AIN: Fast And Accurate Sequence Labeling With Approximate Inference Network Highlight: In this paper, we propose to employ a parallelizable approximate variational inference algorithm for the CRF model. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Xinyu Wang; Yong Jiang; Nguyen Bach; Tao Wang; Zhongqiang Huang; Fei Huang; Kewei Tu;
486	HIT: Nested Named Entity Recognition Via Head-Tail Pair And Token Interaction Highlight: To address this issue, we present a novel nested NER model named HIT. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yu Wang; Yun Li; Hanghang Tong; Ziye Zhu;
487	Supertagging Combinatory Categorial Grammar With Attentive Graph Convolutional Networks Highlight: In this paper, we propose attentive graph convolutional networks to enhance neural CCG supertagging through a novel solution of leveraging contextual information. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yuanhe Tian; Yan Song; Fei Xia;
488	DAGA: Data Augmentation With A Generation Approach ForLow-resource Tagging Tasks Highlight: In this work, we propose a novel augmentation method to generate high quality synthetic data for low-resource tagging tasks with language models trained on the linearized labeled sentences. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Bosheng Ding; Linlin Liu; Lidong Bing; Canasai Kruengkrai; Thien Hai Nguyen; Shafiq Joty; Luo Si; Chunyan Miao;
489	Interpretable Multi-dataset Evaluation For Named Entity Recognition Highlight: In this paper, we present a general methodology for interpretable evaluation for the named entity recognition (NER) task. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Jinlan Fu; Pengfei Liu; Graham Neubig;	code
490	Adversarial Semantic Decoupling For Recognizing Open-Vocabulary Slots Highlight: In this paper, we propose a robust adversarial model-agnostic slot filling method that explicitly decouples local semantics inherent in open-vocabulary slot words from the global context. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yuanmeng Yan; Keqing He; Hong Xu; Sihong Liu; Fanyu Meng; Min Hu; Weiran Xu;
491	Plug And Play Autoencoders For Conditional Text Generation Highlight: We propose methods which are plug and play, where any pretrained autoencoder can be used, and only require learning a mapping within the autoencoder’s embedding space, training embedding-to-embedding (Emb2Emb). Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Florian Mai; Nikolaos Pappas; Ivan Montero; Noah A. Smith; James Henderson;
492	Structure Aware Negative Sampling In Knowledge Graphs Highlight: In this paper, we propose Structure Aware Negative Sampling (SANS), an inexpensive negative sampling strategy that utilizes the rich graph structure by selecting negative samples from a node’s $k$-hop neighborhood. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Kian Ahrabian; Aarash Feizi; Yasmin Salehi; William L. Hamilton; Avishek Joey Bose;
493	Neural Mask Generator: Learning To Generate Adaptive Word Maskings For Language Model Adaptation Highlight: We propose a method to automatically generate a domain- and task-adaptive maskings of the given text for self-supervised pre-training, such that we can effectively adapt the language model to a particular target task (e.g. question answering). Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Minki Kang; Moonsu Han; Sung Ju Hwang;
494	Autoregressive Knowledge Distillation Through Imitation Learning Highlight: We develop a compression technique for autoregressive models that is driven by an imitation learning perspective on knowledge distillation. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Alexander Lin; Jeremy Wohlwend; Howard Chen; Tao Lei;
495	T3: Tree-Autoencoder Constrained Adversarial Text Generation For Targeted Attack Highlight: To handle these challenges, we propose a target-controllable adversarial attack framework T3, which is applicable to a range of NLP tasks. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Boxin Wang; Hengzhi Pei; Boyuan Pan; Qian Chen; Shuohang Wang; Bo Li;	code
496	Structured Pruning Of Large Language Models Highlight: We present a generic, structured pruning approach by parameterizing each weight matrix using its low-rank factorization, and adaptively removing rank-1 components during training. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Ziheng Wang; Jeremy Wohlwend; Tao Lei;
497	Effective Unsupervised Domain Adaptation With Adversarially Trained Language Models Highlight: In this paper, we show that careful masking strategies can bridge the knowledge gap of masked language models (MLMs) about the domains more effectively by allocating self-supervision where it is needed. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Thuy-Trang Vu; Dinh Phung; Gholamreza Haffari;
498	BAE: BERT-based Adversarial Examples For Text Classification Highlight: We present BAE, a black box attack for generating adversarial examples using contextual perturbations from a BERT masked language model. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Siddhant Garg; Goutham Ramakrishnan;
499	Adversarial Self-Supervised Data-Free Distillation For Text Classification Highlight: To tackle this problem, we propose a novel two-stage data-free distillation method, named Adversarial self-Supervised Data-Free Distillation (AS-DFD), which is designed for compressing large-scale transformer-based models (e.g., BERT). Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Xinyin Ma; Yongliang Shen; Gongfan Fang; Chen Chen; Chenghao Jia; Weiming Lu;
500	BERT-ATTACK: Adversarial Attack Against BERT Using BERT Highlight: In this paper, we propose \textbf{BERT-Attack}, a high-quality and effective method to generate adversarial samples using pre-trained masked language models exemplified by BERT. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Linyang Li; Ruotian Ma; Qipeng Guo; Xiangyang Xue; Xipeng Qiu;	code
501	The Thieves On Sesame Street Are Polyglots – Extracting Multilingual Models From Monolingual APIs Highlight: We discover that this extraction process extends to local copies initialized from a pre-trained, multilingual model while the victim remains monolingual. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Nitish Shirish Keskar; Bryan McCann; Caiming Xiong; Richard Socher;
502	When Hearst Is Not Enough: Improving Hypernymy Detection From Corpus With Distributional Models Highlight: We address hypernymy detection, i.e., whether an is-a relationship exists between words (x ,y), with the help of large textual corpora. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Changlong Yu; Jialong Han; Peifeng Wang; Yangqiu Song; Hongming Zhang; Wilfred Ng; Shuming Shi;
503	Interpreting Open-Domain Modifiers: Decomposition Of Wikipedia Categories Into Disambiguated Property-Value Pairs Highlight: This paper proposes an open-domain method for automatically annotating modifier constituents (20th-century’) within Wikipedia categories (20th-century male writers) with properties (date of birth). Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Marius Pasca;
504	A Synset Relation-enhanced Framework With A Try-again Mechanism For Word Sense Disambiguation Highlight: In this paper, we propose a Synset Relation-Enhanced Framework (SREF) that leverages sense relations for both sense embedding enhancement and a try-again mechanism that implements WSD again, after obtaining basic sense embeddings from augmented WordNet glosses. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Ming Wang; Yinglin Wang;
505	Diverse, Controllable, And Keyphrase-Aware: A Corpus And Method For News Multi-Headline Generation Highlight: In this paper, we propose generating multiple headlines with keyphrases of user interests, whose main idea is to generate multiple keyphrases of interest to users for the news first, and then generate multiple keyphrase-relevant headlines. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Dayiheng Liu; Yeyun Gong; Yu Yan; Jie Fu; Bo Shao; Daxin Jiang; Jiancheng Lv; Nan Duan;
506	Factual Error Correction For Abstractive Summarization Models Highlight: We propose a post-editing corrector module to address this issue by identifying and correcting factual errors in generated summaries. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Meng Cao; Yue Dong; Jiapeng Wu; Jackie Chi Kit Cheung;
507	Compressive Summarization With Plausibility And Salience Modeling Highlight: In this work, we propose to relax these explicit syntactic constraints on candidate spans, and instead leave the decision about what to delete to two data-driven criteria: plausibility and salience. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Shrey Desai; Jiacheng Xu; Greg Durrett;
508	Understanding Neural Abstractive Summarization Models Via Uncertainty Highlight: In this work, we analyze summarization decoders in both blackbox and whitebox ways by studying on the entropy, or uncertainty, of the model’s token-level predictions. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Jiacheng Xu; Shrey Desai; Greg Durrett;
509	Better Highlighting: Creating Sub-Sentence Summary Highlights Highlight: In this paper, we aim to generate summary highlights to be overlaid on the original documents to make it easier for readers to sift through a large amount of text. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Sangwoo Cho; Kaiqiang Song; Chen Li; Dong Yu; Hassan Foroosh; Fei Liu;
510	Summarizing Text On Any Aspects: A Knowledge-Informed Weakly-Supervised Approach Highlight: In this work, we study summarizing on \textit{arbitrary} aspects relevant to the document, which significantly expands the application of the task in practice. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Bowen Tan; Lianhui Qin; Eric Xing; Zhiting Hu;
511	BERT-enhanced Relational Sentence Ordering Network Highlight: In this paper, we introduce a novel BERT-enhanced Relational Sentence Ordering Network (referred to as BRSON) by leveraging BERT for capturing better dependency relationship among sentences to enhance the coherence modeling for the entire paragraph. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Baiyun Cui; Yingming Li; Zhongfei Zhang;
512	Online Conversation Disentanglement With Pointer Networks Highlight: In this work, we propose an end-to-end online framework for conversation disentanglement that avoids time-consuming domain-specific feature engineering. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Tao Yu; Shafiq Joty;
513	VCDM: Leveraging Variational Bi-encoding And Deep Contextualized Word Representations For Improved Definition Modeling Highlight: In this paper, we tackle the task of definition modeling, where the goal is to learn to generate definitions of words and phrases. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Machel Reid; Edison Marrese-Taylor; Yutaka Matsuo;
514	Coarse-to-Fine Pre-training For Named Entity Recognition Highlight: To this end, we proposea NER-specific pre-training framework to in-ject coarse-to-fine automatically mined entityknowledge into pre-trained models. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Xue Mengge; Bowen Yu; Zhenyu Zhang; Tingwen Liu; Yue Zhang; Bin Wang;
515	Exploring And Evaluating Attributes, Values, And Structures For Entity Alignment Highlight: In this paper, we propose to utilize an attributed value encoder and partition the KG into subgraphs to model the various types of attribute triples efficiently. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Zhiyuan Liu; Yixin Cao; Liangming Pan; Juanzi Li; Zhiyuan Liu; Tat-Seng Chua;	code
516	Simple And Effective Few-Shot Named Entity Recognition With Structured Nearest Neighbor Learning Highlight: We present a simple few-shot named entity recognition (NER) system based on nearest neighbor learning and structured inference. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yi Yang; Arzoo Katiyar;
517	Learning Structured Representations Of Entity Names Using ActiveLearning And Weak Supervision Highlight: In this paper, we present a novel learning framework that combines active learning and weak supervision to solve this problem. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Kun Qian; Poornima Chozhiyath Raman; Yunyao Li; Lucian Popa;
518	Entity Enhanced BERT Pre-training For Chinese NER Highlight: To integrate the lexicon into pre-trained LMs for Chinese NER, we investigate a semi-supervised entity enhanced BERT pre-training method. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Chen Jia; Yuefeng Shi; Qinrong Yang; Yue Zhang;
519	Scalable Zero-shot Entity Linking With Dense Entity Retrieval Highlight: This paper introduces a conceptually simple, scalable, and highly effective BERT-based entity linking model, along with an extensive evaluation of its accuracy-speed trade-off. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Ledell Wu; Fabio Petroni; Martin Josifoski; Sebastian Riedel; Luke Zettlemoyer;	code
520	A Dataset For Tracking Entities In Open Domain Procedural Text Highlight: We present the first dataset for tracking state changes in procedural text from arbitrary domains by using an unrestricted (open) vocabulary. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Niket Tandon; Keisuke Sakaguchi; Bhavana Dalvi; Dheeraj Rajagopal; Peter Clark; Michal Guerquin; Kyle Richardson; Eduard Hovy;
521	Design Challenges In Low-resource Cross-lingual Entity Linking Highlight: This paper provides a thorough analysis of low-resource XEL techniques, focusing on the key step of identifying candidate English Wikipedia titles that correspond to a given foreign language mention. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Xingyu Fu; Weijia Shi; Xiaodong Yu; Zian Zhao; Dan Roth;
522	Efficient One-Pass End-to-End Entity Linking For Questions Highlight: We present ELQ, a fast end-to-end entity linking model for questions, which uses a biencoder to jointly perform mention detection and linking in one pass. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Belinda Z. Li; Sewon Min; Srinivasan Iyer; Yashar Mehdad; Wen-tau Yih;
523	LUKE: Deep Contextualized Entity Representations With Entity-aware Self-attention Highlight: In this paper, we propose new pretrained contextualized representations of words and entities based on the bidirectional transformer. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Ikuya Yamada; Akari Asai; Hiroyuki Shindo; Hideaki Takeda; Yuji Matsumoto;	code
524	Generating Similes Effortlessly Like A Pro: A Style Transfer Approach For Simile Generation Highlight: In this paper, we tackle the problem of simile generation. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Tuhin Chakrabarty; Smaranda Muresan; Nanyun Peng;
525	STORIUM: A Dataset And Evaluation Platform For Machine-in-the-Loop Story Generation Highlight: To address these issues, we introduce a dataset and evaluation platform built from STORIUM, an online collaborative storytelling community. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Nader Akoury; Shufan Wang; Josh Whiting; Stephen Hood; Nanyun Peng; Mohit Iyyer;
526	Substance Over Style: Document-Level Targeted Content Transfer Highlight: In this work, we introduce the task of document-level targeted content transfer and address it in the recipe domain, with a recipe as the document and a dietary restriction (such as vegan or dairy-free) as the targeted constraint. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Allison Hegel; Sudha Rao; Asli Celikyilmaz; Bill Dolan;
527	Template Guided Text Generation For Task-Oriented Dialogue Highlight: In this work, we investigate two methods for Natural Language Generation (NLG) using a single domain-independent model across a large number of APIs. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Mihir Kale; Abhinav Rastogi;
528	MOCHA: A Dataset For Training And Evaluating Generative Reading Comprehension Metrics Highlight: To address this, we introduce a benchmark for training and evaluating generative reading comprehension metrics: MOdeling Correctness with Human Annotations. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Anthony Chen; Gabriel Stanovsky; Sameer Singh; Matt Gardner;
529	Plan Ahead: Self-Supervised Text Planning For Paragraph Completion Task Highlight: To address that, we propose a self-supervised text planner SSPlanner that predicts what to say first (content prediction), then guides the pretrained language model (surface realization) using the predicted content. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Dongyeop Kang; Eduard Hovy;
530	Inquisitive Question Generation For High Level Text Comprehension Highlight: We introduce INQUISITIVE, a dataset of {\textasciitilde}19K questions that are elicited while a person is reading through a document. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Wei-Jen Ko; Te-yuan Chen; Yiyan Huang; Greg Durrett; Junyi Jessy Li;
531	Towards Persona-Based Empathetic Conversational Models Highlight: To this end, we propose a new task towards persona-based empathetic conversations and present the first empirical study on the impact of persona on empathetic responding. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Peixiang Zhong; Chen Zhang; Hao Wang; Yong Liu; Chunyan Miao;
532	Personal Information Leakage Detection In Conversations Highlight: In this work, we propose to protect personal information by warning users of detected suspicious sentences generated by conversational assistants. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Qiongkai Xu; Lizhen Qu; Zeyu Gao; Gholamreza Haffari;
533	Response Selection For Multi-Party Conversations With Dynamic Topic Tracking Highlight: In this work, we frame response selection as a dynamic topic tracking task to match the topic between the response and relevant conversation context. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Weishi Wang; Steven C.H. Hoi; Shafiq Joty;
534	Regularizing Dialogue Generation By Imitating Implicit Scenarios Highlight: To enable responses that are more meaningful and context-specific, we propose to improve generative dialogue systems from the scenario perspective, where both dialogue history and future conversation are taken into account to implicitly reconstruct the scenario knowledge. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Shaoxiong Feng; Xuancheng Ren; Hongshen Chen; Bin Sun; Kan Li; Xu Sun;
535	MovieChats: Chat Like Humans In A Closed Domain Highlight: In this work, we take a close look at the movie domain and present a large-scale high-quality corpus with fine-grained annotations in hope of pushing the limit of movie-domain chatbots. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Hui Su; Xiaoyu Shen; Zhou Xiao; Zheng Zhang; Ernie Chang; Cheng Zhang; Cheng Niu; Jie Zhou;
536	Conundrums In Entity Coreference Resolution: Making Sense Of The State Of The Art Highlight: We present an empirical analysis of state-of-the-art resolvers with the goal of providing the general NLP audience with a better understanding of the state of the art and coreference researchers with directions for future research. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Jing Lu; Vincent Ng;
537	Semantic Role Labeling Guided Multi-turn Dialogue ReWriter Highlight: In this paper, we propose to use semantic role labeling (SRL), which highlights the core semantic information of who did what to whom, to provide additional guidance for the rewriter model. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Kun Xu; Haochen Tan; Linfeng Song; Han Wu; Haisong Zhang; Linqi Song; Dong Yu;
538	Continuity Of Topic, Interaction, And Query: Learning To Quote In Online Conversations Highlight: Here, we capture the contextual consistency of a quotation in terms of latent topics, interactions with the dialogue history, and coherence to the query turn’s existing contents. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Lingzhi Wang; Jing Li; Xingshan Zeng; Haisong Zhang; Kam-Fai Wong;
539	Profile Consistency Identification For Open-domain Dialogue Agents Highlight: To facilitate the study of profile consistency identification, we create a large-scale human-annotated dataset with over 110K single-turn conversations and their key-value attribute profiles. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Haoyu Song; Yan Wang; Wei-Nan Zhang; Zhengyu Zhao; Ting Liu; Xiaojiang Liu;
540	An Element-aware Multi-representation Model For Law Article Prediction Highlight: In this paper, we propose a Law Article Element-aware Multi-representation Model (LEMM), which can make full use of law article information and can be used for multi-label samples. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Huilin Zhong; Junsheng Zhou; Weiguang Qu; Yunfei Long; Yanhui Gu;
541	Recurrent Event Network: Autoregressive Structure Inferenceover Temporal Knowledge Graphs Highlight: This paper proposes Recurrent Event Network (RE-Net), a novel autoregressive architecture for predicting future interactions. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Woojeong Jin; Meng Qu; Xisen Jin; Xiang Ren;
542	Multi-resolution Annotations For Emoji Prediction Highlight: This paper annotates an emoji prediction dataset with passage-level multi-class/multi-label, and aspect-level multi-class annotations. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Weicheng Ma; Ruibo Liu; Lili Wang; Soroush Vosoughi;
543	Less Is More: Attention Supervision With Counterfactuals For Text Classification Highlight: We aim to leverage human and machine intelligence together for attention supervision. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Seungtaek Choi; Haeju Park; Jinyoung Yeo; Seung-won Hwang;
544	MODE-LSTM: A Parameter-efficient Recurrent Network With Multi-Scale For Sentence Classification Highlight: In this paper, we propose a simple yet effective model called Multi-scale Orthogonal inDependEnt LSTM (MODE-LSTM), which not only has effective parameters and good generalization ability, but also considers multiscale n-gram features. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Qianli Ma; Zhenxi Lin; Jiangyue Yan; Zipeng Chen; Liuhong Yu;
545	HSCNN: A Hybrid-Siamese Convolutional Neural Network For Extremely Imbalanced Multi-label Text Classification Highlight: We propose a hybrid solution which adapts general networks for the head categories, and few-shot techniques for the tail categories. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Wenshuo Yang; Jiyi Li; Fumiyo Fukumoto; Yanming Ye;
546	Multi-Stage Pre-training For Automated Chinese Essay Scoring Highlight: This paper proposes a pre-training based automated Chinese essay scoring method. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Wei Song; Kai Zhang; Ruiji Fu; Lizhen Liu; Ting Liu; Miaomiao Cheng;
547	Multi-hop Inference For Question-driven Summarization Highlight: In this work, we propose a novel question-driven abstractive summarization method, Multi-hop Selective Generator (MSG), to incorporate multi-hop reasoning into question-driven summarization and, meanwhile, provide justifications for the generated summaries. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yang Deng; Wenxuan Zhang; Wai Lam;
548	Towards Interpretable Reasoning Over Paragraph Effects In Situation Highlight: Inspired by human cognitive processes, in this paper we propose a sequential approach for this task which explicitly models each step of the reasoning process with neural network modules. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Mucheng Ren; Xiubo Geng; Tao Qin; Heyan Huang; Daxin Jiang;
549	Question Directed Graph Attention Network For Numerical Reasoning Over Text Highlight: To address this challenge, we propose a heterogeneous graph representation for the context of the passage and question needed for such reasoning, and design a question directed graph attention network to drive multi-step numerical reasoning over this context graph. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Kunlong Chen; Weidi Xu; Xingyi Cheng; Zou Xiaochuan; Yuyu Zhang; Le Song; Taifeng Wang; Yuan Qi; Wei Chu;
550	Dense Passage Retrieval For Open-Domain Question Answering Highlight: In this work, we show that retrieval can be practically implemented using dense representations alone, where embeddings are learned from a small number of questions and passages by a simple dual-encoder framework. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Vladimir Karpukhin; Barlas Oguz; Sewon Min; Patrick Lewis; Ledell Wu; Sergey Edunov; Danqi Chen; Wen-tau Yih;
551	Distilling Structured Knowledge For Text-Based Relational Reasoning Highlight: In this work, we investigate how the structured knowledge of a GNN can be distilled into various NLP models in order to improve their performance. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Jin Dong; Marc-Antoine Rondeau; William L. Hamilton;
552	Asking Without Telling: Exploring Latent Ontologies In Contextual Representations Highlight: To investigate this, we introduce latent subclass learning (LSL): a modification to classifier-based probing that induces a latent categorization (or ontology) of the probe’s inputs. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Julian Michael; Jan A. Botha; Ian Tenney;
553	Pretrained Language Model Embryology: The Birth Of ALBERT Highlight: We thus investigate the developmental process from a set of randomly initialized parameters to a totipotent language model, which we refer to as the embryology of a pretrained language model. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Cheng-Han Chiang; Sung-Feng Huang; Hung-yi Lee;	code
554	Learning Music Helps You Read: Using Transfer To Study Linguistic Structure In Language Models Highlight: We propose transfer learning as a method for analyzing the encoding of grammatical structure in neural language models. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Isabel Papadimitriou; Dan Jurafsky;
555	What Do Position Embeddings Learn? An Empirical Study Of Pre-Trained Language Model Positional Encoding Highlight: This paper focuses on providing a new insight of pre-trained position embeddings by feature-level analysis and empirical experiments on most of iconic NLP tasks. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yu-An Wang; Yun-Nung Chen;
556	“You Are Grounded!”: Latent Name Artifacts In Pre-trained Language Models Highlight: We focus on artifacts associated with the representation of given names (e.g., Donald), which, depending on the corpus, may be associated with specific entities, as indicated by next token prediction (e.g., Trump). Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Vered Shwartz; Rachel Rudinger; Oyvind Tafjord;
557	Birds Have Four Legs?! NumerSense: Probing Numerical Commonsense Knowledge Of Pre-Trained Language Models Highlight: In this paper, we investigate whether and to what extent we can induce numerical commonsense knowledge from PTLMs as well as the robustness of this process. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Bill Yuchen Lin; Seyeon Lee; Rahul Khanna; Xiang Ren;
558	Grounded Adaptation For Zero-shot Executable Semantic Parsing Highlight: We propose Grounded Adaptation for Zeroshot Executable Semantic Parsing (GAZP) to adapt an existing semantic parser to new environments (e.g. new database schemas). Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Victor Zhong; Mike Lewis; Sida I. Wang; Luke Zettlemoyer;
559	An Imitation Game For Learning Semantic Parsers From User Interaction Highlight: In this paper, we suggest an alternative, human-in-the-loop methodology for learning semantic parsers directly from users. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Ziyu Yao; Yiqi Tang; Wen-tau Yih; Huan Sun; Yu Su;	code
560	IGSQL: Database Schema Interaction Graph Based Neural Model For Context-Dependent Text-to-SQL Generation Highlight: In this work, in addition to using encoders to capture historic information of user inputs, we propose a database schema interaction graph encoder to utilize historic information of database schema items. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yitao Cai; Xiaojun Wan;
561	“What Do You Mean By That?” A Parser-Independent Interactive Approach For Enhancing Text-to-SQL Highlight: In this paper, we include human in the loop and present a novel parser-independent interactive approach (PIIA) that interacts with users using multi-choice questions and can easily work with arbitrary parsers. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yuntao Li; Bei Chen; Qian Liu; Yan Gao; Jian-Guang Lou; Yan Zhang; Dongmei Zhang;
562	DuSQL: A Large-Scale And Pragmatic Chinese Text-to-SQL Dataset Highlight: This paper presents DuSQL, a larges-scale and pragmatic Chinese dataset for the cross-domain text-to-SQL task, containing 200 databases, 813 tables, and 23,797 question/SQL pairs. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Lijie Wang; Ao Zhang; Kun Wu; Ke Sun; Zhenghua Li; Hua Wu; Min Zhang; Haifeng Wang;
563	Mention Extraction And Linking For SQL Query Generation Highlight: To solve these problems, this paper proposes a novel extraction-linking approach, where a unified extractor recognizes all types of slot mentions appearing in the question sentence before a linker maps the recognized columns to the table schema to generate executable SQL queries. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Jianqiang Ma; Zeyu Yan; Shuai Pang; Yang Zhang; Jianping Shen;
564	Re-examining The Role Of Schema Linking In Text-to-SQL Highlight: By providing a schema linking corpus based on the Spider text-to-SQL dataset, we systematically study the role of schema linking. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Wenqiang Lei; Weixin Wang; Zhixin Ma; Tian Gan; Wei Lu; Min-Yen Kan; Tat-Seng Chua;
565	A Multi-Task Incremental Learning Framework With Category Name Embedding For Aspect-Category Sentiment Analysis Highlight: In this paper, to make multi-task learning feasible for incremental learning, we proposed Category Name Embedding network (CNE-net). Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Zehui Dai; Cheng Peng; Huajie Chen; Yadong Ding;
566	Train No Evil: Selective Masking For Task-Guided Pre-Training Highlight: In this paper, we propose a three-stage framework by adding a task-guided pre-training stage with selective masking between general pre-training and fine-tuning. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yuxian Gu; Zhengyan Zhang; Xiaozhi Wang; Zhiyuan Liu; Maosong Sun;	code
567	SentiLARE: Sentiment-Aware Language Representation Learning With Linguistic Knowledge Highlight: To benefit the downstream tasks in sentiment analysis, we propose a novel language representation model called SentiLARE, which introduces word-level linguistic knowledge including part-of-speech tag and sentiment polarity (inferred from SentiWordNet) into pre-trained models. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Pei Ke; Haozhe Ji; Siyang Liu; Xiaoyan Zhu; Minlie Huang;
568	Weakly-Supervised Aspect-Based Sentiment Analysis Via Joint Aspect-Sentiment Topic Embedding Highlight: In this paper, we propose a weakly-supervised approach for aspect-based sentiment analysis, which uses only a few keywords describing each aspect/sentiment without using any labeled examples. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Jiaxin Huang; Yu Meng; Fang Guo; Heng Ji; Jiawei Han;
569	APE: Argument Pair Extraction From Peer Review And Rebuttal Via Multi-task Learning Highlight: In this paper, we introduce a new argument pair extraction (APE) task on peer review and rebuttal in order to study the contents, the structure and the connections between them. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Liying Cheng; Lidong Bing; Qian Yu; Wei Lu; Luo Si;
570	Diversified Multiple Instance Learning For Document-Level Multi-Aspect Sentiment Classification Highlight: To this end, we propose a novel Diversified Multiple Instance Learning Network (D-MILN), which is able to achieve aspect-level sentiment classification with only document-level weak supervision. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yunjie Ji; Hao Liu; Bolei He; Xinyan Xiao; Hua Wu; Yanhua Yu;
571	Identifying Exaggerated Language Highlight: We contribute to the study of hyperbole by (1) creating a corpus focusing on sentence-level hyperbole detection, (2) performing a statistical and manual analysis of our corpus, and (3) addressing the automatic hyperbole detection task. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Li Kong; Chuanyi Li; Jidong Ge; Bin Luo; Vincent Ng;
572	Unified Feature And Instance Based Domain Adaptation For Aspect-Based Sentiment Analysis Highlight: To resolve this limitation, we propose an end-to-end framework to jointly perform feature and instance based adaptation for the ABSA task in this paper. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Chenggong Gong; Jianfei Yu; Rui Xia;
573	Compositional And Lexical Semantics In RoBERTa, BERT And DistilBERT: A Case Study On CoQA Highlight: We identify the problematic areas for the finetuned RoBERTa, BERT and DistilBERT models through systematic error analysis – basic arithmetic (counting phrases), compositional semantics (negation and Semantic Role Labeling), and lexical semantics (surprisal and antonymy). Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Ieva Staliūnaitė; Ignacio Iacobacci;
574	Attention Is Not Only A Weight: Analyzing Transformers With Vector Norms Highlight: This paper shows that attention weights alone are only one of the two factors that determine the output of attention and proposes a norm-based analysis that incorporates the second factor, the norm of the transformed input vectors. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Goro Kobayashi; Tatsuki Kuribayashi; Sho Yokoi; Kentaro Inui;
575	F1 Is Not Enough! Models And Evaluation Towards User-Centered Explainable Question Answering Highlight: As a remedy, we propose a hierarchical model and a new regularization term to strengthen the answer-explanation coupling as well as two evaluation scores to quantify the coupling. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Hendrik Schuff; Heike Adel; Ngoc Thang Vu;
576	On The Ability And Limitations Of Transformers To Recognize Formal Languages Highlight: In this work, we systematically study the ability of Transformers to model such languages as well as the role of its individual components in doing so. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Satwik Bhattamishra; Kabir Ahuja; Navin Goyal;
577	An Unsupervised Joint System For Text Generation From Knowledge Graphs And Semantic Parsing Highlight: To this end, we present the first approach to unsupervised text generation from KGs and show simultaneously how it can be used for unsupervised semantic parsing. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Martin Schmitt; Sahand Sharifzadeh; Volker Tresp; Hinrich Schütze;
578	DGST: A Dual-Generator Network For Text Style Transfer Highlight: We propose DGST, a novel and simple Dual-Generator network architecture for text Style Transfer. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Xiao Li; Guanyi Chen; Chenghua Lin; Ruizhe Li;
579	A Knowledge-Aware Sequence-to-Tree Network For Math Word Problem Solving Highlight: To incorporate external knowledge and global expression information, we propose a novel knowledge-aware sequence-to-tree (KA-S2T) network in which the entities in the problem sequences and their categories are modeled as an entity graph. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Qinzhuo Wu; Qi Zhang; Jinlan Fu; Xuanjing Huang;
580	Generating Fact Checking Briefs Highlight: To train its components, we introduce QABriefDataset We show that fact checking with briefs – in particular QABriefs – increases the accuracy of crowdworkers by 10% while slightly decreasing the time taken. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Angela Fan; Aleksandra Piktus; Fabio Petroni; Guillaume Wenzek; Marzieh Saeidi; Andreas Vlachos; Antoine Bordes; Sebastian Riedel;
581	Improving The Efficiency Of Grammatical Error Correction With Erroneous Span Detection And Correction Highlight: We propose a novel language-independent approach to improve the efficiency for Grammatical Error Correction (GEC) by dividing the task into two subtasks: Erroneous Span Detection (ESD) and Erroneous Span Correction (ESC). Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Mengyun Chen; Tao Ge; Xingxing Zhang; Furu Wei; Ming Zhou;
582	Coreferential Reasoning Learning For Language Representation Highlight: To address this issue, we present CorefBERT, a novel language representation model that can capture the coreferential relations in context. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Deming Ye; Yankai Lin; Jiaju Du; Zhenghao Liu; Peng Li; Maosong Sun; Zhiyuan Liu;	code
583	Is Graph Structure Necessary For Multi-hop Question Answering? Highlight: In this paper, we investigate whether the graph structure is necessary for textual multi-hop reasoning. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Nan Shao; Yiming Cui; Ting Liu; Shijin Wang; Guoping Hu;
584	XL-WiC: A Multilingual Benchmark For Evaluating Semantic Contextualization Highlight: We put forward a large multilingual benchmark, XL-WiC, featuring gold standards in 12 new languages from varied language families and with different degrees of resource availability, opening room for evaluation scenarios such as zero-shot cross-lingual transfer. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Alessandro Raganato; Tommaso Pasini; Jose Camacho-Collados; Mohammad Taher Pilehvar;	code
585	Generationary Or “How We Went Beyond Word Sense Inventories And Learned To Gloss” Highlight: In this paper we show this needs not be the case, and propose a unified model that is able to produce contextually appropriate definitions. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Michele Bevilacqua; Marco Maru; Roberto Navigli;	code
586	Probing Pretrained Language Models For Lexical Semantics Highlight: In this work, we present a systematic empirical analysis across six typologically diverse languages and five different lexical tasks, addressing the following questions: 1) How do different lexical knowledge extraction strategies (monolingual versus multilingual source LM, out-of-context versus in-context encoding, inclusion of special tokens, and layer-wise averaging) impact performance? Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Ivan Vulić; Edoardo Maria Ponti; Robert Litschko; Goran Glavaš; Anna Korhonen;
587	Cross-lingual Spoken Language Understanding With Regularized Representation Alignment Highlight: To cope with this issue, we propose a regularization approach to further align word-level and sentence-level representations across languages without any external resource. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Zihan Liu; Genta Indra Winata; Peng Xu; Zhaojiang Lin; Pascale Fung;
588	SLURP: A Spoken Language Understanding Resource Package Highlight: In this paper, we release SLURP, a new SLU package containing the following: (1) A new challenging dataset in English spanning 18 domains, which is substantially bigger and linguistically more diverse than existing datasets; (2) Competitive baselines based on state-of-the-art NLU and ASR systems; (3) A new transparent metric for entity labelling which enables a detailed error analysis for identifying potential areas of improvement. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Emanuele Bastianelli; Andrea Vanzo; Pawel Swietojanski; Verena Rieser;	code
589	Neural Conversational QA: Learning To Reason Vs Exploiting Patterns Highlight: In this paper we share our findings about the four types of patterns in the ShARC corpus and how the neural models exploit them. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Nikhil Verma; Abhishek Sharma; Dhiraj Madan; Danish Contractor; Harshit Kumar; Sachindra Joshi;
590	Counterfactual Generator: A Weakly-Supervised Method For Named Entity Recognition Highlight: In this paper, we decompose the sentence into two parts: entity and context, and rethink the relationship between them and model performance from a causal perspective. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Xiangji Zeng; Yunliang Li; Yuchen Zhai; Yin Zhang;	code
591	Understanding Procedural Text Using Interactive Entity Networks Highlight: In this paper, we propose a novel Interactive Entity Network (IEN), which is a recurrent network with memory equipped cells for state tracking. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Jizhi Tang; Yansong Feng; Dongyan Zhao;
592	A Rigorous Study On Named Entity Recognition: Can Fine-tuning Pretrained Model Lead To The Promised Land? Highlight: As there is no currently available dataset to investigate this problem, this paper proposes to conduct randomization test on standard benchmarks. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Hongyu Lin; Yaojie Lu; Jialong Tang; Xianpei Han; Le Sun; Zhicheng Wei; Nicholas Jing Yuan;
593	DyERNIE: Dynamic Evolution Of Riemannian Manifold Embeddings For Temporal Knowledge Graph Completion Highlight: To this end, we propose DyERNIE, a non-Euclidean embedding approach that learns evolving entity representations in a product of Riemannian manifolds, where the composed spaces are estimated from the sectional curvatures of underlying data. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Zhen Han; Peng Chen; Yunpu Ma; Volker Tresp;
594	Embedding Words In Non-Vector Space With Unsupervised Graph Learning Highlight: We introduce GraphGlove: unsupervised graph word representations which are learned end-to-end. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Max Ryabinin; Sergei Popov; Liudmila Prokhorenkova; Elena Voita;
595	Debiasing Knowledge Graph Embeddings Highlight: We present a novel approach, in which all embeddings are trained to be neutral to sensitive attributes such as gender by default using an adversarial loss. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Joseph Fisher; Arpit Mittal; Dave Palfrey; Christos Christodoulopoulos;
596	Message Passing For Hyper-Relational Knowledge Graphs Highlight: In this work, we propose a message passing based graph encoder – StarE capable of modeling such hyper-relational KGs. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Mikhail Galkin; Priyansh Trivedi; Gaurav Maheshwari; Ricardo Usbeck; Jens Lehmann;
597	Relation-aware Graph Attention Networks With Relational Position Encodings For Emotion Recognition In Conversations Highlight: In this paper, we propose relational position encodings that provide RGAT with sequential information reflecting the relational graph structure. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Taichi Ishiwatari; Yuki Yasuda; Taro Miyazaki; Jun Goto;
598	BERT Knows Punta Cana Is Not Just Beautiful, It’s Gorgeous: Ranking Scalar Adjectives With Contextualised Representations Highlight: We propose a novel BERT-based approach to intensity detection for scalar adjectives. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Aina Garí Soler; Marianna Apidianaki;
599	Feature Adaptation Of Pre-Trained Language Models Across Languages And Domains With Robust Self-Training Highlight: We explore unsupervised domain adaptation (UDA) in this paper. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Hai Ye; Qingyu Tan; Ruidan He; Juntao Li; Hwee Tou Ng; Lidong Bing;
600	Textual Data Augmentation For Efficient Active Learning On Tiny Datasets Highlight: In this paper we propose a novel data augmentation approach where guided outputs of a language generation model, e.g. GPT-2, when labeled, can improve the performance of text classifiers through an active learning process. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Husam Quteineh; Spyridon Samothrakis; Richard Sutcliffe;
601	“I’d Rather Just Go To Bed”: Understanding Indirect Answers Highlight: We revisit a pragmatic inference problem in dialog: Understanding indirect responses to questions. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Annie Louis; Dan Roth; Filip Radlinski;
602	PowerTransformer: Unsupervised Controllable Revision For Biased Language Correction Highlight: To address this challenge, we adopt an unsupervised approach using auxiliary supervision with related tasks such as paraphrasing and self-supervision based on a reconstruction loss, building on pretrained language models. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Xinyao Ma; Maarten Sap; Hannah Rashkin; Yejin Choi;
603	MEGA RST Discourse Treebanks With Structure And Nuclearity From Scalable Distant Sentiment Supervision Highlight: In this work, we present a novel scalable methodology to automatically generate discourse treebanks using distant supervision from sentiment annotated datasets, creating and publishing MEGA-DT, a new large-scale discourse-annotated corpus. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Patrick Huber; Giuseppe Carenini;
604	Centering-based Neural Coherence Modeling With Hierarchical Discourse Segments Highlight: In this work, we propose a coherence model which takes discourse structural information into account without relying on human annotations. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Sungho Jeon; Michael Strube;
605	Keeping Up Appearances: Computational Modeling Of Face Acts In Persuasion Oriented Discussions Highlight: Grounded in the politeness theory of Brown and Levinson (1978), we propose a generalized framework for modeling face acts in persuasion conversations, resulting in a reliable coding manual, an annotated corpus, and computational models. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Ritam Dutt; Rishabh Joshi; Carolyn Rose;
606	HABERTOR: An Efficient And Effective Deep Hatespeech Detector Highlight: We present our HABERTOR model for detecting hatespeech in large scale user-generated content. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Thanh Tran; Yifan Hu; Changwei Hu; Kevin Yen; Fei Tan; Kyumin Lee; Se Rim Park;
607	An Empirical Study On Large-Scale Multi-Label Text Classification Including Few And Zero-Shot Labels Highlight: Here, for the first time, we empirically evaluate a battery of LMTC methods from vanilla LWANs to hierarchical classification approaches and transfer learning, on frequent, few, and zero-shot learning on three datasets from different domains. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Ilias Chalkidis; Manos Fergadiotis; Sotiris Kotitsas; Prodromos Malakasiotis; Nikolaos Aletras; Ion Androutsopoulos;
608	*Which BERT? A Survey Organizing Contextualized Encoders** Highlight: We present a survey on language representation learning with the aim of consolidating a series of shared lessons learned across a variety of recent efforts. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Patrick Xia; Shijie Wu; Benjamin Van Durme;
609	Fact Or Fiction: Verifying Scientific Claims Highlight: We introduce scientific claim verification, a new task to select abstracts from the research literature containing evidence that SUPPORTS or REFUTES a given scientific claim, and to identify rationales justifying each decision. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	David Wadden; Shanchuan Lin; Kyle Lo; Lucy Lu Wang; Madeleine van Zuylen; Arman Cohan; Hannaneh Hajishirzi;	code
610	Semantic Role Labeling As Syntactic Dependency Parsing Highlight: Based on this observation, we present a conversion scheme that packs SRL annotations into dependency tree representations through joint labels that permit highly accurate recovery back to the original format. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Tianze Shi; Igor Malioutov; Ozan Irsoy;
611	PARADE: A New Dataset For Paraphrase Identification Requiring Computer Science Domain Knowledge Highlight: We present a new benchmark dataset called PARADE for paraphrase identification that requires specialized domain knowledge. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yun He; Zhuoer Wang; Yin Zhang; Ruihong Huang; James Caverlee;
612	Causal Inference Of Script Knowledge Highlight: We argue from both a conceptual and practical sense that a purely correlation-based approach is insufficient, and instead propose an approach to script induction based on the causal effect between events, formally defined via interventions. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Noah Weber; Rachel Rudinger; Benjamin Van Durme;
613	Towards Debiasing NLU Models From Unknown Biases Highlight: In this work, we present the first step to bridge this gap by introducing a self-debiasing framework that prevents models from mainly utilizing biases without knowing them in advance. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Prasetya Ajie Utama; Nafise Sadat Moosavi; Iryna Gurevych;
614	On The Role Of Supervision In Unsupervised Constituency Parsing Highlight: We introduce strong baselines for them, by training an existing supervised parsing model (Kitaev and Klein, 2018) on the same labeled examples they access. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Haoyue Shi; Karen Livescu; Kevin Gimpel;
615	Language Model Prior For Low-Resource Neural Machine Translation Highlight: In this work, we propose a novel approach to incorporate a LM as prior in a neural translation model (TM). Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Christos Baziotis; Barry Haddow; Alexandra Birch;
616	Detecting Word Sense Disambiguation Biases In Machine Translation For Model-Agnostic Adversarial Attacks Highlight: We introduce a method for the prediction of disambiguation errors based on statistical data properties, demonstrating its effectiveness across several domains and model types. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Denis Emelin; Ivan Titov; Rico Sennrich;
617	MAD-X: An Adapter-Based Framework For Multi-Task Cross-Lingual Transfer Highlight: We propose MAD-X, an adapter-based framework that enables high portability and parameter-efficient transfer to arbitrary tasks and languages by learning modular language and task representations. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Jonas Pfeiffer; Ivan Vulić; Iryna Gurevych; Sebastian Ruder;
618	Translation Artifacts In Cross-lingual Transfer Learning Highlight: In this paper, we show that such translation process can introduce subtle artifacts that have a notable impact in existing cross-lingual models. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Mikel Artetxe; Gorka Labaka; Eneko Agirre;
619	A Time-Aware Transformer Based Model For Suicide Ideation Detection On Social Media Highlight: In this work, we focus on identifying suicidal intent in English tweets by augmenting linguistic models with historical context. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Ramit Sawhney; Harshit Joshi; Saumya Gandhi; Rajiv Ratn Shah;
620	Weakly Supervised Learning Of Nuanced Frames For Analyzing Polarization In News Media Highlight: In this paper, we suggest a minimally supervised approach for identifying nuanced frames in news article coverage of politically divisive topics. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Shamik Roy; Dan Goldwasser;
621	Where Are The Facts? Searching For Fact-checked Information To Alleviate The Spread Of Fake News Highlight: To tackle these questions, we propose a novel framework to search for fact-checking articles, which address the content of an original tweet (that may contain misinformation) posted by online users. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Nguyen Vo; Kyumin Lee;	code
622	Fortifying Toxic Speech Detectors Against Veiled Toxicity Highlight: In this work, we propose a framework aimed at fortifying existing toxic speech detectors without a large labeled corpus of veiled toxicity. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Xiaochuang Han; Yulia Tsvetkov;
623	Explainable Automated Fact-Checking For Public Health Claims Highlight: We present the first study of explainable fact-checking for claims which require specific expertise. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Neema Kotonya; Francesca Toni;
624	Interactive Fiction Game Playing As Multi-Paragraph Reading Comprehension With Reinforcement Learning Highlight: We take a novel perspective of IF game solving and re-formulate it as Multi-Passage Reading Comprehension (MPRC) tasks. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Xiaoxiao Guo; Mo Yu; Yupeng Gao; Chuang Gan; Murray Campbell; Shiyu Chang;
625	DORB: Dynamically Optimizing Multiple Rewards With Bandits Highlight: Considering the above aspects, in our work, we automate the optimization of multiple metric rewards simultaneously via a multi-armed bandit approach (DORB), where at each round, the bandit chooses which metric reward to optimize next, based on expected arm gains. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Ramakanth Pasunuru; Han Guo; Mohit Bansal;
626	MedFilter: Improving Extraction Of Task-relevant Utterances Through Integration Of Discourse Structure And Ontological Knowledge Highlight: In this paper, we propose the novel modeling approach MedFilter, which addresses these insights in order to increase performance at identifying and categorizing task-relevant utterances, and in so doing, positively impacts performance at a downstream information extraction task. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Sopan Khosla; Shikhar Vashishth; Jill Fain Lehman; Carolyn Rose;
627	Hierarchical Evidence Set Modeling For Automated Fact Extraction And Verification Highlight: Unlike the prior works, in this paper, we propose Hierarchical Evidence Set Modeling (HESM), a framework to extract evidence sets (each of which may contain multiple evidence sentences), and verify a claim to be supported, refuted or not enough info, by encoding and attending the claim and evidence sets at different levels of hierarchy. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Shyam Subramanian; Kyumin Lee;	code
628	Program Enhanced Fact Verification With Verbalization And Graph Attention Network Highlight: In this paper, we present a Program-enhanced Verbalization and Graph Attention Network (ProgVGAT) to integrate programs and execution into textual inference models. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Xiaoyu Yang; Feng Nie; Yufei Feng; Quan Liu; Zhigang Chen; Xiaodan Zhu;
629	Constrained Fact Verification For FEVER Highlight: In this work, we propose a new methodology for fact-verification, specifically FEVER, that enforces a closed-world reliance on extracted evidence. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Adithya Pratapa; Sai Muralidhar Jayanthi; Kavya Nerella;
630	Entity Linking In 100 Languages Highlight: We propose a new formulation for multilingual entity linking, where language-specific mentions resolve to a language-agnostic Knowledge Base. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Jan A. Botha; Zifei Shan; Daniel Gillick;
631	PatchBERT: Just-in-Time, Out-of-Vocabulary Patching Highlight: In our paper, we study a pre-trained multilingual BERT model and analyze the OOV rate on downstream tasks, how it introduces information loss, and as a side-effect, obstructs the potential of the underlying model. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Sangwhan Moon; Naoaki Okazaki;
632	On The Importance Of Pre-training Data Volume For Compact Language Models Highlight: In an effort towards sustainable practices, we study the impact of pre-training data volume on compact language models. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Vincent Micheli; Martin d’Hoffschmidt; François Fleuret;
633	BERT-of-Theseus: Compressing BERT By Progressive Module Replacing Highlight: In this paper, we propose a novel model compression approach to effectively compress BERT by progressive module replacing. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Canwen Xu; Wangchunshu Zhou; Tao Ge; Furu Wei; Ming Zhou;
634	Recall And Learn: Fine-tuning Deep Pretrained Language Models With Less Forgetting Highlight: To fine-tune with less forgetting, we propose a recall and learn mechanism, which adopts the idea of multi-task learning and jointly learns pretraining tasks and downstream tasks. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Sanyuan Chen; Yutai Hou; Yiming Cui; Wanxiang Che; Ting Liu; Xiangzhan Yu;
635	Exploring And Predicting Transferability Across NLP Tasks Highlight: In this paper, we conduct an extensive study of the transferability between 33 NLP tasks across three broad classes of problems (text classification, question answering, and sequence labeling). Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Tu Vu; Tong Wang; Tsendsuren Munkhdalai; Alessandro Sordoni; Adam Trischler; Andrew Mattarella-Micke; Subhransu Maji; Mohit Iyyer;
636	To BERT Or Not To BERT: Comparing Task-specific And Task-agnostic Semi-Supervised Approaches For Sequence Tagging Highlight: In this work, we investigate how to effectively use unlabeled data: by exploring the task-specific semi-supervised approach, Cross-View Training (CVT) and comparing it with task-agnostic BERT in multiple settings that include domain and task relevant English data. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Kasturi Bhattacharjee; Miguel Ballesteros; Rishita Anubhai; Smaranda Muresan; Jie Ma; Faisal Ladhak; Yaser Al-Onaizan;
637	Cold-start Active Learning Through Self-supervised Language Modeling Highlight: Therefore, we treat the language modeling loss as a proxy for classification uncertainty. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Michelle Yuan; Hsuan-Tien Lin; Jordan Boyd-Graber;
638	Active Learning For BERT: An Empirical Study Highlight: Here, we present a large-scale empirical study on active learning techniques for BERT-based classification, addressing a diverse set of AL strategies and datasets. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Liat Ein-Dor; Alon Halfon; Ariel Gera; Eyal Shnarch; Lena Dankin; Leshem Choshen; Marina Danilevsky; Ranit Aharonov; Yoav Katz; Noam Slonim;
639	Transformer Based Multi-Source Domain Adaptation Highlight: Here, we investigate the problem of unsupervised multi-source domain adaptation, where a model is trained on labelled data from multiple source domains and must make predictions on a domain for which no labelled data has been seen. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Dustin Wright; Isabelle Augenstein;	code
640	Vector-Vector-Matrix Architecture: A Novel Hardware-Aware Framework For Low-Latency Inference In NLP Applications Highlight: To address this issue, we propose a novel vector-vector-matrix architecture (VVMA), which greatly reduces the latency at inference time for NMT. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Matthew Khoury; Rumen Dangovski; Longwu Ou; Preslav Nakov; Yichen Shen; Li Jing;
641	The Importance Of Fillers For Text Representations Of Speech Transcripts Highlight: We explore the possibility of representing them with deep contextualised embeddings, showing improvements on modelling spoken language and two downstream tasks – predicting a speaker’s stance and expressed confidence. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Tanvi Dinkar; Pierre Colombo; Matthieu Labeau; Chloé Clavel;
642	The Role Of Context In Neural Pitch Accent Detection In English Highlight: We propose a new model for pitch accent detection, inspired by the work of Stehwien et al. (2018), who presented a CNN-based model for this task. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Elizabeth Nielsen; Mark Steedman; Sharon Goldwater;
643	VolTAGE: Volatility Forecasting Via Text Audio Fusion With Graph Convolution Networks For Earnings Calls Highlight: Building on existing work, we introduce a neural model for stock volatility prediction that accounts for stock interdependence via graph convolutions while fusing verbal, vocal, and financial features in a semi-supervised multi-task risk forecasting formulation. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Ramit Sawhney; Piyush Khanna; Arshiya Aggarwal; Taru Jain; Puneet Mathur; Rajiv Ratn Shah;
644	Effectively Pretraining A Speech Translation Decoder With Machine Translation Data Highlight: In this paper, we will show that by using an adversarial regularizer, we can bring the encoder representations of the ASR and NMT tasks closer even though they are in different modalities, and how this helps us effectively use a pretrained NMT decoder for speech translation. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Ashkan Alinejad; Anoop Sarkar;
645	A Preliminary Exploration Of GANs For Keyphrase Generation Highlight: We introduce a new keyphrase generation approach using Generative Adversarial Networks (GANs). Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Avinash Swaminathan; Haimin Zhang; Debanjan Mahata; Rakesh Gosangi; Rajiv Ratn Shah; Amanda Stent;
646	TESA: A Task In Entity Semantic Aggregation For Abstractive Summarization Highlight: In this paper, we present a new dataset and task aimed at the semantic aggregation of entities. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Clément Jumel; Annie Louis; Jackie Chi Kit Cheung;
647	MLSUM: The Multilingual Summarization Corpus Highlight: We present MLSUM, the first large-scale MultiLingual SUMmarization dataset. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Thomas Scialom; Paul-Alexis Dray; Sylvain Lamprier; Benjamin Piwowarski; Jacopo Staiano;
648	Multi-XScience: A Large-scale Dataset For Extreme Multi-document Summarization Of Scientific Articles Highlight: We propose Multi-XScience, a large-scale multi-document summarization dataset created from scientific articles. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yao Lu; Yue Dong; Laurent Charlin;
649	Intrinsic Evaluation Of Summarization Datasets Highlight: We perform the first large-scale evaluation of summarization datasets by introducing 5 intrinsic metrics and applying them to 10 popular datasets. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Rishi Bommasani; Claire Cardie;
650	Iterative Feature Mining For Constraint-Based Data Collection To Increase Data Diversity And Model Robustness Highlight: We propose a general approach for guiding workers to write more diverse text by iteratively constraining their writing. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Stefan Larson; Anthony Zheng; Anish Mahendran; Rishi Tekriwal; Adrian Cheung; Eric Guldan; Kevin Leach; Jonathan K. Kummerfeld;
651	Conversational Semantic Parsing For Dialog State Tracking Highlight: We describe an encoder-decoder framework for DST with hierarchical representations, which leads to {\textasciitilde}20% improvement over state-of-the-art DST approaches that operate on a flat meaning space of slot-value pairs. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Jianpeng Cheng; Devang Agrawal; Héctor Martínez Alonso; Shruti Bhargava; Joris Driesen; Federico Flego; Dain Kaplan; Dimitri Kartsaklis; Lin Li; Dhivya Piraviperumal; Jason D. Williams; Hong Yu; Diarmuid Ó Séaghdha; Anders Johannsen;
652	Doc2dial: A Goal-Oriented Document-Grounded Dialogue Dataset Highlight: We introduce doc2dial, a new dataset of goal-oriented dialogues that are grounded in the associated documents. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Song Feng; Hui Wan; Chulaka Gunasekara; Siva Patel; Sachindra Joshi; Luis Lastras;
653	Interview: Large-scale Modeling Of Media Dialog With Discourse Patterns And Knowledge Grounding Highlight: In this work, we perform the first large-scale analysis of discourse in media dialog and its impact on generative modeling of dialog turns, with a focus on interrogative patterns and use of external knowledge. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Bodhisattwa Prasad Majumder; Shuyang Li; Jianmo Ni; Julian McAuley;
654	INSPIRED: Toward Sociable Recommendation Dialog Systems Highlight: Therefore, we present INSPIRED, a new dataset of 1,001 human-human dialogs for movie recommendation with measures for successful recommendations. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Shirley Anugrah Hayati; Dongyeop Kang; Qingxiaoyang Zhu; Weiyan Shi; Zhou Yu;
655	Information Seeking In The Spirit Of Learning: A Dataset For Conversational Curiosity Highlight: We incorporate this knowledge into a multi-task model that reproduces human assistant policies and improves over a bert content model by 13 mean reciprocal rank points. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Pedro Rodriguez; Paul Crook; Seungwhan Moon; Zhiguang Wang;
656	Queens Are Powerful Too: Mitigating Gender Bias In Dialogue Generation Highlight: We consider three techniques to mitigate gender bias: counterfactual data augmentation, targeted data collection, and bias controlled training. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Emily Dinan; Angela Fan; Adina Williams; Jack Urbanek; Douwe Kiela; Jason Weston;
657	Discriminatively-Tuned Generative Classifiers For Robust Natural Language Inference Highlight: In this paper, we focus on natural language inference (NLI). Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Xiaoan Ding; Tianyu Liu; Baobao Chang; Zhifang Sui; Kevin Gimpel;
658	New Protocols And Negative Results For Textual Entailment Data Collection Highlight: We propose four alternative protocols, each aimed at improving either the ease with which annotators can produce sound training examples or the quality and diversity of those examples. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Samuel R. Bowman; Jennimaria Palomaki; Livio Baldini Soares; Emily Pitler;
659	The Curse Of Performance Instability In Analysis Datasets: Consequences, Source, And Suggestions Highlight: We find that the performance of state-of-the-art models on Natural Language Inference (NLI) and Reading Comprehension (RC) analysis/stress sets can be highly unstable. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Xiang Zhou; Yixin Nie; Hao Tan; Mohit Bansal;
660	Universal Natural Language Processing With Limited Annotations: Try Few-shot Textual Entailment As A Start Highlight: In this work, we introduce Universal Few-shot textual Entailment (UFO-Entail). Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Wenpeng Yin; Nazneen Fatema Rajani; Dragomir Radev; Richard Socher; Caiming Xiong;
661	ConjNLI: Natural Language Inference Over Conjunctive Sentences Highlight: Hence, we introduce ConjNLI, a challenge stress-test for natural language inference over conjunctive sentences, where the premise differs from the hypothesis by conjuncts removed, added, or replaced. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Swarnadeep Saha; Yixin Nie; Mohit Bansal;
662	Data And Representation For Turkish Natural Language Inference Highlight: In this paper, we offer a positive response for natural language inference (NLI) in Turkish. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Emrah Budur; Rıza Özçelik; Tunga Gungor; Christopher Potts;
663	Multitask Learning For Cross-Lingual Transfer Of Broad-coverage Semantic Dependencies Highlight: We describe a method for developing broad-coverage semantic dependency parsers for languages for which no semantically annotated resource is available. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Maryam Aminian; Mohammad Sadegh Rasooli; Mona Diab;
664	Precise Task Formalization Matters In Winograd Schema Evaluations Highlight: We perform an ablation on two Winograd Schema datasets that interpolates between the formalizations used before and after this surge, and find (i) framing the task as multiple choice improves performance dramatically and (ii)several additional techniques, including the reuse of a pretrained language modeling head, can mitigate the model’s extreme sensitivity to hyperparameters. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Haokun Liu; William Huang; Dhara Mungra; Samuel R. Bowman;
665	Avoiding The Hypothesis-Only Bias In Natural Language Inference Via Ensemble Adversarial Training Highlight: We show that the bias can be reduced in the sentence representations by using an ensemble of adversaries, encouraging the model to jointly decrease the accuracy of these different adversaries while fitting the data. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Joe Stacey; Pasquale Minervini; Haim Dubossarsky; Sebastian Riedel; Tim Rocktäschel;
666	SynSetExpan: An Iterative Framework For Joint Entity Set Expansion And Synonym Discovery Highlight: In this work, we hypothesize that these two tasks are tightly coupled because two synonymous entities tend to have a similar likelihood of belonging to various semantic classes. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Jiaming Shen; Wenda Qiu; Jingbo Shang; Michelle Vanni; Xiang Ren; Jiawei Han;
667	Evaluating The Calibration Of Knowledge Graph Embeddings For Trustworthy Link Prediction Highlight: In this paper we take initial steps toward this direction by investigating the calibration of KGE models, or the extent to which they output confidence scores that reflect the expected correctness of predicted knowledge graph triples. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Tara Safavi; Danai Koutra; Edgar Meij;
668	Text Graph Transformer For Document Classification Highlight: We propose a mini-batch text graph sampling method that significantly reduces computing and memory costs to handle large-sized corpus. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Haopeng Zhang; Jiawei Zhang;
669	CoDEx: A Comprehensive Knowledge Graph Completion Benchmark Highlight: We present CoDEx, a set of knowledge graph completion datasets extracted from Wikidata and Wikipedia that improve upon existing knowledge graph completion benchmarks in scope and level of difficulty. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Tara Safavi; Danai Koutra;	code
670	META: Metadata-Empowered Weak Supervision For Text Classification Highlight: In this paper, we propose a novel framework, META, which goes beyond the existing paradigm and leverages metadata as an additional source of weak supervision. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Dheeraj Mekala; Xinyang Zhang; Jingbo Shang;
671	Towards More Accurate Uncertainty Estimation In Text Classification Highlight: To achieve this, we aim at generating accurate uncertainty score by improving the confidence of winning scores. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Jianfeng He; Xuchao Zhang; Shuo Lei; Zhiqian Chen; Fanglan Chen; Abdulaziz Alhamadani; Bei Xiao; ChangTien Lu;
672	Chapter Captor: Text Segmentation In Novels Highlight: Using this annotated data as ground truth after removing structural cues, we present cut-based and neural methods for chapter segmentation, achieving a F1-score of 0.453 on the challenging task of exact break prediction over book-length documents. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Charuta Pethe; Allen Kim; Steve Skiena;
673	Authorship Attribution For Neural Text Generation Highlight: In this work, in the context of this Turing Test, we investigate the so-called authorship attribution problem in three versions: (1) given two texts T1 and T2, are both generated by the same method or not? Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Adaku Uchendu; Thai Le; Kai Shu; Dongwon Lee;	code
674	NwQM: A Neural Quality Assessment Framework For Wikipedia Highlight: In this paper we propose Neural wikipedia Quality Monitor (NwQM), a novel deep learning model which accumulates signals from several key information sources such as article text, meta data and images to obtain improved Wikipedia article representation. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Bhanu Prakash Reddy Guda; Sasi Bhushan Seelaboyina; Soumya Sarkar; Animesh Mukherjee;
675	Towards Modeling Revision Requirements In WikiHow Instructions Highlight: In this work, we test whether the need for such edits can be predicted automatically. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Irshad Bhat; Talita Anthonio; Michael Roth;
676	Deep Attentive Learning For Stock Movement Prediction From Social Media Text And Company Correlations Highlight: We introduce an architecture that achieves a potent blend of chaotic temporal signals from financial data, social media, and inter-stock relationships via a graph neural network in a hierarchical temporal fashion. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Ramit Sawhney; Shivam Agarwal; Arnav Wadhwa; Rajiv Ratn Shah;
677	Natural Language Processing For Achieving Sustainable Development: The Case Of Neural Labelling To Enhance Community Profiling Highlight: In this paper, we show the high potential of NLP to enhance project sustainability. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Costanza Conforti; Stephanie Hirmer; Dai Morgan; Marco Basaldella; Yau Ben Or;
678	To Schedule Or Not To Schedule: Extracting Task Specific Temporal Entities And Associated Negation Constraints Highlight: We showcase a novel model for extracting task-specific date-time entities along with their negation constraints. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Barun Patra; Chala Fufa; Pamela Bhattacharya; Charles Lee;
679	Competence-Level Prediction And Resume & Job Description Matching Using Context-Aware Transformer Models Highlight: This paper presents a comprehensive study on resume classification to reduce the time and labor needed to screen an overwhelming number of applications significantly, while improving the selection of suitable candidates. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Changmao Li; Elaine Fisher; Rebecca Thomas; Steve Pittard; Vicki Hertzberg; Jinho D. Choi;
680	Grammatical Error Correction In Low Error Density Domains: A New Benchmark And Analyses Highlight: We aim to broaden the target domain of GEC and release CWEB, a new benchmark for GEC consisting of website text generated by English speakers of varying levels of proficiency. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Simon Flachs; Ophélie Lacroix; Helen Yannakoudakis; Marek Rei; Anders Søgaard;
681	Deconstructing Word Embedding Algorithms Highlight: In this work, we deconstruct Word2vec, GloVe, and others, into a common form, unveiling some of the common conditions that seem to be required for making performant word embeddings. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Kian Kenyon-Dean; Edward Newell; Jackie Chi Kit Cheung;
682	Sequential Modelling Of The Evolution Of Word Representations For Semantic Change Detection Highlight: In this work, we propose three variants of sequential models for detecting semantically shifted words, effectively accounting for the changes in the word representations over time. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Adam Tsakalidis; Maria Liakata;
683	Sparsity Makes Sense: Word Sense Disambiguation Using Sparse Contextualized Word Representations Highlight: In this paper, we demonstrate that by utilizing sparse word representations, it becomes possible to surpass the results of more complex task-specific models on the task of fine-grained all-words word sense disambiguation. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Gábor Berend;
684	Exploring Semantic Capacity Of Terms Highlight: We introduce and study semantic capacity of terms. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Jie Huang; Zilong Wang; Kevin Chang; Wen-mei Hwu; JinJun Xiong;
685	Learning To Ignore: Long Document Coreference With Bounded Memory Neural Networks Highlight: We argue that keeping all entities in memory is unnecessary, and we propose a memory-augmented neural network that tracks only a small bounded number of entities at a time, thus guaranteeing a linear runtime in length of document. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Shubham Toshniwal; Sam Wiseman; Allyson Ettinger; Karen Livescu; Kevin Gimpel;
686	Revealing The Myth Of Higher-Order Inference In Coreference Resolution Highlight: To make a comprehensive analysis, we implement an end-to-end coreference system as well as four HOI approaches, attended antecedent, entity equalization, span clustering, and cluster merging, where the latter two are our original methods. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Liyan Xu; Jinho D. Choi;
687	Pre-training Mention Representations In Coreference Models Highlight: We propose two self-supervised tasks that are closely related to coreference resolution and thus improve mention representation. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yuval Varkel; Amir Globerson;
688	Learning Collaborative Agents With Rule Guidance For Knowledge Graph Reasoning Highlight: In this paper, we propose RuleGuider, which leverages high-quality rules generated by symbolic-based methods to provide reward supervision for walk-based agents. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Deren Lei; Gangrong Jiang; Xiaotao Gu; Kexuan Sun; Yuning Mao; Xiang Ren;
689	Exploring Contextualized Neural Language Models For Temporal Dependency Parsing Highlight: In this paper, we develop several variants of BERT-based temporal dependency parser, and show that BERT significantly improves temporal dependency parsing (Zhang and Xue, 2018a). Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Hayley Ross; Jonathon Cai; Bonan Min;	code
690	Systematic Comparison Of Neural Architectures And Training Approaches For Open Information Extraction Highlight: In this work, we systematically compare different neural network architectures and training approaches, and improve the performance of the currently best models on the OIE16 benchmark (Stanovsky and Dagan, 2016) by 0.421 F1 score and 0.420 AUC-PR, respectively, in our experiments (i.e., by more than 200% in both cases). Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Patrick Hohenecker; Frank Mtumbuka; Vid Kocijan; Thomas Lukasiewicz;
691	SeqMix: Augmenting Active Sequence Labeling Via Sequence Mixup Highlight: We propose a simple but effective data augmentation method to improve label efficiency of active sequence labeling. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Rongzhi Zhang; Yue Yu; Chao Zhang;	code
692	AxCell: Automatic Extraction Of Results From Machine Learning Papers Highlight: In this paper, we present AxCell, an automatic machine learning pipeline for extracting results from papers. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Marcin Kardas; Piotr Czapla; Pontus Stenetorp; Sebastian Ruder; Sebastian Riedel; Ross Taylor; Robert Stojnic;
693	Knowledge-guided Open Attribute Value Extraction With Reinforcement Learning Highlight: In this work, we propose a knowledge-guided reinforcement learning (RL) framework for open attribute value extraction. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Ye Liu; Sheng Zhang; Rui Song; Suo Feng; Yanghua Xiao;
694	DualTKB: A Dual Learning Bridge Between Text And Knowledge Base Highlight: In this work, we present a dual learning approach for unsupervised text to path and path to text transfers in Commonsense Knowledge Bases (KBs). Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Pierre Dognin; Igor Melnyk; Inkit Padhi; Cicero Nogueira dos Santos; Payel Das;
695	Incremental Neural Coreference Resolution In Constant Memory Highlight: In this work, we successfully convert a high-performing model (Joshi et al., 2020), asymptotically reducing its memory usage to constant space with only a 0.3% relative loss in F1 on OntoNotes 5.0. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Patrick Xia; João Sedoc; Benjamin Van Durme;
696	Improving Low Compute Language Modeling With In-Domain Embedding Initialisation Highlight: We show that for our target setting in English, initialising and freezing input embeddings using in-domain data can improve language model performance by providing a useful representation of rare words, and this pattern holds across several different domains. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Charles Welch; Rada Mihalcea; Jonathan K. Kummerfeld;
697	KGPT: Knowledge-Grounded Pre-Training For Data-to-Text Generation Highlight: In this paper, we propose to leverage pre-training and transfer learning to address this issue. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Wenhu Chen; Yu Su; Xifeng Yan; William Yang Wang;
698	POINTER: Constrained Progressive Text Generation Via Insertion-based Generative Pre-training Highlight: To address this challenge, we present POINTER (PrOgressive INsertion-based TransformER), a simple yet novel insertion-based approach for hard-constrained text generation. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yizhe Zhang; Guoyin Wang; Chunyuan Li; Zhe Gan; Chris Brockett; Bill Dolan;
699	Unsupervised Text Style Transfer With Padded Masked Language Models Highlight: We propose Masker, an unsupervised text-editing method for style transfer. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Eric Malmi; Aliaksei Severyn; Sascha Rothe;
700	PALM: Pre-training An Autoencoding&Autoregressive Language Model For Context-conditioned Generation Highlight: This work presents PALM with a novel scheme that jointly pre-trains an autoencoding and autoregressive language model on a large unlabeled corpus, specifically designed for generating new text conditioned on context. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Bin Bi; Chenliang Li; Chen Wu; Ming Yan; Wei Wang; Songfang Huang; Fei Huang; Luo Si;
701	Gradient-guided Unsupervised Lexically Constrained Text Generation Highlight: In this paper, we propose a novel method G2LC to solve the lexically-constrained generation as an unsupervised gradient-guided optimization problem. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Lei Sha;
702	TeaForN: Teacher-Forcing With N-grams Highlight: Our proposed method, Teacher-Forcing with N-grams (TeaForN), addresses both these problems directly, through the use of a stack of N decoders trained to decode along a secondary time axis that allows model-parameter updates based on N prediction steps. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Sebastian Goodman; Nan Ding; Radu Soricut;
703	Experience Grounds Language Highlight: We posit that the present success of representation learning approaches trained on large, text-only corpora requires the parallel tradition of research on the broader physical and social context of language to address the deeper questions of communication. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yonatan Bisk; Ari Holtzman; Jesse Thomason; Jacob Andreas; Yoshua Bengio; Joyce Chai; Mirella Lapata; Angeliki Lazaridou; Jonathan May; Aleksandr Nisnevich; Nicolas Pinto; Joseph Turian;
704	Keep CALM And Explore: Language Models For Action Generation In Text-based Games Highlight: In this paper, we propose the Contextual Action Language Model (CALM) to generate a compact set of action candidates at each game state. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Shunyu Yao; Rohan Rao; Matthew Hausknecht; Karthik Narasimhan;	code
705	CapWAP: Image Captioning With A Purpose Highlight: In this paper, we propose a new task, Captioning with A Purpose (CapWAP). Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Adam Fisch; Kenton Lee; Ming-Wei Chang; Jonathan Clark; Regina Barzilay;
706	What Is More Likely To Happen Next? Video-and-Language Future Event Prediction Highlight: In this work, we explore whether AI models are able to learn to make such multimodal commonsense next-event predictions. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Jie Lei; Licheng Yu; Tamara Berg; Mohit Bansal;
707	X-LXMERT: Paint, Caption And Answer Questions With Multi-Modal Transformers Highlight: We introduce X-LXMERT, an extension to LXMERT with training refinements including: discretizing visual representations, using uniform masking with a large range of masking ratios and aligning the right pre-training datasets to the right objectives which enables it to paint. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Jaemin Cho; Jiasen Lu; Dustin Schwenk; Hannaneh Hajishirzi; Aniruddha Kembhavi;
708	Towards Understanding Sample Variance In Visually Grounded Language Generation: Evaluations And Observations Highlight: In this work, we set forth to design a set of experiments to understand an important but often ignored problem in visually grounded language generation: given that humans have different utilities and visual attention, how will the sample variance in multi-reference datasets affect the models’ performance? Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Wanrong Zhu; Xin Wang; Pradyumna Narayana; Kazoo Sone; Sugato Basu; William Yang Wang;
709	Beyond Instructional Videos: Probing For More Diverse Visual-Textual Grounding On YouTube Highlight: We find that visual-textual grounding is indeed possible across previously unexplored video categories, and that pretraining on a more diverse set results in representations that generalize to both non-instructional and instructional domains. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Jack Hessel; Zhenhai Zhu; Bo Pang; Radu Soricut;
710	Hierarchical Graph Network For Multi-hop Question Answering Highlight: In this paper, we present Hierarchical Graph Network (HGN) for multi-hop question answering. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yuwei Fang; Siqi Sun; Zhe Gan; Rohit Pillai; Shuohang Wang; Jingjing Liu;
711	A Simple Yet Strong Pipeline For HotpotQA Highlight: Our pipeline has three steps: 1) use BERT to identify potentially relevant sentences \textit{independently} of each other; 2) feed the set of selected sentences as context into a standard BERT span prediction model to choose an answer; and 3) use the sentence selection model, now with the chosen answer, to produce supporting sentences. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Dirk Groeneveld; Tushar Khot; Mausam; Ashish Sabharwal;
712	Is Multihop QA In DiRe Condition? Measuring And Reducing Disconnected Reasoning Highlight: Third, our experiments suggest that there hasn’t been much progress in multi-hop QA in the reading comprehension setting. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Harsh Trivedi; Niranjan Balasubramanian; Tushar Khot; Ashish Sabharwal;
713	Unsupervised Question Decomposition For Question Answering Highlight: Specifically, we propose an algorithm for One-to-N Unsupervised Sequence transduction (ONUS) that learns to map one hard, multi-hop question to many simpler, single-hop sub-questions. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Ethan Perez; Patrick Lewis; Wen-tau Yih; Kyunghyun Cho; Douwe Kiela;
714	SRLGRN: Semantic Role Labeling Graph Reasoning Network Highlight: We propose a graph reasoning network based on the semantic structure of the sentences to learn cross paragraph reasoning paths and find the supporting facts and the answer jointly. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Chen Zheng; Parisa Kordjamshidi;
715	CancerEmo: A Dataset For Fine-Grained Emotion Detection Highlight: To this end, we introduce CancerEmo, an emotion dataset created from an online health community and annotated with eight fine-grained emotions. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Tiberiu Sosea; Cornelia Caragea;
716	Exploring The Role Of Argument Structure In Online Debate Persuasion Highlight: In this paper, we aim to further investigate the role of discourse structure of the arguments from online debates in their persuasiveness. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Jialu Li; Esin Durmus; Claire Cardie;
717	Zero-Shot Stance Detection: A Dataset And Model Using Generalized Topic Representations Highlight: In this paper, we present a new dataset for zero-shot stance detection that captures a wider range of topics and lexical variation than in previous datasets. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Emily Allaway; Kathleen McKeown;
718	Sentiment Analysis Of Tweets Using Heterogeneous Multi-layer Network Representation And Embedding Highlight: This study proposes a heterogeneous multi-layer network-based representation of tweets to generate multiple representations of a tweet and address the above issues. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Loitongbam Gyanendro Singh; Anasua Mitra; Sanasam Ranbir Singh;
719	Introducing Syntactic Structures Into Target Opinion Word Extraction With Deep Learning Highlight: In this work, we propose to incorporate the syntactic structures of the sentences into the deep learning models for TOWE, leveraging the syntax-based opinion possibility scores and the syntactic connections between the words. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Amir Pouran Ben Veyseh; Nasim Nouri; Franck Dernoncourt; Dejing Dou; Thien Huu Nguyen;
720	EmoTag1200: Understanding The Association Between Emojis And Emotions Highlight: In this paper, we seek to explore the connection between emojis and emotions by means of a new dataset consisting of human-solicited association ratings. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Abu Awal Md Shoeb; Gerard de Melo;
721	MIME: MIMicking Emotions For Empathetic Response Generation Highlight: We argue that empathetic responses often mimic the emotion of the user to a varying degree, depending on its positivity or negativity and content. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Navonil Majumder; Pengfei Hong; Shanshan Peng; Jiankun Lu; Deepanway Ghosal; Alexander Gelbukh; Rada Mihalcea; Soujanya Poria;	code
722	Exploiting Structured Knowledge In Text Via Graph-Guided Representation Learning Highlight: In this work, we aim at equipping pre-trained language models with structured knowledge. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Tao Shen; Yi Mao; Pengcheng He; Guodong Long; Adam Trischler; Weizhu Chen;
723	Named Entity Recognition Only From Word Embeddings Highlight: In this work, we propose a fully unsupervised NE recognition model which only needs to take informative clues from pre-trained word embeddings. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Ying Luo; Hai Zhao; Junlang Zhan;
724	Text Classification Using Label Names Only: A Language Model Self-Training Approach Highlight: In this paper, we explore the potential of only using the label name of each class to train classification models on unlabeled data, without using any labeled documents. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yu Meng; Yunyi Zhang; Jiaxin Huang; Chenyan Xiong; Heng Ji; Chao Zhang; Jiawei Han;
725	Neural Topic Modeling With Cycle-Consistent Adversarial Training Highlight: To overcome such limitations, we propose Topic Modeling with Cycle-consistent Adversarial Training (ToMCAT) and its supervised version sToMCAT. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Xuemeng Hu; Rui Wang; Deyu Zhou; Yuxuan Xiong;
726	Data Boost: Text Data Augmentation Through Reinforcement Learning Guided Conditional Generation Highlight: In this paper, we present a powerful and easy to deploy text augmentation framework, Data Boost, which augments data through reinforcement learning guided conditional generation. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Ruibo Liu; Guangxuan Xu; Chenyan Jia; Weicheng Ma; Lili Wang; Soroush Vosoughi;
727	A State-independent And Time-evolving Network For Early Rumor Detection In Social Media Highlight: In this paper, we study automatic rumor detection for in social media at the event level where an event consists of a sequence of posts organized according to the posting time. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Rui Xia; Kaizhou Xuan; Jianfei Yu;
728	PyMT5: Multi-mode Translation Of Natural Language And Python Code With Transformers Highlight: We present an analysis and modeling effort of a large-scale parallel corpus of 26 million Python methods and 7.7 million method-docstring pairs, demonstrating that for docstring and method generation, PyMT5 outperforms similarly-sized auto-regressive language models (GPT2) which were English pre-trained or randomly initialized. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Colin Clement; Dawn Drain; Jonathan Timcheck; Alexey Svyatkovskiy; Neel Sundaresan;
729	PathQG: Neural Question Generation From Facts Highlight: In this paper, we explore to incorporate facts in the text for question generation in a comprehensive way. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Siyuan Wang; Zhongyu Wei; Zhihao Fan; Zengfeng Huang; Weijian Sun; Qi Zhang; Xuanjing Huang;
730	What Time Is It? Temporal Analysis Of Novels Highlight: To do so, we construct a data set of hourly time phrases from 52,183 fictional books. We then construct a time-of-day classification model that achieves an average error of 2.27 hours. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Allen Kim; Charuta Pethe; Steve Skiena;
731	COGS: A Compositional Generalization Challenge Based On Semantic Interpretation Highlight: To facilitate the evaluation of the compositional abilities of language processing architectures, we introduce COGS, a semantic parsing dataset based on a fragment of English. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Najoung Kim; Tal Linzen;
732	An Analysis Of Natural Language Inference Benchmarks Through The Lens Of Negation Highlight: In this paper, we present a new benchmark for natural language inference in which negation plays a critical role. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Md Mosharaf Hossain; Venelin Kovatchev; Pranoy Dutta; Tiffany Kao; Elizabeth Wei; Eduardo Blanco;
733	On The Sentence Embeddings From Pre-trained Language Models Highlight: In this paper, we argue that the semantic information in the BERT embeddings is not fully exploited. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Bohan Li; Hao Zhou; Junxian He; Mingxuan Wang; Yiming Yang; Lei Li;	code
734	What Can We Learn From Collective Human Opinions On Natural Language Inference Data? Highlight: We collect ChaosNLI, a dataset with a total of 464,500 annotations to study Collective HumAn OpinionS in oft-used NLI evaluation sets. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yixin Nie; Xiang Zhou; Mohit Bansal;
735	Improving Text Generation With Student-Forcing Optimal Transport Highlight: To reduce this gap between training and testing, we propose using optimal transport (OT) to match the sequences generated in these two modes. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Jianqiao Li; Chunyuan Li; Guoyin Wang; Hao Fu; Yuhchen Lin; Liqun Chen; Yizhe Zhang; Chenyang Tao; Ruiyi Zhang; Wenlin Wang; Dinghan Shen; Qian Yang; Lawrence Carin;
736	UNION: An Unreferenced Metric For Evaluating Open-ended Story Generation Highlight: We propose an approach of constructing negative samples by mimicking the errors commonly observed in existing NLG models, including repeated plots, conflicting logic, and long-range incoherence. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Jian Guan; Minlie Huang;
737	F^2-Softmax: Diversifying Neural Text Generation Via Frequency Factorized Softmax Highlight: As a simple yet effective remedy, we propose two novel methods, F{\^{}}2-Softmax and MefMax, for a balanced training even with the skewed frequency distribution. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Byung-Ju Choi; Jimin Hong; David Park; Sang Wan Lee;
738	Partially-Aligned Data-to-Text Generation With Distant Supervision Highlight: To tackle this new task, we propose a novel distant supervision generation framework. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Zihao Fu; Bei Shi; Wai Lam; Lidong Bing; Zhiyuan Liu;	code
739	Like Hiking? You Probably Enjoy Nature: Persona-grounded Dialog With Commonsense Expansions Highlight: In this paper, we propose to expand available persona sentences using existing commonsense knowledge bases and paraphrasing resources to imbue dialog models with access to an expanded and richer set of persona descriptions. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Bodhisattwa Prasad Majumder; Harsh Jhamtani; Taylor Berg-Kirkpatrick; Julian McAuley;
740	A Probabilistic End-To-End Task-Oriented Dialog Model With Latent Belief States Towards Semi-Supervised Learning Highlight: In this paper we aim at alleviating the reliance on belief state labels in building end-to-end dialog systems, by leveraging unlabeled dialog data towards semi-supervised learning. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yichi Zhang; Zhijian Ou; Min Hu; Junlan Feng;
741	The World Is Not Binary: Learning To Rank With Grayscale Data For Dialogue Response Selection Highlight: In this work, we show that grayscale data can be automatically constructed without human effort. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Zibo Lin; Deng Cai; Yan Wang; Xiaojiang Liu; Haitao Zheng; Shuming Shi;
742	GRADE: Automatic Graph-Enhanced Coherence Metric For Evaluating Open-Domain Dialogue Systems Highlight: Capitalized on the topic-level dialogue graph, we propose a new evaluation metric GRADE, which stands for Graph-enhanced Representations for Automatic Dialogue Evaluation. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Lishan Huang; Zheng Ye; Jinghui Qin; Liang Lin; Xiaodan Liang;
743	MedDialog: Large-scale Medical Dialogue Datasets Highlight: To facilitate the research and development of medical dialogue systems, we build large-scale medical dialogue datasets – MedDialog, which contain 1) a Chinese dataset with 3.4 million conversations between patients and doctors, 11.3 million utterances, 660.2 million tokens, covering 172 specialties of diseases, and 2) an English dataset with 0.26 million conversations, 0.51 million utterances, 44.53 million tokens, covering 96 specialties of diseases. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Guangtao Zeng; Wenmian Yang; Zeqian Ju; Yue Yang; Sicheng Wang; Ruisi Zhang; Meng Zhou; Jiaqi Zeng; Xiangyu Dong; Ruoyu Zhang; Hongchao Fang; Penghui Zhu; Shu Chen; Pengtao Xie;	code
744	An Information Theoretic View On Selecting Linguistic Probes Highlight: We show this dichotomy is valid information-theoretically. In addition, we find that the ”good probe” criteria proposed by the two papers, selectivity (Hewitt and Liang, 2019) and information gain (Pimentel et al., 2020), are equivalent – the errors of their approaches are identical (modulo irrelevant terms). Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Zining Zhu; Frank Rudzicz;
745	With Little Power Comes Great Responsibility Highlight: By meta-analyzing a set of existing NLP papers and datasets, we characterize typical power for a variety of settings and conclude that underpowered experiments are common in the NLP literature. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Dallas Card; Peter Henderson; Urvashi Khandelwal; Robin Jia; Kyle Mahowald; Dan Jurafsky;
746	Dataset Cartography: Mapping And Diagnosing Datasets With Training Dynamics Highlight: We introduce Data Maps-a model-based tool to characterize and diagnose datasets. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Swabha Swayamdipta; Roy Schwartz; Nicholas Lourie; Yizhong Wang; Hannaneh Hajishirzi; Noah A. Smith; Yejin Choi;
747	Evaluating And Characterizing Human Rationales Highlight: To unpack this finding, we propose improved metrics to account for model-dependent baseline performance. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Samuel Carton; Anirudh Rathore; Chenhao Tan;
748	On Extractive And Abstractive Neural Document Summarization With Transformer Language Models Highlight: We present a method to produce abstractive summaries of long documents that exceed several thousand words via neural abstractive summarization. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Jonathan Pilault; Raymond Li; Sandeep Subramanian; Chris Pal;
749	Multi-Fact Correction In Abstractive Text Summarization Highlight: To address this challenge, we propose Span-Fact, a suite of two factual correction models that leverages knowledge learned from question answering models to make corrections in system-generated summaries via span selection. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Yue Dong; Shuohang Wang; Zhe Gan; Yu Cheng; Jackie Chi Kit Cheung; Jingjing Liu;
750	Evaluating The Factual Consistency Of Abstractive Text Summarization Highlight: We propose a weakly-supervised, model-based approach for verifying factual consistency and identifying conflicts between source documents and generated summaries. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Wojciech Kryscinski; Bryan McCann; Caiming Xiong; Richard Socher;	code
751	Re-evaluating Evaluation In Text Summarization Highlight: In this paper, we make an attempt to re-evaluate the evaluation method for text summarization: assessing the reliability of automatic metrics using top-scoring system outputs, both abstractive and extractive, on recently popular datasets for both system-level and summary-level evaluation settings. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Manik Bhandari; Pranav Narayan Gour; Atabak Ashfaq; Pengfei Liu; Graham Neubig;
752	VMSMO: Learning To Generate Multimodal Summary For Video-based News Articles Highlight: Hence, in this paper, we propose the task of Video-based Multimodal Summarization with Multimodal Output (VMSMO) to tackle such a problem. Related Papers Related Patents Related Grants Related Orgs Related Experts Details	Mingzhe Li; Xiuying Chen; Shen Gao; Zhangming Chan; Dongyan Zhao; Rui Yan;