Paper Digest: ACL 2019 Highlights
Download ACL-2019-Paper-Digests.pdf– highlights of all 660 (447 long+ 213 short) ACL-2019 papers.
Annual Meeting of the Association for Computational Linguistics (ACL) is one of the top natural language processing conferences in the world. In 2019, it is to be held in Florence, Italy. There were 2,905 paper submissions, of which 447 were accepted as long papers, and 213 as short papers.
Annual Meeting of the Association for Computational Linguistics (ACL) is one of the top natural language processing conferences in the world. In 2019, it is to be held in Florence, Italy. There were 2,905 paper submissions, of which 447 were accepted as long papers, and 213 as short papers.
To help AI community quickly catch up on the work presented in this conference, Paper Digest Team processed all accepted papers, and generated one highlight sentence (typically the main topic) for each paper. Readers are encouraged to read these machine generated highlights to quickly get the main idea of each paper.
We thank all authors for writing these interesting papers, and readers for reading our digests. If you do not want to miss any interesting AI paper, you are welcome to sign up our free paper digest service to get new paper updates customized to your own interests on a daily basis.
Paper Digest Team
team@paperdigest.org
TABLE 1: ACL 2019 Papers
Title | Authors | Highlight | |
---|---|---|---|
1 | One Time of Interaction May Not Be Enough: Go Deep with an Interaction-over-Interaction Network for Response Selection in Dialogues | Chongyang Tao, Wei Wu, Can Xu, Wenpeng Hu, Dongyan Zhao, Rui Yan, | In this work, we let utterance-response interaction go deep by proposing an interaction-over-interaction network (IoI). |
2 | Incremental Transformer with Deliberation Decoder for Document Grounded Conversations | Zekang Li, Cheng Niu, Fandong Meng, Yang Feng, Qian Li, Jie Zhou, | In this paper, we propose a novel Transformer-based architecture for multi-turn document grounded conversations. |
3 | Improving Multi-turn Dialogue Modelling with Utterance ReWriter | Hui Su, Xiaoyu Shen, Rongzhi Zhang, Fei Sun, Pengwei Hu, Cheng Niu, Jie Zhou, | In this paper, we propose rewriting the human utterance as a pre-process to help multi-turn dialgoue modelling. |
4 | Do Neural Dialog Systems Use the Conversation History Effectively? An Empirical Study | Chinnadhurai Sankar, Sandeep Subramanian, Chris Pal, Sarath Chandar, Yoshua Bengio, | In this paper, we take an empirical approach to understanding how these models use the available dialog history by studying the sensitivity of the models to artificially introduced unnatural changes or perturbations to their context at test time. |
5 | Boosting Dialog Response Generation | Wenchao Du, Alan W Black, | To address this problem, we designed an iterative training process and ensemble method based on boosting. |
6 | Constructing Interpretive Spatio-Temporal Features for Multi-Turn Responses Selection | Junyu Lu, Chenbin Zhang, Zeying Xie, Guang Ling, Tom Chao Zhou, Zenglin Xu, | To address these issues, we propose a Spatio-Temporal Matching network (STM) for response selection. |
7 | Semantic Parsing with Dual Learning | Ruisheng Cao, Su Zhu, Chen Liu, Jieyu Li, Kai Yu, | In this work, we develop a semantic parsing framework with the dual learning algorithm, which enables a semantic parser to make full use of data (labeled and even unlabeled) through a dual-learning game. |
8 | Semantic Expressive Capacity with Bounded Memory | Antoine Venant, Alexander Koller, | We investigate the capacity of mechanisms for compositional semantic parsing to describe relations between sentences and semantic representations. |
9 | AMR Parsing as Sequence-to-Graph Transduction | Sheng Zhang, Xutai Ma, Kevin Duh, Benjamin Van Durme, | We propose an attention-based model that treats AMR parsing as sequence-to-graph transduction. |
10 | Generating Logical Forms from Graph Representations of Text and Entities | Peter Shaw, Philip Massey, Angelica Chen, Francesco Piccinno, Yasemin Altun, | We present an approach that uses a Graph Neural Network (GNN) architecture to incorporate information about relevant entities and their relations during parsing. |
11 | Learning Compressed Sentence Representations for On-Device Text Processing | Dinghan Shen, Pengyu Cheng, Dhanasekar Sundararaman, Xinyuan Zhang, Qian Yang, Meng Tang, Asli Celikyilmaz, Lawrence Carin, | In this paper, we propose four different strategies to transform continuous and generic sentence embeddings into a binarized form, while preserving their rich semantic information. |
12 | The (Non-)Utility of Structural Features in BiLSTM-based Dependency Parsers | Agnieszka Falenska, Jonas Kuhn, | In this paper we aim to answer the question: How much structural context are the BiLSTM representations able to capture implicitly? |
13 | Automatic Generation of High Quality CCGbanks for Parser Domain Adaptation | Masashi Yoshikawa, Hiroshi Noji, Koji Mineshima, Daisuke Bekki, | We propose a new domain adaptation method for Combinatory Categorial Grammar (CCG) parsing, based on the idea of automatic generation of CCG corpora exploiting cheaper resources of dependency trees. |
14 | A Joint Named-Entity Recognizer for Heterogeneous Tag-sets Using a Tag Hierarchy | Genady Beryozkin, Yoel Drori, Oren Gilon, Tzvika Hartman, Idan Szpektor, | We propose to use the given tag hierarchy to jointly learn a neural network that shares its tagging layer among all tag-sets. |
15 | Massively Multilingual Transfer for NER | Afshin Rahimi, Yuan Li, Trevor Cohn, | We propose two techniques for modulating the transfer, suitable for zero-shot or few-shot learning, respectively. |
16 | Reliability-aware Dynamic Feature Composition for Name Tagging | Ying Lin, Liyuan Liu, Heng Ji, Dong Yu, Jiawei Han, | In this paper, we propose a novel reliability-aware name tagging model to tackle this issue. |
17 | Unsupervised Pivot Translation for Distant Languages | Yichong Leng, Xu Tan, Tao Qin, Xiang-Yang Li, Tie-Yan Liu, | In this work, we introduce unsupervised pivot translation for distant languages, which translates a language to a distant language through multiple hops, and the unsupervised translation on each hop is relatively easier than the original direct translation. |
18 | Bilingual Lexicon Induction with Semi-supervision in Non-Isometric Embedding Spaces | Barun Patra, Joel Ruben Antony Moniz, Sarthak Garg, Matthew R. Gormley, Graham Neubig, | We propose a technique to quantitatively estimate this assumption of the isometry between two embedding spaces and empirically show that this assumption weakens as the languages in question become increasingly etymologically distant. |
19 | An Effective Approach to Unsupervised Machine Translation | Mikel Artetxe, Gorka Labaka, Eneko Agirre, | In this paper, we identify and address several deficiencies of existing unsupervised SMT approaches by exploiting subword information, developing a theoretically well founded unsupervised tuning method, and incorporating a joint refinement procedure. |
20 | Effective Adversarial Regularization for Neural Machine Translation | Motoki Sato, Jun Suzuki, Shun Kiyono, | We aim to further leverage this promising methodology into more sophisticated and critical neural models in the natural language processing field, i.e., neural machine translation (NMT) models. |
21 | Revisiting Low-Resource Neural Machine Translation: A Case Study | Rico Sennrich, Biao Zhang, | In this paper, we re-assess the validity of these results, arguing that they are the result of lack of system adaptation to low-resource settings. |
22 | Domain Adaptive Inference for Neural Machine Translation | Danielle Saunders, Felix Stahlberg, Adrià de Gispert, Bill Byrne, | We investigate adaptive ensemble weighting for Neural Machine Translation, addressing the case of improving performance on a new and potentially unknown domain without sacrificing performance on the original domain. |
23 | Neural Relation Extraction for Knowledge Base Enrichment | Bayu Distiawan Trisedya, Gerhard Weikum, Jianzhong Qi, Rui Zhang, | This way, NED errors may cause extraction errors that affect the overall precision and recall.To address this problem, we propose an end-to-end relation extraction model for KB enrichment based on a neural encoder-decoder model. |
24 | Attention Guided Graph Convolutional Networks for Relation Extraction | Zhijiang Guo, Yan Zhang, Wei Lu, | In this work, we propose Attention Guided Graph Convolutional Networks (AGGCNs), a novel model which directly takes full dependency trees as inputs. |
25 | Spatial Aggregation Facilitates Discovery of Spatial Topics | Aniruddha Maiti, Slobodan Vucetic, | By looking at topic discovery through matrix factorization lenses we show that spatial aggregation allows low rank approximation of the original document-word matrix, in which spatially distinct topics are preserved and non-spatial topics are aggregated into a single topic. |
26 | Relation Embedding with Dihedral Group in Knowledge Graph | Canran Xu, Ruijiang Li, | To fulfill this gap, we propose a new model called DihEdral, named after dihedral symmetry group. |
27 | Sequence Tagging with Contextual and Non-Contextual Subword Representations: A Multilingual Evaluation | Benjamin Heinzerling, Michael Strube, | In this work, we conduct an extensive evaluation comparing non-contextual subword embeddings, namely FastText and BPEmb, and a contextual representation method, namely BERT, on multilingual named entity recognition and part-of-speech tagging. |
28 | Augmenting Neural Networks with First-order Logic | Tao Li, Vivek Srikumar, | In this paper, we present a novel framework for introducing declarative knowledge to neural network architectures in order to guide training and prediction. |
29 | Self-Regulated Interactive Sequence-to-Sequence Learning | Julia Kreutzer, Stefan Riezler, | We show how self-regulation strategies that decide when to ask for which kind of feedback from a teacher (or from oneself) can be cast as a learning-to-learn problem leading to improved cost-aware sequence-to-sequence learning. |
30 | You Only Need Attention to Traverse Trees | Mahtab Ahmed, Muhammad Rifayat Samee, Robert E. Mercer, | To this end, we propose Tree Transformer, a model that captures phrase level syntax for constituency trees as well as word-level dependencies for dependency trees by doing recursive traversal only with attention. |
31 | Cross-Domain Generalization of Neural Constituency Parsers | Daniel Fried, Nikita Kitaev, Dan Klein, | We present three results about the generalization of neural parsers in a zero-shot setting: training on trees from one corpus and evaluating on out-of-domain corpora. |
32 | Adaptive Attention Span in Transformers | Sainbayar Sukhbaatar, Edouard Grave, Piotr Bojanowski, Armand Joulin, | We propose a novel self-attention mechanism that can learn its optimal attention span. |
33 | Neural News Recommendation with Long- and Short-term User Representations | Mingxiao An, Fangzhao Wu, Chuhan Wu, Kun Zhang, Zheng Liu, Xing Xie, | In this paper, we propose a neural news recommendation approach which can learn both long- and short-term user representations. |
34 | Automatic Domain Adaptation Outperforms Manual Domain Adaptation for Predicting Financial Outcomes | Marina Sedinkina, Nikolas Breitkopf, Hinrich Schütze, | In this paper, we automatically create sentiment dictionaries for predicting financial outcomes. |
35 | Manipulating the Difficulty of C-Tests | Ji-Ung Lee, Erik Schwan, Christian M. Meyer, | We propose two novel manipulation strategies for increasing and decreasing the difficulty of C-tests automatically. |
36 | Towards Unsupervised Text Classification Leveraging Experts and Word Embeddings | Zied Haj-Yahia, Adrien Sieg, Léa A. Deleris, | In this work, we explore an unsupervised approach to classify documents into categories simply described by a label. |
37 | Neural Text Simplification of Clinical Letters with a Domain Specific Phrase Table | Matthew Shardlow, Raheel Nawaz, | This work uses neural text simplification methods to automatically improve the understandability of clinical letters for patients. |
38 | What You Say and How You Say It Matters: Predicting Stock Volatility Using Verbal and Vocal Cues | Yu Qin, Yi Yang, | We propose a multimodal deep regression model (MDRM) that jointly model CEO’s verbal (from text) and vocal (from audio) information in a conference call. |
39 | Detecting Concealed Information in Text and Speech | Shengli Hu, | In this work, we explore acoustic-prosodic and linguistic indicators of information concealment by collecting a unique corpus of professionals practicing for oral exams while concealing information. |
40 | Evidence-based Trustworthiness | Yi Zhang, Zachary Ives, Dan Roth, | Our key contribution is to develop a family of probabilistic models that jointly estimate the trustworthiness of sources, and the credibility of claims they assert. |
41 | Disentangled Representation Learning for Non-Parallel Text Style Transfer | Vineet John, Lili Mou, Hareesh Bahuleyan, Olga Vechtomova, | We propose a simple yet effective approach, which incorporates auxiliary multi-task and adversarial objectives, for style prediction and bag-of-words prediction, respectively. |
42 | Cross-Sentence Grammatical Error Correction | Shamil Chollampatt, Weiqi Wang, Hwee Tou Ng, | In this paper, we address this serious limitation of existing approaches and improve strong neural encoder-decoder models by appropriately modeling wider contexts. |
43 | This Email Could Save Your Life: Introducing the Task of Email Subject Line Generation | Rui Zhang, Joel Tetreault, | In this paper, we propose and study the task of \textit{email subject line generation}: automatically generating an email subject line from the email body. |
44 | Time-Out: Temporal Referencing for Robust Modeling of Lexical Semantic Change | Haim Dubossarsky, Simon Hengchen, Nina Tahmasebi, Dominik Schlechtweg, | We show that, trained on a diachroniccorpus, the skip-gram with negative samplingarchitecture with temporal referencing outper-forms alignment models on a synthetic task aswell as a manual testset. |
45 | Adversarial Attention Modeling for Multi-dimensional Emotion Regression | Suyang Zhu, Shoushan Li, Guodong Zhou, | In this paper, we propose a neural network-based approach, namely Adversarial Attention Network, to the task of multi-dimensional emotion regression, which automatically rates multiple emotion dimension scores for an input text. |
46 | Divide, Conquer and Combine: Hierarchical Feature Fusion Network with Local and Global Perspectives for Multimodal Affective Computing | Sijie Mai, Haifeng Hu, Songlong Xing, | We propose a general strategy named divide, conquer and combine’ for multimodal fusion. |
47 | Modeling Financial Analysts’ Decision Making via the Pragmatics and Semantics of Earnings Calls | Katherine Keith, Amanda Stent, | In this paper, we examine analysts’ decision making behavior as it pertains to the language content of earnings calls. |
48 | An Interactive Multi-Task Learning Network for End-to-End Aspect-Based Sentiment Analysis | Ruidan He, Wee Sun Lee, Hwee Tou Ng, Daniel Dahlmeier, | In this paper, we propose an interactive multi-task learning network (IMN) which is able to jointly learn multiple related tasks simultaneously at both the token level as well as the document level. |
49 | Decompositional Argument Mining: A General Purpose Approach for Argument Graph Construction | Debela Gemechu, Chris Reed, | This work presents an approach decomposing propositions into four functional components and identify the patterns linking those components to determine argument structure. |
50 | MELD: A Multimodal Multi-Party Dataset for Emotion Recognition in Conversations | Soujanya Poria, Devamanyu Hazarika, Navonil Majumder, Gautam Naik, Erik Cambria, Rada Mihalcea, | Thus, we propose the Multimodal EmotionLines Dataset (MELD), an extension and enhancement of EmotionLines. |
51 | Open-Domain Targeted Sentiment Analysis via Span-Based Extraction and Classification | Minghao Hu, Yuxing Peng, Zhen Huang, Dongsheng Li, Yiwei Lv, | To address these problems, we propose a span-based extract-then-classify framework, where multiple opinion targets are directly extracted from the sentence under the supervision of target span boundaries, and corresponding polarities are then classified using their span representations. |
52 | Transfer Capsule Network for Aspect Level Sentiment Classification | Zhuang Chen, Tieyun Qian, | In this paper, we propose a Transfer Capsule Network (TransCap) model for transferring document-level knowledge to aspect-level sentiment classification. |
53 | Progressive Self-Supervised Attention Learning for Aspect-Level Sentiment Analysis | Jialong Tang, Ziyao Lu, Jinsong Su, Yubin Ge, Linfeng Song, Le Sun, Jiebo Luo, | In this paper, we propose a progressive self-supervised attention learning approach for neural ASC models, which automatically mines useful attention supervision information from a training corpus to refine attention mechanisms. |
54 | Classification and Clustering of Arguments with Contextualized Word Embeddings | Nils Reimers, Benjamin Schiller, Tilman Beck, Johannes Daxenberger, Christian Stab, Iryna Gurevych, | For the first time, we showhow to leverage the power of contextual-ized word embeddings to classify and clustertopic-dependent arguments, achieving impres-sive results on both tasks and across multipledatasets. |
55 | Sentiment Tagging with Partial Labels using Modular Architectures | Xiao Zhang, Dan Goldwasser, | In this paper we focus on a popular class of learning problems, sequence prediction applied to several sentiment analysis tasks, and suggest a modular learning approach in which different sub-tasks are learned using separate functional modules, combined to perform the final task while sharing information. |
56 | DOER: Dual Cross-Shared RNN for Aspect Term-Polarity Co-Extraction | Huaishao Luo, Tianrui Li, Bing Liu, Junbo Zhang, | In this paper, we treat these two tasks as two sequence labeling problems and propose a novel Dual crOss-sharEd RNN framework (DOER) to generate all aspect term-polarity pairs of the input sentence simultaneously. |
57 | A Corpus for Modeling User and Language Effects in Argumentation on Online Debating | Esin Durmus, Claire Cardie, | This paper presents a dataset of 78,376 debates generated over a 10-year period along with surprisingly comprehensive participant profiles. |
58 | Topic Tensor Network for Implicit Discourse Relation Recognition in Chinese | Sheng Xu, Peifeng Li, Fang Kong, Qiaoming Zhu, Guodong Zhou, | In this paper, we propose a topic tensor network to recognize Chinese implicit discourse relations with both sentence-level and topic-level representations. |
59 | Learning from Omission | Bill McDowell, Noah Goodman, | Here, we explore whether pragmatic reasoning during training can improve the quality of learned meanings. |
60 | Multi-Task Learning for Coherence Modeling | Youmna Farag, Helen Yannakoudakis, | We propose a hierarchical neural network trained in a multi-task fashion that learns to predict a document-level coherence score (at the network’s top layers) along with word-level grammatical roles (at the bottom layers), taking advantage of inductive transfer between the two tasks. |
61 | Data Programming for Learning Discourse Structure | Sonia Badene, Kate Thompson, Jean-Pierre Lorré, Nicholas Asher, | This paper investigates the advantages and limits of data programming for the task of learning discourse structure. |
62 | Evaluating Discourse in Structured Text Representations | Elisa Ferracane, Greg Durrett, Junyi Jessy Li, Katrin Erk, | We examine this model in detail, and evaluate on additional discourse-relevant tasks and datasets, in order to assess whether the structured attention improves performance on the end task and whether it captures a text’s discourse structure. |
63 | Know What You Don’t Know: Modeling a Pragmatic Speaker that Refers to Objects of Unknown Categories | Sina Zarrieß, David Schlangen, | We combine these lines of research and model zero-shot reference games, where a speaker needs to successfully refer to a novel object in an image. |
64 | End-to-end Deep Reinforcement Learning Based Coreference Resolution | Hongliang Fei, Xu Li, Dingcheng Li, Ping Li, | In this paper, we introduce an end-to-end reinforcement learning based coreference resolution model to directly optimize coreference evaluation metrics. |
65 | Implicit Discourse Relation Identification for Open-domain Dialogues | Mingyu Derek Ma, Kevin Bowden, Jiaqi Wu, Wen Cui, Marilyn Walker, | In this paper, we designed a novel discourse relation identification pipeline specifically tuned for open-domain dialogue systems. |
66 | Coreference Resolution with Entity Equalization | Ben Kantor, Amir Globerson, | Here we provide a simple and effective approach for achieving this, via an “Entity Equalization” mechanism. |
67 | A Cross-Domain Transferable Neural Coherence Model | Peng Xu, Hamidreza Saghir, Jin Sung Kang, Teng Long, Avishek Joey Bose, Yanshuai Cao, Jackie Chi Kit Cheung, | In this work, we propose a local discriminative neural model with a much smaller negative sampling space that can efficiently learn against incorrect orderings. |
68 | MOROCO: The Moldavian and Romanian Dialectal Corpus | Andrei Butnaru, Radu Tudor Ionescu, | In this work, we introduce the MOldavian and ROmanian Dialectal COrpus (MOROCO), which is freely available for download at https://github.com/butnaruandrei/MOROCO. |
69 | Just “OneSeC” for Producing Multilingual Sense-Annotated Data | Bianca Scarlini, Tommaso Pasini, Roberto Navigli, | In this paper we formulate the assumption of One Sense per Wikipedia Category and present OneSeC, a language-independent method for the automatic extraction of hundreds of thousands of sentences in which a target word is tagged with its meaning. |
70 | How to (Properly) Evaluate Cross-Lingual Word Embeddings: On Strong Baselines, Comparative Analyses, and Some Misconceptions | Goran Glavaš, Robert Litschko, Sebastian Ruder, Ivan Vuli?, | In this work, we take the first step towards a comprehensive evaluation of CLE models: we thoroughly evaluate both supervised and unsupervised CLE models, for a large number of language pairs, on BLI and three downstream tasks, providing new insights concerning the ability of cutting-edge CLE models to support cross-lingual NLP. |
71 | SP-10K: A Large-scale Evaluation Set for Selectional Preference Acquisition | Hongming Zhang, Hantian Ding, Yangqiu Song, | To provide a better evaluation method for SP models, we introduce SP-10K, a large-scale evaluation set that provides human ratings for the plausibility of 10,000 SP pairs over five SP relations, covering 2,500 most frequent verbs, nouns, and adjectives in American English. |
72 | A Wind of Change: Detecting and Evaluating Lexical Semantic Change across Times and Domains | Dominik Schlechtweg, Anna Hätty, Marco Del Tredici, Sabine Schulte im Walde, | We perform an interdisciplinary large-scale evaluation for detecting lexical semantic divergences in a diachronic and in a synchronic task: semantic sense changes across time, and semantic sense changes across domains. |
73 | Errudite: Scalable, Reproducible, and Testable Error Analysis | Tongshuang Wu, Marco Tulio Ribeiro, Jeffrey Heer, Daniel Weld, | This paper codifies model and task agnostic principles for informative error analysis, and presents Errudite, an interactive tool for better supporting this process. |
74 | DocRED: A Large-Scale Document-Level Relation Extraction Dataset | Yuan Yao, Deming Ye, Peng Li, Xu Han, Yankai Lin, Zhenghao Liu, Zhiyuan Liu, Lixin Huang, Jie Zhou, Maosong Sun, | In order to acceleratethe research on document-level RE, we in-troduce DocRED, a new dataset constructedfrom Wikipedia and Wikidata with three features: (1) DocRED annotates both named entities and relations, and is the largest human-annotated dataset for document-level RE fromplain text; (2) DocRED requires reading multiple sentences in a document to extract entities and infer their relations by synthesizing all information of the document; (3) alongwith the human-annotated data, we also offer large-scale distantly supervised data, whichenables DocRED to be adopted for both supervised and weakly supervised scenarios. |
75 | ChID: A Large-scale Chinese IDiom Dataset for Cloze Test | Chujie Zheng, Minlie Huang, Aixin Sun, | In this paper we propose a large-scale Chinese cloze test dataset ChID, which studies the comprehension of idiom, a unique language phenomenon in Chinese. |
76 | Automatic Evaluation of Local Topic Quality | Jeffrey Lund, Piper Armstrong, Wilson Fearn, Stephen Cowley, Courtni Byun, Jordan Boyd-Graber, Kevin Seppi, | We propose a task designed to elicit human judgments of token-level topic assignments. |
77 | Crowdsourcing and Aggregating Nested Markable Annotations | Chris Madge, Juntao Yu, Jon Chamberlain, Udo Kruschwitz, Silviu Paun, Massimo Poesio, | In this paper, we present a method for identifying markables for coreference annotation that combines high-performance automatic markable detectors with checking with a Game-With-A-Purpose (GWAP) and aggregation using a Bayesian annotation model. |
78 | Transferable Multi-Domain State Generator for Task-Oriented Dialogue Systems | Chien-Sheng Wu, Andrea Madotto, Ehsan Hosseini-Asl, Caiming Xiong, Richard Socher, Pascale Fung, | In this paper, we propose a Transferable Dialogue State Generator (TRADE) that generates dialogue states from utterances using copy mechanism, facilitating transfer when predicting (domain, slot, value) triplets not encountered during training. |
79 | Multi-Task Networks with Universe, Group, and Task Feature Learning | Shiva Pentyala, Mengwen Liu, Markus Dreyer, | We present methods for multi-task learning that take advantage of natural groupings of related tasks. |
80 | Constrained Decoding for Neural NLG from Compositional Representations in Task-Oriented Dialogue | Anusha Balakrishnan, Jinfeng Rao, Kartikeya Upasani, Michael White, Rajen Subba, | In this paper, we (1) propose using tree-structured semantic representations, like those used in traditional rule-based NLG systems, for better discourse-level structuring and sentence-level planning; (2) introduce a challenging dataset using this representation for the weather domain; (3) introduce a constrained decoding approach for Seq2Seq models that leverages this representation to improve semantic correctness; and (4) demonstrate promising results on our dataset and the E2E dataset. |
81 | OpenDialKG: Explainable Conversational Reasoning with Attention-based Walks over Knowledge Graphs | Seungwhan Moon, Pararth Shah, Anuj Kumar, Rajen Subba, | We study a conversational reasoning model that strategically traverses through a large-scale common fact knowledge graph (KG) to introduce engaging and contextually diverse entities and attributes. |
82 | Coupling Retrieval and Meta-Learning for Context-Dependent Semantic Parsing | Daya Guo, Duyu Tang, Nan Duan, Ming Zhou, Jian Yin, | In this paper, we present an approach to incorporate retrieved datapoints as supporting evidence for context-dependent semantic parsing, such as generating source code conditioned on the class environment. |
83 | Knowledge-aware Pronoun Coreference Resolution | Hongming Zhang, Yan Song, Yangqiu Song, Dong Yu, | In this paper, we explore how to leverage different types of knowledge to better resolve pronoun coreference with a neural model. |
84 | Don’t Take the Premise for Granted: Mitigating Artifacts in Natural Language Inference | Yonatan Belinkov, Adam Poliak, Stuart Shieber, Benjamin Van Durme, Alexander Rush, | We propose two probabilistic methods to build models that are more robust to such biases and better transfer across datasets. |
85 | GEAR: Graph-based Evidence Aggregating and Reasoning for Fact Verification | Jie Zhou, Xu Han, Cheng Yang, Zhiyuan Liu, Lifeng Wang, Changcheng Li, Maosong Sun, | To alleviate this issue, we propose a graph-based evidence aggregating and reasoning (GEAR) framework which enables information to transfer on a fully-connected evidence graph and then utilizes different aggregators to collect multi-evidence information. |
86 | SherLIiC: A Typed Event-Focused Lexical Inference Benchmark for Evaluating Natural Language Inference | Martin Schmitt, Hinrich Schütze, | We present SherLIiC, a testbed for lexical inference in context (LIiC), consisting of 3985 manually annotated inference rule candidates (InfCands), accompanied by (i) {\textasciitilde}960k unlabeled InfCands, and (ii) {\textasciitilde}190k typed textual relations between Freebase entities extracted from the large entity-linked corpus ClueWeb09. |
87 | Extracting Symptoms and their Status from Clinical Conversations | Nan Du, Kai Chen, Anjuli Kannan, Linh Tran, Yuhui Chen, Izhak Shafran, | This paper describes novel models tailored for a new application, that of extracting the symptoms mentioned in clinical conversations along with their status. |
88 | What Makes a Good Counselor? Learning to Distinguish between High-quality and Low-quality Counseling Conversations | Verónica Pérez-Rosas, Xinyi Wu, Kenneth Resnicow, Rada Mihalcea, | In this paper, we explore several linguistic aspects of the collaboration process occurring during counseling conversations. |
89 | Finding Your Voice: The Linguistic Development of Mental Health Counselors | Justine Zhang, Robert Filbin, Christine Morrison, Jaclyn Weiser, Cristian Danescu-Niculescu-Mizil, | In this work, we develop a computational framework to quantify the extent to which individuals change their linguistic behavior with experience and to study the nature of this evolution. |
90 | Towards Automating Healthcare Question Answering in a Noisy Multilingual Low-Resource Setting | Jeanne E. Daniel, Willie Brink, Ryan Eloff, Charles Copley, | We discuss ongoing work into automating amultilingual digital helpdesk service availablevia text messaging to pregnant and breastfeed-ing mothers in South Africa. |
91 | Joint Entity Extraction and Assertion Detection for Clinical Text | Parminder Bhatia, Busra Celikkaya, Mohammed Khalilia, | We consider this as a multi-task problem and present a novel end-to-end neural model to jointly extract entities and negations. |
92 | HEAD-QA: A Healthcare Dataset for Complex Reasoning | David Vilares, Carlos Gómez-Rodríguez, | We present HEAD-QA, a multi-choice question answering testbed to encourage research on complex reasoning. |
93 | Are You Convinced? Choosing the More Convincing Evidence with a Siamese Network | Martin Gleize, Eyal Shnarch, Leshem Choshen, Lena Dankin, Guy Moshkowich, Ranit Aharonov, Noam Slonim, | In this paper, we present a new data set, IBM-EviConv, of pairs of evidence labeled for convincingness, designed to be more challenging than existing alternatives. |
94 | From Surrogacy to Adoption; From Bitcoin to Cryptocurrency: Debate Topic Expansion | Roy Bar-Haim, Dalia Krieger, Orith Toledo-Ronen, Lilach Edelstein, Yonatan Bilu, Alon Halfon, Yoav Katz, Amir Menczel, Ranit Aharonov, Noam Slonim, | We present algorithms for finding both consistent and contrastive expansions and demonstrate their effectiveness empirically. |
95 | Multimodal and Multi-view Models for Emotion Recognition | Gustavo Aguilar, Viktor Rozgic, Weiran Wang, Chao Wang, | To address this challenge, we study the problem of efficiently combining acoustic and lexical modalities during training while still providing a deployable acoustic model that does not require lexical inputs. |
96 | Emotion-Cause Pair Extraction: A New Task to Emotion Analysis in Texts | Rui Xia, Zixiang Ding, | In this work, we propose a new task: emotion-cause pair extraction (ECPE), which aims to extract the potential pairs of emotions and corresponding causes in a document. |
97 | Argument Invention from First Principles | Yonatan Bilu, Ariel Gera, Daniel Hershcovich, Benjamin Sznajder, Dan Lahav, Guy Moshkowich, Anael Malet, Assaf Gavron, Noam Slonim, | In this work we aim to explicitly define a taxonomy of such principled recurring arguments, and, given a controversial topic, to automatically identify which of these arguments are relevant to the topic. |
98 | Improving the Similarity Measure of Determinantal Point Processes for Extractive Multi-Document Summarization | Sangwoo Cho, Logan Lebanoff, Hassan Foroosh, Fei Liu, | In this paper we seek to strengthen a DPP-based method for extractive multi-document summarization by presenting a novel similarity measure inspired by capsule networks. |
99 | Global Optimization under Length Constraint for Neural Text Summarization | Takuya Makino, Tomoya Iwakura, Hiroya Takamura, Manabu Okumura, | We propose a global optimization method under length constraint (GOLC) for neural text summarization models. |
100 | Searching for Effective Neural Extractive Summarization: What Works and What’s Next | Ming Zhong, Pengfei Liu, Danqing Wang, Xipeng Qiu, Xuanjing Huang, | In this paper, we seek to better understand how neural extractive summarization systems could benefit from different types of model architectures, transferable knowledge and learning schemas. |
101 | A Simple Theoretical Model of Importance for Summarization | Maxime Peyrard, | To this end, we propose simple but rigorous definitions of several concepts that were previously used only intuitively in summarization: Redundancy, Relevance, and Informativeness. |
102 | Multi-News: A Large-Scale Multi-Document Summarization Dataset and Abstractive Hierarchical Model | Alexander Fabbri, Irene Li, Tianwei She, Suyi Li, Dragomir Radev, | In this paper, we introduce Multi-News, the first large-scale MDS news dataset. |
103 | Generating Natural Language Adversarial Examples through Probability Weighted Word Saliency | Shuhuai Ren, Yihe Deng, Kun He, Wanxiang Che, | Based on the synonyms substitution strategy, we introduce a new word replacement order determined by both the word saliency and the classification probability, and propose a greedy algorithm called probability weighted word saliency (PWWS) for text adversarial attack. |
104 | Heuristic Authorship Obfuscation | Janek Bevendorff, Martin Potthast, Matthias Hagen, Benno Stein, | We deal with the adversary task, called authorship obfuscation: preventing verification by altering a to-be-obfuscated text. |
105 | Text Categorization by Learning Predominant Sense of Words as Auxiliary Task | Kazuya Shimura, Jiyi Li, Fumiyo Fukumoto, | This paper follows the assumption and presents a method for text categorization by leveraging the predominant sense of words depending on the domain, i.e., domain-specific senses. |
106 | DeepSentiPeer: Harnessing Sentiment in Review Texts to Recommend Peer Review Decisions | Tirthankar Ghosal, Rajeev Verma, Asif Ekbal, Pushpak Bhattacharyya, | Here in this work, we investigate the role of reviewer sentiment embedded within peer review texts to predict the peer review outcome. |
107 | Gated Embeddings in End-to-End Speech Recognition for Conversational-Context Fusion | Suyoun Kim, Siddharth Dalmia, Florian Metze, | We present a novel conversational-context aware end-to-end speech recognizer based on a gated neural network that incorporates conversational-context/word/speech embeddings. |
108 | Figurative Usage Detection of Symptom Words to Improve Personal Health Mention Detection | Adith Iyer, Aditya Joshi, Sarvnaz Karimi, Ross Sparks, Cecile Paris, | To do so, we present two methods: a pipeline-based approach and a feature augmentation-based approach. |
109 | Complex Word Identification as a Sequence Labelling Task | Sian Gooding, Ekaterina Kochmar, | In this paper, we present a novel approach to CWI based on sequence modelling. |
110 | Neural News Recommendation with Topic-Aware News Representation | Chuhan Wu, Fangzhao Wu, Mingxiao An, Yongfeng Huang, Xing Xie, | In this paper, we propose a neural news recommendation approach with topic-aware news representations. |
111 | Poetry to Prose Conversion in Sanskrit as a Linearisation Task: A Case for Low-Resource Languages | Amrith Krishna, Vishnu Sharma, Bishal Santra, Aishik Chakraborty, Pavankumar Satuluri, Pawan Goyal, | k{\=a}vya guru, the approach we propose, essentially consists of a pipeline of two pretraining steps followed by a seq2seq model. |
112 | Learning Emphasis Selection for Written Text in Visual Media from Crowd-Sourced Label Distributions | Amirreza Shirani, Franck Dernoncourt, Paul Asente, Nedim Lipka, Seokhwan Kim, Jose Echevarria, Thamar Solorio, | We propose a model that employs end-to-end label distribution learning (LDL) on crowd-sourced data and predicts a selection distribution, capturing the inter-subjectivity (common-sense) in the audience as well as the ambiguity of the input. |
113 | Rumor Detection by Exploiting User Credibility Information, Attention and Multi-task Learning | Quanzhi Li, Qiong Zhang, Luo Si, | In this study, we propose a new multi-task learning approach for rumor detection and stance classification tasks. |
114 | Context-specific Language Modeling for Human Trafficking Detection from Online Advertisements | Saeideh Shahrokh Esfahani, Michael J. Cafarella, Maziyar Baran Pouyan, Gregory DeAngelo, Elena Eneva, Andy E. Fano, | Here, we present an approach using natural language processing to identify trafficking ads on these websites. |
115 | Self-Attentional Models for Lattice Inputs | Matthias Sperber, Graham Neubig, Ngoc-Quan Pham, Alex Waibel, | To extend such models to handle lattices, we introduce probabilistic reachability masks that incorporate lattice structure into the model and support lattice scores if available. |
116 | When a Good Translation is Wrong in Context: Context-Aware Machine Translation Improves on Deixis, Ellipsis, and Lexical Cohesion | Elena Voita, Rico Sennrich, Ivan Titov, | We introduce a model that is suitable for this scenario and demonstrate major gains over a context-agnostic baseline on our new benchmarks without sacrificing performance as measured with BLEU. We then create test sets targeting these phenomena. |
117 | A Compact and Language-Sensitive Multilingual Translation Method | Yining Wang, Long Zhou, Jiajun Zhang, Feifei Zhai, Jingfang Xu, Chengqing Zong, | In this paper, we propose a compact and language-sensitive method for multilingual translation. |
118 | Unsupervised Parallel Sentence Extraction with Parallel Segment Detection Helps Machine Translation | Viktor Hangya, Alexander Fraser, | We detect continuous parallel segments in sentence pair candidates and rely on them when mining parallel sentences. |
119 | Unsupervised Bilingual Word Embedding Agreement for Unsupervised Neural Machine Translation | Haipeng Sun, Rui Wang, Kehai Chen, Masao Utiyama, Eiichiro Sumita, Tiejun Zhao, | Thus, we propose two methods that train UNMT with UBWE agreement. |
120 | Effective Cross-lingual Transfer of Neural Machine Translation Models without Shared Vocabularies | Yunsu Kim, Yingbo Gao, Hermann Ney, | This paper shows effective techniques to transfer a pretrained NMT model to a new, unrelated language without shared vocabularies. |
121 | Improved Zero-shot Neural Machine Translation via Ignoring Spurious Correlations | Jiatao Gu, Yong Wang, Kyunghyun Cho, Victor O.K. Li, | In this work, we address the degeneracy problem due to capturing spurious correlations by quantitatively analyzing the mutual information between language IDs of the source and decoded sentences. |
122 | Syntactically Supervised Transformers for Faster Neural Machine Translation | Nader Akoury, Kalpesh Krishna, Mohit Iyyer, | In this work, we propose the syntactically supervised Transformer (SynST), which first autoregressively predicts a chunked parse tree before generating all of the target tokens in one shot conditioned on the predicted parse. |
123 | Dynamically Composing Domain-Data Selection with Clean-Data Selection by “Co-Curricular Learning” for Neural Machine Translation | Wei Wang, Isaac Caswell, Ciprian Chelba, | This paper introduces a “co-curricular learning” method to compose dynamic domain-data selection with dynamic clean-data selection, for transfer learning across both capabilities. |
124 | On the Word Alignment from Neural Machine Translation | Xintong Li, Guanlin Li, Lemao Liu, Max Meng, Shuming Shi, | This paper thereby proposes two methods to induce word alignment which are general and agnostic to specific NMT models. |
125 | Imitation Learning for Non-Autoregressive Neural Machine Translation | Bingzhen Wei, Mingxuan Wang, Hao Zhou, Junyang Lin, Xu Sun, | In this paper, we propose an imitation learning framework for non-autoregressive machine translation, which still enjoys the fast translation speed but gives comparable translation performance compared to its auto-regressive counterpart. |
126 | Monotonic Infinite Lookback Attention for Simultaneous Machine Translation | Naveen Arivazhagan, Colin Cherry, Wolfgang Macherey, Chung-Cheng Chiu, Semih Yavuz, Ruoming Pang, Wei Li, Colin Raffel, | We present the first simultaneous translation system to learn an adaptive schedule jointly with a neural machine translation (NMT) model that attends over all source tokens read thus far. |
127 | Global Textual Relation Embedding for Relational Understanding | Zhiyu Chen, Hanwen Zha, Honglei Liu, Wenhu Chen, Xifeng Yan, Yu Su, | In this work, we investigate how to learn a general-purpose embedding of textual relations, defined as the shortest dependency path between entities. |
128 | Graph Neural Networks with Generated Parameters for Relation Extraction | Hao Zhu, Yankai Lin, Zhiyuan Liu, Jie Fu, Tat-Seng Chua, Maosong Sun, | In this paper, we propose a novel graph neural network with generated parameters (GP-GNNs). |
129 | Entity-Relation Extraction as Multi-Turn Question Answering | Xiaoya Li, Fan Yin, Zijun Sun, Xiayu Li, Arianna Yuan, Duo Chai, Mingxin Zhou, Jiwei Li, | In this paper, we propose a new paradigm for the task of entity-relation extraction. |
130 | Exploiting Entity BIO Tag Embeddings and Multi-task Learning for Relation Extraction with Imbalanced Data | Wei Ye, Bo Li, Rui Xie, Zhonghao Sheng, Long Chen, Shikun Zhang, | To mitigate this problem, we propose a multi-task architecture which jointly trains a model to perform relation identification with cross-entropy loss and relation classification with ranking loss. |
131 | Joint Type Inference on Entities and Relations via Graph Convolutional Networks | Changzhi Sun, Yeyun Gong, Yuanbin Wu, Ming Gong, Daxin Jiang, Man Lan, Shiliang Sun, Nan Duan, | To tackle the joint type inference task, we propose a novel graph convolutional network (GCN) running on an entity-relation bipartite graph. |
132 | Extracting Multiple-Relations in One-Pass with Pre-Trained Transformers | Haoyu Wang, Ming Tan, Mo Yu, Shiyu Chang, Dakuo Wang, Kun Xu, Xiaoxiao Guo, Saloni Potdar, | In this work, we focus on the task of multiple relation extractions by encoding the paragraph only once. |
133 | Unsupervised Information Extraction: Regularizing Discriminative Approaches with Relation Distribution Losses | Étienne Simon, Vincent Guigue, Benjamin Piwowarski, | To overcome this limitation, we introduce a skewness loss which encourages the classifier to predict a relation with confidence given a sentence, and a distribution distance loss enforcing that all relations are predicted in average. |
134 | Fine-tuning Pre-Trained Transformer Language Models to Distantly Supervised Relation Extraction | Christoph Alt, Marc Hübner, Leonhard Hennig, | To address this gap, we utilize a pre-trained language model, the OpenAI Generative Pre-trained Transformer (GPT) (Radford et al., 2018). |
135 | ARNOR: Attention Regularization based Noise Reduction for Distant Supervision Relation Classification | Wei Jia, Dai Dai, Xinyan Xiao, Hua Wu, | In this paper, we propose ARNOR, a novel Attention Regularization based NOise Reduction framework for distant supervision relation classification. |
136 | GraphRel: Modeling Text as Relational Graphs for Joint Entity and Relation Extraction | Tsu-Jui Fu, Peng-Hsuan Li, Wei-Yun Ma, | In this paper, we present GraphRel, an end-to-end relation extraction model which uses graph convolutional networks (GCNs) to jointly learn named entities and relations. |
137 | DIAG-NRE: A Neural Pattern Diagnosis Framework for Distantly Supervised Neural Relation Extraction | Shun Zheng, Xu Han, Yankai Lin, Peilin Yu, Lu Chen, Ling Huang, Zhiyuan Liu, Wei Xu, | To ease the labor-intensive workload of pattern writing and enable the quick generalization to new relation types, we propose a neural pattern diagnosis framework, DIAG-NRE, that can automatically summarize and refine high-quality relational patterns from noise data with human experts in the loop. |
138 | Multi-grained Named Entity Recognition | Congying Xia, Chenwei Zhang, Tao Yang, Yaliang Li, Nan Du, Xian Wu, Wei Fan, Fenglong Ma, Philip Yu, | This paper presents a novel framework, MGNER, for Multi-Grained Named Entity Recognition where multiple entities or entity mentions in a sentence could be non-overlapping or totally nested. |
139 | ERNIE: Enhanced Language Representation with Informative Entities | Zhengyan Zhang, Xu Han, Zhiyuan Liu, Xin Jiang, Maosong Sun, Qun Liu, | In this paper, we utilize both large-scale textual corpora and KGs to train an enhanced language representation model (ERNIE), which can take full advantage of lexical, syntactic, and knowledge information simultaneously. |
140 | Multi-Channel Graph Neural Network for Entity Alignment | Yixin Cao, Zhiyuan Liu, Chengjiang Li, Zhiyuan Liu, Juanzi Li, Tat-Seng Chua, | In this paper, we propose a novel Multi-channel Graph Neural Network model (MuGNN) to learn alignment-oriented knowledge graph (KG) embeddings by robustly encoding two KGs via multiple channels. |
141 | A Neural Multi-digraph Model for Chinese NER with Gazetteers | Ruixue Ding, Pengjun Xie, Xiaoyan Zhang, Wei Lu, Linlin Li, Luo Si, | To automatically learn how to incorporate multiple gazetteers into an NER system, we propose a novel approach based on graph neural networks with a multi-digraph structure that captures the information that the gazetteers offer. |
142 | Improved Language Modeling by Decoding the Past | Siddhartha Brahma, | We propose a new regularization method based on decoding the last token in the context using the predicted distribution of the next token. |
143 | Training Hybrid Language Models by Marginalizing over Segmentations | Edouard Grave, Sainbayar Sukhbaatar, Piotr Bojanowski, Armand Joulin, | In this paper, we study the problem of hybrid language modeling, that is using models which can predict both characters and larger units such as character ngrams or words. |
144 | Improving Neural Language Models by Segmenting, Attending, and Predicting the Future | Hongyin Luo, Lan Jiang, Yonatan Belinkov, James Glass, | In this work, we propose a method that improves language modeling by learning to align the given context and the following phrase. |
145 | Lightweight and Efficient Neural Natural Language Processing with Quaternion Networks | Yi Tay, Aston Zhang, Anh Tuan Luu, Jinfeng Rao, Shuai Zhang, Shuohang Wang, Jie Fu, Siu Cheung Hui, | This paper proposes a series of lightweight and memory efficient neural architectures for a potpourri of natural language processing (NLP) tasks. |
146 | Sparse Sequence-to-Sequence Models | Ben Peters, Vlad Niculae, André F. T. Martins, | In this paper, we propose sparse sequence-to-sequence models, rooted in a new family of $\alpha$-entmax transformations, which includes softmax and sparsemax as particular cases, and is sparse for any $\alpha > 1$. |
147 | On the Robustness of Self-Attentive Models | Yu-Lun Hsieh, Minhao Cheng, Da-Cheng Juan, Wei Wei, Wen-Lian Hsu, Cho-Jui Hsieh, | Specifically, we investigate the attention and feature extraction mechanisms of state-of-the-art recurrent neural networks and self-attentive architectures for sentiment analysis, entailment and machine translation under adversarial attacks. |
148 | Exact Hard Monotonic Attention for Character-Level Transduction | Shijie Wu, Ryan Cotterell, | In this work, we ask the following question: Is monotonicity really a helpful inductive bias in these tasks? |
149 | A Lightweight Recurrent Network for Sequence Modeling | Biao Zhang, Rico Sennrich, | In this paper, we propose a lightweight recurrent network, or LRN. |
150 | Towards Scalable and Reliable Capsule Networks for Challenging NLP Applications | Wei Zhao, Haiyun Peng, Steffen Eger, Erik Cambria, Min Yang, | In this paper, we introduce: (i) an agreement score to evaluate the performance of routing processes at instance-level; (ii) an adaptive optimizer to enhance the reliability of routing; (iii) capsule compression and partial routing to improve the scalability of capsule networks. |
151 | Soft Representation Learning for Sparse Transfer | Haeju Park, Jinyoung Yeo, Gengyu Wang, Seung-won Hwang, | Our contribution is using adversarial training across tasks, to “soft-code” shared and private spaces, to avoid the shared space gets too sparse. |
152 | Learning Representations from Imperfect Time Series Data via Tensor Rank Regularization | Paul Pu Liang, Zhun Liu, Yao-Hung Hubert Tsai, Qibin Zhao, Ruslan Salakhutdinov, Louis-Philippe Morency, | To address these concerns, we present a regularization method based on tensor rank minimization. |
153 | Towards Lossless Encoding of Sentences | Gabriele Prato, Mathieu Duchesneau, Sarath Chandar, Alain Tapp, | In this work, we propose a near lossless method for encoding long sequences of texts as well as all of their sub-sequences into feature rich representations. |
154 | Open Vocabulary Learning for Neural Chinese Pinyin IME | Zhuosheng Zhang, Yafang Huang, Hai Zhao, | To alleviate such inconveniences, we propose a neural P2C conversion model augmented by an online updated vocabulary with a sampling mechanism to support open vocabulary learning during IME working. |
155 | Using LSTMs to Assess the Obligatoriness of Phonological Distinctive Features for Phonotactic Learning | Nicole Mirea, Klinton Bicknell, | To ascertain the importance of phonetic information in the form of phonological distinctive features for the purpose of segment-level phonotactic acquisition, we compare the performance of two recurrent neural network models of phonotactic learning: one that has access to distinctive features at the start of the learning process, and one that does not. |
156 | Better Character Language Modeling through Morphology | Terra Blevins, Luke Zettlemoyer, | We incorporate morphological supervision into character language models (CLMs) via multitasking and show that this addition improves bits-per-character (BPC) performance across 24 languages, even when the morphology data and language modeling data are disjoint. |
157 | Historical Text Normalization with Delayed Rewards | Simon Flachs, Marcel Bollmann, Anders Søgaard, | Policy gradient training enables direct optimization for exact matches, and while the small datasets in historical text normalization are prohibitive of from-scratch reinforcement learning, we show that policy gradient fine-tuning leads to significant improvements across the board. |
158 | Stochastic Tokenization with a Language Model for Neural Text Classification | Tatsuya Hiraoka, Hiroyuki Shindo, Yuji Matsumoto, | In this paper, we propose a method to simultaneously learn tokenization and text classification to address these problems. |
159 | Mitigating Gender Bias in Natural Language Processing: Literature Review | Tony Sun, Andrew Gaut, Shirlyn Tang, Yuxin Huang, Mai ElSherief, Jieyu Zhao, Diba Mirza, Elizabeth Belding, Kai-Wei Chang, William Yang Wang, | In this paper, we review contemporary studies on recognizing and mitigating gender bias in NLP. |
160 | Gender-preserving Debiasing for Pre-trained Word Embeddings | Masahiro Kaneko, Danushka Bollegala, | Taking gender-bias as a working example, we propose a debiasing method that preserves non-discriminative gender-related information, while removing stereotypical discriminative gender biases from pre-trained word embeddings. |
161 | Counterfactual Data Augmentation for Mitigating Gender Stereotypes in Languages with Rich Morphology | Ran Zmigrod, Sebastian J. Mielke, Hanna Wallach, Ryan Cotterell, | We present a novel approach for converting between masculine-inflected and feminine-inflected sentences in such languages. |
162 | A Transparent Framework for Evaluating Unintended Demographic Bias in Word Embeddings | Chris Sweeney, Maryam Najafian, | In this work, we present a transparent framework and metric for evaluating discrimination across protected groups with respect to their word embedding bias. |
163 | The Risk of Racial Bias in Hate Speech Detection | Maarten Sap, Dallas Card, Saadia Gabriel, Yejin Choi, Noah A. Smith, | We investigate how annotators’ insensitivity to differences in dialect can lead to racial bias in automatic hate speech detection models, potentially amplifying harm against minority populations. |
164 | Evaluating Gender Bias in Machine Translation | Gabriel Stanovsky, Noah A. Smith, Luke Zettlemoyer, | We present the first challenge set and evaluation protocol for the analysis of gender bias in machine translation (MT). |
165 | LSTMEmbed: Learning Word and Sense Representations from a Large Semantically Annotated Corpus with Long Short-Term Memories | Ignacio Iacobacci, Roberto Navigli, | In this paper we explore the capabilities of a bidirectional LSTM model to learn representations of word senses from semantically annotated corpora. |
166 | Understanding Undesirable Word Embedding Associations | Kawin Ethayarajh, David Duvenaud, Graeme Hirst, | We show that for any embedding model that implicitly does matrix factorization, debiasing vectors post hoc using subspace projection (Bolukbasi et al., 2016) is, under certain conditions, equivalent to training on an unbiased corpus. |
167 | Unsupervised Discovery of Gendered Language through Latent-Variable Modeling | Alexander Miserlis Hoyle, Lawrence Wolf-Sonkin, Hanna Wallach, Isabelle Augenstein, Ryan Cotterell, | To that end, we introduce a generative latent-variable model that jointly represents adjective (or verb) choice, with its sentiment, given the natural gender of a head (or dependent) noun. |
168 | Topic Sensitive Attention on Generic Corpora Corrects Sense Bias in Pretrained Embeddings | Vihari Piratla, Sunita Sarawagi, Soumen Chakrabarti, | Given a small corpus D{\_}T pertaining to a limited set of focused topics, our goal is to train embeddings that accurately capture the sense of words in the topic in spite of the limited size of D{\_}T. |
169 | SphereRE: Distinguishing Lexical Relations with Hyperspherical Relation Embeddings | Chengyu Wang, Xiaofeng He, Aoying Zhou, | In this work, we present a neural representation learning model to distinguish lexical relations among term pairs based on Hyperspherical Relation Embeddings (SphereRE). |
170 | Multilingual Factor Analysis | Francisco Vargas, Kamen Brestnichki, Alex Papadopoulos Korfiatis, Nils Hammerla, | In this work we approach the task of learning multilingual word representations in an offline manner by fitting a generative latent variable model to a multilingual dictionary. |
171 | Meaning to Form: Measuring Systematicity as Information | Tiago Pimentel, Arya D. McCarthy, Damian Blasi, Brian Roark, Ryan Cotterell, | In this work, we offer a holistic quantification of the systematicity of the sign using mutual information and recurrent neural networks. |
172 | Learning Morphosyntactic Analyzers from the Bible via Iterative Annotation Projection across 26 Languages | Garrett Nicolai, David Yarowsky, | In this paper, we address both issues simultaneously: leveraging the high accuracy of English taggers and parsers, we project morphological information onto translations of the Bible in 26 varied test languages. |
173 | Adversarial Multitask Learning for Joint Multi-Feature and Multi-Dialect Morphological Modeling | Nasser Zalmout, Nizar Habash, | In this paper we explore the use of multitask learning and adversarial training to address morphological richness and dialectal variations in the context of full morphological tagging. |
174 | Neural Machine Translation with Reordering Embeddings | Kehai Chen, Rui Wang, Masao Utiyama, Eiichiro Sumita, | In this paper, we propose a reordering mechanism to learn the reordering embedding of a word based on its contextual information. |
175 | Neural Fuzzy Repair: Integrating Fuzzy Matches into Neural Machine Translation | Bram Bulte, Arda Tezcan, | We present a simple yet powerful data augmentation method for boosting Neural Machine Translation (NMT) performance by leveraging information retrieved from a Translation Memory (TM). |
176 | Learning Deep Transformer Models for Machine Translation | Qiang Wang, Bei Li, Tong Xiao, Jingbo Zhu, Changliang Li, Derek F. Wong, Lidia S. Chao, | Two strands of research are promising to improve models of this kind: the first uses wide networks (a.k.a. Transformer-Big) and has been the de facto standard for development of the Transformer system, and the other uses deeper language representation but faces the difficulty arising from learning deep networks. Here, we continue the line of research on the latter. |
177 | Generating Diverse Translations with Sentence Codes | Raphael Shu, Hideki Nakayama, Kyunghyun Cho, | In this work, we attempt to obtain diverse translations by using sentence codes to condition the sentence generation. |
178 | Self-Supervised Neural Machine Translation | Dana Ruiter, Cristina España-Bonet, Josef van Genabith, | We present a simple new method where an emergent NMT system is used for simultaneously selecting training data and learning internal NMT representations. |
179 | Exploring Phoneme-Level Speech Representations for End-to-End Speech Translation | Elizabeth Salesky, Matthias Sperber, Alan W Black, | We show that a naive method to create compressed phoneme-like speech representations is far more effective and efficient for translation than traditional frame-level speech features. |
180 | Visually Grounded Neural Syntax Acquisition | Haoyue Shi, Jiayuan Mao, Kevin Gimpel, Karen Livescu, | We present the Visually Grounded Neural Syntax Learner (VG-NSL), an approach for learning syntactic representations and structures without any explicit supervision. |
181 | Stay on the Path: Instruction Fidelity in Vision-and-Language Navigation | Vihan Jain, Gabriel Magalhaes, Alexander Ku, Ashish Vaswani, Eugene Ie, Jason Baldridge, | Here, we highlight shortcomings of current metrics for the Room-to-Room dataset (Anderson et al.,2018b) and propose a new metric, Coverage weighted by Length Score (CLS). |
182 | Expressing Visual Relationships via Language | Hao Tan, Franck Dernoncourt, Zhe Lin, Trung Bui, Mohit Bansal, | To push forward the research in this direction, we first introduce a new language-guided image editing dataset that contains a large number of real image pairs with corresponding editing instructions. We then propose a new relational speaker model based on an encoder-decoder architecture with static relational attention and sequential multi-head attention. |
183 | Weakly-Supervised Spatio-Temporally Grounding Natural Sentence in Video | Zhenfang Chen, Lin Ma, Wenhan Luo, Kwan-Yee Kenneth Wong, | In this paper, we address a novel task, namely weakly-supervised spatio-temporally grounding natural sentence in video. |
184 | The PhotoBook Dataset: Building Common Ground through Visually-Grounded Dialogue | Janosch Haber, Tim Baumgärtner, Ece Takmaz, Lieke Gelderloos, Elia Bruni, Raquel Fernández, | This paper introduces the PhotoBook dataset, a large-scale collection of visually-grounded, task-oriented dialogues in English designed to investigate shared dialogue history accumulating during conversation. |
185 | Continual and Multi-Task Architecture Search | Ramakanth Pasunuru, Mohit Bansal, | In our work, we first introduce a novel continual architecture search (CAS) approach, so as to continually evolve the model parameters during the sequential training of several tasks, without losing performance on previously learned tasks (via block-sparsity and orthogonality constraints), thus enabling life-long learning. Next, we explore a multi-task architecture search (MAS) approach over ENAS for finding a unified, single cell structure that performs well across multiple tasks (via joint controller rewards), and hence allows more generalizable transfer of the cell structure knowledge to an unseen new task. |
186 | Semi-supervised Stochastic Multi-Domain Learning using Variational Inference | Yitong Li, Timothy Baldwin, Trevor Cohn, | In this paper we propose a method to distill the important domain signal as part of a multi-domain learning system, using a latent variable model in which parts of a neural model are stochastically gated based on the inferred domain. |
187 | Boosting Entity Linking Performance by Leveraging Unlabeled Documents | Phong Le, Ivan Titov, | In contrast, we propose an approach which exploits only naturally occurring information: unlabeled documents and Wikipedia. |
188 | Pre-Learning Environment Representations for Data-Efficient Neural Instruction Following | David Gaddy, Dan Klein, | We consider the problem of learning to map from natural language instructions to state transitions (actions) in a data-efficient manner. |
189 | Reinforced Training Data Selection for Domain Adaptation | Miaofeng Liu, Yan Song, Hongbin Zou, Tong Zhang, | To make TDS self-adapted to data and task, and to combine it with model training, in this paper, we propose a reinforcement learning (RL) framework that synchronously searches for training instances relevant to the target domain and learns better representations for them. |
190 | Generating Long and Informative Reviews with Aspect-Aware Coarse-to-Fine Decoding | Junyi Li, Wayne Xin Zhao, Ji-Rong Wen, Yang Song, | In this paper, we propose a novel review generation model by characterizing an elaborately designed aspect-aware coarse-to-fine generation process. |
191 | PaperRobot: Incremental Draft Generation of Scientific Ideas | Qingyun Wang, Lifu Huang, Zhiying Jiang, Kevin Knight, Heng Ji, Mohit Bansal, Yi Luan, | We present a PaperRobot who performs as an automatic research assistant by (1) conducting deep understanding of a large collection of human-written papers in a target domain and constructing comprehensive background knowledge graphs (KGs); (2) creating new ideas by predicting links from the background KGs, by combining graph attention and contextual text attention; (3) incrementally writing some key elements of a new paper based on memory-attention networks: from the input title along with predicted related entities to generate a paper abstract, from the abstract to generate conclusion and future work, and finally from future work to generate a title for a follow-on paper. |
192 | Rhetorically Controlled Encoder-Decoder for Modern Chinese Poetry Generation | Zhiqiang Liu, Zuohui Fu, Jie Cao, Gerard de Melo, Yik-Cheung Tam, Cheng Niu, Jie Zhou, | In this paper, we propose a rhetorically controlled encoder-decoder for modern Chinese poetry generation. |
193 | Enhancing Topic-to-Essay Generation with External Commonsense Knowledge | Pengcheng Yang, Lei Li, Fuli Luo, Tianyu Liu, Xu Sun, | Towards filling this gap, we propose to integrate commonsense from the external knowledge base into the generator through dynamic memory mechanism. |
194 | Towards Fine-grained Text Sentiment Transfer | Fuli Luo, Peng Li, Pengcheng Yang, Jie Zhou, Yutong Tan, Baobao Chang, Zhifang Sui, Xu Sun, | In this paper, we focus on the task of fine-grained text sentiment transfer (FGST). |
195 | Data-to-text Generation with Entity Modeling | Ratish Puduppully, Li Dong, Mirella Lapata, | In this work we propose an entity-centric neural architecture for data-to-text generation. |
196 | Ensuring Readability and Data-fidelity using Head-modifier Templates in Deep Type Description Generation | Jiangjie Chen, Ao Wang, Haiyun Jiang, Suo Feng, Chenguang Li, Yanghua Xiao, | To solve these problems, we propose a head-modifier template based method to ensure the readability and data fidelity of generated type descriptions. |
197 | Key Fact as Pivot: A Two-Stage Model for Low Resource Table-to-Text Generation | Shuming Ma, Pengcheng Yang, Tianyu Liu, Peng Li, Jie Zhou, Xu Sun, | In this work, we consider the scenario of low resource table-to-text generation, where only limited parallel data is available. |
198 | Unsupervised Neural Text Simplification | Sai Surya, Abhijit Mishra, Anirban Laha, Parag Jain, Karthik Sankaranarayanan, | The paper presents a first attempt towards unsupervised neural text simplification that relies only on unlabeled text corpora. |
199 | Syntax-Infused Variational Autoencoder for Text Generation | Xinyuan Zhang, Yi Yang, Siyang Yuan, Dinghan Shen, Lawrence Carin, | We present a syntax-infused variational autoencoder (SIVAE), that integrates sentences with their syntactic trees to improve the grammar of generated sentences. |
200 | Towards Generating Long and Coherent Text with Multi-Level Latent Variable Models | Dinghan Shen, Asli Celikyilmaz, Yizhe Zhang, Liqun Chen, Xin Wang, Jianfeng Gao, Lawrence Carin, | In this paper, we propose to leverage several multi-level structures to learn a VAE model for generating long, and coherent text. |
201 | Jointly Learning Semantic Parser and Natural Language Generator via Dual Information Maximization | Hai Ye, Wenjie Li, Lu Wang, | In this paper, we model the duality of these two tasks via a joint learning framework, and demonstrate its effectiveness of boosting the performance on both tasks. |
202 | Learning to Select, Track, and Generate for Data-to-Text | Hayate Iso, Yui Uehara, Tatsuya Ishigaki, Hiroshi Noji, Eiji Aramaki, Ichiro Kobayashi, Yusuke Miyao, Naoaki Okazaki, Hiroya Takamura, | We propose a data-to-text generation model with two modules, one for tracking and the other for text generation. |
203 | Reinforced Dynamic Reasoning for Conversational Question Generation | Boyuan Pan, Hao Li, Ziyu Yao, Deng Cai, Huan Sun, | Towards that end, we propose a new approach named Reinforced Dynamic Reasoning network, which is based on the general encoder-decoder framework but incorporates a reasoning procedure in a dynamic manner to better understand what has been asked and what to ask next about the passage into the general encoder-decoder framework. |
204 | TalkSumm: A Dataset and Scalable Annotation Method for Scientific Paper Summarization Based on Conference Talks | Guy Lev, Michal Shmueli-Scheuer, Jonathan Herzig, Achiya Jerbi, David Konopnicki, | In this paper, we propose a novel method that automatically generates summaries for scientific papers, by utilizing videos of talks at scientific conferences. |
205 | Improving Abstractive Document Summarization with Salient Information Modeling | Yongjian You, Weijia Jia, Tianyi Liu, Wenmian Yang, | To tackle the above difficulties, we propose a Transformer-based encoder-decoder framework with two novel extensions for abstractive document summarization. |
206 | Unsupervised Neural Single-Document Summarization of Reviews via Learning Latent Discourse Structure and its Ranking | Masaru Isonuma, Junichiro Mori, Ichiro Sakata, | This paper focuses on the end-to-end abstractive summarization of a single product review without supervision. |
207 | BiSET: Bi-directional Selective Encoding with Template for Abstractive Summarization | Kai Wang, Xiaojun Quan, Rui Wang, | In this paper, we propose a novel Bi-directional Selective Encoding with Template (BiSET) model, which leverages template discovered from training data to softly select key information from each source article to guide its summarization process. |
208 | Neural Keyphrase Generation via Reinforcement Learning with Adaptive Rewards | Hou Pong Chan, Wang Chen, Lu Wang, Irwin King, | To address this problem, we propose a reinforcement learning (RL) approach for keyphrase generation, with an adaptive reward function that encourages a model to generate both sufficient and accurate keyphrases. |
209 | Scoring Sentence Singletons and Pairs for Abstractive Summarization | Logan Lebanoff, Kaiqiang Song, Franck Dernoncourt, Doo Soon Kim, Seokhwan Kim, Walter Chang, Fei Liu, | This paper attempts to bridge the gap by ranking sentence singletons and pairs together in a unified space. |
210 | Keep Meeting Summaries on Topic: Abstractive Multi-Modal Meeting Summarization | Manling Li, Lingyu Zhang, Heng Ji, Richard J. Radke, | Specifically, we propose a multi-modal hierarchical attention across three levels: segment, utterance and word. |
211 | Adversarial Domain Adaptation Using Artificial Titles for Abstractive Title Generation | Francine Chen, Yan-Ying Chen, | This paper examines techniques for adapting from a labeled source domain to an unlabeled target domain in the context of an encoder-decoder model for text generation. |
212 | BIGPATENT: A Large-Scale Dataset for Abstractive and Coherent Summarization | Eva Sharma, Chen Li, Lu Wang, | In this work, we present a novel dataset, BIGPATENT, consisting of 1.3 million records of U.S. patent documents along with human written abstractive summaries. |
213 | Ranking Generated Summaries by Correctness: An Interesting but Challenging Application for Natural Language Inference | Tobias Falke, Leonardo F. R. Ribeiro, Prasetya Ajie Utama, Ido Dagan, Iryna Gurevych, | In this paper, we evaluate summaries produced by state-of-the-art models via crowdsourcing and show that such errors occur frequently, in particular with more abstractive models. |
214 | Self-Supervised Learning for Contextualized Extractive Summarization | Hong Wang, Xin Wang, Wenhan Xiong, Mo Yu, Xiaoxiao Guo, Shiyu Chang, William Yang Wang, | In this paper, we aim to improve this task by introducing three auxiliary pre-training tasks that learn to capture the document-level context in a self-supervised fashion. |
215 | On the Summarization of Consumer Health Questions | Asma Ben Abacha, Dina Demner-Fushman, | In this paper, we study neural abstractive models for medical question summarization. |
216 | Unsupervised Rewriter for Multi-Sentence Compression | Yang Zhao, Xiaoyu Shen, Wei Bi, Akiko Aizawa, | To tackle the above-mentioned issues, we present a neural rewriter for multi-sentence compression that does not need any parallel corpus. |
217 | Inferential Machine Comprehension: Answering Questions by Recursively Deducing the Evidence Chain from Text | Jianxing Yu, Zhengjun Zha, Jian Yin, | This paper focuses on the topic of inferential machine comprehension, which aims to fully understand the meanings of given text to answer generic questions, especially the ones needed reasoning skills. |
218 | Token-level Dynamic Self-Attention Network for Multi-Passage Reading Comprehension | Yimeng Zhuang, Huadong Wang, | In this paper, we introduce the Dynamic Self-attention Network (DynSAN) for multi-passage reading comprehension task, which processes cross-passage information at token-level and meanwhile avoids substantial computational costs. |
219 | Explicit Utilization of General Knowledge in Machine Reading Comprehension | Chao Wang, Hui Jiang, | To bridge the gap between Machine Reading Comprehension (MRC) models and human beings, which is mainly reflected in the hunger for data and the robustness to noise, in this paper, we explore how to integrate the neural networks of MRC models with the general knowledge of human beings. |
220 | Multi-style Generative Reading Comprehension | Kyosuke Nishida, Itsumi Saito, Kosuke Nishida, Kazutoshi Shinoda, Atsushi Otsuka, Hisako Asano, Junji Tomita, | We propose a multi-style abstractive summarization model for question answering, called Masque. |
221 | Retrieve, Read, Rerank: Towards End-to-End Multi-Document Reading Comprehension | Minghao Hu, Yuxing Peng, Zhen Huang, Dongsheng Li, | In this work, we present RE$^3$QA, a unified question answering model that combines context retrieving, reading comprehension, and answer reranking to predict the final answer. |
222 | Multi-Hop Paragraph Retrieval for Open-Domain Question Answering | Yair Feldman, Ran El-Yaniv, | We present a method for retrieving multiple supporting paragraphs, nested amidst a large knowledge base, which contain the necessary evidence to answer a given question. |
223 | E3: Entailment-driven Extracting and Editing for Conversational Machine Reading | Victor Zhong, Luke Zettlemoyer, | We present a new conversational machine reading model that jointly extracts a set of decision rules from the procedural text while reasoning about which are entailed by the conversational history and which still need to be edited to create questions for the user. |
224 | Generating Question-Answer Hierarchies | Kalpesh Krishna, Mohit Iyyer, | In this paper, we present SQUASH (Specificity-controlled Question-Answer Hierarchies), a novel and challenging text generation task that converts an input document into a hierarchy of question-answer pairs. |
225 | Answering while Summarizing: Multi-task Learning for Multi-hop QA with Evidence Extraction | Kosuke Nishida, Kyosuke Nishida, Masaaki Nagata, Atsushi Otsuka, Itsumi Saito, Hisako Asano, Junji Tomita, | This study focuses on the task of explainable multi-hop QA, which requires the system to return the answer with evidence sentences by reasoning and gathering disjoint pieces of the reference texts. |
226 | Enhancing Pre-Trained Language Representations with Rich Knowledge for Machine Reading Comprehension | An Yang, Quan Wang, Jing Liu, Kai Liu, Yajuan Lyu, Hua Wu, Qiaoqiao She, Sujian Li, | In this work, we investigate the potential of leveraging external knowledge bases (KBs) to further improve BERT for MRC. |
227 | XQA: A Cross-lingual Open-domain Question Answering Dataset | Jiahua Liu, Yankai Lin, Zhiyuan Liu, Maosong Sun, | In this paper, we construct a novel dataset XQA for cross-lingual OpenQA research. |
228 | Compound Probabilistic Context-Free Grammars for Grammar Induction | Yoon Kim, Chris Dyer, Alexander Rush, | We study a formalization of the grammar induction problem that models sentences as being generated by a compound probabilistic context free grammar. |
229 | Semi-supervised Domain Adaptation for Dependency Parsing | Zhenghua Li, Xue Peng, Min Zhang, Rui Wang, Luo Si, | We propose a simple domain embedding approach to merge the source- and target-domain training data, which is shown to be more effective than both direct corpus concatenation and multi-task learning. |
230 | Head-Driven Phrase Structure Grammar Parsing on Penn Treebank | Junru Zhou, Hai Zhao, | This paper makes the first attempt to formulate a simplified HPSG by integrating constituent and dependency formal representations into head-driven phrase structure. |
231 | Distantly Supervised Named Entity Recognition using Positive-Unlabeled Learning | Minlong Peng, Xiaoyu Xing, Qi Zhang, Jinlan Fu, Xuanjing Huang, | In this work, we explore the way to perform named entity recognition (NER) using only unlabeled data and named entity dictionaries. |
232 | Multi-Task Semantic Dependency Parsing with Policy Gradient for Learning Easy-First Strategies | Shuhei Kurita, Anders Søgaard, | We propose a new iterative predicate selection (IPS) algorithm for SDP. |
233 | GCDT: A Global Context Enhanced Deep Transition Architecture for Sequence Labeling | Yijin Liu, Fandong Meng, Jinchao Zhang, Jinan Xu, Yufeng Chen, Jie Zhou, | In this paper, we try to address these issues, and thus propose a Global Context enhanced Deep Transition architecture for sequence labeling named GCDT. |
234 | Unsupervised Learning of PCFGs with Normalizing Flow | Lifeng Jin, Finale Doshi-Velez, Timothy Miller, Lane Schwartz, William Schuler, | This paper describes a neural PCFG inducer which employs context embeddings (Peters et al., 2018) in a normalizing flow model (Dinh et al., 2015) to extend PCFG induction to use semantic and morphological information. |
235 | Variance of Average Surprisal: A Better Predictor for Quality of Grammar from Unsupervised PCFG Induction | Lifeng Jin, William Schuler, | In order to find a better indicator for quality of induced grammars, this paper correlates several linguistically- and psycholinguistically-motivated predictors to parsing accuracy on a large multilingual grammar induction evaluation data set. |
236 | Cross-Domain NER using Cross-Domain Language Modeling | Chen Jia, Xiaobo Liang, Yue Zhang, | To address this issue, we consider using cross-domain LM as a bridge cross-domains for NER domain adaptation, performing cross-domain and cross-task knowledge transfer by designing a novel parameter generation network. |
237 | Graph-based Dependency Parsing with Graph Neural Networks | Tao Ji, Yuanbin Wu, Man Lan, | We investigate the problem of efficiently incorporating high-order features into neural graph-based dependency parsing. |
238 | Wide-Coverage Neural A* Parsing for Minimalist Grammars | John Torr, Milos Stanojevic, Mark Steedman, Shay B. Cohen, | This paper presents the first ever application of this formalism to the task of realistic wide-coverage parsing. |
239 | Multi-Modal Sarcasm Detection in Twitter with Hierarchical Fusion Model | Yitao Cai, Huiyu Cai, Xiaojun Wan, | In this paper, we focus on multi-modal sarcasm detection for tweets consisting of texts and images in Twitter. We create a multi-modal sarcasm detection dataset based on Twitter. |
240 | Topic-Aware Neural Keyphrase Generation for Social Media Language | Yue Wang, Jing Li, Hou Pong Chan, Irwin King, Michael R. Lyu, Shuming Shi, | To facilitate automatic language understanding, we study keyphrase prediction, distilling salient information from massive posts. |
241 | #YouToo? Detection of Personal Recollections of Sexual Harassment on Social Media | Arijit Ghosh Chowdhury, Ramit Sawhney, Rajiv Ratn Shah, Debanjan Mahata, | This work attempts to aggregate such experiences of sexual abuse to facilitate a better understanding of social media constructs and to bring about social change. |
242 | Multi-task Pairwise Neural Ranking for Hashtag Segmentation | Mounica Maddela, Wei Xu, Daniel Preo?iuc-Pietro, | We build a dataset of 12,594 hashtags split into individual segments and propose a set of approaches for hashtag segmentation by framing it as a pairwise ranking problem between candidate segmentations. |
243 | Entity-Centric Contextual Affective Analysis | Anjalie Field, Yulia Tsvetkov, | We show how contextualized word embeddings can be used to capture affect dimensions in portrayals of people. |
244 | Sentence-Level Evidence Embedding for Claim Verification with Hierarchical Attention Networks | Jing Ma, Wei Gao, Shafiq Joty, Kam-Fai Wong, | In this paper, we propose a novel end-to-end hierarchical attention network focusing on learning to represent coherent evidence as well as their semantic relatedness with the claim. |
245 | Predicting Human Activities from User-Generated Content | Steven Wilson, Rada Mihalcea, | In this paper, we explore the task of predicting human activities from user-generated content. We collect a dataset containing instances of social media users writing about a range of everyday activities. |
246 | You Write like You Eat: Stylistic Variation as a Predictor of Social Stratification | Angelo Basile, Albert Gatt, Malvina Nissim, | Inspired by Labov’s seminal work on stylisticvariation as a function of social stratification,we develop and compare neural models thatpredict a person’s presumed socio-economicstatus, obtained through distant supervision,from their writing style on social media. |
247 | Encoding Social Information with Graph Convolutional Networks forPolitical Perspective Detection in News Media | Chang Li, Dan Goldwasser, | In this paper, we highlight the importance of contextualizing social information, capturing how this information is disseminated in social networks. |
248 | Fine-Grained Spoiler Detection from Large-Scale Review Corpora | Mengting Wan, Rishabh Misra, Ndapa Nakashole, Julian McAuley, | This paper presents computational approaches for automatically detecting critical plot twists in reviews of media products. First, we created a large-scale book review dataset that includes fine-grained spoiler annotations at the sentence-level, as well as book and (anonymized) user information. |
249 | Celebrity Profiling | Matti Wiegmann, Benno Stein, Martin Potthast, | With this paper we introduce the Webis Celebrity Corpus 2019. |
250 | Dataset Creation for Ranking Constructive News Comments | Soichiro Fujita, Hayato Kobayashi, Manabu Okumura, | In this paper, we address directly evaluating the quality of comments on the basis of “constructiveness,” separately from user feedback. To this end, we create a new dataset including 100K+ Japanese comments with constructiveness scores (C-scores). |
251 | Enhancing Air Quality Prediction with Social Media and Natural Language Processing | Jyun-Yu Jiang, Xue Sun, Wei Wang, Sean Young, | In this paper, we propose to exploit social media and natural language processing techniques to enhance air quality prediction. |
252 | Twitter Homophily: Network Based Prediction of User’s Occupation | Jiaqi Pan, Rishabh Bhardwaj, Wei Lu, Hai Leong Chieu, Xinghao Pan, Ni Yi Puay, | In this paper, we investigate the importance of social network information compared to content information in the prediction of a Twitter user’s occupational class. |
253 | Domain Adaptive Dialog Generation via Meta Learning | Kun Qian, Zhou Yu, | We propose a domain adaptive dialog generation method based on meta-learning (DAML). |
254 | Strategies for Structuring Story Generation | Angela Fan, Mike Lewis, Yann Dauphin, | We explore coarse-to-fine models for creating narrative texts of several hundred words, and introduce new models which decompose stories by abstracting over actions and entities. |
255 | Argument Generation with Retrieval, Planning, and Realization | Xinyu Hua, Zhe Hu, Lu Wang, | In this paper, we study the specific problem of counter-argument generation, and present a novel framework, CANDELA. |
256 | A Simple Recipe towards Reducing Hallucination in Neural Surface Realisation | Feng Nie, Jin-Ge Yao, Jinpeng Wang, Rong Pan, Chin-Yew Lin, | To mitigate this issue, we propose to integrate a language understanding module for data refinement with self-training iterations to effectively induce strong equivalence between the input data and the paired text. |
257 | Cross-Modal Commentator: Automatic Machine Commenting Based on Cross-Modal Information | Pengcheng Yang, Zhihan Zhang, Fuli Luo, Lei Li, Chengyang Huang, Xu Sun, | To remedy this, we propose a new task: cross-model automatic commenting (CMAC), which aims to make comments by integrating multiple modal contents. |
258 | A Working Memory Model for Task-oriented Dialog Response Generation | Xiuyi Chen, Jiaming Xu, Bo Xu, | Inspired by the psychological studies on working memory, we propose a working memory model (WMM2Seq) for dialog response generation. |
259 | Cognitive Graph for Multi-Hop Reading Comprehension at Scale | Ming Ding, Chang Zhou, Qibin Chen, Hongxia Yang, Jie Tang, | We propose a new CogQA framework for multi-hop reading comprehension question answering in web-scale documents. |
260 | Multi-hop Reading Comprehension across Multiple Documents by Reasoning over Heterogeneous Graphs | Ming Tu, Guangtao Wang, Jing Huang, Yun Tang, Xiaodong He, Bowen Zhou, | In this paper, we propose a new model to tackle the multi-hop RC problem. |
261 | Explore, Propose, and Assemble: An Interpretable Model for Multi-Hop Reading Comprehension | Yichen Jiang, Nitish Joshi, Yen-Chun Chen, Mohit Bansal, | To achieve this, we propose an interpretable 3-module system called Explore-Propose-Assemble reader (EPAr). |
262 | Avoiding Reasoning Shortcuts: Adversarial Evaluation, Training, and Model Development for Multi-Hop QA | Yichen Jiang, Mohit Bansal, | In this paper, we show that in the multi-hop HotpotQA (Yang et al., 2018) dataset, the examples often contain reasoning shortcuts through which models can directly locate the answer by word-matching the question with a sentence in the context. |
263 | Exploiting Explicit Paths for Multi-hop Reading Comprehension | Souvik Kundu, Tushar Khot, Ashish Sabharwal, Peter Clark, | We propose a novel, path-based reasoning approach for the multi-hop reading comprehension task where a system needs to combine facts from multiple passages to answer a question. |
264 | Sentence Mover’s Similarity: Automatic Evaluation for Multi-Sentence Texts | Elizabeth Clark, Asli Celikyilmaz, Noah A. Smith, | We introduce methods based on sentence mover’s similarity; our automatic metrics evaluate text in a continuous space using word and sentence embeddings. |
265 | Analysis of Automatic Annotation Suggestions for Hard Discourse-Level Tasks in Expert Domains | Claudia Schulz, Christian M. Meyer, Jan Kiesewetter, Michael Sailer, Elisabeth Bauer, Martin R. Fischer, Frank Fischer, Iryna Gurevych, | To speed up and ease annotations, we investigate the viability of automatically generated annotation suggestions for such tasks. |
266 | Deep Dominance – How to Properly Compare Deep Neural Models | Rotem Dror, Segev Shlomov, Roi Reichart, | In this paper, we propose to adapt to this problem a recently proposed test for the Almost Stochastic Dominance relation between two distributions. |
267 | We Need to Talk about Standard Splits | Kyle Gorman, Steven Bedrick, | We argue that randomly generated splits should be used in system evaluation. |
268 | Aiming beyond the Obvious: Identifying Non-Obvious Cases in Semantic Similarity Datasets | Nicole Peinelt, Maria Liakata, Dong Nguyen, | This paper proposes to distinguish obvious from non-obvious text pairs based on superficial lexical overlap and ground-truth labels. |
269 | Putting Evaluation in Context: Contextual Embeddings Improve Machine Translation Evaluation | Nitika Mathur, Timothy Baldwin, Trevor Cohn, | We proposed a simple unsupervised metric, and additional supervised metrics which rely on contextual word embeddings to encode the translation and reference sentences. |
270 | Joint Effects of Context and User History for Predicting Online Conversation Re-entries | Xingshan Zeng, Jing Li, Lu Wang, Kam-Fai Wong, | Specifically, we propose a neural framework with three main layers, each modeling context, user history, and interactions between them, to explore how the conversation context and user chatting history jointly result in their re-entry behavior. |
271 | CONAN – COunter NArratives through Nichesourcing: a Multilingual Dataset of Responses to Fight Online Hate Speech | Yi-Ling Chung, Elizaveta Kuzmenko, Serra Sinem Tekiroglu, Marco Guerini, | In this paper, we describe the creation of the first large-scale, multilingual, expert-based dataset of hate-speech/counter-narrative pairs. |
272 | Categorizing and Inferring the Relationship between the Text and Image of Twitter Posts | Alakananda Vempala, Daniel Preo?iuc-Pietro, | We show that by combining the text and image information, we can build a machine learning approach that accurately distinguishes between the relationship types. |
273 | Who Sides with Whom? Towards Computational Construction of Discourse Networks for Political Debates | Sebastian Padó, Andre Blessing, Nico Blokker, Erenay Dayanik, Sebastian Haunss, Jonas Kuhn, | This paper presents three contributions towards this goal: (a) a requirements analysis, linking the task to knowledge base population; (b) an annotated pilot corpus of migration claims based on German newspaper reports; (c) initial modeling results. |
274 | Analyzing Linguistic Differences between Owner and Staff Attributed Tweets | Daniel Preo?iuc-Pietro, Rita Devlin Marier, | In this study, we challenge this assumption and study the linguistic differences between posts signed by the account owner or attributed to their staff. |
275 | Exploring Author Context for Detecting Intended vs Perceived Sarcasm | Silviu Oprea, Walid Magdy, | We define author context as the embedded representation of their historical posts on Twitter and suggest neural models that extract these representations. |
276 | Open Domain Event Extraction Using Neural Latent Variable Models | Xiao Liu, Heyan Huang, Yue Zhang, | We consider open domain event extraction, the task of extracting unconstraint types of events from news clusters. |
277 | Multi-Level Matching and Aggregation Network for Few-Shot Relation Classification | Zhi-Xiu Ye, Zhen-Hua Ling, | This paper presents a multi-level matching and aggregation network (MLMAN) for few-shot relation classification. |
278 | Quantifying Similarity between Relations with Fact Distribution | Weize Chen, Hao Zhu, Xu Han, Zhiyuan Liu, Maosong Sun, | We introduce a conceptually simple and effective method to quantify the similarity between relations in knowledge bases. |
279 | Matching the Blanks: Distributional Similarity for Relation Learning | Livio Baldini Soares, Nicholas FitzGerald, Jeffrey Ling, Tom Kwiatkowski, | In this paper, we build on extensions of Harris’ distributional hypothesis to relations, as well as recent advances in learning text representations (specifically, BERT), to build task agnostic relation representations solely from entity-linked text. |
280 | Fine-Grained Temporal Relation Extraction | Siddharth Vashishtha, Benjamin Van Durme, Aaron Steven White, | We present a novel semantic framework for modeling temporal relations and event durations that maps pairs of events to real-valued scales. |
281 | FIESTA: Fast IdEntification of State-of-The-Art models using adaptive bandit algorithms | Henry Moss, Andrew Moore, David Leslie, Paul Rayson, | We present FIESTA, a model selection approach that significantly reduces the computational resources required to reliably identify state-of-the-art performance from large collections of candidate models. |
282 | Is Attention Interpretable? | Sofia Serrano, Noah A. Smith, | We conclude that while attention noisily predicts input components’ overall importance to a model, it is by no means a fail-safe indicator. |
283 | Correlating Neural and Symbolic Representations of Language | Grzegorz Chrupa?a, Afra Alishahi, | Here we present two methods based on Representational Similarity Analysis (RSA) and Tree Kernels (TK) which allow us to directly quantify how strongly the information encoded in neural activation patterns corresponds to information represented by symbolic structures such as syntax trees. |
284 | Interpretable Neural Predictions with Differentiable Binary Variables | Joost Bastings, Wilker Aziz, Ivan Titov, | We propose a latent model that mixes discrete and continuous behaviour allowing at the same time for binary selections and gradient-based training without REINFORCE. |
285 | Transformer-XL: Attentive Language Models beyond a Fixed-Length Context | Zihang Dai, Zhilin Yang, Yiming Yang, Jaime Carbonell, Quoc Le, Ruslan Salakhutdinov, | We propose a novel neural architecture Transformer-XL that enables learning dependency beyond a fixed length without disrupting temporal coherence. |
286 | Domain Adaptation of Neural Machine Translation by Lexicon Induction | Junjie Hu, Mengzhou Xia, Graham Neubig, Jaime Carbonell, | To remedy this problem, we propose an unsupervised adaptation method which fine-tunes a pre-trained out-of-domain NMT model using a pseudo-in-domain corpus. |
287 | Reference Network for Neural Machine Translation | Han Fu, Chenghao Liu, Jianling Sun, | In this paper, we propose a Reference Network to incorporate referring process into translation decoding of NMT. |
288 | Retrieving Sequential Information for Non-Autoregressive Neural Machine Translation | Chenze Shao, Yang Feng, Jinchao Zhang, Fandong Meng, Xilin Chen, Jie Zhou, | In this paper, we propose two approaches to retrieve the target sequential information for NAT to enhance its translation ability while preserving the fast-decoding property. |
289 | STACL: Simultaneous Translation with Implicit Anticipation and Controllable Latency using Prefix-to-Prefix Framework | Mingbo Ma, Liang Huang, Hao Xiong, Renjie Zheng, Kaibo Liu, Baigong Zheng, Chuanqiang Zhang, Zhongjun He, Hairong Liu, Xing Li, Hua Wu, Haifeng Wang, | Within this framework, we present a very simple yet surprisingly effective “wait-k” policy trained to generate the target sentence concur- rently with the source sentence, but always k words behind. |
290 | Look Harder: A Neural Machine Translation Model with Hard Attention | Sathish Reddy Indurthi, Insoo Chung, Sangha Kim, | In this work, we propose a hard-attention based NMT model which selects a subset of source tokens for each target token to effectively handle long sequence translation. |
291 | Robust Neural Machine Translation with Joint Textual and Phonetic Embedding | Hairong Liu, Mingbo Ma, Liang Huang, Hao Xiong, Zhongjun He, | We propose to improve the robustness of NMT to homophone noises by 1) jointly embedding both textual and phonetic information of source sentences, and 2) augmenting the training dataset with homophone noises. |
292 | A Simple and Effective Approach to Automatic Post-Editing with Transfer Learning | Gonçalo M. Correia, André F. T. Martins, | in this paper, we propose an alternative where we fine-tune pre-trained BERT models on both the encoder and decoder of an APE system, exploring several parameter sharing strategies. |
293 | Translating Translationese: A Two-Step Approach to Unsupervised Machine Translation | Nima Pourdamghani, Nada Aldarrab, Marjan Ghazvininejad, Kevin Knight, Jonathan May, | In this work we explore this intuition by breaking translation into a two step process: generating a rough gloss by means of a dictionary and then translating’ the resulting pseudo-translation, or Translationese’ into a fully fluent translation. |
294 | Training Neural Machine Translation to Apply Terminology Constraints | Georgiana Dinu, Prashant Mathur, Marcello Federico, Yaser Al-Onaizan, | This paper proposes a novel method to inject custom terminology into neural machine translation at run time. |
295 | Leveraging Local and Global Patterns for Self-Attention Networks | Mingzhou Xu, Derek F. Wong, Baosong Yang, Yue Zhang, Lidia S. Chao, | To address this argument, we propose a hybrid attention mechanism to dynamically leverage both of the local and global information. |
296 | Sentence-Level Agreement for Neural Machine Translation | Mingming Yang, Rui Wang, Kehai Chen, Masao Utiyama, Eiichiro Sumita, Min Zhang, Tiejun Zhao, | In this paper, we propose a sentence-level agreement module to directly minimize the difference between the representation of source and target sentence. |
297 | Multilingual Unsupervised NMT using Shared Encoder and Language-Specific Decoders | Sukanta Sen, Kamal Kumar Gupta, Asif Ekbal, Pushpak Bhattacharyya, | In this paper, we propose a multilingual unsupervised NMT scheme which jointly trains multiple languages with a shared encoder and multiple decoders. |
298 | Lattice-Based Transformer Encoder for Neural Machine Translation | Fengshun Xiao, Jiangtong Li, Hai Zhao, Rui Wang, Kehai Chen, | We propose two methods: 1) lattice positional encoding and 2) lattice-aware self-attention. |
299 | Multi-Source Cross-Lingual Model Transfer: Learning What to Share | Xilun Chen, Ahmed Hassan Awadallah, Hany Hassan, Wei Wang, Claire Cardie, | In this work, we focus on the multilingual transfer setting where training data in multiple source languages is leveraged to further boost target language performance. |
300 | Unsupervised Multilingual Word Embedding with Limited Resources using Neural Language Models | Takashi Wada, Tomoharu Iwata, Yuji Matsumoto, | To overcome this problem, we propose a new unsupervised multilingual embedding method that does not rely on such assumption and performs well under resource-poor scenarios, namely when only a small amount of monolingual data (i.e., 50k sentences) are available, or when the domains of monolingual data are different across languages. |
301 | Choosing Transfer Languages for Cross-Lingual Learning | Yu-Hsiang Lin, Chian-Yu Chen, Jean Lee, Zirui Li, Yuyan Zhang, Mengzhou Xia, Shruti Rijhwani, Junxian He, Zhisong Zhang, Xuezhe Ma, Antonios Anastasopoulos, Patrick Littell, Graham Neubig, | In this paper, we consider this task of automatically selecting optimal transfer languages as a ranking problem, and build models that consider the aforementioned features to perform this prediction. |
302 | CogNet: A Large-Scale Cognate Database | Khuyagbaatar Batsuren, Gabor Bella, Fausto Giunchiglia, | This paper introduces CogNet, a new, large-scale lexical database that provides cognates -words of common origin and meaning- across languages. |
303 | Neural Decipherment via Minimum-Cost Flow: From Ugaritic to Linear B | Jiaming Luo, Yuan Cao, Regina Barzilay, | In this paper we propose a novel neural approach for automatic decipherment of lost languages. |
304 | Cross-lingual Knowledge Graph Alignment via Graph Matching Neural Network | Kun Xu, Liwei Wang, Mo Yu, Yansong Feng, Yan Song, Zhiguo Wang, Dong Yu, | In this paper, we introduce the topic entity graph, a local sub-graph of an entity, to represent entities with their contextual information in KG. |
305 | Zero-Shot Cross-Lingual Abstractive Sentence Summarization through Teaching Generation and Attention | Xiangyu Duan, Mingming Yin, Min Zhang, Boxing Chen, Weihua Luo, | We propose to solve this zero-shot problem by using resource-rich monolingual ASSUM system to teach zero-shot cross-lingual ASSUM system on both summary word generation and attention. |
306 | Improving Low-Resource Cross-lingual Document Retrieval by Reranking with Deep Bilingual Representations | Rui Zhang, Caitlin Westerfield, Sungrok Shim, Garrett Bingham, Alexander Fabbri, William Hu, Neha Verma, Dragomir Radev, | In this paper, we propose to boost low-resource cross-lingual document retrieval performance with deep bilingual query-document representations. |
307 | Are Girls Neko or Sh=ojo? Cross-Lingual Alignment of Non-Isomorphic Embeddings with Iterative Normalization | Mozhi Zhang, Keyulu Xu, Ken-ichi Kawarabayashi, Stefanie Jegelka, Jordan Boyd-Graber, | For non-isomorphic pairs, our method (Iterative Normalization) transforms monolingual embeddings to make orthogonal alignment easier by simultaneously enforcing that (1) individual word vectors are unit length, and (2) each language’s average vector is zero. |
308 | MAAM: A Morphology-Aware Alignment Model for Unsupervised Bilingual Lexicon Induction | Pengcheng Yang, Fuli Luo, Peng Chen, Tianyu Liu, Xu Sun, | To tackle this challenge, we propose a morphology-aware alignment model for the UBLI task. |
309 | Margin-based Parallel Corpus Mining with Multilingual Sentence Embeddings | Mikel Artetxe, Holger Schwenk, | In this paper, we propose a new method for this task based on multilingual sentence embeddings. |
310 | JW300: A Wide-Coverage Parallel Corpus for Low-Resource Languages | Željko Agi?, Ivan Vuli?, | In this paper, we present the resource and showcase its utility in experiments with cross-lingual word embedding induction and multi-source part-of-speech projection. |
311 | Cross-Lingual Syntactic Transfer through Unsupervised Adaptation of Invertible Projections | Junxian He, Zhisong Zhang, Taylor Berg-Kirkpatrick, Graham Neubig, | In this paper, we focus on methods for cross-lingual transfer to distant languages and propose to learn a generative model with a structured prior that utilizes labeled source data and unlabeled target data jointly. |
312 | Unsupervised Joint Training of Bilingual Word Embeddings | Benjamin Marie, Atsushi Fujita, | In this work, we propose a new approach that trains unsupervised BWE jointly on synthetic parallel data generated through unsupervised machine translation. |
313 | Inferring Concept Hierarchies from Text Corpora via Hyperbolic Embeddings | Matthew Le, Stephen Roller, Laetitia Papaxanthos, Douwe Kiela, Maximilian Nickel, | For this purpose, we propose a new method combining hyperbolic embeddings and Hearst patterns. |
314 | Is Word Segmentation Necessary for Deep Learning of Chinese Representations? | Xiaoya Li, Yuxian Meng, Xiaofei Sun, Qinghong Han, Arianna Yuan, Jiwei Li, | In this paper, we ask the fundamental question of whether Chinese word segmentation (CWS) is necessary for deep learning-based Chinese Natural Language Processing. |
315 | Towards Understanding Linear Word Analogies | Kawin Ethayarajh, David Duvenaud, Graeme Hirst, | We provide novel justification for the addition of SGNS word vectors by showing that it automatically down-weights the more frequent word, as weighting schemes do ad hoc. |
316 | On the Compositionality Prediction of Noun Phrases using Poincar’e Embeddings | Abhik Jana, Dima Puzyrev, Alexander Panchenko, Pawan Goyal, Chris Biemann, Animesh Mukherjee, | We introduce a novel technique to blend hierarchical information with distributional information for predicting compositionality. |
317 | Robust Representation Learning of Biomedical Names | Minh C. Phan, Aixin Sun, Yi Tay, | This paper proposes a new framework for learning robust representations of biomedical names and terms. |
318 | Relational Word Embeddings | Jose Camacho-Collados, Luis Espinosa Anke, Steven Schockaert, | As an alternative, in this paper we propose to encode relational knowledge in a separate word embedding, which is aimed to be complementary to a given standard word embedding. |
319 | Unraveling Antonym’s Word Vectors through a Siamese-like Network | Mathias Etcheverry, Dina Wonsever, | We present an approach to unravel antonymy and synonymy from word vectors based on a siamese network inspired approach. |
320 | Incorporating Syntactic and Semantic Information in Word Embeddings using Graph Convolutional Networks | Shikhar Vashishth, Manik Bhandari, Prateek Yadav, Piyush Rai, Chiranjib Bhattacharyya, Partha Talukdar, | In this paper, we overcome this problem by proposing SynGCN, a flexible Graph Convolution based method for learning word embeddings. |
321 | Word and Document Embedding with vMF-Mixture Priors on Context Word Vectors | Shoaib Jameel, Steven Schockaert, | Our hypothesis in this paper is that embedding models can be improved by explicitly imposing a cluster structure on the set of context word vectors. |
322 | Delta Embedding Learning | Xiao Zhang, Ji Wu, Dejing Dou, | We propose a novel learning technique called Delta Embedding Learning, which can be applied to general NLP tasks to improve performance by optimized tuning of the word embeddings. |
323 | Annotation and Automatic Classification of Aspectual Categories | Markus Egg, Helena Prepens, Will Roberts, | We present the first annotated resource for the aspectual classification of German verb tokens in their clausal context. |
324 | Putting Words in Context: LSTM Language Models and Lexical Ambiguity | Laura Aina, Kristina Gulordava, Gemma Boleda, | We investigate how an LSTM language model deals with lexical ambiguity in English, designing a method to probe its hidden representations for lexical and contextual information about words. |
325 | Making Fast Graph-based Algorithms with Graph Metric Embeddings | Andrey Kutuzov, Mohammad Dorgham, Oleksiy Oliynyk, Chris Biemann, Alexander Panchenko, | We introduce a simple yet efficient and effective approach for learning graph embeddings. |
326 | Embedding Imputation with Grounded Language Information | Ziyi Yang, Chenguang Zhu, Vin Sachidananda, Eric Darve, | In this paper, we propose an approach for embedding imputation which uses grounded information in the form of a knowledge graph. |
327 | The Effectiveness of Simple Hybrid Systems for Hypernym Discovery | William Held, Nizar Habash, | This paper evaluates the contribution of both paradigms to hybrid success by evaluating the benefits of hybrid treatment of baseline models from each paradigm. |
328 | BERT-based Lexical Substitution | Wangchunshu Zhou, Tao Ge, Ke Xu, Furu Wei, Ming Zhou, | To address these issues, we propose an end-to-end BERT-based lexical substitution approach which can propose and validate substitute candidates without using any annotated data or manually curated resources. |
329 | Exploring Numeracy in Word Embeddings | Aakanksha Naik, Abhilasha Ravichander, Carolyn Rose, Eduard Hovy, | In this work, we show that existing embedding models are inadequate at constructing representations that capture salient aspects of mathematical meaning for numbers, which is important for language understanding. |
330 | HighRES: Highlight-based Reference-less Evaluation of Summarization | Hardy Hardy, Shashi Narayan, Andreas Vlachos, | To address this issue, we propose a novel approach for manual evaluation, Highlight-based Reference-less Evaluation of Summarization (HighRES), in which summaries are assessed by multiple annotators against the source document via manually highlighted salient content in the latter. |
331 | EditNTS: An Neural Programmer-Interpreter Model for Sentence Simplification through Explicit Editing | Yue Dong, Zichao Li, Mehdi Rezagholizadeh, Jackie Chi Kit Cheung, | We present the first sentence simplification model that learns explicit edit operations (ADD, DELETE, and KEEP) via a neural programmer-interpreter approach. |
332 | Decomposable Neural Paraphrase Generation | Zichao Li, Xin Jiang, Lifeng Shang, Qun Liu, | This paper presents Decomposable Neural Paraphrase Generator (DNPG), a Transformer-based model that can learn and generate paraphrases of a sentence at different levels of granularity in a disentangled way. |
333 | Transforming Complex Sentences into a Semantic Hierarchy | Christina Niklaus, Matthias Cetto, André Freitas, Siegfried Handschuh, | We present an approach for recursively splitting and rephrasing complex English sentences into a novel semantic hierarchy of simplified sentences, with each of them presenting a more regular structure that may facilitate a wide variety of artificial intelligence tasks, such as machine translation (MT) or information extraction (IE). |
334 | Right for the Wrong Reasons: Diagnosing Syntactic Heuristics in Natural Language Inference | Tom McCoy, Ellie Pavlick, Tal Linzen, | A machine learning system can score well on a given test set by relying on heuristics that are effective for frequent example types but break down in more challenging cases. We study this issue within natural language inference (NLI), the task of determining whether one sentence entails another. |
335 | Zero-Shot Entity Linking by Reading Entity Descriptions | Lajanugen Logeswaran, Ming-Wei Chang, Kenton Lee, Kristina Toutanova, Jacob Devlin, Honglak Lee, | We present the zero-shot entity linking task, where mentions must be linked to unseen entities without in-domain labeled data. |
336 | Dual Adversarial Neural Transfer for Low-Resource Named Entity Recognition | Joey Tianyi Zhou, Hao Zhang, Di Jin, Hongyuan Zhu, Meng Fang, Rick Siow Mong Goh, Kenneth Kwok, | We propose a new neural transfer method termed Dual Adversarial Transfer Network (DATNet) for addressing low-resource Named Entity Recognition (NER). |
337 | Scalable Syntax-Aware Language Models Using Knowledge Distillation | Adhiguna Kuncoro, Chris Dyer, Laura Rimell, Stephen Clark, Phil Blunsom, | To answer this question, we introduce an efficient knowledge distillation (KD) technique that transfers knowledge from a syntactic language model trained on a small corpus to an LSTM language model, hence enabling the LSTM to develop a more structurally sensitive representation of the larger training data it learns from. |
338 | An Imitation Learning Approach to Unsupervised Parsing | Bowen Li, Lili Mou, Frank Keller, | In our work, we propose an imitation learning approach to unsupervised parsing, where we transfer the syntactic knowledge induced by PRPN to a Tree-LSTM model with discrete parsing actions. |
339 | Women’s Syntactic Resilience and Men’s Grammatical Luck: Gender-Bias in Part-of-Speech Tagging and Dependency Parsing | Aparna Garimella, Carmen Banea, Dirk Hovy, Rada Mihalcea, | To address this, we annotate the Wall Street Journal part of the Penn Treebank with the gender information of the articles’ authors, and build taggers and parsers trained on this data that show performance differences in text written by men and women. |
340 | Multilingual Constituency Parsing with Self-Attention and Pre-Training | Nikita Kitaev, Steven Cao, Dan Klein, | We show that constituency parsing benefits from unsupervised pre-training across a variety of languages and a range of pre-training conditions. |
341 | A Multilingual BPE Embedding Space for Universal Sentiment Lexicon Induction | Mengjie Zhao, Hinrich Schütze, | We present a new method for sentiment lexicon induction that is designed to be applicable to the entire range of typological diversity of the world’s languages. |
342 | Tree Communication Models for Sentiment Analysis | Yuan Zhang, Yue Zhang, | In this paper, we propose a tree communication model using graph convolutional neural network and graph recurrent neural network, which allows rich information exchange between phrases constituent tree. |
343 | Improved Sentiment Detection via Label Transfer from Monolingual to Synthetic Code-Switched Text | Bidisha Samanta, Niloy Ganguly, Soumen Chakrabarti, | We present an effective technique for synthesizing labeled code-switched text from labeled monolingual text, which is relatively readily available. |
344 | Exploring Sequence-to-Sequence Learning in Aspect Term Extraction | Dehong Ma, Sujian Li, Fangzhao Wu, Xing Xie, Houfeng Wang, | To tackle these problems, we first explore to formalize ATE as a sequence-to-sequence (Seq2Seq) learning task where the source sequence and target sequence are composed of words and labels respectively. At the same time, to make Seq2Seq learning suit to ATE where labels correspond to words one by one, we design the gated unit networks to incorporate corresponding word representation into the decoder, and position-aware attention to pay more attention to the adjacent words of a target word. |
345 | Aspect Sentiment Classification Towards Question-Answering with Reinforced Bidirectional Attention Network | Jingjing Wang, Changlong Sun, Shoushan Li, Xiaozhong Liu, Luo Si, Min Zhang, Guodong Zhou, | This paper extends the research to interactive reviews and proposes a new research task, namely Aspect Sentiment Classification towards Question-Answering (ASC-QA), for real-world applications. |
346 | ELI5: Long Form Question Answering | Angela Fan, Yacine Jernite, Ethan Perez, David Grangier, Jason Weston, Michael Auli, | We introduce the first large-scale corpus for long form question answering, a task requiring elaborate and in-depth answers to open-ended questions. |
347 | Textbook Question Answering with Multi-modal Context Graph Understanding and Self-supervised Open-set Comprehension | Daesik Kim, Seonhoon Kim, Nojun Kwak, | In this work, we introduce a novel algorithm for solving the textbook question answering (TQA) task which describes more realistic QA problems compared to other recent tasks. |
348 | Generating Question Relevant Captions to Aid Visual Question Answering | Jialin Wu, Zeyuan Hu, Raymond Mooney, | We present a novel approach to better VQA performance that exploits this connection by jointly generating captions that are targeted to help answer a specific visual question. |
349 | Multi-grained Attention with Object-level Grounding for Visual Question Answering | Pingping Huang, Jianhui Huang, Yuqing Guo, Min Qiao, Yong Zhu, | To address this problem, this paper proposes a multi-grained attention method. |
350 | Psycholinguistics Meets Continual Learning: Measuring Catastrophic Forgetting in Visual Question Answering | Claudio Greco, Barbara Plank, Raquel Fernández, Raffaella Bernardi, | We study the issue of catastrophic forgetting in the context of neural multimodal approaches to Visual Question Answering (VQA). |
351 | Improving Visual Question Answering by Referring to Generated Paragraph Captions | Hyounghun Kim, Mohit Bansal, | Hence, we propose a combined Visual and Textual Question Answering (VTQA) model which takes as input a paragraph caption as well as the corresponding image, and answers the given question based on both inputs. |
352 | Shared-Private Bilingual Word Embeddings for Neural Machine Translation | Xuebo Liu, Derek F. Wong, Yang Liu, Lidia S. Chao, Tong Xiao, Jingbo Zhu, | In this paper, we propose shared-private bilingual word embeddings, which give a closer relationship between the source and target embeddings, and which also reduce the number of model parameters. |
353 | Literary Event Detection | Matthew Sims, Jong Ho Park, David Bamman, | In this work we present a new dataset of literary events-events that are depicted as taking place within the imagined space of a novel. |
354 | Assessing the Ability of Self-Attention Networks to Learn Word Order | Baosong Yang, Longyue Wang, Derek F. Wong, Lidia S. Chao, Zhaopeng Tu, | To this end, we propose a novel word reordering detection task to quantify how well the word order information learned by SAN and RNN. |
355 | Energy and Policy Considerations for Deep Learning in NLP | Emma Strubell, Ananya Ganesh, Andrew McCallum, | In this paper we bring this issue to the attention of NLP researchers by quantifying the approximate financial and environmental costs of training a variety of recently successful neural network models for NLP. |
356 | What Does BERT Learn about the Structure of Language? | Ganesh Jawahar, Benoît Sagot, Djamé Seddah, | In this work, we provide novel support for this claim by performing a series of experiments to unpack the elements of English language structure learned by BERT. |
357 | A Just and Comprehensive Strategy for Using NLP to Address Online Abuse | David Jurgens, Libby Hemphill, Eshwar Chandrasekharan, | In this position paper, we argue that the community needs to make three substantive changes: (1) expanding our scope of problems to tackle both more subtle and more serious forms of abuse, (2) developing proactive technologies that counter or inhibit abuse before it harms, and (3) reframing our effort within a framework of justice to promote healthy communities. |
358 | Learning from Dialogue after Deployment: Feed Yourself, Chatbot! | Braden Hancock, Antoine Bordes, Pierre-Emmanuel Mazare, Jason Weston, | In this work, we propose the self-feeding chatbot, a dialogue agent with the ability to extract new training examples from the conversations it participates in. |
359 | Generating Responses with a Specific Emotion in Dialog | Zhenqiao Song, Xiaoqing Zheng, Lu Liu, Mu Xu, Xuanjing Huang, | We propose an emotional dialogue system (EmoDS) that can generate the meaningful responses with a coherent structure for a post, and meanwhile express the desired emotion explicitly or implicitly within a unified framework. |
360 | Semantically Conditioned Dialog Response Generation via Hierarchical Disentangled Self-Attention | Wenhu Chen, Jianshu Chen, Pengda Qin, Xifeng Yan, William Yang Wang, | To alleviate such scalability issue, we exploit the structure of dialog acts to build a multi-layer hierarchical graph, where each act is represented as a root-to-leaf route on the graph. |
361 | Incremental Learning from Scratch for Task-Oriented Dialogue Systems | Weikang Wang, Jiajun Zhang, Qian Li, Mei-Yuh Hwang, Chengqing Zong, Zhifei Li, | To address this problem, we propose a novel incremental learning framework to design task-oriented dialogue systems, or for short Incremental Dialogue System (IDS), without pre-defining the exhaustive list of user needs. To evaluate our method, we propose a new dataset which simulates unanticipated user needs in the deployment stage. |
362 | ReCoSa: Detecting the Relevant Contexts with Self-Attention for Multi-turn Dialogue Generation | Hainan Zhang, Yanyan Lan, Liang Pang, Jiafeng Guo, Xueqi Cheng, | In this paper, we propose a new model, named ReCoSa, to tackle this problem. |
363 | Dialogue Natural Language Inference | Sean Welleck, Jason Weston, Arthur Szlam, Kyunghyun Cho, | In this paper, we frame the consistency of dialogue agents as natural language inference (NLI) and create a new natural language inference dataset called Dialogue NLI. |
364 | Budgeted Policy Learning for Task-Oriented Dialogue Systems | Zhirui Zhang, Xiujun Li, Jianfeng Gao, Enhong Chen, | This paper presents a new approach that extends Deep Dyna-Q (DDQ) by incorporating a Budget-Conscious Scheduling (BCS) to best utilize a fixed, small amount of user interactions (budget) for learning task-oriented dialogue agents. |
365 | Comparison of Diverse Decoding Methods from Conditional Language Models | Daphne Ippolito, Reno Kriz, Joao Sedoc, Maria Kustikova, Chris Callison-Burch, | In this work, we perform an extensive survey of decoding-time strategies for generating diverse outputs from a conditional language model. |
366 | Retrieval-Enhanced Adversarial Training for Neural Response Generation | Qingfu Zhu, Lei Cui, Wei-Nan Zhang, Furu Wei, Ting Liu, | In this paper, we propose a Retrieval-Enhanced Adversarial Training (REAT) method for neural response generation. |
367 | Vocabulary Pyramid Network: Multi-Pass Encoding and Decoding with Multi-Level Vocabularies for Response Generation | Cao Liu, Shizhu He, Kang Liu, Jun Zhao, | To tackle the above two problems, we present a Vocabulary Pyramid Network (VPN) which is able to incorporate multi-pass encoding and decoding with multi-level vocabularies into response generation. |
368 | On-device Structured and Context Partitioned Projection Networks | Sujith Ravi, Zornitsa Kozareva, | To address this challenge, we propose an on-device neural network SGNN++ which dynamically learns compact projection vectors from raw text using structured and context-dependent partition projections. |
369 | Proactive Human-Machine Conversation with Explicit Conversation Goal | Wenquan Wu, Zhen Guo, Xiangyang Zhou, Hua Wu, Xiyuan Zhang, Rongzhong Lian, Haifeng Wang, | In this paper, we take a radical step towards building a human-like conversational agent: endowing it with the ability of proactively leading the conversation (introducing a new topic or maintaining the current topic). |
370 | Learning a Matching Model with Co-teaching for Multi-turn Response Selection in Retrieval-based Dialogue Systems | Jiazhan Feng, Chongyang Tao, Wei Wu, Yansong Feng, Dongyan Zhao, Rui Yan, | To learn a robust matching model from noisy training data, we propose a general co-teaching framework with three specific teaching strategies that cover both teaching with loss functions and teaching with data curriculum. |
371 | Learning to Abstract for Memory-augmented Conversational Response Generation | Zhiliang Tian, Wei Bi, Xiaopeng Li, Nevin L. Zhang, | In this work, we propose a memory-augmented generative model, which learns to abstract from the training corpus and saves the useful information to the memory to assist the response generation. |
372 | Are Training Samples Correlated? Learning to Generate Dialogue Responses with Multiple References | Lisong Qiu, Juntao Li, Wei Bi, Dongyan Zhao, Rui Yan, | In this paper, we propose to utilize the multiple references by considering the correlation of different valid responses and modeling the 1-to-n mapping with a novel two-step generation architecture. |
373 | Pretraining Methods for Dialog Context Representation Learning | Shikib Mehri, Evgeniia Razumovskaia, Tiancheng Zhao, Maxine Eskenazi, | Two novel methods of pretraining dialog context encoders are proposed, and a total of four methods are examined. |
374 | A Large-Scale Corpus for Conversation Disentanglement | Jonathan K. Kummerfeld, Sai R. Gouravajhala, Joseph J. Peper, Vignesh Athreya, Chulaka Gunasekara, Jatin Ganhotra, Siva Sankalp Patel, Lazaros C Polymenakos, Walter Lasecki, | We created a new dataset of 77,563 messages manually annotated with reply-structure graphs that both disentangle conversations and define internal conversation structure. |
375 | Self-Supervised Dialogue Learning | Jiawei Wu, Xin Wang, William Yang Wang, | Therefore, in this paper, we introduce a self-supervised learning task, inconsistent order detection, to explicitly capture the flow of conversation in dialogues. |
376 | Are we there yet? Encoder-decoder neural networks as cognitive models of English past tense inflection | Maria Corkery, Yevgen Matusevych, Sharon Goldwater, | Recently, however, Kirov and Cotterell (2018) showed that modern encoder-decoder (ED) models overcome many of these flaws. They also presented evidence that ED models demonstrate humanlike performance in a nonce-word task. Here, we look more closely at the behaviour of their model in this task. |
377 | A Spreading Activation Framework for Tracking Conceptual Complexity of Texts | Ioana Hulpu?, Sanja Štajner, Heiner Stuckenschmidt, | We propose an unsupervised approach for assessing conceptual complexity of texts, based on spreading activation. |
378 | End-to-End Sequential Metaphor Identification Inspired by Linguistic Theories | Rui Mao, Chenghua Lin, Frank Guerin, | We experiment with two DNN models which are inspired by two human metaphor identification procedures. By testing on three public datasets, we find that our models achieve state-of-the-art performance in end-to-end metaphor identification. |
379 | Diachronic Sense Modeling with Deep Contextualized Word Embeddings: An Ecological View | Renfen Hu, Shen Li, Shichen Liang, | To address this issue, this paper proposes a sense representation and tracking framework based on deep contextualized embeddings, aiming at answering not only what and when, but also how the word meaning changes. |
380 | Miss Tools and Mr Fruit: Emergent Communication in Agents Learning about Object Affordances | Diane Bouchacourt, Marco Baroni, | We propose here a new task capturing crucial aspects of the human environment, such as natural object affordances, and of human conversation, such as full symmetry among the participants. |
381 | CNNs found to jump around more skillfully than RNNs: Compositional Generalization in Seq2seq Convolutional Networks | Roberto Dessì, Marco Baroni, | We test here a convolutional network (CNN) on these tasks, reporting hugely improved performance with respect to RNNs. |
382 | Uncovering Probabilistic Implications in Typological Knowledge Bases | Johannes Bjerva, Yova Kementchedjhieva, Ryan Cotterell, Isabelle Augenstein, | In this paper, we present a computational model which successfully identifies known universals, including Greenberg universals, but also uncovers new ones, worthy of further linguistic investigation. |
383 | Is Word Segmentation Child’s Play in All Languages? | Georgia R. Loukatou, Steven Moran, Damian Blasi, Sabine Stoll, Alejandrina Cristia, | We report on the stability in performance of 11 conceptually diverse algorithms on a selection of 8 typologically distinct languages. The results consist evidence that some segmentation algorithms are cross-linguistically valid, thus could be considered as potential strategies employed by all infants. |
384 | On the Distribution of Deep Clausal Embeddings: A Large Cross-linguistic Study | Damian Blasi, Ryan Cotterell, Lawrence Wolf-Sonkin, Sabine Stoll, Balthasar Bickel, Marco Baroni, | We introduce here a collection of large, dependency-parsed written corpora in 17 languages, that allow us, for the first time, to capture clausal embedding through dependency graphs and assess their distribution. |
385 | Attention-based Conditioning Methods for External Knowledge Integration | Katerina Margatina, Christos Baziotis, Alexandros Potamianos, | In this paper, we present a novel approach for incorporating external knowledge in Recurrent Neural Networks (RNNs). |
386 | The KnowRef Coreference Corpus: Removing Gender and Number Cues for Difficult Pronominal Anaphora Resolution | Ali Emami, Paul Trichelair, Adam Trischler, Kaheer Suleman, Hannes Schulz, Jackie Chi Kit Cheung, | We introduce a new benchmark for coreference resolution and NLI, KnowRef, that targets common-sense understanding and world knowledge. We present a corpus of over 8,000 annotated text passages with ambiguous pronominal anaphora. |
387 | StRE: Self Attentive Edit Quality Prediction in Wikipedia | Soumya Sarkar, Bhanu Prakash Reddy, Sandipan Sikdar, Animesh Mukherjee, | In this paper we propose Self Attentive Revision Encoder (StRE) which leverages orthographic similarity of lexical units toward predicting the quality of new edits. |
388 | How Large Are Lions? Inducing Distributions over Quantitative Attributes | Yanai Elazar, Abhijit Mahabal, Deepak Ramachandran, Tania Bedrax-Weiss, Dan Roth, | We propose an unsupervised method for collecting quantitative information from large amounts of web data, and use it to create a new, very large resource consisting of distributions over physical quantities associated with objects, adjectives, and verbs which we call Distributions over Quantitative (DoQ). |
389 | Fine-Grained Sentence Functions for Short-Text Conversation | Wei Bi, Jun Gao, Xiaojiang Liu, Shuming Shi, | In this work, we collect a new Short-Text Conversation dataset with manually annotated SEntence FUNctions (STC-Sefun). |
390 | Give Me More Feedback II: Annotating Thesis Strength and Related Attributes in Student Essays | Zixuan Ke, Hrishikesh Inamdar, Hui Lin, Vincent Ng, | To facilitate advances in this area, we design a scoring rubric for scoring a core, yet unexplored dimension of persuasive essay quality, thesis strength, and annotate a corpus of essays with thesis strength scores. |
391 | Crowdsourcing and Validating Event-focused Emotion Corpora for German and English | Enrica Troiano, Sebastian Padó, Roman Klinger, | In this paper, we fill this gap for German by constructing deISEAR, a corpus designed in analogy to the well-established English ISEAR emotion dataset. |
392 | Pay Attention when you Pay the Bills. A Multilingual Corpus with Dependency-based and Semantic Annotation of Collocations. | Marcos Garcia, Marcos García Salido, Susana Sotelo, Estela Mosqueira, Margarita Alonso-Ramos, | This paper presents a new multilingual corpus with semantic annotation of collocations in English, Portuguese, and Spanish. |
393 | Does it Make Sense? And Why? A Pilot Study for Sense Making and Explanation | Cunxiang Wang, Shuailong Liang, Yue Zhang, Xiaonan Li, Tian Gao, | In this paper, we release a benchmark to directly test whether a system can differentiate natural language statements that make sense from those that do not make sense. |
394 | Large Dataset and Language Model Fun-Tuning for Humor Recognition | Vladislav Blinov, Valeria Bolotova-Baranova, Pavel Braslavski, | We collected a dataset of jokes and funny dialogues in Russian from various online resources and complemented them carefully with unfunny texts with similar lexical properties. |
395 | Towards Language Agnostic Universal Representations | Armen Aghajanyan, Xia Song, Saurabh Tiwary, | In this work, we present a method to decouple the language from the problem by learning language agnostic representations and therefore allowing training a model in one language and applying to a different one in a zero shot fashion. |
396 | Leveraging Meta Information in Short Text Aggregation | He Zhao, Lan Du, Guanfeng Liu, Wray Buntine, | To deal with the insufficiency, we propose a generative model that aggregates short texts into clusters by leveraging the associated meta information. |
397 | Exploiting Invertible Decoders for Unsupervised Sentence Representation Learning | Shuai Tang, Virginia R. de Sa, | In order to utilise the decoder after learning, we present two types of decoding functions whose inverse can be easily derived without expensive inverse calculation. |
398 | Self-Attentive, Multi-Context One-Class Classification for Unsupervised Anomaly Detection on Text | Lukas Ruff, Yury Zemlyanskiy, Robert Vandermeulen, Thomas Schnake, Marius Kloft, | In this paper we introduce a new anomaly detection method-Context Vector Data Description (CVDD)-which builds upon word embedding models to learn multiple sentence representations that capture multiple semantic contexts via the self-attention mechanism. |
399 | Hubless Nearest Neighbor Search for Bilingual Lexicon Induction | Jiaji Huang, Qiang Qiu, Kenneth Church, | This work proposes a new method, Hubless Nearest Neighbor (HNN), to mitigate hubness. |
400 | Distant Learning for Entity Linking with Automatic Noise Detection | Phong Le, Ivan Titov, | As the learning signal is weak and our surrogate labels are noisy, we introduce a noise detection component in our model: it lets the model detect and disregard examples which are likely to be noisy. |
401 | Learning How to Active Learn by Dreaming | Thuy-Trang Vu, Ming Liu, Dinh Phung, Gholamreza Haffari, | We introduce a new sample-efficient method that learns the AL policy directly on the target domain of interest by using wake and dream cycles. |
402 | Few-Shot Representation Learning for Out-Of-Vocabulary Words | Ziniu Hu, Ting Chen, Kai-Wei Chang, Yizhou Sun, | In this paper, we formulate the learning of OOV embedding as a few-shot regression problem by fitting a representation function to predict an oracle embedding vector (defined as embedding trained with abundant observations) based on limited contexts. |
403 | Neural Temporality Adaptation for Document Classification: Diachronic Word Embeddings and Domain Adaptation Models | Xiaolei Huang, Michael J. Paul, | This paper describes two complementary ways to adapt classifiers to shifts across time. |
404 | Learning Transferable Feature Representations Using Neural Networks | Himanshu Sharad Bhatt, Shourya Roy, Arun Rajkumar, Sriranjani Ramakrishnan, | We present a novel neural network architecture to simultaneously learn a two-part representation which is based on the principle of segregating source specific representation from the common representation. |
405 | Bayes Test of Precision, Recall, and F1 Measure for Comparison of Two Natural Language Processing Models | Ruibo Wang, Jihong Li, | In this study, we propose to use a block-regularized 3{\mbox{$\times$}}2 CV (3{\mbox{$\times$}}2 BCV) in model comparison because it could regularize the difference in certain frequency distributions over linguistic units between training and validation sets and yield stable estimators of P, R, and F1. |
406 | TIGS: An Inference Algorithm for Text Infilling with Gradient Search | Dayiheng Liu, Jie Fu, Pengfei Liu, Jiancheng Lv, | In this paper, we propose an iterative inference algorithm based on gradient search, which could be the first inference algorithm that can be broadly applied to any neural sequence generative models for text infilling tasks. |
407 | Keeping Notes: Conditional Natural Language Generation with a Scratchpad Encoder | Ryan Benmalek, Madian Khabsa, Suma Desu, Claire Cardie, Michele Banko, | We introduce the Scratchpad Mechanism, a novel addition to the sequence-to-sequence (seq2seq) neural network architecture and demonstrate its effectiveness in improving the overall fluency of seq2seq models for natural language generation tasks. |
408 | Using Automatically Extracted Minimum Spans to Disentangle Coreference Evaluation from Boundary Detection | Nafise Sadat Moosavi, Leo Born, Massimo Poesio, Michael Strube, | In this paper, we propose the MINA algorithm for automatically extracting minimum spans to benefit from minimum span evaluation in all corpora. |
409 | Revisiting Joint Modeling of Cross-document Entity and Event Coreference Resolution | Shany Barhom, Vered Shwartz, Alon Eirew, Michael Bugert, Nils Reimers, Ido Dagan, | We propose a neural architecture for cross-document coreference resolution. |
410 | A Unified Linear-Time Framework for Sentence-Level Discourse Parsing | Xiang Lin, Shafiq Joty, Prathyusha Jwalapuram, M Saiful Bari, | We propose an efficient neural framework for sentence-level discourse analysis in accordance with Rhetorical Structure Theory (RST). |
411 | Employing the Correspondence of Relations and Connectives to Identify Implicit Discourse Relations via Label Embeddings | Linh The Nguyen, Linh Van Ngo, Khoat Than, Thien Huu Nguyen, | In this work, we explore this property in a multi-task learning framework for IDRR in which the relations and the connectives are simultaneously predicted, and the mapping is leveraged to transfer knowledge between the two prediction tasks via the embeddings of relations and connectives. |
412 | Do You Know That Florence Is Packed with Visitors? Evaluating State-of-the-art Models of Speaker Commitment | Nanjiang Jiang, Marie-Catherine de Marneffe, | Here, we explore the hypothesis that linguistic deficits drive the error patterns of existing speaker commitment models by analyzing the linguistic correlates of model error on a challenging naturalistic dataset. |
413 | Multi-Relational Script Learning for Discourse Relations | I-Ta Lee, Dan Goldwasser, | In this paper, we suggest to view learning event embedding as a multi-relational problem, which allows us to capture different aspects of event pairs. |
414 | Open-Domain Why-Question Answering with Adversarial Learning to Encode Answer Texts | Jong-Hoon Oh, Kazuma Kadowaki, Julien Kloetzer, Ryu Iida, Kentaro Torisawa, | In this paper, we propose a method for why-question answering (why-QA) that uses an adversarial learning framework. |
415 | Learning to Ask Unanswerable Questions for Machine Reading Comprehension | Haichao Zhu, Li Dong, Furu Wei, Wenhui Wang, Bing Qin, Ting Liu, | In this work, we propose a data augmentation technique by automatically generating relevant unanswerable questions according to an answerable question paired with its corresponding paragraph that contains the answer. |
416 | Compositional Questions Do Not Necessitate Multi-hop Reasoning | Sewon Min, Eric Wallace, Sameer Singh, Matt Gardner, Hannaneh Hajishirzi, Luke Zettlemoyer, | We introduce a single-hop BERT-based RC model that achieves 67 F1-comparable to state-of-the-art multi-hop models. |
417 | Improving Question Answering over Incomplete KBs with Knowledge-Aware Reader | Wenhan Xiong, Mo Yu, Shiyu Chang, Xiaoxiao Guo, William Yang Wang, | We propose a new end-to-end question answering model, which learns to aggregate answer evidence from an incomplete knowledge base (KB) and a set of retrieved text snippets.Under the assumptions that structured data is easier to query and the acquired knowledge can help the understanding of unstructured text, our model first accumulates knowledge ofKB entities from a question-related KB sub-graph; then reformulates the question in the latent space and reads the text with the accumulated entity knowledge at hand. |
418 | AdaNSP: Uncertainty-driven Adaptive Decoding in Neural Semantic Parsing | Xiang Zhang, Shizhu He, Kang Liu, Jun Zhao, | We instead to propose an adaptive decoding method to avoid such intermediate representations. |
419 | The Language of Legal and Illegal Activity on the Darknet | Leshem Choshen, Dan Eldad, Daniel Hershcovich, Elior Sulem, Omri Abend, | This paper tackles this gap and performs an in-depth investigation of the characteristics of legal and illegal text in the Darknet, comparing it to a clear net website with similar content as a control condition. |
420 | Eliciting Knowledge from Experts: Automatic Transcript Parsing for Cognitive Task Analysis | Junyi Du, He Jiang, Jiaming Shen, Xiang Ren, | In this paper, we propose a weakly-supervised information extraction framework for automated CTA transcript parsing. |
421 | Course Concept Expansion in MOOCs with External Knowledge and Interactive Game | Jifan Yu, Chenyu Wang, Gan Luo, Lei Hou, Juanzi Li, Zhiyuan Liu, Jie Tang, | In this paper, we first build a novel boundary during searching for new concepts via external knowledge base and then utilize heterogeneous features to verify the high-quality results. In addition, to involve human efforts in our model, we design an interactive optimization mechanism based on a game. |
422 | Towards Near-imperceptible Steganographic Text | Falcon Dai, Zheng Cai, | We show that the imperceptibility of several existing linguistic steganographic systems (Fang et al., 2017; Yang et al., 2018) relies on implicit assumptions on statistical behaviors of fluent text. |
423 | Inter-sentence Relation Extraction with Document-level Graph Convolutional Neural Network | Sunil Kumar Sahu, Fenia Christopoulou, Makoto Miwa, Sophia Ananiadou, | We present a novel inter-sentence relation extraction model that builds a labelled edge graph convolutional neural network model on a document-level graph. |
424 | Neural Legal Judgment Prediction in English | Ilias Chalkidis, Ion Androutsopoulos, Nikolaos Aletras, | As a side-product, we propose a hierarchical version of BERT, which bypasses BERT’s length limitation. We release a new English legal judgment prediction dataset, containing cases from the European Court of Human Rights. |
425 | Robust Neural Machine Translation with Doubly Adversarial Inputs | Yong Cheng, Lu Jiang, Wolfgang Macherey, | We propose an approach to improving the robustness of NMT models, which consists of two parts: (1) attack the translation model with adversarial source examples; (2) defend the translation model with adversarial target inputs to improve its robustness against the adversarial source inputs. |
426 | Bridging the Gap between Training and Inference for Neural Machine Translation | Wen Zhang, Yang Feng, Fandong Meng, Di You, Qun Liu, | In this paper, we address these issues by sampling context words not only from the ground truth sequence but also from the predicted sequence by the model during training, where the predicted sequence is selected with a sentence-level optimum. |
427 | Beyond BLEU:Training Neural Machine Translation with Semantic Similarity | John Wieting, Taylor Berg-Kirkpatrick, Kevin Gimpel, Graham Neubig, | In this paper, we introduce an alternative reward function for optimizing NMT systems that is based on recent work in semantic similarity. |
428 | AutoML Strategy Based on Grammatical Evolution: A Case Study about Knowledge Discovery from Text | Suilan Estevez-Velarde, Yoan Gutiérrez, Andrés Montoyo, Yudivián Almeida-Cruz, | This paper proposes a novel AutoML strategy based on probabilistic grammatical evolution, which is evaluated on the health domain by facing the knowledge discovery challenge in Spanish text documents. |
429 | Distilling Discrimination and Generalization Knowledge for Event Detection via Delta-Representation Learning | Yaojie Lu, Hongyu Lin, Xianpei Han, Le Sun, | To address this problem, this paper proposes a Delta-learning approach to distill discrimination and generalization knowledge by effectively decoupling, incrementally learning and adaptively fusing event representation. |
430 | Chinese Relation Extraction with Multi-Grained Information and External Linguistic Knowledge | Ziran Li, Ning Ding, Zhiyuan Liu, Haitao Zheng, Ying Shen, | To address the issues, we propose a multi-grained lattice framework (MG lattice) for Chinese relation extraction to take advantage of multi-grained language information and external linguistic knowledge. |
431 | A2N: Attending to Neighbors for Knowledge Graph Inference | Trapit Bansal, Da-Cheng Juan, Sujith Ravi, Andrew McCallum, | We thus propose a novel attention-based method to learn query-dependent representation of entities which adaptively combines the relevant graph neighborhood of an entity leading to more accurate KG completion. |
432 | Graph based Neural Networks for Event Factuality Prediction using Syntactic and Semantic Structures | Amir Pouran Ben Veyseh, Thien Huu Nguyen, Dejing Dou, | In this work, we introduce a novel graph-based neural network for EFP that can integrate the semantic and syntactic information more effectively. |
433 | Embedding Time Expressions for Deep Temporal Ordering Models | Tanya Goyal, Greg Durrett, | In this paper, we introduce a framework to infuse temporal awareness into such models by learning a pre-trained model to embed timexes. We generate synthetic data consisting of pairs of timexes, then train a character LSTM to learn embeddings and classify the timexes’ temporal relation. |
434 | Episodic Memory Reader: Learning What to Remember for Question Answering from Streaming Data | Moonsu Han, Minki Kang, Hyunwoo Jung, Sung Ju Hwang, | To tackle this problem, we propose a novel end-to-end deep network model for reading comprehension, which we refer to as Episodic Memory Reader (EMR) that sequentially reads the input contexts into an external memory, while replacing memories that are less important for answering unseen questions. |
435 | Selection Bias Explorations and Debias Methods for Natural Language Sentence Matching Datasets | Guanhua Zhang, Bing Bai, Jian Liang, Kun Bai, Shiyu Chang, Mo Yu, Conghui Zhu, Tiejun Zhao, | In this paper, we investigate the problem of selection bias on six NLSM datasets and find that four out of them are significantly biased. |
436 | Real-Time Open-Domain Question Answering with Dense-Sparse Phrase Index | Minjoon Seo, Jinhyuk Lee, Tom Kwiatkowski, Ankur Parikh, Ali Farhadi, Hannaneh Hajishirzi, | In this paper, we introduce query-agnostic indexable representations of document phrases that can drastically speed up open-domain QA. |
437 | Language Modeling with Shared Grammar | Yuyu Zhang, Le Song, | In this work, we propose neural variational language model (NVLM), which enables the sharing of grammar knowledge among different corpora. |
438 | Zero-Shot Semantic Parsing for Instructions | Ofer Givoli, Roi Reichart, | We introduce a new training algorithm that aims to train a semantic parser on examples from a set of source domains, so that it can effectively parse instructions from an unknown target domain. |
439 | Can You Tell Me How to Get Past Sesame Street? Sentence-Level Pretraining Beyond Language Modeling | Alex Wang, Jan Hula, Patrick Xia, Raghavendra Pappagari, R. Thomas McCoy, Roma Patel, Najoung Kim, Ian Tenney, Yinghui Huang, Katherin Yu, Shuning Jin, Berlin Chen, Benjamin Van Durme, Edouard Grave, Ellie Pavlick, Samuel R. Bowman, | We conduct the first large-scale systematic study of candidate pretraining tasks, comparing 19 different tasks both as alternatives and complements to language modeling. |
440 | Complex Question Decomposition for Semantic Parsing | Haoyu Zhang, Jingjing Cai, Jianjun Xu, Ji Wang, | In this work, we focus on complex question semantic parsing and propose a novel Hierarchical Semantic Parsing (HSP) method, which utilizes the decompositionality of complex questions for semantic parsing. |
441 | Multi-Task Deep Neural Networks for Natural Language Understanding | Xiaodong Liu, Pengcheng He, Weizhu Chen, Jianfeng Gao, | In this paper, we present a Multi-Task Deep Neural Network (MT-DNN) for learning representations across multiple natural language understanding (NLU) tasks. |
442 | DisSent: Learning Sentence Representations from Explicit Discourse Relations | Allen Nie, Erin Bennett, Noah Goodman, | We show that with dependency parsing and rule-based rubrics, we can curate a high quality sentence relation task by leveraging explicit discourse relations. |
443 | SParC: Cross-Domain Semantic Parsing in Context | Tao Yu, Rui Zhang, Michihiro Yasunaga, Yi Chern Tan, Xi Victoria Lin, Suyi Li, Heyang Er, Irene Li, Bo Pang, Tao Chen, Emily Ji, Shreya Dixit, David Proctor, Sungrok Shim, Jonathan Kraft, Vincent Zhang, Caiming Xiong, Richard Socher, Dragomir Radev, | We present SParC, a dataset for cross-domainSemanticParsing inContext that consists of 4,298 coherent question sequences (12k+ individual questions annotated with SQL queries). |
444 | Towards Complex Text-to-SQL in Cross-Domain Database with Intermediate Representation | Jiaqi Guo, Zecheng Zhan, Yan Gao, Yan Xiao, Jian-Guang Lou, Ting Liu, Dongmei Zhang, | We present a neural approach called IRNet for complex and cross-domain Text-to-SQL. |
445 | EigenSent: Spectral sentence embeddings using higher-order Dynamic Mode Decomposition | Subhradeep Kayal, George Tsatsaronis, | In this work, we experiment with spectral methods of signal representation and summarization as mechanisms for constructing such word-sequence embeddings in an unsupervised fashion. |
446 | SemBleu: A Robust Metric for AMR Parsing Evaluation | Linfeng Song, Daniel Gildea, | We propose SEMBLEU, a robust metric that extends BLEU (Papineni et al., 2002) to AMRs. |
447 | Reranking for Neural Semantic Parsing | Pengcheng Yin, Graham Neubig, | This paper presents a simple approach to quickly iterate and improve the performance of an existing neural semantic parser by reranking an n-best list of predicted MRs, using features that are designed to fix observed problems with baseline models. |
448 | Representing Schema Structure with Graph Neural Networks for Text-to-SQL Parsing | Ben Bogin, Jonathan Berant, Matt Gardner, | In this paper, we present an encoder-decoder semantic parser, where the structure of the DB schema is encoded with a graph neural network, and this representation is later used at both encoding and decoding time. |
449 | Human vs. Muppet: A Conservative Estimate of Human Performance on the GLUE Benchmark | Nikita Nangia, Samuel R. Bowman, | Given the fast pace of progress however, the headroom we observe is quite limited. |
450 | Compositional Semantic Parsing across Graphbanks | Matthias Lindemann, Jonas Groschwitz, Alexander Koller, | We present a compositional neural semantic parser which achieves, for the first time, competitive accuracies across a diverse range of graphbanks. |
451 | Rewarding Smatch: Transition-Based AMR Parsing with Reinforcement Learning | Tahira Naseem, Abhishek Shah, Hui Wan, Radu Florian, Salim Roukos, Miguel Ballesteros, | Our work involves enriching the Stack-LSTM transition-based AMR parser (Ballesteros and Al-Onaizan, 2017) by augmenting training with Policy Learning and rewarding the Smatch score of sampled graphs. |
452 | BERT Rediscovers the Classical NLP Pipeline | Ian Tenney, Dipanjan Das, Ellie Pavlick, | We focus on one such model, BERT, and aim to quantify where linguistic information is captured within the network. |
453 | Simple and Effective Paraphrastic Similarity from Parallel Translations | John Wieting, Kevin Gimpel, Graham Neubig, Taylor Berg-Kirkpatrick, | We present a model and methodology for learning paraphrastic sentence embeddings directly from bitext, removing the time-consuming intermediate step of creating para-phrase corpora. |
454 | Second-Order Semantic Dependency Parsing with End-to-End Neural Networks | Xinyu Wang, Jingxian Huang, Kewei Tu, | In this paper, we propose a second-order semantic dependency parser, which takes into consideration not only individual dependency edges but also interactions between pairs of edges. |
455 | Towards Multimodal Sarcasm Detection (An _Obviously_ Perfect Paper) | Santiago Castro, Devamanyu Hazarika, Verónica Pérez-Rosas, Roger Zimmermann, Rada Mihalcea, Soujanya Poria, | In this paper, we argue that incorporating multimodal cues can improve the automatic classification of sarcasm. |
456 | Determining Relative Argument Specificity and Stance for Complex Argumentative Structures | Esin Durmus, Faisal Ladhak, Claire Cardie, | In this paper, we tackle these tasks in the context of complex arguments on a diverse set of topics. |
457 | Latent Variable Sentiment Grammar | Liwen Zhang, Kewei Tu, Yue Zhang, | To this end, we investigate two formalisms with deep sentiment representations that capture sentiment subtype expressions by latent variables and Gaussian mixture vectors, respectively. |
458 | An Investigation of Transfer Learning-Based Sentiment Analysis in Japanese | Enkhbold Bataa, Joshua Wu, | In this work we focus on Japanese and show the potential use of transfer learning techniques in text classification. |
459 | Probing Neural Network Comprehension of Natural Language Arguments | Timothy Niven, Hung-Yu Kao, | We are surprised to find that BERT’s peak performance of 77% on the Argument Reasoning Comprehension Task reaches just three points below the average untrained human baseline. However, we show that this result is entirely accounted for by exploitation of spurious statistical cues in the dataset. |
460 | Recognising Agreement and Disagreement between Stances with Reason Comparing Networks | Chang Xu, Cecile Paris, Surya Nepal, Ross Sparks, | We propose a reason comparing network (RCN) to leverage reason information for stance comparison. |
461 | Toward Comprehensive Understanding of a Sentiment Based on Human Motives | Naoki Otani, Eduard Hovy, | Our work considers human motives as the driver for human sentiments and addresses the problem of motive detection as the first step. |
462 | Context-aware Embedding for Targeted Aspect-based Sentiment Analysis | Bin Liang, Jiachen Du, Ruifeng Xu, Binyang Li, Hejiao Huang, | To address this problem, we propose a novel method to refine the embeddings of targets and aspects. |
463 | Yes, we can! Mining Arguments in 50 Years of US Presidential Campaign Debates | Shohreh Haddadan, Elena Cabrio, Serena Villata, | As existing research lacks solid empirical investigation of the typology of argument components in political debates, we fill this gap by proposing an Argument Mining approach to political debates. |
464 | An Empirical Study of Span Representations in Argumentation Structure Parsing | Tatsuki Kuribayashi, Hiroki Ouchi, Naoya Inoue, Paul Reisert, Toshinori Miyoshi, Jun Suzuki, Kentaro Inui, | This study investigates (i) span representation originally developed for other NLP tasks and (ii) a simple task-dependent extension for ASP. |
465 | Simple and Effective Text Matching with Richer Alignment Features | Runqi Yang, Jianhai Zhang, Xing Gao, Feng Ji, Haiqing Chen, | In this paper, we present a fast and strong neural approach for general purpose text matching applications. |
466 | Learning Attention-based Embeddings for Relation Prediction in Knowledge Graphs | Deepak Nathani, Jatin Chauhan, Charu Sharma, Manohar Kaul, | To this effect, our paper proposes a novel attention-based feature embedding that captures both entity and relation features in any given entity’s neighborhood. |
467 | Neural Network Alignment for Sentential Paraphrases | Jessica Ouyang, Kathy McKeown, | We present a monolingual alignment system for long, sentence- or clause-level alignments, and demonstrate that systems designed for word- or short phrase-based alignment are ill-suited for these longer alignments. |
468 | Duality of Link Prediction and Entailment Graph Induction | Mohammad Javad Hosseini, Shay B. Cohen, Mark Johnson, Mark Steedman, | In this paper, we show that these two problems are actually complementary. |
469 | A Cross-Sentence Latent Variable Model for Semi-Supervised Text Sequence Matching | Jihun Choi, Taeuk Kim, Sang-goo Lee, | We present a latent variable model for predicting the relationship between a pair of text sequences. |
470 | COMET: Commonsense Transformers for Automatic Knowledge Graph Construction | Antoine Bosselut, Hannah Rashkin, Maarten Sap, Chaitanya Malaviya, Asli Celikyilmaz, Yejin Choi, | We present the first comprehensive study on automatic knowledge base construction for two prevalent commonsense knowledge graphs: ATOMIC (Sap et al., 2019) and ConceptNet (Speer et al., 2017). |
471 | Detecting Subevents using Discourse and Narrative Features | Mohammed Aldawsari, Mark Finlayson, | We present a supervised model for automatically identifying when one event is a subevent of another. |
472 | HellaSwag: Can a Machine Really Finish Your Sentence? | Rowan Zellers, Ari Holtzman, Yonatan Bisk, Ali Farhadi, Yejin Choi, | In this paper, we show that commonsense inference still proves difficult for even state-of-the-art models, by presenting HellaSwag, a new challenge dataset. |
473 | Unified Semantic Parsing with Weak Supervision | Priyanka Agrawal, Ayushi Dalmia, Parag Jain, Abhishek Bansal, Ashish Mittal, Karthik Sankaranarayanan, | To overcome this, we propose a novel framework to build a unified multi-domain enabled semantic parser trained only with weak supervision (denotations). |
474 | Every Child Should Have Parents: A Taxonomy Refinement Algorithm Based on Hyperbolic Term Embeddings | Rami Aly, Shantanu Acharya, Alexander Ossa, Arne Köhn, Chris Biemann, Alexander Panchenko, | We introduce the use of Poincar{\’e} embeddings to improve existing state-of-the-art approaches to domain-specific taxonomy induction from text as a signal for both relocating wrong hyponym terms within a (pre-induced) taxonomy as well as for attaching disconnected terms in a taxonomy. |
475 | Learning to Rank for Plausible Plausibility | Zhongyang Li, Tongfei Chen, Benjamin Van Durme, | We suggest this loss is intuitively wrong when applied to plausibility tasks, where the prompt by design is neither categorically entailed nor contradictory given the context. |
476 | Generalized Tuning of Distributional Word Vectors for Monolingual and Cross-Lingual Lexical Entailment | Goran Glavaš, Ivan Vuli?, | In this work, we propose a simple and effective method for fine-tuning distributional word vectors for LE. |
477 | Attention Is (not) All You Need for Commonsense Reasoning | Tassilo Klein, Moin Nabi, | In this paper, we describe a simple re-implementation of BERT for commonsense reasoning. |
478 | A Surprisingly Robust Trick for the Winograd Schema Challenge | Vid Kocijan, Ana-Maria Cretu, Oana-Maria Camburu, Yordan Yordanov, Thomas Lukasiewicz, | In this paper, we show that the performance of three language models on WSC273 consistently and robustly improves when fine-tuned on a similar pronoun disambiguation problem dataset (denoted WSCR). |
479 | Coherent Comments Generation for Chinese Articles with a Graph-to-Sequence Model | Wei Li, Jingjing Xu, Yancheng He, ShengLi Yan, Yunfang Wu, Xu Sun, | In this paper, we propose to generate comments with a graph-to-sequence model that models the input news as a topic interaction graph. |
480 | Interconnected Question Generation with Coreference Alignment and Conversation Flow Modeling | Yifan Gao, Piji Li, Irwin King, Michael R. Lyu, | We propose an end-to-end neural model with coreference alignment and conversation flow modeling. |
481 | Cross-Lingual Training for Automatic Question Generation | Vishwajeet Kumar, Nitish Joshi, Arijit Mukherjee, Ganesh Ramakrishnan, Preethi Jyothi, | We propose a cross-lingual QG model which uses the following training regime: (i) Unsupervised pretraining of language models in both primary and secondary languages and (ii) joint supervised training for QG in both languages. |
482 | A Hierarchical Reinforced Sequence Operation Method for Unsupervised Text Style Transfer | Chen Wu, Xuancheng Ren, Fuli Luo, Xu Sun, | To address these challenges, we propose a hierarchical reinforced sequence operation method, named Point-Then-Operate (PTO), which consists of a high-level agent that proposes operation positions and a low-level agent that alters the sentence. |
483 | Handling Divergent Reference Texts when Evaluating Table-to-Text Generation | Bhuwan Dhingra, Manaal Faruqui, Ankur Parikh, Ming-Wei Chang, Dipanjan Das, William Cohen, | We propose a new metric, PARENT, which aligns n-grams from the reference and generated texts to the semi-structured data before computing their precision and recall. |
484 | Unsupervised Question Answering by Cloze Translation | Patrick Lewis, Ludovic Denoyer, Sebastian Riedel, | In this work, we explore to what extent high quality training data is actually required for Extractive QA, and investigate the possibility of unsupervised Extractive QA. |
485 | MultiQA: An Empirical Investigation of Generalization and Transfer in Reading Comprehension | Alon Talmor, Jonathan Berant, | In this paper, we conduct such an investigation over ten RC datasets, training on one or more source RC datasets, and evaluating generalization, as well as transfer to a target RC dataset. |
486 | Simple and Effective Curriculum Pointer-Generator Networks for Reading Comprehension over Long Narratives | Yi Tay, Shuohang Wang, Anh Tuan Luu, Jie Fu, Minh C. Phan, Xingdi Yuan, Jinfeng Rao, Siu Cheung Hui, Aston Zhang, | We propose a curriculum learning (CL) based Pointer-Generator framework for reading/sampling over large documents, enabling diverse training of the neural model based on the notion of alternating contextual difficulty. |
487 | Explain Yourself! Leveraging Language Models for Commonsense Reasoning | Nazneen Fatema Rajani, Bryan McCann, Caiming Xiong, Richard Socher, | We collect human explanations for commonsense reasoning in the form of natural language sequences and highlighted annotations in a new dataset called Common Sense Explanations (CoS-E). We use CoS-E to train language models to automatically generate explanations that can be used during training and inference in a novel Commonsense Auto-Generated Explanation (CAGE) framework. |
488 | Interpretable Question Answering on Knowledge Bases and Text | Alona Sydorova, Nina Poerner, Benjamin Roth, | In this work, we address the interpretability of ML based question answering (QA) models on a combination of knowledge bases (KB) and text documents. |
489 | A Resource-Free Evaluation Metric for Cross-Lingual Word Embeddings Based on Graph Modularity | Yoshinari Fujinuma, Jordan Boyd-Graber, Michael J. Paul, | We measure this characteristic using modularity, a network measurement that measures the strength of clusters in a graph. |
490 | Multilingual and Cross-Lingual Graded Lexical Entailment | Ivan Vuli?, Simone Paolo Ponzetto, Goran Glavaš, | In this paper, we present the first work on cross-lingual generalisation of GR-LE relation. |
491 | What Kind of Language Is Hard to Language-Model? | Sebastian J. Mielke, Ryan Cotterell, Kyle Gorman, Brian Roark, Jason Eisner, | Methodologically, we introduce a new paired-sample multiplicative mixed-effects model to obtain language difficulty coefficients from at-least-pairwise parallel corpora. |
492 | Analyzing the Limitations of Cross-lingual Word Embedding Mappings | Aitor Ormazabal, Mikel Artetxe, Gorka Labaka, Aitor Soroa, Eneko Agirre, | We thus conclude that current mapping methods do have strong limitations, calling for further research to jointly learn cross-lingual embeddings with a weaker cross-lingual signal. |
493 | How Multilingual is Multilingual BERT? | Telmo Pires, Eva Schlinger, Dan Garrette, | In this paper, we show that Multilingual BERT (M-BERT), released by Devlin et al. (2018) as a single language model pre-trained from monolingual corpora in 104 languages, is surprisingly good at zero-shot cross-lingual model transfer, in which task-specific annotations in one language are used to fine-tune the model for evaluation in another language. |
494 | Bilingual Lexicon Induction through Unsupervised Machine Translation | Mikel Artetxe, Gorka Labaka, Eneko Agirre, | In this paper, we propose an alternative approach to this problem that builds on the recent work on unsupervised machine translation. |
495 | Automatically Identifying Complaints in Social Media | Daniel Preo?iuc-Pietro, Mihaela Gaman, Nikolaos Aletras, | In this paper, we introduce the first systematic analysis of complaints in computational linguistics. |
496 | TWEETQA: A Social Media Focused Question Answering Dataset | Wenhan Xiong, Jiawei Wu, Hong Wang, Vivek Kulkarni, Mo Yu, Shiyu Chang, Xiaoxiao Guo, William Yang Wang, | While previous datasets have concentrated on question answering (QA) for formal text like news and Wikipedia, we present the first large-scale dataset for QA over social media data. |
497 | Asking the Crowd: Question Analysis, Evaluation and Generation for Open Discussion on Online Forums | Zi Chai, Xinyu Xing, Xiaojun Wan, Bo Huang, | In this paper, we take the first step on teaching machines to ask open-answered questions from real-world news for open discussion (openQG). |
498 | Tree LSTMs with Convolution Units to Predict Stance and Rumor Veracity in Social Media Conversations | Sumeet Kumar, Kathleen Carley, | In this research, we propose a new way to represent social-media conversations as binarized constituency trees that allows comparing features in source-posts and their replies effectively. |
499 | HIBERT: Document Level Pre-training of Hierarchical Bidirectional Transformers for Document Summarization | Xingxing Zhang, Furu Wei, Ming Zhou, | Inspired by the recent work on pre-training transformer sentence encoders (Devlin et al., 2018), we propose Hibert (as shorthand for \textbf{HI}erachical \textbf{B}idirectional \textbf{E}ncoder \textbf{R}epresentations from \textbf{T}ransformers) for document encoding and a method to pre-train it using unlabeled data. |
500 | Hierarchical Transformers for Multi-Document Summarization | Yang Liu, Mirella Lapata, | In this paper, we develop a neural summarization model which can effectively process multiple input documents and distill Transformer architecture with the ability to encode documents in a hierarchical manner. |
501 | Abstractive Text Summarization Based on Deep Learning and Semantic Content Generalization | Panagiotis Kouris, Georgios Alexandridis, Andreas Stafylopatis, | This work proposes a novel framework for enhancing abstractive text summarization based on the combination of deep learning techniques along with semantic data transformations. |
502 | Studying Summarization Evaluation Metrics in the Appropriate Scoring Range | Maxime Peyrard, | We show that, surprisingly, evaluation metrics which behave similarly on these datasets (average-scoring range) strongly disagree in the higher-scoring range in which current systems now operate. |
503 | Simple Unsupervised Summarization by Contextual Matching | Jiawei Zhou, Alexander Rush, | We propose an unsupervised method for sentence summarization using only language modeling. |
504 | Generating Summaries with Topic Templates and Structured Convolutional Decoders | Laura Perez-Beltrachini, Yang Liu, Mirella Lapata, | In this paper we propose a structured convolutional decoder that is guided by the content structure of target summaries. |
505 | Morphological Irregularity Correlates with Frequency | Shijie Wu, Ryan Cotterell, Timothy O’Donnell, | We present a study of morphological irregularity. |
506 | Like a Baby: Visually Situated Neural Language Acquisition | Alexander Ororbia, Ankur Mali, Matthew Kelly, David Reitter, | We examine the benefits of visual context in training neural language models to perform next-word prediction. |
507 | Relating Simple Sentence Representations in Deep Neural Networks and the Brain | Sharmistha Jat, Hao Tang, Partha Talukdar, Tom Mitchell, | We investigate these questions using sentences with simple syntax and semantics (e.g., The bone was eaten by the dog.) |
508 | Modeling Affirmative and Negated Action Processing in the Brain with Lexical and Compositional Semantic Models | Vesna Djokic, Jean Maillard, Luana Bulat, Ekaterina Shutova, | In this paper, we apply lexical and compositional semantic models to decode fMRI patterns associated with negated and affirmative sentences containing hand-action verbs. |
509 | Word-order Biases in Deep-agent Emergent Communication | Rahma Chaabouni, Eugene Kharitonov, Alessandro Lazaric, Emmanuel Dupoux, Marco Baroni, | We aim here to uncover which biases such models display with respect to “natural” word-order constraints. |
510 | NNE: A Dataset for Nested Named Entity Recognition in English Newswire | Nicky Ringland, Xiang Dai, Ben Hachey, Sarvnaz Karimi, Cecile Paris, James R. Curran, | We describe NNE-a fine-grained, nested named entity dataset over the full Wall Street Journal portion of the Penn Treebank (PTB). |
511 | Sequence-to-Nuggets: Nested Entity Mention Detection via Anchor-Region Networks | Hongyu Lin, Yaojie Lu, Xianpei Han, Le Sun, | In this paper, we propose to resolve this problem by modeling and leveraging the head-driven phrase structures of entity mentions, i.e., although a mention can nest other mentions, they will not share the same head word. |
512 | Improving Textual Network Embedding with Global Attention via Optimal Transport | Liqun Chen, Guoyin Wang, Chenyang Tao, Dinghan Shen, Pengyu Cheng, Xinyuan Zhang, Wenlin Wang, Yizhe Zhang, Lawrence Carin, | This work focuses on learning context-aware network embeddings augmented with text data. |
513 | Identification of Tasks, Datasets, Evaluation Metrics, and Numeric Scores for Scientific Leaderboards Construction | Yufang Hou, Charles Jochim, Martin Gleize, Francesca Bonin, Debasis Ganguly, | In this paper we build two datasets and develop a framework (TDMS-IE) aimed at automatically extracting task, dataset, metric and score from NLP papers, towards the automatic construction of leaderboards. |
514 | Scaling up Open Tagging from Tens to Thousands: Comprehension Empowered Attribute Value Extraction from Product Title | Huimin Xu, Wenting Wang, Xin Mao, Xinyu Jiang, Man Lan, | In this work, we propose a novel approach to support value extraction scaling up to thousands of attributes without losing performance: (1) We propose to regard attribute as a query and adopt only one global set of BIO tags for any attributes to reduce the burden of attribute tag or model explosion; (2) We explicitly model the semantic representations for attribute and title, and develop an attention mechanism to capture the interactive semantic relations in-between to enforce our framework to be attribute comprehensive. |
515 | Incorporating Linguistic Constraints into Keyphrase Generation | Jing Zhao, Yuxiang Zhang, | In this paper, we propose the parallel Seq2Seq network with the coverage attention to alleviate the overlapping phrase problem. |
516 | A Unified Multi-task Adversarial Learning Framework for Pharmacovigilance Mining | Shweta Yadav, Asif Ekbal, Sriparna Saha, Pushpak Bhattacharyya, | In this paper, we propose a neural network inspired multi- task learning framework that can simultaneously extract ADRs from various sources. |
517 | Quantity Tagger: A Latent-Variable Sequence Labeling Approach to Solving Addition-Subtraction Word Problems | Yanyan Zou, Wei Lu, | This work presents a novel approach, \textit{Quantity Tagger}, that automatically discovers such hidden relations by tagging each quantity with a \textit{sign} corresponding to one type of mathematical operation. |
518 | A Deep Reinforced Sequence-to-Set Model for Multi-Label Classification | Pengcheng Yang, Fuli Luo, Shuming Ma, Junyang Lin, Xu Sun, | To remedy this, we propose a simple but effective sequence-to-set model. |
519 | Joint Slot Filling and Intent Detection via Capsule Neural Networks | Chenwei Zhang, Yaliang Li, Nan Du, Wei Fan, Philip Yu, | To exploit the semantic hierarchy for effective modeling, we propose a capsule-based neural network model which accomplishes slot filling and intent detection via a dynamic routing-by-agreement schema. |
520 | Neural Aspect and Opinion Term Extraction with Mined Rules as Weak Supervision | Hongliang Dai, Yangqiu Song, | To alleviate this problem, we first propose an algorithm to automatically mine extraction rules from existing training examples based on dependency parsing results. The mined rules are then applied to label a large amount of auxiliary data. Finally, we study training procedures to train a neural model which can learn from both the data automatically labeled by the rules and a small amount of data accurately annotated by human. |
521 | Cost-sensitive Regularization for Label Confusion-aware Event Detection | Hongyu Lin, Yaojie Lu, Xianpei Han, Le Sun, | To address this label confusion problem, this paper proposes cost-sensitive regularization, which can force the training procedure to concentrate more on optimizing confusing type pairs. |
522 | Exploring Pre-trained Language Models for Event Extraction and Generation | Sen Yang, Dawei Feng, Linbo Qiao, Zhigang Kan, Dongsheng Li, | To promote event extraction, we first propose an event extraction model to overcome the roles overlap problem by separating the argument prediction in terms of roles. Moreover, to address the problem of insufficient training data, we propose a method to automatically generate labeled data by editing prototypes and screen out generated samples by ranking the quality. |
523 | Improving Open Information Extraction via Iterative Rank-Aware Learning | Zhengbao Jiang, Pengcheng Yin, Graham Neubig, | We propose an additional binary classification loss to calibrate the likelihood to make it more globally comparable, and an iterative learning process, where extractions generated by the open IE model are incrementally included as training samples to help the model learn from trial and error. |
524 | Towards Improving Neural Named Entity Recognition with Gazetteers | Tianyu Liu, Jin-Ge Yao, Chin-Yew Lin, | In this work, we show that properly utilizing external gazetteers could benefit segmental neural NER models. |
525 | Span-Level Model for Relation Extraction | Kalpit Dixit, Yaser Al-Onaizan, | To address these concerns, we present a model which directly models all possible spans and performs joint entity mention detection and relation extraction. |
526 | Enhancing Unsupervised Generative Dependency Parser with Contextual Information | Wenjuan Han, Yong Jiang, Kewei Tu, | In this paper, we propose a novel probabilistic model called discriminative neural dependency model with valence (D-NDMV) that generates a sentence and its parse from a continuous latent representation, which encodes global contextual information of the generated sentence. |
527 | Neural Architectures for Nested NER through Linearization | Jana Straková, Milan Straka, Jan Hajic, | We propose two neural network architectures for nested named entity recognition (NER), a setting in which named entities may overlap and also be labeled with more than one label. |
528 | Online Infix Probability Computation for Probabilistic Finite Automata | Marco Cognetta, Yo-Sub Han, Soon Chan Kwon, | We develop an asymptotic improvement of that algorithm and solve the open problem of computing the infix probabilities of PFAs from streaming data, which is crucial when process- ing queries online and is the ultimate goal of the incremental approach. |
529 | How to Best Use Syntax in Semantic Role Labelling | Yufei Wang, Mark Johnson, Stephen Wan, Yifang Sun, Wei Wang, | We evaluate three different ways of encoding syntactic parses and three different ways of injecting them into a state-of-the-art neural ELMo-based SRL sequence labelling model. |
530 | PTB Graph Parsing with Tree Approximation | Yoshihide Kato, Shigeki Matsubara, | This paper proposes a method that approximates PTB graph-structured representations by trees. |
531 | Sequence Labeling Parsing by Learning across Representations | Michalina Strzyz, David Vilares, Carlos Gómez-Rodríguez, | We use parsing as sequence labeling as a common framework to learn across constituency and dependency syntactic abstractions.To do so, we cast the problem as multitask learning (MTL). |
532 | A Prism Module for Semantic Disentanglement in Name Entity Recognition | Kun Liu, Shen Li, Daqi Zheng, Zhengdong Lu, Sheng Gao, Si Li, | To solve this problem, we propose a prism module to disentangle the semantic aspects of words and reduce noise at the input layer of a model. |
533 | Label-Agnostic Sequence Labeling by Copying Nearest Neighbors | Sam Wiseman, Karl Stratos, | We show we can perform accurate sequence labeling by explicitly (and only) copying labels from retrieved neighbors. |
534 | Towards Empathetic Open-domain Conversation Models: A New Benchmark and Dataset | Hannah Rashkin, Eric Michael Smith, Margaret Li, Y-Lan Boureau, | This work proposes a new benchmark for empathetic dialogue generation and EmpatheticDialogues, a novel dataset of 25k conversations grounded in emotional situations. |
535 | Know More about Each Other: Evolving Dialogue Strategy via Compound Assessment | Siqi Bao, Huang He, Fan Wang, Rongzhong Lian, Hua Wu, | In this paper, a novel Generation-Evaluation framework is developed for multi-turn conversations with the objective of letting both participants know more about each other. |
536 | Training Neural Response Selection for Task-Oriented Dialogue Systems | Matthew Henderson, Ivan Vuli?, Daniela Gerz, Iñigo Casanueva, Pawe? Budzianowski, Sam Coope, Georgios Spithourakis, Tsung-Hsien Wen, Nikola Mrkši?, Pei-Hao Su, | Inspired by the recent success of pretraining in language modelling, we propose an effective method for deploying response selection in task-oriented dialogue. |
537 | Collaborative Dialogue in Minecraft | Anjali Narayan-Chen, Prashant Jayannavar, Julia Hockenmaier, | We present the Minecraft Dialogue Corpus, a collection of 509 conversations and game logs. |
538 | Neural Response Generation with Meta-words | Can Xu, Wei Wu, Chongyang Tao, Huang Hu, Matt Schuerman, Ying Wang, | We present open domain dialogue generation with meta-words. |
539 | Conversing by Reading: Contentful Neural Conversation with On-demand Machine Reading | Lianhui Qin, Michel Galley, Chris Brockett, Xiaodong Liu, Xiang Gao, Bill Dolan, Yejin Choi, Jianfeng Gao, | We present a new end-to-end approach to contentful neural conversation that jointly models response generation and on-demand machine reading. To support further research on knowledge-grounded conversation, we introduce a new large-scale conversation dataset grounded in external web pages (2.8M turns, 7.4M sentences of grounding). |
540 | Ordinal and Attribute Aware Response Generation in a Multimodal Dialogue System | Hardik Chauhan, Mauajama Firdaus, Asif Ekbal, Pushpak Bhattacharyya, | In this paper, we propose a novel position and attribute aware attention mechanism to learn enhanced image representation conditioned on the user utterance. |
541 | Memory Consolidation for Contextual Spoken Language Understanding with Dialogue Logistic Inference | He Bai, Yu Zhou, Jiajun Zhang, Chengqing Zong, | In this paper, we propose a new dialogue logistic inference (DLI) task to consolidate the context memory jointly with SLU in the multi-task framework. |
542 | Personalizing Dialogue Agents via Meta-Learning | Andrea Madotto, Zhaojiang Lin, Chien-Sheng Wu, Pascale Fung, | In this paper, we propose to extend Model-Agnostic Meta-Learning (MAML) (Finn et al., 2017) to personalized dialogue learning without using any persona descriptions. |
543 | Reading Turn by Turn: Hierarchical Attention Architecture for Spoken Dialogue Comprehension | Zhengyuan Liu, Nancy Chen, | Therefore, in this work, we propose a hierarchical attention neural network architecture, combining turn-level and word-level attention mechanisms, to improve spoken dialogue comprehension performance. |
544 | A Novel Bi-directional Interrelated Model for Joint Intent Detection and Slot Filling | Haihong E, Peiqing Niu, Zhongfu Chen, Meina Song, | In this paper, we propose a novel bi-directional interrelated model for joint intent detection and slot filling. |
545 | Dual Supervised Learning for Natural Language Understanding and Generation | Shang-Yu Su, Chao-Wei Huang, Yun-Nung Chen, | This paper proposes a novel learning framework for natural language understanding and generation on top of dual supervised learning, providing a way to exploit the duality. |
546 | SUMBT: Slot-Utterance Matching for Universal and Scalable Belief Tracking | Hwaran Lee, Jinsik Lee, Tae-Yoon Kim, | In this paper, we propose a new approach to universal and scalable belief tracker, called slot-utterance matching belief tracker (SUMBT). |
547 | Robust Zero-Shot Cross-Domain Slot Filling with Example Values | Darsh Shah, Raghav Gupta, Amir Fayazi, Dilek Hakkani-Tur, | We propose utilizing both the slot description and a small number of examples of slot values, which may be easily available, to learn semantic representations of slots which are transferable across domains and robust to misaligned schemas. |
548 | Deep Unknown Intent Detection with Margin Loss | Ting-En Lin, Hua Xu, | In this paper, we present a two-stage method for detecting unknown intents. |
549 | Modeling Semantic Relationship in Multi-turn Conversations with Hierarchical Latent Variables | Lei Shen, Yang Feng, Haolan Zhan, | To address this problem, we propose a Conversational Semantic Relationship RNN (CSRR) model to construct the dependency explicitly. |
550 | Rationally Reappraising ATIS-based Dialogue Systems | Jingcheng Niu, Gerald Penn, | This paper presents a detailed account of these shortcomings, our proposed repairs, our rule-based grammar and the neural slot-filling architectures associated with ATIS. |
551 | Learning Latent Trees with Stochastic Perturbations and Differentiable Dynamic Programming | Caio Corro, Ivan Titov, | We treat projective dependency trees as latent variables in our probabilistic model and induce them in such a way as to be beneficial for a downstream task, without relying on any direct tree supervision. |
552 | Neural-based Chinese Idiom Recommendation for Enhancing Elegance in Essay Writing | Yuanchao Liu, Bo Pang, Bingquan Liu, | In this study, we address the problem of idiom recommendation by leveraging a neural machine translation framework, in which we suppose that idioms are written with one pseudo target language. |
553 | Better Exploiting Latent Variables in Text Modeling | Canasai Kruengkrai, | We show that sampling latent variables multiple times at a gradient step helps in improving a variational autoencoder and propose a simple and effective method to better exploit these latent variables through hidden state averaging. |
554 | Misleading Failures of Partial-input Baselines | Shi Feng, Eric Wallace, Jordan Boyd-Graber, | We first design artificial datasets to illustrate how the trivial patterns that are only visible in the full input can evade any partial-input baseline. Next, we identify such artifacts in the SNLI dataset-a hypothesis-only model augmented with trivial patterns in the premise can solve 15% of previously-thought “hard” examples. |
555 | Soft Contextual Data Augmentation for Neural Machine Translation | Fei Gao, Jinhua Zhu, Lijun Wu, Yingce Xia, Tao Qin, Xueqi Cheng, Wengang Zhou, Tie-Yan Liu, | In this paper, we present a novel data augmentation method for neural machine translation.Different from previous augmentation methods that randomly drop, swap or replace words with other words in a sentence, we softly augment a randomly chosen word in a sentence by its contextual mixture of multiple related words. |
556 | Reversing Gradients in Adversarial Domain Adaptation for Question Deduplication and Textual Entailment Tasks | Anush Kamath, Sparsh Gupta, Vitor Carvalho, | Here we investigate the use of gradient reversal on adversarial domain adaptation to explicitly learn both shared and unshared (domain specific) representations between two textual domains. |
557 | Towards Integration of Statistical Hypothesis Tests into Deep Neural Networks | Ahmad Aghaebrahimian, Mark Cieliebak, | We report our ongoing work about a new deep architecture working in tandem with a statistical test procedure for jointly training texts and their label descriptions for multi-label and multi-class classification tasks. |
558 | Depth Growing for Neural Machine Translation | Lijun Wu, Yiren Wang, Yingce Xia, Fei Tian, Fei Gao, Tao Qin, Jianhuang Lai, Tie-Yan Liu, | In this work, we propose an effective two-stage approach with three specially designed components to construct deeper NMT models, which result in significant improvements over the strong Transformer baselines on WMT14 English$\to$German and English$\to$French translation tasks. |
559 | Generating Fluent Adversarial Examples for Natural Languages | Huangzhao Zhang, Hao Zhou, Ning Miao, Lei Li, | In this paper, we propose MHA, which addresses both problems by performing Metropolis-Hastings sampling, whose proposal is designed with the guidance of gradients. |
560 | Towards Explainable NLP: A Generative Explanation Framework for Text Classification | Hui Liu, Qingyu Yin, William Yang Wang, | To solve this problem, we propose a novel generative explanation framework that learns to make classification decisions and generate fine-grained explanations at the same time. |
561 | Combating Adversarial Misspellings with Robust Word Recognition | Danish Pruthi, Bhuwan Dhingra, Zachary C. Lipton, | To combat adversarial spelling mistakes, we propose placing a word recognition model in front of the downstream classifier. |
562 | An Empirical Investigation of Structured Output Modeling for Graph-based Neural Dependency Parsing | Zhisong Zhang, Xuezhe Ma, Eduard Hovy, | In this paper, we investigate the aspect of structured output modeling for the state-of-the-art graph-based neural dependency parser (Dozat and Manning, 2017). |
563 | Observing Dialogue in Therapy: Categorizing and Forecasting Behavioral Codes | Jie Cao, Michael Tanana, Zac Imel, Eric Poitras, David Atkins, Vivek Srikumar, | In this paper, we study modeling behavioral codes used to asses a psychotherapy treatment style called Motivational Interviewing (MI), which is effective for addressing substance abuse and related problems. |
564 | Multimodal Transformer Networks for End-to-End Video-Grounded Dialogue Systems | Hung Le, Doyen Sahoo, Nancy Chen, Steven Hoi, | To overcome this, we propose Multimodal Transformer Networks (MTN) to encode videos and incorporate information from different modalities. |
565 | Target-Guided Open-Domain Conversation | Jianheng Tang, Tiancheng Zhao, Chenyan Xiong, Xiaodan Liang, Eric Xing, Zhiting Hu, | We propose a structured approach that introduces coarse-grained keywords to control the intended content of system responses. |
566 | Persuasion for Good: Towards a Personalized Persuasive Dialogue System for Social Good | Xuewei Wang, Weiyan Shi, Richard Kim, Yoojung Oh, Sijia Yang, Jingwen Zhang, Zhou Yu, | We designed an online persuasion task where one participant was asked to persuade the other to donate to a specific charity. We collected a large dataset with 1,017 dialogues and annotated emerging persuasion strategies from a subset. |
567 | Improving Neural Conversational Models with Entropy-Based Data Filtering | Richárd Csáky, Patrik Purgai, Gábor Recski, | While previous methods for improving the quality of open-domain response generation focused on either the underlying model or the training objective, we present a method of filtering dialog datasets by removing generic utterances from training data using a simple entropy-based approach that does not require human supervision. |
568 | Zero-shot Word Sense Disambiguation using Sense Definition Embeddings | Sawan Kumar, Sharmistha Jat, Karan Saxena, Partha Talukdar, | To overcome this challenge, we propose Extended WSD Incorporating Sense Embeddings (EWISE), a supervised model to perform WSD by predicting over a continuous sense embedding space as opposed to a discrete label space. |
569 | Language Modelling Makes Sense: Propagating Representations through WordNet for Full-Coverage Word Sense Disambiguation | Daniel Loureiro, Alípio Jorge, | In this work, we show that contextual embeddings can be used to achieve unprecedented gains in Word Sense Disambiguation (WSD) tasks. |
570 | Word2Sense: Sparse Interpretable Word Embeddings | Abhishek Panigrahi, Harsha Vardhan Simhadri, Chiranjib Bhattacharyya, | We present an unsupervised method to generate Word2Sense word embeddings that are interpretable – each dimension of the embedding space corresponds to a fine-grained sense, and the non-negative value of the embedding along the j-th dimension represents the relevance of the j-th sense to the word. |
571 | Modeling Semantic Compositionality with Sememe Knowledge | Fanchao Qi, Junjie Huang, Chenghao Yang, Zhiyuan Liu, Xiao Chen, Qun Liu, Maosong Sun, | In this paper, we verify the effectiveness of sememes, the minimum semantic units of human languages, in modeling SC by a confirmatory experiment. |
572 | Predicting Humorousness and Metaphor Novelty with Gaussian Process Preference Learning | Edwin Simpson, Erik-Lân Do Dinh, Tristan Miller, Iryna Gurevych, | We introduce a Bayesian approach for predicting humorousness and metaphor novelty using Gaussian process preference learning (GPPL), which achieves a Spearman’s ? of 0.56 against gold using word embeddings and linguistic features. |
573 | Empirical Linguistic Study of Sentence Embeddings | Katarzyna Krasnowska-Kiera?, Alina Wróblewska, | The purpose of the research is to answer the question whether linguistic information is retained in vector representations of sentences. |
574 | Probing for Semantic Classes: Diagnosing the Meaning Content of Word Embeddings | Yadollah Yaghoobzadeh, Katharina Kann, T. J. Hazen, Eneko Agirre, Hinrich Schütze, | We present a large dataset based on manual Wikipedia annotations and word senses, where word senses from different words are related by semantic classes. |
575 | Deep Neural Model Inspection and Comparison via Functional Neuron Pathways | James Fiacco, Samridhi Choudhary, Carolyn Rose, | We introduce a general method for the interpretation and comparison of neural models. |
576 | Collocation Classification with Unsupervised Relation Vectors | Luis Espinosa Anke, Steven Schockaert, Leo Wanner, | In this paper, we explore to which extent the current distributional landscape based on word embeddings provides a suitable basis for classification of collocations, i.e., pairs of words between which idiosyncratic lexical relations hold. |
577 | Corpus-based Check-up for Thesaurus | Natalia Loukachevitch, | In this paper we discuss the usefulness of applying a checking procedure to existing thesauri. |
578 | Confusionset-guided Pointer Networks for Chinese Spelling Check | Dingmin Wang, Yi Tay, Li Zhong, | This paper proposes Confusionset-guided Pointer Networks for Chinese Spell Check (CSC) task. |
579 | Generalized Data Augmentation for Low-Resource Translation | Mengzhou Xia, Xiang Kong, Antonios Anastasopoulos, Graham Neubig, | In this paper, we propose a general framework of data augmentation for low-resource machine translation not only using target-side monolingual data, but also by pivoting through a related high-resource language. |
580 | Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned | Elena Voita, David Talbot, Fedor Moiseev, Rico Sennrich, Ivan Titov, | In this work we evaluate the contribution made by individual attention heads to the overall performance of the model and analyze the roles played by them in the encoder. |
581 | Better OOV Translation with Bilingual Terminology Mining | Matthias Huck, Viktor Hangya, Alexander Fraser, | We improve the translation of OOVs in NMT using easy-to-obtain monolingual data. |
582 | Simultaneous Translation with Flexible Policy via Restricted Imitation Learning | Baigong Zheng, Renjie Zheng, Mingbo Ma, Liang Huang, | We propose a much simpler single model that adds a “delay” token to the target vocabulary, and design a restricted dynamic oracle to greatly simplify training. |
583 | Target Conditioned Sampling: Optimizing Data Selection for Multilingual Neural Machine Translation | Xinyi Wang, Graham Neubig, | In this paper, we seek to construct a sampling distribution over all multilingual data, so that it minimizes the training loss of the low-resource language. |
584 | Adversarial Learning of Privacy-Preserving Text Representations for De-Identification of Medical Records | Max Friedrich, Arne Köhn, Gregor Wiedemann, Chris Biemann, | We introduce a method to create privacy-preserving shareable representations of medical text (i.e. they contain no PHI) that does not require expensive manual pseudonymization. |
585 | Merge and Label: A Novel Neural Network Architecture for Nested NER | Joseph Fisher, Andreas Vlachos, | In this paper we introduce a novel neural network architecture that first merges tokens and/or entities into entities forming nested structures, and then labels each of them independently. |
586 | Low-resource Deep Entity Resolution with Transfer and Active Learning | Jungo Kasai, Kun Qian, Sairam Gurajada, Yunyao Li, Lucian Popa, | In this paper, we develop a deep learning-based method that targets low-resource settings for ER through a novel combination of transfer learning and active learning. |
587 | A Semi-Markov Structured Support Vector Machine Model for High-Precision Named Entity Recognition | Ravneet Arora, Chen-Tse Tsai, Ketevan Tsereteli, Prabhanjan Kambadur, Yi Yang, | In this paper, we propose a neural semi-Markov structured support vector machine model that controls the precision-recall trade-off by assigning weights to different types of errors in the loss-augmented inference during training. |
588 | Using Human Attention to Extract Keyphrase from Microblog Post | Yingyi Zhang, Chengzhi Zhang, | Thus, this paper aims to integrate human attention into keyphrase extraction models. |
589 | Model-Agnostic Meta-Learning for Relation Classification with Limited Supervision | Abiola Obamuyide, Andreas Vlachos, | In this paper we frame the task of supervised relation classification as an instance of meta-learning. |
590 | Variational Pretraining for Semi-supervised Text Classification | Suchin Gururangan, Tam Dang, Dallas Card, Noah A. Smith, | We introduce VAMPIRE, a lightweight pretraining framework for effective text classification when data and computing resources are limited. |
591 | Task Refinement Learning for Improved Accuracy and Stability of Unsupervised Domain Adaptation | Yftah Ziser, Roi Reichart, | In this paper we propose a Task Refinement Learning (TRL) approach, in order to solve these problems. |
592 | Optimal Transport-based Alignment of Learned Character Representations for String Similarity | Derek Tam, Nicholas Monath, Ari Kobren, Aaron Traylor, Rajarshi Das, Andrew McCallum, | In this work, we present STANCE-a learned model for computing the similarity of two strings. |
593 | The Referential Reader: A Recurrent Entity Network for Anaphora Resolution | Fei Liu, Luke Zettlemoyer, Jacob Eisenstein, | We present a new architecture for storing and accessing entity mentions during online text processing. |
594 | Interpolated Spectral NGram Language Models | Ariadna Quattoni, Xavier Carreras, | In this work we employ a technique for scaling up spectral learning, and use interpolated predictions that are optimized to maximize perplexity. |
595 | BAM! Born-Again Multi-Task Networks for Natural Language Understanding | Kevin Clark, Minh-Thang Luong, Urvashi Khandelwal, Christopher D. Manning, Quoc V. Le, | To help address this, we propose using knowledge distillation where single-task models teach a multi-task model. |
596 | Curate and Generate: A Corpus and Method for Joint Control of Semantics and Style in Neural NLG | Shereen Oraby, Vrindavan Harrison, Abteen Ebrahimi, Marilyn Walker, | We present YelpNLG, a corpus of 300,000 rich, parallel meaning representations and highly stylistically varied reference texts spanning different restaurant attributes, and describe a novel methodology that can be scalably reused to generate NLG datasets for other domains. |
597 | Automated Chess Commentator Powered by Neural Chess Engine | Hongyu Zang, Zhiwei Yu, Xiaojun Wan, | In this paper, we explore a new approach for automated chess commentary generation, which aims to generate chess commentary texts in different categories (e.g., \textit{description}, \textit{comparison}, \textit{planning}, etc.). |
598 | Barack’s Wife Hillary: Using Knowledge Graphs for Fact-Aware Language Modeling | Robert Logan, Nelson F. Liu, Matthew E. Peters, Matt Gardner, Sameer Singh, | To address this, we introduce the knowledge graph language model (KGLM), a neural language model with mechanisms for selecting and copying facts from a knowledge graph that are relevant to the context. |
599 | Controllable Paraphrase Generation with a Syntactic Exemplar | Mingda Chen, Qingming Tang, Sam Wiseman, Kevin Gimpel, | In this work, we propose a novel task, where the syntax of a generated sentence is controlled rather by a sentential exemplar. To evaluate quantitatively with standard metrics, we create a novel dataset with human annotations. |
600 | Towards Comprehensive Description Generation from Factual Attribute-value Tables | Tianyu Liu, Fuli Luo, Pengcheng Yang, Wei Wu, Baobao Chang, Zhifang Sui, | To relieve these problems, we first propose force attention (FA) method to encourage the generator to pay more attention to the uncovered attributes to avoid potential key attributes missing. Furthermore, we propose reinforcement learning for information richness to generate more informative as well as more loyal descriptions for tables. |
601 | Style Transformer: Unpaired Text Style Transfer without Disentangled Latent Representation | Ning Dai, Jianze Liang, Xipeng Qiu, Xuanjing Huang, | In this paper, we propose the Style Transformer, which makes no assumption about the latent representation of source sentence and equips the power of attention mechanism in Transformer to achieve better style transfer and better content preservation. |
602 | Generating Sentences from Disentangled Syntactic and Semantic Spaces | Yu Bao, Hao Zhou, Shujian Huang, Lei Li, Lili Mou, Olga Vechtomova, Xin-yu Dai, Jiajun Chen, | In this paper, we propose to generate sentences from disentangled syntactic and semantic spaces. |
603 | Learning to Control the Fine-grained Sentiment for Story Ending Generation | Fuli Luo, Damai Dai, Pengcheng Yang, Tianyu Liu, Baobao Chang, Zhifang Sui, Xu Sun, | Therefore, we propose a generic and novel framework which consists of a sentiment analyzer and a sentimental generator, respectively addressing the two challenges. |
604 | Self-Attention Architectures for Answer-Agnostic Neural Question Generation | Thomas Scialom, Benjamin Piwowarski, Jacopo Staiano, | We explore how Transformers can be adapted to the task of Neural Question Generation without constraining the model to focus on a specific answer passage. |
605 | Unsupervised Paraphrasing without Translation | Aurko Roy, David Grangier, | This work proposes to learn paraphrasing models only from a monolingual corpus. |
606 | Storyboarding of Recipes: Grounded Contextual Generation | Khyathi Chandu, Eric Nyberg, Alan W Black, | We introduce a dataset for sequential procedural (how-to) text generation from images in cooking domain. |
607 | Negative Lexically Constrained Decoding for Paraphrase Generation | Tomoyuki Kajiwara, | To solve this problem, we propose a neural model for paraphrase generation that first identifies words in the source sentence that should be paraphrased. |
608 | Large-Scale Transfer Learning for Natural Language Generation | Sergey Golovanov, Rauf Kurbanov, Sergey Nikolenko, Kyryl Truskovskyi, Alexander Tselousov, Thomas Wolf, | We focus in particular on open-domain dialog as a typical high entropy generation task, presenting and comparing different architectures for adapting pretrained models with state of the art results. |
609 | Automatic Grammatical Error Correction for Sequence-to-sequence Text Generation: An Empirical Study | Tao Ge, Xingxing Zhang, Furu Wei, Ming Zhou, | In this paper, we present a preliminary empirical study on whether and how much automatic grammatical error correction can help improve seq2seq text generation. |
610 | Improving the Robustness of Question Answering Systems to Question Paraphrasing | Wee Chung Gan, Hwee Tou Ng, | Using a neural paraphrasing model trained to generate multiple paraphrased questions for a given source question and a set of paraphrase suggestions, we propose a data augmentation approach that requires no human intervention to re-train the models for improved robustness to question paraphrasing. |
611 | RankQA: Neural Question Answering with Answer Re-Ranking | Bernhard Kratzwald, Anna Eigenmann, Stefan Feuerriegel, | In contrast, this work proposes RankQA: RankQA extends the conventional two-stage process in neural QA with a third stage that performs an additional answer re-ranking. |
612 | Latent Retrieval for Weakly Supervised Open Domain Question Answering | Kenton Lee, Ming-Wei Chang, Kristina Toutanova, | We show for the first time that it is possible to jointly learn the retriever and reader from question-answer string pairs and without any IR system. |
613 | Multi-hop Reading Comprehension through Question Decomposition and Rescoring | Sewon Min, Victor Zhong, Luke Zettlemoyer, Hannaneh Hajishirzi, | We propose a system for multi-hop RC that decomposes a compositional question into simpler sub-questions that can be answered by off-the-shelf single-hop RC models. |
614 | Combining Knowledge Hunting and Neural Language Models to Solve the Winograd Schema Challenge | Ashok Prakash, Arpit Sharma, Arindam Mitra, Chitta Baral, | In this work, we build-up on the language model based methods and augment them with a commonsense knowledge hunting (using automatic extraction from text) module and an explicit reasoning module. |
615 | Careful Selection of Knowledge to Solve Open Book Question Answering | Pratyay Banerjee, Kuntal Kumar Pal, Arindam Mitra, Chitta Baral, | In this paper we address QA with respect to the OpenBookQA dataset and combine state of the art language models with abductive information retrieval (IR), information gain based re-ranking, passage selection and weighted scoring to achieve 72.0% accuracy, an 11.6% improvement over the current state of the art. |
616 | Learning Representation Mapping for Relation Detection in Knowledge Base Question Answering | Peng Wu, Shujian Huang, Rongxiang Weng, Zaixiang Zheng, Jianbing Zhang, Xiaohui Yan, Jiajun Chen, | In this paper, we propose a simple mapping method, named representation adapter, to learn the representation mapping for both seen and unseen relations based on previously learned relation embedding. |
617 | Dynamically Fused Graph Network for Multi-hop Reasoning | Lin Qiu, Yunxuan Xiao, Yanru Qu, Hao Zhou, Lei Li, Weinan Zhang, Yong Yu, | In this paper, we propose Dynamically Fused Graph Network (DFGN), a novel method to answer those questions requiring multiple scattered evidence and reasoning over them. |
618 | NLProlog: Reasoning with Weak Unification for Question Answering in Natural Language | Leon Weber, Pasquale Minervini, Jannes Münchmeyer, Ulf Leser, Tim Rocktäschel, | In this paper, we describe a model combining neural networks with logic programming in a novel manner for solving multi-hop reasoning tasks over natural language. |
619 | Modeling Intra-Relation in Math Word Problems with Different Functional Multi-Head Attentions | Jierui Li, Lei Wang, Jipeng Zhang, Yan Wang, Bing Tian Dai, Dongxiang Zhang, | To utilize the merits of deep learning models with simultaneous consideration of MWPs’ specific features, we propose a group attention mechanism to extract global features, quantity-related features, quantity-pair features and question-related features in MWPs respectively. |
620 | Synthetic QA Corpora Generation with Roundtrip Consistency | Chris Alberti, Daniel Andor, Emily Pitler, Jacob Devlin, Michael Collins, | We introduce a novel method of generating synthetic question answering corpora by combining models of question generation and answer extraction, and by filtering the results to ensure roundtrip consistency. |
621 | Are Red Roses Red? Evaluating Consistency of Question-Answering Models | Marco Tulio Ribeiro, Carlos Guestrin, Sameer Singh, | We propose a method to automatically extract such implications for instances from two QA datasets, VQA and SQuAD, which we then use to evaluate the consistency of models. |
622 | MC^2: Multi-perspective Convolutional Cube for Conversational Machine Reading Comprehension | Xuanyu Zhang, | To comprehend context profoundly and efficiently from different perspectives, we propose a novel neural network model, Multi-perspective Convolutional Cube (MC{\^{}}2). |
623 | Reducing Word Omission Errors in Neural Machine Translation: A Contrastive Learning Approach | Zonghan Yang, Yong Cheng, Yang Liu, Maosong Sun, | In this work, we propose a contrastive learning approach to reducing word omission errors in NMT. |
624 | Exploiting Sentential Context for Neural Machine Translation | Xing Wang, Zhaopeng Tu, Longyue Wang, Shuming Shi, | In this work, we present novel approaches to exploit sentential context for neural machine translation (NMT). |
625 | Wetin dey with these comments? Modeling Sociolinguistic Factors Affecting Code-switching Behavior in Nigerian Online Discussions | Innocent Ndubuisi-Obi, Sayan Ghosh, David Jurgens, | We introduce a new corpus of 330K articles and accompanying 389K comments labeled for code switching behavior. |
626 | Accelerating Sparse Matrix Operations in Neural Networks on Graphics Processing Units | Arturo Argueta, David Chiang, | We present two new GPU algorithms: one at the input layer, for multiplying a matrix by a few-hot vector (generalizing the more common operation of multiplication by a one-hot vector) and one at the output layer, for a fused softmax and top-N selection (commonly used in beam search). |
627 | An Automated Framework for Fast Cognate Detection and Bayesian Phylogenetic Inference in Computational Historical Linguistics | Taraka Rama, Johann-Mattis List, | We present a fully automated workflow for phylogenetic reconstruction on large datasets, consisting of two novel methods, one for fast detection of cognates and one for fast Bayesian phylogenetic inference. |
628 | Sentence Centrality Revisited for Unsupervised Summarization | Hao Zheng, Mirella Lapata, | In this paper we develop an unsupervised approach arguing that it is unrealistic to expect large-scale and high-quality training data to be available or created for different types of summaries, domains, or languages. |
629 | Discourse Representation Parsing for Sentences and Documents | Jiangming Liu, Shay B. Cohen, Mirella Lapata, | We introduce a novel semantic parsing task based on Discourse Representation Theory (DRT; Kamp and Reyle 1993). |
630 | Inducing Document Structure for Aspect-based Summarization | Lea Frermann, Alexandre Klementiev, | We tackle the task of aspect-based summarization, where, given a document and a target aspect, our models generate a summary centered around the aspect. |
631 | Incorporating Priors with Feature Attribution on Text Classification | Frederick Liu, Besim Avci, | Our approach integrates feature attributions into the objective function to allow machine learning practitioners to incorporate priors in model building. |
632 | Matching Article Pairs with Graphical Decomposition and Convolutions | Bang Liu, Di Niu, Haojie Wei, Jinghong Lin, Yancheng He, Kunfeng Lai, Yu Xu, | To model article pairs, we propose the Concept Interaction Graph to represent an article as a graph of concepts. To facilitate the evaluation of long article matching, we have created two datasets, each consisting of about 30K pairs of breaking news articles covering diverse topics in the open domain. |
633 | Hierarchical Transfer Learning for Multi-label Text Classification | Siddhartha Banerjee, Cem Akkaya, Francisco Perez-Sorrosal, Kostas Tsioutsiouliklis, | We propose a novel transfer learning based strategy, HTrans, where binary classifiers at lower levels in the hierarchy are initialized using parameters of the parent classifier and fine-tuned on the child category classification task. |
634 | Bias Analysis and Mitigation in the Evaluation of Authorship Verification | Janek Bevendorff, Matthias Hagen, Benno Stein, Martin Potthast, | In this paper we review, theoretically and practically, the authorship verification task and conclude that the underlying experiment design cannot guarantee pushing forward the state of the art-in fact, it allows for top benchmarking with a surprisingly straightforward approach. We pinpoint these sources in the evaluation chain and present a refined authorship corpus as effective countermeasure. |
635 | Numeracy-600K: Learning Numeracy for Detecting Exaggerated Information in Market Comments | Chung-Chi Chen, Hen-Hsen Huang, Hiroya Takamura, Hsin-Hsi Chen, | In this paper, we attempt to answer the question of whether neural network models can learn numeracy, which is the ability to predict the magnitude of a numeral at some specific position in a text description. |
636 | Large-Scale Multi-Label Text Classification on EU Legislation | Ilias Chalkidis, Emmanouil Fergadiotis, Prodromos Malakasiotis, Ion Androutsopoulos, | We consider Large-Scale Multi-Label Text Classification (LMTC) in the legal domain. |
637 | Why Didn’t You Listen to Me? Comparing User Control of Human-in-the-Loop Topic Models | Varun Kumar, Alison Smith-Renner, Leah Findlater, Kevin Seppi, Jordan Boyd-Graber, | Users should have a sense of control in HLTM systems, so we propose a control metric to measure whether refinement operations’ results match users’ expectations. |
638 | Encouraging Paragraph Embeddings to Remember Sentence Identity Improves Classification | Tu Vu, Mohit Iyyer, | In this paper, we investigate a state-of-the-art paragraph embedding method proposed by Zhang et al. (2017) and discover that it cannot reliably tell whether a given sentence occurs in the input paragraph or not. |
639 | A Multi-Task Architecture on Relevance-based Neural Query Translation | Sheikh Muhammad Sarwar, Hamed Bonab, James Allan, | We describe a multi-task learning approach to train a Neural Machine Translation (NMT) model with a Relevance-based Auxiliary Task (RAT) for search query translation. |
640 | Topic Modeling with Wasserstein Autoencoders | Feng Nan, Ran Ding, Ramesh Nallapati, Bing Xiang, | We propose a novel neural topic model in the Wasserstein autoencoders (WAE) framework. |
641 | Dense Procedure Captioning in Narrated Instructional Videos | Botian Shi, Lei Ji, Yaobo Liang, Nan Duan, Peng Chen, Zhendong Niu, Ming Zhou, | Motivated by video dense captioning, we propose a model to generate procedure captions from narrated instructional videos which are a sequence of step-wise clips with description. |
642 | Latent Variable Model for Multi-modal Translation | Iacer Calixto, Miguel Rios, Wilker Aziz, | In this work, we propose to model the interaction between visual and textual features for multi-modal neural machine translation (MMT) through a latent variable model. |
643 | Identifying Visible Actions in Lifestyle Vlogs | Oana Ignat, Laura Burdick, Jia Deng, Rada Mihalcea, | We construct a dataset with crowdsourced manual annotations of visible actions, and introduce a multimodal algorithm that leverages information derived from visual and linguistic clues to automatically infer which actions are visible in a video. |
644 | A Corpus for Reasoning about Natural Language Grounded in Photographs | Alane Suhr, Stephanie Zhou, Ally Zhang, Iris Zhang, Huajun Bai, Yoav Artzi, | We introduce a new dataset for joint reasoning about natural language and images, with a focus on semantic diversity, compositionality, and visual reasoning challenges. |
645 | Learning to Discover, Ground and Use Words with Segmental Neural Language Models | Kazuya Kawakami, Chris Dyer, Phil Blunsom, | We propose a segmental neural language model that combines the generalization power of neural networks with the ability to discover word-like units that are latent in unsegmented character sequences. |
646 | What Should I Ask? Using Conversationally Informative Rewards for Goal-oriented Visual Dialog. | Pushkar Shukla, Carlos Elmadjian, Richika Sharan, Vivek Kulkarni, Matthew Turk, William Yang Wang, | In this work, we focus on the task of goal-oriented visual dialogue, aiming to automatically generate a series of questions about an image with a single objective. |
647 | Symbolic Inductive Bias for Visually Grounded Learning of Spoken Language | Grzegorz Chrupa?a, | We propose to use multitask learning to exploit existing transcribed speech within the end-to-end setting. |
648 | Multi-step Reasoning via Recurrent Dual Attention for Visual Dialog | Zhe Gan, Yu Cheng, Ahmed Kholy, Linjie Li, Jingjing Liu, Jianfeng Gao, | This paper presents a new model for visual dialog, Recurrent Dual Attention Network (ReDAN), using multi-step reasoning to answer a series of questions about an image. |
649 | Lattice Transformer for Speech Translation | Pei Zhang, Niyu Ge, Boxing Chen, Kai Fan, | The goal of this work is to extend the attention mechanism of the transformer to naturally consume the lattice in addition to the traditional sequential input. |
650 | Informative Image Captioning with External Sources of Information | Sanqiang Zhao, Piyush Sharma, Tomer Levinboim, Radu Soricut, | We introduce a multimodal, multi-encoder model based on Transformer that ingests both image features and multiple sources of entity labels. |
651 | CoDraw: Collaborative Drawing as a Testbed for Grounded Goal-driven Communication | Jin-Hwa Kim, Nikita Kitaev, Xinlei Chen, Marcus Rohrbach, Byoung-Tak Zhang, Yuandong Tian, Dhruv Batra, Devi Parikh, | In this work, we propose a goal-driven collaborative task that combines language, perception, and action. We collect the CoDraw dataset of {\textasciitilde}10K dialogs consisting of {\textasciitilde}138K messages exchanged between human players. |
652 | Bridging by Word: Image Grounded Vocabulary Construction for Visual Captioning | Zhihao Fan, Zhongyu Wei, Siyuan Wang, Xuanjing Huang, | To tackle this problem, we propose to construct an image-grounded vocabulary, based on which, captions are generated with limitation and guidance. |
653 | Distilling Translations with Visual Awareness | Julia Ive, Pranava Madhyastha, Lucia Specia, | We propose a translate-and-refine approach to this problem where images are only used by a second stage decoder. |
654 | VIFIDEL: Evaluating the Visual Fidelity of Image Descriptions | Pranava Madhyastha, Josiah Wang, Lucia Specia, | We propose a novel image-aware metric for this task: VIFIDEL. |
655 | Are You Looking? Grounding to Multiple Modalities in Vision-and-Language Navigation | Ronghang Hu, Daniel Fried, Anna Rohrbach, Dan Klein, Trevor Darrell, Kate Saenko, | To better use all the available modalities, we propose to decompose the grounding procedure into a set of expert models with access to different modalities (including object detections) and ensemble them at prediction time, improving the performance of state-of-the-art models on the VLN task. |
656 | Multimodal Transformer for Unaligned Multimodal Language Sequences | Yao-Hung Hubert Tsai, Shaojie Bai, Paul Pu Liang, J. Zico Kolter, Louis-Philippe Morency, Ruslan Salakhutdinov, | In this paper, we introduce the Multimodal Transformer (MulT) to generically address the above issues in an end-to-end manner without explicitly aligning the data. |
657 | Show, Describe and Conclude: On Exploiting the Structure Information of Chest X-ray Reports | Baoyu Jing, Zeya Wang, Eric Xing, | In this work, we propose a novel framework which exploits the structure information between and within report sections for generating CXR imaging reports. |
658 | Visual Story Post-Editing | Ting-Yao Hsu, Chieh-Yang Huang, Yen-Chia Hsu, Ting-Hao Huang, | We introduce the first dataset for human edits of machine-generated visual stories and explore how these collected edits may be used for the visual story post-editing task. |
659 | Multimodal Abstractive Summarization for How2 Videos | Shruti Palaskar, Jind?ich Libovický, Spandana Gella, Florian Metze, | In this paper, we study abstractive summarization for open-domain videos. |
660 | Learning to Relate from Captions and Bounding Boxes | Sarthak Garg, Joel Ruben Antony Moniz, Anshu Aviral, Priyatham Bollimpalli, | In this work, we propose a novel approach that predicts the relationships between various entities in an image in a weakly supervised manner by relying on image captions and object bounding box annotations as the sole source of supervision. |