Paper Digest: NAACL 2018 Highlights
The North American Chapter of the Association for Computational Linguistics (NAACL) is one of the top natural language processing conferences in the world. In 2018, it is to be held in New Orleans, Louisiana. There were 1,072 paper submissions, of which 205 were accepted as long papers and 125 were accepted as short papers.
To help AI community quickly catch up on the work presented in this conference, Paper Digest Team processed all accepted papers, and generated one highlight sentence (typically the main topic) for each paper. Readers are encouraged to read these machine generated highlights / summaries to quickly get the main idea of each paper.
We thank all authors for writing these interesting papers, and readers for reading our digests. If you do not want to miss any interesting AI paper, you are welcome to sign up our free paper digest service to get new paper updates customized to your own interests on a daily basis.
Paper Digest Team
team@paperdigest.org
TABLE 1: NAACL 2018 Long Papers
Title | Authors | Highlight | |
---|---|---|---|
1 | Label-Aware Double Transfer Learning for Cross-Specialty Medical Named Entity Recognition | Zhenghui Wang, Yanru Qu, Liheng Chen, Jian Shen, Weinan Zhang, Shaodian Zhang, Yimei Gao, Gen Gu, Ken Chen, Yong Yu | In this paper, we propose a label-aware double transfer learning framework (La-DTL) for cross-specialty NER, so that a medical NER system designed for one specialty could be conveniently applied to another one with minimal annotation efforts. |
2 | Neural Fine-Grained Entity Type Classification with Hierarchy-Aware Loss | Peng Xu, Denilson Barbosa | Instead, we propose an end-to-end solution with a neural network model that uses a variant of cross-entropy loss function to handle out-of-context labels, and hierarchical loss normalization to cope with overly-specific ones. |
3 | Joint Bootstrapping Machines for High Confidence Relation Extraction | Pankaj Gupta, Benjamin Roth, Hinrich Schütze | We introduce BREX, a new bootstrapping method that protects against such contamination by highly effective confidence assessment. |
4 | A Deep Generative Model of Vowel Formant Typology | Ryan Cotterell, Jason Eisner | In our work, we tackle the problem of vowel system typology, i.e., we propose a generative probability model of which vowels a language contains. |
5 | Fortification of Neural Morphological Segmentation Models for Polysynthetic Minimal-Resource Languages | Katharina Kann, Jesus Manuel Mager Hois, Ivan Vladimir Meza-Ruiz, Hinrich Schütze | We provide our morphological segmentation datasets for Mexicanero, Nahuatl, Wixarika and Yorem Nokki for future research. |
6 | Improving Character-Based Decoding Using Target-Side Morphological Information for Neural Machine Translation | Peyman Passban, Qun Liu, Andy Way | In this paper, we propose an extension to the state-of-the-art model of Chung et al. (2016), which works at the character level and boosts the decoder with target-side morphological information. |
7 | Parsing Speech: a Neural Approach to Integrating Lexical and Acoustic-Prosodic Information | Trang Tran, Shubham Toshniwal, Mohit Bansal, Kevin Gimpel, Karen Livescu, Mari Ostendorf | For automatically parsing spoken utterances, we introduce a model that integrates transcribed text and acoustic-prosodic features using a convolutional neural network over energy and pitch trajectories coupled with an attention-based recurrent neural network that accepts text and prosodic features. |
8 | Tied Multitask Learning for Neural Speech Translation | Antonios Anastasopoulos, David Chiang | We explore multitask models for neural translation of speech, augmenting them in order to reflect two intuitive notions. |
9 | Please Clap: Modeling Applause in Campaign Speeches | Jon Gillick, David Bamman | We introduce a new corpus of speeches from campaign events in the months leading up to the 2016 U.S. presidential election and develop new models for predicting moments of audience applause. |
10 | Attentive Interaction Model: Modeling Changes in View in Argumentation | Yohan Jo, Shivani Poddar, Byungsoo Jeon, Qinlan Shen, Carolyn Rosé, Graham Neubig | We present a neural architecture for modeling argumentative dialogue that explicitly models the interplay between an Opinion Holder’s (OH’s) reasoning and a challenger’s argument, with the goal of predicting if the argument successfully changes the OH’s view. |
11 | Automatic Focus Annotation: Bringing Formal Pragmatics Alive in Analyzing the Information Structure of Authentic Data | Ramon Ziai, Detmar Meurers | Building on the research that established detailed annotation guidelines for manual annotation of information structural concepts for written (Dipper et al., 2007; Ziai and Meurers, 2014) and spoken language data (Calhoun et al., 2010), this paper presents the first approach automating the analysis of focus in authentic written data. |
12 | Dear Sir or Madam, May I Introduce the GYAFC Dataset: Corpus, Benchmarks and Metrics for Formality Style Transfer | Sudha Rao, Joel Tetreault | In this work, we create the largest corpus for a particular stylistic transfer (formality) and show that techniques from the machine translation community can serve as strong baselines for future work. |
13 | Improving Implicit Discourse Relation Classification by Modeling Inter-dependencies of Discourse Units in a Paragraph | Zeyu Dai, Ruihong Huang | With the goal of improving implicit discourse relation classification, we introduce a paragraph-level neural networks that model inter-dependencies between discourse units as well as discourse relation continuity and patterns, and predict a sequence of discourse relations in a paragraph. |
14 | A Deep Ensemble Model with Slot Alignment for Sequence-to-Sequence Natural Language Generation | Juraj Juraska, Panagiotis Karagiannis, Kevin Bowden, Marilyn Walker | We describe an ensemble neural language generator, and present several novel methods for data representation and augmentation that yield improved results in our model. |
15 | A Melody-Conditioned Lyrics Language Model | Kento Watanabe, Yuichiroh Matsubayashi, Satoru Fukayama, Masataka Goto, Kentaro Inui, Tomoyasu Nakano | This paper presents a novel, data-driven language model that produces entire lyrics for a given input melody. |
16 | Discourse-Aware Neural Rewards for Coherent Text Generation | Antoine Bosselut, Asli Celikyilmaz, Xiaodong He, Jianfeng Gao, Po-Sen Huang, Yejin Choi | In this paper, we investigate the use of discourse-aware rewards with reinforcement learning to guide a model to generate long, coherent text. |
17 | Natural Answer Generation with Heterogeneous Memory | Yao Fu, Yansong Feng | In this work, we propose a novel attention mechanism to encourage the decoder to actively interact with the memory by taking its heterogeneity into account. |
18 | Query and Output: Generating Words by Querying Distributed Word Representations for Paraphrase Generation | Shuming Ma, Xu Sun, Wei Li, Sujian Li, Wenjie Li, Xuancheng Ren | In this work, we introduce a novel model based on the encoder-decoder framework, called Word Embedding Attention Network (WEAN). |
19 | Simplification Using Paraphrases and Context-Based Lexical Substitution | Reno Kriz, Eleni Miltsakaki, Marianna Apidianaki, Chris Callison-Burch | We propose a complex word identification (CWI) model that exploits both lexical and contextual features, and a simplification mechanism which relies on a word-embedding lexical substitution model to replace the detected complex words with simpler paraphrases. |
20 | Zero-Shot Question Generation from Knowledge Graphs for Unseen Predicates and Entity Types | Hady Elsahar, Christophe Gravier, Frederique Laforest | We present a neural model for question generation from knowledge graphs triples in a “Zero-shot” setup, that is generating questions for predicate, subject types or object types that were not seen at training time. |
21 | Automated Essay Scoring in the Presence of Biased Ratings | Evelin Amorim, Marcia Cançado, Adriano Veloso | We present features to quantify rater bias based on their comments, and we found that rater bias plays an important role in automated essay scoring. To this end, we present a new annotated corpus containing essays and their respective scores. |
22 | Content-Based Citation Recommendation | Chandra Bhagavatula, Sergey Feldman, Russell Power, Waleed Ammar | We present a content-based method for recommending citations in an academic paper draft. |
23 | Looking Beyond the Surface: A Challenge Set for Reading Comprehension over Multiple Sentences | Daniel Khashabi, Snigdha Chaturvedi, Michael Roth, Shyam Upadhyay, Dan Roth | We present a reading comprehension challenge in which questions can only be answered by taking into account information from multiple sentences. |
24 | Neural Automated Essay Scoring and Coherence Modeling for Adversarially Crafted Input | Youmna Farag, Helen Yannakoudakis, Ted Briscoe | We develop a neural model of local coherence that can effectively learn connectedness features between sentences, and propose a framework for integrating and jointly training the local coherence model with a state-of-the-art AES model. |
25 | QuickEdit: Editing Text & Translations by Crossing Words Out | David Grangier, Michael Auli | We propose a framework for computer-assisted text editing. |
26 | Tempo-Lexical Context Driven Word Embedding for Cross-Session Search Task Extraction | Procheta Sen, Debasis Ganguly, Gareth Jones | By contrast, in this work we seek to identify tasks that span across multiple sessions. |
27 | Zero-Shot Sequence Labeling: Transferring Knowledge from Sentences to Tokens | Marek Rei, Anders Søgaard | Can attention- or gradient-based visualization techniques be used to infer token-level labels for binary sequence tagging problems, using networks trained only on sentence-level labels? |
28 | Variable Typing: Assigning Meaning to Variables in Mathematical Text | Yiannos Stathopoulos, Simon Baker, Marek Rei, Simone Teufel | We introduce \textit{variable typing}, the task of assigning one \textit{mathematical type} (multi-word technical terms referring to mathematical concepts) to each variable in a sentence of mathematical text. As part of this work, we also introduce a new annotated data set composed of 33,524 data points extracted from scientific documents published on arXiv. |
29 | Learning beyond Datasets: Knowledge Graph Augmented Neural Networks for Natural Language Processing | Annervaz K M, Somnath Basu Roy Chowdhury, Ambedkar Dukkipati | In this work, we propose to enhance learning models with world knowledge in the form of Knowledge Graph (KG) fact triples for Natural Language Processing (NLP) tasks. |
30 | Comparing Constraints for Taxonomic Organization | Anne Cocos, Marianna Apidianaki, Chris Callison-Burch | In this paper, we present a head-to-head comparison of six taxonomic organization algorithms that vary with respect to their structural and transitivity constraints, and treatment of synonymy. |
31 | Improving Lexical Choice in Neural Machine Translation | Toan Nguyen, David Chiang | We explore two solutions to the problem of mistranslating rare words in neural machine translation. |
32 | Universal Neural Machine Translation for Extremely Low Resource Languages | Jiatao Gu, Hany Hassan, Jacob Devlin, Victor O.K. Li | In this paper, we propose a new universal machine translation approach focusing on languages with a limited amount of parallel data. |
33 | Classical Structured Prediction Losses for Sequence to Sequence Learning | Sergey Edunov, Myle Ott, Michael Auli, David Grangier, Marc’Aurelio Ranzato | In this paper, we survey a range of classical objective functions that have been widely used to train linear models for structured prediction and apply them to neural sequence to sequence models. |
34 | Deep Dirichlet Multinomial Regression | Adrian Benton, Mark Dredze | We present deep Dirichlet Multinomial Regression (dDMR), a generative topic model that simultaneously learns document feature representations and topics. |
35 | Microblog Conversation Recommendation via Joint Modeling of Topics and Discourse | Xingshan Zeng, Jing Li, Lu Wang, Nicholas Beauchamp, Sarah Shugars, Kam-Fai Wong | Here we propose a new method for microblog conversation recommendation. |
36 | Before Name-Calling: Dynamics and Triggers of Ad Hominem Fallacies in Web Argumentation | Ivan Habernal, Henning Wachsmuth, Iryna Gurevych, Benno Stein | As existing research lacks solid empirical investigation of the typology of ad hominem arguments as well as their potential causes, this paper fills this gap by (1) performing several large-scale annotation studies, (2) experimenting with various neural architectures and validating our working hypotheses, such as controversy or reasonableness, and (3) providing linguistic insights into triggers of ad hominem using explainable neural network architectures. |
37 | Scene Graph Parsing as Dependency Parsing | Yu-Siang Wang, Chenxi Liu, Xiaohui Zeng, Alan Yuille | In this paper, we study the problem of parsing structured knowledge graphs from textual descriptions. |
38 | Learning Visually Grounded Sentence Representations | Douwe Kiela, Alexis Conneau, Allan Jabri, Maximilian Nickel | We investigate grounded sentence representations, where we train a sentence encoder to predict the image features of a given caption-i.e., we try to “imagine” how a sentence would be depicted visually-and use the resultant features as sentence representations. |
39 | Comparatives, Quantifiers, Proportions: a Multi-Task Model for the Learning of Quantities from Vision | Sandro Pezzelle, Ionut-Teodor Sorodoc, Raffaella Bernardi | The present work investigates whether different quantification mechanisms (set comparison, vague quantification, and proportional estimation) can be jointly learned from visual scenes by a multi-task computational model. |
40 | Being Negative but Constructively: Lessons Learnt from Creating Better Visual Question Answering Datasets | Wei-Lun Chao, Hexiang Hu, Fei Sha | In this paper, we study a crucial component of this task: how can we design good datasets for the task? |
41 | Abstract Meaning Representation for Paraphrase Detection | Fuad Issa, Marco Damonte, Shay B. Cohen, Xiaohui Yan, Yi Chang | We show that na{\”\i}ve use of AMR in paraphrase detection is not necessarily useful, and turn to describe a technique based on latent semantic analysis in combination with AMR parsing that significantly advances state-of-the-art results in paraphrase detection for the Microsoft Research Paraphrase Corpus. |
42 | attr2vec: Jointly Learning Word and Contextual Attribute Embeddings with Factorization Machines | Fabio Petroni, Vassilis Plachouras, Timothy Nugent, Jochen L. Leidner | In this work, we introduce attr2vec, a novel framework for jointly learning embeddings for words and contextual attributes based on factorization machines. |
43 | Can Network Embedding of Distributional Thesaurus Be Combined with Word Vectors for Better Representation? | Abhik Jana, Pawan Goyal | Being motivated by recent surge of research in network embedding techniques (DeepWalk, LINE, node2vec etc.), we turn a distributional thesaurus network into dense word vectors and investigate the usefulness of distributional thesaurus embedding in improving overall word representation. |
44 | Deep Neural Models of Semantic Shift | Alex Rosenfeld, Katrin Erk | In this paper, we propose a deep neural network diachronic distributional model. |
45 | Distributional Inclusion Vector Embedding for Unsupervised Hypernymy Detection | Haw-Shiuan Chang, Ziyun Wang, Luke Vilnis, Andrew McCallum | This paper introduces distributional inclusion vector embedding (DIVE), a simple-to-implement unsupervised method of hypernym discovery via per-word non-negative vector embeddings which preserve the inclusion property of word contexts. |
46 | Mining Possessions: Existence, Type and Temporal Anchors | Dhivya Chinnappa, Eduardo Blanco | This paper presents a corpus and experiments to mine possession relations from text. |
47 | Neural Tensor Networks with Diagonal Slice Matrices | Takahiro Ishihara, Katsuhiko Hayashi, Hitoshi Manabe, Masashi Shimbo, Masaaki Nagata | We address these issues by applying eigendecomposition to each slice matrix of a tensor to reduce its number of paramters. |
48 | Post-Specialisation: Retrofitting Vectors of Words Unseen in Lexical Resources | Ivan Vulić, Goran Glavaš, Nikola Mrkšić, Anna Korhonen | In this paper, we show that constraint-driven vector space specialisation can be extended to unseen words. |
49 | Unsupervised Learning of Sentence Embeddings Using Compositional n-Gram Features | Matteo Pagliardini, Prakhar Gupta, Martin Jaggi | We present a simple but efficient unsupervised objective to train distributed representations of sentences. |
50 | Learning Domain Representation for Multi-Domain Sentiment Classification | Qi Liu, Yue Zhang, Jiangming Liu | We investigate this problem by learning domain-specific representations of input sentences using neural network. |
51 | Learning Sentence Representations over Tree Structures for Target-Dependent Classification | Junwen Duan, Xiao Ding, Ting Liu | To address above issues, we propose a reinforcement learning based approach, which automatically induces target-specific sentence representations over tree structures. |
52 | Relevant Emotion Ranking from Text Constrained with Emotion Relationships | Deyu Zhou, Yang Yang, Yulan He | A novel framework of relevant emotion ranking is proposed to tackle the problem. |
53 | Solving Data Sparsity for Aspect Based Sentiment Analysis Using Cross-Linguality and Multi-Linguality | Md Shad Akhtar, Palaash Sawant, Sukanta Sen, Asif Ekbal, Pushpak Bhattacharyya | In this work we propose to minimize the effect of data sparsity by leveraging bilingual word embeddings learned through a parallel corpus. |
54 | SRL4ORL: Improving Opinion Role Labeling Using Multi-Task Learning with Semantic Role Labeling | Ana Marasović, Anette Frank | With deeper analysis we determine what works and what might be done to make further improvements for ORL. |
55 | Approaching Neural Grammatical Error Correction as a Low-Resource Machine Translation Task | Marcin Junczys-Dowmunt, Roman Grundkiewicz, Shubha Guha, Kenneth Heafield | We demonstrate parallels between neural GEC and low-resource neural MT and successfully adapt several methods from low-resource MT to neural GEC. We further establish guidelines for trustable results in neural GEC and propose a set of model-independent methods for neural GEC that can be easily applied in most GEC settings. |
56 | Robust Cross-Lingual Hypernymy Detection Using Dependency Context | Shyam Upadhyay, Yogarshi Vyas, Marine Carpuat, Dan Roth | We propose BiSparse-Dep, a family of unsupervised approaches for cross-lingual hypernymy detection, which learns sparse, bilingual word embeddings based on dependency contexts. |
57 | Noising and Denoising Natural Language: Diverse Backtranslation for Grammar Correction | Ziang Xie, Guillaume Genthial, Stanley Xie, Andrew Ng, Dan Jurafsky | In this paper, we consider synthesizing parallel data by noising a clean monolingual corpus. |
58 | Self-Training for Jointly Learning to Ask and Answer Questions | Mrinmaya Sachan, Eric Xing | To alleviate these issues, we propose a self-training method for jointly learning to ask as well as answer questions, leveraging unlabeled text along with labeled question answer pairs for learning. |
59 | The Web as a Knowledge-Base for Answering Complex Questions | Alon Talmor, Jonathan Berant | In this paper, we present a novel framework for answering broad and complex questions, assuming answering simple questions is possible using a search engine and a reading comprehension model. To illustrate the viability of our approach, we create a new dataset of complex questions, ComplexWebQuestions, and present a model that decomposes questions and interacts with the web to compute an answer. |
60 | A Meaning-Based Statistical English Math Word Problem Solver | Chao-Chun Liang, Yu-Shiang Wong, Yi-Chung Lin, Keh-Yih Su | We introduce MeSys, a meaning-based approach, for solving English math word problems (MWPs) via understanding and reasoning in this paper. |
61 | Fine-Grained Temporal Orientation and its Relationship with Psycho-Demographic Correlates | Sabyasachi Kamila, Mohammed Hasanuzzaman, Asif Ekbal, Pushpak Bhattacharyya, Andy Way | In this paper, we propose a very first study to demonstrate the association between the sentiment view of the temporal orientation of the users and their different psycho-demographic attributes by analyzing their tweets. |
62 | Querying Word Embeddings for Similarity and Relatedness | Fatemeh Torabi Asr, Robert Zinkov, Michael Jones | We demonstrate the usefulness of context embeddings in predicting asymmetric association between words from a recently published dataset of production norms (Jouravlev & McRae, 2016). |
63 | Semantic Structural Evaluation for Text Simplification | Elior Sulem, Omri Abend, Ari Rappoport | In this paper we propose the first measure to address structural aspects of text simplification, called SAMSA. |
64 | Entity Commonsense Representation for Neural Abstractive Summarization | Reinald Kim Amplayo, Seonjae Lim, Seung-won Hwang | To this end, we leverage on an off-the-shelf entity linking system (ELS) to extract linked entities and propose Entity2Topic (E2T), a module easily attachable to a sequence-to-sequence model that transforms a list of entities into a vector representation of the topic of the summary. |
65 | Newsroom: A Dataset of 1.3 Million Summaries with Diverse Extractive Strategies | Max Grusky, Mor Naaman, Yoav Artzi | We present NEWSROOM, a summarization dataset of 1.3 million articles and summaries written by authors and editors in newsrooms of 38 major news publications. |
66 | Polyglot Semantic Parsing in APIs | Kyle Richardson, Jonathan Berant, Jonas Kuhn | In this paper, we explore the idea of polyglot semantic translation, or learning semantic parsing models that are trained on multiple datasets and natural languages. |
67 | Neural Models of Factuality | Rachel Rudinger, Aaron Steven White, Benjamin Van Durme | We present two neural models for event factuality prediction, which yield significant performance gains over previous models on three event factuality datasets: FactBank, UW, and MEANTIME. |
68 | Accurate Text-Enhanced Knowledge Graph Representation Learning | Bo An, Bo Chen, Xianpei Han, Le Sun | To appropriately handle the semantic variety of entities/relations in distinct triples, we propose an accurate text-enhanced knowledge graph representation learning method, which can represent a relation/entity with different representations in different triples by exploiting additional textual information. |
69 | Acquisition of Phrase Correspondences Using Natural Deduction Proofs | Hitomi Yanaka, Koji Mineshima, Pascual Martínez-Gómez, Daisuke Bekki | To solve this problem, we propose a method for detecting paraphrases via natural deduction proofs of semantic relations between sentence pairs. |
70 | Automatic Stance Detection Using End-to-End Memory Networks | Mitra Mohtarami, Ramy Baly, James Glass, Preslav Nakov, Lluís Màrquez, Alessandro Moschitti | We present an effective end-to-end memory network model that jointly (i) predicts whether a given document can be considered as relevant evidence for a given claim, and (ii) extracts snippets of evidence that can be used to reason about the factuality of the target claim. |
71 | Collective Entity Disambiguation with Structured Gradient Tree Boosting | Yi Yang, Ozan Irsoy, Kazi Shefaet Rahman | We present a gradient-tree-boosting-based structured learning model for jointly disambiguating named entities in a document. |
72 | DeepAlignment: Unsupervised Ontology Matching with Refined Word Vectors | Prodromos Kolyvakis, Alexandros Kalousis, Dimitris Kiritsis | In this work, we present a novel entity alignment method which we dub DeepAlignment. |
73 | Efficient Sequence Learning with Group Recurrent Networks | Fei Gao, Lijun Wu, Li Zhao, Tao Qin, Xueqi Cheng, Tie-Yan Liu | In this paper, we propose an efficient architecture to improve the efficiency of such RNN model training, which adopts the group strategy for recurrent layers, while exploiting the representation rearrangement strategy between layers as well as time steps. |
74 | FEVER: a Large-scale Dataset for Fact Extraction and VERification | James Thorne, Andreas Vlachos, Christos Christodoulopoulos, Arpit Mittal | In this paper we introduce a new publicly available dataset for verification against textual sources, FEVER: Fact Extraction and VERification. |
75 | Global Relation Embedding for Relation Extraction | Yu Su, Honglei Liu, Semih Yavuz, Izzeddin Gür, Huan Sun, Xifeng Yan | To combat the wrong labeling problem of distant supervision, we propose to embed textual relations with global statistics of relations, i.e., the co-occurrence statistics of textual and knowledge base relations collected from the entire corpus. |
76 | Implicit Argument Prediction with Event Knowledge | Pengxiang Cheng, Katrin Erk | We propose to train models for implicit argument prediction on a simple cloze task, for which data can be generated automatically at scale. |
77 | Improving Temporal Relation Extraction with a Globally Acquired Statistical Resource | Qiang Ning, Hao Wu, Haoruo Peng, Dan Roth | This paper develops such a resource – a probabilistic knowledge base acquired in the news domain – by extracting temporal relations between events from the New York Times (NYT) articles over a 20-year span (1987-2007). |
78 | Multimodal Named Entity Recognition for Short Social Media Posts | Seungwhan Moon, Leonardo Neves, Vitor Carvalho | We introduce a new task called Multimodal Named Entity Recognition (MNER) for noisy user-generated data such as tweets or Snapchat captions, which comprise short text with accompanying images. To this end, we create a new dataset for MNER called SnapCaptions (Snapchat image-caption pairs submitted to public and crowd-sourced stories with fully annotated named entities). |
79 | Nested Named Entity Recognition Revisited | Arzoo Katiyar, Claire Cardie | We propose a novel recurrent neural network-based approach to simultaneously handle nested named entity recognition and nested entity mention detection. |
80 | Simultaneously Self-Attending to All Mentions for Full-Abstract Biological Relation Extraction | Patrick Verga, Emma Strubell, Andrew McCallum | In response, we propose a model which simultaneously predicts relationships between all mention pairs in a document. We also introduce a new dataset an order of magnitude larger than existing human-annotated biological information extraction datasets and more accurate than distantly supervised alternatives. |
81 | Supervised Open Information Extraction | Gabriel Stanovsky, Julian Michael, Luke Zettlemoyer, Ido Dagan | We present data and methods that enable a supervised learning approach to Open Information Extraction (Open IE). |
82 | Embedding Syntax and Semantics of Prepositions via Tensor Decomposition | Hongyu Gong, Suma Bhat, Pramod Viswanath | In this paper we use \textit{word-triple} counts (one of the triples being a preposition) to capture a preposition’s interaction with its attachment and complement. |
83 | From Phonology to Syntax: Unsupervised Linguistic Typology at Different Levels with Language Embeddings | Johannes Bjerva, Isabelle Augenstein | We learn distributed language representations, which can be used to predict typological properties on a massively multilingual scale. |
84 | Monte Carlo Syntax Marginals for Exploring and Using Dependency Parses | Katherine Keith, Su Lin Blodgett, Brendan O’Connor | In this work, we propose a transition sampling algorithm to sample from the full joint distribution of parse trees defined by a transition-based parsing model, and demonstrate the use of the samples in probabilistic dependency analysis. |
85 | Neural Particle Smoothing for Sampling from Conditional Sequence Models | Chu-Cheng Lin, Jason Eisner | We introduce neural particle smoothing, a sequential Monte Carlo method for sampling annotations of an input string from a given probability model. |
86 | Neural Syntactic Generative Models with Exact Marginalization | Jan Buys, Phil Blunsom | We present neural syntactic generative models with exact marginalization that support both dependency parsing and language modeling. |
87 | Noise-Robust Morphological Disambiguation for Dialectal Arabic | Nasser Zalmout, Alexander Erdmann, Nizar Habash | We present a neural morphological tagging and disambiguation model for Egyptian Arabic, with various extensions to handle noisy and inconsistent content. |
88 | Parsing Tweets into Universal Dependencies | Yijia Liu, Yi Zhu, Wanxiang Che, Bing Qin, Nathan Schneider, Noah A. Smith | To overcome the annotation noise without sacrificing computational efficiency, we propose a new method to distill an ensemble of 20 transition-based parsers into a single one. |
89 | Robust Multilingual Part-of-Speech Tagging via Adversarial Training | Michihiro Yasunaga, Jungo Kasai, Dragomir Radev | In this paper, we propose and analyze a neural POS tagging model that exploits AT. |
90 | Universal Dependency Parsing for Hindi-English Code-Switching | Irshad Bhat, Riyaz A. Bhat, Manish Shrivastava, Dipti Sharma | In this paper, we investigate these indispensable processes and other problems associated with syntactic parsing of code-switching data and propose methods to mitigate their effects. |
91 | What’s Going On in Neural Constituency Parsers? An Analysis | David Gaddy, Mitchell Stern, Dan Klein | The goal of this work is to analyze the extent to which information provided directly by the model structure in classical systems is still being captured by neural methods. |
92 | Deep Generative Model for Joint Alignment and Word Representation | Miguel Rios, Wilker Aziz, Khalil Sima’an | This work exploits translation data as a source of semantically relevant learning signal for models of word representation. |
93 | Learning Word Embeddings for Low-Resource Languages by PU Learning | Chao Jiang, Hsiang-Fu Yu, Cho-Jui Hsieh, Kai-Wei Chang | In this paper, we study how to effectively learn a word embedding model on a corpus with only a few million tokens. |
94 | Exploring the Role of Prior Beliefs for Argument Persuasion | Esin Durmus, Claire Cardie | To study the actual effect of language use vs. prior beliefs on persuasion, we provide a new dataset and propose a controlled setting that takes into consideration two reader-level factors: political and religious ideology. |
95 | Inducing a Lexicon of Abusive Words — a Feature-Based Approach | Michael Wiegand, Josef Ruppenhofer, Anna Schmidt, Clayton Greenberg | We propose novel features employing information from both corpora and lexical resources. |
96 | Author Commitment and Social Power: Automatic Belief Tagging to Infer the Social Context of Interactions | Vinodkumar Prabhakaran, Premkumar Ganeshkumar, Owen Rambow | In this paper, we employ advancements in extra-propositional semantics extraction within NLP to study how author commitment reflects the social context of an interactions. |
97 | Comparing Automatic and Human Evaluation of Local Explanations for Text Classification | Dong Nguyen | We evaluate a variety of local explanation approaches using automatic measures based on word deletion. |
98 | Deep Temporal-Recurrent-Replicated-Softmax for Topical Trends over Time | Pankaj Gupta, Subburam Rajaram, Hinrich Schütze, Bernt Andrassy | We introduce a novel unsupervised neural dynamic topic model named as Recurrent Neural Network-Replicated Softmax Model (RNNRSM), where the discovered topics at each time influence the topic discovery in the subsequent time steps. |
99 | Lessons from the Bible on Modern Topics: Low-Resource Multilingual Topic Model Evaluation | Shudong Hao, Jordan Boyd-Graber, Michael J. Paul | Because standard metrics fail to accurately measure topic quality when robust external resources are unavailable, we propose an adaptation model that improves the accuracy and reliability of these metrics in low-resource settings. |
100 | Explainable Prediction of Medical Codes from Clinical Text | James Mullenbach, Sarah Wiegreffe, Jon Duke, Jimeng Sun, Jacob Eisenstein | We present an attentional convolutional network that predicts medical codes from clinical text. |
101 | A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference | Adina Williams, Nikita Nangia, Samuel Bowman | This paper introduces the Multi-Genre Natural Language Inference (MultiNLI) corpus, a dataset designed for use in the development and evaluation of machine learning models for sentence understanding. |
102 | Filling Missing Paths: Modeling Co-occurrences of Word Pairs and Dependency Paths for Recognizing Lexical Semantic Relations | Koki Washio, Tsuneaki Kato | In this paper, we propose novel methods with a neural model of P(path|w1,w2) to solve this problem. |
103 | Specialising Word Vectors for Lexical Entailment | Ivan Vulić, Nikola Mrkšić | We present LEAR (Lexical Entailment Attract-Repel), a novel post-processing method that transforms any input word vector space to emphasise the asymmetric relation of lexical entailment (LE), also known as the IS-A or hyponymy-hypernymy relation. |
104 | Cross-Lingual Abstract Meaning Representation Parsing | Marco Damonte, Shay B. Cohen | Abstract Meaning Representation (AMR) research has mostly focused on English. |
105 | Sentences with Gapping: Parsing and Reconstructing Elided Predicates | Sebastian Schuster, Joakim Nivre, Christopher D. Manning | In this paper, we present two methods for parsing to a Universal Dependencies graph representation that explicitly encodes the elided material with additional nodes and edges. |
106 | A Structured Syntax-Semantics Interface for English-AMR Alignment | Ida Szubert, Adam Lopez, Nathan Schneider | To test it, we devise an expressive framework to align AMR graphs to dependency graphs, which we use to annotate 200 AMRs. |
107 | End-to-End Graph-Based TAG Parsing with Neural Networks | Jungo Kasai, Robert Frank, Pauli Xu, William Merrill, Owen Rambow | We present a graph-based Tree Adjoining Grammar (TAG) parser that uses BiLSTMs, highway connections, and character-level CNNs. |
108 | Colorless Green Recurrent Networks Dream Hierarchically | Kristina Gulordava, Piotr Bojanowski, Edouard Grave, Tal Linzen, Marco Baroni | We investigate to what extent RNNs learn to track abstract hierarchical syntactic structure. |
109 | Diverse Few-Shot Text Classification with Multiple Metrics | Mo Yu, Xiaoxiao Guo, Jinfeng Yi, Shiyu Chang, Saloni Potdar, Yu Cheng, Gerald Tesauro, Haoyu Wang, Bowen Zhou | To alleviate the problem, we propose an adaptive metric learning approach that automatically determines the best weighted combination from a set of metrics obtained from meta-training tasks for a newly seen few-shot task. |
110 | Early Text Classification Using Multi-Resolution Concept Representations | Adrian Pastor López-Monroy, Fabio A. González, Manuel Montes, Hugo Jair Escalante, Thamar Solorio | This paper proposes a novel document representation to improve the early detection of risks in social media sources. |
111 | Multinomial Adversarial Networks for Multi-Domain Text Classification | Xilun Chen, Claire Cardie | In this work, we propose a multinomial adversarial network (MAN) to tackle this real-world problem of multi-domain text classification (MDTC) in which labeled data may exist for multiple domains, but in insufficient amounts to train effective classifiers for one or more of the domains. |
112 | Pivot Based Language Modeling for Improved Neural Domain Adaptation | Yftah Ziser, Roi Reichart | In this paper we present the Pivot Based Language Model (PBLM), a representation learning model that marries together pivot-based and NN modeling in a structure aware manner. |
113 | Reinforced Co-Training | Jiawei Wu, Lei Li, William Yang Wang | In this paper, we propose a novel method, Reinforced Co-Training, to select high-quality unlabeled samples to better co-train on. |
114 | Tensor Product Generation Networks for Deep NLP Modeling | Qiuyuan Huang, Paul Smolensky, Xiaodong He, Li Deng, Dapeng Wu | We present a new approach to the design of deep networks for natural language processing (NLP), based on the general technique of Tensor Product Representations (TPRs) for encoding and processing symbol structures in distributed neural networks. |
115 | The Context-Dependent Additive Recurrent Neural Net | Quan Hung Tran, Tuan Lai, Gholamreza Haffari, Ingrid Zukerman, Trung Bui, Hung Bui | In this paper, we propose a novel family of Recurrent Neural Network unit: the Context-dependent Additive Recurrent Neural Network (CARNN) that is designed specifically to address this type of problem. |
116 | Combining Character and Word Information in Neural Machine Translation Using a Multi-Level Attention | Huadong Chen, Shujian Huang, David Chiang, Xinyu Dai, Jiajun Chen | In this paper, we improve the model by incorporating multiple levels of granularity. |
117 | Dense Information Flow for Neural Machine Translation | Yanyao Shen, Xu Tan, Di He, Tao Qin, Tie-Yan Liu | Inspired by the success of the DenseNet model in computer vision problems, in this paper, we propose a densely connected NMT architecture (DenseNMT) that is able to train more efficiently for NMT. |
118 | Evaluating Discourse Phenomena in Neural Machine Translation | Rachel Bawden, Rico Sennrich, Alexandra Birch, Barry Haddow | In this article, we present hand-crafted, discourse test sets, designed to test the models’ ability to exploit previous source and target sentences. |
119 | Fast Lexically Constrained Decoding with Dynamic Beam Allocation for Neural Machine Translation | Matt Post, David Vilar | We present a algorithm for lexically constrained decoding with a complexity of O(1) in the number of constraints. |
120 | Guiding Neural Machine Translation with Retrieved Translation Pieces | Jingyi Zhang, Masao Utiyama, Eiichro Sumita, Graham Neubig, Satoshi Nakamura | In this paper, we propose a simple, fast, and effective method for recalling previously seen translation examples and incorporating them into the NMT decoding process. |
121 | Handling Homographs in Neural Machine Translation | Frederick Liu, Han Lu, Graham Neubig | We then proceed to describe methods, inspired by the word sense disambiguation literature, that model the context of the input word with context-aware word embeddings that help to differentiate the word sense before feeding it into the encoder. |
122 | Improving Neural Machine Translation with Conditional Sequence Generative Adversarial Nets | Zhen Yang, Wei Chen, Feng Wang, Bo Xu | This paper proposes an approach for applying GANs to NMT. |
123 | Neural Machine Translation for Bilingually Scarce Scenarios: a Deep Multi-Task Learning Approach | Poorya Zaremoodi, Gholamreza Haffari | In this paper, we use monolingual linguistic resources in the source side to address this challenging problem based on a multi-task learning approach. |
124 | Self-Attentive Residual Decoder for Neural Machine Translation | Lesly Miculicich Werlen, Nikolaos Pappas, Dhananjay Ram, Andrei Popescu-Belis | To address this limitation, we propose a target-side-attentive residual recurrent network for decoding, where attention over previous words contributes directly to the prediction of the next word. |
125 | Target Foresight Based Attention for Neural Machine Translation | Xintong Li, Lemao Liu, Zhaopeng Tu, Shuming Shi, Max Meng | In this paper, we propose a new attention model enhanced by the implicit information of target foresight word oriented to both alignment and translation tasks. |
126 | Context Sensitive Neural Lemmatization with Lematus | Toms Bergmanis, Sharon Goldwater | We introduce Lematus, a lemmatizer based on a standard encoder-decoder architecture, which incorporates character-level sentence context. |
127 | Modeling Noisiness to Recognize Named Entities using Multitask Neural Networks on Social Media | Gustavo Aguilar, Adrian Pastor López-Monroy, Fabio González, Thamar Solorio | We present two systems that address the challenges of processing social media data using character-level phonetics and phonology, word embeddings, and Part-of-Speech tags as features. |
128 | Reusing Weights in Subword-Aware Neural Language Models | Zhenisbek Assylbekov, Rustem Takhanov | We propose several ways of reusing subword embeddings and other weights in subword-aware neural language models. |
129 | Simple Models for Word Formation in Slang | Vivek Kulkarni, William Yang Wang | We propose the first generative models for three types of extra-grammatical word formation phenomena abounding in slang: Blends, Clippings, and Reduplicatives. |
130 | Using Morphological Knowledge in Open-Vocabulary Neural Language Models | Austin Matthews, Graham Neubig, Chris Dyer | We introduce an open-vocabulary language model that incorporates more sophisticated linguistic knowledge by predicting words using a mixture of three generative processes: (1) by generating words as a sequence of characters, (2) by directly generating full word forms, and (3) by generating words as a sequence of morphemes that are combined using a hand-written morphological analyzer. |
131 | A Neural Layered Model for Nested Named Entity Recognition | Meizhi Ju, Makoto Miwa, Sophia Ananiadou | To address this issue, we propose a novel neural model to identify nested entities by dynamically stacking flat NER layers. |
132 | DR-BiLSTM: Dependent Reading Bidirectional LSTM for Natural Language Inference | Reza Ghaeini, Sadid A. Hasan, Vivek Datla, Joey Liu, Kathy Lee, Ashequl Qadir, Yuan Ling, Aaditya Prakash, Xiaoli Fern, Oladimeji Farri | We present a novel deep learning architecture to address the natural language inference (NLI) task. |
133 | KBGAN: Adversarial Learning for Knowledge Graph Embeddings | Liwei Cai, William Yang Wang | We introduce KBGAN, an adversarial learning framework to improve the performances of a wide range of existing knowledge graph embedding models. |
134 | Multimodal Frame Identification with Multilingual Evaluation | Teresa Botschen, Iryna Gurevych, Jan-Christoph Klie, Hatem Mousselly-Sergieh, Stefan Roth | In this paper, we extend a state-of-the-art FrameId system in order to effectively leverage multimodal representations. |
135 | Learning Joint Semantic Parsers from Disjoint Data | Hao Peng, Sam Thomson, Swabha Swayamdipta, Noah A. Smith | We present a new approach to learning a semantic parser from multiple datasets, even when the target semantic formalisms are drastically different and the underlying corpora do not overlap. |
136 | Identifying Semantic Divergences in Parallel Text without Annotations | Yogarshi Vyas, Xing Niu, Marine Carpuat | Recognizing that even correct translations are not always semantically equivalent, we automatically detect meaning divergences in parallel sentence pairs with a deep neural model of bilingual semantic similarity which can be trained for any parallel corpus without any manual annotation. |
137 | Bootstrapping Generators from Noisy Data | Laura Perez-Beltrachini, Mirella Lapata | In this paper we aim to bootstrap generators from large scale datasets where the data (e.g., DBPedia facts) and related texts (e.g., Wikipedia abstracts) are loosely aligned. |
138 | SHAPED: Shared-Private Encoder-Decoder for Text Style Adaptation | Ye Zhang, Nan Ding, Radu Soricut | We describe a family of model architectures capable of capturing both generic language characteristics via shared model parameters, as well as particular style characteristics via private model parameters. |
139 | Generating Descriptions from Structured Data Using a Bifocal Attention Mechanism and Gated Orthogonalization | Preksha Nema, Shreyas Shetty, Parag Jain, Anirban Laha, Karthik Sankaranarayanan, Mitesh M. Khapra | In this work, we focus on the task of generating natural language descriptions from a structured table of facts containing fields (such as nationality, occupation, etc) and values (such as Indian, actor, director, etc). In addition, we also introduce two similar datasets for French and German. |
140 | CliCR: a Dataset of Clinical Case Reports for Machine Reading Comprehension | Simon Šuster, Walter Daelemans | We present a new dataset for machine comprehension in the medical domain. |
141 | Learning to Collaborate for Question Answering and Asking | Duyu Tang, Nan Duan, Zhao Yan, Zhirui Zhang, Yibo Sun, Shujie Liu, Yuanhua Lv, Ming Zhou | In this paper, we give a systematic study that seeks to leverage the connection to improve both QA and QG. |
142 | Learning to Rank Question-Answer Pairs Using Hierarchical Recurrent Encoder with Latent Topic Clustering | Seunghyun Yoon, Joongbo Shin, Kyomin Jung | In this paper, we propose a novel end-to-end neural architecture for ranking candidate answers, that adapts a hierarchical recurrent neural network and a latent topic clustering module. |
143 | Supervised and Unsupervised Transfer Learning for Question Answering | Yu-An Chung, Hung-Yi Lee, James Glass | In this paper, we conduct extensive experiments to investigate the transferability of knowledge learned from a source QA dataset to a target dataset using two QA models. |
144 | Tracking State Changes in Procedural Text: a Challenge Dataset and Models for Process Paragraph Comprehension | Bhavana Dalvi, Lifu Huang, Niket Tandon, Wen-tau Yih, Peter Clark | We present a new dataset and models for comprehending paragraphs about processes (e.g., photosynthesis), an important genre of text describing a dynamic world. We are releasing the ProPara dataset and our models to the community. |
145 | Combining Deep Learning and Topic Modeling for Review Understanding in Context-Aware Recommendation | Mingmin Jin, Xin Luo, Huiling Zhu, Hankz Hankui Zhuo | In this paper, we investigate the approach to effectively utilize review information for recommender systems. |
146 | Deconfounded Lexicon Induction for Interpretable Social Science | Reid Pryzant, Kelly Shen, Dan Jurafsky, Stefan Wagner | We introduce two deep learning algorithms for the task. |
147 | Detecting Denial-of-Service Attacks from Social Media Text: Applying NLP to Computer Security | Nathanael Chambers, Ben Fry, James McMasters | We describe two learning frameworks for this task: a feed-forward neural network and a partially labeled LDA model. |
148 | The Importance of Calibration for Estimating Proportions from Annotations | Dallas Card, Noah A. Smith | The Importance of Calibration for Estimating Proportions from Annotations |
149 | A Dataset of Peer Reviews (PeerRead): Collection, Insights and NLP Applications | Dongyeop Kang, Waleed Ammar, Bhavana Dalvi, Madeleine van Zuylen, Sebastian Kohlmeier, Eduard Hovy, Roy Schwartz | We describe the data collection process and report interesting observed phenomena in the peer reviews. We present the first public dataset of scientific peer reviews available for research purposes (PeerRead v1),1 providing an opportunity to study this important artifact. |
150 | Deep Communicating Agents for Abstractive Summarization | Asli Celikyilmaz, Antoine Bosselut, Xiaodong He, Yejin Choi | We present deep communicating agents in an encoder-decoder architecture to address the challenges of representing a long document for abstractive summarization. |
151 | Encoding Conversation Context for Neural Keyphrase Extraction from Microblog Posts | Yingyi Zhang, Jing Li, Yan Song, Chengzhi Zhang | In this paper, we present a neural keyphrase extraction framework for microblog posts that takes their conversation context into account, where four types of neural encoders, namely, averaged embedding, RNN, attention, and memory networks, are proposed to represent the conversation context. |
152 | Estimating Summary Quality with Pairwise Preferences | Markus Zopf | In this paper, we propose an alternative evaluation approach based on pairwise preferences of sentences. |
153 | Generating Topic-Oriented Summaries Using Neural Attention | Kundan Krishna, Balaji Vasan Srinivasan | In this paper, we propose an attention based RNN framework to generate multiple summaries of a single document tuned to different topics of interest. |
154 | Generative Bridging Network for Neural Sequence Prediction | Wenhu Chen, Guanlin Li, Shuo Ren, Shujie Liu, Zhirui Zhang, Mu Li, Ming Zhou | In order to alleviate data sparsity and overfitting problems in maximum likelihood estimation (MLE) for sequence prediction tasks, we propose the Generative Bridging Network (GBN), in which a novel bridge module is introduced to assist the training of the sequence prediction model (the generator network). |
155 | Higher-Order Syntactic Attention Network for Longer Sentence Compression | Hidetaka Kamigaito, Katsuhiko Hayashi, Tsutomu Hirao, Masaaki Nagata | To solve this problem, we propose a higher-order syntactic attention network (HiSAN) that can handle higher-order dependency features as an attention distribution on LSTM hidden states. |
156 | Neural Storyline Extraction Model for Storyline Generation from News Articles | Deyu Zhou, Linsen Guo, Yulan He | In this paper, we propose a novel neural network based approach to extract structured representations and evolution patterns of storylines without using annotated data. |
157 | Provable Fast Greedy Compressive Summarization with Any Monotone Submodular Function | Shinsaku Sakaue, Tsutomu Hirao, Masaaki Nishino, Masaaki Nagata | In this paper, we propose a fast greedy method for compressive summarization. |
158 | Ranking Sentences for Extractive Summarization with Reinforcement Learning | Shashi Narayan, Shay B. Cohen, Mirella Lapata | In this paper we conceptualize extractive summarization as a sentence ranking task and propose a novel training algorithm which globally optimizes the ROUGE evaluation metric through a reinforcement learning objective. |
159 | Relational Summarization for Corpus Analysis | Abram Handler, Brendan O’Connor | This work introduces a new problem, relational summarization, in which the goal is to generate a natural language summary of the relationship between two lexical items in a corpus, without reference to a knowledge base. |
160 | What’s This Movie About? A Joint Neural Network Architecture for Movie Content Analysis | Philip John Gorinski, Mirella Lapata | We present a novel end-to-end model for overview generation, consisting of a multi-label encoder for identifying screenplay attributes, and an LSTM decoder to generate natural language sentences conditioned on the identified attributes. We create a dataset that consists of movie scripts, attribute-value pairs for the movies’ aspects, as well as overviews, which we extract from an online database. |
161 | Which Scores to Predict in Sentence Regression for Text Summarization? | Markus Zopf, Eneldo Loza Mencía, Johannes Fürnkranz | In this paper, we show in extensive experiments that following this intuition leads to suboptimal results and that learning to predict ROUGE precision scores leads to better results. |
162 | A Hierarchical Latent Structure for Variational Conversation Modeling | Yookoon Park, Jaemin Cho, Gunhee Kim | To solve the degeneration problem, we propose a novel model named Variational Hierarchical Conversation RNNs (VHCR), involving two key ideas of (1) using a hierarchical structure of latent variables, and (2) exploiting an utterance drop regularization. |
163 | Detecting Egregious Conversations between Customers and Virtual Agents | Tommy Sandbank, Michal Shmueli-Scheuer, Jonathan Herzig, David Konopnicki, John Richards, David Piorkowski | In this paper, we outline an approach to detecting such egregious conversations, using behavioral cues from the user, patterns in agent responses, and user-agent interaction. |
164 | Learning to Disentangle Interleaved Conversational Threads with a Siamese Hierarchical Network and Similarity Ranking | Jyun-Yu Jiang, Francine Chen, Yan-Ying Chen, Wei Wang | In this paper, we propose to leverage representation learning for conversation disentanglement. |
165 | Variational Knowledge Graph Reasoning | Wenhu Chen, Wenhan Xiong, Xifeng Yan, William Yang Wang | In this paper, we tackle a practical query answering task involving predicting the relation of a given entity pair. |
166 | Inducing Temporal Relations from Time Anchor Annotation | Fei Cheng, Yusuke Miyao | In this paper, we propose a new approach to obtain temporal relations from absolute time value (a.k.a. time anchors), which is suitable for texts containing rich temporal information such as news articles. |
167 | ELDEN: Improved Entity Linking Using Densified Knowledge Graphs | Priya Radhakrishnan, Partha Talukdar, Vasudeva Varma | In this paper, we propose Entity Linking using Densified Knowledge Graphs (ELDEN). |
168 | Interpretable Charge Predictions for Criminal Cases: Learning to Generate Court Views from Fact Descriptions | Hai Ye, Xin Jiang, Zhunchen Luo, Wenhan Chao | In this paper, we propose to study the problem of court view generation from the fact description in a criminal case. |
169 | Delete, Retrieve, Generate: a Simple Approach to Sentiment and Style Transfer | Juncen Li, Robin Jia, He He, Percy Liang | In this paper, we propose simpler methods motivated by the observation that text attributes are often marked by distinctive phrases (e.g., “too small”). |
170 | Adversarial Example Generation with Syntactically Controlled Paraphrase Networks | Mohit Iyyer, John Wieting, Kevin Gimpel, Luke Zettlemoyer | We propose syntactically controlled paraphrase networks (SCPNs) and use them to generate adversarial examples. |
171 | Sentiment Analysis: It’s Complicated! | Kian Kenyon-Dean, Eisha Ahmed, Scott Fujimoto, Jeremy Georges-Filteau, Christopher Glasz, Barleen Kaur, Auguste Lalande, Shruti Bhanderi, Robert Belfer, Nirmal Kanagasabai, Roman Sarrazingendron, Rohit Verma, Derek Ruths | We therefore propose the notion of a “complicated” class of sentiment to categorize such text, and argue that its inclusion in the short-text sentiment analysis framework will improve the quality of automated sentiment analysis systems as they are implemented in real-world settings. |
172 | Multi-Task Learning of Pairwise Sequence Classification Tasks over Disparate Label Spaces | Isabelle Augenstein, Sebastian Ruder, Anders Søgaard | We evaluate our approach on a variety of tasks with disparate label spaces. |
173 | Word Emotion Induction for Multiple Languages as a Deep Multi-Task Learning Problem | Sven Buechel, Udo Hahn | We here present a solution to get around this language data bottleneck by rephrasing word emotion induction as a multi-task learning problem. |
174 | Human Needs Categorization of Affective Events Using Labeled and Unlabeled Data | Haibo Ding, Ellen Riloff | Our work aims to categorize affective events based upon human need categories that often explain people’s motivations and desires: PHYSIOLOGICAL, HEALTH, LEISURE, SOCIAL, FINANCIAL, COGNITION, and FREEDOM. |
175 | The Argument Reasoning Comprehension Task: Identification and Reconstruction of Implicit Warrants | Ivan Habernal, Henning Wachsmuth, Iryna Gurevych, Benno Stein | In this paper we develop a methodology for reconstructing warrants systematically. |
176 | Linguistic Cues to Deception and Perceived Deception in Interview Dialogues | Sarah Ita Levitan, Angel Maredia, Julia Hirschberg | We explore deception detection in interview dialogues. |
177 | Unified Pragmatic Models for Generating and Following Instructions | Daniel Fried, Jacob Andreas, Dan Klein | We extend these models to tasks with sequential structure. |
178 | Hierarchical Structured Model for Fine-to-Coarse Manifesto Text Analysis | Shivashankar Subramanian, Trevor Cohn, Timothy Baldwin | In this paper we propose a two-stage model for automatically performing both levels of analysis over manifestos. |
179 | Behavior Analysis of NLI Models: Uncovering the Influence of Three Factors on Robustness | Ivan Sanchez, Jeff Mitchell, Sebastian Riedel | Natural Language Inference is a challenging task that has received substantial attention, and state-of-the-art models now achieve impressive test set performance in the form of accuracy scores. |
180 | Assessing Language Proficiency from Eye Movements in Reading | Yevgeni Berzak, Boris Katz, Roger Levy | We present a novel approach for determining learners’ second language proficiency which utilizes behavioral traces of eye movements during reading. |
181 | Comparing Theories of Speaker Choice Using a Model of Classifier Production in Mandarin Chinese | Meilin Zhan, Roger Levy | In a corpus analysis of Mandarin Chinese, we show that the distribution of speaker choices supports the availability-based production account and not the Uniform Information Density. |
182 | Spotting Spurious Data with Neural Networks | Hadi Amiri, Timothy Miller, Guergana Savova | In this paper, we present effective approaches inspired by queueing theory and psychology of learning to automatically identify spurious instances in datasets. |
183 | The Timing of Lexical Memory Retrievals in Language Production | Jeremy Cole, David Reitter | This paper explores the time course of lexical memory retrieval by modeling fluent language production. |
184 | Unsupervised Induction of Linguistic Categories with Records of Reading, Speaking, and Writing | Maria Barrett, Ana Valeria González-Garduño, Lea Frermann, Anders Søgaard | This paper shows that performance can be further improved by including data that is readily available or can be easily obtained for most languages, i.e., eye-tracking, speech, or keystroke logs (or any combination thereof). |
185 | Challenging Reading Comprehension on Daily Conversation: Passage Completion on Multiparty Dialog | Kaixin Ma, Tomasz Jurczyk, Jinho D. Choi | This paper presents a new corpus and a robust deep learning architecture for a task in reading comprehension, passage completion, on multiparty dialog. Since there is no dataset that challenges the task of passage completion in this genre, we create a corpus by selecting transcripts from a TV show that comprise 1,681 dialogs, generating passages for each dialog through crowdsourcing, and annotating mentions of characters in both the dialog and the passages. |
186 | Dialog Generation Using Multi-Turn Reasoning Neural Networks | Xianchao Wu, Ander Martínez, Momo Klyen | In this paper, we propose a generalizable dialog generation approach that adapts multi-turn reasoning, one recent advancement in the field of document comprehension, to generate responses (“answers”) by taking current conversation session context as a “document” and current query as a “question”. |
187 | Dialogue Learning with Human Teaching and Feedback in End-to-End Trainable Task-Oriented Dialogue Systems | Bing Liu, Gokhan Tür, Dilek Hakkani-Tür, Pararth Shah, Larry Heck | In this work, we present a hybrid learning method for training task-oriented dialogue systems through online user interactions. |
188 | LSDSCC: a Large Scale Domain-Specific Conversational Corpus for Response Generation with Diversity Oriented Evaluation Metrics | Zhen Xu, Nan Jiang, Bingquan Liu, Wenge Rong, Bowen Wu, Baoxun Wang, Zhuoran Wang, Xiaolong Wang | This paper proposes a Large Scale Domain-Specific Conversational Corpus (LSDSCC) composed of high-quality queryresponse pairs extracted from the domainspecific online forum, with thorough preprocessing and cleansing procedures. |
189 | EMR Coding with Semi-Parametric Multi-Head Matching Networks | Anthony Rios, Ramakanth Kavuluru | In this paper, we present a new neural network architecture that combines ideas from few-shot learning matching networks, multi-label loss functions, and convolutional neural networks for text classification to significantly outperform other state-of-the-art models. |
190 | Factors Influencing the Surprising Instability of Word Embeddings | Laura Wendlandt, Jonathan K. Kummerfeld, Rada Mihalcea | In this paper, we consider one aspect of embedding spaces, namely their stability. |
191 | Mining Evidences for Concept Stock Recommendation | Qi Liu, Yue Zhang | We investigate the task of mining relevant stocks given a topic of concern on emerging capital markets, for which there is lack of structural understanding. |
192 | Binarized LSTM Language Model | Xuan Liu, Di Cao, Kai Yu | In this paper, a novel binarized LSTM LM is proposed to address the problem. |
193 | Conversational Memory Network for Emotion Recognition in Dyadic Dialogue Videos | Devamanyu Hazarika, Soujanya Poria, Amir Zadeh, Erik Cambria, Louis-Philippe Morency, Roger Zimmermann | In this paper, we address recognizing utterance-level emotions in dyadic conversational videos. |
194 | How Time Matters: Learning Time-Decay Attention for Contextual Spoken Language Understanding in Dialogues | Shang-Yu Su, Pei-Chieh Yuan, Yun-Nung Chen | The experiments on the benchmark Dialogue State Tracking Challenge (DSTC4) dataset show that the proposed time-decay attention mechanisms significantly improve the state-of-the-art model for contextual understanding performance. |
195 | Towards Understanding Text Factors in Oral Reading | Anastassia Loukina, Van Rynald T. Liceralde, Beata Beigman Klebanov | Using a case study, we show that variation in oral reading rate across passages for professional narrators is consistent across readers and much of it can be explained using features of the texts being read. |
196 | Generating Bilingual Pragmatic Color References | Will Monroe, Jennifer Hu, Andrew Jong, Christopher Potts | Using a newly-collected dataset of color reference games in Mandarin Chinese (which we release to the public), we confirm that a variety of constructions display the same sensitivity to contextual difficulty in Chinese and English. |
197 | Learning with Latent Language | Jacob Andreas, Dan Klein, Sergey Levine | This paper aims to show that using the space of natural language strings as a parameter space is an effective way to capture natural task structure. |
198 | Object Counts! Bringing Explicit Detections Back into Image Captioning | Josiah Wang, Pranava Swaroop Madhyastha, Lucia Specia | We provide an in-depth analysis of end-to-end image captioning by exploring a variety of cues that can be derived from such object detections. |
199 | Quantifying the Visual Concreteness of Words and Topics in Multimodal Datasets | Jack Hessel, David Mimno, Lillian Lee | We give an algorithm for automatically computing the visual concreteness of words and topics within multimodal datasets. |
200 | Speaker Naming in Movies | Mahmoud Azab, Mingzhe Wang, Max Smith, Noriyuki Kojima, Jia Deng, Rada Mihalcea | We propose a new model for speaker naming in movies that leverages visual, textual, and acoustic modalities in an unified optimization framework. To evaluate the performance of our model, we introduce a new dataset consisting of six episodes of the Big Bang Theory TV show and eighteen full movies covering different genres. |
201 | Stacking with Auxiliary Features for Visual Question Answering | Nazneen Fatema Rajani, Raymond Mooney | In this paper, we describe how we use these various categories of auxiliary features to improve performance for VQA. |
202 | Deep Contextualized Word Representations | Matthew Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, Luke Zettlemoyer | We introduce a new type of deep contextualized word representation that models both (1) complex characteristics of word use (e.g., syntax and semantics), and (2) how these uses vary across linguistic contexts (i.e., to model polysemy). |
203 | Learning to Map Context-Dependent Sentences to Executable Formal Queries | Alane Suhr, Srinivasan Iyer, Yoav Artzi | We propose a context-dependent model to map utterances within an interaction to executable formal queries. |
204 | Neural Text Generation in Stories Using Entity Representations as Context | Elizabeth Clark, Yangfeng Ji, Noah A. Smith | We introduce an approach to neural text generation that explicitly represents entities mentioned in the text. |
205 | Recurrent Neural Networks as Weighted Language Recognizers | Yining Chen, Sorcha Gilroy, Andreas Maletti, Jonathan May, Kevin Knight | We investigate the computational complexity of various problems for simple recurrent neural networks (RNNs) as formal models for recognizing weighted languages. |
TABLE 2: NAACL 2018 Short Papers
Title | Authors | Highlight | |
---|---|---|---|
1 | Enhanced Word Representations for Bridging Anaphora Resolution | Yufang Hou | Most current models of word representations (e.g., GloVe) have successfully captured fine-grained semantics. |
2 | Gender Bias in Coreference Resolution | Rachel Rudinger, Jason Naradowsky, Brian Leonard, Benjamin Van Durme | We present an empirical study of gender bias in coreference resolution systems. |
3 | Gender Bias in Coreference Resolution: Evaluation and Debiasing Methods | Jieyu Zhao, Tianlu Wang, Mark Yatskar, Vicente Ordonez, Kai-Wei Chang | In this paper, we introduce a new benchmark for co-reference resolution focused on gender bias, WinoBias. |
4 | Integrating Stance Detection and Fact Checking in a Unified Corpus | Ramy Baly, Mitra Mohtarami, James Glass, Lluís Màrquez, Alessandro Moschitti, Preslav Nakov | In this paper, we support the interdependencies between these tasks as annotations in the same corpus. |
5 | Is Something Better than Nothing? Automatically Predicting Stance-based Arguments Using Deep Learning and Small Labelled Dataset | Pavithra Rajendran, Danushka Bollegala, Simon Parsons | In this paper, we investigate the use of weakly supervised and semi-supervised methods for automatically annotating data, and thus providing large annotated datasets. |
6 | Multi-Task Learning for Argumentation Mining in Low-Resource Settings | Claudia Schulz, Steffen Eger, Johannes Daxenberger, Tobias Kahse, Iryna Gurevych | We investigate whether and where multi-task learning (MTL) can improve performance on NLP problems related to argumentation mining (AM), in particular argument component identification. |
7 | Neural Models for Reasoning over Multiple Mentions Using Coreference | Bhuwan Dhingra, Qiao Jin, Zhilin Yang, William Cohen, Ruslan Salakhutdinov | We present a recurrent layer which is instead biased towards coreferent dependencies. |
8 | Automatic Dialogue Generation with Expressed Emotions | Chenyang Huang, Osmar Zaïane, Amine Trabelsi, Nouha Dziri | In this research, we address the problem of forcing the dialogue generation to express emotion. |
9 | Guiding Generation for Abstractive Text Summarization Based on Key Information Guide Network | Chenliang Li, Weiran Xu, Si Li, Sheng Gao | We propose a guiding generation model that combines the extractive method and the abstractive method. |
10 | Natural Language Generation by Hierarchical Decoding with Linguistic Patterns | Shang-Yu Su, Kai-Ling Lo, Yi-Ting Yeh, Yun-Nung Chen | This paper introduces a hierarchical decoding NLG model based on linguistic patterns in different levels, and shows that the proposed method outperforms the traditional one with a smaller model size. |
11 | Neural Poetry Translation | Marjan Ghazvininejad, Yejin Choi, Kevin Knight | We present the first neural poetry translation system. |
12 | RankME: Reliable Human Ratings for Natural Language Generation | Jekaterina Novikova, Ondřej Dušek, Verena Rieser | We present a novel rank-based magnitude estimation method (RankME), which combines the use of continuous scales and relative assessments. |
13 | Sentence Simplification with Memory-Augmented Neural Networks | Tu Vu, Baotian Hu, Tsendsuren Munkhdalai, Hong Yu | In this paper, we adapt an architecture with augmented memory capacities called Neural Semantic Encoders (Munkhdalai and Yu, 2017) for sentence simplification. |
14 | A Corpus of Non-Native Written English Annotated for Metaphor | Beata Beigman Klebanov, Chee Wee (Ben) Leong, Michael Flor | We present a corpus of 240 argumentative essays written by non-native speakers of English annotated for metaphor. |
15 | A Simple and Effective Approach to the Story Cloze Test | Siddarth Srinivasan, Richa Arora, Mark Riedl | Following this approach, we present a simpler fully-neural approach to the Story Cloze Test using skip-thought embeddings of the stories in a feed-forward network that achieves close to state-of-the-art performance on this task without any feature engineering. |
16 | An Annotated Corpus for Machine Reading of Instructions in Wet Lab Protocols | Chaitanya Kulkarni, Wei Xu, Alan Ritter, Raghu Machiraju | We describe an effort to annotate a corpus of natural language instructions consisting of 622 wet lab protocols to facilitate automatic or semi-automatic conversion of protocols into a machine-readable format and benefit biological research. |
17 | Annotation Artifacts in Natural Language Inference Data | Suchin Gururangan, Swabha Swayamdipta, Omer Levy, Roy Schwartz, Samuel Bowman, Noah A. Smith | We show that, in a significant portion of such data, this protocol leaves clues that make it possible to identify the label by looking only at the hypothesis, without observing the premise. |
18 | Humor Recognition Using Deep Learning | Peng-Yu Chen, Von-Wun Soo | In this paper, we construct and collect four datasets with distinct joke types in both English and Chinese and conduct learning experiments on humor recognition. |
19 | Leveraging Intra-User and Inter-User Representation Learning for Automated Hate Speech Detection | Jing Qian, Mai ElSherief, Elizabeth Belding, William Yang Wang | In this paper, we radically improve automated hate speech detection by presenting a novel model that leverages intra-user and inter-user representation learning for robust hate speech detection on Twitter. |
20 | Reference-less Measure of Faithfulness for Grammatical Error Correction | Leshem Choshen, Omri Abend | We propose USim, a semantic measure for Grammatical Error Correction (that measures the semantic faithfulness of the output to the source, thereby complementing existing reference-less measures (RLMs) for measuring the output’s grammaticality. |
21 | Training Structured Prediction Energy Networks with Indirect Supervision | Amirmohammad Rooshenas, Aishwarya Kamath, Andrew McCallum | This paper introduces rank-based training of structured prediction energy networks (SPENs). |
22 | Si O No, Que Penses? Catalonian Independence and Linguistic Identity on Social Media | Ian Stewart, Yuval Pinter, Jacob Eisenstein | This study examines the use of Catalan, a language local to the semi-autonomous region of Catalonia in Spain, on Twitter in discourse related to the 2017 independence referendum. |
23 | A Transition-Based Algorithm for Unrestricted AMR Parsing | David Vilares, Carlos Gómez-Rodríguez | We explore this idea and introduce a greedy left-to-right non-projective transition-based parser. |
24 | Analogies in Complex Verb Meaning Shifts: the Effect of Affect in Semantic Similarity Models | Maximilian Köper, Sabine Schulte im Walde | We present a computational model to detect and distinguish analogies in meaning shifts between German base and complex verbs. |
25 | Character-Based Neural Networks for Sentence Pair Modeling | Wuwei Lan, Wei Xu | In this paper, we study how effective subword-level (character and character n-gram) representations are in sentence pair modeling. |
26 | Determining Event Durations: Models and Error Analysis | Alakananda Vempala, Eduardo Blanco, Alexis Palmer | This paper presents models to predict event durations. |
27 | Diachronic Usage Relatedness (DURel): A Framework for the Annotation of Lexical Semantic Change | Dominik Schlechtweg, Sabine Schulte im Walde, Stefanie Eckmann | We propose a framework that extends synchronic polysemy annotation to diachronic changes in lexical meaning, to counteract the lack of resources for evaluating computational models of lexical semantic change. |
28 | Directional Skip-Gram: Explicitly Distinguishing Left and Right Context for Word Embeddings | Yan Song, Shuming Shi, Jing Li, Haisong Zhang | In this paper, we present directional skip-gram (DSG), a simple but effective enhancement of the skip-gram model by explicitly distinguishing left and right context in word prediction. |
29 | Discriminating between Lexico-Semantic Relations with the Specialization Tensor Model | Goran Glavaš, Ivan Vulić | We present a simple and effective feed-forward neural architecture for discriminating between lexico-semantic relations (synonymy, antonymy, hypernymy, and meronymy). |
30 | Evaluating bilingual word embeddings on the long tail | Fabienne Braune, Viktor Hangya, Tobias Eder, Alexander Fraser | We show that state-of-the-art approaches fail on this task and present simple new techniques to improve bilingual word embeddings for mining rare words. We release new gold standard datasets and code to stimulate research on this task. |
31 | Frustratingly Easy Meta-Embedding — Computing Meta-Embeddings by Averaging Source Word Embeddings | Joshua Coates, Danushka Bollegala | In this paper, we show that the arithmetic mean of two distinct word embedding sets yields a performant meta-embedding that is comparable or better than more complex meta-embedding learning methods. |
32 | Introducing Two Vietnamese Datasets for Evaluating Semantic Models of (Dis-)Similarity and Relatedness | Kim Anh Nguyen, Sabine Schulte im Walde, Ngoc Thang Vu | We present two novel datasets for the low-resource language Vietnamese to assess models of semantic similarity: ViCon comprises pairs of synonyms and antonyms across word classes, thus offering data to distinguish between similarity and dissimilarity. |
33 | Lexical Substitution for Evaluating Compositional Distributional Models | Maja Buljan, Sebastian Padó, Jan Šnajder | Compositional Distributional Semantic Models (CDSMs) model the meaning of phrases and sentences in vector space. We create a LexSub dataset for CDSM evaluation from a corpus with manual “all-words” LexSub annotation. |
34 | Mittens: an Extension of GloVe for Learning Domain-Specialized Representations | Nicholas Dingwall, Christopher Potts | We present a simple extension of the GloVe representation learning model that begins with general-purpose representations and updates them based on data from a specialized domain. |
35 | Olive Oil is Made textitof Olives, Baby Oil is Made textitfor Babies: Interpreting Noun Compounds Using Paraphrases in a Neural Model | Vered Shwartz, Chris Waterson | We explore a neural paraphrasing approach that demonstrates superior performance when such memorization is not possible. |
36 | Semantic Pleonasm Detection | Omid Kashefi, Andrew T. Lucas, Rebecca Hwa | To aid the development of systems that detect pleonasms in text, we introduce an annotated corpus of semantic pleonasms. |
37 | Similarity Measures for the Detection of Clinical Conditions with Verbal Fluency Tasks | Felipe Paula, Rodrigo Wilkens, Marco Idiart, Aline Villavicencio | In this work, we investigate three similarity measures for automatically identifying switches in semantic chains: semantic similarity from a manually constructed resource, and word association strength and semantic relatedness, both calculated from corpora. |
38 | Sluice Resolution without Hand-Crafted Features over Brittle Syntax Trees | Ola Rønning, Daniel Hardt, Anders Søgaard | Syntactic information is arguably important for sluice resolution, but we show that multi-task learning with partial parsing as auxiliary tasks effectively closes the gap and buys us an additional 9% error reduction over previous work. |
39 | The Word Analogy Testing Caveat | Natalie Schluter | We show that even supposing there were such word analogy regularities that should be detected in the word embeddings obtained via unsupervised means, standard word analogy test implementation practices provide distorted or contrived results. |
40 | Transition-Based Chinese AMR Parsing | Chuan Wang, Bin Li, Nianwen Xue | This paper presents the first AMR parser built on the Chinese AMR bank. |
41 | Knowledge-Enriched Two-Layered Attention Network for Sentiment Analysis | Abhishek Kumar, Daisuke Kawahara, Sadao Kurohashi | We propose a novel two-layered attention network based on Bidirectional Long Short-Term Memory for sentiment analysis. |
42 | Letting Emotions Flow: Success Prediction by Modeling the Flow of Emotions in Books | Suraj Maharjan, Sudipta Kar, Manuel Montes, Fabio A. González, Thamar Solorio | In this paper, we model the flow of emotions over a book using recurrent neural networks and quantify its usefulness in predicting success in books. |
43 | Modeling Inter-Aspect Dependencies for Aspect-Based Sentiment Analysis | Devamanyu Hazarika, Soujanya Poria, Prateek Vij, Gangeshwar Krishnamurthy, Erik Cambria, Roger Zimmermann | In this paper, we incorporate this pattern by simultaneous classification of all aspects in a sentence along with temporal dependency processing of their corresponding sentence representations using recurrent networks. |
44 | Multi-Task Learning Framework for Mining Crowd Intelligence towards Clinical Treatment | Shweta Yadav, Asif Ekbal, Sriparna Saha, Pushpak Bhattacharyya, Amit Sheth | In this paper, we present a study where medical user’s opinions on health-related issues are analyzed to capture the medical sentiment at a blog level. |
45 | Recurrent Entity Networks with Delayed Memory Update for Targeted Aspect-Based Sentiment Analysis | Fei Liu, Trevor Cohn, Timothy Baldwin | Motivated by recent advances in memory-augmented models for machine reading, we propose a novel architecture, utilising external “memory chains” with a delayed memory update mechanism to track entities. |
46 | Near Human-Level Performance in Grammatical Error Correction with Hybrid Machine Translation | Roman Grundkiewicz, Marcin Junczys-Dowmunt | We combine two of the most popular approaches to automated Grammatical Error Correction (GEC): GEC based on Statistical Machine Translation (SMT) and GEC based on Neural Machine Translation (NMT). |
47 | Strong Baselines for Simple Question Answering over Knowledge Graphs with and without Neural Networks | Salman Mohammed, Peng Shi, Jimmy Lin | We examine the problem of question answering over knowledge graphs, focusing on simple questions that can be answered by the lookup of a single fact. |
48 | Looking for Structure in Lexical and Acoustic-Prosodic Entrainment Behaviors | Andreas Weise, Rivka Levitan | We present a negative result of our search, finding no meaningful correlations, clusters, or principal components in various entrainment measures, and discuss practical and theoretical implications. |
49 | Modeling Semantic Plausibility by Injecting World Knowledge | Su Wang, Greg Durrett, Katrin Erk | This paper introduces the task of semantic plausibility: recognizing plausible but possibly novel events. We present a new crowdsourced dataset of semantic plausibility judgments of single events such as man swallow paintball. |
50 | A Bi-Model Based RNN Semantic Frame Parsing Model for Intent Detection and Slot Filling | Yu Wang, Yilin Shen, Hongxia Jin | In this paper, new Bi-model based RNN semantic frame parsing network structures are designed to perform the intent detection and slot filling tasks jointly, by considering their cross-impact to each other using two correlated bidirectional LSTMs (BLSTM). |
51 | A Comparison of Two Paraphrase Models for Taxonomy Augmentation | Vassilis Plachouras, Fabio Petroni, Timothy Nugent, Jochen L. Leidner | In this paper, we explore automatic taxonomy augmentation with paraphrases. |
52 | A Laypeople Study on Terminology Identification across Domains and Task Definitions | Anna Hätty, Sabine Schulte im Walde | This paper introduces a new dataset of term annotation. |
53 | A Novel Embedding Model for Knowledge Base Completion Based on Convolutional Neural Network | Dai Quoc Nguyen, Tu Dinh Nguyen, Dat Quoc Nguyen, Dinh Phung | In this paper, we propose a novel embedding model, named ConvKB, for knowledge base completion. |
54 | Cross-language Article Linking Using Cross-Encyclopedia Entity Embedding | Chun-Kai Wu, Richard Tzong-Han Tsai | In this paper, we address these problems by proposing cross-encyclopedia entity embedding. |
55 | Identifying the Most Dominant Event in a News Article by Mining Event Coreference Relations | Prafulla Kumar Choubey, Kaushik Raju, Ruihong Huang | Identifying the most dominant and central event of a document, which governs and connects other foreground and background events in the document, is useful for many applications, such as text summarization, storyline generation and text segmentation. |
56 | Improve Neural Entity Recognition via Multi-Task Data Selection and Constrained Decoding | Huasha Zhao, Yi Yang, Qiong Zhang, Luo Si | In this paper, we propose an entity recognition system that improves this neural architecture with two novel techniques. |
57 | Keep Your Bearings: Lightly-Supervised Information Extraction with Ladder Networks That Avoids Semantic Drift | Ajay Nagesh, Mihai Surdeanu | We propose a novel approach to semi-supervised learning for information extraction that uses ladder networks (Rasmus et al., 2015). |
58 | Semi-Supervised Event Extraction with Paraphrase Clusters | James Ferguson, Colin Lockard, Daniel Weld, Hannaneh Hajishirzi | We present a method for self-training event extraction systems by bootstrapping additional training data. |
59 | Structure Regularized Neural Network for Entity Relation Classification for Chinese Literature Text | Ji Wen, Xu Sun, Xuancheng Ren, Qi Su | In this paper, we propose the task of relation classification for Chinese literature text. |
60 | Syntactic Patterns Improve Information Extraction for Medical Search | Roma Patel, Yinfei Yang, Iain Marshall, Ani Nenkova, Byron Wallace | In this paper we demonstrate how features encoding syntactic patterns improve the performance of state-of-the-art sequence tagging models (both neural and linear) for information extraction of these medically relevant categories. |
61 | Syntactically Aware Neural Architectures for Definition Extraction | Luis Espinosa-Anke, Steven Schockaert | In this paper we present a set of neural architectures combining Convolutional and Recurrent Neural Networks, which are further enriched by incorporating linguistic information via syntactic dependencies. |
62 | A Dynamic Oracle for Linear-Time 2-Planar Dependency Parsing | Daniel Fernández-González, Carlos Gómez-Rodríguez | We propose an efficient dynamic oracle for training the 2-Planar transition-based parser, a linear-time parser with over 99% coverage on non-projective syntactic corpora. |
63 | Are Automatic Methods for Cognate Detection Good Enough for Phylogenetic Reconstruction in Historical Linguistics? | Taraka Rama, Johann-Mattis List, Johannes Wahle, Gerhard Jäger | We evaluate the performance of state-of-the-art algorithms for automatic cognate detection by comparing how useful automatically inferred cognates are for the task of phylogenetic inference compared to classical manually annotated cognate sets. |
64 | Automatically Selecting the Best Dependency Annotation Design with Dynamic Oracles | Guillaume Wisniewski, Ophélie Lacroix, François Yvon | This work introduces a new strategy to compare the numerous conventions that have been proposed over the years for expressing dependency structures and discover the one for which a parser will achieve the highest parsing performance. |
65 | Consistent CCG Parsing over Multiple Sentences for Improved Logical Reasoning | Masashi Yoshikawa, Koji Mineshima, Hiroshi Noji, Daisuke Bekki | In this work, we present a simple method to extend an existing CCG parser to parse a set of sentences consistently, which is achieved with an inter-sentence modeling with Markov Random Fields (MRF). |
66 | Exploiting Dynamic Oracles to Train Projective Dependency Parsers on Non-Projective Trees | Lauriane Aufrant, Guillaume Wisniewski, François Yvon | In this work, we propose a simple modification of dynamic oracles, which enables the use of non-projective data when training projective parsers. |
67 | Improving Coverage and Runtime Complexity for Exact Inference in Non-Projective Transition-Based Dependency Parsers | Tianze Shi, Carlos Gómez-Rodríguez, Lillian Lee | We generalize Cohen, G{\’o}mez-Rodr{\’\i}guez, and Satta’s (2011) parser to a family of non-projective transition-based dependency parsers allowing polynomial-time exact inference. |
68 | Towards a Variability Measure for Multiword Expressions | Caroline Pasquer, Agata Savary, Jean-Yves Antoine, Carlos Ramisch | Since variability of MWEs is a matter of scale rather than a binary property, we propose a 2-dimensional language-independent measure of variability dedicated to verbal MWEs based on syntactic and discontinuity-related clues. |
69 | Defoiling Foiled Image Captions | Pranava Swaroop Madhyastha, Josiah Wang, Lucia Specia | In this paper, we demonstrate that it is possible to solve this task using simple, interpretable yet powerful representations based on explicit object information over multilayer perceptron models. |
70 | Pragmatically Informative Image Captioning with Character-Level Inference | Reuben Cohn-Gordon, Noah Goodman, Christopher Potts | We instead solve this problem by implementing a version of RSA which operates at the level of characters (“a”, “b”, “c”, …) during the unrolling of the caption. |
71 | Object Ordering with Bidirectional Matchings for Visual Reasoning | Hao Tan, Mohit Bansal | Our model achieves strong improvements (of 4-6% absolute) over the state-of-the-art on both the structured representation and raw image versions of the dataset. |
72 | Contextual Augmentation: Data Augmentation by Words with Paradigmatic Relations | Sosuke Kobayashi | We propose a novel data augmentation for labeled sentences called contextual augmentation. |
73 | Cross-Lingual Learning-to-Rank with Shared Representations | Shota Sasaki, Shuo Sun, Shigehiko Schamoni, Kevin Duh, Kentaro Inui | We introduce a large-scale dataset derived from Wikipedia to support CLIR research in 25 languages. |
74 | Self-Attention with Relative Position Representations | Peter Shaw, Jakob Uszkoreit, Ashish Vaswani | In this work we present an alternative approach, extending the self-attention mechanism to efficiently consider representations of the relative positions, or distances between sequence elements. |
75 | Text Segmentation as a Supervised Learning Task | Omri Koshorek, Adir Cohen, Noam Mor, Michael Rotman, Jonathan Berant | In this work, we formulate text segmentation as a supervised learning problem, and present a large new dataset for text segmentation that is automatically extracted and labeled from Wikipedia. |
76 | What’s in a Domain? Learning Domain-Robust Text Representations using Adversarial Training | Yitong Li, Timothy Baldwin, Trevor Cohn | We propose a novel method to optimise both in- and out-of-domain accuracy based on joint learning of a structured neural model with domain-specific and domain-general components, coupled with adversarial training for domain. |
77 | Automated Paraphrase Lattice Creation for HyTER Machine Translation Evaluation | Marianna Apidianaki, Guillaume Wisniewski, Anne Cocos, Chris Callison-Burch | We propose a variant of a well-known machine translation (MT) evaluation metric, HyTER (Dreyer and Marcu, 2012), which exploits reference translations enriched with meaning equivalent expressions. |
78 | Exploiting Semantics in Neural Machine Translation with Graph Convolutional Networks | Diego Marcheggiani, Joost Bastings, Ivan Titov | In this work, we are the first to incorporate information about predicate-argument structure of source sentences (namely, semantic-role representations) into neural machine translation. |
79 | Incremental Decoding and Training Methods for Simultaneous Translation in Neural Machine Translation | Fahim Dalvi, Nadir Durrani, Hassan Sajjad, Stephan Vogel | We propose a tunable agent which decides the best segmentation strategy for a user-defined BLEU loss and Average Proportion (AP) constraint. |
80 | Learning Hidden Unit Contribution for Adapting Neural Machine Translation Models | David Vilar | In this paper we explore the use of Learning Hidden Unit Contribution for the task of neural machine translation. |
81 | Neural Machine Translation Decoding with Terminology Constraints | Eva Hasler, Adrià de Gispert, Gonzalo Iglesias, Bill Byrne | We describe our approach to constrained neural decoding based on finite-state machines and multi-stack decoding which supports target-side constraints as well as constraints with corresponding aligned input text spans. |
82 | On the Evaluation of Semantic Phenomena in Neural Machine Translation Using Natural Language Inference | Adam Poliak, Yonatan Belinkov, James Glass, Benjamin Van Durme | We propose a process for investigating the extent to which sentence representations arising from neural machine translation (NMT) systems encode distinct semantic phenomena. |
83 | Using Word Vectors to Improve Word Alignments for Low Resource Machine Translation | Nima Pourdamghani, Marjan Ghazvininejad, Kevin Knight | We present a method for improving word alignments using word similarities. |
84 | When and Why Are Pre-Trained Word Embeddings Useful for Neural Machine Translation? | Ye Qi, Devendra Sachan, Matthieu Felix, Sarguna Padmanabhan, Graham Neubig | In this work, we perform five sets of experiments that analyze when we can expect pre-trained word embeddings to help in NMT tasks. |
85 | Are All Languages Equally Hard to Language-Model? | Ryan Cotterell, Sebastian J. Mielke, Jason Eisner, Brian Roark | In this work, we develop an evaluation framework for fair cross-linguistic comparison of language models, using translated text so that all models are asked to predict approximately the same information. |
86 | The Computational Complexity of Distinctive Feature Minimization in Phonology | Hubie Chen, Mans Hulden | We analyze the complexity of the problem of determining whether a set of phonemes forms a natural class and, if so, that of finding the minimal feature specification for the class. |
87 | Unsupervised Disambiguation of Syncretism in Inflected Lexicons | Ryan Cotterell, Christo Kirov, Sebastian J. Mielke, Jason Eisner | We present such an approach, which employs a neural network to smoothly model a prior distribution over feature bundles (even rare ones). |
88 | Contextualized Word Representations for Reading Comprehension | Shimi Salant, Jonathan Berant | We take a standard neural architecture for this task, and show that by providing rich contextualized word representations from a large pre-trained language model as well as allowing the model to choose between context-dependent and context-independent word representations, we can obtain dramatic improvements and reach performance comparable to state-of-the-art on the competitive SQuAD dataset. |
89 | Crowdsourcing Question-Answer Meaning Representations | Julian Michael, Gabriel Stanovsky, Luheng He, Ido Dagan, Luke Zettlemoyer | We introduce Question-Answer Meaning Representations (QAMRs), which represent the predicate-argument structure of a sentence as a set of question-answer pairs. |
90 | Leveraging Context Information for Natural Question Generation | Linfeng Song, Zhiguo Wang, Wael Hamza, Yue Zhang, Daniel Gildea | We propose a model that matches the answer with the passage before generating the question. |
91 | Robust Machine Comprehension Models via Adversarial Training | Yicheng Wang, Mohit Bansal | We propose a novel alternative adversary-generation algorithm, AddSentDiverse, that significantly increases the variance within the adversarial training data by providing effective examples that punish the model for making certain superficial assumptions. |
92 | Simple and Effective Semi-Supervised Question Answering | Bhuwan Dhingra, Danish Danish, Dheeraj Rajagopal | In this work, we envision a system where the end user specifies a set of base documents and only a few labelled examples. We are also releasing a set of 3.2M cloze-style questions for practitioners to use while building QA systems. |
93 | TypeSQL: Knowledge-Based Type-Aware Neural Text-to-SQL Generation | Tao Yu, Zifan Li, Zilin Zhang, Rui Zhang, Dragomir Radev | In this paper, we present a novel approach TypeSQL which formats the problem as a slot filling task in a more reasonable way. |
94 | Community Member Retrieval on Social Media Using Textual Information | Aaron Jaech, Shobhit Hathi, Mari Ostendorf | The solution introduces an unsupervised proxy task for learning user embeddings: user re-identification. |
95 | Cross-Domain Review Helpfulness Prediction Based on Convolutional Neural Networks with Auxiliary Domain Discriminators | Cen Chen, Yinfei Yang, Jun Zhou, Xiaolong Li, Forrest Sheng Bao | Therefore, we propose a convolutional neural network (CNN) based model which leverages both word-level and character-based representations. |
96 | Predicting Foreign Language Usage from English-Only Social Media Posts | Svitlana Volkova, Stephen Ranshous, Lawrence Phillips | This paper presents a large-scale analysis of 6 million tweets produced by 27 thousand multilingual users speaking 12 other languages besides English. |
97 | A Discourse-Aware Attention Model for Abstractive Summarization of Long Documents | Arman Cohan, Franck Dernoncourt, Doo Soon Kim, Trung Bui, Seokhwan Kim, Walter Chang, Nazli Goharian | We propose the first model for abstractive summarization of single, longer-form documents (e.g., research papers). |
98 | A Mixed Hierarchical Attention Based Encoder-Decoder Approach for Standard Table Summarization | Parag Jain, Anirban Laha, Karthik Sankaranarayanan, Preksha Nema, Mitesh M. Khapra, Shreyas Shetty | In this work, we consider summarizing structured data occurring in the form of tables as they are prevalent across a wide variety of domains. |
99 | Effective Crowdsourcing for a New Type of Summarization Task | Youxuan Jiang, Catherine Finegan-Dollak, Jonathan K. Kummerfeld, Walter Lasecki | We propose targeted summarization as an umbrella category for summarization tasks that intentionally consider only parts of the input data. |
100 | Key2Vec: Automatic Ranked Keyphrase Extraction from Scientific Articles using Phrase Embeddings | Debanjan Mahata, John Kuriakose, Rajiv Ratn Shah, Roger Zimmermann | In this paper, we present an unsupervised technique (Key2Vec) that leverages phrase embeddings for ranking keyphrases extracted from scientific articles. |
101 | Learning to Generate Wikipedia Summaries for Underserved Languages from Wikidata | Lucie-Aimée Kaffee, Hady Elsahar, Pavlos Vougiouklis, Christophe Gravier, Frédérique Laforest, Jonathon Hare, Elena Simperl | In this work, we investigate the generation of open domain Wikipedia summaries in underserved languages using structured data from Wikidata. |
102 | Multi-Reward Reinforced Summarization with Saliency and Entailment | Ramakanth Pasunuru, Mohit Bansal | In this work, we address these three important aspects of a good summary via a reinforcement learning approach with two novel reward functions: ROUGESal and Entail, on top of a coverage-based baseline. |
103 | Objective Function Learning to Match Human Judgements for Optimization-Based Summarization | Maxime Peyrard, Iryna Gurevych | In this work, we learn a summary-level scoring function $\theta$ including human judgments as supervision and automatically generated data as regularization. |
104 | Pruning Basic Elements for Better Automatic Evaluation of Summaries | Ukyo Honda, Tsutomu Hirao, Masaaki Nagata | We propose a simple but highly effective automatic evaluation measure of summarization, pruned Basic Elements (pBE). |
105 | Unsupervised Keyphrase Extraction with Multipartite Graphs | Florian Boudin | We propose an unsupervised keyphrase extraction model that encodes topical information within a multipartite graph structure. |
106 | Where Have I Heard This Story Before? Identifying Narrative Similarity in Movie Remakes | Snigdha Chaturvedi, Shashank Srivastava, Dan Roth | We present an initial approach for this problem, which finds correspondences between narratives in terms of plot events, and resemblances between characters and their social relationships. We present a new task and dataset for story understanding: identifying instances of similar narratives from a collection of narrative texts. |
107 | Multimodal Emoji Prediction | Francesco Barbieri, Miguel Ballesteros, Francesco Ronzano, Horacio Saggion | In this paper we extend recent advances in emoji prediction by putting forward a multimodal approach that is able to predict emojis in Instagram posts. |
108 | Higher-Order Coreference Resolution with Coarse-to-Fine Inference | Kenton Lee, Luheng He, Luke Zettlemoyer | To alleviate the computational cost of this iterative process, we introduce a coarse-to-fine approach that incorporates a less accurate but more efficient bilinear factor, enabling more aggressive pruning without hurting accuracy. |
109 | Non-Projective Dependency Parsing with Non-Local Transitions | Daniel Fernández-González, Carlos Gómez-Rodríguez | We present a novel transition system, based on the Covington non-projective parser, introducing non-local transitions that can directly create arcs involving nodes to the left of the current focus positions. |
110 | Detecting Linguistic Characteristics of Alzheimer’s Dementia by Interpreting Neural Models | Sweta Karlekar, Tong Niu, Mohit Bansal | In this work, we use NLP techniques to classify and analyze the linguistic characteristics of AD patients using the DementiaBank dataset. |
111 | Deep Dungeons and Dragons: Learning Character-Action Interactions from Role-Playing Game Transcripts | Annie Louis, Charles Sutton | We propose role-playing games as a testbed for this problem, and introduce a large corpus of game transcripts collected from online discussion forums. |
112 | Feudal Reinforcement Learning for Dialogue Management in Large Domains | Iñigo Casanueva, Paweł Budzianowski, Pei-Hao Su, Stefan Ultes, Lina M. Rojas-Barahona, Bo-Hsiang Tseng, Milica Gašić | We propose a novel Dialogue Management architecture, based on Feudal RL, which decomposes the decision into two steps; a first step where a master policy selects a subset of primitive actions, and a second step where a primitive action is chosen from the selected subset. |
113 | Evaluating Historical Text Normalization Systems: How Well Do They Generalize? | Alexander Robertson, Sharon Goldwater | We highlight several issues in the evaluation of historical text normalization systems that make it hard to tell how well these systems would actually work in practice-i.e., for new datasets or languages; in comparison to more na{\”\i}ve systems; or as a preprocessing step for downstream NLP tools. |
114 | Gated Multi-Task Network for Text Classification | Liqiang Xiao, Honglun Zhang, Wenqing Chen | In this paper, we introduce gate mechanism into multi-task CNN and propose a new Gated Sharing Unit, which can filter the feature flows between tasks and greatly reduce the interference. |
115 | Natural Language to Structured Query Generation via Meta-Learning | Po-Sen Huang, Chenglong Wang, Rishabh Singh, Wen-tau Yih, Xiaodong He | In this work, we explore a different learning protocol that treats each example as a unique pseudo-task, by reducing the original learning problem to a few-shot meta-learning scenario with the help of a domain-dependent relevance function. |
116 | Smaller Text Classifiers with Discriminative Cluster Embeddings | Mingda Chen, Kevin Gimpel | We propose variations that selectively assign additional parameters to words, which further improves accuracy while still remaining parameter-efficient. |
117 | Role-specific Language Models for Processing Recorded Neuropsychological Exams | Tuka Al Hanai, Rhoda Au, James Glass | This paper demonstrates a method to determine the cognitive health (impaired or not) of 92 subjects, from audio that was diarized using an automatic speech recognition system trained on TED talks and on the structured language used by testers and subjects. |
118 | Slot-Gated Modeling for Joint Slot Filling and Intent Prediction | Chih-Wen Goo, Guang Gao, Yun-Kai Hsu, Chih-Li Huo, Tsung-Chieh Chen, Keng-Wei Hsu, Yun-Nung Chen | Considering that slot and intent have the strong relationship, this paper proposes a slot gate that focuses on learning the relationship between intent and slot attention vectors in order to obtain better semantic frame results by the global optimization. |
119 | An Evaluation of Image-Based Verb Prediction Models against Human Eye-Tracking Data | Spandana Gella, Frank Keller | Recent research in language and vision has developed models for predicting and disambiguating verbs from images. |
120 | Learning to Color from Language | Varun Manjunatha, Mohit Iyyer, Jordan Boyd-Graber, Larry Davis | We present two different architectures for language-conditioned colorization, both of which produce more accurate and plausible colorizations than a language-agnostic version. |
121 | Punny Captions: Witty Wordplay in Image Descriptions | Arjun Chandrasekaran, Devi Parikh, Mohit Bansal | In this work, we attempt to build computational models that can produce witty descriptions for a given image. |
122 | The Emergence of Semantics in Neural Network Representations of Visual Information | Dhanush Dharmaretnam, Alona Fyshe | Here we employ techniques previously used to detect semantic representations in the human brain to detect semantic representations in CNNs. |
123 | Visual Referring Expression Recognition: What Do Systems Actually Learn? | Volkan Cirik, Louis-Philippe Morency, Taylor Berg-Kirkpatrick | We present an empirical analysis of state-of-the-art systems for referring expression recognition – the task of identifying the object in an image referred to by a natural language expression – with the goal of gaining insight into how these systems reason about language and vision. |
124 | Visually Guided Spatial Relation Extraction from Text | Taher Rahgooy, Umar Manzoor, Parisa Kordjamshidi | We use various recent vision and language datasets and techniques to train inter-modality alignment models, visual relationship classifiers and propose a novel global inference model to integrate these components into our structured output prediction model for spatial role and relation extraction. |
125 | Watch, Listen, and Describe: Globally and Locally Aligned Cross-Modal Attentions for Video Captioning | Xin Wang, Yuan-Fang Wang, William Yang Wang | In this paper, we propose a novel hierarchically aligned cross-modal attention (HACA) framework to learn and selectively fuse both global and local temporal dynamics of different modalities. |