Paper Digest: NAACL 2018 Highlights

June 1, 2018October 6, 2019 admin

The North American Chapter of the Association for Computational Linguistics (NAACL) is one of the top natural language processing conferences in the world. In 2018, it is to be held in New Orleans, Louisiana. There were 1,072 paper submissions, of which 205 were accepted as long papers and 125 were accepted as short papers.

To help AI community quickly catch up on the work presented in this conference, Paper Digest Team processed all accepted papers, and generated one highlight sentence (typically the main topic) for each paper. Readers are encouraged to read these machine generated highlights / summaries to quickly get the main idea of each paper.

We thank all authors for writing these interesting papers, and readers for reading our digests. If you do not want to miss any interesting AI paper, you are welcome to sign up our free paper digest service to get new paper updates customized to your own interests on a daily basis.

Paper Digest Team
team@paperdigest.org

TABLE 1: NAACL 2018 Long Papers

	Title	Authors	Highlight
1	Label-Aware Double Transfer Learning for Cross-Specialty Medical Named Entity Recognition	Zhenghui Wang, Yanru Qu, Liheng Chen, Jian Shen, Weinan Zhang, Shaodian Zhang, Yimei Gao, Gen Gu, Ken Chen, Yong Yu	In this paper, we propose a label-aware double transfer learning framework (La-DTL) for cross-specialty NER, so that a medical NER system designed for one specialty could be conveniently applied to another one with minimal annotation efforts.
2	Neural Fine-Grained Entity Type Classification with Hierarchy-Aware Loss	Peng Xu, Denilson Barbosa	Instead, we propose an end-to-end solution with a neural network model that uses a variant of cross-entropy loss function to handle out-of-context labels, and hierarchical loss normalization to cope with overly-specific ones.
3	Joint Bootstrapping Machines for High Confidence Relation Extraction	Pankaj Gupta, Benjamin Roth, Hinrich Schütze	We introduce BREX, a new bootstrapping method that protects against such contamination by highly effective confidence assessment.
4	A Deep Generative Model of Vowel Formant Typology	Ryan Cotterell, Jason Eisner	In our work, we tackle the problem of vowel system typology, i.e., we propose a generative probability model of which vowels a language contains.
5	Fortification of Neural Morphological Segmentation Models for Polysynthetic Minimal-Resource Languages	Katharina Kann, Jesus Manuel Mager Hois, Ivan Vladimir Meza-Ruiz, Hinrich Schütze	We provide our morphological segmentation datasets for Mexicanero, Nahuatl, Wixarika and Yorem Nokki for future research.
6	Improving Character-Based Decoding Using Target-Side Morphological Information for Neural Machine Translation	Peyman Passban, Qun Liu, Andy Way	In this paper, we propose an extension to the state-of-the-art model of Chung et al. (2016), which works at the character level and boosts the decoder with target-side morphological information.
7	Parsing Speech: a Neural Approach to Integrating Lexical and Acoustic-Prosodic Information	Trang Tran, Shubham Toshniwal, Mohit Bansal, Kevin Gimpel, Karen Livescu, Mari Ostendorf	For automatically parsing spoken utterances, we introduce a model that integrates transcribed text and acoustic-prosodic features using a convolutional neural network over energy and pitch trajectories coupled with an attention-based recurrent neural network that accepts text and prosodic features.
8	Tied Multitask Learning for Neural Speech Translation	Antonios Anastasopoulos, David Chiang	We explore multitask models for neural translation of speech, augmenting them in order to reflect two intuitive notions.
9	Please Clap: Modeling Applause in Campaign Speeches	Jon Gillick, David Bamman	We introduce a new corpus of speeches from campaign events in the months leading up to the 2016 U.S. presidential election and develop new models for predicting moments of audience applause.
10	Attentive Interaction Model: Modeling Changes in View in Argumentation	Yohan Jo, Shivani Poddar, Byungsoo Jeon, Qinlan Shen, Carolyn Rosé, Graham Neubig	We present a neural architecture for modeling argumentative dialogue that explicitly models the interplay between an Opinion Holder’s (OH’s) reasoning and a challenger’s argument, with the goal of predicting if the argument successfully changes the OH’s view.
11	Automatic Focus Annotation: Bringing Formal Pragmatics Alive in Analyzing the Information Structure of Authentic Data	Ramon Ziai, Detmar Meurers	Building on the research that established detailed annotation guidelines for manual annotation of information structural concepts for written (Dipper et al., 2007; Ziai and Meurers, 2014) and spoken language data (Calhoun et al., 2010), this paper presents the first approach automating the analysis of focus in authentic written data.
12	Dear Sir or Madam, May I Introduce the GYAFC Dataset: Corpus, Benchmarks and Metrics for Formality Style Transfer	Sudha Rao, Joel Tetreault	In this work, we create the largest corpus for a particular stylistic transfer (formality) and show that techniques from the machine translation community can serve as strong baselines for future work.
13	Improving Implicit Discourse Relation Classification by Modeling Inter-dependencies of Discourse Units in a Paragraph	Zeyu Dai, Ruihong Huang	With the goal of improving implicit discourse relation classification, we introduce a paragraph-level neural networks that model inter-dependencies between discourse units as well as discourse relation continuity and patterns, and predict a sequence of discourse relations in a paragraph.
14	A Deep Ensemble Model with Slot Alignment for Sequence-to-Sequence Natural Language Generation	Juraj Juraska, Panagiotis Karagiannis, Kevin Bowden, Marilyn Walker	We describe an ensemble neural language generator, and present several novel methods for data representation and augmentation that yield improved results in our model.
15	A Melody-Conditioned Lyrics Language Model	Kento Watanabe, Yuichiroh Matsubayashi, Satoru Fukayama, Masataka Goto, Kentaro Inui, Tomoyasu Nakano	This paper presents a novel, data-driven language model that produces entire lyrics for a given input melody.
16	Discourse-Aware Neural Rewards for Coherent Text Generation	Antoine Bosselut, Asli Celikyilmaz, Xiaodong He, Jianfeng Gao, Po-Sen Huang, Yejin Choi	In this paper, we investigate the use of discourse-aware rewards with reinforcement learning to guide a model to generate long, coherent text.
17	Natural Answer Generation with Heterogeneous Memory	Yao Fu, Yansong Feng	In this work, we propose a novel attention mechanism to encourage the decoder to actively interact with the memory by taking its heterogeneity into account.
18	Query and Output: Generating Words by Querying Distributed Word Representations for Paraphrase Generation	Shuming Ma, Xu Sun, Wei Li, Sujian Li, Wenjie Li, Xuancheng Ren	In this work, we introduce a novel model based on the encoder-decoder framework, called Word Embedding Attention Network (WEAN).
19	Simplification Using Paraphrases and Context-Based Lexical Substitution	Reno Kriz, Eleni Miltsakaki, Marianna Apidianaki, Chris Callison-Burch	We propose a complex word identification (CWI) model that exploits both lexical and contextual features, and a simplification mechanism which relies on a word-embedding lexical substitution model to replace the detected complex words with simpler paraphrases.
20	Zero-Shot Question Generation from Knowledge Graphs for Unseen Predicates and Entity Types	Hady Elsahar, Christophe Gravier, Frederique Laforest	We present a neural model for question generation from knowledge graphs triples in a “Zero-shot” setup, that is generating questions for predicate, subject types or object types that were not seen at training time.
21	Automated Essay Scoring in the Presence of Biased Ratings	Evelin Amorim, Marcia Cançado, Adriano Veloso	We present features to quantify rater bias based on their comments, and we found that rater bias plays an important role in automated essay scoring. To this end, we present a new annotated corpus containing essays and their respective scores.
22	Content-Based Citation Recommendation	Chandra Bhagavatula, Sergey Feldman, Russell Power, Waleed Ammar	We present a content-based method for recommending citations in an academic paper draft.
23	Looking Beyond the Surface: A Challenge Set for Reading Comprehension over Multiple Sentences	Daniel Khashabi, Snigdha Chaturvedi, Michael Roth, Shyam Upadhyay, Dan Roth	We present a reading comprehension challenge in which questions can only be answered by taking into account information from multiple sentences.
24	Neural Automated Essay Scoring and Coherence Modeling for Adversarially Crafted Input	Youmna Farag, Helen Yannakoudakis, Ted Briscoe	We develop a neural model of local coherence that can effectively learn connectedness features between sentences, and propose a framework for integrating and jointly training the local coherence model with a state-of-the-art AES model.
25	QuickEdit: Editing Text & Translations by Crossing Words Out	David Grangier, Michael Auli	We propose a framework for computer-assisted text editing.
26	Tempo-Lexical Context Driven Word Embedding for Cross-Session Search Task Extraction	Procheta Sen, Debasis Ganguly, Gareth Jones	By contrast, in this work we seek to identify tasks that span across multiple sessions.
27	Zero-Shot Sequence Labeling: Transferring Knowledge from Sentences to Tokens	Marek Rei, Anders Søgaard	Can attention- or gradient-based visualization techniques be used to infer token-level labels for binary sequence tagging problems, using networks trained only on sentence-level labels?
28	Variable Typing: Assigning Meaning to Variables in Mathematical Text	Yiannos Stathopoulos, Simon Baker, Marek Rei, Simone Teufel	We introduce \textit{variable typing}, the task of assigning one \textit{mathematical type} (multi-word technical terms referring to mathematical concepts) to each variable in a sentence of mathematical text. As part of this work, we also introduce a new annotated data set composed of 33,524 data points extracted from scientific documents published on arXiv.
29	Learning beyond Datasets: Knowledge Graph Augmented Neural Networks for Natural Language Processing	Annervaz K M, Somnath Basu Roy Chowdhury, Ambedkar Dukkipati	In this work, we propose to enhance learning models with world knowledge in the form of Knowledge Graph (KG) fact triples for Natural Language Processing (NLP) tasks.
30	Comparing Constraints for Taxonomic Organization	Anne Cocos, Marianna Apidianaki, Chris Callison-Burch	In this paper, we present a head-to-head comparison of six taxonomic organization algorithms that vary with respect to their structural and transitivity constraints, and treatment of synonymy.
31	Improving Lexical Choice in Neural Machine Translation	Toan Nguyen, David Chiang	We explore two solutions to the problem of mistranslating rare words in neural machine translation.
32	Universal Neural Machine Translation for Extremely Low Resource Languages	Jiatao Gu, Hany Hassan, Jacob Devlin, Victor O.K. Li	In this paper, we propose a new universal machine translation approach focusing on languages with a limited amount of parallel data.
33	Classical Structured Prediction Losses for Sequence to Sequence Learning	Sergey Edunov, Myle Ott, Michael Auli, David Grangier, Marc’Aurelio Ranzato	In this paper, we survey a range of classical objective functions that have been widely used to train linear models for structured prediction and apply them to neural sequence to sequence models.
34	Deep Dirichlet Multinomial Regression	Adrian Benton, Mark Dredze	We present deep Dirichlet Multinomial Regression (dDMR), a generative topic model that simultaneously learns document feature representations and topics.
35	Microblog Conversation Recommendation via Joint Modeling of Topics and Discourse	Xingshan Zeng, Jing Li, Lu Wang, Nicholas Beauchamp, Sarah Shugars, Kam-Fai Wong	Here we propose a new method for microblog conversation recommendation.
36	Before Name-Calling: Dynamics and Triggers of Ad Hominem Fallacies in Web Argumentation	Ivan Habernal, Henning Wachsmuth, Iryna Gurevych, Benno Stein	As existing research lacks solid empirical investigation of the typology of ad hominem arguments as well as their potential causes, this paper fills this gap by (1) performing several large-scale annotation studies, (2) experimenting with various neural architectures and validating our working hypotheses, such as controversy or reasonableness, and (3) providing linguistic insights into triggers of ad hominem using explainable neural network architectures.
37	Scene Graph Parsing as Dependency Parsing	Yu-Siang Wang, Chenxi Liu, Xiaohui Zeng, Alan Yuille	In this paper, we study the problem of parsing structured knowledge graphs from textual descriptions.
38	Learning Visually Grounded Sentence Representations	Douwe Kiela, Alexis Conneau, Allan Jabri, Maximilian Nickel	We investigate grounded sentence representations, where we train a sentence encoder to predict the image features of a given caption-i.e., we try to “imagine” how a sentence would be depicted visually-and use the resultant features as sentence representations.
39	Comparatives, Quantifiers, Proportions: a Multi-Task Model for the Learning of Quantities from Vision	Sandro Pezzelle, Ionut-Teodor Sorodoc, Raffaella Bernardi	The present work investigates whether different quantification mechanisms (set comparison, vague quantification, and proportional estimation) can be jointly learned from visual scenes by a multi-task computational model.
40	Being Negative but Constructively: Lessons Learnt from Creating Better Visual Question Answering Datasets	Wei-Lun Chao, Hexiang Hu, Fei Sha	In this paper, we study a crucial component of this task: how can we design good datasets for the task?
41	Abstract Meaning Representation for Paraphrase Detection	Fuad Issa, Marco Damonte, Shay B. Cohen, Xiaohui Yan, Yi Chang	We show that na{\”\i}ve use of AMR in paraphrase detection is not necessarily useful, and turn to describe a technique based on latent semantic analysis in combination with AMR parsing that significantly advances state-of-the-art results in paraphrase detection for the Microsoft Research Paraphrase Corpus.
42	attr2vec: Jointly Learning Word and Contextual Attribute Embeddings with Factorization Machines	Fabio Petroni, Vassilis Plachouras, Timothy Nugent, Jochen L. Leidner	In this work, we introduce attr2vec, a novel framework for jointly learning embeddings for words and contextual attributes based on factorization machines.
43	Can Network Embedding of Distributional Thesaurus Be Combined with Word Vectors for Better Representation?	Abhik Jana, Pawan Goyal	Being motivated by recent surge of research in network embedding techniques (DeepWalk, LINE, node2vec etc.), we turn a distributional thesaurus network into dense word vectors and investigate the usefulness of distributional thesaurus embedding in improving overall word representation.
44	Deep Neural Models of Semantic Shift	Alex Rosenfeld, Katrin Erk	In this paper, we propose a deep neural network diachronic distributional model.
45	Distributional Inclusion Vector Embedding for Unsupervised Hypernymy Detection	Haw-Shiuan Chang, Ziyun Wang, Luke Vilnis, Andrew McCallum	This paper introduces distributional inclusion vector embedding (DIVE), a simple-to-implement unsupervised method of hypernym discovery via per-word non-negative vector embeddings which preserve the inclusion property of word contexts.
46	Mining Possessions: Existence, Type and Temporal Anchors	Dhivya Chinnappa, Eduardo Blanco	This paper presents a corpus and experiments to mine possession relations from text.
47	Neural Tensor Networks with Diagonal Slice Matrices	Takahiro Ishihara, Katsuhiko Hayashi, Hitoshi Manabe, Masashi Shimbo, Masaaki Nagata	We address these issues by applying eigendecomposition to each slice matrix of a tensor to reduce its number of paramters.
48	Post-Specialisation: Retrofitting Vectors of Words Unseen in Lexical Resources	Ivan Vulić, Goran Glavaš, Nikola Mrkšić, Anna Korhonen	In this paper, we show that constraint-driven vector space specialisation can be extended to unseen words.
49	Unsupervised Learning of Sentence Embeddings Using Compositional n-Gram Features	Matteo Pagliardini, Prakhar Gupta, Martin Jaggi	We present a simple but efficient unsupervised objective to train distributed representations of sentences.
50	Learning Domain Representation for Multi-Domain Sentiment Classification	Qi Liu, Yue Zhang, Jiangming Liu	We investigate this problem by learning domain-specific representations of input sentences using neural network.
51	Learning Sentence Representations over Tree Structures for Target-Dependent Classification	Junwen Duan, Xiao Ding, Ting Liu	To address above issues, we propose a reinforcement learning based approach, which automatically induces target-specific sentence representations over tree structures.
52	Relevant Emotion Ranking from Text Constrained with Emotion Relationships	Deyu Zhou, Yang Yang, Yulan He	A novel framework of relevant emotion ranking is proposed to tackle the problem.
53	Solving Data Sparsity for Aspect Based Sentiment Analysis Using Cross-Linguality and Multi-Linguality	Md Shad Akhtar, Palaash Sawant, Sukanta Sen, Asif Ekbal, Pushpak Bhattacharyya	In this work we propose to minimize the effect of data sparsity by leveraging bilingual word embeddings learned through a parallel corpus.
54	SRL4ORL: Improving Opinion Role Labeling Using Multi-Task Learning with Semantic Role Labeling	Ana Marasović, Anette Frank	With deeper analysis we determine what works and what might be done to make further improvements for ORL.
55	Approaching Neural Grammatical Error Correction as a Low-Resource Machine Translation Task	Marcin Junczys-Dowmunt, Roman Grundkiewicz, Shubha Guha, Kenneth Heafield	We demonstrate parallels between neural GEC and low-resource neural MT and successfully adapt several methods from low-resource MT to neural GEC. We further establish guidelines for trustable results in neural GEC and propose a set of model-independent methods for neural GEC that can be easily applied in most GEC settings.
56	Robust Cross-Lingual Hypernymy Detection Using Dependency Context	Shyam Upadhyay, Yogarshi Vyas, Marine Carpuat, Dan Roth	We propose BiSparse-Dep, a family of unsupervised approaches for cross-lingual hypernymy detection, which learns sparse, bilingual word embeddings based on dependency contexts.
57	Noising and Denoising Natural Language: Diverse Backtranslation for Grammar Correction	Ziang Xie, Guillaume Genthial, Stanley Xie, Andrew Ng, Dan Jurafsky	In this paper, we consider synthesizing parallel data by noising a clean monolingual corpus.
58	Self-Training for Jointly Learning to Ask and Answer Questions	Mrinmaya Sachan, Eric Xing	To alleviate these issues, we propose a self-training method for jointly learning to ask as well as answer questions, leveraging unlabeled text along with labeled question answer pairs for learning.
59	The Web as a Knowledge-Base for Answering Complex Questions	Alon Talmor, Jonathan Berant	In this paper, we present a novel framework for answering broad and complex questions, assuming answering simple questions is possible using a search engine and a reading comprehension model. To illustrate the viability of our approach, we create a new dataset of complex questions, ComplexWebQuestions, and present a model that decomposes questions and interacts with the web to compute an answer.
60	A Meaning-Based Statistical English Math Word Problem Solver	Chao-Chun Liang, Yu-Shiang Wong, Yi-Chung Lin, Keh-Yih Su	We introduce MeSys, a meaning-based approach, for solving English math word problems (MWPs) via understanding and reasoning in this paper.
61	Fine-Grained Temporal Orientation and its Relationship with Psycho-Demographic Correlates	Sabyasachi Kamila, Mohammed Hasanuzzaman, Asif Ekbal, Pushpak Bhattacharyya, Andy Way	In this paper, we propose a very first study to demonstrate the association between the sentiment view of the temporal orientation of the users and their different psycho-demographic attributes by analyzing their tweets.
62	Querying Word Embeddings for Similarity and Relatedness	Fatemeh Torabi Asr, Robert Zinkov, Michael Jones	We demonstrate the usefulness of context embeddings in predicting asymmetric association between words from a recently published dataset of production norms (Jouravlev & McRae, 2016).
63	Semantic Structural Evaluation for Text Simplification	Elior Sulem, Omri Abend, Ari Rappoport	In this paper we propose the first measure to address structural aspects of text simplification, called SAMSA.
64	Entity Commonsense Representation for Neural Abstractive Summarization	Reinald Kim Amplayo, Seonjae Lim, Seung-won Hwang	To this end, we leverage on an off-the-shelf entity linking system (ELS) to extract linked entities and propose Entity2Topic (E2T), a module easily attachable to a sequence-to-sequence model that transforms a list of entities into a vector representation of the topic of the summary.
65	Newsroom: A Dataset of 1.3 Million Summaries with Diverse Extractive Strategies	Max Grusky, Mor Naaman, Yoav Artzi	We present NEWSROOM, a summarization dataset of 1.3 million articles and summaries written by authors and editors in newsrooms of 38 major news publications.
66	Polyglot Semantic Parsing in APIs	Kyle Richardson, Jonathan Berant, Jonas Kuhn	In this paper, we explore the idea of polyglot semantic translation, or learning semantic parsing models that are trained on multiple datasets and natural languages.
67	Neural Models of Factuality	Rachel Rudinger, Aaron Steven White, Benjamin Van Durme	We present two neural models for event factuality prediction, which yield significant performance gains over previous models on three event factuality datasets: FactBank, UW, and MEANTIME.
68	Accurate Text-Enhanced Knowledge Graph Representation Learning	Bo An, Bo Chen, Xianpei Han, Le Sun	To appropriately handle the semantic variety of entities/relations in distinct triples, we propose an accurate text-enhanced knowledge graph representation learning method, which can represent a relation/entity with different representations in different triples by exploiting additional textual information.
69	Acquisition of Phrase Correspondences Using Natural Deduction Proofs	Hitomi Yanaka, Koji Mineshima, Pascual Martínez-Gómez, Daisuke Bekki	To solve this problem, we propose a method for detecting paraphrases via natural deduction proofs of semantic relations between sentence pairs.
70	Automatic Stance Detection Using End-to-End Memory Networks	Mitra Mohtarami, Ramy Baly, James Glass, Preslav Nakov, Lluís Màrquez, Alessandro Moschitti	We present an effective end-to-end memory network model that jointly (i) predicts whether a given document can be considered as relevant evidence for a given claim, and (ii) extracts snippets of evidence that can be used to reason about the factuality of the target claim.
71	Collective Entity Disambiguation with Structured Gradient Tree Boosting	Yi Yang, Ozan Irsoy, Kazi Shefaet Rahman	We present a gradient-tree-boosting-based structured learning model for jointly disambiguating named entities in a document.
72	DeepAlignment: Unsupervised Ontology Matching with Refined Word Vectors	Prodromos Kolyvakis, Alexandros Kalousis, Dimitris Kiritsis	In this work, we present a novel entity alignment method which we dub DeepAlignment.
73	Efficient Sequence Learning with Group Recurrent Networks	Fei Gao, Lijun Wu, Li Zhao, Tao Qin, Xueqi Cheng, Tie-Yan Liu	In this paper, we propose an efficient architecture to improve the efficiency of such RNN model training, which adopts the group strategy for recurrent layers, while exploiting the representation rearrangement strategy between layers as well as time steps.
74	FEVER: a Large-scale Dataset for Fact Extraction and VERification	James Thorne, Andreas Vlachos, Christos Christodoulopoulos, Arpit Mittal	In this paper we introduce a new publicly available dataset for verification against textual sources, FEVER: Fact Extraction and VERification.
75	Global Relation Embedding for Relation Extraction	Yu Su, Honglei Liu, Semih Yavuz, Izzeddin Gür, Huan Sun, Xifeng Yan	To combat the wrong labeling problem of distant supervision, we propose to embed textual relations with global statistics of relations, i.e., the co-occurrence statistics of textual and knowledge base relations collected from the entire corpus.
76	Implicit Argument Prediction with Event Knowledge	Pengxiang Cheng, Katrin Erk	We propose to train models for implicit argument prediction on a simple cloze task, for which data can be generated automatically at scale.
77	Improving Temporal Relation Extraction with a Globally Acquired Statistical Resource	Qiang Ning, Hao Wu, Haoruo Peng, Dan Roth	This paper develops such a resource – a probabilistic knowledge base acquired in the news domain – by extracting temporal relations between events from the New York Times (NYT) articles over a 20-year span (1987-2007).
78	Multimodal Named Entity Recognition for Short Social Media Posts	Seungwhan Moon, Leonardo Neves, Vitor Carvalho	We introduce a new task called Multimodal Named Entity Recognition (MNER) for noisy user-generated data such as tweets or Snapchat captions, which comprise short text with accompanying images. To this end, we create a new dataset for MNER called SnapCaptions (Snapchat image-caption pairs submitted to public and crowd-sourced stories with fully annotated named entities).
79	Nested Named Entity Recognition Revisited	Arzoo Katiyar, Claire Cardie	We propose a novel recurrent neural network-based approach to simultaneously handle nested named entity recognition and nested entity mention detection.
80	Simultaneously Self-Attending to All Mentions for Full-Abstract Biological Relation Extraction	Patrick Verga, Emma Strubell, Andrew McCallum	In response, we propose a model which simultaneously predicts relationships between all mention pairs in a document. We also introduce a new dataset an order of magnitude larger than existing human-annotated biological information extraction datasets and more accurate than distantly supervised alternatives.
81	Supervised Open Information Extraction	Gabriel Stanovsky, Julian Michael, Luke Zettlemoyer, Ido Dagan	We present data and methods that enable a supervised learning approach to Open Information Extraction (Open IE).
82	Embedding Syntax and Semantics of Prepositions via Tensor Decomposition	Hongyu Gong, Suma Bhat, Pramod Viswanath	In this paper we use \textit{word-triple} counts (one of the triples being a preposition) to capture a preposition’s interaction with its attachment and complement.
83	From Phonology to Syntax: Unsupervised Linguistic Typology at Different Levels with Language Embeddings	Johannes Bjerva, Isabelle Augenstein	We learn distributed language representations, which can be used to predict typological properties on a massively multilingual scale.
84	Monte Carlo Syntax Marginals for Exploring and Using Dependency Parses	Katherine Keith, Su Lin Blodgett, Brendan O’Connor	In this work, we propose a transition sampling algorithm to sample from the full joint distribution of parse trees defined by a transition-based parsing model, and demonstrate the use of the samples in probabilistic dependency analysis.
85	Neural Particle Smoothing for Sampling from Conditional Sequence Models	Chu-Cheng Lin, Jason Eisner	We introduce neural particle smoothing, a sequential Monte Carlo method for sampling annotations of an input string from a given probability model.
86	Neural Syntactic Generative Models with Exact Marginalization	Jan Buys, Phil Blunsom	We present neural syntactic generative models with exact marginalization that support both dependency parsing and language modeling.
87	Noise-Robust Morphological Disambiguation for Dialectal Arabic	Nasser Zalmout, Alexander Erdmann, Nizar Habash	We present a neural morphological tagging and disambiguation model for Egyptian Arabic, with various extensions to handle noisy and inconsistent content.
88	Parsing Tweets into Universal Dependencies	Yijia Liu, Yi Zhu, Wanxiang Che, Bing Qin, Nathan Schneider, Noah A. Smith	To overcome the annotation noise without sacrificing computational efficiency, we propose a new method to distill an ensemble of 20 transition-based parsers into a single one.
89	Robust Multilingual Part-of-Speech Tagging via Adversarial Training	Michihiro Yasunaga, Jungo Kasai, Dragomir Radev	In this paper, we propose and analyze a neural POS tagging model that exploits AT.
90	Universal Dependency Parsing for Hindi-English Code-Switching	Irshad Bhat, Riyaz A. Bhat, Manish Shrivastava, Dipti Sharma	In this paper, we investigate these indispensable processes and other problems associated with syntactic parsing of code-switching data and propose methods to mitigate their effects.
91	What’s Going On in Neural Constituency Parsers? An Analysis	David Gaddy, Mitchell Stern, Dan Klein	The goal of this work is to analyze the extent to which information provided directly by the model structure in classical systems is still being captured by neural methods.
92	Deep Generative Model for Joint Alignment and Word Representation	Miguel Rios, Wilker Aziz, Khalil Sima’an	This work exploits translation data as a source of semantically relevant learning signal for models of word representation.
93	Learning Word Embeddings for Low-Resource Languages by PU Learning	Chao Jiang, Hsiang-Fu Yu, Cho-Jui Hsieh, Kai-Wei Chang	In this paper, we study how to effectively learn a word embedding model on a corpus with only a few million tokens.
94	Exploring the Role of Prior Beliefs for Argument Persuasion	Esin Durmus, Claire Cardie	To study the actual effect of language use vs. prior beliefs on persuasion, we provide a new dataset and propose a controlled setting that takes into consideration two reader-level factors: political and religious ideology.
95	Inducing a Lexicon of Abusive Words — a Feature-Based Approach	Michael Wiegand, Josef Ruppenhofer, Anna Schmidt, Clayton Greenberg	We propose novel features employing information from both corpora and lexical resources.
96	Author Commitment and Social Power: Automatic Belief Tagging to Infer the Social Context of Interactions	Vinodkumar Prabhakaran, Premkumar Ganeshkumar, Owen Rambow	In this paper, we employ advancements in extra-propositional semantics extraction within NLP to study how author commitment reflects the social context of an interactions.
97	Comparing Automatic and Human Evaluation of Local Explanations for Text Classification	Dong Nguyen	We evaluate a variety of local explanation approaches using automatic measures based on word deletion.
98	Deep Temporal-Recurrent-Replicated-Softmax for Topical Trends over Time	Pankaj Gupta, Subburam Rajaram, Hinrich Schütze, Bernt Andrassy	We introduce a novel unsupervised neural dynamic topic model named as Recurrent Neural Network-Replicated Softmax Model (RNNRSM), where the discovered topics at each time influence the topic discovery in the subsequent time steps.
99	Lessons from the Bible on Modern Topics: Low-Resource Multilingual Topic Model Evaluation	Shudong Hao, Jordan Boyd-Graber, Michael J. Paul	Because standard metrics fail to accurately measure topic quality when robust external resources are unavailable, we propose an adaptation model that improves the accuracy and reliability of these metrics in low-resource settings.
100	Explainable Prediction of Medical Codes from Clinical Text	James Mullenbach, Sarah Wiegreffe, Jon Duke, Jimeng Sun, Jacob Eisenstein	We present an attentional convolutional network that predicts medical codes from clinical text.
101	A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference	Adina Williams, Nikita Nangia, Samuel Bowman	This paper introduces the Multi-Genre Natural Language Inference (MultiNLI) corpus, a dataset designed for use in the development and evaluation of machine learning models for sentence understanding.
102	Filling Missing Paths: Modeling Co-occurrences of Word Pairs and Dependency Paths for Recognizing Lexical Semantic Relations	Koki Washio, Tsuneaki Kato	In this paper, we propose novel methods with a neural model of P(path\|w1,w2) to solve this problem.
103	Specialising Word Vectors for Lexical Entailment	Ivan Vulić, Nikola Mrkšić	We present LEAR (Lexical Entailment Attract-Repel), a novel post-processing method that transforms any input word vector space to emphasise the asymmetric relation of lexical entailment (LE), also known as the IS-A or hyponymy-hypernymy relation.
104	Cross-Lingual Abstract Meaning Representation Parsing	Marco Damonte, Shay B. Cohen	Abstract Meaning Representation (AMR) research has mostly focused on English.
105	Sentences with Gapping: Parsing and Reconstructing Elided Predicates	Sebastian Schuster, Joakim Nivre, Christopher D. Manning	In this paper, we present two methods for parsing to a Universal Dependencies graph representation that explicitly encodes the elided material with additional nodes and edges.
106	A Structured Syntax-Semantics Interface for English-AMR Alignment	Ida Szubert, Adam Lopez, Nathan Schneider	To test it, we devise an expressive framework to align AMR graphs to dependency graphs, which we use to annotate 200 AMRs.
107	End-to-End Graph-Based TAG Parsing with Neural Networks	Jungo Kasai, Robert Frank, Pauli Xu, William Merrill, Owen Rambow	We present a graph-based Tree Adjoining Grammar (TAG) parser that uses BiLSTMs, highway connections, and character-level CNNs.
108	Colorless Green Recurrent Networks Dream Hierarchically	Kristina Gulordava, Piotr Bojanowski, Edouard Grave, Tal Linzen, Marco Baroni	We investigate to what extent RNNs learn to track abstract hierarchical syntactic structure.
109	Diverse Few-Shot Text Classification with Multiple Metrics	Mo Yu, Xiaoxiao Guo, Jinfeng Yi, Shiyu Chang, Saloni Potdar, Yu Cheng, Gerald Tesauro, Haoyu Wang, Bowen Zhou	To alleviate the problem, we propose an adaptive metric learning approach that automatically determines the best weighted combination from a set of metrics obtained from meta-training tasks for a newly seen few-shot task.
110	Early Text Classification Using Multi-Resolution Concept Representations	Adrian Pastor López-Monroy, Fabio A. González, Manuel Montes, Hugo Jair Escalante, Thamar Solorio	This paper proposes a novel document representation to improve the early detection of risks in social media sources.
111	Multinomial Adversarial Networks for Multi-Domain Text Classification	Xilun Chen, Claire Cardie	In this work, we propose a multinomial adversarial network (MAN) to tackle this real-world problem of multi-domain text classification (MDTC) in which labeled data may exist for multiple domains, but in insufficient amounts to train effective classifiers for one or more of the domains.
112	Pivot Based Language Modeling for Improved Neural Domain Adaptation	Yftah Ziser, Roi Reichart	In this paper we present the Pivot Based Language Model (PBLM), a representation learning model that marries together pivot-based and NN modeling in a structure aware manner.
113	Reinforced Co-Training	Jiawei Wu, Lei Li, William Yang Wang	In this paper, we propose a novel method, Reinforced Co-Training, to select high-quality unlabeled samples to better co-train on.
114	Tensor Product Generation Networks for Deep NLP Modeling	Qiuyuan Huang, Paul Smolensky, Xiaodong He, Li Deng, Dapeng Wu	We present a new approach to the design of deep networks for natural language processing (NLP), based on the general technique of Tensor Product Representations (TPRs) for encoding and processing symbol structures in distributed neural networks.
115	The Context-Dependent Additive Recurrent Neural Net	Quan Hung Tran, Tuan Lai, Gholamreza Haffari, Ingrid Zukerman, Trung Bui, Hung Bui	In this paper, we propose a novel family of Recurrent Neural Network unit: the Context-dependent Additive Recurrent Neural Network (CARNN) that is designed specifically to address this type of problem.
116	Combining Character and Word Information in Neural Machine Translation Using a Multi-Level Attention	Huadong Chen, Shujian Huang, David Chiang, Xinyu Dai, Jiajun Chen	In this paper, we improve the model by incorporating multiple levels of granularity.
117	Dense Information Flow for Neural Machine Translation	Yanyao Shen, Xu Tan, Di He, Tao Qin, Tie-Yan Liu	Inspired by the success of the DenseNet model in computer vision problems, in this paper, we propose a densely connected NMT architecture (DenseNMT) that is able to train more efficiently for NMT.
118	Evaluating Discourse Phenomena in Neural Machine Translation	Rachel Bawden, Rico Sennrich, Alexandra Birch, Barry Haddow	In this article, we present hand-crafted, discourse test sets, designed to test the models’ ability to exploit previous source and target sentences.
119	Fast Lexically Constrained Decoding with Dynamic Beam Allocation for Neural Machine Translation	Matt Post, David Vilar	We present a algorithm for lexically constrained decoding with a complexity of O(1) in the number of constraints.
120	Guiding Neural Machine Translation with Retrieved Translation Pieces	Jingyi Zhang, Masao Utiyama, Eiichro Sumita, Graham Neubig, Satoshi Nakamura	In this paper, we propose a simple, fast, and effective method for recalling previously seen translation examples and incorporating them into the NMT decoding process.
121	Handling Homographs in Neural Machine Translation	Frederick Liu, Han Lu, Graham Neubig	We then proceed to describe methods, inspired by the word sense disambiguation literature, that model the context of the input word with context-aware word embeddings that help to differentiate the word sense before feeding it into the encoder.
122	Improving Neural Machine Translation with Conditional Sequence Generative Adversarial Nets	Zhen Yang, Wei Chen, Feng Wang, Bo Xu	This paper proposes an approach for applying GANs to NMT.
123	Neural Machine Translation for Bilingually Scarce Scenarios: a Deep Multi-Task Learning Approach	Poorya Zaremoodi, Gholamreza Haffari	In this paper, we use monolingual linguistic resources in the source side to address this challenging problem based on a multi-task learning approach.
124	Self-Attentive Residual Decoder for Neural Machine Translation	Lesly Miculicich Werlen, Nikolaos Pappas, Dhananjay Ram, Andrei Popescu-Belis	To address this limitation, we propose a target-side-attentive residual recurrent network for decoding, where attention over previous words contributes directly to the prediction of the next word.
125	Target Foresight Based Attention for Neural Machine Translation	Xintong Li, Lemao Liu, Zhaopeng Tu, Shuming Shi, Max Meng	In this paper, we propose a new attention model enhanced by the implicit information of target foresight word oriented to both alignment and translation tasks.
126	Context Sensitive Neural Lemmatization with Lematus	Toms Bergmanis, Sharon Goldwater	We introduce Lematus, a lemmatizer based on a standard encoder-decoder architecture, which incorporates character-level sentence context.
127	Modeling Noisiness to Recognize Named Entities using Multitask Neural Networks on Social Media	Gustavo Aguilar, Adrian Pastor López-Monroy, Fabio González, Thamar Solorio	We present two systems that address the challenges of processing social media data using character-level phonetics and phonology, word embeddings, and Part-of-Speech tags as features.
128	Reusing Weights in Subword-Aware Neural Language Models	Zhenisbek Assylbekov, Rustem Takhanov	We propose several ways of reusing subword embeddings and other weights in subword-aware neural language models.
129	Simple Models for Word Formation in Slang	Vivek Kulkarni, William Yang Wang	We propose the first generative models for three types of extra-grammatical word formation phenomena abounding in slang: Blends, Clippings, and Reduplicatives.
130	Using Morphological Knowledge in Open-Vocabulary Neural Language Models	Austin Matthews, Graham Neubig, Chris Dyer	We introduce an open-vocabulary language model that incorporates more sophisticated linguistic knowledge by predicting words using a mixture of three generative processes: (1) by generating words as a sequence of characters, (2) by directly generating full word forms, and (3) by generating words as a sequence of morphemes that are combined using a hand-written morphological analyzer.
131	A Neural Layered Model for Nested Named Entity Recognition	Meizhi Ju, Makoto Miwa, Sophia Ananiadou	To address this issue, we propose a novel neural model to identify nested entities by dynamically stacking flat NER layers.
132	DR-BiLSTM: Dependent Reading Bidirectional LSTM for Natural Language Inference	Reza Ghaeini, Sadid A. Hasan, Vivek Datla, Joey Liu, Kathy Lee, Ashequl Qadir, Yuan Ling, Aaditya Prakash, Xiaoli Fern, Oladimeji Farri	We present a novel deep learning architecture to address the natural language inference (NLI) task.
133	KBGAN: Adversarial Learning for Knowledge Graph Embeddings	Liwei Cai, William Yang Wang	We introduce KBGAN, an adversarial learning framework to improve the performances of a wide range of existing knowledge graph embedding models.
134	Multimodal Frame Identification with Multilingual Evaluation	Teresa Botschen, Iryna Gurevych, Jan-Christoph Klie, Hatem Mousselly-Sergieh, Stefan Roth	In this paper, we extend a state-of-the-art FrameId system in order to effectively leverage multimodal representations.
135	Learning Joint Semantic Parsers from Disjoint Data	Hao Peng, Sam Thomson, Swabha Swayamdipta, Noah A. Smith	We present a new approach to learning a semantic parser from multiple datasets, even when the target semantic formalisms are drastically different and the underlying corpora do not overlap.
136	Identifying Semantic Divergences in Parallel Text without Annotations	Yogarshi Vyas, Xing Niu, Marine Carpuat	Recognizing that even correct translations are not always semantically equivalent, we automatically detect meaning divergences in parallel sentence pairs with a deep neural model of bilingual semantic similarity which can be trained for any parallel corpus without any manual annotation.
137	Bootstrapping Generators from Noisy Data	Laura Perez-Beltrachini, Mirella Lapata	In this paper we aim to bootstrap generators from large scale datasets where the data (e.g., DBPedia facts) and related texts (e.g., Wikipedia abstracts) are loosely aligned.
138	SHAPED: Shared-Private Encoder-Decoder for Text Style Adaptation	Ye Zhang, Nan Ding, Radu Soricut	We describe a family of model architectures capable of capturing both generic language characteristics via shared model parameters, as well as particular style characteristics via private model parameters.
139	Generating Descriptions from Structured Data Using a Bifocal Attention Mechanism and Gated Orthogonalization	Preksha Nema, Shreyas Shetty, Parag Jain, Anirban Laha, Karthik Sankaranarayanan, Mitesh M. Khapra	In this work, we focus on the task of generating natural language descriptions from a structured table of facts containing fields (such as nationality, occupation, etc) and values (such as Indian, actor, director, etc). In addition, we also introduce two similar datasets for French and German.
140	CliCR: a Dataset of Clinical Case Reports for Machine Reading Comprehension	Simon Šuster, Walter Daelemans	We present a new dataset for machine comprehension in the medical domain.
141	Learning to Collaborate for Question Answering and Asking	Duyu Tang, Nan Duan, Zhao Yan, Zhirui Zhang, Yibo Sun, Shujie Liu, Yuanhua Lv, Ming Zhou	In this paper, we give a systematic study that seeks to leverage the connection to improve both QA and QG.
142	Learning to Rank Question-Answer Pairs Using Hierarchical Recurrent Encoder with Latent Topic Clustering	Seunghyun Yoon, Joongbo Shin, Kyomin Jung	In this paper, we propose a novel end-to-end neural architecture for ranking candidate answers, that adapts a hierarchical recurrent neural network and a latent topic clustering module.
143	Supervised and Unsupervised Transfer Learning for Question Answering	Yu-An Chung, Hung-Yi Lee, James Glass	In this paper, we conduct extensive experiments to investigate the transferability of knowledge learned from a source QA dataset to a target dataset using two QA models.
144	Tracking State Changes in Procedural Text: a Challenge Dataset and Models for Process Paragraph Comprehension	Bhavana Dalvi, Lifu Huang, Niket Tandon, Wen-tau Yih, Peter Clark	We present a new dataset and models for comprehending paragraphs about processes (e.g., photosynthesis), an important genre of text describing a dynamic world. We are releasing the ProPara dataset and our models to the community.
145	Combining Deep Learning and Topic Modeling for Review Understanding in Context-Aware Recommendation	Mingmin Jin, Xin Luo, Huiling Zhu, Hankz Hankui Zhuo	In this paper, we investigate the approach to effectively utilize review information for recommender systems.
146	Deconfounded Lexicon Induction for Interpretable Social Science	Reid Pryzant, Kelly Shen, Dan Jurafsky, Stefan Wagner	We introduce two deep learning algorithms for the task.
147	Detecting Denial-of-Service Attacks from Social Media Text: Applying NLP to Computer Security	Nathanael Chambers, Ben Fry, James McMasters	We describe two learning frameworks for this task: a feed-forward neural network and a partially labeled LDA model.
148	The Importance of Calibration for Estimating Proportions from Annotations	Dallas Card, Noah A. Smith	The Importance of Calibration for Estimating Proportions from Annotations
149	A Dataset of Peer Reviews (PeerRead): Collection, Insights and NLP Applications	Dongyeop Kang, Waleed Ammar, Bhavana Dalvi, Madeleine van Zuylen, Sebastian Kohlmeier, Eduard Hovy, Roy Schwartz	We describe the data collection process and report interesting observed phenomena in the peer reviews. We present the first public dataset of scientific peer reviews available for research purposes (PeerRead v1),1 providing an opportunity to study this important artifact.
150	Deep Communicating Agents for Abstractive Summarization	Asli Celikyilmaz, Antoine Bosselut, Xiaodong He, Yejin Choi	We present deep communicating agents in an encoder-decoder architecture to address the challenges of representing a long document for abstractive summarization.
151	Encoding Conversation Context for Neural Keyphrase Extraction from Microblog Posts	Yingyi Zhang, Jing Li, Yan Song, Chengzhi Zhang	In this paper, we present a neural keyphrase extraction framework for microblog posts that takes their conversation context into account, where four types of neural encoders, namely, averaged embedding, RNN, attention, and memory networks, are proposed to represent the conversation context.
152	Estimating Summary Quality with Pairwise Preferences	Markus Zopf	In this paper, we propose an alternative evaluation approach based on pairwise preferences of sentences.
153	Generating Topic-Oriented Summaries Using Neural Attention	Kundan Krishna, Balaji Vasan Srinivasan	In this paper, we propose an attention based RNN framework to generate multiple summaries of a single document tuned to different topics of interest.
154	Generative Bridging Network for Neural Sequence Prediction	Wenhu Chen, Guanlin Li, Shuo Ren, Shujie Liu, Zhirui Zhang, Mu Li, Ming Zhou	In order to alleviate data sparsity and overfitting problems in maximum likelihood estimation (MLE) for sequence prediction tasks, we propose the Generative Bridging Network (GBN), in which a novel bridge module is introduced to assist the training of the sequence prediction model (the generator network).
155	Higher-Order Syntactic Attention Network for Longer Sentence Compression	Hidetaka Kamigaito, Katsuhiko Hayashi, Tsutomu Hirao, Masaaki Nagata	To solve this problem, we propose a higher-order syntactic attention network (HiSAN) that can handle higher-order dependency features as an attention distribution on LSTM hidden states.
156	Neural Storyline Extraction Model for Storyline Generation from News Articles	Deyu Zhou, Linsen Guo, Yulan He	In this paper, we propose a novel neural network based approach to extract structured representations and evolution patterns of storylines without using annotated data.
157	Provable Fast Greedy Compressive Summarization with Any Monotone Submodular Function	Shinsaku Sakaue, Tsutomu Hirao, Masaaki Nishino, Masaaki Nagata	In this paper, we propose a fast greedy method for compressive summarization.
158	Ranking Sentences for Extractive Summarization with Reinforcement Learning	Shashi Narayan, Shay B. Cohen, Mirella Lapata	In this paper we conceptualize extractive summarization as a sentence ranking task and propose a novel training algorithm which globally optimizes the ROUGE evaluation metric through a reinforcement learning objective.
159	Relational Summarization for Corpus Analysis	Abram Handler, Brendan O’Connor	This work introduces a new problem, relational summarization, in which the goal is to generate a natural language summary of the relationship between two lexical items in a corpus, without reference to a knowledge base.
160	What’s This Movie About? A Joint Neural Network Architecture for Movie Content Analysis	Philip John Gorinski, Mirella Lapata	We present a novel end-to-end model for overview generation, consisting of a multi-label encoder for identifying screenplay attributes, and an LSTM decoder to generate natural language sentences conditioned on the identified attributes. We create a dataset that consists of movie scripts, attribute-value pairs for the movies’ aspects, as well as overviews, which we extract from an online database.
161	Which Scores to Predict in Sentence Regression for Text Summarization?	Markus Zopf, Eneldo Loza Mencía, Johannes Fürnkranz	In this paper, we show in extensive experiments that following this intuition leads to suboptimal results and that learning to predict ROUGE precision scores leads to better results.
162	A Hierarchical Latent Structure for Variational Conversation Modeling	Yookoon Park, Jaemin Cho, Gunhee Kim	To solve the degeneration problem, we propose a novel model named Variational Hierarchical Conversation RNNs (VHCR), involving two key ideas of (1) using a hierarchical structure of latent variables, and (2) exploiting an utterance drop regularization.
163	Detecting Egregious Conversations between Customers and Virtual Agents	Tommy Sandbank, Michal Shmueli-Scheuer, Jonathan Herzig, David Konopnicki, John Richards, David Piorkowski	In this paper, we outline an approach to detecting such egregious conversations, using behavioral cues from the user, patterns in agent responses, and user-agent interaction.
164	Learning to Disentangle Interleaved Conversational Threads with a Siamese Hierarchical Network and Similarity Ranking	Jyun-Yu Jiang, Francine Chen, Yan-Ying Chen, Wei Wang	In this paper, we propose to leverage representation learning for conversation disentanglement.
165	Variational Knowledge Graph Reasoning	Wenhu Chen, Wenhan Xiong, Xifeng Yan, William Yang Wang	In this paper, we tackle a practical query answering task involving predicting the relation of a given entity pair.
166	Inducing Temporal Relations from Time Anchor Annotation	Fei Cheng, Yusuke Miyao	In this paper, we propose a new approach to obtain temporal relations from absolute time value (a.k.a. time anchors), which is suitable for texts containing rich temporal information such as news articles.
167	ELDEN: Improved Entity Linking Using Densified Knowledge Graphs	Priya Radhakrishnan, Partha Talukdar, Vasudeva Varma	In this paper, we propose Entity Linking using Densified Knowledge Graphs (ELDEN).
168	Interpretable Charge Predictions for Criminal Cases: Learning to Generate Court Views from Fact Descriptions	Hai Ye, Xin Jiang, Zhunchen Luo, Wenhan Chao	In this paper, we propose to study the problem of court view generation from the fact description in a criminal case.
169	Delete, Retrieve, Generate: a Simple Approach to Sentiment and Style Transfer	Juncen Li, Robin Jia, He He, Percy Liang	In this paper, we propose simpler methods motivated by the observation that text attributes are often marked by distinctive phrases (e.g., “too small”).
170	Adversarial Example Generation with Syntactically Controlled Paraphrase Networks	Mohit Iyyer, John Wieting, Kevin Gimpel, Luke Zettlemoyer	We propose syntactically controlled paraphrase networks (SCPNs) and use them to generate adversarial examples.
171	Sentiment Analysis: It’s Complicated!	Kian Kenyon-Dean, Eisha Ahmed, Scott Fujimoto, Jeremy Georges-Filteau, Christopher Glasz, Barleen Kaur, Auguste Lalande, Shruti Bhanderi, Robert Belfer, Nirmal Kanagasabai, Roman Sarrazingendron, Rohit Verma, Derek Ruths	We therefore propose the notion of a “complicated” class of sentiment to categorize such text, and argue that its inclusion in the short-text sentiment analysis framework will improve the quality of automated sentiment analysis systems as they are implemented in real-world settings.
172	Multi-Task Learning of Pairwise Sequence Classification Tasks over Disparate Label Spaces	Isabelle Augenstein, Sebastian Ruder, Anders Søgaard	We evaluate our approach on a variety of tasks with disparate label spaces.
173	Word Emotion Induction for Multiple Languages as a Deep Multi-Task Learning Problem	Sven Buechel, Udo Hahn	We here present a solution to get around this language data bottleneck by rephrasing word emotion induction as a multi-task learning problem.
174	Human Needs Categorization of Affective Events Using Labeled and Unlabeled Data	Haibo Ding, Ellen Riloff	Our work aims to categorize affective events based upon human need categories that often explain people’s motivations and desires: PHYSIOLOGICAL, HEALTH, LEISURE, SOCIAL, FINANCIAL, COGNITION, and FREEDOM.
175	The Argument Reasoning Comprehension Task: Identification and Reconstruction of Implicit Warrants	Ivan Habernal, Henning Wachsmuth, Iryna Gurevych, Benno Stein	In this paper we develop a methodology for reconstructing warrants systematically.
176	Linguistic Cues to Deception and Perceived Deception in Interview Dialogues	Sarah Ita Levitan, Angel Maredia, Julia Hirschberg	We explore deception detection in interview dialogues.
177	Unified Pragmatic Models for Generating and Following Instructions	Daniel Fried, Jacob Andreas, Dan Klein	We extend these models to tasks with sequential structure.
178	Hierarchical Structured Model for Fine-to-Coarse Manifesto Text Analysis	Shivashankar Subramanian, Trevor Cohn, Timothy Baldwin	In this paper we propose a two-stage model for automatically performing both levels of analysis over manifestos.
179	Behavior Analysis of NLI Models: Uncovering the Influence of Three Factors on Robustness	Ivan Sanchez, Jeff Mitchell, Sebastian Riedel	Natural Language Inference is a challenging task that has received substantial attention, and state-of-the-art models now achieve impressive test set performance in the form of accuracy scores.
180	Assessing Language Proficiency from Eye Movements in Reading	Yevgeni Berzak, Boris Katz, Roger Levy	We present a novel approach for determining learners’ second language proficiency which utilizes behavioral traces of eye movements during reading.
181	Comparing Theories of Speaker Choice Using a Model of Classifier Production in Mandarin Chinese	Meilin Zhan, Roger Levy	In a corpus analysis of Mandarin Chinese, we show that the distribution of speaker choices supports the availability-based production account and not the Uniform Information Density.
182	Spotting Spurious Data with Neural Networks	Hadi Amiri, Timothy Miller, Guergana Savova	In this paper, we present effective approaches inspired by queueing theory and psychology of learning to automatically identify spurious instances in datasets.
183	The Timing of Lexical Memory Retrievals in Language Production	Jeremy Cole, David Reitter	This paper explores the time course of lexical memory retrieval by modeling fluent language production.
184	Unsupervised Induction of Linguistic Categories with Records of Reading, Speaking, and Writing	Maria Barrett, Ana Valeria González-Garduño, Lea Frermann, Anders Søgaard	This paper shows that performance can be further improved by including data that is readily available or can be easily obtained for most languages, i.e., eye-tracking, speech, or keystroke logs (or any combination thereof).
185	Challenging Reading Comprehension on Daily Conversation: Passage Completion on Multiparty Dialog	Kaixin Ma, Tomasz Jurczyk, Jinho D. Choi	This paper presents a new corpus and a robust deep learning architecture for a task in reading comprehension, passage completion, on multiparty dialog. Since there is no dataset that challenges the task of passage completion in this genre, we create a corpus by selecting transcripts from a TV show that comprise 1,681 dialogs, generating passages for each dialog through crowdsourcing, and annotating mentions of characters in both the dialog and the passages.
186	Dialog Generation Using Multi-Turn Reasoning Neural Networks	Xianchao Wu, Ander Martínez, Momo Klyen	In this paper, we propose a generalizable dialog generation approach that adapts multi-turn reasoning, one recent advancement in the field of document comprehension, to generate responses (“answers”) by taking current conversation session context as a “document” and current query as a “question”.
187	Dialogue Learning with Human Teaching and Feedback in End-to-End Trainable Task-Oriented Dialogue Systems	Bing Liu, Gokhan Tür, Dilek Hakkani-Tür, Pararth Shah, Larry Heck	In this work, we present a hybrid learning method for training task-oriented dialogue systems through online user interactions.
188	LSDSCC: a Large Scale Domain-Specific Conversational Corpus for Response Generation with Diversity Oriented Evaluation Metrics	Zhen Xu, Nan Jiang, Bingquan Liu, Wenge Rong, Bowen Wu, Baoxun Wang, Zhuoran Wang, Xiaolong Wang	This paper proposes a Large Scale Domain-Specific Conversational Corpus (LSDSCC) composed of high-quality queryresponse pairs extracted from the domainspecific online forum, with thorough preprocessing and cleansing procedures.
189	EMR Coding with Semi-Parametric Multi-Head Matching Networks	Anthony Rios, Ramakanth Kavuluru	In this paper, we present a new neural network architecture that combines ideas from few-shot learning matching networks, multi-label loss functions, and convolutional neural networks for text classification to significantly outperform other state-of-the-art models.
190	Factors Influencing the Surprising Instability of Word Embeddings	Laura Wendlandt, Jonathan K. Kummerfeld, Rada Mihalcea	In this paper, we consider one aspect of embedding spaces, namely their stability.
191	Mining Evidences for Concept Stock Recommendation	Qi Liu, Yue Zhang	We investigate the task of mining relevant stocks given a topic of concern on emerging capital markets, for which there is lack of structural understanding.
192	Binarized LSTM Language Model	Xuan Liu, Di Cao, Kai Yu	In this paper, a novel binarized LSTM LM is proposed to address the problem.
193	Conversational Memory Network for Emotion Recognition in Dyadic Dialogue Videos	Devamanyu Hazarika, Soujanya Poria, Amir Zadeh, Erik Cambria, Louis-Philippe Morency, Roger Zimmermann	In this paper, we address recognizing utterance-level emotions in dyadic conversational videos.
194	How Time Matters: Learning Time-Decay Attention for Contextual Spoken Language Understanding in Dialogues	Shang-Yu Su, Pei-Chieh Yuan, Yun-Nung Chen	The experiments on the benchmark Dialogue State Tracking Challenge (DSTC4) dataset show that the proposed time-decay attention mechanisms significantly improve the state-of-the-art model for contextual understanding performance.
195	Towards Understanding Text Factors in Oral Reading	Anastassia Loukina, Van Rynald T. Liceralde, Beata Beigman Klebanov	Using a case study, we show that variation in oral reading rate across passages for professional narrators is consistent across readers and much of it can be explained using features of the texts being read.
196	Generating Bilingual Pragmatic Color References	Will Monroe, Jennifer Hu, Andrew Jong, Christopher Potts	Using a newly-collected dataset of color reference games in Mandarin Chinese (which we release to the public), we confirm that a variety of constructions display the same sensitivity to contextual difficulty in Chinese and English.
197	Learning with Latent Language	Jacob Andreas, Dan Klein, Sergey Levine	This paper aims to show that using the space of natural language strings as a parameter space is an effective way to capture natural task structure.
198	Object Counts! Bringing Explicit Detections Back into Image Captioning	Josiah Wang, Pranava Swaroop Madhyastha, Lucia Specia	We provide an in-depth analysis of end-to-end image captioning by exploring a variety of cues that can be derived from such object detections.
199	Quantifying the Visual Concreteness of Words and Topics in Multimodal Datasets	Jack Hessel, David Mimno, Lillian Lee	We give an algorithm for automatically computing the visual concreteness of words and topics within multimodal datasets.
200	Speaker Naming in Movies	Mahmoud Azab, Mingzhe Wang, Max Smith, Noriyuki Kojima, Jia Deng, Rada Mihalcea	We propose a new model for speaker naming in movies that leverages visual, textual, and acoustic modalities in an unified optimization framework. To evaluate the performance of our model, we introduce a new dataset consisting of six episodes of the Big Bang Theory TV show and eighteen full movies covering different genres.
201	Stacking with Auxiliary Features for Visual Question Answering	Nazneen Fatema Rajani, Raymond Mooney	In this paper, we describe how we use these various categories of auxiliary features to improve performance for VQA.
202	Deep Contextualized Word Representations	Matthew Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, Luke Zettlemoyer	We introduce a new type of deep contextualized word representation that models both (1) complex characteristics of word use (e.g., syntax and semantics), and (2) how these uses vary across linguistic contexts (i.e., to model polysemy).
203	Learning to Map Context-Dependent Sentences to Executable Formal Queries	Alane Suhr, Srinivasan Iyer, Yoav Artzi	We propose a context-dependent model to map utterances within an interaction to executable formal queries.
204	Neural Text Generation in Stories Using Entity Representations as Context	Elizabeth Clark, Yangfeng Ji, Noah A. Smith	We introduce an approach to neural text generation that explicitly represents entities mentioned in the text.
205	Recurrent Neural Networks as Weighted Language Recognizers	Yining Chen, Sorcha Gilroy, Andreas Maletti, Jonathan May, Kevin Knight	We investigate the computational complexity of various problems for simple recurrent neural networks (RNNs) as formal models for recognizing weighted languages.

TABLE 2: NAACL 2018 Short Papers

	Title	Authors	Highlight
1	Enhanced Word Representations for Bridging Anaphora Resolution	Yufang Hou	Most current models of word representations (e.g., GloVe) have successfully captured fine-grained semantics.
2	Gender Bias in Coreference Resolution	Rachel Rudinger, Jason Naradowsky, Brian Leonard, Benjamin Van Durme	We present an empirical study of gender bias in coreference resolution systems.
3	Gender Bias in Coreference Resolution: Evaluation and Debiasing Methods	Jieyu Zhao, Tianlu Wang, Mark Yatskar, Vicente Ordonez, Kai-Wei Chang	In this paper, we introduce a new benchmark for co-reference resolution focused on gender bias, WinoBias.
4	Integrating Stance Detection and Fact Checking in a Unified Corpus	Ramy Baly, Mitra Mohtarami, James Glass, Lluís Màrquez, Alessandro Moschitti, Preslav Nakov	In this paper, we support the interdependencies between these tasks as annotations in the same corpus.
5	Is Something Better than Nothing? Automatically Predicting Stance-based Arguments Using Deep Learning and Small Labelled Dataset	Pavithra Rajendran, Danushka Bollegala, Simon Parsons	In this paper, we investigate the use of weakly supervised and semi-supervised methods for automatically annotating data, and thus providing large annotated datasets.
6	Multi-Task Learning for Argumentation Mining in Low-Resource Settings	Claudia Schulz, Steffen Eger, Johannes Daxenberger, Tobias Kahse, Iryna Gurevych	We investigate whether and where multi-task learning (MTL) can improve performance on NLP problems related to argumentation mining (AM), in particular argument component identification.
7	Neural Models for Reasoning over Multiple Mentions Using Coreference	Bhuwan Dhingra, Qiao Jin, Zhilin Yang, William Cohen, Ruslan Salakhutdinov	We present a recurrent layer which is instead biased towards coreferent dependencies.
8	Automatic Dialogue Generation with Expressed Emotions	Chenyang Huang, Osmar Zaïane, Amine Trabelsi, Nouha Dziri	In this research, we address the problem of forcing the dialogue generation to express emotion.
9	Guiding Generation for Abstractive Text Summarization Based on Key Information Guide Network	Chenliang Li, Weiran Xu, Si Li, Sheng Gao	We propose a guiding generation model that combines the extractive method and the abstractive method.
10	Natural Language Generation by Hierarchical Decoding with Linguistic Patterns	Shang-Yu Su, Kai-Ling Lo, Yi-Ting Yeh, Yun-Nung Chen	This paper introduces a hierarchical decoding NLG model based on linguistic patterns in different levels, and shows that the proposed method outperforms the traditional one with a smaller model size.
11	Neural Poetry Translation	Marjan Ghazvininejad, Yejin Choi, Kevin Knight	We present the first neural poetry translation system.
12	RankME: Reliable Human Ratings for Natural Language Generation	Jekaterina Novikova, Ondřej Dušek, Verena Rieser	We present a novel rank-based magnitude estimation method (RankME), which combines the use of continuous scales and relative assessments.
13	Sentence Simplification with Memory-Augmented Neural Networks	Tu Vu, Baotian Hu, Tsendsuren Munkhdalai, Hong Yu	In this paper, we adapt an architecture with augmented memory capacities called Neural Semantic Encoders (Munkhdalai and Yu, 2017) for sentence simplification.
14	A Corpus of Non-Native Written English Annotated for Metaphor	Beata Beigman Klebanov, Chee Wee (Ben) Leong, Michael Flor	We present a corpus of 240 argumentative essays written by non-native speakers of English annotated for metaphor.
15	A Simple and Effective Approach to the Story Cloze Test	Siddarth Srinivasan, Richa Arora, Mark Riedl	Following this approach, we present a simpler fully-neural approach to the Story Cloze Test using skip-thought embeddings of the stories in a feed-forward network that achieves close to state-of-the-art performance on this task without any feature engineering.
16	An Annotated Corpus for Machine Reading of Instructions in Wet Lab Protocols	Chaitanya Kulkarni, Wei Xu, Alan Ritter, Raghu Machiraju	We describe an effort to annotate a corpus of natural language instructions consisting of 622 wet lab protocols to facilitate automatic or semi-automatic conversion of protocols into a machine-readable format and benefit biological research.
17	Annotation Artifacts in Natural Language Inference Data	Suchin Gururangan, Swabha Swayamdipta, Omer Levy, Roy Schwartz, Samuel Bowman, Noah A. Smith	We show that, in a significant portion of such data, this protocol leaves clues that make it possible to identify the label by looking only at the hypothesis, without observing the premise.
18	Humor Recognition Using Deep Learning	Peng-Yu Chen, Von-Wun Soo	In this paper, we construct and collect four datasets with distinct joke types in both English and Chinese and conduct learning experiments on humor recognition.
19	Leveraging Intra-User and Inter-User Representation Learning for Automated Hate Speech Detection	Jing Qian, Mai ElSherief, Elizabeth Belding, William Yang Wang	In this paper, we radically improve automated hate speech detection by presenting a novel model that leverages intra-user and inter-user representation learning for robust hate speech detection on Twitter.
20	Reference-less Measure of Faithfulness for Grammatical Error Correction	Leshem Choshen, Omri Abend	We propose USim, a semantic measure for Grammatical Error Correction (that measures the semantic faithfulness of the output to the source, thereby complementing existing reference-less measures (RLMs) for measuring the output’s grammaticality.
21	Training Structured Prediction Energy Networks with Indirect Supervision	Amirmohammad Rooshenas, Aishwarya Kamath, Andrew McCallum	This paper introduces rank-based training of structured prediction energy networks (SPENs).
22	Si O No, Que Penses? Catalonian Independence and Linguistic Identity on Social Media	Ian Stewart, Yuval Pinter, Jacob Eisenstein	This study examines the use of Catalan, a language local to the semi-autonomous region of Catalonia in Spain, on Twitter in discourse related to the 2017 independence referendum.
23	A Transition-Based Algorithm for Unrestricted AMR Parsing	David Vilares, Carlos Gómez-Rodríguez	We explore this idea and introduce a greedy left-to-right non-projective transition-based parser.
24	Analogies in Complex Verb Meaning Shifts: the Effect of Affect in Semantic Similarity Models	Maximilian Köper, Sabine Schulte im Walde	We present a computational model to detect and distinguish analogies in meaning shifts between German base and complex verbs.
25	Character-Based Neural Networks for Sentence Pair Modeling	Wuwei Lan, Wei Xu	In this paper, we study how effective subword-level (character and character n-gram) representations are in sentence pair modeling.
26	Determining Event Durations: Models and Error Analysis	Alakananda Vempala, Eduardo Blanco, Alexis Palmer	This paper presents models to predict event durations.
27	Diachronic Usage Relatedness (DURel): A Framework for the Annotation of Lexical Semantic Change	Dominik Schlechtweg, Sabine Schulte im Walde, Stefanie Eckmann	We propose a framework that extends synchronic polysemy annotation to diachronic changes in lexical meaning, to counteract the lack of resources for evaluating computational models of lexical semantic change.
28	Directional Skip-Gram: Explicitly Distinguishing Left and Right Context for Word Embeddings	Yan Song, Shuming Shi, Jing Li, Haisong Zhang	In this paper, we present directional skip-gram (DSG), a simple but effective enhancement of the skip-gram model by explicitly distinguishing left and right context in word prediction.
29	Discriminating between Lexico-Semantic Relations with the Specialization Tensor Model	Goran Glavaš, Ivan Vulić	We present a simple and effective feed-forward neural architecture for discriminating between lexico-semantic relations (synonymy, antonymy, hypernymy, and meronymy).
30	Evaluating bilingual word embeddings on the long tail	Fabienne Braune, Viktor Hangya, Tobias Eder, Alexander Fraser	We show that state-of-the-art approaches fail on this task and present simple new techniques to improve bilingual word embeddings for mining rare words. We release new gold standard datasets and code to stimulate research on this task.
31	Frustratingly Easy Meta-Embedding — Computing Meta-Embeddings by Averaging Source Word Embeddings	Joshua Coates, Danushka Bollegala	In this paper, we show that the arithmetic mean of two distinct word embedding sets yields a performant meta-embedding that is comparable or better than more complex meta-embedding learning methods.
32	Introducing Two Vietnamese Datasets for Evaluating Semantic Models of (Dis-)Similarity and Relatedness	Kim Anh Nguyen, Sabine Schulte im Walde, Ngoc Thang Vu	We present two novel datasets for the low-resource language Vietnamese to assess models of semantic similarity: ViCon comprises pairs of synonyms and antonyms across word classes, thus offering data to distinguish between similarity and dissimilarity.
33	Lexical Substitution for Evaluating Compositional Distributional Models	Maja Buljan, Sebastian Padó, Jan Šnajder	Compositional Distributional Semantic Models (CDSMs) model the meaning of phrases and sentences in vector space. We create a LexSub dataset for CDSM evaluation from a corpus with manual “all-words” LexSub annotation.
34	Mittens: an Extension of GloVe for Learning Domain-Specialized Representations	Nicholas Dingwall, Christopher Potts	We present a simple extension of the GloVe representation learning model that begins with general-purpose representations and updates them based on data from a specialized domain.
35	Olive Oil is Made textitof Olives, Baby Oil is Made textitfor Babies: Interpreting Noun Compounds Using Paraphrases in a Neural Model	Vered Shwartz, Chris Waterson	We explore a neural paraphrasing approach that demonstrates superior performance when such memorization is not possible.
36	Semantic Pleonasm Detection	Omid Kashefi, Andrew T. Lucas, Rebecca Hwa	To aid the development of systems that detect pleonasms in text, we introduce an annotated corpus of semantic pleonasms.
37	Similarity Measures for the Detection of Clinical Conditions with Verbal Fluency Tasks	Felipe Paula, Rodrigo Wilkens, Marco Idiart, Aline Villavicencio	In this work, we investigate three similarity measures for automatically identifying switches in semantic chains: semantic similarity from a manually constructed resource, and word association strength and semantic relatedness, both calculated from corpora.
38	Sluice Resolution without Hand-Crafted Features over Brittle Syntax Trees	Ola Rønning, Daniel Hardt, Anders Søgaard	Syntactic information is arguably important for sluice resolution, but we show that multi-task learning with partial parsing as auxiliary tasks effectively closes the gap and buys us an additional 9% error reduction over previous work.
39	The Word Analogy Testing Caveat	Natalie Schluter	We show that even supposing there were such word analogy regularities that should be detected in the word embeddings obtained via unsupervised means, standard word analogy test implementation practices provide distorted or contrived results.
40	Transition-Based Chinese AMR Parsing	Chuan Wang, Bin Li, Nianwen Xue	This paper presents the first AMR parser built on the Chinese AMR bank.
41	Knowledge-Enriched Two-Layered Attention Network for Sentiment Analysis	Abhishek Kumar, Daisuke Kawahara, Sadao Kurohashi	We propose a novel two-layered attention network based on Bidirectional Long Short-Term Memory for sentiment analysis.
42	Letting Emotions Flow: Success Prediction by Modeling the Flow of Emotions in Books	Suraj Maharjan, Sudipta Kar, Manuel Montes, Fabio A. González, Thamar Solorio	In this paper, we model the flow of emotions over a book using recurrent neural networks and quantify its usefulness in predicting success in books.
43	Modeling Inter-Aspect Dependencies for Aspect-Based Sentiment Analysis	Devamanyu Hazarika, Soujanya Poria, Prateek Vij, Gangeshwar Krishnamurthy, Erik Cambria, Roger Zimmermann	In this paper, we incorporate this pattern by simultaneous classification of all aspects in a sentence along with temporal dependency processing of their corresponding sentence representations using recurrent networks.
44	Multi-Task Learning Framework for Mining Crowd Intelligence towards Clinical Treatment	Shweta Yadav, Asif Ekbal, Sriparna Saha, Pushpak Bhattacharyya, Amit Sheth	In this paper, we present a study where medical user’s opinions on health-related issues are analyzed to capture the medical sentiment at a blog level.
45	Recurrent Entity Networks with Delayed Memory Update for Targeted Aspect-Based Sentiment Analysis	Fei Liu, Trevor Cohn, Timothy Baldwin	Motivated by recent advances in memory-augmented models for machine reading, we propose a novel architecture, utilising external “memory chains” with a delayed memory update mechanism to track entities.
46	Near Human-Level Performance in Grammatical Error Correction with Hybrid Machine Translation	Roman Grundkiewicz, Marcin Junczys-Dowmunt	We combine two of the most popular approaches to automated Grammatical Error Correction (GEC): GEC based on Statistical Machine Translation (SMT) and GEC based on Neural Machine Translation (NMT).
47	Strong Baselines for Simple Question Answering over Knowledge Graphs with and without Neural Networks	Salman Mohammed, Peng Shi, Jimmy Lin	We examine the problem of question answering over knowledge graphs, focusing on simple questions that can be answered by the lookup of a single fact.
48	Looking for Structure in Lexical and Acoustic-Prosodic Entrainment Behaviors	Andreas Weise, Rivka Levitan	We present a negative result of our search, finding no meaningful correlations, clusters, or principal components in various entrainment measures, and discuss practical and theoretical implications.
49	Modeling Semantic Plausibility by Injecting World Knowledge	Su Wang, Greg Durrett, Katrin Erk	This paper introduces the task of semantic plausibility: recognizing plausible but possibly novel events. We present a new crowdsourced dataset of semantic plausibility judgments of single events such as man swallow paintball.
50	A Bi-Model Based RNN Semantic Frame Parsing Model for Intent Detection and Slot Filling	Yu Wang, Yilin Shen, Hongxia Jin	In this paper, new Bi-model based RNN semantic frame parsing network structures are designed to perform the intent detection and slot filling tasks jointly, by considering their cross-impact to each other using two correlated bidirectional LSTMs (BLSTM).
51	A Comparison of Two Paraphrase Models for Taxonomy Augmentation	Vassilis Plachouras, Fabio Petroni, Timothy Nugent, Jochen L. Leidner	In this paper, we explore automatic taxonomy augmentation with paraphrases.
52	A Laypeople Study on Terminology Identification across Domains and Task Definitions	Anna Hätty, Sabine Schulte im Walde	This paper introduces a new dataset of term annotation.
53	A Novel Embedding Model for Knowledge Base Completion Based on Convolutional Neural Network	Dai Quoc Nguyen, Tu Dinh Nguyen, Dat Quoc Nguyen, Dinh Phung	In this paper, we propose a novel embedding model, named ConvKB, for knowledge base completion.
54	Cross-language Article Linking Using Cross-Encyclopedia Entity Embedding	Chun-Kai Wu, Richard Tzong-Han Tsai	In this paper, we address these problems by proposing cross-encyclopedia entity embedding.
55	Identifying the Most Dominant Event in a News Article by Mining Event Coreference Relations	Prafulla Kumar Choubey, Kaushik Raju, Ruihong Huang	Identifying the most dominant and central event of a document, which governs and connects other foreground and background events in the document, is useful for many applications, such as text summarization, storyline generation and text segmentation.
56	Improve Neural Entity Recognition via Multi-Task Data Selection and Constrained Decoding	Huasha Zhao, Yi Yang, Qiong Zhang, Luo Si	In this paper, we propose an entity recognition system that improves this neural architecture with two novel techniques.
57	Keep Your Bearings: Lightly-Supervised Information Extraction with Ladder Networks That Avoids Semantic Drift	Ajay Nagesh, Mihai Surdeanu	We propose a novel approach to semi-supervised learning for information extraction that uses ladder networks (Rasmus et al., 2015).
58	Semi-Supervised Event Extraction with Paraphrase Clusters	James Ferguson, Colin Lockard, Daniel Weld, Hannaneh Hajishirzi	We present a method for self-training event extraction systems by bootstrapping additional training data.
59	Structure Regularized Neural Network for Entity Relation Classification for Chinese Literature Text	Ji Wen, Xu Sun, Xuancheng Ren, Qi Su	In this paper, we propose the task of relation classification for Chinese literature text.
60	Syntactic Patterns Improve Information Extraction for Medical Search	Roma Patel, Yinfei Yang, Iain Marshall, Ani Nenkova, Byron Wallace	In this paper we demonstrate how features encoding syntactic patterns improve the performance of state-of-the-art sequence tagging models (both neural and linear) for information extraction of these medically relevant categories.
61	Syntactically Aware Neural Architectures for Definition Extraction	Luis Espinosa-Anke, Steven Schockaert	In this paper we present a set of neural architectures combining Convolutional and Recurrent Neural Networks, which are further enriched by incorporating linguistic information via syntactic dependencies.
62	A Dynamic Oracle for Linear-Time 2-Planar Dependency Parsing	Daniel Fernández-González, Carlos Gómez-Rodríguez	We propose an efficient dynamic oracle for training the 2-Planar transition-based parser, a linear-time parser with over 99% coverage on non-projective syntactic corpora.
63	Are Automatic Methods for Cognate Detection Good Enough for Phylogenetic Reconstruction in Historical Linguistics?	Taraka Rama, Johann-Mattis List, Johannes Wahle, Gerhard Jäger	We evaluate the performance of state-of-the-art algorithms for automatic cognate detection by comparing how useful automatically inferred cognates are for the task of phylogenetic inference compared to classical manually annotated cognate sets.
64	Automatically Selecting the Best Dependency Annotation Design with Dynamic Oracles	Guillaume Wisniewski, Ophélie Lacroix, François Yvon	This work introduces a new strategy to compare the numerous conventions that have been proposed over the years for expressing dependency structures and discover the one for which a parser will achieve the highest parsing performance.
65	Consistent CCG Parsing over Multiple Sentences for Improved Logical Reasoning	Masashi Yoshikawa, Koji Mineshima, Hiroshi Noji, Daisuke Bekki	In this work, we present a simple method to extend an existing CCG parser to parse a set of sentences consistently, which is achieved with an inter-sentence modeling with Markov Random Fields (MRF).
66	Exploiting Dynamic Oracles to Train Projective Dependency Parsers on Non-Projective Trees	Lauriane Aufrant, Guillaume Wisniewski, François Yvon	In this work, we propose a simple modification of dynamic oracles, which enables the use of non-projective data when training projective parsers.
67	Improving Coverage and Runtime Complexity for Exact Inference in Non-Projective Transition-Based Dependency Parsers	Tianze Shi, Carlos Gómez-Rodríguez, Lillian Lee	We generalize Cohen, G{\’o}mez-Rodr{\’\i}guez, and Satta’s (2011) parser to a family of non-projective transition-based dependency parsers allowing polynomial-time exact inference.
68	Towards a Variability Measure for Multiword Expressions	Caroline Pasquer, Agata Savary, Jean-Yves Antoine, Carlos Ramisch	Since variability of MWEs is a matter of scale rather than a binary property, we propose a 2-dimensional language-independent measure of variability dedicated to verbal MWEs based on syntactic and discontinuity-related clues.
69	Defoiling Foiled Image Captions	Pranava Swaroop Madhyastha, Josiah Wang, Lucia Specia	In this paper, we demonstrate that it is possible to solve this task using simple, interpretable yet powerful representations based on explicit object information over multilayer perceptron models.
70	Pragmatically Informative Image Captioning with Character-Level Inference	Reuben Cohn-Gordon, Noah Goodman, Christopher Potts	We instead solve this problem by implementing a version of RSA which operates at the level of characters (“a”, “b”, “c”, …) during the unrolling of the caption.
71	Object Ordering with Bidirectional Matchings for Visual Reasoning	Hao Tan, Mohit Bansal	Our model achieves strong improvements (of 4-6% absolute) over the state-of-the-art on both the structured representation and raw image versions of the dataset.
72	Contextual Augmentation: Data Augmentation by Words with Paradigmatic Relations	Sosuke Kobayashi	We propose a novel data augmentation for labeled sentences called contextual augmentation.
73	Cross-Lingual Learning-to-Rank with Shared Representations	Shota Sasaki, Shuo Sun, Shigehiko Schamoni, Kevin Duh, Kentaro Inui	We introduce a large-scale dataset derived from Wikipedia to support CLIR research in 25 languages.
74	Self-Attention with Relative Position Representations	Peter Shaw, Jakob Uszkoreit, Ashish Vaswani	In this work we present an alternative approach, extending the self-attention mechanism to efficiently consider representations of the relative positions, or distances between sequence elements.
75	Text Segmentation as a Supervised Learning Task	Omri Koshorek, Adir Cohen, Noam Mor, Michael Rotman, Jonathan Berant	In this work, we formulate text segmentation as a supervised learning problem, and present a large new dataset for text segmentation that is automatically extracted and labeled from Wikipedia.
76	What’s in a Domain? Learning Domain-Robust Text Representations using Adversarial Training	Yitong Li, Timothy Baldwin, Trevor Cohn	We propose a novel method to optimise both in- and out-of-domain accuracy based on joint learning of a structured neural model with domain-specific and domain-general components, coupled with adversarial training for domain.
77	Automated Paraphrase Lattice Creation for HyTER Machine Translation Evaluation	Marianna Apidianaki, Guillaume Wisniewski, Anne Cocos, Chris Callison-Burch	We propose a variant of a well-known machine translation (MT) evaluation metric, HyTER (Dreyer and Marcu, 2012), which exploits reference translations enriched with meaning equivalent expressions.
78	Exploiting Semantics in Neural Machine Translation with Graph Convolutional Networks	Diego Marcheggiani, Joost Bastings, Ivan Titov	In this work, we are the first to incorporate information about predicate-argument structure of source sentences (namely, semantic-role representations) into neural machine translation.
79	Incremental Decoding and Training Methods for Simultaneous Translation in Neural Machine Translation	Fahim Dalvi, Nadir Durrani, Hassan Sajjad, Stephan Vogel	We propose a tunable agent which decides the best segmentation strategy for a user-defined BLEU loss and Average Proportion (AP) constraint.
80	Learning Hidden Unit Contribution for Adapting Neural Machine Translation Models	David Vilar	In this paper we explore the use of Learning Hidden Unit Contribution for the task of neural machine translation.
81	Neural Machine Translation Decoding with Terminology Constraints	Eva Hasler, Adrià de Gispert, Gonzalo Iglesias, Bill Byrne	We describe our approach to constrained neural decoding based on finite-state machines and multi-stack decoding which supports target-side constraints as well as constraints with corresponding aligned input text spans.
82	On the Evaluation of Semantic Phenomena in Neural Machine Translation Using Natural Language Inference	Adam Poliak, Yonatan Belinkov, James Glass, Benjamin Van Durme	We propose a process for investigating the extent to which sentence representations arising from neural machine translation (NMT) systems encode distinct semantic phenomena.
83	Using Word Vectors to Improve Word Alignments for Low Resource Machine Translation	Nima Pourdamghani, Marjan Ghazvininejad, Kevin Knight	We present a method for improving word alignments using word similarities.
84	When and Why Are Pre-Trained Word Embeddings Useful for Neural Machine Translation?	Ye Qi, Devendra Sachan, Matthieu Felix, Sarguna Padmanabhan, Graham Neubig	In this work, we perform five sets of experiments that analyze when we can expect pre-trained word embeddings to help in NMT tasks.
85	Are All Languages Equally Hard to Language-Model?	Ryan Cotterell, Sebastian J. Mielke, Jason Eisner, Brian Roark	In this work, we develop an evaluation framework for fair cross-linguistic comparison of language models, using translated text so that all models are asked to predict approximately the same information.
86	The Computational Complexity of Distinctive Feature Minimization in Phonology	Hubie Chen, Mans Hulden	We analyze the complexity of the problem of determining whether a set of phonemes forms a natural class and, if so, that of finding the minimal feature specification for the class.
87	Unsupervised Disambiguation of Syncretism in Inflected Lexicons	Ryan Cotterell, Christo Kirov, Sebastian J. Mielke, Jason Eisner	We present such an approach, which employs a neural network to smoothly model a prior distribution over feature bundles (even rare ones).
88	Contextualized Word Representations for Reading Comprehension	Shimi Salant, Jonathan Berant	We take a standard neural architecture for this task, and show that by providing rich contextualized word representations from a large pre-trained language model as well as allowing the model to choose between context-dependent and context-independent word representations, we can obtain dramatic improvements and reach performance comparable to state-of-the-art on the competitive SQuAD dataset.
89	Crowdsourcing Question-Answer Meaning Representations	Julian Michael, Gabriel Stanovsky, Luheng He, Ido Dagan, Luke Zettlemoyer	We introduce Question-Answer Meaning Representations (QAMRs), which represent the predicate-argument structure of a sentence as a set of question-answer pairs.
90	Leveraging Context Information for Natural Question Generation	Linfeng Song, Zhiguo Wang, Wael Hamza, Yue Zhang, Daniel Gildea	We propose a model that matches the answer with the passage before generating the question.
91	Robust Machine Comprehension Models via Adversarial Training	Yicheng Wang, Mohit Bansal	We propose a novel alternative adversary-generation algorithm, AddSentDiverse, that significantly increases the variance within the adversarial training data by providing effective examples that punish the model for making certain superficial assumptions.
92	Simple and Effective Semi-Supervised Question Answering	Bhuwan Dhingra, Danish Danish, Dheeraj Rajagopal	In this work, we envision a system where the end user specifies a set of base documents and only a few labelled examples. We are also releasing a set of 3.2M cloze-style questions for practitioners to use while building QA systems.
93	TypeSQL: Knowledge-Based Type-Aware Neural Text-to-SQL Generation	Tao Yu, Zifan Li, Zilin Zhang, Rui Zhang, Dragomir Radev	In this paper, we present a novel approach TypeSQL which formats the problem as a slot filling task in a more reasonable way.
94	Community Member Retrieval on Social Media Using Textual Information	Aaron Jaech, Shobhit Hathi, Mari Ostendorf	The solution introduces an unsupervised proxy task for learning user embeddings: user re-identification.
95	Cross-Domain Review Helpfulness Prediction Based on Convolutional Neural Networks with Auxiliary Domain Discriminators	Cen Chen, Yinfei Yang, Jun Zhou, Xiaolong Li, Forrest Sheng Bao	Therefore, we propose a convolutional neural network (CNN) based model which leverages both word-level and character-based representations.
96	Predicting Foreign Language Usage from English-Only Social Media Posts	Svitlana Volkova, Stephen Ranshous, Lawrence Phillips	This paper presents a large-scale analysis of 6 million tweets produced by 27 thousand multilingual users speaking 12 other languages besides English.
97	A Discourse-Aware Attention Model for Abstractive Summarization of Long Documents	Arman Cohan, Franck Dernoncourt, Doo Soon Kim, Trung Bui, Seokhwan Kim, Walter Chang, Nazli Goharian	We propose the first model for abstractive summarization of single, longer-form documents (e.g., research papers).
98	A Mixed Hierarchical Attention Based Encoder-Decoder Approach for Standard Table Summarization	Parag Jain, Anirban Laha, Karthik Sankaranarayanan, Preksha Nema, Mitesh M. Khapra, Shreyas Shetty	In this work, we consider summarizing structured data occurring in the form of tables as they are prevalent across a wide variety of domains.
99	Effective Crowdsourcing for a New Type of Summarization Task	Youxuan Jiang, Catherine Finegan-Dollak, Jonathan K. Kummerfeld, Walter Lasecki	We propose targeted summarization as an umbrella category for summarization tasks that intentionally consider only parts of the input data.
100	Key2Vec: Automatic Ranked Keyphrase Extraction from Scientific Articles using Phrase Embeddings	Debanjan Mahata, John Kuriakose, Rajiv Ratn Shah, Roger Zimmermann	In this paper, we present an unsupervised technique (Key2Vec) that leverages phrase embeddings for ranking keyphrases extracted from scientific articles.
101	Learning to Generate Wikipedia Summaries for Underserved Languages from Wikidata	Lucie-Aimée Kaffee, Hady Elsahar, Pavlos Vougiouklis, Christophe Gravier, Frédérique Laforest, Jonathon Hare, Elena Simperl	In this work, we investigate the generation of open domain Wikipedia summaries in underserved languages using structured data from Wikidata.
102	Multi-Reward Reinforced Summarization with Saliency and Entailment	Ramakanth Pasunuru, Mohit Bansal	In this work, we address these three important aspects of a good summary via a reinforcement learning approach with two novel reward functions: ROUGESal and Entail, on top of a coverage-based baseline.
103	Objective Function Learning to Match Human Judgements for Optimization-Based Summarization	Maxime Peyrard, Iryna Gurevych	In this work, we learn a summary-level scoring function $\theta$ including human judgments as supervision and automatically generated data as regularization.
104	Pruning Basic Elements for Better Automatic Evaluation of Summaries	Ukyo Honda, Tsutomu Hirao, Masaaki Nagata	We propose a simple but highly effective automatic evaluation measure of summarization, pruned Basic Elements (pBE).
105	Unsupervised Keyphrase Extraction with Multipartite Graphs	Florian Boudin	We propose an unsupervised keyphrase extraction model that encodes topical information within a multipartite graph structure.
106	Where Have I Heard This Story Before? Identifying Narrative Similarity in Movie Remakes	Snigdha Chaturvedi, Shashank Srivastava, Dan Roth	We present an initial approach for this problem, which finds correspondences between narratives in terms of plot events, and resemblances between characters and their social relationships. We present a new task and dataset for story understanding: identifying instances of similar narratives from a collection of narrative texts.
107	Multimodal Emoji Prediction	Francesco Barbieri, Miguel Ballesteros, Francesco Ronzano, Horacio Saggion	In this paper we extend recent advances in emoji prediction by putting forward a multimodal approach that is able to predict emojis in Instagram posts.
108	Higher-Order Coreference Resolution with Coarse-to-Fine Inference	Kenton Lee, Luheng He, Luke Zettlemoyer	To alleviate the computational cost of this iterative process, we introduce a coarse-to-fine approach that incorporates a less accurate but more efficient bilinear factor, enabling more aggressive pruning without hurting accuracy.
109	Non-Projective Dependency Parsing with Non-Local Transitions	Daniel Fernández-González, Carlos Gómez-Rodríguez	We present a novel transition system, based on the Covington non-projective parser, introducing non-local transitions that can directly create arcs involving nodes to the left of the current focus positions.
110	Detecting Linguistic Characteristics of Alzheimer’s Dementia by Interpreting Neural Models	Sweta Karlekar, Tong Niu, Mohit Bansal	In this work, we use NLP techniques to classify and analyze the linguistic characteristics of AD patients using the DementiaBank dataset.
111	Deep Dungeons and Dragons: Learning Character-Action Interactions from Role-Playing Game Transcripts	Annie Louis, Charles Sutton	We propose role-playing games as a testbed for this problem, and introduce a large corpus of game transcripts collected from online discussion forums.
112	Feudal Reinforcement Learning for Dialogue Management in Large Domains	Iñigo Casanueva, Paweł Budzianowski, Pei-Hao Su, Stefan Ultes, Lina M. Rojas-Barahona, Bo-Hsiang Tseng, Milica Gašić	We propose a novel Dialogue Management architecture, based on Feudal RL, which decomposes the decision into two steps; a first step where a master policy selects a subset of primitive actions, and a second step where a primitive action is chosen from the selected subset.
113	Evaluating Historical Text Normalization Systems: How Well Do They Generalize?	Alexander Robertson, Sharon Goldwater	We highlight several issues in the evaluation of historical text normalization systems that make it hard to tell how well these systems would actually work in practice-i.e., for new datasets or languages; in comparison to more na{\”\i}ve systems; or as a preprocessing step for downstream NLP tools.
114	Gated Multi-Task Network for Text Classification	Liqiang Xiao, Honglun Zhang, Wenqing Chen	In this paper, we introduce gate mechanism into multi-task CNN and propose a new Gated Sharing Unit, which can filter the feature flows between tasks and greatly reduce the interference.
115	Natural Language to Structured Query Generation via Meta-Learning	Po-Sen Huang, Chenglong Wang, Rishabh Singh, Wen-tau Yih, Xiaodong He	In this work, we explore a different learning protocol that treats each example as a unique pseudo-task, by reducing the original learning problem to a few-shot meta-learning scenario with the help of a domain-dependent relevance function.
116	Smaller Text Classifiers with Discriminative Cluster Embeddings	Mingda Chen, Kevin Gimpel	We propose variations that selectively assign additional parameters to words, which further improves accuracy while still remaining parameter-efficient.
117	Role-specific Language Models for Processing Recorded Neuropsychological Exams	Tuka Al Hanai, Rhoda Au, James Glass	This paper demonstrates a method to determine the cognitive health (impaired or not) of 92 subjects, from audio that was diarized using an automatic speech recognition system trained on TED talks and on the structured language used by testers and subjects.
118	Slot-Gated Modeling for Joint Slot Filling and Intent Prediction	Chih-Wen Goo, Guang Gao, Yun-Kai Hsu, Chih-Li Huo, Tsung-Chieh Chen, Keng-Wei Hsu, Yun-Nung Chen	Considering that slot and intent have the strong relationship, this paper proposes a slot gate that focuses on learning the relationship between intent and slot attention vectors in order to obtain better semantic frame results by the global optimization.
119	An Evaluation of Image-Based Verb Prediction Models against Human Eye-Tracking Data	Spandana Gella, Frank Keller	Recent research in language and vision has developed models for predicting and disambiguating verbs from images.
120	Learning to Color from Language	Varun Manjunatha, Mohit Iyyer, Jordan Boyd-Graber, Larry Davis	We present two different architectures for language-conditioned colorization, both of which produce more accurate and plausible colorizations than a language-agnostic version.
121	Punny Captions: Witty Wordplay in Image Descriptions	Arjun Chandrasekaran, Devi Parikh, Mohit Bansal	In this work, we attempt to build computational models that can produce witty descriptions for a given image.
122	The Emergence of Semantics in Neural Network Representations of Visual Information	Dhanush Dharmaretnam, Alona Fyshe	Here we employ techniques previously used to detect semantic representations in the human brain to detect semantic representations in CNNs.
123	Visual Referring Expression Recognition: What Do Systems Actually Learn?	Volkan Cirik, Louis-Philippe Morency, Taylor Berg-Kirkpatrick	We present an empirical analysis of state-of-the-art systems for referring expression recognition – the task of identifying the object in an image referred to by a natural language expression – with the goal of gaining insight into how these systems reason about language and vision.
124	Visually Guided Spatial Relation Extraction from Text	Taher Rahgooy, Umar Manzoor, Parisa Kordjamshidi	We use various recent vision and language datasets and techniques to train inter-modality alignment models, visual relationship classifiers and propose a novel global inference model to integrate these components into our structured output prediction model for spatial role and relation extraction.
125	Watch, Listen, and Describe: Globally and Locally Aligned Cross-Modal Attentions for Video Captioning	Xin Wang, Yuan-Fang Wang, William Yang Wang	In this paper, we propose a novel hierarchically aligned cross-modal attention (HACA) framework to learn and selectively fuse both global and local temporal dynamics of different modalities.