Paper Digest: ACL 2018 Highlights
Annual Meeting of the Association for Computational Linguistics (ACL) is one of the top natural language processing conferences in the world. In 2018, it is to be held in Melbourne, Australia. There were 2,571 paper submissions, of which 256 were accepted as long papers, and 125 as short papers.
To help AI community quickly catch up on the work presented in this conference, Paper Digest Team processed all accepted papers, and generated one highlight sentence (typically the main topic) for each paper. Readers are encouraged to read these machine generated highlights / summaries to quickly get the main idea of each paper.
We thank all authors for writing these interesting papers, and readers for reading our digests. If you do not want to miss any interesting AI paper, you are welcome to sign up our free paper digest service to get new paper updates customized to your own interests on a daily basis.
Paper Digest Team
team@paperdigest.org
TABLE 1: ACL 2018 Long Papers
Title | Authors | Highlight | |
---|---|---|---|
1 | Probabilistic FastText for Multi-Sense Word Embeddings | Ben Athiwaratkun, Andrew Wilson, Anima Anandkumar | We introduce Probabilistic FastText, a new model for word embeddings that can capture multiple word senses, sub-word structure, and uncertainty information. |
2 | A La Carte Embedding: Cheap but Effective Induction of Semantic Feature Vectors | Mikhail Khodak, Nikunj Saunshi, Yingyu Liang, Tengyu Ma, Brandon Stewart, Sanjeev Arora | This paper introduces a la carte embedding, a simple and general alternative to the usual word2vec-based approaches for building such representations that is based upon recent theoretical results for GloVe-like embeddings. We introduce a new dataset showing how the a la carte method requires fewer examples of words in context to learn high-quality embeddings and we obtain state-of-the-art results on a nonce task and some unsupervised document classification tasks. |
3 | Unsupervised Learning of Distributional Relation Vectors | Shoaib Jameel, Zied Bouraoui, Steven Schockaert | In this paper, we introduce a novel method which directly learns relation vectors from co-occurrence statistics. |
4 | Explicit Retrofitting of Distributional Word Vectors | Goran Glavaš, Ivan Vulić | In this work, in contrast, we transform external lexico-semantic relations into training examples which we use to learn an explicit retrofitting model (ER). |
5 | Unsupervised Neural Machine Translation with Weight Sharing | Zhen Yang, Wei Chen, Feng Wang, Bo Xu | To address this issue, we introduce an extension by utilizing two independent encoders but sharing some partial weights which are responsible for extracting high-level representations of the input sentences. |
6 | Triangular Architecture for Rare Language Translation | Shuo Ren, Wenhu Chen, Shujie Liu, Mu Li, Ming Zhou, Shuai Ma | By introducing another rich language Y, we propose a novel triangular training architecture (TA-NMT) to leverage bilingual data (Y,Z) (may be small) and (X,Y) (can be rich) to improve the translation performance of low-resource pairs. |
7 | Subword Regularization: Improving Neural Network Translation Models with Multiple Subword Candidates | Taku Kudo | The question addressed in this paper is whether it is possible to harness the segmentation ambiguity as a noise to improve the robustness of NMT. |
8 | The Best of Both Worlds: Combining Recent Advances in Neural Machine Translation | Mia Xu Chen, Orhan Firat, Ankur Bapna, Melvin Johnson, Wolfgang Macherey, George Foster, Llion Jones, Mike Schuster, Noam Shazeer, Niki Parmar, Ashish Vaswani, Jakob Uszkoreit, Lukasz Kaiser, Zhifeng Chen, Yonghui Wu, Macduff Hughes | In this paper, we tease apart the new architectures and their accompanying techniques in two ways. |
9 | Ultra-Fine Entity Typing | Eunsol Choi, Omer Levy, Yejin Choi, Luke Zettlemoyer | We introduce a new entity typing task: given a sentence with an entity mention, the goal is to predict a set of free-form phrases (e.g. skyscraper, songwriter, or criminal) that describe appropriate types for the target entity. |
10 | Hierarchical Losses and New Resources for Fine-grained Entity Typing and Linking | Shikhar Murty, Patrick Verga, Luke Vilnis, Irena Radovanovic, Andrew McCallum | This paper presents new methods using real and complex bilinear mappings for integrating hierarchical information, yielding substantial improvement over flat predictions in entity linking and fine-grained entity typing, and achieving new state-of-the-art results for end-to-end models on the benchmark FIGER dataset. We also present two new human-annotated datasets containing wide and deep hierarchies which we will release to the community to encourage further research in this direction: \textit{MedMentions}, a collection of PubMed abstracts in which 246k mentions have been mapped to the massive UMLS ontology; and \textit{TypeNet}, which aligns Freebase types with the WordNet hierarchy to obtain nearly 2k entity types. |
11 | Improving Knowledge Graph Embedding Using Simple Constraints | Boyang Ding, Quan Wang, Bin Wang, Li Guo | This paper, by contrast, investigates the potential of using very simple constraints to improve KG embedding. |
12 | Towards Understanding the Geometry of Knowledge Graph Embeddings | Chandrahas, Aditya Sharma, Partha Talukdar | Despite this popularity and effectiveness of KG embeddings in various tasks (e.g., link prediction), geometric understanding of such embeddings (i.e., arrangement of entity and relation vectors in vector space) is unexplored – we fill this gap in the paper. |
13 | A Unified Model for Extractive and Abstractive Summarization using Inconsistency Loss | Wan-Ting Hsu, Chieh-Kai Lin, Ming-Ying Lee, Kerui Min, Jing Tang, Min Sun | We propose a unified model combining the strength of extractive and abstractive summarization. |
14 | Extractive Summarization with SWAP-NET: Sentences and Words from Alternating Pointer Networks | Aishwarya Jadhav, Vaibhav Rajan | We present a new neural sequence-to-sequence model for extractive summarization called SWAP-NET (Sentences and Words from Alternating Pointer Networks). |
15 | Retrieve, Rerank and Rewrite: Soft Template Based Neural Summarization | Ziqiang Cao, Wenjie Li, Sujian Li, Furu Wei | Inspired by the traditional template-based summarization approaches, this paper proposes to use existing summaries as soft templates to guide the seq2seq model. |
16 | Simple and Effective Text Simplification Using Semantic and Neural Methods | Elior Sulem, Omri Abend, Ari Rappoport | Here we present a simple and efficient splitting algorithm based on an automatic semantic parser. |
17 | Obtaining Reliable Human Ratings of Valence, Arousal, and Dominance for 20,000 English Words | Saif Mohammad | We present the NRC VAD Lexicon, which has human ratings of valence, arousal, and dominance for more than 20,000 English words. |
18 | Comprehensive Supersense Disambiguation of English Prepositions and Possessives | Nathan Schneider, Jena D. Hwang, Vivek Srikumar, Jakob Prange, Austin Blodgett, Sarah R. Moeller, Aviram Stern, Adi Bitan, Omri Abend | We introduce a new annotation scheme, corpus, and task for the disambiguation of prepositions and possessives in English. |
19 | A Corpus with Multi-Level Annotations of Patients, Interventions and Outcomes to Support Language Processing for Medical Literature | Benjamin Nye, Junyi Jessy Li, Roma Patel, Yinfei Yang, Iain Marshall, Ani Nenkova, Byron Wallace | We present a corpus of 5,000 richly annotated abstracts of medical articles describing clinical randomized controlled trials. |
20 | Efficient Online Scalar Annotation with Bounded Support | Keisuke Sakaguchi, Benjamin Van Durme | We describe a novel method for efficiently eliciting scalar annotations for dataset construction and system quality estimation by human judgments. |
21 | Neural Argument Generation Augmented with Externally Retrieved Evidence | Xinyu Hua, Lu Wang | In this work, we study a novel task on automatically generating arguments of a different stance for a given statement. |
22 | A Stylometric Inquiry into Hyperpartisan and Fake News | Martin Potthast, Johannes Kiesel, Kevin Reinartz, Janek Bevendorff, Benno Stein | We report on a comparative style analysis of hyperpartisan (extremely one-sided) news and fake news. |
23 | Retrieval of the Best Counterargument without Prior Topic Knowledge | Henning Wachsmuth, Shahbaz Syed, Benno Stein | To operationalize our hypothesis, we simultaneously model the similarity and dissimilarity of pairs of arguments, based on the words and embeddings of the arguments’ premises and conclusions. |
24 | LinkNBed: Multi-Graph Representation Learning with Entity Linkage | Rakshit Trivedi, Bunyamin Sisman, Xin Luna Dong, Christos Faloutsos, Jun Ma, Hongyuan Zha | To this end, we propose LinkNBed, a deep relational learning framework that learns entity and relationship representations across multiple graphs. |
25 | Probabilistic Embedding of Knowledge Graphs with Box Lattice Measures | Luke Vilnis, Xiang Li, Shikhar Murty, Andrew McCallum | In this work we show that a broad class of models that assign probability measures to OE can never capture negative correlation, which motivates our construction of a novel box lattice and accompanying probability measure to capture anti-correlation and even disjoint concepts, while still providing the benefits of probabilistic modeling, such as the ability to perform rich joint and conditional queries over arbitrary sets of concepts, and both learning from and predicting calibrated uncertainty. |
26 | Graph-to-Sequence Learning using Gated Graph Neural Networks | Daniel Beck, Gholamreza Haffari, Trevor Cohn | In this work propose a new model that encodes the full structural information contained in the graph. |
27 | Sharp Nearby, Fuzzy Far Away: How Neural Language Models Use Context | Urvashi Khandelwal, He He, Peng Qi, Dan Jurafsky | In this paper, we investigate the role of context in an LSTM LM, through ablation studies. |
28 | Bridging CNNs, RNNs, and Weighted Finite-State Machines | Roy Schwartz, Sam Thomson, Noah A. Smith | In this paper we present SoPa, a new model that aims to bridge these two approaches. |
29 | Zero-shot Learning of Classifiers from Natural Language Quantification | Shashank Srivastava, Igor Labutov, Tom Mitchell | We present a framework through which a set of explanations of a concept can be used to learn a classifier without access to any labeled examples. |
30 | Sentence-State LSTM for Text Representation | Yue Zhang, Qi Liu, Linfeng Song | We investigate an alternative LSTM structure for encoding text, which consists of a parallel state for each word. |
31 | Universal Language Model Fine-tuning for Text Classification | Jeremy Howard, Sebastian Ruder | We propose Universal Language Model Fine-tuning (ULMFiT), an effective transfer learning method that can be applied to any task in NLP, and introduce techniques that are key for fine-tuning a language model. |
32 | Evaluating neural network explanation methods using hybrid documents and morphosyntactic agreement | Nina Poerner, Hinrich Schütze, Benjamin Roth | We show empirically that LIMSSE, LRP and DeepLIFT are the most effective explanation methods and recommend them for explaining DNNs in NLP. |
33 | Improving Text-to-SQL Evaluation Methodology | Catherine Finegan-Dollak, Jonathan K. Kummerfeld, Li Zhang, Karthik Ramanathan, Sesh Sadasivam, Rui Zhang, Dragomir Radev | We identify limitations of and propose improvements to current evaluations of text-to-SQL systems. |
34 | Semantic Parsing with Syntax- and Table-Aware SQL Generation | Yibo Sun, Duyu Tang, Nan Duan, Jianshu Ji, Guihong Cao, Xiaocheng Feng, Bing Qin, Ting Liu, Ming Zhou | We present a generative model to map natural language questions into SQL queries. |
35 | Multitask Parsing Across Semantic Representations | Daniel Hershcovich, Omri Abend, Ari Rappoport | In this paper we tackle the challenging task of improving semantic parsing performance, taking UCCA parsing as a test case, and AMR, SDP and Universal Dependencies (UD) parsing as auxiliary tasks. |
36 | Character-Level Models versus Morphology in Semantic Role Labeling | Gözde Gül Şahin, Mark Steedman | In this work, we train various types of SRL models that use word, character and morphology level information and analyze how performance of characters compare to words and morphology for several languages. |
37 | AMR Parsing as Graph Prediction with Latent Alignment | Chunchuan Lyu, Ivan Titov | We introduce a neural parser which treats alignments as latent variables within a joint probabilistic model of concepts, relations and alignments. |
38 | Accurate SHRG-Based Semantic Parsing | Yufei Chen, Weiwei Sun, Xiaojun Wan | We demonstrate that an SHRG-based parser can produce semantic graphs much more accurately than previously shown, by relating synchronous production rules to the syntacto-semantic composition process. |
39 | Using Intermediate Representations to Solve Math Word Problems | Danqing Huang, Jin-Ge Yao, Chin-Yew Lin, Qingyu Zhou, Jian Yin | In this work we present an intermediate meaning representation scheme that tries to reduce this gap. |
40 | Discourse Representation Structure Parsing | Jiangming Liu, Shay B. Cohen, Mirella Lapata | We introduce an open-domain neural semantic parser which generates formal meaning representations in the style of Discourse Representation Theory (DRT; Kamp and Reyle 1993). |
41 | Baseline Needs More Love: On Simple Word-Embedding-Based Models and Associated Pooling Mechanisms | Dinghan Shen, Guoyin Wang, Wenlin Wang, Martin Renqiang Min, Qinliang Su, Yizhe Zhang, Chunyuan Li, Ricardo Henao, Lawrence Carin | In this paper, we conduct a point-by-point comparative study between Simple Word-Embedding-based Models (SWEMs), consisting of parameter-free pooling operations, relative to word-embedding-based RNN/CNN models. |
42 | ParaNMT-50M: Pushing the Limits of Paraphrastic Sentence Embeddings with Millions of Machine Translations | John Wieting, Kevin Gimpel | We describe ParaNMT-50M, a dataset of more than 50 million English-English sentential paraphrase pairs. |
43 | Event2Mind: Commonsense Inference on Events, Intents, and Reactions | Hannah Rashkin, Maarten Sap, Emily Allaway, Noah A. Smith, Yejin Choi | We investigate a new commonsense inference task: given an event described in a short free-form text (“X drinks coffee in the morning”), a system reasons about the likely intents (“X wants to stay awake”) and reactions (“X feels alert”) of the event’s participants. To support this study, we construct a new crowdsourced corpus of 25,000 event phrases covering a diverse range of everyday events and situations. |
44 | Neural Adversarial Training for Semi-supervised Japanese Predicate-argument Structure Analysis | Shuhei Kurita, Daisuke Kawahara, Sadao Kurohashi | In this paper, we propose a novel Japanese PAS analysis model based on semi-supervised adversarial training with a raw corpus. |
45 | Improving Event Coreference Resolution by Modeling Correlations between Event Coreference Chains and Document Topic Structures | Prafulla Kumar Choubey, Ruihong Huang | This paper proposes a novel approach for event coreference resolution that models correlations between event coreference chains and document topical structures through an Integer Linear Programming formulation. |
46 | DSGAN: Generative Adversarial Training for Distant Supervision Relation Extraction | Pengda Qin, Weiran Xu, William Yang Wang | In this paper, we introduce an adversarial learning framework, which we named DSGAN, to learn a sentence-level true-positive generator. |
47 | Extracting Relational Facts by an End-to-End Neural Model with Copy Mechanism | Xiangrong Zeng, Daojian Zeng, Shizhu He, Kang Liu, Jun Zhao | In this paper, we propose an end-to-end model based on sequence-to-sequence learning with copy mechanism, which can jointly extract relational facts from sentences of any of these classes. |
48 | Self-regulation: Employing a Generative Adversarial Network to Improve Event Detection | Yu Hong, Wenxuan Zhou, Jingli Zhang, Guodong Zhou, Qiaoming Zhu | In this paper, we propose a self-regulated learning approach by utilizing a generative adversarial network to generate spurious features. |
49 | Context-Aware Neural Model for Temporal Information Extraction | Yuanliang Meng, Anna Rumshisky | We propose a context-aware neural network model for temporal information extraction. |
50 | Temporal Event Knowledge Acquisition via Identifying Narratives | Wenlin Yao, Ruihong Huang | Inspired by the double temporality characteristic of narrative texts, we propose a novel approach for acquiring rich temporal “before/after” event knowledge across sentences in narrative stories. |
51 | Textual Deconvolution Saliency (TDS) : a deep tool box for linguistic analysis | Laurent Vanni, Melanie Ducoffe, Carlos Aguilar, Frederic Precioso, Damon Mayaffre | In this paper, we propose a new strategy, called Text Deconvolution Saliency (TDS), to visualize linguistic information detected by a CNN for text classification. |
52 | Coherence Modeling of Asynchronous Conversations: A Neural Entity Grid Approach | Shafiq Joty, Muhammad Tasnim Mohiuddin, Dat Tien Nguyen | We propose a novel coherence model for written asynchronous conversations (e.g., forums, emails), and show its applications in coherence assessment and thread reconstruction tasks. |
53 | Deep Reinforcement Learning for Chinese Zero Pronoun Resolution | Qingyu Yin, Yu Zhang, Wei-Nan Zhang, Ting Liu, William Yang Wang | In this paper, we show how to integrate these goals, applying deep reinforcement learning to deal with the task. |
54 | Entity-Centric Joint Modeling of Japanese Coreference Resolution and Predicate Argument Structure Analysis | Tomohide Shibata, Sadao Kurohashi | This paper presents an entity-centric joint model for Japanese coreference resolution and predicate argument structure analysis. |
55 | Constraining MGbank: Agreement, L-Selection and Supertagging in Minimalist Grammars | John Torr | This paper reports on two strategies that have been implemented for improving the efficiency and precision of wide-coverage Minimalist Grammar (MG) parsing. |
56 | Not that much power: Linguistic alignment is influenced more by low-level linguistic features rather than social power | Yang Xu, Jeremy Cole, David Reitter | This work characterizes the effect of power on alignment with logistic regression models in two datasets, finding that the effect vanishes or is reversed after controlling for low-level features such as utterance length. |
57 | TutorialBank: A Manually-Collected Corpus for Prerequisite Chains, Survey Extraction and Resource Recommendation | Alexander Fabbri, Irene Li, Prawat Trairatvorakul, Yijiao He, Weitai Ting, Robert Tung, Caitlin Westerfield, Dragomir Radev | To address this situation, we introduce TutorialBank, a new, publicly available dataset which aims to facilitate NLP education and research. We are releasing the dataset and present several avenues for further research. |
58 | Give Me More Feedback: Annotating Argument Persuasiveness and Related Attributes in Student Essays | Winston Carlile, Nishant Gurrapadi, Zixuan Ke, Vincent Ng | We present the first corpus of essays that are simultaneously annotated with argument components, argument persuasiveness scores, and attributes of argument components that impact an argument’s persuasiveness. |
59 | Inherent Biases in Reference-based Evaluation for Grammatical Error Correction | Leshem Choshen, Omri Abend | Concretely, we show that LCB incentivizes GEC systems to avoid correcting even when they can generate a valid correction. |
60 | The price of debiasing automatic metrics in natural language evalaution | Arun Chaganty, Stephen Mussmann, Percy Liang | In this paper, we use control variates to combine automatic metrics with human evaluation to obtain an unbiased estimator with lower cost than human evaluation alone. |
61 | Neural Document Summarization by Jointly Learning to Score and Select Sentences | Qingyu Zhou, Nan Yang, Furu Wei, Shaohan Huang, Ming Zhou, Tiejun Zhao | In this paper, we present a novel end-to-end neural network framework for extractive document summarization by jointly learning to score and select sentences. |
62 | Unsupervised Abstractive Meeting Summarization with Multi-Sentence Compression and Budgeted Submodular Maximization | Guokan Shang, Wensi Ding, Zekun Zhang, Antoine Tixier, Polykarpos Meladianos, Michalis Vazirgiannis, Jean-Pierre Lorré | We introduce a novel graph-based framework for abstractive meeting speech summarization that is fully unsupervised and does not rely on any annotations. |
63 | Fast Abstractive Summarization with Reinforce-Selected Sentence Rewriting | Yen-Chun Chen, Mohit Bansal | Inspired by how humans summarize long documents, we propose an accurate and fast summarization model that first selects salient sentences and then rewrites them abstractively (i.e., compresses and paraphrases) to generate a concise overall summary. |
64 | Soft Layer-Specific Multi-Task Summarization with Entailment and Question Generation | Han Guo, Ramakanth Pasunuru, Mohit Bansal | We improve these important aspects of abstractive summarization via multi-task learning with the auxiliary tasks of question generation and entailment generation, where the former teaches the summarization model how to look for salient questioning-worthy details, and the latter teaches the model how to rewrite a summary which is a directed-logical subset of the input document. |
65 | Modeling and Prediction of Online Product Review Helpfulness: A Survey | Gerardo Ocampo Diaz, Vincent Ng | This paper provides an overview of the most relevant work in helpfulness prediction and understanding in the past decade, discusses the insights gained from said work, and provides guidelines for future research. |
66 | Mining Cross-Cultural Differences and Similarities in Social Media | Bill Yuchen Lin, Frank F. Xu, Kenny Zhu, Seung-won Hwang | In this paper, we study the problem of computing such cross-cultural differences and similarities. |
67 | Classification of Moral Foundations in Microblog Political Discourse | Kristen Johnson, Dan Goldwasser | The contributions of this work includes a dataset annotated for the moral foundations, annotation guidelines, and probabilistic graphical models which show the usefulness of jointly modeling abstract political slogans, as opposed to the unigrams of previous works, with policy frames for the prediction of the morality underlying political tweets. |
68 | Coarse-to-Fine Decoding for Neural Semantic Parsing | Li Dong, Mirella Lapata | In this work, we propose a structure-aware neural architecture which decomposes the semantic parsing process into two stages. |
69 | Confidence Modeling for Neural Semantic Parsing | Li Dong, Chris Quirk, Mirella Lapata | In this work we focus on confidence modeling for neural semantic parsers which are built upon sequence-to-sequence models. |
70 | StructVAE: Tree-structured Latent Variable Models for Semi-supervised Semantic Parsing | Pengcheng Yin, Chunting Zhou, Junxian He, Graham Neubig | We introduce StructVAE, a variational auto-encoding model for semi-supervised semantic parsing, which learns both from limited amounts of parallel data, and readily-available unlabeled NL utterances. |
71 | Sequence-to-Action: End-to-End Semantic Graph Generation for Semantic Parsing | Bo Chen, Le Sun, Xianpei Han | This paper proposes a neural semantic parsing approach – Sequence-to-Action, which models semantic parsing as an end-to-end semantic graph generation process. |
72 | On the Limitations of Unsupervised Bilingual Dictionary Induction | Anders Søgaard, Sebastian Ruder, Ivan Vulić | Unsupervised machine translation – i.e., not assuming any cross-lingual supervision signal, whether a dictionary, translations, or comparable corpora – seems impossible, but nevertheless, Lample et al. (2017) recently proposed a fully unsupervised machine translation (MT) model. |
73 | A robust self-learning method for fully unsupervised cross-lingual mappings of word embeddings | Mikel Artetxe, Gorka Labaka, Eneko Agirre | This work proposes an alternative approach based on a fully unsupervised initialization that explicitly exploits the structural similarity of the embeddings, and a robust self-learning algorithm that iteratively improves this solution. |
74 | A Multi-lingual Multi-task Architecture for Low-resource Sequence Labeling | Ying Lin, Shengqi Yang, Veselin Stoyanov, Heng Ji | We propose a multi-lingual multi-task architecture to develop supervised models with a minimal amount of labeled data for sequence labeling. |
75 | Two Methods for Domain Adaptation of Bilingual Tasks: Delightfully Simple and Broadly Applicable | Viktor Hangya, Fabienne Braune, Alexander Fraser, Hinrich Schütze | We make two contributions. |
76 | Knowledgeable Reader: Enhancing Cloze-Style Reading Comprehension with External Commonsense Knowledge | Todor Mihaylov, Anette Frank | We introduce a neural reading comprehension model that integrates external commonsense knowledge, encoded as a key-value memory, in a cloze-style setting. |
77 | Multi-Relational Question Answering from Narratives: Machine Reading and Reasoning in Simulated Worlds | Igor Labutov, Bishan Yang, Anusha Prakash, Amos Azaria | In this work, we look towards a practical use-case of QA over user-instructed knowledge that uniquely combines elements of both structured QA over knowledge bases, and unstructured QA over narrative, introducing the task of multi-relational QA over personal narrative. |
78 | Simple and Effective Multi-Paragraph Reading Comprehension | Christopher Clark, Matt Gardner | We introduce a method of adapting neural paragraph-level question answering models to the case where entire documents are given as input. |
79 | Semantically Equivalent Adversarial Rules for Debugging NLP models | Marco Tulio Ribeiro, Sameer Singh, Carlos Guestrin | To automatically detect this behavior for individual instances, we present semantically equivalent adversaries (SEAs) – semantic-preserving perturbations that induce changes in the model’s predictions. |
80 | Style Transfer Through Back-Translation | Shrimai Prabhumoye, Yulia Tsvetkov, Ruslan Salakhutdinov, Alan W Black | This paper introduces a new method for automatic style transfer. |
81 | Generating Fine-Grained Open Vocabulary Entity Type Descriptions | Rajarshi Bhowmik, Gerard de Melo | In this paper, we introduce a dynamic memory-based network that generates a short open vocabulary description of an entity by jointly leveraging induced fact embeddings as well as the dynamic context of the generated sequence of words. |
82 | Hierarchical Neural Story Generation | Angela Fan, Mike Lewis, Yann Dauphin | We explore story generation: creative systems that can build coherent and fluent passages of text about a topic. We collect a large dataset of 300K human-written stories paired with writing prompts from an online forum. |
83 | No Metrics Are Perfect: Adversarial Reward Learning for Visual Storytelling | Xin Wang, Wenhu Chen, Yuan-Fang Wang, William Yang Wang | Therefore, we propose an Adversarial REward Learning (AREL) framework to learn an implicit reward function from human demonstrations, and then optimize policy search with the learned reward function. |
84 | Bridging Languages through Images with Deep Partial Canonical Correlation Analysis | Guy Rotman, Ivan Vulić, Roi Reichart | In particular, we propose a novel model based on Partial Canonical Correlation Analysis (PCCA). |
85 | Illustrative Language Understanding: Large-Scale Visual Grounding with Image Search | Jamie Kiros, William Chan, Geoffrey Hinton | We introduce Picturebook, a large-scale lookup operation to ground language via snapshots’ of our physical world accessed through image search. |
86 | What Action Causes This? Towards Naive Physical Action-Effect Prediction | Qiaozi Gao, Shaohua Yang, Joyce Chai, Lucy Vanderwende | Towards this goal, this paper introduces a new task on naive physical action-effect prediction, which addresses the relations between concrete actions (expressed in the form of verb-noun pairs) and their effects on the state of the physical world as depicted by images. We collected a dataset for this task and developed an approach that harnesses web image data through distant supervision to facilitate learning for action-effect prediction. |
87 | Transformation Networks for Target-Oriented Sentiment Classification | Xin Li, Lidong Bing, Wai Lam, Bei Shi | After re-examining the drawbacks of attention mechanism and the obstacles that block CNN to perform well in this classification task, we propose a new model that achieves new state-of-the-art results on a few benchmarks. |
88 | Target-Sensitive Memory Networks for Aspect Sentiment Classification | Shuai Wang, Sahisnu Mazumder, Bing Liu, Mianwei Zhou, Yi Chang | To tackle this problem, we propose the target-sensitive memory networks (TMNs). |
89 | Identifying Transferable Information Across Domains for Cross-domain Sentiment Classification | Raksha Sharma, Pushpak Bhattacharyya, Sandipan Dandapat, Himanshu Sharad Bhatt | We present a novel approach based on χ2 test and cosine-similarity between context vector of words to identify polarity preserving significant words across domains. |
90 | Unpaired Sentiment-to-Sentiment Translation: A Cycled Reinforcement Learning Approach | Jingjing Xu, Xu Sun, Qi Zeng, Xiaodong Zhang, Xuancheng Ren, Houfeng Wang, Wenjie Li | To solve this problem, we propose a cycled reinforcement learning method that enables training on unpaired data by collaboration between a neutralization module and an emotionalization module. |
91 | Discourse Marker Augmented Network with Reinforcement Learning for Natural Language Inference | Boyuan Pan, Yazheng Yang, Zhou Zhao, Yueting Zhuang, Deng Cai, Xiaofei He | We observe that people usually use some discourse markers such as “so” or “but” to represent the logical relationship between two sentences. |
92 | Working Memory Networks: Augmenting Memory Networks with a Relational Reasoning Module | Juan Pavez, Héctor Allende, Héctor Allende-Cid | To solve these issues, we introduce the Working Memory Network, a MemNN architecture with a novel working memory storage and reasoning module. |
93 | Reasoning with Sarcasm by Reading In-Between | Yi Tay, Anh Tuan Luu, Siu Cheung Hui, Jian Su | More specifically, we propose an attention-based neural model that looks in-between instead of across, enabling it to explicitly model contrast and incongruity. |
94 | Adversarial Contrastive Estimation | Avishek Joey Bose, Huan Ling, Yanshuai Cao | In this work, we view contrastive learning as an abstraction of all such methods and augment the negative sampler into a mixture distribution containing an adversarially learned sampler. |
95 | Adaptive Scaling for Sparse Detection in Information Extraction | Hongyu Lin, Yaojie Lu, Xianpei Han, Le Sun | In this paper, we propose \textit{adaptive scaling}, an algorithm which can handle the positive sparsity problem and directly optimize over F-measure via dynamic cost-sensitive learning. |
96 | Strong Baselines for Neural Semi-Supervised Learning under Domain Shift | Sebastian Ruder, Barbara Plank | In this paper, we re-evaluate classic general-purpose bootstrapping approaches in the context of neural networks under domain shifts vs. recent neural approaches and propose a novel multi-task tri-training method that reduces the time and space complexity of classic tri-training. |
97 | Fluency Boost Learning and Inference for Neural Grammatical Error Correction | Tao Ge, Furu Wei, Ming Zhou | Most of the neural sequence-to-sequence (seq2seq) models for grammatical error correction (GEC) have two limitations: (1) a seq2seq model may not be well generalized with only limited error-corrected data; (2) a seq2seq model may fail to completely correct a sentence with multiple errors through normal seq2seq inference. |
98 | A Neural Architecture for Automated ICD Coding | Pengtao Xie, Eric Xing | In this paper, we build a neural architecture for automated coding. |
99 | Domain Adaptation with Adversarial Training and Graph Embeddings | Firoj Alam, Shafiq Joty, Muhammad Imran | In this paper, we study the problem of classifying social media posts during a crisis event (e.g., Earthquake). |
100 | TDNN: A Two-stage Deep Neural Network for Prompt-independent Automated Essay Scoring | Cancan Jin, Ben He, Kai Hui, Le Sun | In particular, in the first stage, using the rated essays for non-target prompts as the training data, a shallow model is learned to select essays with an extreme quality for the target prompt, serving as pseudo training data; in the second stage, an end-to-end hybrid deep model is proposed to learn a prompt-dependent rating model consuming the pseudo training data from the first step. |
101 | Unsupervised Discrete Sentence Representation Learning for Interpretable Neural Dialog Generation | Tiancheng Zhao, Kyusong Lee, Maxine Eskenazi | We present an unsupervised discrete sentence representation learning method that can integrate with any existing encoder-decoder dialog models for interpretable response generation. |
102 | Learning to Control the Specificity in Neural Response Generation | Ruqing Zhang, Jiafeng Guo, Yixing Fan, Yanyan Lan, Jun Xu, Xueqi Cheng | To address this problem, we propose a novel controlled response generation mechanism to handle different utterance-response relationships in terms of specificity. |
103 | Multi-Turn Response Selection for Chatbots with Deep Attention Matching Network | Xiangyang Zhou, Lu Li, Daxiang Dong, Yi Liu, Ying Chen, Wayne Xin Zhao, Dianhai Yu, Hua Wu | In this paper, we investigate matching a response with its multi-turn context using dependency information based entirely on attention. |
104 | MojiTalk: Generating Emotional Responses at Scale | Xianda Zhou, William Yang Wang | In this paper, we take a more radical approach: we exploit the idea of leveraging Twitter data that are naturally labeled with emojis. We collect a large corpus of Twitter conversations that include emojis in the response and assume the emojis convey the underlying emotions of the sentence. |
105 | Taylor’s law for Human Linguistic Sequences | Tatsuru Kobayashi, Kumiko Tanaka-Ishii | This article describes a new way to quantify Taylor’s law in natural language and conducts Taylor analysis of over 1100 texts across 14 languages. |
106 | A Framework for Representing Language Acquisition in a Population Setting | Jordan Kodner, Christopher Cerezo Falco | We compare the strengths and weaknesses of existing approaches and propose a new analytic framework which combines previous network models’ ability to capture realistic social structure with practically and more elegant computational properties. |
107 | Prefix Lexicalization of Synchronous CFGs using Synchronous TAG | Logan Born, Anoop Sarkar | We show that an epsilon-free, chain-free synchronous context-free grammar (SCFG) can be converted into a weakly equivalent synchronous tree-adjoining grammar (STAG) which is prefix lexicalized. |
108 | Straight to the Tree: Constituency Parsing with Neural Syntactic Distance | Yikang Shen, Zhouhan Lin, Athul Paul Jacob, Alessandro Sordoni, Aaron Courville, Yoshua Bengio | In this work, we propose a novel constituency parsing scheme. |
109 | Gaussian Mixture Latent Vector Grammars | Yanpeng Zhao, Liwen Zhang, Kewei Tu | We introduce Latent Vector Grammars (LVeGs), a new framework that extends latent variable grammars such that each nonterminal symbol is associated with a continuous vector space representing the set of (infinitely many) subtypes of the nonterminal. |
110 | Extending a Parser to Distant Domains Using a Few Dozen Partially Annotated Examples | Vidur Joshi, Matthew Peters, Mark Hopkins | For more syntactically distant domains, we provide a simple way to adapt a parser using only dozens of partial annotations. |
111 | Paraphrase to Explicate: Revealing Implicit Noun-Compound Relations | Vered Shwartz, Ido Dagan | We propose a neural model that generalizes better by representing paraphrases in a continuous space, generalizing for both unseen noun-compounds and rare paraphrases. |
112 | Searching for the X-Factor: Exploring Corpus Subjectivity for Word Embeddings | Maksim Tkachenko, Chong Cher Chia, Hady Lauw | Through systematic comparative analyses, we establish this to be the case indeed. |
113 | Word Embedding and WordNet Based Metaphor Identification and Interpretation | Rui Mao, Chenghua Lin, Frank Guerin | In this paper, we propose an unsupervised learning method that identifies and interprets metaphors at word-level without any preprocessing, outperforming strong baselines in the metaphor identification task. |
114 | Incorporating Latent Meanings of Morphological Compositions to Enhance Word Embeddings | Yang Xu, Jiawei Liu, Wei Yang, Liusheng Huang | In this paper, we explore to employ the latent meanings of morphological compositions of words to train and enhance word embeddings. |
115 | A Stochastic Decoder for Neural Machine Translation | Philip Schulz, Wilker Aziz, Trevor Cohn | To this end, we present a deep generative model of machine translation which incorporates a chain of latent variables, in order to account for local lexical and syntactic variation in parallel corpora. |
116 | Forest-Based Neural Machine Translation | Chunpeng Ma, Akihiro Tamura, Masao Utiyama, Tiejun Zhao, Eiichiro Sumita | This paper proposes a forest-based NMT method that translates a linearized packed forest under a simple sequence-to-sequence framework (i.e., a forest-to-sequence NMT model). |
117 | Context-Aware Neural Machine Translation Learns Anaphora Resolution | Elena Voita, Pavel Serdyukov, Rico Sennrich, Ivan Titov | We introduce a context-aware neural machine translation model designed in such way that the flow of information from the extended context to the translation model can be controlled and analyzed. |
118 | Document Context Neural Machine Translation with Memory Networks | Sameen Maruf, Gholamreza Haffari | We present a document-level neural machine translation model which takes both source and target document context into account using memory networks. |
119 | Which Melbourne? Augmenting Geocoding with Maps | Milan Gritta, Mohammad Taher Pilehvar, Nigel Collier | We introduce a geocoder (location mention disambiguator) that achieves state-of-the-art (SOTA) results on three diverse datasets by exploiting the implicit lexical clues. We also introduce an open-source dataset for geoparsing of news events covering global disease outbreaks and epidemics to help future evaluation in geoparsing. |
120 | Learning Prototypical Goal Activities for Locations | Tianyu Jiang, Ellen Riloff | Our research aims to learn goal-acts for specific locations using a text corpus and semi-supervised learning. |
121 | Guess Me if You Can: Acronym Disambiguation for Enterprises | Yang Li, Bo Zhao, Ariel Fuxman, Fangbo Tao | In this work we propose an end-to-end framework to tackle all these challenges. |
122 | A Multi-Axis Annotation Scheme for Event Temporal Relations | Qiang Ning, Hao Wu, Dan Roth | This paper proposes a new multi-axis modeling to better capture the temporal structure of events. |
123 | Exemplar Encoder-Decoder for Neural Conversation Generation | Gaurav Pandey, Danish Contractor, Vineet Kumar, Sachindra Joshi | In this paper we present the Exemplar Encoder-Decoder network (EED), a novel conversation model that learns to utilize \textit{similar} examples from training data to generate responses. |
124 | DialSQL: Dialogue Based Structured Query Generation | Izzeddin Gur, Semih Yavuz, Yu Su, Xifeng Yan | Rather than solely relying on algorithmic innovations, in this work, we introduce DialSQL, a dialogue-based structured query generation framework that leverages human intelligence to boost the performance of existing algorithms via user interaction. |
125 | Conversations Gone Awry: Detecting Early Signs of Conversational Failure | Justine Zhang, Jonathan Chang, Cristian Danescu-Niculescu-Mizil, Lucas Dixon, Yiqing Hua, Dario Taraborelli, Nithum Thain | In this work, we introduce the task of predicting from the very start of a conversation whether it will get out of hand. |
126 | Are BLEU and Meaning Representation in Opposition? | Ondřej Cífka, Ondřej Bojar | We propose several variations of the attentive NMT architecture bringing this meeting point back. |
127 | Automatic Metric Validation for Grammatical Error Correction | Leshem Choshen, Omri Abend | We propose MAEGE, an automatic methodology for GEC metric validation, that overcomes many of the difficulties in the existing methodology. |
128 | The Hitchhiker’s Guide to Testing Statistical Significance in Natural Language Processing | Rotem Dror, Gili Baumer, Segev Shlomov, Roi Reichart | Based on this discussion we propose a simple practical protocol for statistical significance test selection in NLP setups and accompany this protocol with a brief survey of the most relevant tests. |
129 | Distilling Knowledge for Search-based Structured Prediction | Yijia Liu, Wanxiang Che, Huaipeng Zhao, Bing Qin, Ting Liu | In this paper, we distill an ensemble of multiple models trained with different initialization into a single model. |
130 | Stack-Pointer Networks for Dependency Parsing | Xuezhe Ma, Zecong Hu, Jingzhou Liu, Nanyun Peng, Graham Neubig, Eduard Hovy | We introduce a novel architecture for dependency parsing: stack-pointer networks (StackPtr). |
131 | Twitter Universal Dependency Parsing for African-American and Mainstream American English | Su Lin Blodgett, Johnny Wei, Brendan O’Connor | We describe our standards for handling Twitter- and AAE-specific features and evaluate a variety of cross-domain strategies for improving parsing with no, or very little, in-domain labeled data, including a new data synthesis approach. |
132 | LSTMs Can Learn Syntax-Sensitive Dependencies Well, But Modeling Structure Makes Them Better | Adhiguna Kuncoro, Chris Dyer, John Hale, Dani Yogatama, Stephen Clark, Phil Blunsom | Using the same diagnostic, we show that, in fact, LSTMs do succeed in learning such dependencies-provided they have enough capacity. |
133 | Sequicity: Simplifying Task-oriented Dialogue Systems with Single Sequence-to-Sequence Architectures | Wenqiang Lei, Xisen Jin, Min-Yen Kan, Zhaochun Ren, Xiangnan He, Dawei Yin | We propose a novel, holistic, extendable framework based on a single sequence-to-sequence (seq2seq) model which can be optimized with supervised or reinforcement learning. |
134 | An End-to-end Approach for Handling Unknown Slot Values in Dialogue State Tracking | Puyang Xu, Qi Hu | We describe in this paper an E2E architecture based on the pointer network (PtrNet) that can effectively extract unknown slot values while still obtains state-of-the-art accuracy on the standard DSTC2 benchmark. |
135 | Global-Locally Self-Attentive Encoder for Dialogue State Tracking | Victor Zhong, Caiming Xiong, Richard Socher | In this paper, we propose the Global-Locally Self-Attentive Dialogue State Tracker (GLAD), which learns representations of the user utterance and previous system actions with global-local modules. |
136 | Mem2Seq: Effectively Incorporating Knowledge Bases into End-to-End Task-Oriented Dialog Systems | Andrea Madotto, Chien-Sheng Wu, Pascale Fung | In this paper, we propose a novel yet simple end-to-end differentiable model called memory-to-sequence (Mem2Seq) to address this issue. |
137 | Tailored Sequence to Sequence Models to Different Conversation Scenarios | Hainan Zhang, Yanyan Lan, Jiafeng Guo, Jun Xu, Xueqi Cheng | In this paper, we propose two tailored optimization criteria for Seq2Seq to different conversation scenarios, i.e., the maximum generated likelihood for specific-requirement scenario, and the conditional value-at-risk for diverse-requirement scenario. |
138 | Knowledge Diffusion for Neural Dialogue Generation | Shuman Liu, Hongshen Chen, Zhaochun Ren, Yang Feng, Qun Liu, Dawei Yin | In this paper, we propose a neural knowledge diffusion (NKD) model to introduce knowledge into dialogue generation. |
139 | Generating Informative Responses with Controlled Sentence Function | Pei Ke, Jian Guan, Minlie Huang, Xiaoyan Zhu | In this paper, we present a model to generate informative responses with controlled sentence function. |
140 | Sentiment Adaptive End-to-End Dialog Systems | Weiyan Shi, Zhou Yu | Therefore, we propose to include user sentiment obtained through multimodal information (acoustic, dialogic and textual), in the end-to-end learning framework to make systems more user-adaptive and effective. |
141 | Embedding Learning Through Multilingual Concept Induction | Philipp Dufter, Mengjie Zhao, Martin Schmitt, Alexander Fraser, Hinrich Schütze | We present a new method for estimating vector space representations of words: embedding learning by concept induction. |
142 | Isomorphic Transfer of Syntactic Structures in Cross-Lingual NLP | Edoardo Maria Ponti, Roi Reichart, Anna Korhonen, Ivan Vulić | In this paper, we measure cross-lingual syntactic variation, or anisomorphism, in the UD treebank collection, considering both morphological and structural properties. |
143 | Language Modeling for Code-Mixing: The Role of Linguistic Theory based Synthetic Data | Adithya Pratapa, Gayatri Bhat, Monojit Choudhury, Sunayana Sitaram, Sandipan Dandapat, Kalika Bali | We present a computational technique for creation of grammatically valid artificial CM data based on the Equivalence Constraint Theory. |
144 | Chinese NER Using Lattice LSTM | Yue Zhang, Jie Yang | We investigate a lattice-structured LSTM model for Chinese NER, which encodes a sequence of input characters as well as all potential words that match a lexicon. |
145 | Nugget Proposal Networks for Chinese Event Detection | Hongyu Lin, Yaojie Lu, Xianpei Han, Le Sun | In this paper, we propose Nugget Proposal Networks (NPNs), which can solve the word-trigger mismatch problem by directly proposing entire trigger nuggets centered at each character regardless of word boundaries. |
146 | Higher-order Relation Schema Induction using Tensor Factorization with Back-off and Aggregation | Madhav Nimishakavi, Manish Gupta, Partha Talukdar | In this paper, we propose Tensor Factorization with Back-off and Aggregation (TFBA), a novel framework for the HRSI problem. |
147 | Discovering Implicit Knowledge with Unary Relations | Michael Glass, Alfio Gliozzo | In this paper we propose a new methodology to identify relations between two entities, consisting of detecting a very large number of unary relations, and using them to infer missing entities. |
148 | Improving Entity Linking by Modeling Latent Relations between Mentions | Phong Le, Ivan Titov | Unlike previous approaches, which relied on supervised systems or heuristics to predict these relations, we treat relations as latent variables in our neural entity-linking model. |
149 | Dating Documents using Graph Convolution Networks | Shikhar Vashishth, Shib Sankar Dasgupta, Swayambhu Nath Ray, Partha Talukdar | In this paper, we propose NeuralDater, a Graph Convolutional Network (GCN) based document dating approach which jointly exploits syntactic and temporal graph structures of document in a principled way. |
150 | A Graph-to-Sequence Model for AMR-to-Text Generation | Linfeng Song, Yue Zhang, Zhiguo Wang, Daniel Gildea | We introduce a neural graph-to-sequence model, using a novel LSTM structure for directly encoding graph-level semantics. |
151 | GTR-LSTM: A Triple Encoder for Sentence Generation from RDF Data | Bayu Distiawan Trisedya, Jianzhong Qi, Rui Zhang, Wei Wang | To preserve as much information from RDF triples as possible, we propose a novel graph-based triple encoder. |
152 | Learning to Write with Cooperative Discriminators | Ari Holtzman, Jan Buys, Maxwell Forbes, Antoine Bosselut, David Golub, Yejin Choi | We propose a unified learning framework that collectively addresses all the above issues by composing a committee of discriminators that can guide a base RNN generator towards more globally coherent generations. |
153 | A Neural Approach to Pun Generation | Zhiwei Yu, Jiwei Tan, Xiaojun Wan | In this paper, we propose neural network models for homographic pun generation, and they can generate puns without requiring any pun data for training. |
154 | Learning to Generate Move-by-Move Commentary for Chess Games from Large-Scale Social Forum Data | Harsh Jhamtani, Varun Gangal, Eduard Hovy, Graham Neubig, Taylor Berg-Kirkpatrick | We introduce a new large-scale chess commentary dataset and propose methods to generate commentary for individual moves in a chess game. |
155 | From Credit Assignment to Entropy Regularization: Two New Algorithms for Neural Sequence Prediction | Zihang Dai, Qizhe Xie, Eduard Hovy | In this work, we study the credit assignment problem in reward augmented maximum likelihood (RAML) learning, and establish a theoretical equivalence between the token-level counterpart of RAML and the entropy regularized reinforcement learning. |
156 | DuoRC: Towards Complex Language Understanding with Paraphrased Reading Comprehension | Amrita Saha, Rahul Aralikatte, Mitesh M. Khapra, Karthik Sankaranarayanan | We propose DuoRC, a novel dataset for Reading Comprehension (RC) that motivates several new challenges for neural approaches in language understanding beyond those offered by existing RC datasets. |
157 | Stochastic Answer Networks for Machine Reading Comprehension | Xiaodong Liu, Yelong Shen, Kevin Duh, Jianfeng Gao | We propose a simple yet robust stochastic answer network (SAN) that simulates multi-step reasoning in machine reading comprehension. |
158 | Multi-Granularity Hierarchical Attention Fusion Networks for Reading Comprehension and Question Answering | Wei Wang, Ming Yan, Chen Wu | This paper describes a novel hierarchical attention network for reading comprehension style question answering, which aims to answer questions for a given narrative paragraph. |
159 | Joint Training of Candidate Extraction and Answer Selection for Reading Comprehension | Zhen Wang, Jiachen Liu, Xinyan Xiao, Yajuan Lyu, Tian Wu | In this paper, we formulate reading comprehension as an extract-then-select two-stage procedure. |
160 | Efficient and Robust Question Answering from Minimal Context over Documents | Sewon Min, Victor Zhong, Richard Socher, Caiming Xiong | In this paper, we study the minimal context required to answer the question, and find that most questions in existing datasets can be answered with a small set of sentences. |
161 | Denoising Distantly Supervised Open-Domain Question Answering | Yankai Lin, Haozhe Ji, Zhiyuan Liu, Maosong Sun | To address these issues, we propose a novel DS-QA model which employs a paragraph selector to filter out those noisy paragraphs and a paragraph reader to extract the correct answer from those denoised paragraphs. |
162 | Question Condensing Networks for Answer Selection in Community Question Answering | Wei Wu, Xu Sun, Houfeng Wang | In this paper, we propose the Question Condensing Networks (QCN) to make use of the subject-body relationship of community questions. |
163 | Towards Robust Neural Machine Translation | Yong Cheng, Zhaopeng Tu, Fandong Meng, Junjie Zhai, Yang Liu | In this paper, we propose to improve the robustness of NMT models with adversarial stability training. |
164 | Attention Focusing for Neural Machine Translation by Bridging Source and Target Embeddings | Shaohui Kuang, Junhui Li, António Branco, Weihua Luo, Deyi Xiong | In this paper, we seek to somewhat shorten the distance between source and target words in that procedure, and thus strengthen their association, by means of a method we term bridging source and target word embeddings. |
165 | Reliability and Learnability of Human Bandit Feedback for Sequence-to-Sequence Reinforcement Learning | Julia Kreutzer, Joshua Uyheng, Stefan Riezler | We present a study on reinforcement learning (RL) from human bandit feedback for sequence-to-sequence learning, exemplified by the task of bandit neural machine translation (NMT). |
166 | Accelerating Neural Transformer via an Average Attention Network | Biao Zhang, Deyi Xiong, Jinsong Su | To alleviate this issue, we propose an average attention network as an alternative to the self-attention network in the decoder of the neural Transformer. |
167 | How Much Attention Do You Need? A Granular Analysis of Neural Machine Translation Architectures | Tobias Domhan | In this work we take a fine-grained look at the different architectures for NMT. |
168 | Weakly Supervised Semantic Parsing with Abstract Examples | Omer Goldman, Veronica Latcinnik, Ehud Nave, Amir Globerson, Jonathan Berant | In this work we propose that in closed worlds with clear semantic types, one can substantially alleviate these problems by utilizing an abstract representation, where tokens in both the language utterance and program are lifted to an abstract form. |
169 | Improving a Neural Semantic Parser by Counterfactual Learning from Human Bandit Feedback | Carolin Lawrence, Stefan Riezler | We show how to apply this learning framework to neural semantic parsing. |
170 | AMR dependency parsing with a typed semantic algebra | Jonas Groschwitz, Matthias Lindemann, Meaghan Fowlie, Mark Johnson, Alexander Koller | We present a semantic parser for Abstract Meaning Representations which learns to parse strings into tree representations of the compositional structure of an AMR graph. |
171 | Sequence-to-sequence Models for Cache Transition Systems | Xiaochang Peng, Linfeng Song, Daniel Gildea, Giorgio Satta | In this paper, we present a sequence-to-sequence based approach for mapping natural language sentences to AMR semantic graphs. |
172 | Batch IS NOT Heavy: Learning Word Representations From All Samples | Xin Xin, Fajie Yuan, Xiangnan He, Joemon M. Jose | In this work, we propose AllVec that uses batch gradient learning to generate word representations from all training samples. |
173 | Backpropagating through Structured Argmax using a SPIGOT | Hao Peng, Sam Thomson, Noah A. Smith | We introduce structured projection of intermediate gradients (SPIGOT), a new method for backpropagating through neural networks that include hard-decision structured predictions (e.g., parsing) in intermediate layers. |
174 | Learning How to Actively Learn: A Deep Imitation Learning Approach | Ming Liu, Wray Buntine, Gholamreza Haffari | We introduce a method that learns an AL “policy” using “imitation learning” (IL). |
175 | Training Classifiers with Natural Language Explanations | Braden Hancock, Paroma Varma, Stephanie Wang, Martin Bringmann, Percy Liang, Christopher Ré | In this work, we propose BabbleLabble, a framework for training classifiers in which an annotator provides a natural language explanation for each labeling decision. |
176 | Did the Model Understand the Question? | Pramod Kaushik Mudrakarta, Ankur Taly, Mukund Sundararajan, Kedar Dhamdhere | We analyze state-of-the-art deep learning models for three tasks: question answering on (1) images, (2) tables, and (3) passages of text. |
177 | Harvesting Paragraph-level Question-Answer Pairs from Wikipedia | Xinya Du, Claire Cardie | We propose a neural network approach that incorporates coreference knowledge via a novel gating mechanism. We apply our system (composed of an answer span extraction system and the passage-level QG system) to the 10,000 top ranking Wikipedia articles and create a corpus of over one million question-answer pairs. |
178 | Multi-Passage Machine Reading Comprehension with Cross-Passage Answer Verification | Yizhong Wang, Kai Liu, Jing Liu, Wei He, Yajuan Lyu, Hua Wu, Sujian Li, Haifeng Wang | To address this problem, we propose an end-to-end neural model that enables those answer candidates from different passages to verify each other based on their content representations. |
179 | Language Generation via DAG Transduction | Yajie Ye, Weiwei Sun, Xiaojun Wan | In this paper, we propose a novel DAG transducer to perform graph-to-program transformation. |
180 | A Distributional and Orthographic Aggregation Model for English Derivational Morphology | Daniel Deutsch, John Hewitt, Dan Roth | In this work, we tackle the task of derived word generation. |
181 | Deep-speare: A joint neural model of poetic language, meter and rhyme | Jey Han Lau, Trevor Cohn, Timothy Baldwin, Julian Brooke, Adam Hammond | In this paper, we propose a joint architecture that captures language, rhyme and meter for sonnet modelling. |
182 | NeuralREG: An end-to-end approach to referring expression generation | Thiago Castro Ferreira, Diego Moussallem, Ákos Kádár, Sander Wubben, Emiel Krahmer | In this paper, we present a new approach (NeuralREG), relying on deep neural networks, which makes decisions about form and content in one go without explicit feature extraction. |
183 | Stock Movement Prediction from Tweets and Historical Prices | Yumo Xu, Shay B. Cohen | We treat these three complexities and present a novel deep generative model jointly exploiting text and price signals for this task. |
184 | Rumor Detection on Twitter with Tree-structured Recursive Neural Networks | Jing Ma, Wei Gao, Kam-Fai Wong | In this work, we try to learn discriminative features from tweets content by following their non-sequential propagation structure and generate more powerful representations for identifying different type of rumors. |
185 | Visual Attention Model for Name Tagging in Multimodal Social Media | Di Lu, Leonardo Neves, Vitor Carvalho, Ning Zhang, Heng Ji | In this paper, we explore the task of name tagging in multimodal social media posts. |
186 | Multimodal Named Entity Disambiguation for Noisy Social Media Posts | Seungwhan Moon, Leonardo Neves, Vitor Carvalho | We introduce the new Multimodal Named Entity Disambiguation (MNED) task for multimodal social media posts such as Snapchat or Instagram captions, which are composed of short captions with accompanying images. To this end, we build a new dataset called SnapCaptionsKB, a collection of Snapchat image captions submitted to public and crowd-sourced stories, with named entity mentions fully annotated and linked to entities in an external knowledge base. |
187 | Semi-supervised User Geolocation via Graph Convolutional Networks | Afshin Rahimi, Trevor Cohn, Timothy Baldwin | In this paper, we propose GCN, a multiview geolocation model based on Graph Convolutional Networks, that uses both text and network context. |
188 | Document Modeling with External Attention for Sentence Extraction | Shashi Narayan, Ronald Cardenas, Nikos Papasarantopoulos, Shay B. Cohen, Mirella Lapata, Jiangsheng Yu, Yi Chang | We propose to use external information to improve document modeling for problems that can be framed as sentence extraction. |
189 | Neural Models for Documents with Metadata | Dallas Card, Chenhao Tan, Noah A. Smith | In this paper, we build on recent advances in variational inference methods and propose a general neural framework, based on topic models, to enable flexible incorporation of metadata and allow for rapid exploration of alternative models. |
190 | NASH: Toward End-to-End Neural Architecture for Generative Semantic Hashing | Dinghan Shen, Qinliang Su, Paidamoyo Chapfuwa, Wenlin Wang, Guoyin Wang, Ricardo Henao, Lawrence Carin | In this paper, we present an \textit{end-to-end} Neural Architecture for Semantic Hashing (NASH), where the binary hashing codes are treated as \textit{Bernoulli} latent variables. |
191 | Large-Scale QA-SRL Parsing | Nicholas FitzGerald, Julian Michael, Luheng He, Luke Zettlemoyer | We present a new large-scale corpus of Question-Answer driven Semantic Role Labeling (QA-SRL) annotations, and the first high-quality QA-SRL parser. |
192 | Syntax for Semantic Role Labeling, To Be, Or Not To Be | Shexia He, Zuchao Li, Hai Zhao, Hongxiao Bai | We propose an enhanced argument labeling model companying with an extended korder argument pruning algorithm for effectively exploiting syntactic information. |
193 | Situated Mapping of Sequential Instructions to Actions with Single-step Reward Observation | Alane Suhr, Yoav Artzi | We propose a learning approach for mapping context-dependent sequential instructions to actions. |
194 | Marrying Up Regular Expressions with Neural Networks: A Case Study for Spoken Language Understanding | Bingfeng Luo, Yansong Feng, Zheng Wang, Songfang Huang, Rui Yan, Dongyan Zhao | In this paper, we ask the question: “Can we combine a neural network (NN) with regular expressions (RE) to improve supervised learning for NLP?” |
195 | Token-level and sequence-level loss smoothing for RNN language models | Maha Elbayad, Laurent Besacier, Jakob Verbeek | We extend this approach to token-level loss smoothing, and propose improvements to the sequence-level smoothing approach. |
196 | Numeracy for Language Models: Evaluating and Improving their Ability to Predict Numbers | Georgios Spithourakis, Sebastian Riedel | In this paper, we explore different strategies for modelling numerals with language models, such as memorisation and digit-by-digit composition, and propose a novel neural architecture that uses a continuous probability density function to model numerals from an open vocabulary. |
197 | To Attend or not to Attend: A Case Study on Syntactic Structures for Semantic Relatedness | Amulya Gupta, Zhu Zhang | Our models are evaluated on two semantic relatedness tasks: semantic relatedness scoring for sentence pairs (SemEval 2012, Task 6 and SemEval 2014, Task 1) and paraphrase detection for question pairs (Quora, 2017). |
198 | What you can cram into a single $&!#* vector: Probing sentence embeddings for linguistic properties | Alexis Conneau, German Kruszewski, Guillaume Lample, Loïc Barrault, Marco Baroni | We introduce here 10 probing tasks designed to capture simple linguistic features of sentences, and we use them to study embeddings generated by three different encoders trained in eight distinct ways, uncovering intriguing properties of both encoders and training methods. |
199 | Robust Distant Supervision Relation Extraction via Deep Reinforcement Learning | Pengda Qin, Weiran Xu, William Yang Wang | To do this, our paper describes a radical solution-We explore a deep reinforcement learning strategy to generate the false-positive indicator, where we automatically recognize false positives for each relation type without any supervised information. |
200 | Interpretable and Compositional Relation Learning by Joint Training with an Autoencoder | Ryo Takahashi, Ran Tian, Kentaro Inui | In this paper we investigate a dimension reduction technique by training relations jointly with an autoencoder, which is expected to better capture compositional constraints. |
201 | Zero-Shot Transfer Learning for Event Extraction | Lifu Huang, Heng Ji, Kyunghyun Cho, Ido Dagan, Sebastian Riedel, Clare Voss | Most previous supervised event extraction methods have relied on features derived from manual annotations, and thus cannot be applied to new event types without extra annotation effort. |
202 | Recursive Neural Structural Correspondence Network for Cross-domain Aspect and Opinion Co-Extraction | Wenya Wang, Sinno Jialin Pan | In this paper, we develop a novel recursive neural network that could reduce domain shift effectively in word level through syntactic relations. |
203 | Deep Dyna-Q: Integrating Planning for Task-Completion Dialogue Policy Learning | Baolin Peng, Xiujun Li, Jianfeng Gao, Jingjing Liu, Kam-Fai Wong | To address these issues, we present Deep Dyna-Q, which to our knowledge is the first deep RL framework that integrates planning for task-completion dialogue policy learning. |
204 | Learning to Ask Questions in Open-domain Conversational Systems with Typed Decoders | Yansen Wang, Chenyi Liu, Minlie Huang, Liqiang Nie | We observe that a good question is a natural composition of interrogatives, topic words, and ordinary words. |
205 | Personalizing Dialogue Agents: I have a dog, do you have pets too? | Saizheng Zhang, Emily Dinan, Jack Urbanek, Arthur Szlam, Douwe Kiela, Jason Weston | In this work we present the task of making chit-chat more engaging by conditioning on profile information. |
206 | Efficient Large-Scale Neural Domain Classification with Personalized Attention | Young-Bum Kim, Dongchan Kim, Anjishnu Kumar, Ruhi Sarikaya | In this paper, we explore the task of mapping spoken language utterances to one of thousands of natural language understanding domains in intelligent personal digital assistants (IPDAs). |
207 | Multimodal Affective Analysis Using Hierarchical Attention Strategy with Word-Level Alignment | Yue Gu, Kangning Yang, Shiyu Fu, Shuhong Chen, Xinyu Li, Ivan Marsic | Addressing such issues, we introduce a hierarchical multimodal architecture with attention and word-level fusion to classify utterance-level sentiment and emotion from text and audio data. |
208 | Multimodal Language Analysis in the Wild: CMU-MOSEI Dataset and Interpretable Dynamic Fusion Graph | AmirAli Bagher Zadeh, Paul Pu Liang, Soujanya Poria, Erik Cambria, Louis-Philippe Morency | In this paper we introduce CMU Multimodal Opinion Sentiment and Emotion Intensity (CMU-MOSEI), the largest dataset of sentiment analysis and emotion recognition to date. |
209 | Efficient Low-rank Multimodal Fusion With Modality-Specific Factors | Zhun Liu, Ying Shen, Varun Bharadhwaj Lakshminarasimhan, Paul Pu Liang, AmirAli Bagher Zadeh, Louis-Philippe Morency | In this paper, we propose the Low-rank Multimodal Fusion method, which performs multimodal fusion using low-rank tensors to improve efficiency. |
210 | Discourse Coherence: Concurrent Explicit and Implicit Relations | Hannah Rohde, Alexander Johnson, Nathan Schneider, Bonnie Webber | Our prior work suggests that multiple discourse relations can be simultaneously operative between two segments for reasons not predicted by the literature. |
211 | A Spatial Model for Extracting and Visualizing Latent Discourse Structure in Text | Shashank Srivastava, Nebojsa Jojic | We present a generative probabilistic model of documents as sequences of sentences, and show that inference in it can lead to extraction of long-range latent discourse structure from a collection of documents. |
212 | Joint Reasoning for Temporal and Causal Relations | Qiang Ning, Zhili Feng, Hao Wu, Dan Roth | This paper presents a joint inference framework for them using constrained conditional models (CCMs). |
213 | Modeling Naive Psychology of Characters in Simple Commonsense Stories | Hannah Rashkin, Antoine Bosselut, Maarten Sap, Kevin Knight, Yejin Choi | To facilitate research addressing this challenge, we introduce a new annotation framework to explain naive psychology of story characters as fully-specified chains of mental states with respect to motivations and emotional reactions. Our work presents a new large-scale dataset with rich low-level annotations and establishes baseline performance on several new tasks, suggesting avenues for future research. |
214 | A Deep Relevance Model for Zero-Shot Document Filtering | Chenliang Li, Wei Zhou, Feng Ji, Yu Duan, Haiqing Chen | In this paper, we propose a novel deep relevance model for zero-shot document filtering, named DAZER. |
215 | Disconnected Recurrent Neural Networks for Text Categorization | Baoxin Wang | In this paper, we present a novel model named disconnected recurrent neural network (DRNN), which incorporates position-invariance into RNN. |
216 | Joint Embedding of Words and Labels for Text Classification | Guoyin Wang, Chunyuan Li, Wenlin Wang, Yizhe Zhang, Dinghan Shen, Xinyuan Zhang, Ricardo Henao, Lawrence Carin | We introduce an attention framework that measures the compatibility of embeddings between text sequences and labels. |
217 | Neural Sparse Topical Coding | Min Peng, Qianqian Xie, Yanchun Zhang, Hua Wang, Xiuzhen Zhang, Jimin Huang, Gang Tian | We propose a novel sparsity-enhanced topic model, Neural Sparse Topical Coding (NSTC) base on a sparsity-enhanced topic model called Sparse Topical Coding (STC). |
218 | Document Similarity for Texts of Varying Lengths via Hidden Topics | Hongyu Gong, Tarek Sakakini, Suma Bhat, JinJun Xiong | In this paper, we present a document matching approach to bridge this gap, by comparing the texts in a common space of hidden topics. |
219 | Eyes are the Windows to the Soul: Predicting the Rating of Text Quality Using Gaze Behaviour | Sandeep Mathias, Diptesh Kanojia, Kevin Patel, Samarth Agrawal, Abhijit Mishra, Pushpak Bhattacharyya | In this paper, we show that gaze behaviour does indeed help in effectively predicting the rating of text quality. |
220 | Multi-Input Attention for Unsupervised OCR Correction | Rui Dong, David Smith | We propose a novel approach to OCR post-correction that exploits repeated texts in large corpora both as a source of noisy target outputs for unsupervised training and as a source of evidence when decoding. |
221 | Building Language Models for Text with Named Entities | Md Rizwan Parvez, Saikat Chakraborty, Baishakhi Ray, Kai-Wei Chang | In this paper, we propose a novel and effective approach to building a language model which can learn the entity names by leveraging their entity type information. We also introduce two benchmark datasets based on recipes and Java programming codes, on which we evaluate the proposed model. |
222 | hyperdoc2vec: Distributed Representations of Hypertext Documents | Jialong Han, Yan Song, Wayne Xin Zhao, Shuming Shi, Haisong Zhang | In this paper, we propose a general embedding approach for hyper-documents, namely, hyperdoc2vec, along with four criteria characterizing necessary information that hyper-document embedding models should preserve. |
223 | Entity-Duet Neural Ranking: Understanding the Role of Knowledge Graph Semantics in Neural Information Retrieval | Zhenghao Liu, Chenyan Xiong, Maosong Sun, Zhiyuan Liu | This paper presents the Entity-Duet Neural Ranking Model (EDRM), which introduces knowledge graphs to neural search systems. |
224 | Neural Natural Language Inference Models Enhanced with External Knowledge | Qian Chen, Xiaodan Zhu, Zhen-Hua Ling, Diana Inkpen, Si Wei | In this paper, we enrich the state-of-the-art neural natural language inference models with external knowledge. |
225 | AdvEntuRe: Adversarial Training for Textual Entailment with Knowledge-Guided Examples | Dongyeop Kang, Tushar Khot, Ashish Sabharwal, Eduard Hovy | We consider the problem of learning textual entailment models with limited supervision (5K-10K training examples), and present two complementary approaches for it. |
226 | Subword-level Word Vector Representations for Korean | Sungjoon Park, Jeongmin Byun, Sion Baek, Yongseok Cho, Alice Oh | In this paper, we look at improving distributed word representations for Korean using knowledge about the unique linguistic structure of Korean. To evaluate the vectors, we develop Korean test sets for word similarity and analogy and make them publicly available. |
227 | Incorporating Chinese Characters of Words for Lexical Sememe Prediction | Huiming Jin, Hao Zhu, Zhiyuan Liu, Ruobing Xie, Maosong Sun, Fen Lin, Leyu Lin | To address this issue for Chinese, we propose a novel framework to take advantage of both internal character information and external context information of words. |
228 | SemAxis: A Lightweight Framework to Characterize Domain-Specific Word Semantics Beyond Sentiment | Jisun An, Haewoon Kwak, Yong-Yeol Ahn | Here, we propose SemAxis, a simple yet powerful framework to characterize word semantics using many semantic axes in word-vector spaces beyond sentiment. |
229 | End-to-End Reinforcement Learning for Automatic Taxonomy Induction | Yuning Mao, Xiang Ren, Jiaming Shen, Xiaotao Gu, Jiawei Han | We present a novel end-to-end reinforcement learning approach to automatic taxonomy induction from a set of terms. |
230 | Incorporating Glosses into Neural Word Sense Disambiguation | Fuli Luo, Tianyu Liu, Qiaolin Xia, Baobao Chang, Zhifang Sui | In this paper, we integrate the context and glosses of the target word into a unified framework in order to make full use of both labeled data and lexical knowledge. |
231 | Bilingual Sentiment Embeddings: Joint Projection of Sentiment Across Languages | Jeremy Barnes, Roman Klinger, Sabine Schulte im Walde | We introduce Bilingual Sentiment Embeddings (BLSE), which jointly represent sentiment information in a source and target language. |
232 | Learning Domain-Sensitive and Sentiment-Aware Word Embeddings | Bei Shi, Zihao Fu, Lidong Bing, Wai Lam | We propose a new method for learning domain-sensitive and sentiment-aware embeddings that simultaneously capture the information of sentiment semantics and domain sensitivity of individual words. |
233 | Cross-Domain Sentiment Classification with Target Domain Specific Information | Minlong Peng, Qi Zhang, Yu-gang Jiang, Xuanjing Huang | In this work, we propose a method to simultaneously extract domain specific and invariant representations and train a classifier on each of the representation, respectively. And we introduce a few target domain labeled data for learning domain-specific information. |
234 | Aspect Based Sentiment Analysis with Gated Convolutional Networks | Wei Xue, Tao Li | We propose a model based on convolutional neural networks and gating mechanisms, which is more accurate and efficient. |
235 | A Helping Hand: Transfer Learning for Deep Sentiment Analysis | Xin Dong, Gerard de Melo | In this work, we present an approach to feed generic cues into the training process of such networks, leading to better generalization abilities given limited training data. |
236 | Cold-Start Aware User and Product Attention for Sentiment Classification | Reinald Kim Amplayo, Jihyeok Kim, Sua Sung, Seung-won Hwang | In this paper, we present Hybrid Contextualized Sentiment Classifier (HCSC), which contains two modules: (1) a fast word encoder that returns word vectors embedded with short and long range dependency features; and (2) Cold-Start Aware Attention (CSAA), an attention mechanism that considers the existence of cold-start problem when attentively pooling the encoded word vectors. |
237 | Modeling Deliberative Argumentation Strategies on Wikipedia | Khalid Al-Khatib, Henning Wachsmuth, Kevin Lang, Jakob Herpel, Matthias Hagen, Benno Stein | In this paper, we present a model for deliberative discussions and we illustrate its operationalization. On this basis, we automatically generate a corpus with about 200,000 turns, labeled for the 13 categories. |
238 | Conceptual Captions: A Cleaned, Hypernymed, Image Alt-text Dataset For Automatic Image Captioning | Piyush Sharma, Nan Ding, Sebastian Goodman, Radu Soricut | We present a new dataset of image caption annotations, Conceptual Captions, which contains an order of magnitude more images than the MS-COCO dataset (Lin et al., 2014) and represents a wider variety of both images and image caption styles. |
239 | Learning Translations via Images with a Massively Multilingual Image Dataset | John Hewitt, Daphne Ippolito, Brendan Callahan, Reno Kriz, Derry Tanti Wijaya, Chris Callison-Burch | To improve image-based translation, we introduce a novel method of predicting word concreteness from images, which improves on a previous state-of-the-art unsupervised technique. To facilitate research on the task, we introduce a large-scale multilingual corpus of images, each labeled with the word it represents. |
240 | On the Automatic Generation of Medical Imaging Reports | Baoyu Jing, Pengtao Xie, Eric Xing | To cope with these challenges, we (1) build a multi-task learning framework which jointly performs the prediction of tags and the generation of paragraphs, (2) propose a co-attention mechanism to localize regions containing abnormalities and generate narrations for them, (3) develop a hierarchical LSTM model to generate long paragraphs. |
241 | Attacking Visual Language Grounding with Adversarial Examples: A Case Study on Neural Image Captioning | Hongge Chen, Huan Zhang, Pin-Yu Chen, Jinfeng Yi, Cho-Jui Hsieh | To study the robustness of language grounding to adversarial perturbations in machine vision and perception, we propose Show-and-Fool, a novel algorithm for crafting adversarial examples in neural image captioning. |
242 | Think Visually: Question Answering through Virtual Imagery | Ankit Goyal, Jian Wang, Jia Deng | In this paper, we study the problem of geometric reasoning (a form of visual reasoning) in the context of question-answering. Further, we propose two synthetic benchmarks, FloorPlanQA and ShapeIntersection, to evaluate the geometric reasoning capability of QA systems. |
243 | Interactive Language Acquisition with One-shot Visual Concept Learning through a Conversational Game | Haichao Zhang, Haonan Yu, Wei Xu | We highlight the perspective that conversational interaction serves as a natural interface both for language learning and for novel knowledge acquisition and propose a joint imitation and reinforcement approach for grounded language learning through an interactive conversational game. |
244 | A Purely End-to-End System for Multi-speaker Speech Recognition | Hiroshi Seki, Takaaki Hori, Shinji Watanabe, Jonathan Le Roux, John R. Hershey | In this paper, we propose a new sequence-to-sequence framework to directly decode multiple label sequences from a single speech sequence by unifying source separation and speech recognition functions in an end-to-end manner. |
245 | A Structured Variational Autoencoder for Contextual Morphological Inflection | Lawrence Wolf-Sonkin, Jason Naradowsky, Sebastian J. Mielke, Ryan Cotterell | To this end, we introduce a novel generative latent-variable model for the semi-supervised learning of inflection generation. |
246 | Morphosyntactic Tagging with a Meta-BiLSTM Model over Context Sensitive Token Encodings | Bernd Bohnet, Ryan McDonald, Gonçalo Simões, Daniel Andor, Emily Pitler, Joshua Maynez | In this paper, we investigate models that use recurrent neural networks with sentence-level context for initial character and word-based representations. |
247 | Neural Factor Graph Models for Cross-lingual Morphological Tagging | Chaitanya Malaviya, Matthew R. Gormley, Graham Neubig | In this paper we propose a method for cross-lingual morphological tagging that aims to improve information sharing between languages by relaxing this assumption. |
248 | Global Transition-based Non-projective Dependency Parsing | Carlos Gómez-Rodríguez, Tianze Shi, Lillian Lee | In this paper, we extend their approach to support non-projectivity by providing the first practical implementation of the MH₄ algorithm, an $O(n^4)$ mildly nonprojective dynamic-programming parser with very high coverage on non-projective treebanks. |
249 | Constituency Parsing with a Self-Attentive Encoder | Nikita Kitaev, Dan Klein | We demonstrate that replacing an LSTM encoder with a self-attentive architecture can lead to improvements to a state-of-the-art discriminative constituency parser. |
250 | Pre- and In-Parsing Models for Neural Empty Category Detection | Yufei Chen, Yuanyuan Zhao, Weiwei Sun, Xiaojun Wan | Motivated by the positive impact of empty category on syntactic parsing, we study neural models for pre- and in-parsing detection of empty category, which has not previously been investigated. |
251 | Composing Finite State Transducers on GPUs | Arturo Argueta, David Chiang | We show that our approach obtains speedups of up to 6 times over our serial implementation and 4.5 times over OpenFST. |
252 | Supervised Treebank Conversion: Data and Approaches | Xinzhou Jiang, Zhenghua Li, Bo Zhang, Min Zhang, Sheng Li, Luo Si | In this work, we for the first time propose the task of supervised treebank conversion. First, we manually construct a bi-tree aligned dataset containing over ten thousand sentences. |
253 | Object-oriented Neural Programming (OONP) for Document Understanding | Zhengdong Lu, Xianggen Liu, Haotian Cui, Yukun Yan, Daqi Zheng | We propose Object-oriented Neural Programming (OONP), a framework for semantically parsing documents in specific domains. |
254 | Finding syntax in human encephalography with beam search | John Hale, Chris Dyer, Adhiguna Kuncoro, Jonathan Brennan | Finding syntax in human encephalography with beam search |
255 | Learning to Ask Good Questions: Ranking Clarification Questions using Neural Expected Value of Perfect Information | Sudha Rao, Hal Daumé III | In this work, we build a neural network model for the task of ranking clarification questions. We create a dataset of clarification questions consisting of 77K posts paired with a clarification question (and answer) from three domains of StackExchange: askubuntu, unix and superuser. |
256 | Let’s do it “again”: A First Computational Approach to Detecting Adverbial Presupposition Triggers | Andre Cianflone, Yulan Feng, Jad Kabbara, Jackie Chi Kit Cheung | We introduce the novel task of predicting adverbial presupposition triggers, which is useful for natural language generation tasks such as summarization and dialogue systems. We introduce two new corpora, derived from the Penn Treebank and the Annotated English Gigaword dataset and investigate the use of a novel attention mechanism tailored to this task. |
TABLE 2: ACL 2018 Short Papers
Title | Authors | Highlight | |
---|---|---|---|
1 | Continuous Learning in a Hierarchical Multiscale Neural Network | Thomas Wolf, Julien Chaumond, Clement Delangue | We propose a hierarchical multi-scale language model in which short time-scale dependencies are encoded in the hidden state of a lower-level recurrent neural network while longer time-scale dependencies are encoded in the dynamic of the lower-level network by having a meta-learner update the weights of the lower-level neural network in an online meta-learning fashion. |
2 | Restricted Recurrent Neural Tensor Networks: Exploiting Word Frequency and Compositionality | Alexandre Salle, Aline Villavicencio | In this paper, we introduce restricted recurrent neural tensor networks (r-RNTN) which reserve distinct hidden layer weights for frequent vocabulary words while sharing a single set of weights for infrequent words. |
3 | Deep RNNs Encode Soft Hierarchical Syntax | Terra Blevins, Omer Levy, Luke Zettlemoyer | We present a set of experiments to demonstrate that deep recurrent neural networks (RNNs) learn internal representations that capture soft hierarchical notions of syntax from highly varied supervision. |
4 | Word Error Rate Estimation for Speech Recognition: e-WER | Ahmed Ali, Steve Renals | In this paper, we propose a novel approach to estimate WER, or e-WER, which does not require a gold-standard transcription of the test set. |
5 | Towards Robust and Privacy-preserving Text Representations | Yitong Li, Timothy Baldwin, Trevor Cohn | In this paper, we propose an approach to explicitly obscure important author characteristics at training time, such that representations learned are invariant to these attributes. |
6 | HotFlip: White-Box Adversarial Examples for Text Classification | Javid Ebrahimi, Anyi Rao, Daniel Lowd, Dejing Dou | We propose an efficient method to generate white-box adversarial examples to trick a character-level neural classifier. |
7 | Domain Adapted Word Embeddings for Improved Sentiment Classification | Prathusha K Sarma, Yingyu Liang, Bill Sethares | This paper proposes a method to combine the breadth of generic embeddings with the specificity of domain specific embeddings. |
8 | Active learning for deep semantic parsing | Long Duong, Hadi Afshar, Dominique Estival, Glen Pink, Philip Cohen, Mark Johnson | We propose several active learning strategies for overnight data collection and show that different example selection strategies per domain perform best. |
9 | Learning Thematic Similarity Metric from Article Sections Using Triplet Networks | Liat Ein Dor, Yosi Mass, Alon Halfon, Elad Venezian, Ilya Shnayderman, Ranit Aharonov, Noam Slonim | In this paper we suggest to leverage the partition of articles into sections, in order to learn thematic similarity metric between sentences. To test the performance of the learned embeddings, we create and release a sentence clustering benchmark. |
10 | Unsupervised Semantic Frame Induction using Triclustering | Dmitry Ustalov, Alexander Panchenko, Andrey Kutuzov, Chris Biemann, Simone Paolo Ponzetto | We cast the frame induction problem as a triclustering problem that is a generalization of clustering for triadic data. |
11 | Identification of Alias Links among Participants in Narratives | Sangameshwar Patil, Sachin Pawar, Swapnil Hingmire, Girish Palshikar, Vasudeva Varma, Pushpak Bhattacharyya | In this paper, we propose an approach based on linguistic knowledge for identification of aliases mentioned using proper nouns, pronouns or noun phrases with common noun headword. |
12 | Named Entity Recognition With Parallel Recurrent Neural Networks | Andrej Žukov-Gregorič, Yoram Bachrach, Sam Coope | We present a new architecture for named entity recognition. |
13 | Type-Sensitive Knowledge Base Inference Without Explicit Type Supervision | Prachi Jain, Pankaj Kumar, Mausam, Soumen Chakrabarti | State-of-the-art knowledge base completion (KBC) models predict a score for every known or unknown fact via a latent factorization over entity and relation embeddings. |
14 | A Walk-based Model on Entity Graphs for Relation Extraction | Fenia Christopoulou, Makoto Miwa, Sophia Ananiadou | We present a novel graph-based neural network model for relation extraction. |
15 | Ranking-Based Automatic Seed Selection and Noise Reduction for Weakly Supervised Relation Extraction | Van-Thuy Phi, Joan Santoso, Masashi Shimbo, Yuji Matsumoto | This paper addresses the tasks of automatic seed selection for bootstrapping relation extraction, and noise reduction for distantly supervised relation extraction. |
16 | Automatic Extraction of Commonsense LocatedNear Knowledge | Frank F. Xu, Bill Yuchen Lin, Kenny Zhu | In this paper, we study how to automatically extract such relationship through a sentence-level relation classifier and aggregating the scores of entity pairs from a large corpus. Also, we release two benchmark datasets for evaluation and future research. |
17 | Neural Coreference Resolution with Deep Biaffine Attention by Joint Mention Detection and Mention Clustering | Rui Zhang, Cícero Nogueira dos Santos, Michihiro Yasunaga, Bing Xiang, Dragomir Radev | In this paper, we propose to improve the end-to-end coreference resolution system by (1) using a biaffine attention model to get antecedent scores for each possible mention, and (2) jointly optimizing the mention detection accuracy and mention clustering accuracy given the mention cluster labels. |
18 | Fully Statistical Neural Belief Tracking | Nikola Mrkšić, Ivan Vulić | This paper proposes an improvement to the existing data-driven Neural Belief Tracking (NBT) framework for Dialogue State Tracking (DST). |
19 | Some of Them Can be Guessed! Exploring the Effect of Linguistic Context in Predicting Quantifiers | Sandro Pezzelle, Shane Steinert-Threlkeld, Raffaella Bernardi, Jakub Szymanik | We study the role of linguistic context in predicting quantifiers (few’, all’). We collect crowdsourced data from human participants and test various models in a local (single-sentence) and a global context (multi-sentence) condition. |
20 | A Named Entity Recognition Shootout for German | Martin Riedl, Sebastian Padó | We ask how to practically build a model for German named entity recognition (NER) that performs at the state of the art for both contemporary and historical texts, i.e., a big-data and a small-data scenario. |
21 | A dataset for identifying actionable feedback in collaborative software development | Benjamin S. Meyers, Nuthan Munaiah, Emily Prud’hommeaux, Andrew Meneely, Josephine Wolff, Cecilia Ovesdotter Alm, Pradeep Murukannaiah | To understand the factors that contribute to this outcome, we analyze a novel dataset of more than one million code reviews for the Google Chromium project, from which we extract linguistic features of feedback that elicited responsive actions from coworkers. |
22 | SNAG: Spoken Narratives and Gaze Dataset | Preethi Vaidyanathan, Emily T. Prud’hommeaux, Jeff B. Pelz, Cecilia O. Alm | In this paper, we describe a new multimodal dataset that consists of gaze measurements and spoken descriptions collected in parallel during an image inspection task. |
23 | Analogical Reasoning on Chinese Morphological and Semantic Relations | Shen Li, Zhe Zhao, Renfen Hu, Wensi Li, Tao Liu, Xiaoyong Du | This paper proposes an analogical reasoning task on Chinese. |
24 | Construction of a Chinese Corpus for the Analysis of the Emotionality of Metaphorical Expressions | Dongyu Zhang, Hongfei Lin, Liang Yang, Shaowu Zhang, Bo Xu | We present an annotation scheme that contains annotations of linguistic metaphors, emotional categories (joy, anger, sadness, fear, love, disgust and surprise), and intensity. We therefore construct a significant new corpus on metaphor, with 5,605 manually annotated sentences in Chinese. |
25 | Automatic Article Commenting: the Task and Dataset | Lianhui Qin, Lemao Liu, Wei Bi, Yan Wang, Xiaojiang Liu, Zhiting Hu, Hai Zhao, Shuming Shi | This paper proposes the new task of automatic article commenting, and introduces a large-scale Chinese dataset with millions of real comments and a human-annotated subset characterizing the comments’ varying quality. |
26 | Improved Evaluation Framework for Complex Plagiarism Detection | Anton Belyy, Marina Dubova, Dmitry Nekrasov | In this paper, we study the performance of plagdet, the main measure for plagiarim detection, on manually paraphrased datasets (such as PAN Summary). |
27 | Global Encoding for Abstractive Summarization | Junyang Lin, Xu Sun, Shuming Ma, Qi Su | To tackle the problem, we propose a global encoding framework, which controls the information flow from the encoder to the decoder based on the global information of the source context. |
28 | A Language Model based Evaluator for Sentence Compression | Yang Zhao, Zhiyuan Luo, Akiko Aizawa | We herein present a language-model-based evaluator for deletion-based sentence compression and view this task as a series of deletion-and-evaluation operations using the evaluator. |
29 | Identifying and Understanding User Reactions to Deceptive and Trusted Social News Sources | Maria Glenski, Tim Weninger, Svitlana Volkova | In the present work we seek to better understand how users react to trusted and deceptive news sources across two popular, and very different, social media platforms. |
30 | Content-based Popularity Prediction of Online Petitions Using a Deep Regression Model | Shivashankar Subramanian, Timothy Baldwin, Trevor Cohn | In this work, we model this task using CNN regression with an auxiliary ordinal regression objective. |
31 | Fighting Offensive Language on Social Media with Unsupervised Text Style Transfer | Cicero Nogueira dos Santos, Igor Melnyk, Inkit Padhi | We introduce a new approach to tackle the problem of offensive language in online social media. |
32 | Diachronic degradation of language models: Insights from social media | Kokil Jaidka, Niyati Chhaya, Lyle Ungar | This study investigates the diachronic accuracy of pre-trained language models for downstream tasks in machine learning and user profiling. |
33 | Task-oriented Dialogue System for Automatic Diagnosis | Zhongyu Wei, Qianlong Liu, Baolin Peng, Huaixiao Tou, Ting Chen, Xuanjing Huang, Kam-fai Wong, Xiangying Dai | In this paper, we make a move to build a dialogue system for automatic diagnosis. We first build a dataset collected from an online medical forum by extracting symptoms from both patients’ self-reports and conversational data between patients and doctors. |
34 | Transfer Learning for Context-Aware Question Matching in Information-seeking Conversations in E-commerce | Minghui Qiu, Liu Yang, Feng Ji, Wei Zhou, Jun Huang, Haiqing Chen, Bruce Croft, Wei Lin | To alleviate these problems, we study transfer learning for multi-turn information seeking conversations in this paper. |
35 | A Multi-task Approach to Learning Multilingual Representations | Karan Singla, Dogan Can, Shrikanth Narayanan | We present a novel multi-task modeling approach to learning multilingual distributed representations of text. |
36 | Characterizing Departures from Linearity in Word Translation | Ndapa Nakashole, Raphael Flauger | We investigate the behavior of maps learned by machine translation methods. |
37 | Filtering and Mining Parallel Data in a Joint Multilingual Space | Holger Schwenk | We learn a joint multilingual sentence embedding and use the distance between sentences in different languages to filter noisy parallel data and to mine for parallel data in large news collections. |
38 | Hybrid semi-Markov CRF for Neural Sequence Labeling | Zhixiu Ye, Zhen-Hua Ling | In this paper, we improve the existing SCRF methods by employing word-level and segment-level information simultaneously. |
39 | A Study of the Importance of External Knowledge in the Named Entity Recognition Task | Dominic Seyler, Tatiana Dembelova, Luciano Del Corro, Johannes Hoffart, Gerhard Weikum | In this work, we discuss the importance of external knowledge for performing Named Entity Recognition (NER). |
40 | Improving Topic Quality by Promoting Named Entities in Topic Modeling | Katsiaryna Krasnashchok, Salim Jouili | In this paper we use named entities as domain-specific terms for news-centric content and present a new weighting model for Latent Dirichlet Allocation. |
41 | Obligation and Prohibition Extraction Using Hierarchical RNNs | Ilias Chalkidis, Ion Androutsopoulos, Achilleas Michos | We consider the task of detecting contractual obligations and prohibitions. |
42 | Paper Abstract Writing through Editing Mechanism | Qingyun Wang, Zhihao Zhou, Lifu Huang, Spencer Whitehead, Boliang Zhang, Heng Ji, Kevin Knight | We present a paper abstract writing system based on an attentive neural sequence-to-sequence model that can take a title as input and automatically generate an abstract. |
43 | Conditional Generators of Words Definitions | Artyom Gadetsky, Ilya Yakubovskiy, Dmitry Vetrov | In this work, we study the problem of word ambiguities in definition modeling and propose a possible solution by employing latent variable modeling and soft attention mechanisms. |
44 | CNN for Text-Based Multiple Choice Question Answering | Akshay Chaturvedi, Onkar Pandit, Utpal Garain | In this paper, we propose a Convolutional Neural Network (CNN) model for text-based multiple choice question answering where questions are based on a particular article. |
45 | Narrative Modeling with Memory Chains and Semantic Supervision | Fei Liu, Trevor Cohn, Timothy Baldwin | Inspired by previous studies on ROC Story Cloze Test, we propose a novel method, tracking various semantic aspects with external neural memory chains while encouraging each to focus on a particular semantic aspect. |
46 | Injecting Relational Structural Representation in Neural Networks for Question Similarity | Antonio Uva, Daniele Bonadiman, Alessandro Moschitti | In this paper, we propose to inject structural representations in NNs by (i) learning a model with Tree Kernels (TKs) on relatively few pairs of questions (few thousands) as gold standard (GS) training data is typically scarce, (ii) predicting labels on a very large corpus of question pairs, and (iii) pre-training NNs on such large corpus. |
47 | A Simple and Effective Approach to Coverage-Aware Neural Machine Translation | Yanyang Li, Tong Xiao, Yinqiao Li, Qiang Wang, Changming Xu, Jingbo Zhu | We offer a simple and effective method to seek a better balance between model confidence and length preference for Neural Machine Translation (NMT). |
48 | Dynamic Sentence Sampling for Efficient Training of Neural Machine Translation | Rui Wang, Masao Utiyama, Eiichiro Sumita | Here, we propose an efficient method to dynamically sample the sentences in order to accelerate the NMT training. |
49 | Compositional Representation of Morphologically-Rich Input for Neural Machine Translation | Duygu Ataman, Marcello Federico | As a solution, various studies proposed segmenting words into sub-word units and performing translation at the sub-lexical level. |
50 | Extreme Adaptation for Personalized Neural Machine Translation | Paul Michel, Graham Neubig | In this paper, we propose a simple and parameter-efficient adaptation technique that only requires adapting the bias of the output softmax to each particular user of the MT system, either directly or through a factored approximation. |
51 | Multi-representation ensembles and delayed SGD updates improve syntax-based NMT | Danielle Saunders, Felix Stahlberg, Adrià de Gispert, Bill Byrne | We formulate beam search over such ensembles using WFSTs, and describe a delayed SGD update training procedure that is especially effective for long representations like linearized syntax. |
52 | Learning from Chunk-based Feedback in Neural Machine Translation | Pavel Petrushkov, Shahram Khadivi, Evgeny Matusov | We propose a simple and effective way of utilizing such feedback in NMT training. |
53 | Bag-of-Words as Target for Neural Machine Translation | Shuming Ma, Xu Sun, Yizhong Wang, Junyang Lin | In this paper, we propose an approach that uses both the sentences and the bag-of-words as targets in the training stage, in order to encourage the model to generate the potentially correct sentences that are not appeared in the training set. |
54 | Improving Beam Search by Removing Monotonic Constraint for Neural Machine Translation | Raphael Shu, Hideki Nakayama | Despite its simplicity, we show that the proposed decoding algorithm enhances the quality of selected hypotheses and improve the translations even for high-performance models in English-Japanese translation task. |
55 | Leveraging distributed representations and lexico-syntactic fixedness for token-level prediction of the idiomaticity of English verb-noun combinations | Milton King, Paul Cook | In this paper we propose and evaluate models for classifying VNC usages as idiomatic or literal, based on a variety of approaches to forming distributed representations. |
56 | Using pseudo-senses for improving the extraction of synonyms from word embeddings | Olivier Ferret | In this article, we propose Pseudofit, a new method for specializing word embeddings according to semantic similarity without any external knowledge. |
57 | Hearst Patterns Revisited: Automatic Hypernym Detection from Large Text Corpora | Stephen Roller, Douwe Kiela, Maximilian Nickel | In this paper, we study the performance of both approaches on several hypernymy tasks and find that simple pattern-based methods consistently outperform distributional methods on common benchmark datasets. |
58 | Jointly Predicting Predicates and Arguments in Neural Semantic Role Labeling | Luheng He, Kenton Lee, Omer Levy, Luke Zettlemoyer | We propose an end-to-end approach for jointly predicting all predicates, arguments spans, and the relations between them. |
59 | Sparse and Constrained Attention for Neural Machine Translation | Chaitanya Malaviya, Pedro Ferreira, André F. T. Martins | We experiment with various sparse and constrained attention transformations and propose a new one, constrained sparsemax, shown to be differentiable and sparse. |
60 | Neural Hidden Markov Model for Machine Translation | Weiyue Wang, Derui Zhu, Tamer Alkhouli, Zixuan Gan, Hermann Ney | We study a neural hidden Markov model (HMM) consisting of neural network-based alignment and lexicon models, which are trained jointly using the forward-backward algorithm. |
61 | Bleaching Text: Abstract Features for Cross-lingual Gender Prediction | Rob van der Goot, Nikola Ljubešić, Ian Matroos, Malvina Nissim, Barbara Plank | We propose an alternative: bleaching text, i.e., transforming lexical strings into more abstract features. |
62 | Orthographic Features for Bilingual Lexicon Induction | Parker Riley, Daniel Gildea | This work extends embedding-based methods to incorporate these features, resulting in significant accuracy gains for related languages. |
63 | Neural Cross-Lingual Coreference Resolution And Its Application To Entity Linking | Gourab Kundu, Avi Sil, Radu Florian, Wael Hamza | We propose an entity-centric neural crosslingual coreference model that builds on multi-lingual embeddings and language independent features. |
64 | Judicious Selection of Training Data in Assisting Language for Multilingual Neural NER | Rudra Murthy, Anoop Kunchukuttan, Pushpak Bhattacharyya | To alleviate this problem, we propose a metric based on symmetric KL divergence to filter out the highly divergent training instances in the assisting language. |
65 | Neural Open Information Extraction | Lei Cui, Furu Wei, Ming Zhou | In this paper, we propose a neural Open IE approach with an encoder-decoder framework. |
66 | Document Embedding Enhanced Event Detection with Hierarchical and Supervised Attention | Yue Zhao, Xiaolong Jin, Yuanzhuo Wang, Xueqi Cheng | In this paper, we propose a novel Document Embedding Enhanced Bi-RNN model, called DEEB-RNN, to detect events in sentences. |
67 | Learning Matching Models with Weak Supervision for Response Selection in Retrieval-based Chatbots | Yu Wu, Wei Wu, Zhoujun Li, Ming Zhou | We propose a method that can leverage unlabeled data to learn a matching model for response selection in retrieval-based chatbots. |
68 | Improving Slot Filling in Spoken Language Understanding with Joint Pointer and Attention | Lin Zhao, Zhe Feng | We present a generative neural network model for slot filling based on a sequence-to-sequence (Seq2Seq) model together with a pointer network, in the situation where only sentence-level slot annotations are available in the spoken dialogue data. |
69 | Large-Scale Multi-Domain Belief Tracking with Knowledge Sharing | Osman Ramadan, Paweł Budzianowski, Milica Gašić | In this paper, a novel approach is introduced that fully utilizes semantic similarity between dialogue utterances and the ontology terms, allowing the information to be shared across domains. |
70 | Modeling discourse cohesion for discourse parsing via memory network | Yanyan Jia, Yuan Ye, Yansong Feng, Yuxuan Lai, Rui Yan, Dongyan Zhao | In this paper, we propose a new transition-based discourse parser that makes use of memory networks to take discourse cohesion into account. |
71 | SciDTB: Discourse Dependency TreeBank for Scientific Abstracts | An Yang, Sujian Li | In this paper, we present SciDTB, a domain-specific discourse treebank annotated on scientific articles. |
72 | Predicting accuracy on large datasets from smaller pilot data | Mark Johnson, Peter Anderson, Mark Dras, Mark Steedman | We introduce a new performance extrapolation task to evaluate how well different extrapolations predict accuracy on larger training sets. |
73 | The Influence of Context on Sentence Acceptability Judgements | Jean-Philippe Bernardy, Shalom Lappin, Jey Han Lau | We investigate the influence that document context exerts on human acceptability judgements for English sentences, via two sets of experiments. |
74 | Do Neural Network Cross-Modal Mappings Really Bridge Modalities? | Guillem Collell, Marie-Francine Moens | Here, we propose a new similarity measure and two ad hoc experiments to shed light on this issue. |
75 | Policy Gradient as a Proxy for Dynamic Oracles in Constituency Parsing | Daniel Fried, Dan Klein | We explore using a policy gradient method as a parser-agnostic alternative. |
76 | Linear-time Constituency Parsing with RNNs and Dynamic Programming | Juneki Hong, Liang Huang | We propose a linear-time constituency parser with RNNs and dynamic programming using graph-structured stack and beam search, which runs in time $O(n b^2)$ where $b$ is the beam size. |
77 | Simpler but More Accurate Semantic Dependency Parsing | Timothy Dozat, Christopher D. Manning | We extend the LSTM-based syntactic parser of Dozat and Manning (2017) to train on and generate these graph structures. |
78 | Simplified Abugidas | Chenchen Ding, Masao Utiyama, Eiichiro Sumita | An abugida is a writing system where the consonant letters represent syllables with a default vowel and other vowels are denoted by diacritics. |
79 | Automatic Academic Paper Rating Based on Modularized Hierarchical Convolutional Neural Network | Pengcheng Yang, Xu Sun, Wei Li, Shuming Ma | In this paper, in order to assist professionals in evaluating academic papers, we propose a novel task: automatic academic paper rating (AAPR), which automatically determine whether to accept academic papers. We build a new dataset for this task and propose a novel modularized hierarchical convolutional neural network to achieve automatic academic paper rating. |
80 | Automated essay scoring with string kernels and word embeddings | Mădălina Cozma, Andrei Butnaru, Radu Tudor Ionescu | In this work, we present an approach based on combining string kernels and word embeddings for automatic essay scoring. |
81 | Party Matters: Enhancing Legislative Embeddings with Author Attributes for Vote Prediction | Anastassia Kornilova, Daniel Argyle, Vladimir Eidelman | In this paper, we show that text alone is insufficient for modeling voting outcomes in new contexts, as session changes lead to changes in the underlying data generation process. |
82 | Dynamic and Static Topic Model for Analyzing Time-Series Document Collections | Rem Hida, Naoya Takeishi, Takehisa Yairi, Koichi Hori | To this end, we propose a dynamic and static topic model, which simultaneously considers the dynamic structures of the temporal topic evolution and the static structures of the topic hierarchy at each time. |
83 | PhraseCTM: Correlated Topic Modeling on Phrases within Markov Random Fields | Weijing Huang | We propose a novel topic model PhraseCTM and a two-stage method to find out the correlated topics at phrase level. |
84 | A Document Descriptor using Covariance of Word Vectors | Marwan Torki | In this paper, we address the problem of finding a novel document descriptor based on the covariance matrix of the word vectors of a document. |
85 | Learning with Structured Representations for Negation Scope Extraction | Hao Li, Wei Lu | We design approaches based on conditional random fields (CRF), semi-Markov CRF, as well as latent-variable CRF models to capture such information. |
86 | End-Task Oriented Textual Entailment via Deep Explorations of Inter-Sentence Interactions | Wenpeng Yin, Dan Roth, Hinrich Schütze | We propose DEISTE (deep explorations of inter-sentence interactions for textual entailment) for this entailment task. |
87 | Sense-Aware Neural Models for Pun Location in Texts | Yitao Cai, Yin Li, Xiaojun Wan | In this paper, we focus on the task of pun location, which aims to identify the pun word in a given short text. |
88 | A Rank-Based Similarity Metric for Word Embeddings | Enrico Santus, Hongmin Wang, Emmanuele Chersoni, Yue Zhang | In this paper, we report experiments with a rank-based metric for WE, which performs comparably to vector cosine in similarity estimation and outperforms it in the recently-introduced and challenging task of outlier detection, thus suggesting that rank-based measures can improve clustering quality. |
89 | Addressing Noise in Multidialectal Word Embeddings | Alexander Erdmann, Nasser Zalmout, Nizar Habash | We make three contributions to address this noise. |
90 | GNEG: Graph-Based Negative Sampling for word2vec | Zheng Zhang, Pierre Zweigenbaum | In this purpose we pre-compute word co-occurrence statistics from the corpus and apply to it network algorithms such as random walk. |
91 | Unsupervised Learning of Style-sensitive Word Vectors | Reina Akama, Kento Watanabe, Sho Yokoi, Sosuke Kobayashi, Kentaro Inui | This paper presents the first study aimed at capturing stylistic similarity between words in an unsupervised manner. |
92 | Exploiting Document Knowledge for Aspect-level Sentiment Classification | Ruidan He, Wee Sun Lee, Hwee Tou Ng, Daniel Dahlmeier | In this paper, we explore two approaches that transfer knowledge from document-level data, which is much less expensive to obtain, to improve the performance of aspect-level sentiment classification. |
93 | Modeling Sentiment Association in Discourse for Humor Recognition | Lizhen Liu, Donghai Zhang, Wei Song | This paper proposes to model sentiment association between discourse units to indicate how the punchline breaks the expectation of the setup. |
94 | Double Embeddings and CNN-based Sequence Labeling for Aspect Extraction | Hu Xu, Bing Liu, Lei Shu, Philip S. Yu | Unlike other highly sophisticated supervised deep learning models, this paper proposes a novel and yet simple CNN model employing two types of pre-trained embeddings for aspect extraction: general-purpose embeddings and domain-specific embeddings. |
95 | Will it Blend? Blending Weak and Strong Labeled Data in a Neural Network for Argumentation Mining | Eyal Shnarch, Carlos Alzate, Lena Dankin, Martin Gleize, Yufang Hou, Leshem Choshen, Ranit Aharonov, Noam Slonim | We propose a methodology to blend high quality but scarce strong labeled data with noisy but abundant weak labeled data during the training of neural networks. In addition, we provide a manually annotated data set for the task of topic-dependent evidence detection. |
96 | Investigating Audio, Video, and Text Fusion Methods for End-to-End Automatic Personality Prediction | Onno Kampman, Elham J. Barezi, Dario Bertero, Pascale Fung | We propose a tri-modal architecture to predict Big Five personality trait scores from video clips with different channels for audio, text, and video data. |
97 | An Empirical Study of Building a Strong Baseline for Constituency Parsing | Jun Suzuki, Sho Takase, Hidetaka Kamigaito, Makoto Morishita, Masaaki Nagata | We incorporate several techniques that were mainly developed in natural language generation tasks, e.g., machine translation and summarization, and demonstrate that the sequence-to-sequence model achieves the current top-notch parsers’ performance (almost) without requiring any explicit task-specific knowledge or architecture of constituent parsing. |
98 | Parser Training with Heterogeneous Treebanks | Sara Stymne, Miryam de Lhoneux, Aaron Smith, Joakim Nivre | We go on to propose a new method based on treebank embeddings. |
99 | Generalized chart constraints for efficient PCFG and TAG parsing | Stefan Grünewald, Sophie Henning, Alexander Koller | We generalize chart constraints to more expressive grammar formalisms and describe a neural tagger which predicts chart constraints at very high precision. |
100 | Exploring Semantic Properties of Sentence Embeddings | Xunjie Zhu, Tingfeng Li, Gerard de Melo | In this paper, we assess to what extent prominent sentence embedding methods exhibit select semantic properties. |
101 | Scoring Lexical Entailment with a Supervised Directional Similarity Network | Marek Rei, Daniela Gerz, Ivan Vulić | We present the Supervised Directional Similarity Network, a novel neural architecture for learning task-specific transformation functions on top of general-purpose word embeddings. |
102 | Extracting Commonsense Properties from Embeddings with Limited Human Guidance | Yiben Yang, Larry Birnbaum, Ji-Ping Wang, Doug Downey | We propose and assess methods for extracting one type of commonsense knowledge, object-property comparisons, from pre-trained embeddings. |
103 | Breaking NLI Systems with Sentences that Require Simple Lexical Inferences | Max Glockner, Vered Shwartz, Yoav Goldberg | We create a new NLI test set that shows the deficiency of state-of-the-art models in inferences that require lexical and world knowledge. |
104 | Adaptive Knowledge Sharing in Multi-Task Learning: Improving Low-Resource Neural Machine Translation | Poorya Zaremoodi, Wray Buntine, Gholamreza Haffari | We address this issue by extending the recurrent units with multiple “blocks” along with a trainable “routing network”. |
105 | Automatic Estimation of Simultaneous Interpreter Performance | Craig Stewart, Nikolai Vogler, Junjie Hu, Jordan Boyd-Graber, Graham Neubig | We propose the task of predicting simultaneous interpreter performance by building on existing methodology for quality estimation (QE) of machine translation output. |
106 | Polyglot Semantic Role Labeling | Phoebe Mulcaire, Swabha Swayamdipta, Noah A. Smith | We experiment with a new approach where we combine resources from different languages in the CoNLL 2009 shared task to build a single polyglot semantic dependency parser. |
107 | Learning Cross-lingual Distributed Logical Representations for Semantic Parsing | Yanyan Zou, Wei Lu | In this work, we present a study to show how learning distributed representations of the logical forms from data annotated in different languages can be used for improving the performance of a monolingual semantic parser. |
108 | Enhancing Drug-Drug Interaction Extraction from Texts by Molecular Structure Information | Masaki Asada, Makoto Miwa, Yutaka Sasaki | We propose a novel neural method to extract drug-drug interactions (DDIs) from texts using external drug molecular structure information. |
109 | diaNED: Time-Aware Named Entity Disambiguation for Diachronic Corpora | Prabal Agarwal, Jannik Strötgen, Luciano del Corro, Johannes Hoffart, Gerhard Weikum | This paper presents the first time-aware method for NED that resolves ambiguities even when mention contexts give only few cues. |
110 | Examining Temporality in Document Classification | Xiaolei Huang, Michael J. Paul | This study investigates how document classifiers trained on documents from certain time intervals perform on documents from other time intervals, considering both seasonal intervals (intervals that repeat across years, e.g., winter) and non-seasonal intervals (e.g., specific years). |
111 | Personalized Language Model for Query Auto-Completion | Aaron Jaech, Mari Ostendorf | We show how an adaptable language model can be used to generate personalized completions and how the model can use online updating to make predictions for users not seen during training. |
112 | Personalized Review Generation By Expanding Phrases and Attending on Aspect-Aware Representations | Jianmo Ni, Julian McAuley | In this paper, we focus on the problem of building assistive systems that can help users to write reviews. |
113 | Learning Simplifications for Specific Target Audiences | Carolina Scarton, Lucia Specia | We explore these two features of TS to build models tailored for specific grade levels. |
114 | Split and Rephrase: Better Evaluation and Stronger Baselines | Roee Aharoni, Yoav Goldberg | To aid this, we present a new train-development-test data split and neural models augmented with a copy-mechanism, outperforming the best reported baseline by 8.68 BLEU and fostering further progress on the task. |
115 | Autoencoder as Assistant Supervisor: Improving Text Representation for Chinese Social Media Text Summarization | Shuming Ma, Xu Sun, Junyang Lin, Houfeng Wang | In this work, we supervise the learning of the representation of the source content with that of the summary. |
116 | Long Short-Term Memory as a Dynamically Computed Element-wise Weighted Sum | Omer Levy, Kenton Lee, Nicholas FitzGerald, Luke Zettlemoyer | We present an alternative view to explain the success of LSTMs: the gates themselves are versatile recurrent models that provide more representational power than previously appreciated. |
117 | On the Practical Computational Power of Finite Precision RNNs for Language Recognition | Gail Weiss, Yoav Goldberg, Eran Yahav | We consider the case of RNNs with finite precision whose computation time is linear in the input length. |
118 | A Co-Matching Model for Multi-choice Reading Comprehension | Shuohang Wang, Mo Yu, Jing Jiang, Shiyu Chang | This paper proposes a new co-matching approach to this problem, which jointly models whether a passage can match both a question and a candidate answer. |
119 | Tackling the Story Ending Biases in The Story Cloze Test | Rishi Sharma, James Allen, Omid Bakhshandeh, Nasrin Mostafazadeh | In order to shed some light on this issue, we have performed various data analysis and analyzed a variety of top performing models presented for this task. |
120 | A Multi-sentiment-resource Enhanced Attention Network for Sentiment Classification | Zeyang Lei, Yujiu Yang, Min Yang, Yi Liu | In this paper, we propose a Multi-sentiment-resource Enhanced Attention Network (MEAN) to alleviate the problem by integrating three kinds of sentiment linguistic knowledge (e.g., sentiment lexicon, negation words, intensity words) into the deep neural network via attention mechanisms. |
121 | Pretraining Sentiment Classifiers with Unlabeled Dialog Data | Toru Shimizu, Nobuyuki Shimizu, Hayato Kobayashi | In this paper, we take the concept a step further by using a conditional language model, instead of a language model. |
122 | Disambiguating False-Alarm Hashtag Usages in Tweets for Irony Detection | Hen-Hsen Huang, Chiao-Chen Chen, Hsin-Hsi Chen | We analyze the ambiguity of hashtag usages and propose a novel neural network-based model, which incorporates linguistic information from different aspects, to disambiguate the usage of three hashtags that are widely used to collect the training data for irony detection. |
123 | Cross-Target Stance Classification with Self-Attention Networks | Chang Xu, Cécile Paris, Surya Nepal, Ross Sparks | In this work, we explore the potential for generalizing classifiers between different targets, and propose a neural model that can apply what has been learned from a source target to a destination target. |
124 | Know What You Don’t Know: Unanswerable Questions for SQuAD | Pranav Rajpurkar, Robin Jia, Percy Liang | To address these weaknesses, we present SQuADRUn, a new dataset that combines the existing Stanford Question Answering Dataset (SQuAD) with over 50,000 unanswerable questions written adversarially by crowdworkers to look similar to answerable ones. |
125 | `Lighter’ Can Still Be Dark: Modeling Comparative Color Descriptions | Olivia Winn, Smaranda Muresan | We propose a novel paradigm of grounding comparative adjectives within the realm of color descriptions. |