Paper Digest: EMNLP 2017 Highlights

November 6, 2017October 6, 2019 admin

The Conference on Empirical Methods in Natural Language Processing (EMNLP) is one of the top natural language processing conferences in the world. In 2017, it is to be held in Copenhagen, Denmark. There were 836 long paper submissions, of which 216 were accepted and 582 short paper submissions, of which 107 were accepted.

To help AI community quickly catch up on the work presented in this conference, Paper Digest Team processed all accepted papers, and generated one highlight sentence (typically the main topic) for each paper. Readers are encouraged to read these machine generated highlights / summaries to quickly get the main idea of each paper.

We thank all authors for writing these interesting papers, and readers for reading our digests. If you do not want to miss any interesting AI paper, you are welcome to sign up our free paper digest service to get new paper updates customized to your own interests on a daily basis.

Paper Digest Team
team@paperdigest.org

TABLE 1: EMNLP 2017 Papers

	Title	Authors	Highlight
1	Monolingual Phrase Alignment on Parse Forests	Yuki Arase, Junichi Tsujii	We propose an efficient method to conduct phrase alignment on parse forests for paraphrase detection.
2	Fast(er) Exact Decoding and Global Training for Transition-Based Dependency Parsing via a Minimal Feature Set	Tianze Shi, Liang Huang, Lillian Lee	Fast(er) Exact Decoding and Global Training for Transition-Based Dependency Parsing via a Minimal Feature Set
3	Quasi-Second-Order Parsing for 1-Endpoint-Crossing, Pagenumber-2 Graphs	Junjie Cao, Sheng Huang, Weiwei Sun, Xiaojun Wan	We propose a new Maximum Subgraph algorithm for first-order parsing to 1-endpoint-crossing, pagenumber-2 graphs.
4	Position-aware Attention and Supervised Data Improve Slot Filling	Yuhao Zhang, Victor Zhong, Danqi Chen, Gabor Angeli, Christopher D. Manning	This paper simultaneously addresses two issues that have held back prior work.
5	Heterogeneous Supervision for Relation Extraction: A Representation Learning Approach	Liyuan Liu, Xiang Ren, Qi Zhu, Shi Zhi, Huan Gui, Heng Ji, Jiawei Han	To overcome this drawback, we propose a novel framework, REHession, to conduct relation extractor learning using annotations from heterogeneous information source, e.g., knowledge base and domain heuristics.
6	Integrating Order Information and Event Relation for Script Event Prediction	Zhongqing Wang, Yue Zhang, Ching-Yun Chang	We propose a neural model that leverages the advantages of both methods, by using LSTM hidden states as features for event pair modelling.
7	Entity Linking for Queries by Searching Wikipedia Sentences	Chuanqi Tan, Furu Wei, Pengjie Ren, Weifeng Lv, Ming Zhou	We present a simple yet effective approach for linking entities in queries.
8	Train-O-Matic: Large-Scale Supervised Word Sense Disambiguation in Multiple Languages without Manual Training Data	Tommaso Pasini, Roberto Navigli	We present Train-O-Matic, a language-independent method for generating millions of sense-annotated training instances for virtually all meanings of words in a language’s vocabulary.
9	Universal Semantic Parsing	Siva Reddy, Oscar Täckström, Slav Petrov, Mark Steedman, Mirella Lapata	In this work, we introduce UDepLambda, a semantic interface for UD, which maps natural language to logical forms in an almost language-independent fashion and can process dependency graphs.
10	Mimicking Word Embeddings using Subword RNNs	Yuval Pinter, Robert Guthrie, Jacob Eisenstein	In this paper, we present MIMICK, an approach to generating OOV word embeddings compositionally, by learning a function from spellings to distributional embeddings.
11	Past, Present, Future: A Computational Investigation of the Typology of Tense in 1000 Languages	Ehsaneddin Asgari, Hinrich Schütze	We present SuperPivot, an analysis method for low-resource languages that occur in a superparallel corpus, i.e., in a corpus that contains an order of magnitude more languages than parallel corpora currently in use.
12	Neural Machine Translation with Source-Side Latent Graph Parsing	Kazuma Hashimoto, Yoshimasa Tsuruoka	This paper presents a novel neural machine translation model which jointly learns translation and source-side latent graph representations of sentences.
13	Neural Machine Translation with Word Predictions	Rongxiang Weng, Shujian Huang, Zaixiang Zheng, Xinyu Dai, Jiajun Chen	In this paper, we propose to use word predictions as a mechanism for direct supervision.
14	Towards Decoding as Continuous Optimisation in Neural Machine Translation	Cong Duy Vu Hoang, Gholamreza Haffari, Trevor Cohn	We propose a novel decoding approach for neural machine translation (NMT) based on continuous optimisation.
15	Where is Misty? Interpreting Spatial Descriptors by Modeling Regions in Space	Nikita Kitaev, Dan Klein	We present a model for locating regions in space based on natural language descriptions. To evaluate our model, we construct and release a new dataset consisting of Minecraft scenes with crowdsourced natural language descriptions.
16	Continuous Representation of Location for Geolocation and Lexical Dialectology using Mixture Density Networks	Afshin Rahimi, Timothy Baldwin, Trevor Cohn	We propose a method for embedding two-dimensional locations in a continuous vector space using a neural network-based model incorporating mixtures of Gaussian distributions, presenting two model variants for text-based geolocation and lexical dialectology.
17	Obj2Text: Generating Visually Descriptive Language from Object Layouts	Xuwang Yin, Vicente Ordonez	We explore in this paper OBJ2TEXT, a sequence-to-sequence model that encodes a set of objects and their locations as an input sequence using an LSTM network, and decodes this representation using an LSTM language model.
18	End-to-end Neural Coreference Resolution	Kenton Lee, Luheng He, Mike Lewis, Luke Zettlemoyer	We introduce the first end-to-end coreference resolution model and show that it significantly outperforms all previous work without using a syntactic parser or hand-engineered mention detector.
19	Neural Net Models of Open-domain Discourse Coherence	Jiwei Li, Dan Jurafsky	In this paper, we describe domain-independent neural models of discourse coherence that are capable of measuring multiple aspects of coherence in existing sentences and can maintain coherence while generating new sentences.
20	Affinity-Preserving Random Walk for Multi-Document Summarization	Kexiang Wang, Tianyu Liu, Zhifang Sui, Baobao Chang	This paper introduces affinity-preserving random walk to the summarization task, which preserves the affinity relations of sentences by an absorbing random walk model.
21	A Mention-Ranking Model for Abstract Anaphora Resolution	Ana Marasović, Leo Born, Juri Opitz, Anette Frank	We propose a mention-ranking model that learns how abstract anaphors relate to their antecedents with an LSTM-Siamese Net.
22	Hierarchical Embeddings for Hypernymy Detection and Directionality	Kim Anh Nguyen, Maximilian Köper, Sabine Schulte im Walde, Ngoc Thang Vu	We present a novel neural model HyperVec to learn hierarchical embeddings for hypernymy detection and directionality.
23	Ngram2vec: Learning Improved Word Representations from Ngram Co-occurrence Statistics	Zhe Zhao, Tao Liu, Shen Li, Bofang Li, Xiaoyong Du	In this paper, we introduce ngrams into four representation methods: SGNS, GloVe, PPMI matrix, and its SVD factorization.
24	Dict2vec : Learning Word Embeddings using Lexical Dictionaries	Julien Tissier, Christophe Gravier, Amaury Habrard	In this paper, we propose a new approach, Dict2vec, based on one of the largest yet refined datasource for describing words – natural language dictionaries.
25	Learning Chinese Word Representations From Glyphs Of Characters	Tzu-Ray Su, Hung-Yi Lee	In this paper, we propose new methods to learn Chinese word representations. Another contribution in this paper is that we created several evaluation datasets in traditional Chinese and made them public.
26	Learning Paraphrastic Sentence Embeddings from Back-Translated Bitext	John Wieting, Jonathan Mallinson, Kevin Gimpel	We consider the problem of learning general-purpose, paraphrastic sentence embeddings in the setting of Wieting et al. (2016b).
27	Joint Embeddings of Chinese Words, Characters, and Fine-grained Subcharacter Components	Jinxing Yu, Xun Jian, Hao Xin, Yangqiu Song	In this work, we propose an approach to jointly embed Chinese words as well as their characters and fine-grained subcharacter components.
28	Exploiting Morphological Regularities in Distributional Word Representations	Arihant Gupta, Syed Sarfaraz Akhtar, Avijit Vajpayee, Arjit Srivastava, Madan Gopal Jhanwar, Manish Shrivastava	We present an unsupervised, language agnostic approach for exploiting morphological regularities present in high dimensional vector spaces.
29	Exploiting Word Internal Structures for Generic Chinese Sentence Representation	Shaonan Wang, Jiajun Zhang, Chengqing Zong	We introduce a novel mixed characterword architecture to improve Chinese sentence representations, by utilizing rich semantic information of word internal structures.
30	High-risk learning: acquiring new word vectors from tiny data	Aurélie Herbelot, Marco Baroni	In this paper, we show that a neural language model such as Word2Vec only necessitates minor modifications to its standard architecture to learn new terms from tiny data, using background knowledge from a previously learnt semantic space.
31	Word Embeddings based on Fixed-Size Ordinally Forgetting Encoding	Joseph Sanu, Mingbin Xu, Hui Jiang, Quan Liu	In this paper, we propose to learn word embeddings based on the recent fixed-size ordinally forgetting encoding (FOFE) method, which can almost uniquely encode any variable-length sequence into a fixed-size representation.
32	VecShare: A Framework for Sharing Word Representation Vectors	Jared Fernandez, Zhaocheng Yu, Doug Downey	We present a framework, called VecShare, that makes it easy to share and retrieve word embeddings on the Web.
33	Word Re-Embedding via Manifold Dimensionality Retention	Souleiman Hasan, Edward Curry	In this paper, we re-embed pre-trained word embeddings with a stage of manifold learning which retains dimensionality.
34	MUSE: Modularizing Unsupervised Sense Embeddings	Guang-He Lee, Yun-Nung Chen	We leverage reinforcement learning to enable joint training on the proposed modules, and introduce various exploration techniques on sense selection for better robustness.
35	Reporting Score Distributions Makes a Difference: Performance Study of LSTM-networks for Sequence Tagging	Nils Reimers, Iryna Gurevych	In this paper we show that reporting a single performance score is insufficient to compare non-deterministic approaches.
36	Learning What’s Easy: Fully Differentiable Neural Easy-First Taggers	André F. T. Martins, Julia Kreutzer	We introduce a novel neural easy-first decoder that learns to solve sequence tagging tasks in a flexible order.
37	Incremental Skip-gram Model with Negative Sampling	Nobuhiro Kaji, Hayato Kobayashi	To address this problem, we present a simple incremental extension of SGNS and provide a thorough theoretical analysis to demonstrate its validity.
38	Learning to select data for transfer learning with Bayesian Optimization	Sebastian Ruder, Barbara Plank	Inspired by work on curriculum learning, we propose to learn data selection measures using Bayesian Optimization and evaluate them across models, domains and tasks.
39	Unsupervised Pretraining for Sequence to Sequence Learning	Prajit Ramachandran, Peter Liu, Quoc Le	This work presents a general unsupervised learning method to improve the accuracy of sequence to sequence (seq2seq) models.
40	Efficient Attention using a Fixed-Size Memory Representation	Denny Britz, Melody Guan, Minh-Thang Luong	In this work, we propose an alternative attention mechanism based on a fixed size memory representation that is more efficient.
41	Rotated Word Vector Representations and their Interpretability	Sungjoon Park, JinYeong Bak, Alice Oh	We apply several rotation algorithms to the vector representation of words to improve the interpretability.
42	A causal framework for explaining the predictions of black-box sequence-to-sequence models	David Alvarez-Melis, Tommi Jaakkola	We focus the general approach on sequence-to-sequence problems, adopting a variational autoencoder to yield meaningful input perturbations.
43	Piecewise Latent Variables for Neural Variational Text Processing	Iulian Vlad Serban, Alexander G. Ororbia, Joelle Pineau, Aaron Courville	To overcome this restriction, we propose the simple, but highly flexible, piecewise constant distribution.
44	Learning the Structure of Variable-Order CRFs: a finite-state perspective	Thomas Lavergne, François Yvon	Using an effective finite-state representation of variable-length dependencies, we propose new ways to perform feature selection at large scale and report experimental results where we outperform strong baselines on a tagging task.
45	Sparse Communication for Distributed Gradient Descent	Alham Fikri Aji, Kenneth Heafield	We make distributed stochastic gradient descent faster by exchanging sparse updates instead of dense updates.
46	Why ADAGRAD Fails for Online Topic Modeling	You Lu, Jeffrey Lund, Jordan Boyd-Graber	We show that this is because ADAGRAD uses accumulation of previous gradients as the learning rates’ denominators.
47	Recurrent Attention Network on Memory for Aspect Sentiment Analysis	Peng Chen, Zhongqian Sun, Lidong Bing, Wei Yang	We propose a novel framework based on neural networks to identify the sentiment of opinion targets in a comment/review.
48	A Cognition Based Attention Model for Sentiment Analysis	Yunfei Long, Qin Lu, Rong Xiang, Minglei Li, Chu-Ren Huang	In this work, we propose a novel attention model trained by cognition grounded eye-tracking data.
49	Author-aware Aspect Topic Sentiment Model to Retrieve Supporting Opinions from Reviews	Lahari Poddar, Wynne Hsu, Mong Li Lee	We study the problem of searching for supporting opinions in the context of reviews.
50	Magnets for Sarcasm: Making Sarcasm Detection Timely, Contextual and Very Personal	Aniruddha Ghosh, Tony Veale	Using a neural architecture, we show significant gains in detection accuracy when knowledge of the speaker’s mood at the time of production can be inferred.
51	Identifying Humor in Reviews using Background Text Sources	Alex Morales, Chengxiang Zhai	We propose a generative language model, based on the theory of incongruity, to model humorous text, which allows us to leverage background text sources, such as Wikipedia entry descriptions, and enables construction of multiple features for identifying humorous reviews.
52	Sentiment Lexicon Construction with Representation Learning Based on Hierarchical Sentiment Supervision	Leyi Wang, Rui Xia	In this paper, we develop a neural architecture to train a sentiment-aware word embedding by integrating the sentiment supervision at both document and word levels, to enhance the quality of word embedding as well as the sentiment lexicon.
53	Towards a Universal Sentiment Classifier in Multiple languages	Kui Xu, Xiaojun Wan	In this paper we aim to build a universal sentiment classifier with a single classification model in multiple different languages.
54	Capturing User and Product Information for Document Level Sentiment Analysis with Deep Memory Network	Zi-Yi Dou	To address the issue, we propose a deep memory network for document-level sentiment classification which could capture the user and product information at the same time.
55	Identifying and Tracking Sentiments and Topics from Social Media Texts during Natural Disasters	Min Yang, Jincheng Mei, Heng Ji, Wei Zhao, Zhou Zhao, Xiaojun Chen	We propose a location-based dynamic sentiment-topic model (LDST) which can jointly model topic, sentiment, time and Geolocation information. We will release the data and source code after this work is published.
56	Refining Word Embeddings for Sentiment Analysis	Liang-Chih Yu, Jin Wang, K. Robert Lai, Xuejie Zhang	Therefore, this study proposes a word vector refinement model that can be applied to any pre-trained word vectors (e.g., Word2vec and GloVe).
57	A Multilayer Perceptron based Ensemble Technique for Fine-grained Financial Sentiment Analysis	Md Shad Akhtar, Abhishek Kumar, Deepanway Ghosal, Asif Ekbal, Pushpak Bhattacharyya	In this paper, we propose a novel method for combining deep learning and classical feature based models using a Multi-Layer Perceptron (MLP) network for financial sentiment analysis.
58	Sentiment Intensity Ranking among Adjectives Using Sentiment Bearing Word Embeddings	Raksha Sharma, Arpan Somani, Lakshya Kumar, Pushpak Bhattacharyya	In this paper, we propose a semi-supervised technique that uses sentiment bearing word embeddings to produce a continuous ranking among adjectives that share common semantics.
59	Sentiment Lexicon Expansion Based on Neural PU Learning, Double Dictionary Lookup, and Polarity Association	Yasheng Wang, Yang Zhang, Bing Liu	In a recent sentiment analysis application, we used a large Chinese sentiment lexicon and found that it missed a large number of sentiment words in social media. This paper first poses the problem as a PU learning problem, which is a new formulation.
60	DeepPath: A Reinforcement Learning Method for Knowledge Graph Reasoning	Wenhan Xiong, Thien Hoang, William Yang Wang	More specifically, we describe a novel reinforcement learning framework for learning multi-hop relational paths: we use a policy-based agent with continuous states based on knowledge graph embeddings, which reasons in a KG vector-space by sampling the most promising relation to extend its path.
61	Task-Oriented Query Reformulation with Reinforcement Learning	Rodrigo Nogueira, Kyunghyun Cho	In this work, we introduce a query reformulation system based on a neural network that rewrites a query to maximize the number of relevant documents returned.
62	Sentence Simplification with Deep Reinforcement Learning	Xingxing Zhang, Mirella Lapata	We address the simplification problem with an encoder-decoder model coupled with a deep reinforcement learning framework.
63	Learning how to Active Learn: A Deep Reinforcement Learning Approach	Meng Fang, Yuan Li, Trevor Cohn	To address these shortcomings, we introduce a novel formulation by reframing the active learning as a reinforcement learning problem and explicitly learning a data selection policy, where the policy takes the role of the active learning heuristic.
64	Split and Rephrase	Shashi Narayan, Claire Gardent, Shay B. Cohen, Anastasia Shimorina	We propose a new sentence simplification task (Split-and-Rephrase) where the aim is to split a complex sentence into a meaning preserving sequence of shorter sentences.
65	Neural Response Generation via GAN with an Approximate Embedding Layer	Zhen Xu, Bingquan Liu, Baoxun Wang, Chengjie Sun, Xiaolong Wang, Zhuoran Wang, Chao Qi	This paper presents a Generative Adversarial Network (GAN) to model single-turn short-text conversations, which trains a sequence-to-sequence (Seq2Seq) network for response generation simultaneously with a discriminative classifier that measures the differences between human-produced responses and machine-generated ones.
66	A Hybrid Convolutional Variational Autoencoder for Text Generation	Stanislau Semeniuta, Aliaksei Severyn, Erhardt Barth	In this paper we explore the effect of architectural choices on learning a variational autoencoder (VAE) for text generation.
67	Filling the Blanks (hint: plural noun) for Mad Libs Humor	Nabil Hossain, John Krumm, Lucy Vanderwende, Eric Horvitz, Henry Kautz	We develop an algorithm called Libitum that helps humans generate humor in a Mad Lib, which is a popular fill-in-the-blank game.
68	Measuring Thematic Fit with Distributional Feature Overlap	Enrico Santus, Emmanuele Chersoni, Alessandro Lenci, Philippe Blache	In this paper, we introduce a new distributional method for modeling predicate-argument thematic fit judgments.
69	SCDV : Sparse Composite Document Vectors using soft clustering over distributional representations	Dheeraj Mekala, Vivek Gupta, Bhargavi Paranjape, Harish Karnick	We present a feature vector formation technique for documents – Sparse Composite Document Vector (SCDV) – which overcomes several shortcomings of the current distributional paragraph vector representations that are widely used for text representation.
70	Supervised Learning of Universal Sentence Representations from Natural Language Inference Data	Alexis Conneau, Douwe Kiela, Holger Schwenk, Loïc Barrault, Antoine Bordes	In this paper, we show how universal sentence representations trained using the supervised data of the Stanford Natural Language Inference datasets can consistently outperform unsupervised methods like SkipThought vectors on a wide range of transfer tasks.
71	Determining Semantic Textual Similarity using Natural Deduction Proofs	Hitomi Yanaka, Koji Mineshima, Pascual Martínez-Gómez, Daisuke Bekki	We propose a method for determining semantic textual similarity by combining shallow features with features extracted from natural deduction proofs of bidirectional entailment relations between sentence pairs.
72	Multi-Grained Chinese Word Segmentation	Chen Gong, Zhenghua Li, Min Zhang, Xinzhou Jiang	This work proposes and addresses multi-grained WS (MWS). We build a large-scale pseudo MWS dataset for model training and tuning by leveraging the annotation heterogeneity of three SWS datasets.
73	Don’t Throw Those Morphological Analyzers Away Just Yet: Neural Morphological Disambiguation for Arabic	Nasser Zalmout, Nizar Habash	This paper presents a model for Arabic morphological disambiguation based on Recurrent Neural Networks (RNN).
74	Paradigm Completion for Derivational Morphology	Ryan Cotterell, Ekaterina Vylomova, Huda Khayrallah, Christo Kirov, David Yarowsky	We overview the theoretical motivation for a paradigmatic treatment of derivational morphology, and introduce the task of derivational paradigm completion as a parallel to inflectional paradigm completion.
75	A Sub-Character Architecture for Korean Language Processing	Karl Stratos	We introduce a novel sub-character architecture that exploits a unique compositional structure of the Korean language.
76	Do LSTMs really work so well for PoS tagging? — A replication study	Tobias Horsmann, Torsten Zesch	A recent study by Plank et al. (2016) found that LSTM-based PoS taggers considerably improve over the current state-of-the-art when evaluated on the corpora of the Universal Dependencies project that use a coarse-grained tagset.
77	The Labeled Segmentation of Printed Books	Lara McConnaughey, Jennifer Dai, David Bamman	We introduce the task of book structure labeling: segmenting and assigning a fixed category (such as Table of Contents, Preface, Index) to the document structure of printed books.
78	Cross-lingual Character-Level Neural Morphological Tagging	Ryan Cotterell, Georg Heigold	In the work presented here, we explore a transfer learning scheme, whereby we train character-level recurrent neural taggers to predict morphological taggings for high-resource languages and low-resource languages together.
79	Word-Context Character Embeddings for Chinese Word Segmentation	Hao Zhou, Zhenting Yu, Yue Zhang, Shujian Huang, Xinyu Dai, Jiajun Chen	We investigate training character embeddings on a word-based context in a similar way, showing that the simple method improves state-of-the-art neural word segmentation models significantly, beating tri-training baselines for leveraging auto-segmented data.
80	Segmentation-Free Word Embedding for Unsegmented Languages	Takamasa Oshikiri	In this paper, we propose a new pipeline of word embedding for unsegmented languages, called segmentation-free word embedding, which does not require word segmentation as a preprocessing step.
81	From Textbooks to Knowledge: A Case Study in Harvesting Axiomatic Knowledge from Textbooks to Solve Geometry Problems	Mrinmaya Sachan, Kumar Dubey, Eric Xing	As a case study, we present an approach for harvesting structured axiomatic knowledge from math textbooks.
82	RACE: Large-scale ReAding Comprehension Dataset From Examinations	Guokun Lai, Qizhe Xie, Hanxiao Liu, Yiming Yang, Eduard Hovy	We present RACE, a new dataset for benchmark evaluation of methods in the reading comprehension task.
83	Beyond Sentential Semantic Parsing: Tackling the Math SAT with a Cascade of Tree Transducers	Mark Hopkins, Cristian Petrescu-Prahova, Roie Levin, Ronan Le Bras, Alvaro Herrasti, Vidur Joshi	We present an approach for answering questions that span multiple sentences and exhibit sophisticated cross-sentence anaphoric phenomena, evaluating on a rich source of such questions – the math portion of the Scholastic Aptitude Test (SAT).
84	Learning Fine-Grained Expressions to Solve Math Word Problems	Danqing Huang, Shuming Shi, Chin-Yew Lin, Jian Yin	This paper presents a novel template-based method to solve math word problems.
85	Structural Embedding of Syntactic Trees for Machine Comprehension	Rui Liu, Junjie Hu, Wei Wei, Zi Yang, Eric Nyberg	In this paper, we propose structural embedding of syntactic trees (SEST), an algorithm framework to utilize structured information and encode them into vector representations that can boost the performance of algorithms for the machine comprehension.
86	World Knowledge for Reading Comprehension: Rare Entity Prediction with Hierarchical LSTMs Using External Descriptions	Teng Long, Emmanuel Bengio, Ryan Lowe, Jackie Chi Kit Cheung, Doina Precup	In this paper, we introduce a task and several models to drive progress towards this goal.
87	Two-Stage Synthesis Networks for Transfer Learning in Machine Comprehension	David Golub, Po-Sen Huang, Xiaodong He, Li Deng	We develop a technique for transfer learning in machine comprehension (MC) using a novel two-stage synthesis network.
88	Deep Neural Solver for Math Word Problems	Yan Wang, Xiaojiang Liu, Shuming Shi	This paper presents a deep neural solver to automatically solve math word problems.
89	Latent Space Embedding for Retrieval in Question-Answer Archives	Deepak P, Dinesh Garg, Shirish Shevade	In this paper, we devise a CQA retrieval technique, LASER-QA, that embeds question-answer pairs within a unified latent space preserving the local neighborhood structure of question and answer spaces.
90	Question Generation for Question Answering	Nan Duan, Duyu Tang, Peng Chen, Ming Zhou	This paper presents how to generate questions from given passages using neural networks, where large scale QA pairs are automatically crawled and processed from Community-QA website, and used as training data.
91	Learning to Paraphrase for Question Answering	Li Dong, Jonathan Mallinson, Siva Reddy, Mirella Lapata	In this paper we turn to paraphrases as a means of capturing this knowledge and present a general framework which learns felicitous paraphrases for various QA tasks.
92	Temporal Information Extraction for Question Answering Using Syntactic Dependencies in an LSTM-based Architecture	Yuanliang Meng, Anna Rumshisky, Alexey Romanov	In this paper, we propose to use a set of simple, uniform in architecture LSTM-based models to recover different kinds of temporal relations from text.
93	Ranking Kernels for Structures and Embeddings: A Hybrid Preference and Classification Model	Kateryna Tymoshenko, Daniele Bonadiman, Alessandro Moschitti	In this work, we propose a new hybrid approach combining preference ranking applied to TKs and pointwise ranking applied to CNNs.
94	Recovering Question Answering Errors via Query Revision	Semih Yavuz, Izzeddin Gur, Yu Su, Xifeng Yan	In this work, we propose to crosscheck the corresponding KB relations behind the predicted answers and identify potential inconsistencies.
95	An empirical study on the effectiveness of images in Multimodal Neural Machine Translation	Jean-Benoit Delbrouck, Stéphane Dupont	In this paper, we compare several attention mechanism on the multi-modal translation task (English, image → German) and evaluate the ability of the model to make use of images to improve translation.
96	Sound-Word2Vec: Learning Word Representations Grounded in Sounds	Ashwin Vijayakumar, Ramakrishna Vedantam, Devi Parikh	In this work, we treat sound as a first-class citizen, studying downstream 6textual tasks which require aural grounding.
97	The Promise of Premise: Harnessing Question Premises in Visual Question Answering	Aroma Mahendru, Viraj Prabhu, Akrit Mohapatra, Dhruv Batra, Stefan Lee	In this paper, we make a simple observation that questions about images often contain premises – objects and relationships implied by the question – and that reasoning about premises can help Visual Question Answering (VQA) models respond more intelligently to irrelevant or previously unseen questions.
98	Guided Open Vocabulary Image Captioning with Constrained Beam Search	Peter Anderson, Basura Fernando, Mark Johnson, Stephen Gould	We address this problem using a flexible approach that enables existing deep captioning architectures to take advantage of image taggers at test time, without re-training.
99	Zero-Shot Activity Recognition with Verb Attribute Induction	Rowan Zellers, Yejin Choi	In this paper, we investigate large-scale zero-shot activity recognition by modeling the visual and linguistic attributes of action verbs.
100	Deriving continous grounded meaning representations from referentially structured multimodal contexts	Sina Zarrieß, David Schlangen	We propose a new task for evaluating grounded meaning representations-detection of potentially co-referential phrases-and show that it requires precise denotational representations of attribute meanings, which our method provides.
101	Hierarchically-Attentive RNN for Album Summarization and Storytelling	Licheng Yu, Mohit Bansal, Tamara Berg	We address the problem of end-to-end visual storytelling.
102	Video Highlight Prediction Using Audience Chat Reactions	Cheng-Yang Fu, Joon Lee, Mohit Bansal, Alexander Berg	We present methods addressing the problem of automatic video highlight prediction based on joint visual features and textual analysis of the real-world audience discourse with complex slang, in both English and traditional Chinese. We present a novel dataset based on League of Legends championships recorded from North American and Taiwanese Twitch.tv channels (will be released for further research), and demonstrate strong results on these using multimodal, character-level CNN-RNN model architectures.
103	Reinforced Video Captioning with Entailment Rewards	Ramakanth Pasunuru, Mohit Bansal	Sequence-to-sequence models have shown promising improvements on the temporal task of video captioning, but they optimize word-level cross-entropy loss during training.
104	Evaluating Hierarchies of Verb Argument Structure with Hierarchical Clustering	Jesse Mu, Joshua K. Hartshorne, Timothy O’Donnell	We discuss limitations of a simple hierarchical representation and suggest similar approaches for identifying the representations underpinning verb argument structure.
105	Incorporating Global Visual Features into Attention-based Neural Machine Translation.	Iacer Calixto, Qun Liu	We introduce multi-modal, attention-based neural machine translation (NMT) models which incorporate visual features into different parts of both the encoder and the decoder.
106	Mapping Instructions and Visual Observations to Actions with Reinforcement Learning	Dipendra Misra, John Langford, Yoav Artzi	We propose to directly map raw visual observations and text input to actions for instruction execution.
107	An analysis of eye-movements during reading for the detection of mild cognitive impairment	Kathleen C. Fraser, Kristina Lundholm Fors, Dimitrios Kokkinakis, Arto Nordlund	We present a machine learning analysis of eye-tracking data for the detection of mild cognitive impairment, a decline in cognitive abilities that is associated with an increased risk of developing dementia.
108	A Structured Learning Approach to Temporal Relation Extraction	Qiang Ning, Zhili Feng, Dan Roth	This paper suggests that it is important to take these dependencies into account while learning to identify these relations and proposes a structured learning approach to address this challenge.
109	Importance sampling for unbiased on-demand evaluation of knowledge base population	Arun Chaganty, Ashwin Paranjape, Percy Liang, Christopher D. Manning	Our first contribution is a new importance-sampling based evaluation which corrects for this bias by annotating a new system’s predictions on-demand via crowdsourcing.
110	PACRR: A Position-Aware Neural IR Model for Relevance Matching	Kai Hui, Andrew Yates, Klaus Berberich, Gerard de Melo	In this work, we propose a novel neural IR model named PACRR aiming at better modeling position-dependent interactions between a query and a document.
111	Globally Normalized Reader	Jonathan Raiman, John Miller	This method improves the performance of all models considered in this work and is of independent interest for a variety of NLP tasks.
112	Speech segmentation with a neural encoder model of working memory	Micha Elsner, Cory Shain	We present the first unsupervised LSTM speech segmenter as a cognitive model of the acquisition of words from unsegmented input.
113	Speaking, Seeing, Understanding: Correlating semantic models with conceptual representation in the brain	Luana Bulat, Stephen Clark, Ekaterina Shutova	In this paper, we present a systematic evaluation and comparison of a range of widely-used, state-of-the-art semantic models in their ability to predict patterns of conceptual representation in the human brain.
114	Multi-modal Summarization for Asynchronous Collection of Text, Image, Audio and Video	Haoran Li, Junnan Zhu, Cong Ma, Jiajun Zhang, Chengqing Zong	In this work, we propose an extractive Multi-modal Summarization (MMS) method which can automatically generate a textual summary given a set of documents, images, audios and videos related to a specific topic. We further introduce an MMS corpus in English and Chinese.
115	Tensor Fusion Network for Multimodal Sentiment Analysis	Amir Zadeh, Minghai Chen, Soujanya Poria, Erik Cambria, Louis-Philippe Morency	In this paper, we pose the problem of multimodal sentiment analysis as modeling intra-modality and inter-modality dynamics.
116	ConStance: Modeling Annotation Contexts to Improve Stance Classification	Kenneth Joseph, Lisa Friedland, William Hobbs, David Lazer, Oren Tsur	To characterize and reduce these biases, we develop ConStance, a general model for reasoning about annotations across information conditions.
117	Deeper Attention to Abusive User Content Moderation	John Pavlopoulos, Prodromos Malakasiotis, Ion Androutsopoulos	Experimenting with a new dataset of 1.6M user comments from a news portal and an existing dataset of 115K Wikipedia talk page comments, we show that an RNN operating on word embeddings outpeforms the previous state of the art in moderation, which used logistic regression or an MLP classifier with character or word n-grams.
118	Outta Control: Laws of Semantic Change and Inherent Biases in Word Representation Models	Haim Dubossarsky, Daphna Weinshall, Eitan Grossman	This article evaluates three proposed laws of semantic change.
119	Human Centered NLP with User-Factor Adaptation	Veronica Lynn, Youngseo Son, Vivek Kulkarni, Niranjan Balasubramanian, H. Andrew Schwartz	We introduce a continuous adaptation technique, suited for real-valued user factors that are common in social science and bringing us closer to personalized NLP, adapting to each user uniquely.
120	Neural Sequence Learning Models for Word Sense Disambiguation	Alessandro Raganato, Claudio Delli Bovi, Roberto Navigli	To bridge this gap we adopt a different perspective and rely on sequence learning to frame the disambiguation problem: we propose and study in depth a series of end-to-end neural architectures directly tailored to the task, from bidirectional Long Short-Term Memory to encoder-decoder models.
121	Learning Word Relatedness over Time	Guy D. Rosin, Eytan Adar, Kira Radinsky	In this work, we introduce a temporal relationship model that is extracted from longitudinal data collections.
122	Inter-Weighted Alignment Network for Sentence Pair Modeling	Gehui Shen, Yunlun Yang, Zhi-Hong Deng	In this paper, we propose a model to measure the similarity of a sentence pair focusing on the interaction information.
123	A Short Survey on Taxonomy Learning from Text Corpora: Issues, Resources and Recent Advances	Chengyu Wang, Xiaofeng He, Aoying Zhou	In this paper, we overview recent advances on taxonomy construction from free texts, reorganizing relevant subtasks into a complete framework.
124	Idiom-Aware Compositional Distributed Semantics	Pengfei Liu, Kaiyu Qian, Xipeng Qiu, Xuanjing Huang	In this paper, we propose an idiom-aware distributed semantic model to build representation of sentences on the basis of understanding their contained idioms. To better evaluate our models, we also construct an idiom-enriched sentiment classification dataset with considerable scale and abundant peculiarities of idioms.
125	Macro Grammars and Holistic Triggering for Efficient Semantic Parsing	Yuchen Zhang, Panupong Pasupat, Percy Liang	We propose a new online learning algorithm that searches faster as training progresses.
126	A Continuously Growing Dataset of Sentential Paraphrases	Wuwei Lan, Siyu Qiu, Hua He, Wei Xu	In this paper, we present a new method to collect large-scale sentential paraphrases from Twitter by linking tweets through shared URLs. We present the largest human-labeled paraphrase corpus to date of 51,524 sentence pairs and the first cross-domain benchmarking for automatic paraphrase identification.
127	Cross-domain Semantic Parsing via Paraphrasing	Yu Su, Xifeng Yan	We discover two problems, small micro variance and large macro variance, of pre-trained word embeddings that hinder their direct use in neural networks, and propose standardization techniques as a remedy.
128	A Joint Sequential and Relational Model for Frame-Semantic Parsing	Bishan Yang, Tom Mitchell	We introduce a new method for frame-semantic parsing that significantly improves the prior state of the art.
129	Getting the Most out of AMR Parsing	Chuan Wang, Nianwen Xue	This paper proposes to tackle the AMR parsing bottleneck by improving two components of an AMR parser: concept identification and alignment.
130	AMR Parsing using Stack-LSTMs	Miguel Ballesteros, Yaser Al-Onaizan	We present a transition-based AMR parser that directly generates AMR parses from plain text.
131	An End-to-End Deep Framework for Answer Triggering with a Novel Group-Level Objective	Jie Zhao, Yu Su, Ziyu Guan, Huan Sun	In contrast to existing pipeline methods which first consider individual candidate answers separately and then make a prediction based on a threshold, we propose an end-to-end deep neural network framework, which is trained by a novel group-level objective function that directly optimizes the answer triggering performance.
132	Predicting Word Association Strengths	Andrew Cattle, Xiaojuan Ma	We find Word2Vec (Mikolov et al., 2013) and GloVe (Pennington et al., 2014) cosine similarities, as well as vector offsets, to be the highest performing features.
133	Learning Contextually Informed Representations for Linear-Time Discourse Parsing	Yang Liu, Mirella Lapata	In this work, we propose a linear-time parser with a novel way of representing discourse constituents based on neural networks which takes into account global contextual information and is able to capture long-distance dependencies.
134	Multi-task Attention-based Neural Networks for Implicit Discourse Relationship Representation and Identification	Man Lan, Jianxiang Wang, Yuanbin Wu, Zheng-Yu Niu, Haifeng Wang	We present a novel multi-task attention based neural network model to address implicit discourse relationship representation and identification through two types of representation learning, an attention based neural network for learning discourse relationship representation with two arguments and a multi-task framework for learning knowledge from annotated and unannotated corpora.
135	Chinese Zero Pronoun Resolution with Deep Memory Network	Qingyu Yin, Yu Zhang, Weinan Zhang, Ting Liu	In this paper, we address this issue by building a deep memory network that is capable of encoding zero pronouns into vector representations with information obtained from their contexts and potential antecedents.
136	How much progress have we made on RST discourse parsing? A replication study of recent results on the RST-DT	Mathieu Morey, Philippe Muller, Nicholas Asher	This article evaluates purported progress over the past years in RST discourse parsing.
137	What is it? Disambiguating the different readings of the pronoun `it’	Sharid Loáiciga, Liane Guillou, Christian Hardmeier	In this paper, we address the problem of predicting one of three functions for the English pronoun it’: anaphoric, event reference or pleonastic.
138	Revisiting Selectional Preferences for Coreference Resolution	Benjamin Heinzerling, Nafise Sadat Moosavi, Michael Strube	We propose a dependency-based embedding model of selectional preferences which allows fine-grained compatibility judgments with high coverage.
139	Learning to Rank Semantic Coherence for Topic Segmentation	Liang Wang, Sujian Li, Yajuan Lv, Houfeng Wang	In this paper, we present an intuitive and simple idea to automatically create a “quasi” training dataset, which includes a large amount of text pairs from the same or different documents with different semantic coherence.
140	GRASP: Rich Patterns for Argumentation Mining	Eyal Shnarch, Ran Levy, Vikas Raykar, Noam Slonim	We report highly promising experimental results in several challenging text analysis tasks within the field of Argumentation Mining.
141	Patterns of Argumentation Strategies across Topics	Khalid Al-Khatib, Henning Wachsmuth, Matthias Hagen, Benno Stein	This paper presents an analysis of argumentation strategies in news editorials within and across topics.
142	Using Argument-based Features to Predict and Analyse Review Helpfulness	Haijing Liu, Yang Gao, Pin Lv, Mengxue Li, Shiqiang Geng, Minglan Li, Hao Wang	We study the helpful product reviews identification problem in this paper.
143	Here’s My Point: Joint Pointer Architecture for Argument Mining	Peter Potash, Alexey Romanov, Anna Rumshisky	This work presents the first neural network-based approach to link extraction in argument mining.
144	Identifying attack and support argumentative relations using deep learning	Oana Cocarascu, Francesca Toni	We propose a deep learning architecture to capture argumentative relations of attack and support from one piece of text to another, of the kind that naturally occur in a debate.
145	Neural Lattice-to-Sequence Models for Uncertain Inputs	Matthias Sperber, Graham Neubig, Jan Niehues, Alex Waibel	In this work, we extend the TreeLSTM (Tai et al., 2015) into a LatticeLSTM that is able to consume word lattices, and can be used as encoder in an attentional encoder-decoder model.
146	Memory-augmented Neural Machine Translation	Yang Feng, Shiyue Zhang, Andi Zhang, Dong Wang, Andrew Abel	This paper presents a novel memory-augmented NMT (M-NMT) architecture, which stores knowledge about how words (usually infrequently encountered ones) should be translated in a memory and then utilizes them to assist the neural model.
147	Dynamic Data Selection for Neural Machine Translation	Marlies van der Wees, Arianna Bisazza, Christof Monz	With the recent increase in popularity of neural machine translation (NMT), we explore in this paper to what extent and how NMT can also benefit from data selection.
148	Neural Machine Translation Leveraging Phrase-based Models in a Hybrid Search	Leonard Dahlmann, Evgeny Matusov, Pavel Petrushkov, Shahram Khadivi	In this paper, we introduce a hybrid search for attention-based neural machine translation (NMT).
149	Translating Phrases in Neural Machine Translation	Xing Wang, Zhaopeng Tu, Deyi Xiong, Min Zhang	In this work, we propose a method to translate phrases in NMT by integrating a phrase memory storing target phrases from a phrase-based statistical machine translation (SMT) system into the encoder-decoder architecture of NMT.
150	Towards Bidirectional Hierarchical Representations for Attention-based Neural Machine Translation	Baosong Yang, Derek F. Wong, Tong Xiao, Lidia S. Chao, Jingbo Zhu	This paper proposes a hierarchical attentional neural translation model which focuses on enhancing source-side hierarchical representations by covering both local and global semantic information using a bidirectional tree-based encoder.
151	Massive Exploration of Neural Machine Translation Architectures	Denny Britz, Anna Goldie, Minh-Thang Luong, Quoc Le	In this work, we present a large-scale analysis of the sensitivity of NMT architectures to common hyperparameters.
152	Learning Translations via Matrix Completion	Derry Tanti Wijaya, Brendan Callahan, John Hewitt, Jie Gao, Xiao Ling, Marianna Apidianaki, Chris Callison-Burch	We model this task as a matrix completion problem, and present an effective and extendable framework for completing the matrix.
153	Reinforcement Learning for Bandit Neural Machine Translation with Simulated Human Feedback	Khanh Nguyen, Hal Daumé III, Jordan Boyd-Graber	We describe a reinforcement learning algorithm that improves neural machine translation systems from simulated human feedback.
154	Towards Compact and Fast Neural Machine Translation Using a Combined Method	Xiaowei Zhang, Wei Chen, Feng Wang, Shuang Xu, Bo Xu	This paper presents a four stage pipeline to compress model and speed up the decoding for NMT.
155	Instance Weighting for Neural Machine Translation Domain Adaptation	Rui Wang, Masao Utiyama, Lemao Liu, Kehai Chen, Eiichiro Sumita	In this paper, two instance weighting technologies, i.e., sentence weighting and domain weighting with a dynamic weight learning strategy, are proposed for NMT domain adaptation.
156	Regularization techniques for fine-tuning in neural machine translation	Antonio Valerio Miceli Barone, Barry Haddow, Ulrich Germann, Rico Sennrich	We investigate techniques for supervised domain adaptation for neural machine translation where an existing model trained on a large out-of-domain dataset is adapted to a small in-domain dataset.
157	Source-Side Left-to-Right or Target-Side Left-to-Right? An Empirical Comparison of Two Phrase-Based Decoding Algorithms	Yin-Wen Chang, Michael Collins	This paper describes an empirical study of the phrase-based decoding algorithm proposed by Chang and Collins (2017).
158	Using Target-side Monolingual Data for Neural Machine Translation through Multi-task Learning	Tobias Domhan, Felix Hieber	We propose to modify the decoder in a neural sequence-to-sequence model to enable multi-task learning for two strongly related tasks: target-side language modeling and translation.
159	Encoding Sentences with Graph Convolutional Networks for Semantic Role Labeling	Diego Marcheggiani, Ivan Titov	We propose a version of graph convolutional networks (GCNs), a recent class of neural networks operating on graphs, suited to model syntactic dependency graphs.
160	Neural Semantic Parsing with Type Constraints for Semi-Structured Tables	Jayant Krishnamurthy, Pradeep Dasigi, Matt Gardner	We present a new semantic parsing model for answering compositional questions on semi-structured Wikipedia tables.
161	Joint Concept Learning and Semantic Parsing from Natural Language Explanations	Shashank Srivastava, Igor Labutov, Tom Mitchell	We present a joint model for (1) language interpretation (semantic parsing) and (2) concept learning (classification) that does not require labeling statements with logical forms.
162	Grasping the Finer Point: A Supervised Similarity Network for Metaphor Detection	Marek Rei, Luana Bulat, Douwe Kiela, Ekaterina Shutova	In this paper, we present the first deep learning architecture designed to capture metaphorical composition.
163	Identifying civilians killed by police with distantly supervised entity-event extraction	Katherine Keith, Abram Handler, Michael Pinkham, Cara Magliozzi, Joshua McDuffie, Brendan O’Connor	We present a newly collected police fatality corpus, which we release publicly, and present a model to solve this problem that uses EM-based distant supervision with logistic regression and convolutional neural network classifiers.
164	Asking too much? The rhetorical role of questions in political discourse	Justine Zhang, Arthur Spirling, Cristian Danescu-Niculescu-Mizil	In this work we introduce an unsupervised methodology for extracting surface motifs that recur in questions, and for grouping them according to their latent rhetorical role.
165	Detecting Perspectives in Political Debates	David Vilares, Yulan He	We propose a Bayesian modelling approach where topics (or propositions) and their associated perspectives (or viewpoints) are modeled as latent variables.
166	“i have a feeling trump will win………………”: Forecasting Winners and Losers from User Predictions on Twitter	Sandesh Swamy, Alan Ritter, Marie-Catherine de Marneffe	To answer this question, we build a corpus of tweets annotated for veridicality on which we train a log-linear classifier that detects positive veridicality with high precision.
167	A Question Answering Approach for Emotion Cause Extraction	Lin Gui, Jiannan Hu, Yulan He, Ruifeng Xu, Qin Lu, Jiachen Du	Inspired by recent advances in using deep memory networks for question answering (QA), we propose a new approach which considers emotion cause identification as a reading comprehension task in QA.
168	Story Comprehension for Predicting What Happens Next	Snigdha Chaturvedi, Haoruo Peng, Dan Roth	In this paper, we present a story comprehension model that explores three distinct semantic aspects: (i) the sequence of events described in the story, (ii) its emotional trajectory, and (iii) its plot consistency.
169	Using millions of emoji occurrences to learn any-domain representations for detecting sentiment, emotion and sarcasm	Bjarke Felbo, Alan Mislove, Anders Søgaard, Iyad Rahwan, Sune Lehmann	Our paper shows that by extending the distant supervision to a more diverse set of noisy labels, the models can learn richer representations.
170	Opinion Recommendation Using A Neural Model	Zhongqing Wang, Yue Zhang	We present opinion recommendation, a novel task of jointly generating a review with a rating score that a certain user would give to a certain product which is unreviewed by the user, given existing reviews to the product by other users, and the reviews that the user has given to other products.
171	CRF Autoencoder for Unsupervised Dependency Parsing	Jiong Cai, Yong Jiang, Kewei Tu	In this paper, we develop an unsupervised dependency parsing model based on the CRF autoencoder.
172	Efficient Discontinuous Phrase-Structure Parsing via the Generalized Maximum Spanning Arborescence	Caio Corro, Joseph Le Roux, Mathieu Lacroix	We present a new method for the joint task of tagging and non-projective dependency parsing.
173	Incremental Graph-based Neural Dependency Parsing	Xiaoqing Zheng	Very recently, some studies on neural dependency parsers have shown advantage over the traditional ones on a wide variety of languages.
174	Neural Discontinuous Constituency Parsing	Miloš Stanojević, Raquel G. Alhama	In this paper, we propose a solution to this problem by replacing the structured perceptron model with a recursive neural model that computes a global representation of the configuration, therefore allowing even the most remote parts of the configuration to influence the parsing decisions.
175	Stack-based Multi-layer Attention for Transition-based Dependency Parsing	Zhirui Zhang, Shujie Liu, Mu Li, Ming Zhou, Enhong Chen	In this paper, we propose a stack-based multi-layer attention model for seq2seq learning to better leverage structural linguistics information.
176	Dependency Grammar Induction with Neural Lexicalization and Big Training Data	Wenjuan Han, Yong Jiang, Kewei Tu	We study the impact of big models (in terms of the degree of lexicalization) and big data (in terms of the training corpus size) on dependency grammar induction.
177	Combining Generative and Discriminative Approaches to Unsupervised Dependency Parsing via Dual Decomposition	Yong Jiang, Wenjuan Han, Kewei Tu	In this paper, we propose a new learning strategy that learns a generative model and a discriminative model jointly based on the dual decomposition method.
178	Effective Inference for Generative Neural Parsing	Mitchell Stern, Daniel Fried, Dan Klein	We describe an alternative to the conventional action-level beam search used for discriminative neural models that enables us to decode directly in these generative models.
179	Semi-supervised Structured Prediction with Neural CRF Autoencoder	Xiao Zhang, Yong Jiang, Hao Peng, Kewei Tu, Dan Goldwasser	In this paper we propose an end-to-end neural CRF autoencoder (NCRF-AE) model for semi-supervised learning of sequential structured prediction problems.
180	TAG Parsing with Neural Networks and Vector Representations of Supertags	Jungo Kasai, Bob Frank, Tom McCoy, Owen Rambow, Alexis Nasr	We present supertagging-based models for Tree Adjoining Grammar parsing that use neural network architectures and dense vector representation of supertags (elementary trees) to achieve state-of-the-art performance in unlabeled and labeled attachment scores.
181	Global Normalization of Convolutional Neural Networks for Joint Entity and Relation Classification	Heike Adel, Hinrich Schütze	We introduce globally normalized convolutional neural networks for joint entity classification and relation extraction.
182	End-to-End Neural Relation Extraction with Global Optimization	Meishan Zhang, Yue Zhang, Guohong Fu	We build a globally optimized neural model for end-to-end relation extraction, proposing novel LSTM features in order to better learn context representations.
183	KGEval: Accuracy Estimation of Automatically Constructed Knowledge Graphs	Prakhar Ojha, Partha Talukdar	This important problem has largely been ignored in prior research – we fill this gap and propose KGEval.
184	Sparsity and Noise: Where Knowledge Graph Embeddings Fall Short	Jay Pujara, Eriq Augustine, Lise Getoor	In this paper, we consider the problem of applying embedding techniques to KGs extracted from text, which are often incomplete and contain errors.
185	Dual Tensor Model for Detecting Asymmetric Lexico-Semantic Relations	Goran Glavaš, Simone Paolo Ponzetto	In this work, we propose the Dual Tensor model, a neural architecture with which we explicitly model the asymmetry and capture the translation between unspecialized and specialized word embeddings via a pair of tensors.
186	Incorporating Relation Paths in Neural Relation Extraction	Wenyuan Zeng, Yankai Lin, Zhiyuan Liu, Maosong Sun	To address this issue, we build inference chains between two target entities via intermediate entities, and propose a path-based neural relation extraction model to encode the relational semantics from both direct sentences and inference chains.
187	Adversarial Training for Relation Extraction	Yi Wu, David Bamman, Stuart Russell	We apply adversarial training in relation extraction within the multi-instance multi-label learning framework.
188	Context-Aware Representations for Knowledge Base Relation Extraction	Daniil Sorokin, Iryna Gurevych	We demonstrate that for sentence-level relation extraction it is beneficial to consider other relations in the sentential context while predicting the target relation.
189	A Soft-label Method for Noise-tolerant Distantly Supervised Relation Extraction	Tianyu Liu, Kexiang Wang, Baobao Chang, Zhifang Sui	To this end, we introduce an entity-pair level denoise method which exploits semantic information from correctly labeled entity pairs to correct wrong labels dynamically during training.
190	A Sequential Model for Classifying Temporal Relations between Intra-Sentence Events	Prafulla Kumar Choubey, Ruihong Huang	We present a sequential model for temporal relation classification between intra-sentence events.
191	Deep Residual Learning for Weakly-Supervised Relation Extraction	Yi Yao Huang, William Yang Wang	In this paper, we design a novel convolutional neural network (CNN) with residual learning, and investigate its impacts on the task of distantly supervised noisy relation extraction.
192	Noise-Clustered Distant Supervision for Relation Extraction: A Nonparametric Bayesian Perspective	Qing Zhang, Houfeng Wang	To address the challenge, this work presents a novel nonparametric Bayesian formulation for the task.
193	Exploring Vector Spaces for Semantic Relations	Kata Gábor, Haïfa Zargayouna, Isabelle Tellier, Davide Buscaldi, Thierry Charnois	In this paper, we explore the potential of pre-trained word embeddings to identify generic types of semantic relations in an unsupervised experiment.
194	Temporal dynamics of semantic relations in word embeddings: an application to predicting armed conflict participants	Andrey Kutuzov, Erik Velldal, Lilja Øvrelid	This paper deals with using word embedding models to trace the temporal dynamics of semantic relations between pairs of words.
195	Dynamic Entity Representations in Neural Language Models	Yangfeng Ji, Chenhao Tan, Sebastian Martschat, Yejin Choi, Noah A. Smith	We present a new type of language model, EntityNLM, that can explicitly model entities, dynamically update their representations, and contextually generate their mentions.
196	Towards Quantum Language Models	Ivano Basile, Fabio Tamburini	This paper presents a new approach for building Language Models using the Quantum Probability Theory, a Quantum Language Model (QLM).
197	Reference-Aware Language Models	Zichao Yang, Phil Blunsom, Chris Dyer, Wang Ling	We propose a general class of language models that treat reference as discrete stochastic latent variables.
198	A Simple Language Model based on PMI Matrix Approximations	Oren Melamud, Ido Dagan, Jacob Goldberger	Specifically, we show that with minor modifications to word2vec’s algorithm, we get principled language models that are closely related to the well-established Noise Contrastive Estimation (NCE) based language models.
199	Syllable-aware Neural Language Models: A Failure to Beat Character-aware Ones	Zhenisbek Assylbekov, Rustem Takhanov, Bagdat Myrzakhmetov, Jonathan N. Washington	Syllable-aware Neural Language Models: A Failure to Beat Character-aware Ones
200	Inducing Semantic Micro-Clusters from Deep Multi-View Representations of Novels	Lea Frermann, György Szarvas	Here, we propose a principled and scalable framework leveraging expert-provided semantic tags (e.g., mystery, pirates) to evaluate plot representations in an extrinsic fashion, assessing their ability to produce locally coherent groupings of novels (micro-clusters) in model space.
201	Initializing Convolutional Filters with Semantic Features for Text Classification	Shen Li, Zhe Zhao, Tao Liu, Renfen Hu, Xiaoyong Du	This paper presents a novel weight initialization method to improve the CNNs for text classification.
202	Shortest-Path Graph Kernels for Document Similarity	Giannis Nikolentzos, Polykarpos Meladianos, François Rousseau, Yannis Stavrakas, Michalis Vazirgiannis	In this paper, we present a novel document similarity measure based on the definition of a graph kernel between pairs of documents.
203	Adapting Topic Models using Lexical Associations with Tree Priors	Weiwei Yang, Jordan Boyd-Graber, Philip Resnik	Models work best when they are optimized taking into account the evaluation criteria that people care about.
204	Finding Patterns in Noisy Crowds: Regression-based Annotation Aggregation for Crowdsourced Data	Natalie Parde, Rodney Nielsen	We present an aggregation approach that learns a regression model from crowdsourced annotations to predict aggregated labels for instances that have no expert adjudications.
205	CROWD-IN-THE-LOOP: A Hybrid Approach for Annotating Semantic Roles	Chenguang Wang, Alan Akbik, Laura Chiticariu, Yunyao Li, Fei Xia, Anbang Xu	In this paper, we postulate that while producing SRL annotation does require expert involvement in general, a large subset of SRL labeling tasks is in fact appropriate for the crowd.
206	A Joint Many-Task Model: Growing a Neural Network for Multiple NLP Tasks	Kazuma Hashimoto, Caiming Xiong, Yoshimasa Tsuruoka, Richard Socher	We introduce a joint many-task model together with a strategy for successively growing its depth to solve increasingly complex tasks.
207	Earth Mover’s Distance Minimization for Unsupervised Bilingual Lexicon Induction	Meng Zhang, Yang Liu, Huanbo Luan, Maosong Sun	In this paper, we attempt to establish the cross-lingual connection without relying on any cross-lingual supervision.
208	Unfolding and Shrinking Neural Machine Translation Ensembles	Felix Stahlberg, Bill Byrne	This work aims to reduce the runtime to be on par with a single system without compromising the translation quality.
209	Graph Convolutional Encoders for Syntax-aware Neural Machine Translation	Joost Bastings, Ivan Titov, Wilker Aziz, Diego Marcheggiani, Khalil Sima’an	We present a simple and effective approach to incorporating syntactic structure into neural attention-based encoder-decoder models for machine translation.
210	Trainable Greedy Decoding for Neural Machine Translation	Jiatao Gu, Kyunghyun Cho, Victor O.K. Li	In this paper, we solely focus on the problem of decoding given a trained neural machine translation model.
211	Satirical News Detection and Analysis using Attention Mechanism and Linguistic Features	Fan Yang, Arjun Mukherjee, Eduard Dragut	Despite the embedded genre in the article, not everyone can recognize the satirical cues and therefore believe the news as true news.
212	Fine Grained Citation Span for References in Wikipedia	Besnik Fetahu, Katja Markert, Avishek Anand	We propose a sequence classification approach where for a paragraph and a citation, we determine the citation span at a fine-grained level.
213	Identifying Semantic Edit Intentions from Revisions in Wikipedia	Diyi Yang, Aaron Halfaker, Robert Kraut, Eduard Hovy	In this work, we develop in collaboration with Wikipedia editors a 13-category taxonomy of the semantic intention behind edits in Wikipedia articles.
214	Accurate Supervised and Semi-Supervised Machine Reading for Long Documents	Daniel Hewlett, Llion Jones, Alexandre Lacoste, Izzeddin Gur	We introduce a hierarchical architecture for machine reading capable of extracting precise information from long documents.
215	Adversarial Examples for Evaluating Reading Comprehension Systems	Robin Jia, Percy Liang	To reward systems with real language understanding abilities, we propose an adversarial evaluation scheme for the Stanford Question Answering Dataset (SQuAD).
216	Reasoning with Heterogeneous Knowledge for Commonsense Machine Comprehension	Hongyu Lin, Le Sun, Xianpei Han	In this paper, we propose a multi-knowledge reasoning method, which can exploit heterogeneous knowledge for commonsense machine comprehension.
217	Document-Level Multi-Aspect Sentiment Classification as Machine Comprehension	Yichun Yin, Yangqiu Song, Ming Zhang	In this paper, we model the task as a machine comprehension problem where pseudo question-answer pairs are constructed by a small number of aspect-related keywords and aspect ratings. We will release our code and data for the method replicability.
218	What is the Essence of a Claim? Cross-Domain Claim Identification	Johannes Daxenberger, Steffen Eger, Ivan Habernal, Christian Stab, Iryna Gurevych	We perform a qualitative analysis across six different datasets and show that these appear to conceptualize claims quite differently.
219	Identifying Where to Focus in Reading Comprehension for Neural Question Generation	Xinya Du, Claire Cardie	We propose a hierarchical neural sentence-level sequence tagging model for this task, which existing approaches to question generation have ignored.
220	Break it Down for Me: A Study in Automated Lyric Annotation	Lucas Sterckx, Jason Naradowsky, Bill Byrne, Thomas Demeester, Chris Develder	We introduce the task of automated lyric annotation (ALA).
221	Cascaded Attention based Unsupervised Information Distillation for Compressive Summarization	Piji Li, Wai Lam, Lidong Bing, Weiwei Guo, Hang Li	Inspired by this observation, we propose a cascaded attention based unsupervised model to estimate the salience information from the text for compressive multi-document summarization.
222	Deep Recurrent Generative Decoder for Abstractive Text Summarization	Piji Li, Wai Lam, Lidong Bing, Zihao Wang	We propose a new framework for abstractive text summarization based on a sequence-to-sequence oriented encoder-decoder model equipped with a deep recurrent generative decoder (DRGN).
223	Extractive Summarization Using Multi-Task Learning with Document Classification	Masaru Isonuma, Toru Fujino, Junichiro Mori, Yutaka Matsuo, Ichiro Sakata	In this paper, we propose a general framework for summarization that extracts sentences from a document using externally related information.
224	Towards Automatic Construction of News Overview Articles by News Synthesis	Jianmin Zhang, Xiaojun Wan	In this paper we investigate a new task of automatically constructing an overview article from a given set of news articles about a news event.
225	Joint Syntacto-Discourse Parsing and the Syntacto-Discourse Treebank	Kai Zhao, Liang Huang	In this paper we propose the first end-to-end discourse parser that jointly parses in both syntax and discourse levels, as well as the first syntacto-discourse treebank by integrating the Penn Treebank and the RST Treebank.
226	Event Coreference Resolution by Iteratively Unfolding Inter-dependencies among Events	Prafulla Kumar Choubey, Ruihong Huang	We introduce a novel iterative approach for event coreference resolution that gradually builds event clusters by exploiting inter-dependencies among event mentions within the same chain as well as across event chains.
227	When to Finish? Optimal Beam Search for Neural Text Generation (modulo beam size)	Liang Huang, Kai Zhao, Mingbo Ma	We propose a provably optimal beam search algorithm that will always return the optimal-score complete hypothesis (modulo beam size), and finish as soon as the optimality is established.
228	Steering Output Style and Topic in Neural Response Generation	Di Wang, Nebojsa Jojic, Chris Brockett, Eric Nyberg	We propose simple and flexible training and decoding methods for influencing output style and topic in neural encoder-decoder based language generation.
229	Preserving Distributional Information in Dialogue Act Classification	Quan Hung Tran, Ingrid Zukerman, Gholamreza Haffari	This paper introduces a novel training/decoding strategy for sequence labeling.
230	Adversarial Learning for Neural Dialogue Generation	Jiwei Li, Will Monroe, Tianlin Shi, Sébastien Jean, Alan Ritter, Dan Jurafsky	We cast the task as a reinforcement learning problem where we jointly train two systems: a generative model to produce response sequences, and a discriminator-analagous to the human evaluator in the Turing test- to distinguish between the human-generated dialogues and the machine-generated ones.
231	Using Context Information for Dialog Act Classification in DNN Framework	Yang Liu, Kun Han, Zhao Tan, Yun Lei	This paper proposes several ways of using context information for DA classification, all in the deep learning framework.
232	Modeling Dialogue Acts with Content Word Filtering and Speaker Preferences	Yohan Jo, Michael Yoder, Hyeju Jang, Carolyn Rosé	We present an unsupervised model of dialogue act sequences in conversation.
233	Towards Implicit Content-Introducing for Generative Short-Text Conversation Systems	Lili Yao, Yaoyuan Zhang, Yansong Feng, Dongyan Zhao, Rui Yan	In this paper, we aim to generate a more meaningful and informative reply when answering a given question.
234	Affordable On-line Dialogue Policy Learning	Cheng Chang, Runzhe Yang, Lu Chen, Xiang Zhou, Kai Yu	For solving the unsustainable learning problem, we proposed a complete companion teaching framework incorporating the guidance from the human teacher.
235	Generating High-Quality and Informative Conversation Responses with Sequence-to-Sequence Models	Yuanlong Shao, Stephan Gouws, Denny Britz, Anna Goldie, Brian Strope, Ray Kurzweil	In this work, we focus on the single turn setting.
236	Bootstrapping incremental dialogue systems from minimal data: the generalisation power of dialogue grammars	Arash Eshghi, Igor Shalyminov, Oliver Lemon	We investigate an end-to-end method for automatically inducing task-based dialogue systems from small amounts of unannotated dialogue data.
237	Composite Task-Completion Dialogue Policy Learning via Hierarchical Deep Reinforcement Learning	Baolin Peng, Xiujun Li, Lihong Li, Jianfeng Gao, Asli Celikyilmaz, Sungjin Lee, Kam-Fai Wong	This paper addresses this challenge by formulating the task in the mathematical framework of options over Markov Decision Processes (MDPs), and proposing a hierarchical deep reinforcement learning approach to learning a dialogue manager that operates at different temporal scales.
238	Why We Need New Evaluation Metrics for NLG	Jekaterina Novikova, Ondřej Dušek, Amanda Cercas Curry, Verena Rieser	In this paper, we motivate the need for novel, system- and data-independent automatic evaluation methods: We investigate a wide range of metrics, including state-of-the-art word-based and novel grammar-based ones, and demonstrate that they only weakly reflect human judgements of system outputs as generated by data-driven, end-to-end NLG.
239	Challenges in Data-to-Document Generation	Sam Wiseman, Stuart Shieber, Alexander Rush	In this work, we suggest a slightly more difficult data-to-text generation task, and investigate how effective current approaches are on this task. In particular, we introduce a new, large-scale corpus of data records paired with descriptive documents, propose a series of extractive evaluation methods for analyzing performance, and obtain baseline results using current neural generation methods.
240	All that is English may be Hindi: Enhancing language identification through automatic ranking of the likeliness of word borrowing in social media	Jasabanta Patro, Bidisha Samanta, Saurabh Singh, Abhipsa Basu, Prithwish Mukherjee, Monojit Choudhury, Animesh Mukherjee	n this paper, we present a set of computational methods to identify the likeliness of a word being borrowed, based on the signals from social media.
241	Multi-View Unsupervised User Feature Embedding for Social Media-based Substance Use Prediction	Tao Ding, Warren K. Bickel, Shimei Pan	In this paper, we demonstrate how the state-of-the-art machine learning and text mining techniques can be used to build effective social media-based substance use detection systems.
242	Demographic-aware word associations	Aparna Garimella, Carmen Banea, Rada Mihalcea	To capture these variations, we introduce the task of demographic-aware word associations. We build a new gold standard dataset consisting of word association responses for approximately 300 stimulus words, collected from more than 800 respondents of different gender (male/female) and from different locations (India/United States), and show that there are significant variations in the word associations made by these groups.
243	A Factored Neural Network Model for Characterizing Online Discussions in Vector Space	Hao Cheng, Hao Fang, Mari Ostendorf	We develop a novel factored neural model that learns comment embeddings in an unsupervised way leveraging the structure of distributional context in online discussion forums.
244	Dimensions of Interpersonal Relationships: Corpus and Experiments	Farzana Rashid, Eduardo Blanco	This paper presents a corpus and experiments to determine dimensions of interpersonal relationships. We create a corpus by retrieving pairs of people, and then annotating dimensions for their relationships.
245	Argument Mining on Twitter: Arguments, Facts and Sources	Mihai Dusmanu, Elena Cabrio, Serena Villata	In this paper, we apply supervised classification to identify arguments on Twitter, and we present two new tasks for argument mining, namely facts recognition and source identification.
246	Distinguishing Japanese Non-standard Usages from Standard Ones	Tatsuya Aoki, Ryohei Sasano, Hiroya Takamura, Manabu Okumura	In this study, we attempt to distinguish non-standard usages on social media from standard ones in an unsupervised manner.
247	Connotation Frames of Power and Agency in Modern Films	Maarten Sap, Marcella Cindy Prasettio, Ari Holtzman, Hannah Rashkin, Yejin Choi	We introduce connotation frames of power and agency, a pragmatic formalism organized using frame semantic representations, to model how different levels of power and agency are implicitly projected on actors through their actions.
248	Controlling Human Perception of Basic User Traits	Daniel Preoţiuc-Pietro, Sharath Chandra Guntuku, Lyle Ungar	In this pilot study, we measure the extent to which human perception of basic user trait information – gender and age – is controllable through text.
249	Topic Signatures in Political Campaign Speeches	Clément Gautrais, Peggy Cellier, René Quiniou, Alexandre Termier	In this paper, we present a method combining standard topic modeling with signature mining for analyzing topic recurrence in speeches of Clinton and Trump during the 2016 American presidential campaign.
250	Assessing Objective Recommendation Quality through Political Forecasting	H. Andrew Schwartz, Masoud Rouhizadeh, Michael Bishop, Philip Tetlock, Barbara Mellers, Lyle Ungar	We explore recommendation quality assessment with respect to both subjective (i.e. users’ ratings) and objective (i.e., did it influence?
251	Never Abandon Minorities: Exhaustive Extraction of Bursty Phrases on Microblogs Using Set Cover Problem	Masumi Shirakawa, Takahiro Hara, Takuya Maekawa	We propose a language-independent data-driven method to exhaustively extract bursty phrases of arbitrary forms (e.g., phrases other than simple noun phrases) from microblogs.
252	Maximum Margin Reward Networks for Learning from Explicit and Implicit Supervision	Haoruo Peng, Ming-Wei Chang, Wen-tau Yih	In this work, we propose Maximum Margin Reward Networks, a neural network-based framework that aims to learn from both explicit (full structures) and implicit supervision signals (delayed feedback on the correctness of the predicted structure).
253	The Impact of Modeling Overall Argumentation with Tree Kernels	Henning Wachsmuth, Giovanni Da San Martino, Dora Kiesel, Benno Stein	Several approaches have been proposed to model either the explicit sequential structure of an argumentative text or its implicit hierarchical structure.
254	Learning Generic Sentence Representations Using Convolutional Neural Networks	Zhe Gan, Yunchen Pu, Ricardo Henao, Chunyuan Li, Xiaodong He, Lawrence Carin	We propose a new encoder-decoder approach to learn distributed sentence representations that are applicable to multiple purposes.
255	Repeat before Forgetting: Spaced Repetition for Efficient and Effective Training of Neural Networks	Hadi Amiri, Timothy Miller, Guergana Savova	We present a novel approach for training artificial neural networks.
256	Part-of-Speech Tagging for Twitter with Adversarial Neural Networks	Tao Gui, Qi Zhang, Haoran Huang, Minlong Peng, Xuanjing Huang	In this work, we study the problem of part-of-speech tagging for Tweets.
257	Investigating Different Syntactic Context Types and Context Representations for Learning Word Embeddings	Bofang Li, Tao Liu, Zhe Zhao, Buzhou Tang, Aleksandr Drozd, Anna Rogers, Xiaoyong Du	We provide a systematical investigation of 4 different syntactic context types and context representations for learning word embeddings.
258	Does syntax help discourse segmentation? Not so much	Chloé Braud, Ophélie Lacroix, Anders Søgaard	Our results show that dependency information is less useful than expected, but we provide a fully scalable, robust model that only relies on part-of-speech information, and show that it performs well across languages in the absence of any gold-standard annotation.
259	Deal or No Deal? End-to-End Learning of Negotiation Dialogues	Mike Lewis, Denis Yarats, Yann Dauphin, Devi Parikh, Dhruv Batra	For the first time, we show it is possible to train end-to-end models for negotiation, which must learn both linguistic and reasoning skills with no annotated dialogue states.
260	Agent-Aware Dropout DQN for Safe and Efficient On-line Dialogue Policy Learning	Lu Chen, Xiang Zhou, Cheng Chang, Runzhe Yang, Kai Yu	We employ a \textit{companion learning} framework to integrate the two approaches for \textit{on-line} dialogue policy learning, in which a pre-defined rule-based policy acts as a “teacher” and guides a data-driven RL system by giving example actions as well as additional rewards.
261	Towards Debate Automation: a Recurrent Model for Predicting Debate Winners	Peter Potash, Anna Rumshisky	In this paper we introduce a practical first step towards the creation of an automated debate agent: a state-of-the-art recurrent predictive model for predicting debate winners.
262	Further Investigation into Reference Bias in Monolingual Evaluation of Machine Translation	Qingsong Ma, Yvette Graham, Timothy Baldwin, Qun Liu	Further Investigation into Reference Bias in Monolingual Evaluation of Machine Translation
263	A Challenge Set Approach to Evaluating Machine Translation	Pierre Isabelle, Colin Cherry, George Foster	To exemplify this approach, we present an English-French challenge set, and use it to analyze phrase-based and neural systems.
264	Knowledge Distillation for Bilingual Dictionary Induction	Ndapandula Nakashole, Raphael Flauger	In this paper, we propose a bridging approach, where our main contribution is a knowledge distillation training objective.
265	Machine Translation, it’s a question of style, innit? The case of English tag questions	Rachel Bawden	In this paper, we address the problem of generating English tag questions (TQs) (e.g. it is, isn’t it?)
266	Deciphering Related Languages	Nima Pourdamghani, Kevin Knight	We present a method for translating texts between close language pairs.
267	Identifying Cognate Sets Across Dictionaries of Related Languages	Adam St Arnaud, David Beck, Grzegorz Kondrak	We present a system for identifying cognate sets across dictionaries of related languages.
268	Learning Language Representations for Typology Prediction	Chaitanya Malaviya, Graham Neubig, Patrick Littell	Exploiting the existence of parallel texts in more than a thousand languages, we build a massive many-to-one NMT system from 1017 languages into English, and use this to predict information missing from typological databases.
269	Cheap Translation for Cross-Lingual Named Entity Recognition	Stephen Mayhew, Chen-Tse Tsai, Dan Roth	We propose a simple method for cross-lingual named entity recognition (NER) that works well in settings with \textit{very} minimal resources.
270	Cross-Lingual Induction and Transfer of Verb Classes Based on Word Vector Space Specialisation	Ivan Vulić, Nikola Mrkšić, Anna Korhonen	In this work, we propose a novel cross-lingual transfer method for inducing VerbNets for multiple languages.
271	Classification of telicity using cross-linguistic annotation projection	Annemarie Friedrich, Damyana Gateva	Our contributions are as follows. We also create a new data set of English texts manually annotated with telicity.
272	Counterfactual Learning from Bandit Feedback under Deterministic Logging : A Case Study in Statistical Machine Translation	Carolin Lawrence, Artem Sokolov, Stefan Riezler	We show that counterfactual learning from deterministic bandit logs is possible nevertheless by smoothing out deterministic components in learning.
273	Learning Fine-grained Relations from Chinese User Generated Categories	Chengyu Wang, Yan Fan, Xiaofeng He, Aoying Zhou	In this paper, we present a weakly supervised learning framework to harvest relations from Chinese UGCs.
274	Improving Slot Filling Performance with Attentive Neural Networks on Dependency Structures	Lifu Huang, Avirup Sil, Heng Ji, Radu Florian	In this paper we propose an effective DNN architecture for SF with the following new strategies: (1).
275	Identifying Products in Online Cybercrime Marketplaces: A Dataset for Fine-grained Domain Adaptation	Greg Durrett, Jonathan K. Kummerfeld, Taylor Berg-Kirkpatrick, Rebecca Portnoff, Sadia Afroz, Damon McCoy, Kirill Levchenko, Vern Paxson	In this work, we study the task of identifying products being bought and sold in online cybercrime forums, which exhibits particularly challenging cross-domain effects. We release a dataset of 1,938 annotated posts from across the four forums.
276	Labeling Gaps Between Words: Recognizing Overlapping Mentions with Mention Separators	Aldrian Obaja Muis, Wei Lu	In this paper, we propose a new model that is capable of recognizing overlapping mentions.
277	Deep Joint Entity Disambiguation with Local Neural Attention	Octavian-Eugen Ganea, Thomas Hofmann	We propose a novel deep learning model for joint document-level entity disambiguation, which leverages learned neural representations.
278	MinIE: Minimizing Facts in Open Information Extraction	Kiril Gashteovski, Rainer Gemulla, Luciano del Corro	In this paper, we propose MinIE, an OIE system that aims to provide useful, compact extractions with high precision and recall.
279	Scientific Information Extraction with Semi-supervised Neural Tagging	Yi Luan, Mari Ostendorf, Hannaneh Hajishirzi	We cast the problem as sequence tagging and introduce semi-supervised methods to a neural tagging model, which builds on recent advances in named entity recognition.
280	NITE: A Neural Inductive Teaching Framework for Domain Specific NER	Siliang Tang, Ning Zhang, Jinjiang Zhang, Fei Wu, Yueting Zhuang	In this paper, we proposed a novel Neural Inductive TEaching framework (NITE) to transfer knowledge from existing domain-specific NER models into an arbitrary deep neural network in a teacher-student training manner.
281	Speeding up Reinforcement Learning-based Information Extraction Training using Asynchronous Methods	Aditya Sharma, Zarana Parekh, Partha Talukdar	We leverage recent advances in parallel RL training using asynchronous methods and propose RLIE-A3C.
282	Leveraging Linguistic Structures for Named Entity Recognition with Bidirectional Recursive Neural Networks	Peng-Hsuan Li, Ruo-Ping Dong, Yu-Siang Wang, Ju-Chieh Chou, Wei-Yun Ma	In this paper, we utilize the linguistic structures of texts to improve named entity recognition by BRNN-CNN, a special bidirectional recursive network attached with a convolutional network.
283	Fast and Accurate Entity Recognition with Iterated Dilated Convolutions	Emma Strubell, Patrick Verga, David Belanger, Andrew McCallum	We describe a distinct combination of network structure, parameter sharing and training procedures that enable dramatic 14-20x test-time speedups while retaining accuracy comparable to the Bi-LSTM-CRF.
284	Entity Linking via Joint Encoding of Types, Descriptions, and Context	Nitish Gupta, Sameer Singh, Dan Roth	In this work we present a neural, modular entity linking system that learns a unified dense representation for each entity using multiple sources of information, such as its description, contexts around its mentions, and its fine-grained types.
285	An Insight Extraction System on BioMedical Literature with Deep Neural Networks	Hua He, Kris Ganjam, Navendu Jain, Jessica Lundin, Ryen White, Jimmy Lin	As new scientific findings appear across a large collection of biomedical publications, our aim is to tap into this literature to automate biomedical knowledge extraction and identify important insights from them.
286	Word Etymology as Native Language Interference	Vivi Nastase, Carlo Strapparava	We present experiments that show the influence of native language on lexical choice when producing text in another language – in this particular case English.
287	A Simpler and More Generalizable Story Detector using Verb and Character Features	Joshua Eisenberg, Mark Finlayson	We present a new state-of-the-art detector that achieves a maximum performance of 0.75 F1 (a 14% improvement), with significantly greater generalizability than previous work.
288	Multi-modular domain-tailored OCR post-correction	Sarah Schulz, Jonas Kuhn	Since we consider the accessibility of the resulting tool as a crucial part of Digital Humanities collaborations, we describe the workflow we suggest for efficient text recognition and subsequent automatic and manual post-correction
289	Learning to Predict Charges for Criminal Cases with Legal Basis	Bingfeng Luo, Yansong Feng, Jianbo Xu, Xiang Zhang, Dongyan Zhao	We argue that relevant law articles play an important role in this task, and therefore propose an attention-based neural network method to jointly model the charge prediction task and the relevant article extraction task in a unified framework.
290	Quantifying the Effects of Text Duplication on Semantic Models	Alexandra Schofield, Laure Thompson, David Mimno	By artificially creating different forms of duplicate text we confirm several hypotheses about how repeated text impacts models.
291	Identifying Semantically Deviating Outlier Documents	Honglei Zhuang, Chi Wang, Fangbo Tao, Lance Kaplan, Jiawei Han	In this paper, we study the problem of mining semantically deviating document outliers in a given corpus.
292	Detecting and Explaining Causes From Text For a Time Series Event	Dongyeop Kang, Varun Gangal, Ang Lu, Zheng Chen, Eduard Hovy	To detect causal features from text, we propose a novel method based on the Granger causality of time series between features extracted from text such as N-grams, topics, sentiments, and their composition.
293	A Novel Cascade Model for Learning Latent Similarity from Heterogeneous Sequential Data of MOOC	Zhuoxuan Jiang, Shanshan Feng, Gao Cong, Chunyan Miao, Xiaoming Li	In this paper, we propose a novel cascade model, which can capture both the latent semantics and latent similarity by modeling MOOC data.
294	Identifying the Provision of Choices in Privacy Policy Text	Kanthashree Mysore Sathyendra, Shomir Wilson, Florian Schaub, Sebastian Zimmeck, Norman Sadeh	In particular, we present a two-stage architecture of classification models to identify opt-out choices in privacy policy text, labelling common varieties of choices with a mean F1 score of 0.735.
295	An Empirical Analysis of Edit Importance between Document Versions	Tanya Goyal, Sachin Kelkar, Manas Agarwal, Jeenu Grover	In this paper, we present a novel approach to infer significance of various textual edits to documents.
296	Transition-Based Disfluency Detection using LSTMs	Shaolei Wang, Wanxiang Che, Yue Zhang, Meishan Zhang, Ting Liu	In this paper, we model the problem of disfluency detection using a transition-based framework, which incrementally constructs and labels the disfluency chunk of input sentences using a new transition system without syntax information.
297	Neural Sequence-Labelling Models for Grammatical Error Correction	Helen Yannakoudakis, Marek Rei, Øistein E. Andersen, Zheng Yuan	We propose an approach to N-best list reranking using neural sequence-labelling models.
298	Adapting Sequence Models for Sentence Correction	Allen Schmaltz, Yoon Kim, Alexander Rush, Stuart Shieber	In a controlled experiment of sequence-to-sequence approaches for the task of sentence correction, we find that character-based models are generally more effective than word-based models and models that encode subword information via convolutions, and that modeling the output data as a series of diffs improves effectiveness over standard approaches.
299	A Study of Style in Machine Translation: Controlling the Formality of Machine Translation Output	Xing Niu, Marianna Martindale, Marine Carpuat	We propose to use lexical formality models to control the formality level of machine translation output.
300	Sharp Models on Dull Hardware: Fast and Accurate Neural Machine Translation Decoding on the CPU	Jacob Devlin	In this work we focus on efficient decoding, with a goal of achieving accuracy close the state-of-the-art in neural machine translation (NMT), while achieving CPU decoding speed/throughput close to that of a phrasal decoder.
301	Exploiting Cross-Sentence Context for Neural Machine Translation	Longyue Wang, Zhaopeng Tu, Andy Way, Qun Liu	In this paper, we propose a cross-sentence context-aware approach and investigate the influence of historical contextual information on the performance of neural machine translation (NMT).
302	Cross-Lingual Transfer Learning for POS Tagging without Cross-Lingual Resources	Joo-Kyung Kim, Young-Bum Kim, Ruhi Sarikaya, Eric Fosler-Lussier	In this paper, we introduce a cross-lingual transfer learning model for POS tagging without ancillary resources such as parallel corpora.
303	Image Pivoting for Learning Multilingual Multimodal Representations	Spandana Gella, Rico Sennrich, Frank Keller, Mirella Lapata	In this paper we propose a model to learn multimodal multilingual representations for matching images and sentences in different languages, with the aim of advancing multilingual versions of image search and image understanding.
304	Neural Machine Translation with Source Dependency Representation	Kehai Chen, Rui Wang, Masao Utiyama, Lemao Liu, Akihiro Tamura, Eiichiro Sumita, Tiejun Zhao	In this paper, we propose a novel NMT with source dependency representation to improve translation performance of NMT, especially long sentences.
305	Visual Denotations for Recognizing Textual Entailment	Dan Han, Pascual Martínez-Gómez, Koji Mineshima	We propose to map phrases to their visual denotations and compare their meaning in terms of their images.
306	Sequence Effects in Crowdsourced Annotations	Nitika Mathur, Timothy Baldwin, Trevor Cohn	In this paper, we explore sequence effects where annotations of an item are affected by the preceding items.
307	No Need to Pay Attention: Simple Recurrent Neural Networks Work!	Ferhan Ture, Oliver Jojic	In fact, we present a preliminary analysis of the performance of our model on real queries from Comcast’s X1 entertainment platform with millions of users every day.
308	The strange geometry of skip-gram with negative sampling	David Mimno, Laure Thompson	We show that this geometric concentration depends on the ratio of positive to negative examples, and that it is neither theoretically nor empirically inherent in related embedding algorithms.
309	Natural Language Processing with Small Feed-Forward Networks	Jan A. Botha, Emily Pitler, Ji Ma, Anton Bakalov, Alex Salcianu, David Weiss, Ryan McDonald, Slav Petrov	We show that small and shallow feed-forward neural networks can achieve near state-of-the-art results on a range of unstructured and structured language processing tasks while being considerably cheaper in memory and computational requirements than deep recurrent models.
310	Deep Multi-Task Learning for Aspect Term Extraction with Memory Interaction	Xin Li, Wai Lam	We propose a novel LSTM-based deep multi-task learning framework for aspect term extraction from user review sentences.
311	Analogs of Linguistic Structure in Deep Representations	Jacob Andreas, Dan Klein	We investigate the compositional structure of message vectors computed by a deep network trained on a communication game.
312	A Simple Regularization-based Algorithm for Learning Cross-Domain Word Embeddings	Wei Yang, Wei Lu, Vincent Zheng	In this paper, we present a simple yet effective method for learning word embeddings based on text from different domains.
313	Learning what to read: Focused machine reading	Enrique Noriega-Atala, Marco A. Valenzuela-Escárcega, Clayton Morrison, Mihai Surdeanu	In this work, we introduce a focused reading approach to guide the machine reading of biomedical literature towards what literature should be read to answer a biomedical query as efficiently as possible.
314	DOC: Deep Open Classification of Text Documents	Lei Shu, Hu Xu, Bing Liu	This paper proposes a novel deep learning based approach.
315	Charmanteau: Character Embedding Models For Portmanteau Creation	Varun Gangal, Harsh Jhamtani, Graham Neubig, Eduard Hovy, Eric Nyberg	We propose a noisy-channel-style model, which allows for the incorporation of unsupervised word lists, improving performance over a standard source-to-target model.
316	Using Automated Metaphor Identification to Aid in Detection and Prediction of First-Episode Schizophrenia	E. Darío Gutiérrez, Guillermo Cecchi, Cheryl Corcoran, Philip Corlett	Using metaphor-identification and sentiment-analysis algorithms to automatically generate features, we create a classifier, that, with high accuracy, can predict which patients will develop (or currently suffer from) schizophrenia.
317	Truth of Varying Shades: Analyzing Language in Fake News and Political Fact-Checking	Hannah Rashkin, Eunsol Choi, Jin Yea Jang, Svitlana Volkova, Yejin Choi	We present an analytic study on the language of news media in the context of political fact-checking and fake news detection.
318	Topic-Based Agreement and Disagreement in US Electoral Manifestos	Stefano Menini, Federico Nanni, Simone Paolo Ponzetto, Sara Tonelli	We present a topic-based analysis of agreement and disagreement in political manifestos, which relies on a new method for topic detection based on key concept clustering.
319	Zipporah: a Fast and Scalable Data Cleaning System for Noisy Web-Crawled Parallel Corpora	Hainan Xu, Philipp Koehn	We propose a novel type of bag-of-words translation feature, and train logistic regression models to classify good data and synthetic noisy data in the proposed feature space.
320	Bringing Structure into Summaries: Crowdsourcing a Benchmark Corpus of Concept Maps	Tobias Falke, Iryna Gurevych	To close this gap, we present a newly created corpus of concept maps that summarize heterogeneous collections of web documents on educational topics. We release the corpus along with a baseline system and proposed evaluation protocol to enable further research on this variant of summarization.
321	Natural Language Does Not Emerge `Naturally’ in Multi-Agent Dialog	Satwik Kottur, José Moura, Stefan Lee, Dhruv Batra	In this paper, using a Task & Talk reference game between two agents as a testbed, we present a sequence of negative’ results culminating in a positive’ one – showing that while most agent-invented languages are effective (i.e. achieve near-perfect task rewards), they are decidedly not interpretable or compositional.
322	Depression and Self-Harm Risk Assessment in Online Forums	Andrew Yates, Arman Cohan, Nazli Goharian	In this work, we present a framework for supporting and studying users in both types of communities. We introduce a large-scale general forum dataset consisting of users with self-reported depression diagnoses matched with control users.
323	Men Also Like Shopping: Reducing Gender Bias Amplification using Corpus-level Constraints	Jieyu Zhao, Tianlu Wang, Mark Yatskar, Vicente Ordonez, Kai-Wei Chang	In this work, we study data and models associated with multilabel object classification and visual semantic role labeling.