Paper Digest: EMNLP 2017 Highlights
The Conference on Empirical Methods in Natural Language Processing (EMNLP) is one of the top natural language processing conferences in the world. In 2017, it is to be held in Copenhagen, Denmark. There were 836 long paper submissions, of which 216 were accepted and 582 short paper submissions, of which 107 were accepted.
To help AI community quickly catch up on the work presented in this conference, Paper Digest Team processed all accepted papers, and generated one highlight sentence (typically the main topic) for each paper. Readers are encouraged to read these machine generated highlights / summaries to quickly get the main idea of each paper.
We thank all authors for writing these interesting papers, and readers for reading our digests. If you do not want to miss any interesting AI paper, you are welcome to sign up our free paper digest service to get new paper updates customized to your own interests on a daily basis.
Paper Digest Team
team@paperdigest.org
TABLE 1: EMNLP 2017 Papers
Title | Authors | Highlight | |
---|---|---|---|
1 | Monolingual Phrase Alignment on Parse Forests | Yuki Arase, Junichi Tsujii | We propose an efficient method to conduct phrase alignment on parse forests for paraphrase detection. |
2 | Fast(er) Exact Decoding and Global Training for Transition-Based Dependency Parsing via a Minimal Feature Set | Tianze Shi, Liang Huang, Lillian Lee | Fast(er) Exact Decoding and Global Training for Transition-Based Dependency Parsing via a Minimal Feature Set |
3 | Quasi-Second-Order Parsing for 1-Endpoint-Crossing, Pagenumber-2 Graphs | Junjie Cao, Sheng Huang, Weiwei Sun, Xiaojun Wan | We propose a new Maximum Subgraph algorithm for first-order parsing to 1-endpoint-crossing, pagenumber-2 graphs. |
4 | Position-aware Attention and Supervised Data Improve Slot Filling | Yuhao Zhang, Victor Zhong, Danqi Chen, Gabor Angeli, Christopher D. Manning | This paper simultaneously addresses two issues that have held back prior work. |
5 | Heterogeneous Supervision for Relation Extraction: A Representation Learning Approach | Liyuan Liu, Xiang Ren, Qi Zhu, Shi Zhi, Huan Gui, Heng Ji, Jiawei Han | To overcome this drawback, we propose a novel framework, REHession, to conduct relation extractor learning using annotations from heterogeneous information source, e.g., knowledge base and domain heuristics. |
6 | Integrating Order Information and Event Relation for Script Event Prediction | Zhongqing Wang, Yue Zhang, Ching-Yun Chang | We propose a neural model that leverages the advantages of both methods, by using LSTM hidden states as features for event pair modelling. |
7 | Entity Linking for Queries by Searching Wikipedia Sentences | Chuanqi Tan, Furu Wei, Pengjie Ren, Weifeng Lv, Ming Zhou | We present a simple yet effective approach for linking entities in queries. |
8 | Train-O-Matic: Large-Scale Supervised Word Sense Disambiguation in Multiple Languages without Manual Training Data | Tommaso Pasini, Roberto Navigli | We present Train-O-Matic, a language-independent method for generating millions of sense-annotated training instances for virtually all meanings of words in a language’s vocabulary. |
9 | Universal Semantic Parsing | Siva Reddy, Oscar Täckström, Slav Petrov, Mark Steedman, Mirella Lapata | In this work, we introduce UDepLambda, a semantic interface for UD, which maps natural language to logical forms in an almost language-independent fashion and can process dependency graphs. |
10 | Mimicking Word Embeddings using Subword RNNs | Yuval Pinter, Robert Guthrie, Jacob Eisenstein | In this paper, we present MIMICK, an approach to generating OOV word embeddings compositionally, by learning a function from spellings to distributional embeddings. |
11 | Past, Present, Future: A Computational Investigation of the Typology of Tense in 1000 Languages | Ehsaneddin Asgari, Hinrich Schütze | We present SuperPivot, an analysis method for low-resource languages that occur in a superparallel corpus, i.e., in a corpus that contains an order of magnitude more languages than parallel corpora currently in use. |
12 | Neural Machine Translation with Source-Side Latent Graph Parsing | Kazuma Hashimoto, Yoshimasa Tsuruoka | This paper presents a novel neural machine translation model which jointly learns translation and source-side latent graph representations of sentences. |
13 | Neural Machine Translation with Word Predictions | Rongxiang Weng, Shujian Huang, Zaixiang Zheng, Xinyu Dai, Jiajun Chen | In this paper, we propose to use word predictions as a mechanism for direct supervision. |
14 | Towards Decoding as Continuous Optimisation in Neural Machine Translation | Cong Duy Vu Hoang, Gholamreza Haffari, Trevor Cohn | We propose a novel decoding approach for neural machine translation (NMT) based on continuous optimisation. |
15 | Where is Misty? Interpreting Spatial Descriptors by Modeling Regions in Space | Nikita Kitaev, Dan Klein | We present a model for locating regions in space based on natural language descriptions. To evaluate our model, we construct and release a new dataset consisting of Minecraft scenes with crowdsourced natural language descriptions. |
16 | Continuous Representation of Location for Geolocation and Lexical Dialectology using Mixture Density Networks | Afshin Rahimi, Timothy Baldwin, Trevor Cohn | We propose a method for embedding two-dimensional locations in a continuous vector space using a neural network-based model incorporating mixtures of Gaussian distributions, presenting two model variants for text-based geolocation and lexical dialectology. |
17 | Obj2Text: Generating Visually Descriptive Language from Object Layouts | Xuwang Yin, Vicente Ordonez | We explore in this paper OBJ2TEXT, a sequence-to-sequence model that encodes a set of objects and their locations as an input sequence using an LSTM network, and decodes this representation using an LSTM language model. |
18 | End-to-end Neural Coreference Resolution | Kenton Lee, Luheng He, Mike Lewis, Luke Zettlemoyer | We introduce the first end-to-end coreference resolution model and show that it significantly outperforms all previous work without using a syntactic parser or hand-engineered mention detector. |
19 | Neural Net Models of Open-domain Discourse Coherence | Jiwei Li, Dan Jurafsky | In this paper, we describe domain-independent neural models of discourse coherence that are capable of measuring multiple aspects of coherence in existing sentences and can maintain coherence while generating new sentences. |
20 | Affinity-Preserving Random Walk for Multi-Document Summarization | Kexiang Wang, Tianyu Liu, Zhifang Sui, Baobao Chang | This paper introduces affinity-preserving random walk to the summarization task, which preserves the affinity relations of sentences by an absorbing random walk model. |
21 | A Mention-Ranking Model for Abstract Anaphora Resolution | Ana Marasović, Leo Born, Juri Opitz, Anette Frank | We propose a mention-ranking model that learns how abstract anaphors relate to their antecedents with an LSTM-Siamese Net. |
22 | Hierarchical Embeddings for Hypernymy Detection and Directionality | Kim Anh Nguyen, Maximilian Köper, Sabine Schulte im Walde, Ngoc Thang Vu | We present a novel neural model HyperVec to learn hierarchical embeddings for hypernymy detection and directionality. |
23 | Ngram2vec: Learning Improved Word Representations from Ngram Co-occurrence Statistics | Zhe Zhao, Tao Liu, Shen Li, Bofang Li, Xiaoyong Du | In this paper, we introduce ngrams into four representation methods: SGNS, GloVe, PPMI matrix, and its SVD factorization. |
24 | Dict2vec : Learning Word Embeddings using Lexical Dictionaries | Julien Tissier, Christophe Gravier, Amaury Habrard | In this paper, we propose a new approach, Dict2vec, based on one of the largest yet refined datasource for describing words – natural language dictionaries. |
25 | Learning Chinese Word Representations From Glyphs Of Characters | Tzu-Ray Su, Hung-Yi Lee | In this paper, we propose new methods to learn Chinese word representations. Another contribution in this paper is that we created several evaluation datasets in traditional Chinese and made them public. |
26 | Learning Paraphrastic Sentence Embeddings from Back-Translated Bitext | John Wieting, Jonathan Mallinson, Kevin Gimpel | We consider the problem of learning general-purpose, paraphrastic sentence embeddings in the setting of Wieting et al. (2016b). |
27 | Joint Embeddings of Chinese Words, Characters, and Fine-grained Subcharacter Components | Jinxing Yu, Xun Jian, Hao Xin, Yangqiu Song | In this work, we propose an approach to jointly embed Chinese words as well as their characters and fine-grained subcharacter components. |
28 | Exploiting Morphological Regularities in Distributional Word Representations | Arihant Gupta, Syed Sarfaraz Akhtar, Avijit Vajpayee, Arjit Srivastava, Madan Gopal Jhanwar, Manish Shrivastava | We present an unsupervised, language agnostic approach for exploiting morphological regularities present in high dimensional vector spaces. |
29 | Exploiting Word Internal Structures for Generic Chinese Sentence Representation | Shaonan Wang, Jiajun Zhang, Chengqing Zong | We introduce a novel mixed characterword architecture to improve Chinese sentence representations, by utilizing rich semantic information of word internal structures. |
30 | High-risk learning: acquiring new word vectors from tiny data | Aurélie Herbelot, Marco Baroni | In this paper, we show that a neural language model such as Word2Vec only necessitates minor modifications to its standard architecture to learn new terms from tiny data, using background knowledge from a previously learnt semantic space. |
31 | Word Embeddings based on Fixed-Size Ordinally Forgetting Encoding | Joseph Sanu, Mingbin Xu, Hui Jiang, Quan Liu | In this paper, we propose to learn word embeddings based on the recent fixed-size ordinally forgetting encoding (FOFE) method, which can almost uniquely encode any variable-length sequence into a fixed-size representation. |
32 | VecShare: A Framework for Sharing Word Representation Vectors | Jared Fernandez, Zhaocheng Yu, Doug Downey | We present a framework, called VecShare, that makes it easy to share and retrieve word embeddings on the Web. |
33 | Word Re-Embedding via Manifold Dimensionality Retention | Souleiman Hasan, Edward Curry | In this paper, we re-embed pre-trained word embeddings with a stage of manifold learning which retains dimensionality. |
34 | MUSE: Modularizing Unsupervised Sense Embeddings | Guang-He Lee, Yun-Nung Chen | We leverage reinforcement learning to enable joint training on the proposed modules, and introduce various exploration techniques on sense selection for better robustness. |
35 | Reporting Score Distributions Makes a Difference: Performance Study of LSTM-networks for Sequence Tagging | Nils Reimers, Iryna Gurevych | In this paper we show that reporting a single performance score is insufficient to compare non-deterministic approaches. |
36 | Learning What’s Easy: Fully Differentiable Neural Easy-First Taggers | André F. T. Martins, Julia Kreutzer | We introduce a novel neural easy-first decoder that learns to solve sequence tagging tasks in a flexible order. |
37 | Incremental Skip-gram Model with Negative Sampling | Nobuhiro Kaji, Hayato Kobayashi | To address this problem, we present a simple incremental extension of SGNS and provide a thorough theoretical analysis to demonstrate its validity. |
38 | Learning to select data for transfer learning with Bayesian Optimization | Sebastian Ruder, Barbara Plank | Inspired by work on curriculum learning, we propose to learn data selection measures using Bayesian Optimization and evaluate them across models, domains and tasks. |
39 | Unsupervised Pretraining for Sequence to Sequence Learning | Prajit Ramachandran, Peter Liu, Quoc Le | This work presents a general unsupervised learning method to improve the accuracy of sequence to sequence (seq2seq) models. |
40 | Efficient Attention using a Fixed-Size Memory Representation | Denny Britz, Melody Guan, Minh-Thang Luong | In this work, we propose an alternative attention mechanism based on a fixed size memory representation that is more efficient. |
41 | Rotated Word Vector Representations and their Interpretability | Sungjoon Park, JinYeong Bak, Alice Oh | We apply several rotation algorithms to the vector representation of words to improve the interpretability. |
42 | A causal framework for explaining the predictions of black-box sequence-to-sequence models | David Alvarez-Melis, Tommi Jaakkola | We focus the general approach on sequence-to-sequence problems, adopting a variational autoencoder to yield meaningful input perturbations. |
43 | Piecewise Latent Variables for Neural Variational Text Processing | Iulian Vlad Serban, Alexander G. Ororbia, Joelle Pineau, Aaron Courville | To overcome this restriction, we propose the simple, but highly flexible, piecewise constant distribution. |
44 | Learning the Structure of Variable-Order CRFs: a finite-state perspective | Thomas Lavergne, François Yvon | Using an effective finite-state representation of variable-length dependencies, we propose new ways to perform feature selection at large scale and report experimental results where we outperform strong baselines on a tagging task. |
45 | Sparse Communication for Distributed Gradient Descent | Alham Fikri Aji, Kenneth Heafield | We make distributed stochastic gradient descent faster by exchanging sparse updates instead of dense updates. |
46 | Why ADAGRAD Fails for Online Topic Modeling | You Lu, Jeffrey Lund, Jordan Boyd-Graber | We show that this is because ADAGRAD uses accumulation of previous gradients as the learning rates’ denominators. |
47 | Recurrent Attention Network on Memory for Aspect Sentiment Analysis | Peng Chen, Zhongqian Sun, Lidong Bing, Wei Yang | We propose a novel framework based on neural networks to identify the sentiment of opinion targets in a comment/review. |
48 | A Cognition Based Attention Model for Sentiment Analysis | Yunfei Long, Qin Lu, Rong Xiang, Minglei Li, Chu-Ren Huang | In this work, we propose a novel attention model trained by cognition grounded eye-tracking data. |
49 | Author-aware Aspect Topic Sentiment Model to Retrieve Supporting Opinions from Reviews | Lahari Poddar, Wynne Hsu, Mong Li Lee | We study the problem of searching for supporting opinions in the context of reviews. |
50 | Magnets for Sarcasm: Making Sarcasm Detection Timely, Contextual and Very Personal | Aniruddha Ghosh, Tony Veale | Using a neural architecture, we show significant gains in detection accuracy when knowledge of the speaker’s mood at the time of production can be inferred. |
51 | Identifying Humor in Reviews using Background Text Sources | Alex Morales, Chengxiang Zhai | We propose a generative language model, based on the theory of incongruity, to model humorous text, which allows us to leverage background text sources, such as Wikipedia entry descriptions, and enables construction of multiple features for identifying humorous reviews. |
52 | Sentiment Lexicon Construction with Representation Learning Based on Hierarchical Sentiment Supervision | Leyi Wang, Rui Xia | In this paper, we develop a neural architecture to train a sentiment-aware word embedding by integrating the sentiment supervision at both document and word levels, to enhance the quality of word embedding as well as the sentiment lexicon. |
53 | Towards a Universal Sentiment Classifier in Multiple languages | Kui Xu, Xiaojun Wan | In this paper we aim to build a universal sentiment classifier with a single classification model in multiple different languages. |
54 | Capturing User and Product Information for Document Level Sentiment Analysis with Deep Memory Network | Zi-Yi Dou | To address the issue, we propose a deep memory network for document-level sentiment classification which could capture the user and product information at the same time. |
55 | Identifying and Tracking Sentiments and Topics from Social Media Texts during Natural Disasters | Min Yang, Jincheng Mei, Heng Ji, Wei Zhao, Zhou Zhao, Xiaojun Chen | We propose a location-based dynamic sentiment-topic model (LDST) which can jointly model topic, sentiment, time and Geolocation information. We will release the data and source code after this work is published. |
56 | Refining Word Embeddings for Sentiment Analysis | Liang-Chih Yu, Jin Wang, K. Robert Lai, Xuejie Zhang | Therefore, this study proposes a word vector refinement model that can be applied to any pre-trained word vectors (e.g., Word2vec and GloVe). |
57 | A Multilayer Perceptron based Ensemble Technique for Fine-grained Financial Sentiment Analysis | Md Shad Akhtar, Abhishek Kumar, Deepanway Ghosal, Asif Ekbal, Pushpak Bhattacharyya | In this paper, we propose a novel method for combining deep learning and classical feature based models using a Multi-Layer Perceptron (MLP) network for financial sentiment analysis. |
58 | Sentiment Intensity Ranking among Adjectives Using Sentiment Bearing Word Embeddings | Raksha Sharma, Arpan Somani, Lakshya Kumar, Pushpak Bhattacharyya | In this paper, we propose a semi-supervised technique that uses sentiment bearing word embeddings to produce a continuous ranking among adjectives that share common semantics. |
59 | Sentiment Lexicon Expansion Based on Neural PU Learning, Double Dictionary Lookup, and Polarity Association | Yasheng Wang, Yang Zhang, Bing Liu | In a recent sentiment analysis application, we used a large Chinese sentiment lexicon and found that it missed a large number of sentiment words in social media. This paper first poses the problem as a PU learning problem, which is a new formulation. |
60 | DeepPath: A Reinforcement Learning Method for Knowledge Graph Reasoning | Wenhan Xiong, Thien Hoang, William Yang Wang | More specifically, we describe a novel reinforcement learning framework for learning multi-hop relational paths: we use a policy-based agent with continuous states based on knowledge graph embeddings, which reasons in a KG vector-space by sampling the most promising relation to extend its path. |
61 | Task-Oriented Query Reformulation with Reinforcement Learning | Rodrigo Nogueira, Kyunghyun Cho | In this work, we introduce a query reformulation system based on a neural network that rewrites a query to maximize the number of relevant documents returned. |
62 | Sentence Simplification with Deep Reinforcement Learning | Xingxing Zhang, Mirella Lapata | We address the simplification problem with an encoder-decoder model coupled with a deep reinforcement learning framework. |
63 | Learning how to Active Learn: A Deep Reinforcement Learning Approach | Meng Fang, Yuan Li, Trevor Cohn | To address these shortcomings, we introduce a novel formulation by reframing the active learning as a reinforcement learning problem and explicitly learning a data selection policy, where the policy takes the role of the active learning heuristic. |
64 | Split and Rephrase | Shashi Narayan, Claire Gardent, Shay B. Cohen, Anastasia Shimorina | We propose a new sentence simplification task (Split-and-Rephrase) where the aim is to split a complex sentence into a meaning preserving sequence of shorter sentences. |
65 | Neural Response Generation via GAN with an Approximate Embedding Layer | Zhen Xu, Bingquan Liu, Baoxun Wang, Chengjie Sun, Xiaolong Wang, Zhuoran Wang, Chao Qi | This paper presents a Generative Adversarial Network (GAN) to model single-turn short-text conversations, which trains a sequence-to-sequence (Seq2Seq) network for response generation simultaneously with a discriminative classifier that measures the differences between human-produced responses and machine-generated ones. |
66 | A Hybrid Convolutional Variational Autoencoder for Text Generation | Stanislau Semeniuta, Aliaksei Severyn, Erhardt Barth | In this paper we explore the effect of architectural choices on learning a variational autoencoder (VAE) for text generation. |
67 | Filling the Blanks (hint: plural noun) for Mad Libs Humor | Nabil Hossain, John Krumm, Lucy Vanderwende, Eric Horvitz, Henry Kautz | We develop an algorithm called Libitum that helps humans generate humor in a Mad Lib, which is a popular fill-in-the-blank game. |
68 | Measuring Thematic Fit with Distributional Feature Overlap | Enrico Santus, Emmanuele Chersoni, Alessandro Lenci, Philippe Blache | In this paper, we introduce a new distributional method for modeling predicate-argument thematic fit judgments. |
69 | SCDV : Sparse Composite Document Vectors using soft clustering over distributional representations | Dheeraj Mekala, Vivek Gupta, Bhargavi Paranjape, Harish Karnick | We present a feature vector formation technique for documents – Sparse Composite Document Vector (SCDV) – which overcomes several shortcomings of the current distributional paragraph vector representations that are widely used for text representation. |
70 | Supervised Learning of Universal Sentence Representations from Natural Language Inference Data | Alexis Conneau, Douwe Kiela, Holger Schwenk, Loïc Barrault, Antoine Bordes | In this paper, we show how universal sentence representations trained using the supervised data of the Stanford Natural Language Inference datasets can consistently outperform unsupervised methods like SkipThought vectors on a wide range of transfer tasks. |
71 | Determining Semantic Textual Similarity using Natural Deduction Proofs | Hitomi Yanaka, Koji Mineshima, Pascual Martínez-Gómez, Daisuke Bekki | We propose a method for determining semantic textual similarity by combining shallow features with features extracted from natural deduction proofs of bidirectional entailment relations between sentence pairs. |
72 | Multi-Grained Chinese Word Segmentation | Chen Gong, Zhenghua Li, Min Zhang, Xinzhou Jiang | This work proposes and addresses multi-grained WS (MWS). We build a large-scale pseudo MWS dataset for model training and tuning by leveraging the annotation heterogeneity of three SWS datasets. |
73 | Don’t Throw Those Morphological Analyzers Away Just Yet: Neural Morphological Disambiguation for Arabic | Nasser Zalmout, Nizar Habash | This paper presents a model for Arabic morphological disambiguation based on Recurrent Neural Networks (RNN). |
74 | Paradigm Completion for Derivational Morphology | Ryan Cotterell, Ekaterina Vylomova, Huda Khayrallah, Christo Kirov, David Yarowsky | We overview the theoretical motivation for a paradigmatic treatment of derivational morphology, and introduce the task of derivational paradigm completion as a parallel to inflectional paradigm completion. |
75 | A Sub-Character Architecture for Korean Language Processing | Karl Stratos | We introduce a novel sub-character architecture that exploits a unique compositional structure of the Korean language. |
76 | Do LSTMs really work so well for PoS tagging? — A replication study | Tobias Horsmann, Torsten Zesch | A recent study by Plank et al. (2016) found that LSTM-based PoS taggers considerably improve over the current state-of-the-art when evaluated on the corpora of the Universal Dependencies project that use a coarse-grained tagset. |
77 | The Labeled Segmentation of Printed Books | Lara McConnaughey, Jennifer Dai, David Bamman | We introduce the task of book structure labeling: segmenting and assigning a fixed category (such as Table of Contents, Preface, Index) to the document structure of printed books. |
78 | Cross-lingual Character-Level Neural Morphological Tagging | Ryan Cotterell, Georg Heigold | In the work presented here, we explore a transfer learning scheme, whereby we train character-level recurrent neural taggers to predict morphological taggings for high-resource languages and low-resource languages together. |
79 | Word-Context Character Embeddings for Chinese Word Segmentation | Hao Zhou, Zhenting Yu, Yue Zhang, Shujian Huang, Xinyu Dai, Jiajun Chen | We investigate training character embeddings on a word-based context in a similar way, showing that the simple method improves state-of-the-art neural word segmentation models significantly, beating tri-training baselines for leveraging auto-segmented data. |
80 | Segmentation-Free Word Embedding for Unsegmented Languages | Takamasa Oshikiri | In this paper, we propose a new pipeline of word embedding for unsegmented languages, called segmentation-free word embedding, which does not require word segmentation as a preprocessing step. |
81 | From Textbooks to Knowledge: A Case Study in Harvesting Axiomatic Knowledge from Textbooks to Solve Geometry Problems | Mrinmaya Sachan, Kumar Dubey, Eric Xing | As a case study, we present an approach for harvesting structured axiomatic knowledge from math textbooks. |
82 | RACE: Large-scale ReAding Comprehension Dataset From Examinations | Guokun Lai, Qizhe Xie, Hanxiao Liu, Yiming Yang, Eduard Hovy | We present RACE, a new dataset for benchmark evaluation of methods in the reading comprehension task. |
83 | Beyond Sentential Semantic Parsing: Tackling the Math SAT with a Cascade of Tree Transducers | Mark Hopkins, Cristian Petrescu-Prahova, Roie Levin, Ronan Le Bras, Alvaro Herrasti, Vidur Joshi | We present an approach for answering questions that span multiple sentences and exhibit sophisticated cross-sentence anaphoric phenomena, evaluating on a rich source of such questions – the math portion of the Scholastic Aptitude Test (SAT). |
84 | Learning Fine-Grained Expressions to Solve Math Word Problems | Danqing Huang, Shuming Shi, Chin-Yew Lin, Jian Yin | This paper presents a novel template-based method to solve math word problems. |
85 | Structural Embedding of Syntactic Trees for Machine Comprehension | Rui Liu, Junjie Hu, Wei Wei, Zi Yang, Eric Nyberg | In this paper, we propose structural embedding of syntactic trees (SEST), an algorithm framework to utilize structured information and encode them into vector representations that can boost the performance of algorithms for the machine comprehension. |
86 | World Knowledge for Reading Comprehension: Rare Entity Prediction with Hierarchical LSTMs Using External Descriptions | Teng Long, Emmanuel Bengio, Ryan Lowe, Jackie Chi Kit Cheung, Doina Precup | In this paper, we introduce a task and several models to drive progress towards this goal. |
87 | Two-Stage Synthesis Networks for Transfer Learning in Machine Comprehension | David Golub, Po-Sen Huang, Xiaodong He, Li Deng | We develop a technique for transfer learning in machine comprehension (MC) using a novel two-stage synthesis network. |
88 | Deep Neural Solver for Math Word Problems | Yan Wang, Xiaojiang Liu, Shuming Shi | This paper presents a deep neural solver to automatically solve math word problems. |
89 | Latent Space Embedding for Retrieval in Question-Answer Archives | Deepak P, Dinesh Garg, Shirish Shevade | In this paper, we devise a CQA retrieval technique, LASER-QA, that embeds question-answer pairs within a unified latent space preserving the local neighborhood structure of question and answer spaces. |
90 | Question Generation for Question Answering | Nan Duan, Duyu Tang, Peng Chen, Ming Zhou | This paper presents how to generate questions from given passages using neural networks, where large scale QA pairs are automatically crawled and processed from Community-QA website, and used as training data. |
91 | Learning to Paraphrase for Question Answering | Li Dong, Jonathan Mallinson, Siva Reddy, Mirella Lapata | In this paper we turn to paraphrases as a means of capturing this knowledge and present a general framework which learns felicitous paraphrases for various QA tasks. |
92 | Temporal Information Extraction for Question Answering Using Syntactic Dependencies in an LSTM-based Architecture | Yuanliang Meng, Anna Rumshisky, Alexey Romanov | In this paper, we propose to use a set of simple, uniform in architecture LSTM-based models to recover different kinds of temporal relations from text. |
93 | Ranking Kernels for Structures and Embeddings: A Hybrid Preference and Classification Model | Kateryna Tymoshenko, Daniele Bonadiman, Alessandro Moschitti | In this work, we propose a new hybrid approach combining preference ranking applied to TKs and pointwise ranking applied to CNNs. |
94 | Recovering Question Answering Errors via Query Revision | Semih Yavuz, Izzeddin Gur, Yu Su, Xifeng Yan | In this work, we propose to crosscheck the corresponding KB relations behind the predicted answers and identify potential inconsistencies. |
95 | An empirical study on the effectiveness of images in Multimodal Neural Machine Translation | Jean-Benoit Delbrouck, Stéphane Dupont | In this paper, we compare several attention mechanism on the multi-modal translation task (English, image → German) and evaluate the ability of the model to make use of images to improve translation. |
96 | Sound-Word2Vec: Learning Word Representations Grounded in Sounds | Ashwin Vijayakumar, Ramakrishna Vedantam, Devi Parikh | In this work, we treat sound as a first-class citizen, studying downstream 6textual tasks which require aural grounding. |
97 | The Promise of Premise: Harnessing Question Premises in Visual Question Answering | Aroma Mahendru, Viraj Prabhu, Akrit Mohapatra, Dhruv Batra, Stefan Lee | In this paper, we make a simple observation that questions about images often contain premises – objects and relationships implied by the question – and that reasoning about premises can help Visual Question Answering (VQA) models respond more intelligently to irrelevant or previously unseen questions. |
98 | Guided Open Vocabulary Image Captioning with Constrained Beam Search | Peter Anderson, Basura Fernando, Mark Johnson, Stephen Gould | We address this problem using a flexible approach that enables existing deep captioning architectures to take advantage of image taggers at test time, without re-training. |
99 | Zero-Shot Activity Recognition with Verb Attribute Induction | Rowan Zellers, Yejin Choi | In this paper, we investigate large-scale zero-shot activity recognition by modeling the visual and linguistic attributes of action verbs. |
100 | Deriving continous grounded meaning representations from referentially structured multimodal contexts | Sina Zarrieß, David Schlangen | We propose a new task for evaluating grounded meaning representations-detection of potentially co-referential phrases-and show that it requires precise denotational representations of attribute meanings, which our method provides. |
101 | Hierarchically-Attentive RNN for Album Summarization and Storytelling | Licheng Yu, Mohit Bansal, Tamara Berg | We address the problem of end-to-end visual storytelling. |
102 | Video Highlight Prediction Using Audience Chat Reactions | Cheng-Yang Fu, Joon Lee, Mohit Bansal, Alexander Berg | We present methods addressing the problem of automatic video highlight prediction based on joint visual features and textual analysis of the real-world audience discourse with complex slang, in both English and traditional Chinese. We present a novel dataset based on League of Legends championships recorded from North American and Taiwanese Twitch.tv channels (will be released for further research), and demonstrate strong results on these using multimodal, character-level CNN-RNN model architectures. |
103 | Reinforced Video Captioning with Entailment Rewards | Ramakanth Pasunuru, Mohit Bansal | Sequence-to-sequence models have shown promising improvements on the temporal task of video captioning, but they optimize word-level cross-entropy loss during training. |
104 | Evaluating Hierarchies of Verb Argument Structure with Hierarchical Clustering | Jesse Mu, Joshua K. Hartshorne, Timothy O’Donnell | We discuss limitations of a simple hierarchical representation and suggest similar approaches for identifying the representations underpinning verb argument structure. |
105 | Incorporating Global Visual Features into Attention-based Neural Machine Translation. | Iacer Calixto, Qun Liu | We introduce multi-modal, attention-based neural machine translation (NMT) models which incorporate visual features into different parts of both the encoder and the decoder. |
106 | Mapping Instructions and Visual Observations to Actions with Reinforcement Learning | Dipendra Misra, John Langford, Yoav Artzi | We propose to directly map raw visual observations and text input to actions for instruction execution. |
107 | An analysis of eye-movements during reading for the detection of mild cognitive impairment | Kathleen C. Fraser, Kristina Lundholm Fors, Dimitrios Kokkinakis, Arto Nordlund | We present a machine learning analysis of eye-tracking data for the detection of mild cognitive impairment, a decline in cognitive abilities that is associated with an increased risk of developing dementia. |
108 | A Structured Learning Approach to Temporal Relation Extraction | Qiang Ning, Zhili Feng, Dan Roth | This paper suggests that it is important to take these dependencies into account while learning to identify these relations and proposes a structured learning approach to address this challenge. |
109 | Importance sampling for unbiased on-demand evaluation of knowledge base population | Arun Chaganty, Ashwin Paranjape, Percy Liang, Christopher D. Manning | Our first contribution is a new importance-sampling based evaluation which corrects for this bias by annotating a new system’s predictions on-demand via crowdsourcing. |
110 | PACRR: A Position-Aware Neural IR Model for Relevance Matching | Kai Hui, Andrew Yates, Klaus Berberich, Gerard de Melo | In this work, we propose a novel neural IR model named PACRR aiming at better modeling position-dependent interactions between a query and a document. |
111 | Globally Normalized Reader | Jonathan Raiman, John Miller | This method improves the performance of all models considered in this work and is of independent interest for a variety of NLP tasks. |
112 | Speech segmentation with a neural encoder model of working memory | Micha Elsner, Cory Shain | We present the first unsupervised LSTM speech segmenter as a cognitive model of the acquisition of words from unsegmented input. |
113 | Speaking, Seeing, Understanding: Correlating semantic models with conceptual representation in the brain | Luana Bulat, Stephen Clark, Ekaterina Shutova | In this paper, we present a systematic evaluation and comparison of a range of widely-used, state-of-the-art semantic models in their ability to predict patterns of conceptual representation in the human brain. |
114 | Multi-modal Summarization for Asynchronous Collection of Text, Image, Audio and Video | Haoran Li, Junnan Zhu, Cong Ma, Jiajun Zhang, Chengqing Zong | In this work, we propose an extractive Multi-modal Summarization (MMS) method which can automatically generate a textual summary given a set of documents, images, audios and videos related to a specific topic. We further introduce an MMS corpus in English and Chinese. |
115 | Tensor Fusion Network for Multimodal Sentiment Analysis | Amir Zadeh, Minghai Chen, Soujanya Poria, Erik Cambria, Louis-Philippe Morency | In this paper, we pose the problem of multimodal sentiment analysis as modeling intra-modality and inter-modality dynamics. |
116 | ConStance: Modeling Annotation Contexts to Improve Stance Classification | Kenneth Joseph, Lisa Friedland, William Hobbs, David Lazer, Oren Tsur | To characterize and reduce these biases, we develop ConStance, a general model for reasoning about annotations across information conditions. |
117 | Deeper Attention to Abusive User Content Moderation | John Pavlopoulos, Prodromos Malakasiotis, Ion Androutsopoulos | Experimenting with a new dataset of 1.6M user comments from a news portal and an existing dataset of 115K Wikipedia talk page comments, we show that an RNN operating on word embeddings outpeforms the previous state of the art in moderation, which used logistic regression or an MLP classifier with character or word n-grams. |
118 | Outta Control: Laws of Semantic Change and Inherent Biases in Word Representation Models | Haim Dubossarsky, Daphna Weinshall, Eitan Grossman | This article evaluates three proposed laws of semantic change. |
119 | Human Centered NLP with User-Factor Adaptation | Veronica Lynn, Youngseo Son, Vivek Kulkarni, Niranjan Balasubramanian, H. Andrew Schwartz | We introduce a continuous adaptation technique, suited for real-valued user factors that are common in social science and bringing us closer to personalized NLP, adapting to each user uniquely. |
120 | Neural Sequence Learning Models for Word Sense Disambiguation | Alessandro Raganato, Claudio Delli Bovi, Roberto Navigli | To bridge this gap we adopt a different perspective and rely on sequence learning to frame the disambiguation problem: we propose and study in depth a series of end-to-end neural architectures directly tailored to the task, from bidirectional Long Short-Term Memory to encoder-decoder models. |
121 | Learning Word Relatedness over Time | Guy D. Rosin, Eytan Adar, Kira Radinsky | In this work, we introduce a temporal relationship model that is extracted from longitudinal data collections. |
122 | Inter-Weighted Alignment Network for Sentence Pair Modeling | Gehui Shen, Yunlun Yang, Zhi-Hong Deng | In this paper, we propose a model to measure the similarity of a sentence pair focusing on the interaction information. |
123 | A Short Survey on Taxonomy Learning from Text Corpora: Issues, Resources and Recent Advances | Chengyu Wang, Xiaofeng He, Aoying Zhou | In this paper, we overview recent advances on taxonomy construction from free texts, reorganizing relevant subtasks into a complete framework. |
124 | Idiom-Aware Compositional Distributed Semantics | Pengfei Liu, Kaiyu Qian, Xipeng Qiu, Xuanjing Huang | In this paper, we propose an idiom-aware distributed semantic model to build representation of sentences on the basis of understanding their contained idioms. To better evaluate our models, we also construct an idiom-enriched sentiment classification dataset with considerable scale and abundant peculiarities of idioms. |
125 | Macro Grammars and Holistic Triggering for Efficient Semantic Parsing | Yuchen Zhang, Panupong Pasupat, Percy Liang | We propose a new online learning algorithm that searches faster as training progresses. |
126 | A Continuously Growing Dataset of Sentential Paraphrases | Wuwei Lan, Siyu Qiu, Hua He, Wei Xu | In this paper, we present a new method to collect large-scale sentential paraphrases from Twitter by linking tweets through shared URLs. We present the largest human-labeled paraphrase corpus to date of 51,524 sentence pairs and the first cross-domain benchmarking for automatic paraphrase identification. |
127 | Cross-domain Semantic Parsing via Paraphrasing | Yu Su, Xifeng Yan | We discover two problems, small micro variance and large macro variance, of pre-trained word embeddings that hinder their direct use in neural networks, and propose standardization techniques as a remedy. |
128 | A Joint Sequential and Relational Model for Frame-Semantic Parsing | Bishan Yang, Tom Mitchell | We introduce a new method for frame-semantic parsing that significantly improves the prior state of the art. |
129 | Getting the Most out of AMR Parsing | Chuan Wang, Nianwen Xue | This paper proposes to tackle the AMR parsing bottleneck by improving two components of an AMR parser: concept identification and alignment. |
130 | AMR Parsing using Stack-LSTMs | Miguel Ballesteros, Yaser Al-Onaizan | We present a transition-based AMR parser that directly generates AMR parses from plain text. |
131 | An End-to-End Deep Framework for Answer Triggering with a Novel Group-Level Objective | Jie Zhao, Yu Su, Ziyu Guan, Huan Sun | In contrast to existing pipeline methods which first consider individual candidate answers separately and then make a prediction based on a threshold, we propose an end-to-end deep neural network framework, which is trained by a novel group-level objective function that directly optimizes the answer triggering performance. |
132 | Predicting Word Association Strengths | Andrew Cattle, Xiaojuan Ma | We find Word2Vec (Mikolov et al., 2013) and GloVe (Pennington et al., 2014) cosine similarities, as well as vector offsets, to be the highest performing features. |
133 | Learning Contextually Informed Representations for Linear-Time Discourse Parsing | Yang Liu, Mirella Lapata | In this work, we propose a linear-time parser with a novel way of representing discourse constituents based on neural networks which takes into account global contextual information and is able to capture long-distance dependencies. |
134 | Multi-task Attention-based Neural Networks for Implicit Discourse Relationship Representation and Identification | Man Lan, Jianxiang Wang, Yuanbin Wu, Zheng-Yu Niu, Haifeng Wang | We present a novel multi-task attention based neural network model to address implicit discourse relationship representation and identification through two types of representation learning, an attention based neural network for learning discourse relationship representation with two arguments and a multi-task framework for learning knowledge from annotated and unannotated corpora. |
135 | Chinese Zero Pronoun Resolution with Deep Memory Network | Qingyu Yin, Yu Zhang, Weinan Zhang, Ting Liu | In this paper, we address this issue by building a deep memory network that is capable of encoding zero pronouns into vector representations with information obtained from their contexts and potential antecedents. |
136 | How much progress have we made on RST discourse parsing? A replication study of recent results on the RST-DT | Mathieu Morey, Philippe Muller, Nicholas Asher | This article evaluates purported progress over the past years in RST discourse parsing. |
137 | What is it? Disambiguating the different readings of the pronoun `it’ | Sharid Loáiciga, Liane Guillou, Christian Hardmeier | In this paper, we address the problem of predicting one of three functions for the English pronoun it’: anaphoric, event reference or pleonastic. |
138 | Revisiting Selectional Preferences for Coreference Resolution | Benjamin Heinzerling, Nafise Sadat Moosavi, Michael Strube | We propose a dependency-based embedding model of selectional preferences which allows fine-grained compatibility judgments with high coverage. |
139 | Learning to Rank Semantic Coherence for Topic Segmentation | Liang Wang, Sujian Li, Yajuan Lv, Houfeng Wang | In this paper, we present an intuitive and simple idea to automatically create a “quasi” training dataset, which includes a large amount of text pairs from the same or different documents with different semantic coherence. |
140 | GRASP: Rich Patterns for Argumentation Mining | Eyal Shnarch, Ran Levy, Vikas Raykar, Noam Slonim | We report highly promising experimental results in several challenging text analysis tasks within the field of Argumentation Mining. |
141 | Patterns of Argumentation Strategies across Topics | Khalid Al-Khatib, Henning Wachsmuth, Matthias Hagen, Benno Stein | This paper presents an analysis of argumentation strategies in news editorials within and across topics. |
142 | Using Argument-based Features to Predict and Analyse Review Helpfulness | Haijing Liu, Yang Gao, Pin Lv, Mengxue Li, Shiqiang Geng, Minglan Li, Hao Wang | We study the helpful product reviews identification problem in this paper. |
143 | Here’s My Point: Joint Pointer Architecture for Argument Mining | Peter Potash, Alexey Romanov, Anna Rumshisky | This work presents the first neural network-based approach to link extraction in argument mining. |
144 | Identifying attack and support argumentative relations using deep learning | Oana Cocarascu, Francesca Toni | We propose a deep learning architecture to capture argumentative relations of attack and support from one piece of text to another, of the kind that naturally occur in a debate. |
145 | Neural Lattice-to-Sequence Models for Uncertain Inputs | Matthias Sperber, Graham Neubig, Jan Niehues, Alex Waibel | In this work, we extend the TreeLSTM (Tai et al., 2015) into a LatticeLSTM that is able to consume word lattices, and can be used as encoder in an attentional encoder-decoder model. |
146 | Memory-augmented Neural Machine Translation | Yang Feng, Shiyue Zhang, Andi Zhang, Dong Wang, Andrew Abel | This paper presents a novel memory-augmented NMT (M-NMT) architecture, which stores knowledge about how words (usually infrequently encountered ones) should be translated in a memory and then utilizes them to assist the neural model. |
147 | Dynamic Data Selection for Neural Machine Translation | Marlies van der Wees, Arianna Bisazza, Christof Monz | With the recent increase in popularity of neural machine translation (NMT), we explore in this paper to what extent and how NMT can also benefit from data selection. |
148 | Neural Machine Translation Leveraging Phrase-based Models in a Hybrid Search | Leonard Dahlmann, Evgeny Matusov, Pavel Petrushkov, Shahram Khadivi | In this paper, we introduce a hybrid search for attention-based neural machine translation (NMT). |
149 | Translating Phrases in Neural Machine Translation | Xing Wang, Zhaopeng Tu, Deyi Xiong, Min Zhang | In this work, we propose a method to translate phrases in NMT by integrating a phrase memory storing target phrases from a phrase-based statistical machine translation (SMT) system into the encoder-decoder architecture of NMT. |
150 | Towards Bidirectional Hierarchical Representations for Attention-based Neural Machine Translation | Baosong Yang, Derek F. Wong, Tong Xiao, Lidia S. Chao, Jingbo Zhu | This paper proposes a hierarchical attentional neural translation model which focuses on enhancing source-side hierarchical representations by covering both local and global semantic information using a bidirectional tree-based encoder. |
151 | Massive Exploration of Neural Machine Translation Architectures | Denny Britz, Anna Goldie, Minh-Thang Luong, Quoc Le | In this work, we present a large-scale analysis of the sensitivity of NMT architectures to common hyperparameters. |
152 | Learning Translations via Matrix Completion | Derry Tanti Wijaya, Brendan Callahan, John Hewitt, Jie Gao, Xiao Ling, Marianna Apidianaki, Chris Callison-Burch | We model this task as a matrix completion problem, and present an effective and extendable framework for completing the matrix. |
153 | Reinforcement Learning for Bandit Neural Machine Translation with Simulated Human Feedback | Khanh Nguyen, Hal Daumé III, Jordan Boyd-Graber | We describe a reinforcement learning algorithm that improves neural machine translation systems from simulated human feedback. |
154 | Towards Compact and Fast Neural Machine Translation Using a Combined Method | Xiaowei Zhang, Wei Chen, Feng Wang, Shuang Xu, Bo Xu | This paper presents a four stage pipeline to compress model and speed up the decoding for NMT. |
155 | Instance Weighting for Neural Machine Translation Domain Adaptation | Rui Wang, Masao Utiyama, Lemao Liu, Kehai Chen, Eiichiro Sumita | In this paper, two instance weighting technologies, i.e., sentence weighting and domain weighting with a dynamic weight learning strategy, are proposed for NMT domain adaptation. |
156 | Regularization techniques for fine-tuning in neural machine translation | Antonio Valerio Miceli Barone, Barry Haddow, Ulrich Germann, Rico Sennrich | We investigate techniques for supervised domain adaptation for neural machine translation where an existing model trained on a large out-of-domain dataset is adapted to a small in-domain dataset. |
157 | Source-Side Left-to-Right or Target-Side Left-to-Right? An Empirical Comparison of Two Phrase-Based Decoding Algorithms | Yin-Wen Chang, Michael Collins | This paper describes an empirical study of the phrase-based decoding algorithm proposed by Chang and Collins (2017). |
158 | Using Target-side Monolingual Data for Neural Machine Translation through Multi-task Learning | Tobias Domhan, Felix Hieber | We propose to modify the decoder in a neural sequence-to-sequence model to enable multi-task learning for two strongly related tasks: target-side language modeling and translation. |
159 | Encoding Sentences with Graph Convolutional Networks for Semantic Role Labeling | Diego Marcheggiani, Ivan Titov | We propose a version of graph convolutional networks (GCNs), a recent class of neural networks operating on graphs, suited to model syntactic dependency graphs. |
160 | Neural Semantic Parsing with Type Constraints for Semi-Structured Tables | Jayant Krishnamurthy, Pradeep Dasigi, Matt Gardner | We present a new semantic parsing model for answering compositional questions on semi-structured Wikipedia tables. |
161 | Joint Concept Learning and Semantic Parsing from Natural Language Explanations | Shashank Srivastava, Igor Labutov, Tom Mitchell | We present a joint model for (1) language interpretation (semantic parsing) and (2) concept learning (classification) that does not require labeling statements with logical forms. |
162 | Grasping the Finer Point: A Supervised Similarity Network for Metaphor Detection | Marek Rei, Luana Bulat, Douwe Kiela, Ekaterina Shutova | In this paper, we present the first deep learning architecture designed to capture metaphorical composition. |
163 | Identifying civilians killed by police with distantly supervised entity-event extraction | Katherine Keith, Abram Handler, Michael Pinkham, Cara Magliozzi, Joshua McDuffie, Brendan O’Connor | We present a newly collected police fatality corpus, which we release publicly, and present a model to solve this problem that uses EM-based distant supervision with logistic regression and convolutional neural network classifiers. |
164 | Asking too much? The rhetorical role of questions in political discourse | Justine Zhang, Arthur Spirling, Cristian Danescu-Niculescu-Mizil | In this work we introduce an unsupervised methodology for extracting surface motifs that recur in questions, and for grouping them according to their latent rhetorical role. |
165 | Detecting Perspectives in Political Debates | David Vilares, Yulan He | We propose a Bayesian modelling approach where topics (or propositions) and their associated perspectives (or viewpoints) are modeled as latent variables. |
166 | “i have a feeling trump will win………………”: Forecasting Winners and Losers from User Predictions on Twitter | Sandesh Swamy, Alan Ritter, Marie-Catherine de Marneffe | To answer this question, we build a corpus of tweets annotated for veridicality on which we train a log-linear classifier that detects positive veridicality with high precision. |
167 | A Question Answering Approach for Emotion Cause Extraction | Lin Gui, Jiannan Hu, Yulan He, Ruifeng Xu, Qin Lu, Jiachen Du | Inspired by recent advances in using deep memory networks for question answering (QA), we propose a new approach which considers emotion cause identification as a reading comprehension task in QA. |
168 | Story Comprehension for Predicting What Happens Next | Snigdha Chaturvedi, Haoruo Peng, Dan Roth | In this paper, we present a story comprehension model that explores three distinct semantic aspects: (i) the sequence of events described in the story, (ii) its emotional trajectory, and (iii) its plot consistency. |
169 | Using millions of emoji occurrences to learn any-domain representations for detecting sentiment, emotion and sarcasm | Bjarke Felbo, Alan Mislove, Anders Søgaard, Iyad Rahwan, Sune Lehmann | Our paper shows that by extending the distant supervision to a more diverse set of noisy labels, the models can learn richer representations. |
170 | Opinion Recommendation Using A Neural Model | Zhongqing Wang, Yue Zhang | We present opinion recommendation, a novel task of jointly generating a review with a rating score that a certain user would give to a certain product which is unreviewed by the user, given existing reviews to the product by other users, and the reviews that the user has given to other products. |
171 | CRF Autoencoder for Unsupervised Dependency Parsing | Jiong Cai, Yong Jiang, Kewei Tu | In this paper, we develop an unsupervised dependency parsing model based on the CRF autoencoder. |
172 | Efficient Discontinuous Phrase-Structure Parsing via the Generalized Maximum Spanning Arborescence | Caio Corro, Joseph Le Roux, Mathieu Lacroix | We present a new method for the joint task of tagging and non-projective dependency parsing. |
173 | Incremental Graph-based Neural Dependency Parsing | Xiaoqing Zheng | Very recently, some studies on neural dependency parsers have shown advantage over the traditional ones on a wide variety of languages. |
174 | Neural Discontinuous Constituency Parsing | Miloš Stanojević, Raquel G. Alhama | In this paper, we propose a solution to this problem by replacing the structured perceptron model with a recursive neural model that computes a global representation of the configuration, therefore allowing even the most remote parts of the configuration to influence the parsing decisions. |
175 | Stack-based Multi-layer Attention for Transition-based Dependency Parsing | Zhirui Zhang, Shujie Liu, Mu Li, Ming Zhou, Enhong Chen | In this paper, we propose a stack-based multi-layer attention model for seq2seq learning to better leverage structural linguistics information. |
176 | Dependency Grammar Induction with Neural Lexicalization and Big Training Data | Wenjuan Han, Yong Jiang, Kewei Tu | We study the impact of big models (in terms of the degree of lexicalization) and big data (in terms of the training corpus size) on dependency grammar induction. |
177 | Combining Generative and Discriminative Approaches to Unsupervised Dependency Parsing via Dual Decomposition | Yong Jiang, Wenjuan Han, Kewei Tu | In this paper, we propose a new learning strategy that learns a generative model and a discriminative model jointly based on the dual decomposition method. |
178 | Effective Inference for Generative Neural Parsing | Mitchell Stern, Daniel Fried, Dan Klein | We describe an alternative to the conventional action-level beam search used for discriminative neural models that enables us to decode directly in these generative models. |
179 | Semi-supervised Structured Prediction with Neural CRF Autoencoder | Xiao Zhang, Yong Jiang, Hao Peng, Kewei Tu, Dan Goldwasser | In this paper we propose an end-to-end neural CRF autoencoder (NCRF-AE) model for semi-supervised learning of sequential structured prediction problems. |
180 | TAG Parsing with Neural Networks and Vector Representations of Supertags | Jungo Kasai, Bob Frank, Tom McCoy, Owen Rambow, Alexis Nasr | We present supertagging-based models for Tree Adjoining Grammar parsing that use neural network architectures and dense vector representation of supertags (elementary trees) to achieve state-of-the-art performance in unlabeled and labeled attachment scores. |
181 | Global Normalization of Convolutional Neural Networks for Joint Entity and Relation Classification | Heike Adel, Hinrich Schütze | We introduce globally normalized convolutional neural networks for joint entity classification and relation extraction. |
182 | End-to-End Neural Relation Extraction with Global Optimization | Meishan Zhang, Yue Zhang, Guohong Fu | We build a globally optimized neural model for end-to-end relation extraction, proposing novel LSTM features in order to better learn context representations. |
183 | KGEval: Accuracy Estimation of Automatically Constructed Knowledge Graphs | Prakhar Ojha, Partha Talukdar | This important problem has largely been ignored in prior research – we fill this gap and propose KGEval. |
184 | Sparsity and Noise: Where Knowledge Graph Embeddings Fall Short | Jay Pujara, Eriq Augustine, Lise Getoor | In this paper, we consider the problem of applying embedding techniques to KGs extracted from text, which are often incomplete and contain errors. |
185 | Dual Tensor Model for Detecting Asymmetric Lexico-Semantic Relations | Goran Glavaš, Simone Paolo Ponzetto | In this work, we propose the Dual Tensor model, a neural architecture with which we explicitly model the asymmetry and capture the translation between unspecialized and specialized word embeddings via a pair of tensors. |
186 | Incorporating Relation Paths in Neural Relation Extraction | Wenyuan Zeng, Yankai Lin, Zhiyuan Liu, Maosong Sun | To address this issue, we build inference chains between two target entities via intermediate entities, and propose a path-based neural relation extraction model to encode the relational semantics from both direct sentences and inference chains. |
187 | Adversarial Training for Relation Extraction | Yi Wu, David Bamman, Stuart Russell | We apply adversarial training in relation extraction within the multi-instance multi-label learning framework. |
188 | Context-Aware Representations for Knowledge Base Relation Extraction | Daniil Sorokin, Iryna Gurevych | We demonstrate that for sentence-level relation extraction it is beneficial to consider other relations in the sentential context while predicting the target relation. |
189 | A Soft-label Method for Noise-tolerant Distantly Supervised Relation Extraction | Tianyu Liu, Kexiang Wang, Baobao Chang, Zhifang Sui | To this end, we introduce an entity-pair level denoise method which exploits semantic information from correctly labeled entity pairs to correct wrong labels dynamically during training. |
190 | A Sequential Model for Classifying Temporal Relations between Intra-Sentence Events | Prafulla Kumar Choubey, Ruihong Huang | We present a sequential model for temporal relation classification between intra-sentence events. |
191 | Deep Residual Learning for Weakly-Supervised Relation Extraction | Yi Yao Huang, William Yang Wang | In this paper, we design a novel convolutional neural network (CNN) with residual learning, and investigate its impacts on the task of distantly supervised noisy relation extraction. |
192 | Noise-Clustered Distant Supervision for Relation Extraction: A Nonparametric Bayesian Perspective | Qing Zhang, Houfeng Wang | To address the challenge, this work presents a novel nonparametric Bayesian formulation for the task. |
193 | Exploring Vector Spaces for Semantic Relations | Kata Gábor, Haïfa Zargayouna, Isabelle Tellier, Davide Buscaldi, Thierry Charnois | In this paper, we explore the potential of pre-trained word embeddings to identify generic types of semantic relations in an unsupervised experiment. |
194 | Temporal dynamics of semantic relations in word embeddings: an application to predicting armed conflict participants | Andrey Kutuzov, Erik Velldal, Lilja Øvrelid | This paper deals with using word embedding models to trace the temporal dynamics of semantic relations between pairs of words. |
195 | Dynamic Entity Representations in Neural Language Models | Yangfeng Ji, Chenhao Tan, Sebastian Martschat, Yejin Choi, Noah A. Smith | We present a new type of language model, EntityNLM, that can explicitly model entities, dynamically update their representations, and contextually generate their mentions. |
196 | Towards Quantum Language Models | Ivano Basile, Fabio Tamburini | This paper presents a new approach for building Language Models using the Quantum Probability Theory, a Quantum Language Model (QLM). |
197 | Reference-Aware Language Models | Zichao Yang, Phil Blunsom, Chris Dyer, Wang Ling | We propose a general class of language models that treat reference as discrete stochastic latent variables. |
198 | A Simple Language Model based on PMI Matrix Approximations | Oren Melamud, Ido Dagan, Jacob Goldberger | Specifically, we show that with minor modifications to word2vec’s algorithm, we get principled language models that are closely related to the well-established Noise Contrastive Estimation (NCE) based language models. |
199 | Syllable-aware Neural Language Models: A Failure to Beat Character-aware Ones | Zhenisbek Assylbekov, Rustem Takhanov, Bagdat Myrzakhmetov, Jonathan N. Washington | Syllable-aware Neural Language Models: A Failure to Beat Character-aware Ones |
200 | Inducing Semantic Micro-Clusters from Deep Multi-View Representations of Novels | Lea Frermann, György Szarvas | Here, we propose a principled and scalable framework leveraging expert-provided semantic tags (e.g., mystery, pirates) to evaluate plot representations in an extrinsic fashion, assessing their ability to produce locally coherent groupings of novels (micro-clusters) in model space. |
201 | Initializing Convolutional Filters with Semantic Features for Text Classification | Shen Li, Zhe Zhao, Tao Liu, Renfen Hu, Xiaoyong Du | This paper presents a novel weight initialization method to improve the CNNs for text classification. |
202 | Shortest-Path Graph Kernels for Document Similarity | Giannis Nikolentzos, Polykarpos Meladianos, François Rousseau, Yannis Stavrakas, Michalis Vazirgiannis | In this paper, we present a novel document similarity measure based on the definition of a graph kernel between pairs of documents. |
203 | Adapting Topic Models using Lexical Associations with Tree Priors | Weiwei Yang, Jordan Boyd-Graber, Philip Resnik | Models work best when they are optimized taking into account the evaluation criteria that people care about. |
204 | Finding Patterns in Noisy Crowds: Regression-based Annotation Aggregation for Crowdsourced Data | Natalie Parde, Rodney Nielsen | We present an aggregation approach that learns a regression model from crowdsourced annotations to predict aggregated labels for instances that have no expert adjudications. |
205 | CROWD-IN-THE-LOOP: A Hybrid Approach for Annotating Semantic Roles | Chenguang Wang, Alan Akbik, Laura Chiticariu, Yunyao Li, Fei Xia, Anbang Xu | In this paper, we postulate that while producing SRL annotation does require expert involvement in general, a large subset of SRL labeling tasks is in fact appropriate for the crowd. |
206 | A Joint Many-Task Model: Growing a Neural Network for Multiple NLP Tasks | Kazuma Hashimoto, Caiming Xiong, Yoshimasa Tsuruoka, Richard Socher | We introduce a joint many-task model together with a strategy for successively growing its depth to solve increasingly complex tasks. |
207 | Earth Mover’s Distance Minimization for Unsupervised Bilingual Lexicon Induction | Meng Zhang, Yang Liu, Huanbo Luan, Maosong Sun | In this paper, we attempt to establish the cross-lingual connection without relying on any cross-lingual supervision. |
208 | Unfolding and Shrinking Neural Machine Translation Ensembles | Felix Stahlberg, Bill Byrne | This work aims to reduce the runtime to be on par with a single system without compromising the translation quality. |
209 | Graph Convolutional Encoders for Syntax-aware Neural Machine Translation | Joost Bastings, Ivan Titov, Wilker Aziz, Diego Marcheggiani, Khalil Sima’an | We present a simple and effective approach to incorporating syntactic structure into neural attention-based encoder-decoder models for machine translation. |
210 | Trainable Greedy Decoding for Neural Machine Translation | Jiatao Gu, Kyunghyun Cho, Victor O.K. Li | In this paper, we solely focus on the problem of decoding given a trained neural machine translation model. |
211 | Satirical News Detection and Analysis using Attention Mechanism and Linguistic Features | Fan Yang, Arjun Mukherjee, Eduard Dragut | Despite the embedded genre in the article, not everyone can recognize the satirical cues and therefore believe the news as true news. |
212 | Fine Grained Citation Span for References in Wikipedia | Besnik Fetahu, Katja Markert, Avishek Anand | We propose a sequence classification approach where for a paragraph and a citation, we determine the citation span at a fine-grained level. |
213 | Identifying Semantic Edit Intentions from Revisions in Wikipedia | Diyi Yang, Aaron Halfaker, Robert Kraut, Eduard Hovy | In this work, we develop in collaboration with Wikipedia editors a 13-category taxonomy of the semantic intention behind edits in Wikipedia articles. |
214 | Accurate Supervised and Semi-Supervised Machine Reading for Long Documents | Daniel Hewlett, Llion Jones, Alexandre Lacoste, Izzeddin Gur | We introduce a hierarchical architecture for machine reading capable of extracting precise information from long documents. |
215 | Adversarial Examples for Evaluating Reading Comprehension Systems | Robin Jia, Percy Liang | To reward systems with real language understanding abilities, we propose an adversarial evaluation scheme for the Stanford Question Answering Dataset (SQuAD). |
216 | Reasoning with Heterogeneous Knowledge for Commonsense Machine Comprehension | Hongyu Lin, Le Sun, Xianpei Han | In this paper, we propose a multi-knowledge reasoning method, which can exploit heterogeneous knowledge for commonsense machine comprehension. |
217 | Document-Level Multi-Aspect Sentiment Classification as Machine Comprehension | Yichun Yin, Yangqiu Song, Ming Zhang | In this paper, we model the task as a machine comprehension problem where pseudo question-answer pairs are constructed by a small number of aspect-related keywords and aspect ratings. We will release our code and data for the method replicability. |
218 | What is the Essence of a Claim? Cross-Domain Claim Identification | Johannes Daxenberger, Steffen Eger, Ivan Habernal, Christian Stab, Iryna Gurevych | We perform a qualitative analysis across six different datasets and show that these appear to conceptualize claims quite differently. |
219 | Identifying Where to Focus in Reading Comprehension for Neural Question Generation | Xinya Du, Claire Cardie | We propose a hierarchical neural sentence-level sequence tagging model for this task, which existing approaches to question generation have ignored. |
220 | Break it Down for Me: A Study in Automated Lyric Annotation | Lucas Sterckx, Jason Naradowsky, Bill Byrne, Thomas Demeester, Chris Develder | We introduce the task of automated lyric annotation (ALA). |
221 | Cascaded Attention based Unsupervised Information Distillation for Compressive Summarization | Piji Li, Wai Lam, Lidong Bing, Weiwei Guo, Hang Li | Inspired by this observation, we propose a cascaded attention based unsupervised model to estimate the salience information from the text for compressive multi-document summarization. |
222 | Deep Recurrent Generative Decoder for Abstractive Text Summarization | Piji Li, Wai Lam, Lidong Bing, Zihao Wang | We propose a new framework for abstractive text summarization based on a sequence-to-sequence oriented encoder-decoder model equipped with a deep recurrent generative decoder (DRGN). |
223 | Extractive Summarization Using Multi-Task Learning with Document Classification | Masaru Isonuma, Toru Fujino, Junichiro Mori, Yutaka Matsuo, Ichiro Sakata | In this paper, we propose a general framework for summarization that extracts sentences from a document using externally related information. |
224 | Towards Automatic Construction of News Overview Articles by News Synthesis | Jianmin Zhang, Xiaojun Wan | In this paper we investigate a new task of automatically constructing an overview article from a given set of news articles about a news event. |
225 | Joint Syntacto-Discourse Parsing and the Syntacto-Discourse Treebank | Kai Zhao, Liang Huang | In this paper we propose the first end-to-end discourse parser that jointly parses in both syntax and discourse levels, as well as the first syntacto-discourse treebank by integrating the Penn Treebank and the RST Treebank. |
226 | Event Coreference Resolution by Iteratively Unfolding Inter-dependencies among Events | Prafulla Kumar Choubey, Ruihong Huang | We introduce a novel iterative approach for event coreference resolution that gradually builds event clusters by exploiting inter-dependencies among event mentions within the same chain as well as across event chains. |
227 | When to Finish? Optimal Beam Search for Neural Text Generation (modulo beam size) | Liang Huang, Kai Zhao, Mingbo Ma | We propose a provably optimal beam search algorithm that will always return the optimal-score complete hypothesis (modulo beam size), and finish as soon as the optimality is established. |
228 | Steering Output Style and Topic in Neural Response Generation | Di Wang, Nebojsa Jojic, Chris Brockett, Eric Nyberg | We propose simple and flexible training and decoding methods for influencing output style and topic in neural encoder-decoder based language generation. |
229 | Preserving Distributional Information in Dialogue Act Classification | Quan Hung Tran, Ingrid Zukerman, Gholamreza Haffari | This paper introduces a novel training/decoding strategy for sequence labeling. |
230 | Adversarial Learning for Neural Dialogue Generation | Jiwei Li, Will Monroe, Tianlin Shi, Sébastien Jean, Alan Ritter, Dan Jurafsky | We cast the task as a reinforcement learning problem where we jointly train two systems: a generative model to produce response sequences, and a discriminator-analagous to the human evaluator in the Turing test- to distinguish between the human-generated dialogues and the machine-generated ones. |
231 | Using Context Information for Dialog Act Classification in DNN Framework | Yang Liu, Kun Han, Zhao Tan, Yun Lei | This paper proposes several ways of using context information for DA classification, all in the deep learning framework. |
232 | Modeling Dialogue Acts with Content Word Filtering and Speaker Preferences | Yohan Jo, Michael Yoder, Hyeju Jang, Carolyn Rosé | We present an unsupervised model of dialogue act sequences in conversation. |
233 | Towards Implicit Content-Introducing for Generative Short-Text Conversation Systems | Lili Yao, Yaoyuan Zhang, Yansong Feng, Dongyan Zhao, Rui Yan | In this paper, we aim to generate a more meaningful and informative reply when answering a given question. |
234 | Affordable On-line Dialogue Policy Learning | Cheng Chang, Runzhe Yang, Lu Chen, Xiang Zhou, Kai Yu | For solving the unsustainable learning problem, we proposed a complete companion teaching framework incorporating the guidance from the human teacher. |
235 | Generating High-Quality and Informative Conversation Responses with Sequence-to-Sequence Models | Yuanlong Shao, Stephan Gouws, Denny Britz, Anna Goldie, Brian Strope, Ray Kurzweil | In this work, we focus on the single turn setting. |
236 | Bootstrapping incremental dialogue systems from minimal data: the generalisation power of dialogue grammars | Arash Eshghi, Igor Shalyminov, Oliver Lemon | We investigate an end-to-end method for automatically inducing task-based dialogue systems from small amounts of unannotated dialogue data. |
237 | Composite Task-Completion Dialogue Policy Learning via Hierarchical Deep Reinforcement Learning | Baolin Peng, Xiujun Li, Lihong Li, Jianfeng Gao, Asli Celikyilmaz, Sungjin Lee, Kam-Fai Wong | This paper addresses this challenge by formulating the task in the mathematical framework of options over Markov Decision Processes (MDPs), and proposing a hierarchical deep reinforcement learning approach to learning a dialogue manager that operates at different temporal scales. |
238 | Why We Need New Evaluation Metrics for NLG | Jekaterina Novikova, Ondřej Dušek, Amanda Cercas Curry, Verena Rieser | In this paper, we motivate the need for novel, system- and data-independent automatic evaluation methods: We investigate a wide range of metrics, including state-of-the-art word-based and novel grammar-based ones, and demonstrate that they only weakly reflect human judgements of system outputs as generated by data-driven, end-to-end NLG. |
239 | Challenges in Data-to-Document Generation | Sam Wiseman, Stuart Shieber, Alexander Rush | In this work, we suggest a slightly more difficult data-to-text generation task, and investigate how effective current approaches are on this task. In particular, we introduce a new, large-scale corpus of data records paired with descriptive documents, propose a series of extractive evaluation methods for analyzing performance, and obtain baseline results using current neural generation methods. |
240 | All that is English may be Hindi: Enhancing language identification through automatic ranking of the likeliness of word borrowing in social media | Jasabanta Patro, Bidisha Samanta, Saurabh Singh, Abhipsa Basu, Prithwish Mukherjee, Monojit Choudhury, Animesh Mukherjee | n this paper, we present a set of computational methods to identify the likeliness of a word being borrowed, based on the signals from social media. |
241 | Multi-View Unsupervised User Feature Embedding for Social Media-based Substance Use Prediction | Tao Ding, Warren K. Bickel, Shimei Pan | In this paper, we demonstrate how the state-of-the-art machine learning and text mining techniques can be used to build effective social media-based substance use detection systems. |
242 | Demographic-aware word associations | Aparna Garimella, Carmen Banea, Rada Mihalcea | To capture these variations, we introduce the task of demographic-aware word associations. We build a new gold standard dataset consisting of word association responses for approximately 300 stimulus words, collected from more than 800 respondents of different gender (male/female) and from different locations (India/United States), and show that there are significant variations in the word associations made by these groups. |
243 | A Factored Neural Network Model for Characterizing Online Discussions in Vector Space | Hao Cheng, Hao Fang, Mari Ostendorf | We develop a novel factored neural model that learns comment embeddings in an unsupervised way leveraging the structure of distributional context in online discussion forums. |
244 | Dimensions of Interpersonal Relationships: Corpus and Experiments | Farzana Rashid, Eduardo Blanco | This paper presents a corpus and experiments to determine dimensions of interpersonal relationships. We create a corpus by retrieving pairs of people, and then annotating dimensions for their relationships. |
245 | Argument Mining on Twitter: Arguments, Facts and Sources | Mihai Dusmanu, Elena Cabrio, Serena Villata | In this paper, we apply supervised classification to identify arguments on Twitter, and we present two new tasks for argument mining, namely facts recognition and source identification. |
246 | Distinguishing Japanese Non-standard Usages from Standard Ones | Tatsuya Aoki, Ryohei Sasano, Hiroya Takamura, Manabu Okumura | In this study, we attempt to distinguish non-standard usages on social media from standard ones in an unsupervised manner. |
247 | Connotation Frames of Power and Agency in Modern Films | Maarten Sap, Marcella Cindy Prasettio, Ari Holtzman, Hannah Rashkin, Yejin Choi | We introduce connotation frames of power and agency, a pragmatic formalism organized using frame semantic representations, to model how different levels of power and agency are implicitly projected on actors through their actions. |
248 | Controlling Human Perception of Basic User Traits | Daniel Preoţiuc-Pietro, Sharath Chandra Guntuku, Lyle Ungar | In this pilot study, we measure the extent to which human perception of basic user trait information – gender and age – is controllable through text. |
249 | Topic Signatures in Political Campaign Speeches | Clément Gautrais, Peggy Cellier, René Quiniou, Alexandre Termier | In this paper, we present a method combining standard topic modeling with signature mining for analyzing topic recurrence in speeches of Clinton and Trump during the 2016 American presidential campaign. |
250 | Assessing Objective Recommendation Quality through Political Forecasting | H. Andrew Schwartz, Masoud Rouhizadeh, Michael Bishop, Philip Tetlock, Barbara Mellers, Lyle Ungar | We explore recommendation quality assessment with respect to both subjective (i.e. users’ ratings) and objective (i.e., did it influence? |
251 | Never Abandon Minorities: Exhaustive Extraction of Bursty Phrases on Microblogs Using Set Cover Problem | Masumi Shirakawa, Takahiro Hara, Takuya Maekawa | We propose a language-independent data-driven method to exhaustively extract bursty phrases of arbitrary forms (e.g., phrases other than simple noun phrases) from microblogs. |
252 | Maximum Margin Reward Networks for Learning from Explicit and Implicit Supervision | Haoruo Peng, Ming-Wei Chang, Wen-tau Yih | In this work, we propose Maximum Margin Reward Networks, a neural network-based framework that aims to learn from both explicit (full structures) and implicit supervision signals (delayed feedback on the correctness of the predicted structure). |
253 | The Impact of Modeling Overall Argumentation with Tree Kernels | Henning Wachsmuth, Giovanni Da San Martino, Dora Kiesel, Benno Stein | Several approaches have been proposed to model either the explicit sequential structure of an argumentative text or its implicit hierarchical structure. |
254 | Learning Generic Sentence Representations Using Convolutional Neural Networks | Zhe Gan, Yunchen Pu, Ricardo Henao, Chunyuan Li, Xiaodong He, Lawrence Carin | We propose a new encoder-decoder approach to learn distributed sentence representations that are applicable to multiple purposes. |
255 | Repeat before Forgetting: Spaced Repetition for Efficient and Effective Training of Neural Networks | Hadi Amiri, Timothy Miller, Guergana Savova | We present a novel approach for training artificial neural networks. |
256 | Part-of-Speech Tagging for Twitter with Adversarial Neural Networks | Tao Gui, Qi Zhang, Haoran Huang, Minlong Peng, Xuanjing Huang | In this work, we study the problem of part-of-speech tagging for Tweets. |
257 | Investigating Different Syntactic Context Types and Context Representations for Learning Word Embeddings | Bofang Li, Tao Liu, Zhe Zhao, Buzhou Tang, Aleksandr Drozd, Anna Rogers, Xiaoyong Du | We provide a systematical investigation of 4 different syntactic context types and context representations for learning word embeddings. |
258 | Does syntax help discourse segmentation? Not so much | Chloé Braud, Ophélie Lacroix, Anders Søgaard | Our results show that dependency information is less useful than expected, but we provide a fully scalable, robust model that only relies on part-of-speech information, and show that it performs well across languages in the absence of any gold-standard annotation. |
259 | Deal or No Deal? End-to-End Learning of Negotiation Dialogues | Mike Lewis, Denis Yarats, Yann Dauphin, Devi Parikh, Dhruv Batra | For the first time, we show it is possible to train end-to-end models for negotiation, which must learn both linguistic and reasoning skills with no annotated dialogue states. |
260 | Agent-Aware Dropout DQN for Safe and Efficient On-line Dialogue Policy Learning | Lu Chen, Xiang Zhou, Cheng Chang, Runzhe Yang, Kai Yu | We employ a \textit{companion learning} framework to integrate the two approaches for \textit{on-line} dialogue policy learning, in which a pre-defined rule-based policy acts as a “teacher” and guides a data-driven RL system by giving example actions as well as additional rewards. |
261 | Towards Debate Automation: a Recurrent Model for Predicting Debate Winners | Peter Potash, Anna Rumshisky | In this paper we introduce a practical first step towards the creation of an automated debate agent: a state-of-the-art recurrent predictive model for predicting debate winners. |
262 | Further Investigation into Reference Bias in Monolingual Evaluation of Machine Translation | Qingsong Ma, Yvette Graham, Timothy Baldwin, Qun Liu | Further Investigation into Reference Bias in Monolingual Evaluation of Machine Translation |
263 | A Challenge Set Approach to Evaluating Machine Translation | Pierre Isabelle, Colin Cherry, George Foster | To exemplify this approach, we present an English-French challenge set, and use it to analyze phrase-based and neural systems. |
264 | Knowledge Distillation for Bilingual Dictionary Induction | Ndapandula Nakashole, Raphael Flauger | In this paper, we propose a bridging approach, where our main contribution is a knowledge distillation training objective. |
265 | Machine Translation, it’s a question of style, innit? The case of English tag questions | Rachel Bawden | In this paper, we address the problem of generating English tag questions (TQs) (e.g. it is, isn’t it?) |
266 | Deciphering Related Languages | Nima Pourdamghani, Kevin Knight | We present a method for translating texts between close language pairs. |
267 | Identifying Cognate Sets Across Dictionaries of Related Languages | Adam St Arnaud, David Beck, Grzegorz Kondrak | We present a system for identifying cognate sets across dictionaries of related languages. |
268 | Learning Language Representations for Typology Prediction | Chaitanya Malaviya, Graham Neubig, Patrick Littell | Exploiting the existence of parallel texts in more than a thousand languages, we build a massive many-to-one NMT system from 1017 languages into English, and use this to predict information missing from typological databases. |
269 | Cheap Translation for Cross-Lingual Named Entity Recognition | Stephen Mayhew, Chen-Tse Tsai, Dan Roth | We propose a simple method for cross-lingual named entity recognition (NER) that works well in settings with \textit{very} minimal resources. |
270 | Cross-Lingual Induction and Transfer of Verb Classes Based on Word Vector Space Specialisation | Ivan Vulić, Nikola Mrkšić, Anna Korhonen | In this work, we propose a novel cross-lingual transfer method for inducing VerbNets for multiple languages. |
271 | Classification of telicity using cross-linguistic annotation projection | Annemarie Friedrich, Damyana Gateva | Our contributions are as follows. We also create a new data set of English texts manually annotated with telicity. |
272 | Counterfactual Learning from Bandit Feedback under Deterministic Logging : A Case Study in Statistical Machine Translation | Carolin Lawrence, Artem Sokolov, Stefan Riezler | We show that counterfactual learning from deterministic bandit logs is possible nevertheless by smoothing out deterministic components in learning. |
273 | Learning Fine-grained Relations from Chinese User Generated Categories | Chengyu Wang, Yan Fan, Xiaofeng He, Aoying Zhou | In this paper, we present a weakly supervised learning framework to harvest relations from Chinese UGCs. |
274 | Improving Slot Filling Performance with Attentive Neural Networks on Dependency Structures | Lifu Huang, Avirup Sil, Heng Ji, Radu Florian | In this paper we propose an effective DNN architecture for SF with the following new strategies: (1). |
275 | Identifying Products in Online Cybercrime Marketplaces: A Dataset for Fine-grained Domain Adaptation | Greg Durrett, Jonathan K. Kummerfeld, Taylor Berg-Kirkpatrick, Rebecca Portnoff, Sadia Afroz, Damon McCoy, Kirill Levchenko, Vern Paxson | In this work, we study the task of identifying products being bought and sold in online cybercrime forums, which exhibits particularly challenging cross-domain effects. We release a dataset of 1,938 annotated posts from across the four forums. |
276 | Labeling Gaps Between Words: Recognizing Overlapping Mentions with Mention Separators | Aldrian Obaja Muis, Wei Lu | In this paper, we propose a new model that is capable of recognizing overlapping mentions. |
277 | Deep Joint Entity Disambiguation with Local Neural Attention | Octavian-Eugen Ganea, Thomas Hofmann | We propose a novel deep learning model for joint document-level entity disambiguation, which leverages learned neural representations. |
278 | MinIE: Minimizing Facts in Open Information Extraction | Kiril Gashteovski, Rainer Gemulla, Luciano del Corro | In this paper, we propose MinIE, an OIE system that aims to provide useful, compact extractions with high precision and recall. |
279 | Scientific Information Extraction with Semi-supervised Neural Tagging | Yi Luan, Mari Ostendorf, Hannaneh Hajishirzi | We cast the problem as sequence tagging and introduce semi-supervised methods to a neural tagging model, which builds on recent advances in named entity recognition. |
280 | NITE: A Neural Inductive Teaching Framework for Domain Specific NER | Siliang Tang, Ning Zhang, Jinjiang Zhang, Fei Wu, Yueting Zhuang | In this paper, we proposed a novel Neural Inductive TEaching framework (NITE) to transfer knowledge from existing domain-specific NER models into an arbitrary deep neural network in a teacher-student training manner. |
281 | Speeding up Reinforcement Learning-based Information Extraction Training using Asynchronous Methods | Aditya Sharma, Zarana Parekh, Partha Talukdar | We leverage recent advances in parallel RL training using asynchronous methods and propose RLIE-A3C. |
282 | Leveraging Linguistic Structures for Named Entity Recognition with Bidirectional Recursive Neural Networks | Peng-Hsuan Li, Ruo-Ping Dong, Yu-Siang Wang, Ju-Chieh Chou, Wei-Yun Ma | In this paper, we utilize the linguistic structures of texts to improve named entity recognition by BRNN-CNN, a special bidirectional recursive network attached with a convolutional network. |
283 | Fast and Accurate Entity Recognition with Iterated Dilated Convolutions | Emma Strubell, Patrick Verga, David Belanger, Andrew McCallum | We describe a distinct combination of network structure, parameter sharing and training procedures that enable dramatic 14-20x test-time speedups while retaining accuracy comparable to the Bi-LSTM-CRF. |
284 | Entity Linking via Joint Encoding of Types, Descriptions, and Context | Nitish Gupta, Sameer Singh, Dan Roth | In this work we present a neural, modular entity linking system that learns a unified dense representation for each entity using multiple sources of information, such as its description, contexts around its mentions, and its fine-grained types. |
285 | An Insight Extraction System on BioMedical Literature with Deep Neural Networks | Hua He, Kris Ganjam, Navendu Jain, Jessica Lundin, Ryen White, Jimmy Lin | As new scientific findings appear across a large collection of biomedical publications, our aim is to tap into this literature to automate biomedical knowledge extraction and identify important insights from them. |
286 | Word Etymology as Native Language Interference | Vivi Nastase, Carlo Strapparava | We present experiments that show the influence of native language on lexical choice when producing text in another language – in this particular case English. |
287 | A Simpler and More Generalizable Story Detector using Verb and Character Features | Joshua Eisenberg, Mark Finlayson | We present a new state-of-the-art detector that achieves a maximum performance of 0.75 F1 (a 14% improvement), with significantly greater generalizability than previous work. |
288 | Multi-modular domain-tailored OCR post-correction | Sarah Schulz, Jonas Kuhn | Since we consider the accessibility of the resulting tool as a crucial part of Digital Humanities collaborations, we describe the workflow we suggest for efficient text recognition and subsequent automatic and manual post-correction |
289 | Learning to Predict Charges for Criminal Cases with Legal Basis | Bingfeng Luo, Yansong Feng, Jianbo Xu, Xiang Zhang, Dongyan Zhao | We argue that relevant law articles play an important role in this task, and therefore propose an attention-based neural network method to jointly model the charge prediction task and the relevant article extraction task in a unified framework. |
290 | Quantifying the Effects of Text Duplication on Semantic Models | Alexandra Schofield, Laure Thompson, David Mimno | By artificially creating different forms of duplicate text we confirm several hypotheses about how repeated text impacts models. |
291 | Identifying Semantically Deviating Outlier Documents | Honglei Zhuang, Chi Wang, Fangbo Tao, Lance Kaplan, Jiawei Han | In this paper, we study the problem of mining semantically deviating document outliers in a given corpus. |
292 | Detecting and Explaining Causes From Text For a Time Series Event | Dongyeop Kang, Varun Gangal, Ang Lu, Zheng Chen, Eduard Hovy | To detect causal features from text, we propose a novel method based on the Granger causality of time series between features extracted from text such as N-grams, topics, sentiments, and their composition. |
293 | A Novel Cascade Model for Learning Latent Similarity from Heterogeneous Sequential Data of MOOC | Zhuoxuan Jiang, Shanshan Feng, Gao Cong, Chunyan Miao, Xiaoming Li | In this paper, we propose a novel cascade model, which can capture both the latent semantics and latent similarity by modeling MOOC data. |
294 | Identifying the Provision of Choices in Privacy Policy Text | Kanthashree Mysore Sathyendra, Shomir Wilson, Florian Schaub, Sebastian Zimmeck, Norman Sadeh | In particular, we present a two-stage architecture of classification models to identify opt-out choices in privacy policy text, labelling common varieties of choices with a mean F1 score of 0.735. |
295 | An Empirical Analysis of Edit Importance between Document Versions | Tanya Goyal, Sachin Kelkar, Manas Agarwal, Jeenu Grover | In this paper, we present a novel approach to infer significance of various textual edits to documents. |
296 | Transition-Based Disfluency Detection using LSTMs | Shaolei Wang, Wanxiang Che, Yue Zhang, Meishan Zhang, Ting Liu | In this paper, we model the problem of disfluency detection using a transition-based framework, which incrementally constructs and labels the disfluency chunk of input sentences using a new transition system without syntax information. |
297 | Neural Sequence-Labelling Models for Grammatical Error Correction | Helen Yannakoudakis, Marek Rei, Øistein E. Andersen, Zheng Yuan | We propose an approach to N-best list reranking using neural sequence-labelling models. |
298 | Adapting Sequence Models for Sentence Correction | Allen Schmaltz, Yoon Kim, Alexander Rush, Stuart Shieber | In a controlled experiment of sequence-to-sequence approaches for the task of sentence correction, we find that character-based models are generally more effective than word-based models and models that encode subword information via convolutions, and that modeling the output data as a series of diffs improves effectiveness over standard approaches. |
299 | A Study of Style in Machine Translation: Controlling the Formality of Machine Translation Output | Xing Niu, Marianna Martindale, Marine Carpuat | We propose to use lexical formality models to control the formality level of machine translation output. |
300 | Sharp Models on Dull Hardware: Fast and Accurate Neural Machine Translation Decoding on the CPU | Jacob Devlin | In this work we focus on efficient decoding, with a goal of achieving accuracy close the state-of-the-art in neural machine translation (NMT), while achieving CPU decoding speed/throughput close to that of a phrasal decoder. |
301 | Exploiting Cross-Sentence Context for Neural Machine Translation | Longyue Wang, Zhaopeng Tu, Andy Way, Qun Liu | In this paper, we propose a cross-sentence context-aware approach and investigate the influence of historical contextual information on the performance of neural machine translation (NMT). |
302 | Cross-Lingual Transfer Learning for POS Tagging without Cross-Lingual Resources | Joo-Kyung Kim, Young-Bum Kim, Ruhi Sarikaya, Eric Fosler-Lussier | In this paper, we introduce a cross-lingual transfer learning model for POS tagging without ancillary resources such as parallel corpora. |
303 | Image Pivoting for Learning Multilingual Multimodal Representations | Spandana Gella, Rico Sennrich, Frank Keller, Mirella Lapata | In this paper we propose a model to learn multimodal multilingual representations for matching images and sentences in different languages, with the aim of advancing multilingual versions of image search and image understanding. |
304 | Neural Machine Translation with Source Dependency Representation | Kehai Chen, Rui Wang, Masao Utiyama, Lemao Liu, Akihiro Tamura, Eiichiro Sumita, Tiejun Zhao | In this paper, we propose a novel NMT with source dependency representation to improve translation performance of NMT, especially long sentences. |
305 | Visual Denotations for Recognizing Textual Entailment | Dan Han, Pascual Martínez-Gómez, Koji Mineshima | We propose to map phrases to their visual denotations and compare their meaning in terms of their images. |
306 | Sequence Effects in Crowdsourced Annotations | Nitika Mathur, Timothy Baldwin, Trevor Cohn | In this paper, we explore sequence effects where annotations of an item are affected by the preceding items. |
307 | No Need to Pay Attention: Simple Recurrent Neural Networks Work! | Ferhan Ture, Oliver Jojic | In fact, we present a preliminary analysis of the performance of our model on real queries from Comcast’s X1 entertainment platform with millions of users every day. |
308 | The strange geometry of skip-gram with negative sampling | David Mimno, Laure Thompson | We show that this geometric concentration depends on the ratio of positive to negative examples, and that it is neither theoretically nor empirically inherent in related embedding algorithms. |
309 | Natural Language Processing with Small Feed-Forward Networks | Jan A. Botha, Emily Pitler, Ji Ma, Anton Bakalov, Alex Salcianu, David Weiss, Ryan McDonald, Slav Petrov | We show that small and shallow feed-forward neural networks can achieve near state-of-the-art results on a range of unstructured and structured language processing tasks while being considerably cheaper in memory and computational requirements than deep recurrent models. |
310 | Deep Multi-Task Learning for Aspect Term Extraction with Memory Interaction | Xin Li, Wai Lam | We propose a novel LSTM-based deep multi-task learning framework for aspect term extraction from user review sentences. |
311 | Analogs of Linguistic Structure in Deep Representations | Jacob Andreas, Dan Klein | We investigate the compositional structure of message vectors computed by a deep network trained on a communication game. |
312 | A Simple Regularization-based Algorithm for Learning Cross-Domain Word Embeddings | Wei Yang, Wei Lu, Vincent Zheng | In this paper, we present a simple yet effective method for learning word embeddings based on text from different domains. |
313 | Learning what to read: Focused machine reading | Enrique Noriega-Atala, Marco A. Valenzuela-Escárcega, Clayton Morrison, Mihai Surdeanu | In this work, we introduce a focused reading approach to guide the machine reading of biomedical literature towards what literature should be read to answer a biomedical query as efficiently as possible. |
314 | DOC: Deep Open Classification of Text Documents | Lei Shu, Hu Xu, Bing Liu | This paper proposes a novel deep learning based approach. |
315 | Charmanteau: Character Embedding Models For Portmanteau Creation | Varun Gangal, Harsh Jhamtani, Graham Neubig, Eduard Hovy, Eric Nyberg | We propose a noisy-channel-style model, which allows for the incorporation of unsupervised word lists, improving performance over a standard source-to-target model. |
316 | Using Automated Metaphor Identification to Aid in Detection and Prediction of First-Episode Schizophrenia | E. Darío Gutiérrez, Guillermo Cecchi, Cheryl Corcoran, Philip Corlett | Using metaphor-identification and sentiment-analysis algorithms to automatically generate features, we create a classifier, that, with high accuracy, can predict which patients will develop (or currently suffer from) schizophrenia. |
317 | Truth of Varying Shades: Analyzing Language in Fake News and Political Fact-Checking | Hannah Rashkin, Eunsol Choi, Jin Yea Jang, Svitlana Volkova, Yejin Choi | We present an analytic study on the language of news media in the context of political fact-checking and fake news detection. |
318 | Topic-Based Agreement and Disagreement in US Electoral Manifestos | Stefano Menini, Federico Nanni, Simone Paolo Ponzetto, Sara Tonelli | We present a topic-based analysis of agreement and disagreement in political manifestos, which relies on a new method for topic detection based on key concept clustering. |
319 | Zipporah: a Fast and Scalable Data Cleaning System for Noisy Web-Crawled Parallel Corpora | Hainan Xu, Philipp Koehn | We propose a novel type of bag-of-words translation feature, and train logistic regression models to classify good data and synthetic noisy data in the proposed feature space. |
320 | Bringing Structure into Summaries: Crowdsourcing a Benchmark Corpus of Concept Maps | Tobias Falke, Iryna Gurevych | To close this gap, we present a newly created corpus of concept maps that summarize heterogeneous collections of web documents on educational topics. We release the corpus along with a baseline system and proposed evaluation protocol to enable further research on this variant of summarization. |
321 | Natural Language Does Not Emerge `Naturally’ in Multi-Agent Dialog | Satwik Kottur, José Moura, Stefan Lee, Dhruv Batra | In this paper, using a Task & Talk reference game between two agents as a testbed, we present a sequence of negative’ results culminating in a positive’ one – showing that while most agent-invented languages are effective (i.e. achieve near-perfect task rewards), they are decidedly not interpretable or compositional. |
322 | Depression and Self-Harm Risk Assessment in Online Forums | Andrew Yates, Arman Cohan, Nazli Goharian | In this work, we present a framework for supporting and studying users in both types of communities. We introduce a large-scale general forum dataset consisting of users with self-reported depression diagnoses matched with control users. |
323 | Men Also Like Shopping: Reducing Gender Bias Amplification using Corpus-level Constraints | Jieyu Zhao, Tianlu Wang, Mark Yatskar, Vicente Ordonez, Kai-Wei Chang | In this work, we study data and models associated with multilabel object classification and visual semantic role labeling. |