Paper Digest: ACL 2017 Highlights
Annual Meeting of the Association for Computational Linguistics (ACL) is one of the top natural language processing conferences in the world. In 2017, it is to be held in Vancouver, Canada. There were 1,318 paper submissions, of which 195 were accepted as long papers, and 107 as short papers.
To help AI community quickly catch up on the work presented in this conference, Paper Digest Team processed all accepted papers, and generated one highlight sentence (typically the main topic) for each paper. Readers are encouraged to read these machine generated highlights / summaries to quickly get the main idea of each paper.
We thank all authors for writing these interesting papers, and readers for reading our digests. If you do not want to miss any interesting AI paper, you are welcome to sign up our free paper digest service to get new paper updates customized to your own interests on a daily basis.
Paper Digest Team
team@paperdigest.org
TABLE 1: ACL 2017 Long Papers
Title | Authors | Highlight | |
---|---|---|---|
1 | Adversarial Multi-task Learning for Text Classification | Pengfei Liu, Xipeng Qiu, Xuanjing Huang | In this paper, we propose an adversarial multi-task learning framework, alleviating the shared and private latent feature spaces from interfering with each other. |
2 | Neural End-to-End Learning for Computational Argumentation Mining | Steffen Eger, Johannes Daxenberger, Iryna Gurevych | We investigate neural techniques for end-to-end computational argumentation mining (AM). |
3 | Neural Symbolic Machines: Learning Semantic Parsers on Freebase with Weak Supervision | Chen Liang, Jonathan Berant, Quoc Le, Kenneth D. Forbus, Ni Lao | In this work, we introduce a Neural Symbolic Machine, which contains (a) a neural “programmer”, i.e., a sequence-to-sequence model that maps language utterances to programs and utilizes a key-variable memory to handle compositionality (b) a symbolic “computer”, i.e., a Lisp interpreter that performs program execution, and helps find good programs by pruning the search space. |
4 | Neural Relation Extraction with Multi-lingual Attention | Yankai Lin, Zhiyuan Liu, Maosong Sun | To address this issue, we introduce a multi-lingual neural relation extraction framework, which employs mono-lingual attention to utilize the information within mono-lingual texts and further proposes cross-lingual attention to consider the information consistency and complementarity among cross-lingual texts. |
5 | Learning Structured Natural Language Representations for Semantic Parsing | Jianpeng Cheng, Siva Reddy, Vijay Saraswat, Mirella Lapata | We introduce a neural semantic parser which is interpretable and scalable. |
6 | Morph-fitting: Fine-Tuning Word Vector Spaces with Simple Language-Specific Rules | Ivan Vulić, Nikola Mrkšić, Roi Reichart, Diarmuid Ó Séaghdha, Steve Young, Anna Korhonen | In this work, we propose a novel morph-fitting procedure which moves past the use of curated semantic lexicons for improving distributional vector spaces. |
7 | Skip-Gram − Zipf + Uniform = Vector Additivity | Alex Gittens, Dimitris Achlioptas, Michael W. Mahoney | When these assumptions do not hold, this work describes the correct non-linear composition operator. |
8 | The State of the Art in Semantic Representation | Omri Abend, Ari Rappoport | Yet, little has been done to assess the achievements and the shortcomings of these new contenders, compare them with syntactic schemes, and clarify the general goals of research on semantic representation. |
9 | Joint Learning for Event Coreference Resolution | Jing Lu, Vincent Ng | To address this problem, we propose a model for jointly learning event coreference, trigger detection, and event anaphoricity. |
10 | Generating and Exploiting Large-scale Pseudo Training Data for Zero Pronoun Resolution | Ting Liu, Yiming Cui, Qingyu Yin, Wei-Nan Zhang, Shijin Wang, Guoping Hu | To alleviate the problem above, in this paper, we propose a simple but novel approach to automatically generate large-scale pseudo training data for zero pronoun resolution. |
11 | Discourse Mode Identification in Essays | Wei Song, Dong Wang, Ruiji Fu, Lizhen Liu, Ting Liu, Guoping Hu | We annotate a corpus to study the characteristics of discourse modes and describe a neural sequence labeling model for identification. |
12 | A Convolutional Encoder Model for Neural Machine Translation | Jonas Gehring, Michael Auli, David Grangier, Yann Dauphin | We present a faster and simpler architecture based on a succession of convolutional layers. |
13 | Deep Neural Machine Translation with Linear Associative Unit | Mingxuan Wang, Zhengdong Lu, Jie Zhou, Qun Liu | To address this problem we propose a novel linear associative units (LAU) to reduce the gradient propagation path inside the recurrent unit. |
14 | Neural AMR: Sequence-to-Sequence Models for Parsing and Generation | Ioannis Konstas, Srinivasan Iyer, Mark Yatskar, Yejin Choi, Luke Zettlemoyer | We present a novel training procedure that can lift this limitation using millions of unlabeled sentences and careful preprocessing of the AMR graphs. |
15 | Program Induction by Rationale Generation: Learning to Solve and Explain Algebraic Word Problems | Wang Ling, Dani Yogatama, Chris Dyer, Phil Blunsom | To make this task more feasible, we solve these problems by generating answer rationales, sequences of natural language and human-readable mathematical expressions that derive the final answer through a series of small steps. To evaluate our approach, we have created a new 100,000-sample dataset of questions, answers and rationales. |
16 | Automatically Generating Rhythmic Verse with Neural Networks | Jack Hopkins, Douwe Kiela | We propose two novel methodologies for the automatic generation of rhythmic poetry in a variety of forms. |
17 | Creating Training Corpora for NLG Micro-Planners | Claire Gardent, Anastasia Shimorina, Shashi Narayan, Laura Perez-Beltrachini | In this paper, we present a novel framework for semi-automatically creating linguistically challenging micro-planning data-to-text corpora from existing Knowledge Bases. |
18 | Gated Self-Matching Networks for Reading Comprehension and Question Answering | Wenhui Wang, Nan Yang, Furu Wei, Baobao Chang, Ming Zhou | In this paper, we present the gated self-matching networks for reading comprehension style question answering, which aims to answer questions from a given passage. |
19 | Generating Natural Answers by Incorporating Copying and Retrieving Mechanisms in Sequence-to-Sequence Learning | Shizhu He, Cao Liu, Kang Liu, Jun Zhao | In this paper, we propose an end-to-end question answering system called COREQA in sequence-to-sequence learning, which incorporates copying and retrieving mechanisms to generate natural answers within an encoder-decoder framework. |
20 | Coarse-to-Fine Question Answering for Long Documents | Eunsol Choi, Daniel Hewlett, Jakob Uszkoreit, Illia Polosukhin, Alexandre Lacoste, Jonathan Berant | We present a framework for question answering that can efficiently scale to longer documents while maintaining or even improving performance of state-of-the-art models. |
21 | An End-to-End Model for Question Answering over Knowledge Base with Cross-Attention Combining Global Knowledge | Yanchao Hao, Yuanzhe Zhang, Kang Liu, Shizhu He, Zhanyi Liu, Hua Wu, Jun Zhao | Hence, we present an end-to-end neural network model to represent the questions and their corresponding scores dynamically according to the various candidate answer aspects via cross-attention mechanism. |
22 | Translating Neuralese | Jacob Andreas, Anca Dragan, Dan Klein | Several approaches have recently been proposed for learning decentralized deep multiagent policies that coordinate via a differentiable communication channel. |
23 | Obtaining referential word meanings from visual and distributional information: Experiments on object naming | Sina Zarrieß, David Schlangen | We present a model that learns individual predictors for object names that link visual and distributional aspects of word meaning during training. |
24 | FOIL it! Find One mismatch between Image and Language caption | Ravi Shekhar, Sandro Pezzelle, Yauhen Klimovich, Aurélie Herbelot, Moin Nabi, Enver Sangineto, Raffaella Bernardi | In this paper, we aim to understand whether current language and vision (LaVi) models truly grasp the interaction between the two modalities. |
25 | Verb Physics: Relative Physical Knowledge of Actions and Objects | Maxwell Forbes, Yejin Choi | In this paper, we present an approach to infer relative physical knowledge of actions and objects along five dimensions (e.g., size, weight, and strength) from unstructured natural language text. |
26 | A* CCG Parsing with a Supertag and Dependency Factored Model | Masashi Yoshikawa, Hiroshi Noji, Yuji Matsumoto | We propose a new A* CCG parsing model in which the probability of a tree is decomposed into factors of CCG categories and its syntactic dependencies both defined on bi-directional LSTMs. |
27 | A Full Non-Monotonic Transition System for Unrestricted Non-Projective Parsing | Daniel Fernández-González, Carlos Gómez-Rodríguez | In this paper, we propose a novel, fully non-monotonic transition system based on the non-projective Covington algorithm. |
28 | Aggregating and Predicting Sequence Labels from Crowd Annotations | An Thanh Nguyen, Byron Wallace, Junyi Jessy Li, Ani Nenkova, Matthew Lease | To predict sequences in unannotated text, we propose a neural approach using Long Short Term Memory. |
29 | Multi-space Variational Encoder-Decoders for Semi-supervised Labeled Sequence Transduction | Chunting Zhou, Graham Neubig | In this paper we propose multi-space variational encoder-decoders, a new model for labeled sequence transduction with semi-supervised learning. |
30 | Scalable Bayesian Learning of Recurrent Neural Networks for Language Modeling | Zhe Gan, Chunyuan Li, Changyou Chen, Yunchen Pu, Qinliang Su, Lawrence Carin | This paper leverages recent advances in stochastic gradient Markov Chain Monte Carlo (also appropriate for large training sets) to learn weight uncertainty in RNNs. |
31 | Learning attention for historical text normalization by learning to pronounce | Marcel Bollmann, Joachim Bingel, Anders Søgaard | We analyze the induced models across 44 different texts from Early New High German. |
32 | Deep Learning in Semantic Kernel Spaces | Danilo Croce, Simone Filice, Giuseppe Castellucci, Roberto Basili | In this paper, we show that expressive kernels and deep neural networks can be combined in a common framework in order to (i) explicitly model structured information and (ii) learn non-linear decision functions. |
33 | Topically Driven Neural Language Model | Jey Han Lau, Timothy Baldwin, Trevor Cohn | We present a neural language model that incorporates document context in the form of a topic model-like architecture, thus providing a succinct representation of the broader document context outside of the current sentence. |
34 | Handling Cold-Start Problem in Review Spam Detection by Jointly Embedding Texts and Behaviors | Xuepeng Wang, Kang Liu, Jun Zhao | This paper proposes a novel neural network model to detect review spam for cold-start problem, by learning to represent the new reviewers’ review with jointly embedded textual and behavioral information. |
35 | Learning Cognitive Features from Gaze Data for Sentiment and Sarcasm Classification using Convolutional Neural Network | Abhijit Mishra, Kuntal Dey, Pushpak Bhattacharyya | We introduce a framework to automatically extract cognitive features from the eye-movement/gaze data of human readers reading the text and use them as features along with textual features for the tasks of sentiment polarity and sarcasm detection. |
36 | An Unsupervised Neural Attention Model for Aspect Extraction | Ruidan He, Wee Sun Lee, Hwee Tou Ng, Daniel Dahlmeier | In this paper, we present a novel neural approach with the aim of discovering coherent aspects. |
37 | Other Topics You May Also Agree or Disagree: Modeling Inter-Topic Preferences using Tweets and Matrix Factorization | Akira Sasaki, Kazuaki Hanawa, Naoaki Okazaki, Kentaro Inui | We presents in this paper our approach for modeling inter-topic preferences of Twitter users: for example, “those who agree with the Trans-Pacific Partnership (TPP) also agree with free trade”. |
38 | Automatically Labeled Data Generation for Large Scale Event Extraction | Yubo Chen, Shulin Liu, Xiang Zhang, Kang Liu, Jun Zhao | To solve the data labeling problem, we propose to automatically label training data for event extraction via world knowledge and linguistic knowledge, which can detect key arguments and trigger words for each event type and employ them to label events in texts automatically. |
39 | Time Expression Analysis and Recognition Using Syntactic Token Types and General Heuristic Rules | Xiaoshi Zhong, Aixin Sun, Erik Cambria | Based on the findings, we propose a type-based approach, named SynTime, to recognize time expressions. |
40 | Learning with Noise: Enhance Distantly Supervised Relation Extraction with Dynamic Transition Matrix | Bingfeng Luo, Yansong Feng, Zheng Wang, Zhanxing Zhu, Songfang Huang, Rui Yan, Dongyan Zhao | In this paper, we take a deep look at the application of distant supervision in relation extraction. |
41 | A Syntactic Neural Model for General-Purpose Code Generation | Pengcheng Yin, Graham Neubig | Informed by previous work in semantic parsing, in this paper we propose a novel neural architecture powered by a grammar model to explicitly capture the target syntax as prior knowledge. |
42 | Learning bilingual word embeddings with (almost) no bilingual data | Mikel Artetxe, Gorka Labaka, Eneko Agirre | Our method exploits the structural similarity of embedding spaces, and works with as little bilingual evidence as a 25 word dictionary or even an automatically generated list of numerals, obtaining results comparable to those of systems that use richer resources. |
43 | Abstract Meaning Representation Parsing using LSTM Recurrent Neural Networks | William Foland, James H. Martin | We present a system which parses sentences into Abstract Meaning Representations, improving state-of-the-art results for this task by more than 5%. |
44 | Deep Semantic Role Labeling: What Works and What’s Next | Luheng He, Kenton Lee, Mike Lewis, Luke Zettlemoyer | We introduce a new deep learning model for semantic role labeling (SRL) that significantly improves the state of the art, along with detailed analyses to reveal its strengths and limitations. |
45 | Towards End-to-End Reinforcement Learning of Dialogue Agents for Information Access | Bhuwan Dhingra, Lihong Li, Xiujun Li, Jianfeng Gao, Yun-Nung Chen, Faisal Ahmed, Li Deng | In this paper, we address this limitation by replacing symbolic queries with an induced “soft” posterior distribution over the KB that indicates which entities the user is interested in. |
46 | Sequential Matching Network: A New Architecture for Multi-turn Response Selection in Retrieval-Based Chatbots | Yu Wu, Wei Wu, Chen Xing, Ming Zhou, Zhoujun Li | We propose a sequential matching network (SMN) to address both problems. |
47 | Learning Word-Like Units from Joint Audio-Visual Analysis | David Harwath, James Glass | Given a collection of images and spoken audio captions, we present a method for discovering word-like acoustic units in the continuous speech signal and grounding them to semantically relevant image regions. |
48 | Joint CTC/attention decoding for end-to-end speech recognition | Takaaki Hori, Shinji Watanabe, John Hershey | This paper proposes joint decoding algorithm for end-to-end ASR with a hybrid CTC/attention architecture, which effectively utilizes both advantages in decoding. |
49 | Found in Translation: Reconstructing Phylogenetic Language Trees from Translations | Ella Rabinovich, Noam Ordan, Shuly Wintner | We show that traces of the source language remain in the translation product to the extent that it is possible to uncover the history of the source language by looking only at the translation. |
50 | Predicting Native Language from Gaze | Yevgeni Berzak, Chie Nakamura, Suzanne Flynn, Boris Katz | We present a novel methodology for studying this question: analysis of eye-movement patterns in second language reading of free-form text. |
51 | MORSE: Semantic-ally Drive-n MORpheme SEgment-er | Tarek Sakakini, Suma Bhat, Pramod Viswanath | We present in this paper a novel framework for morpheme segmentation which uses the morpho-syntactic regularities preserved by word representations, in addition to orthographic features, to segment words into morphemes. We also analyze the deficiencies of available benchmarking datasets and introduce our own dataset that was created on the basis of compositionality. |
52 | Deep Pyramid Convolutional Neural Networks for Text Categorization | Rie Johnson, Tong Zhang | This paper proposes a low-complexity word-level deep convolutional neural network (CNN) architecture for text categorization that can efficiently represent long-range associations in text. |
53 | Improved Neural Relation Detection for Knowledge Base Question Answering | Mo Yu, Wenpeng Yin, Kazi Saidul Hasan, Cicero dos Santos, Bing Xiang, Bowen Zhou | In this paper, we propose a hierarchical recurrent neural network enhanced by residual learning which detects KB relations given an input question. |
54 | Deep Keyphrase Generation | Rui Meng, Sanqiang Zhao, Shuguang Han, Daqing He, Peter Brusilovsky, Yu Chi | We propose a generative model for keyphrase prediction with an encoder-decoder framework, which can effectively overcome the above drawbacks. |
55 | Attention-over-Attention Neural Networks for Reading Comprehension | Yiming Cui, Zhipeng Chen, Si Wei, Shijin Wang, Ting Liu, Guoping Hu | In this paper, we present a simple but novel model called attention-over-attention reader for better solving cloze-style reading comprehension task. |
56 | Alignment at Work: Using Language to Distinguish the Internalization and Self-Regulation Components of Cultural Fit in Organizations | Gabriel Doyle, Amir Goldberg, Sameer Srivastava, Michael Frank | Recent research draws on computational linguistics to measure cultural fit but overlooks asymmetries in cultural adaptation. |
57 | Representations of language in a model of visually grounded speech signal | Grzegorz Chrupała, Lieke Gelderloos, Afra Alishahi | We present a visually grounded model of speech perception which projects spoken utterances and images to a joint semantic space. |
58 | Spectral Analysis of Information Density in Dialogue Predicts Collaborative Task Performance | Yang Xu, David Reitter | We propose a perspective on dialogue that focuses on relative information contributions of conversation partners as a key to successful communication. |
59 | Affect-LM: A Neural Language Model for Customizable Affective Text Generation | Sayan Ghosh, Mathieu Chollet, Eugene Laksana, Louis-Philippe Morency, Stefan Scherer | In this paper, we propose an extension to an LSTM (Long Short-Term Memory) language model for generation of conversational text, conditioned on affect categories. |
60 | Domain Attention with an Ensemble of Experts | Young-Bum Kim, Karl Stratos, Dongchan Kim | We describe a solution based on attending an ensemble of domain experts. |
61 | Learning Discourse-level Diversity for Neural Dialog Models using Conditional Variational Autoencoders | Tiancheng Zhao, Ran Zhao, Maxine Eskenazi | Unlike past work that has focused on diversifying the output of the decoder from word-level to alleviate this problem, we present a novel framework based on conditional variational autoencoders that capture the discourse-level diversity in the encoder. |
62 | Hybrid Code Networks: practical and efficient end-to-end dialog control with supervised and reinforcement learning | Jason D. Williams, Kavosh Asadi, Geoffrey Zweig | We introduce Hybrid Code Networks (HCNs), which combine an RNN with domain-specific knowledge encoded as software and system action templates. |
63 | Generating Contrastive Referring Expressions | Martín Villalba, Christoph Teichmann, Alexander Koller | We present an algorithm for generating corrective REs that use contrastive focus (“no, the BLUE button”) to emphasize the information the hearer most likely misunderstood. |
64 | Modeling Source Syntax for Neural Machine Translation | Junhui Li, Deyi Xiong, Zhaopeng Tu, Muhua Zhu, Min Zhang, Guodong Zhou | On the basis, we propose three different sorts of encoders to incorporate source syntax into NMT: 1) Parallel RNN encoder that learns word and label annotation vectors parallelly; 2) Hierarchical RNN encoder that learns word and label annotation vectors in a two-level hierarchy; and 3) Mixed RNN encoder that stitchingly learns word and label annotation vectors over sequences where words and labels are mixed. |
65 | Sequence-to-Dependency Neural Machine Translation | Shuangzhi Wu, Dongdong Zhang, Nan Yang, Mu Li, Ming Zhou | Inspired by the success of using syntactic knowledge of target language for improving statistical machine translation, in this paper we propose a novel Sequence-to-Dependency Neural Machine Translation (SD-NMT) method, in which the target word sequence and its corresponding dependency structure are jointly constructed and modeled, and this structure is used as context to facilitate word generations. |
66 | Detect Rumors in Microblog Posts Using Propagation Structure via Kernel Learning | Jing Ma, Wei Gao, Kam-Fai Wong | In this paper, we attempt to address the problem of identifying rumors, i.e., fake information, out of microblog posts based on their propagation structure. |
67 | EmoNet: Fine-Grained Emotion Detection with Gated Recurrent Neural Networks | Muhammad Abdul-Mageed, Lyle Ungar | In this work, we build a very large dataset for fine-grained emotions and develop deep learning models on it. |
68 | Beyond Binary Labels: Political Ideology Prediction of Twitter Users | Daniel Preoţiuc-Pietro, Ye Liu, Daniel Hopkins, Lyle Ungar | Using a novel data set with political ideology labels self-reported through surveys, our goal is two-fold: a) to characterize the groups of politically engaged users through language use on Twitter; b) to build a fine-grained model that predicts political ideology of unseen users. |
69 | Leveraging Behavioral and Social Information for Weakly Supervised Collective Classification of Political Discourse on Twitter | Kristen Johnson, Di Jin, Dan Goldwasser | We present a collection of weakly supervised models which harness collective classification to predict the frames used in political discourse on the microblogging platform, Twitter. |
70 | A Nested Attention Neural Hybrid Model for Grammatical Error Correction | Jianshu Ji, Qinlong Wang, Kristina Toutanova, Yongen Gong, Steven Truong, Jianfeng Gao | A Nested Attention Neural Hybrid Model for Grammatical Error Correction |
71 | TextFlow: A Text Similarity Measure based on Continuous Sequences | Yassine Mrabet, Halil Kilicoglu, Dina Demner-Fushman | In this paper we present a novel text similarity measure inspired from a common representation in DNA sequence alignment algorithms. |
72 | Friendships, Rivalries, and Trysts: Characterizing Relations between Ideas in Texts | Chenhao Tan, Dallas Card, Noah A. Smith | Because ideas are naturally embedded in texts, we propose the first framework to systematically characterize the relations between ideas based on their occurrence in a corpus of documents, independent of how these ideas are represented. |
73 | Polish evaluation dataset for compositional distributional semantics models | Alina Wróblewska, Katarzyna Krasnowska-Kieraś | The paper presents a procedure of building an evaluation dataset. |
74 | Automatic Annotation and Evaluation of Error Types for Grammatical Error Correction | Christopher Bryant, Mariano Felice, Ted Briscoe | To overcome this problem, we introduce ERRANT, a grammatical ERRor ANnotation Toolkit designed to automatically extract edits from parallel original and corrected sentences and classify them according to a new, dataset-agnostic, rule-based framework. |
75 | Evaluation Metrics for Machine Reading Comprehension: Prerequisite Skills and Readability | Saku Sugawara, Yusuke Kido, Hikaru Yokono, Akiko Aizawa | In this study, two classes of metrics were adopted for evaluating RC datasets: prerequisite skills and readability. |
76 | A Minimal Span-Based Neural Constituency Parser | Mitchell Stern, Jacob Andreas, Dan Klein | In this work, we present a minimal neural model for constituency parsing based on independent scoring of labels and spans. |
77 | Semantic Dependency Parsing via Book Embedding | Weiwei Sun, Junjie Cao, Xiaojun Wan | To build a semantic graph for a given sentence, we design new Maximum Subgraph algorithms to generate noncrossing graphs on each page, and a Lagrangian Relaxation-based algorithm tocombine pages into a book. |
78 | Neural Word Segmentation with Rich Pretraining | Jie Yang, Yue Zhang, Fei Dong | We investigate the effectiveness of a range of external training sources for neural word segmentation by building a modular segmentation model, pretraining the most important submodule using rich external sources. |
79 | Neural Machine Translation via Binary Code Prediction | Yusuke Oda, Philip Arthur, Graham Neubig, Koichiro Yoshino, Satoshi Nakamura | In this paper, we propose a new method for calculating the output layer in neural machine translation systems. |
80 | What do Neural Machine Translation Models Learn about Morphology? | Yonatan Belinkov, Nadir Durrani, Fahim Dalvi, Hassan Sajjad, James Glass | In this work, we analyze the representations learned by neural MT models at various levels of granularity and empirically evaluate the quality of the representations for learning morphology through extrinsic part-of-speech and morphological tagging tasks. |
81 | Context-Dependent Sentiment Analysis in User-Generated Videos | Soujanya Poria, Erik Cambria, Devamanyu Hazarika, Navonil Majumder, Amir Zadeh, Louis-Philippe Morency | In this paper, we propose a LSTM-based model that enables utterances to capture contextual information from their surroundings in the same video, thus aiding the classification process. |
82 | A Multidimensional Lexicon for Interpersonal Stancetaking | Umashanthi Pavalanathan, Jim Fitzpatrick, Scott Kiesling, Jacob Eisenstein | A Multidimensional Lexicon for Interpersonal Stancetaking |
83 | Tandem Anchoring: a Multiword Anchor Approach for Interactive Topic Modeling | Jeffrey Lund, Connor Cook, Kevin Seppi, Jordan Boyd-Graber | We propose combinations of words as anchors, going beyond existing single word anchor algorithms-an approach we call “Tandem Anchors”. |
84 | Apples to Apples: Learning Semantics of Common Entities Through a Novel Comprehension Task | Omid Bakhshandeh, James Allen | In order to enable learning about common entities, we introduce a novel machine comprehension task, GuessTwo: given a short paragraph comparing different aspects of two real-world semantically-similar entities, a system should guess what those entities are. For benchmarking further progress in the task, we have collected a set of paragraphs as the test set on which human can accomplish the task with an accuracy of 94.2% on open-ended prediction. |
85 | Going out on a limb: Joint Extraction of Entity Mentions and Relations without Dependency Trees | Arzoo Katiyar, Claire Cardie | We present a novel attention-based recurrent neural network for joint extraction of entity mentions and relations. |
86 | Naturalizing a Programming Language via Interactive Learning | Sida I. Wang, Samuel Ginn, Percy Liang, Christopher D. Manning | Our goal is to create a convenient natural language interface for performing well-specified but complex actions such as analyzing data, manipulating text, and querying databases. |
87 | Semantic Word Clusters Using Signed Spectral Clustering | João Sedoc, Jean Gallier, Dean Foster, Lyle Ungar | We present a new signed spectral normalized graph cut algorithm, \textit{signed clustering}, that overlays existing thesauri upon distributionally derived vector representations of words, so that antonym relationships between word pairs are represented by negative weights. |
88 | An Interpretable Knowledge Transfer Model for Knowledge Base Completion | Qizhe Xie, Xuezhe Ma, Zihang Dai, Eduard Hovy | We propose a novel embedding model, ITransF, to perform knowledge base completion. |
89 | Learning a Neural Semantic Parser from User Feedback | Srinivasan Iyer, Ioannis Konstas, Alvin Cheung, Jayant Krishnamurthy, Luke Zettlemoyer | We present an approach to rapidly and easily build natural language interfaces to databases for new domains, whose performance improves over time based on user feedback, and requires minimal intervention. |
90 | Joint Modeling of Content and Discourse Relations in Dialogues | Kechen Qin, Lu Wang, Joseph Kim | We present a joint modeling approach to identify salient discussion points in spoken meetings as well as to label the discourse relations between speaker turns. |
91 | Argument Mining with Structured SVMs and RNNs | Vlad Niculae, Joonsuk Park, Claire Cardie | We propose a novel factor graph model for argument mining, designed for settings in which the argumentative relations in a document do not necessarily form a tree structure. |
92 | Neural Discourse Structure for Text Categorization | Yangfeng Ji, Noah A. Smith | We show that discourse structure, as defined by Rhetorical Structure Theory and provided by an existing discourse parser, benefits text categorization. |
93 | Adversarial Connective-exploiting Networks for Implicit Discourse Relation Classification | Lianhui Qin, Zhisong Zhang, Hai Zhao, Zhiting Hu, Eric Xing | We propose a feature imitation framework in which an implicit relation network is driven to learn from another neural network with access to connectives, and thus encouraged to extract similarly salient features for accurate classification. |
94 | Don’t understand a measure? Learn it: Structured Prediction for Coreference Resolution optimizing its measures | Iryna Haponchyk, Alessandro Moschitti | In this paper, we trade off exact computation for enabling the use and study of more complex loss functions for coreference resolution. |
95 | Bayesian Modeling of Lexical Resources for Low-Resource Settings | Nicholas Andrews, Mark Dredze, Benjamin Van Durme, Jason Eisner | In this paper, we investigate a more robust approach: we stipulate that the lexicon is the result of an assumed generative process. |
96 | Semi-Supervised QA with Generative Domain-Adaptive Nets | Zhilin Yang, Junjie Hu, Ruslan Salakhutdinov, William Cohen | We propose a novel training framework, the \textit{Generative Domain-Adaptive Nets}. |
97 | From Language to Programs: Bridging Reinforcement Learning and Maximum Marginal Likelihood | Kelvin Guu, Panupong Pasupat, Evan Liu, Percy Liang | Our goal is to learn a semantic parser that maps natural language utterances into executable programs when only indirect supervision is available: examples are labeled with the correct execution result, but not the program itself. |
98 | Diversity driven attention model for query-based abstractive summarization | Preksha Nema, Mitesh M. Khapra, Anirban Laha, Balaraman Ravindran | In this work we propose a model for the query-based summarization task based on the encode-attend-decode paradigm with two key additions (i) a query attention model (in addition to document attention model) which learns to focus on different portions of the query at different time steps (instead of using a static representation for the query) and (ii) a new diversity based attention model which aims to alleviate the problem of repeating phrases in the summary. |
99 | Get To The Point: Summarization with Pointer-Generator Networks | Abigail See, Peter J. Liu, Christopher D. Manning | In this work we propose a novel architecture that augments the standard sequence-to-sequence attentional model in two orthogonal ways. |
100 | Supervised Learning of Automatic Pyramid for Optimization-Based Multi-Document Summarization | Maxime Peyrard, Judith Eckle-Kohler | We present a new supervised framework that learns to estimate automatic Pyramid scores and uses them for optimization-based extractive multi-document summarization. |
101 | Selective Encoding for Abstractive Sentence Summarization | Qingyu Zhou, Nan Yang, Furu Wei, Ming Zhou | We propose a selective encoding model to extend the sequence-to-sequence framework for abstractive sentence summarization. |
102 | PositionRank: An Unsupervised Approach to Keyphrase Extraction from Scholarly Documents | Corina Florescu, Cornelia Caragea | In this paper, we propose PositionRank, an unsupervised model for keyphrase extraction from scholarly documents that incorporates information from all positions of a word’s occurrences into a biased PageRank. |
103 | Towards an Automatic Turing Test: Learning to Evaluate Dialogue Responses | Ryan Lowe, Michael Noseworthy, Iulian Vlad Serban, Nicolas Angelard-Gontier, Yoshua Bengio, Joelle Pineau | In response to this challenge, we formulate automatic dialogue evaluation as a learning problem.We present an evaluation model (ADEM)that learns to predict human-like scores to input responses, using a new dataset of human response scores. |
104 | A Transition-Based Directed Acyclic Graph Parser for UCCA | Daniel Hershcovich, Omri Abend, Ari Rappoport | We present the first parser for UCCA, a cross-linguistically applicable framework for semantic representation, which builds on extensive typological work and supports rapid annotation. |
105 | Abstract Syntax Networks for Code Generation and Semantic Parsing | Maxim Rabinovich, Mitchell Stern, Dan Klein | We introduce abstract syntax networks, a modeling framework for these problems. |
106 | Visualizing and Understanding Neural Machine Translation | Yanzhuo Ding, Yang Liu, Huanbo Luan, Maosong Sun | In this work, we propose to use layer-wise relevance propagation (LRP) to compute the contribution of each contextual word to arbitrary hidden states in the attention-based encoder-decoder framework. |
107 | Detecting annotation noise in automatically labelled data | Ines Rehbein, Josef Ruppenhofer | We introduce a method for error detection in automatically annotated text, aimed at supporting the creation of high-quality language resources at affordable cost. |
108 | Abstractive Document Summarization with a Graph-Based Attentional Neural Model | Jiwei Tan, Xiaojun Wan, Jianguo Xiao | Abstractive summarization is the ultimate goal of document summarization research, but previously it is less investigated due to the immaturity of text generation techniques. |
109 | Probabilistic Typology: Deep Generative Models of Vowel Inventories | Ryan Cotterell, Jason Eisner | In this paper we present the first probabilistic treatment of a basic question in phonological typology: What makes a natural vowel inventory? |
110 | Adversarial Multi-Criteria Learning for Chinese Word Segmentation | Xinchi Chen, Zhan Shi, Xipeng Qiu, Xuanjing Huang | In this paper, we propose adversarial multi-criteria learning for CWS by integrating shared knowledge from multiple heterogeneous segmentation criteria. |
111 | Neural Joint Model for Transition-based Chinese Syntactic Analysis | Shuhei Kurita, Daisuke Kawahara, Sadao Kurohashi | We present neural network-based joint models for Chinese word segmentation, POS tagging and dependency parsing. |
112 | Robust Incremental Neural Semantic Graph Parsing | Jan Buys, Phil Blunsom | We propose a neural encoder-decoder transition-based parser which is the first full-coverage semantic graph parser for Minimal Recursion Semantics (MRS). |
113 | Joint Extraction of Entities and Relations Based on a Novel Tagging Scheme | Suncong Zheng, Feng Wang, Hongyun Bao, Yuexing Hao, Peng Zhou, Bo Xu | What’s more, the end-to-end model proposed in this paper, achieves the best results on the public dataset. |
114 | A Local Detection Approach for Named Entity Recognition and Mention Detection | Mingbin Xu, Hui Jiang, Sedtawut Watcharawittayakul | In this paper, we study a novel approach for named entity recognition (NER) and mention detection (MD) in natural language processing. |
115 | Vancouver Welcomes You! Minimalist Location Metonymy Resolution | Milan Gritta, Mohammad Taher Pilehvar, Nut Limsopatham, Nigel Collier | We show how a minimalist neural approach combined with a novel predicate window method can achieve competitive results on the SemEval 2007 task on Metonymy Resolution. |
116 | Unifying Text, Metadata, and User Network Representations with a Neural Network for Geolocation Prediction | Yasuhide Miura, Motoki Taniguchi, Tomoki Taniguchi, Tomoko Ohkuma | We propose a novel geolocation prediction model using a complex neural network. |
117 | Multi-Task Video Captioning with Video and Entailment Generation | Ramakanth Pasunuru, Mohit Bansal | For this, we present a many-to-many multi-task learning model that shares parameters across the encoders and decoders of the three tasks. |
118 | Enriching Complex Networks with Word Embeddings for Detecting Mild Cognitive Impairment from Speech Transcripts | Leandro Santos, Edilson Anselmo Corrêa Júnior, Osvaldo Oliveira Jr, Diego Amancio, Letícia Mansur, Sandra Aluísio | In this paper, we modeled transcripts into complex networks and enriched them with word embedding (CNE) to better represent short texts produced in neuropsychological assessments. |
119 | Adversarial Adaptation of Synthetic or Stale Data | Young-Bum Kim, Karl Stratos, Dongchan Kim | We propose a solution to this mismatch problem by framing it as domain adaptation, treating the flawed training dataset as a source domain and the evaluation dataset as a target domain. |
120 | Chat Detection in an Intelligent Assistant: Combining Task-oriented and Non-task-oriented Spoken Dialogue Systems | Satoshi Akasaki, Nobuhiro Kaji | To address the lack of benchmark datasets for this task, we construct a new dataset consisting of 15,160 utterances collected from the real log data of a commercial intelligent assistant (and will release the dataset to facilitate future research activity). |
121 | A Neural Local Coherence Model | Dat Tien Nguyen, Shafiq Joty | We propose a local coherence model based on a convolutional neural network that operates over the entity grid representation of a text. |
122 | Data-Driven Broad-Coverage Grammars for Opinionated Natural Language Generation (ONLG) | Tomer Cagan, Stefan L. Frank, Reut Tsarfaty | We present a data-driven architecture for ONLG that generates subjective responses triggered by users’ agendas, consisting of topics and sentiments, and based on wide-coverage automatically-acquired generative grammars. |
123 | Learning to Ask: Neural Question Generation for Reading Comprehension | Xinya Du, Junru Shao, Claire Cardie | We introduce an attention-based sequence learning model for the task and investigate the effect of encoding sentence- vs. paragraph-level information. |
124 | Joint Optimization of User-desired Content in Multi-document Summaries by Learning from User Feedback | Avinesh P.V.S, Christian M. Meyer | In this paper, we propose an extractive multi-document summarization (MDS) system using joint optimization and active learning for content selection grounded in user feedback. |
125 | Flexible and Creative Chinese Poetry Generation Using Neural Memory | Jiyuan Zhang, Yang Feng, Dong Wang, Yang Wang, Andrew Abel, Shiyue Zhang, Andi Zhang | This work proposes a memory augmented neural model for Chinese poem generation, where the neural model and the augmented memory work together to balance the requirements of linguistic accordance and aesthetic innovation, leading to innovative generations that are still rule-compliant. |
126 | Learning to Generate Market Comments from Stock Prices | Soichiro Murakami, Akihiko Watanabe, Akira Miyazawa, Keiichi Goshima, Toshihiko Yanase, Hiroya Takamura, Yusuke Miyao | This paper presents a novel encoder-decoder model for automatically generating market comments from stock prices. |
127 | Can Syntax Help? Improving an LSTM-based Sentence Compression Model for New Domains | Liangguo Wang, Jing Jiang, Hai Leong Chieu, Chen Hui Ong, Dandan Song, Lejian Liao | In this paper, we study how to improve the domain adaptability of a deletion-based Long Short-Term Memory (LSTM) neural network model for sentence compression. |
128 | Transductive Non-linear Learning for Chinese Hypernym Prediction | Chengyu Wang, Junchi Yan, Aoying Zhou, Xiaofeng He | Rather than extracting hypernyms from texts, in this paper, we present a transductive learning approach to establish mappings from entities to hypernyms in the embedding space directly. |
129 | A Constituent-Centric Neural Architecture for Reading Comprehension | Pengtao Xie, Eric Xing | In this paper, we study the RC problem on the Stanford Question Answering Dataset (SQuAD). |
130 | Cross-lingual Distillation for Text Classification | Ruochen Xu, Yiming Yang | This paper presents a novel approach to CLTC that builds on model distillation, which adapts and extends a framework originally proposed for model compression. |
131 | Understanding and Predicting Empathic Behavior in Counseling Therapy | Verónica Pérez-Rosas, Rada Mihalcea, Kenneth Resnicow, Satinder Singh, Lawrence An | In this paper, we explore several aspects pertaining to counseling interaction dynamics and their relation to counselor empathy during motivational interviewing encounters. |
132 | Leveraging Knowledge Bases in LSTMs for Improving Machine Reading | Bishan Yang, Tom Mitchell | We propose KBLSTM, a novel neural model that leverages continuous representations of KBs to enhance the learning of recurrent neural networks for machine reading. |
133 | Prerequisite Relation Learning for Concepts in MOOCs | Liangming Pan, Chengjiang Li, Juanzi Li, Jie Tang | We study the extent to which the prerequisite relation between knowledge concepts in Massive Open Online Courses (MOOCs) can be inferred automatically. |
134 | Unsupervised Text Segmentation Based on Native Language Characteristics | Shervin Malmasi, Mark Dras, Mark Johnson, Lan Du, Magdalena Wolska | We propose a Bayesian unsupervised text segmentation approach to the latter. |
135 | Weakly Supervised Cross-Lingual Named Entity Recognition via Effective Annotation and Representation Projection | Jian Ni, Georgiana Dinu, Radu Florian | In this paper, we present two weakly supervised approaches for cross-lingual NER with no human annotation in a target language. |
136 | Context Sensitive Lemmatization Using Two Successive Bidirectional Gated Recurrent Networks | Abhisek Chakrabarty, Onkar Arun Pandit, Utpal Garain | We introduce a composite deep neural network architecture for supervised and language independent context sensitive lemmatization. To train the model on Bengali, we develop a gold lemma annotated dataset (having 1,702 sentences with a total of 20,257 word tokens), which is an additional contribution of this work. |
137 | Learning to Create and Reuse Words in Open-Vocabulary Neural Language Modeling | Kazuya Kawakami, Chris Dyer, Phil Blunsom | In this paper, we augment a hierarchical LSTM language model that generates sequences of word tokens character by character with a caching mechanism that learns to reuse previously generated words. To validate our model we construct a new open-vocabulary language modeling corpus (the Multilingual Wikipedia Corpus; MWC) from comparable Wikipedia articles in 7 typologically diverse languages and demonstrate the effectiveness of our model across this range of languages. |
138 | Bandit Structured Prediction for Neural Sequence-to-Sequence Learning | Julia Kreutzer, Artem Sokolov, Stefan Riezler | Bandit structured prediction describes a stochastic optimization framework where learning is performed from partial feedback. |
139 | Prior Knowledge Integration for Neural Machine Translation using Posterior Regularization | Jiacheng Zhang, Yang Liu, Huanbo Luan, Jingfang Xu, Maosong Sun | In this work, we propose to use posterior regularization to provide a general framework for integrating prior knowledge into neural machine translation. |
140 | Incorporating Word Reordering Knowledge into Attention-based Neural Machine Translation | Jinchao Zhang, Mingxuan Wang, Qun Liu, Jie Zhou | This paper proposes three distortion models to explicitly incorporate the word reordering knowledge into attention-based Neural Machine Translation (NMT) for further improving translation performance. |
141 | Lexically Constrained Decoding for Sequence Generation Using Grid Beam Search | Chris Hokamp, Qun Liu | We present Grid Beam Search (GBS), an algorithm which extends beam search to allow the inclusion of pre-specified lexical constraints. |
142 | Combating Human Trafficking with Multimodal Deep Models | Edmund Tong, Amir Zadeh, Cara Jones, Louis-Philippe Morency | In this paper, we take a major step in the automatic detection of advertisements suspected to pertain to human trafficking. We present a novel dataset called Trafficking-10k, with more than 10,000 advertisements annotated for this task. |
143 | MalwareTextDB: A Database for Annotated Malware Articles | Swee Kiat Lim, Aldrian Obaja Muis, Wei Lu, Chen Hui Ong | In this paper, we discuss the construction of a new database for annotated malware texts. |
144 | A Corpus of Annotated Revisions for Studying Argumentative Writing | Fan Zhang, Homa B. Hashemi, Rebecca Hwa, Diane Litman | This paper presents ArgRewrite, a corpus of between-draft revisions of argumentative essays. |
145 | Watset: Automatic Induction of Synsets from a Graph of Synonyms | Dmitry Ustalov, Alexander Panchenko, Chris Biemann | This paper presents a new graph-based approach that induces synsets using synonymy dictionaries and word embeddings. |
146 | Neural Modeling of Multi-Predicate Interactions for Japanese Predicate Argument Structure Analysis | Hiroki Ouchi, Hiroyuki Shindo, Yuji Matsumoto | To remedy this problem, we introduce a model that uses grid-type recurrent neural networks. |
147 | TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension | Mandar Joshi, Eunsol Choi, Daniel Weld, Luke Zettlemoyer | We present TriviaQA, a challenging reading comprehension dataset containing over 650K question-answer-evidence triples. |
148 | Learning Semantic Correspondences in Technical Documentation | Kyle Richardson, Jonas Kuhn | Our approach exploits the parallel nature of such documentation, or the tight coupling between high-level text and the low-level representations we aim to learn. |
149 | Bridge Text and Knowledge by Learning Multi-Prototype Entity Mention Embedding | Yixin Cao, Lifu Huang, Heng Ji, Xu Chen, Juanzi Li | In this paper, to deal with the ambiguity of entity mentions, we propose a novel Multi-Prototype Mention Embedding model, which learns multiple sense embeddings for each mention by jointly modeling words from textual contexts and entities derived from a knowledge base. |
150 | Interactive Learning of Grounded Verb Semantics towards Human-Robot Communication | Lanbo She, Joyce Chai | To address this limitation, this paper presents a new interactive learning approach that allows robots to proactively engage in interaction with human partners by asking good questions to learn models for grounded verb semantics. |
151 | Multimodal Word Distributions | Ben Athiwaratkun, Andrew Wilson | To learn these distributions, we propose an energy-based max-margin objective. |
152 | Enhanced LSTM for Natural Language Inference | Qian Chen, Xiaodan Zhu, Zhen-Hua Ling, Si Wei, Hui Jiang, Diana Inkpen | In this paper, we present a new state-of-the-art result, achieving the accuracy of 88.6% on the Stanford Natural Language Inference Dataset. |
153 | Linguistic analysis of differences in portrayal of movie characters | Anil Ramakrishna, Victor R. Martínez, Nikolaos Malandrakis, Karan Singla, Shrikanth Narayanan | We examine differences in portrayal of characters in movies using psycholinguistic and graph theoretic measures computed directly from screenplays. |
154 | Linguistically Regularized LSTM for Sentiment Classification | Qiao Qian, Minlie Huang, Jinhao Lei, Xiaoyan Zhu | In this paper, we propose simple models trained with sentence-level annotation, but also attempt to model the linguistic role of sentiment lexicons, negation words, and intensity words. |
155 | Sarcasm SIGN: Interpreting Sarcasm with Sentiment Based Monolingual Machine Translation | Lotem Peled, Roi Reichart | In this paper we present the novel task of sarcasm interpretation, defined as the generation of a non-sarcastic utterance conveying the same message as the original sarcastic one. We introduce a novel dataset of 3000 sarcastic tweets, each interpreted by five human judges. |
156 | Active Sentiment Domain Adaptation | Fangzhao Wu, Yongfeng Huang, Jun Yan | In this paper, we propose an active sentiment domain adaptation approach to handle this problem. |
157 | Volatility Prediction using Financial Disclosures Sentiments with Word Embedding-based IR Models | Navid Rekabsaz, Mihai Lupu, Artem Baklanov, Alexander Dür, Linda Andersson, Allan Hanbury | We therefore study different fusion methods to combine text and market data resources. |
158 | CANE: Context-Aware Network Embedding for Relation Modeling | Cunchao Tu, Han Liu, Zhiyuan Liu, Maosong Sun | In this paper, we assume that one vertex usually shows different aspects when interacting with different neighbor vertices, and should own different embeddings respectively. |
159 | Universal Dependencies Parsing for Colloquial Singaporean English | Hongmin Wang, Yue Zhang, GuangYong Leonard Chan, Jie Yang, Hai Leong Chieu | We make both our annotation and parser available for further research. |
160 | Generic Axiomatization of Families of Noncrossing Graphs in Dependency Parsing | Anssi Yli-Jyrä, Carlos Gómez-Rodríguez | We present a simple encoding for unlabeled noncrossing graphs and show how its latent counterpart helps us to represent several families of directed and undirected graphs used in syntactic and semantic parsing of natural language as context-free languages. |
161 | Semi-supervised sequence tagging with bidirectional language models | Matthew Peters, Waleed Ammar, Chandra Bhagavatula, Russell Power | In this paper, we demonstrate a general semi-supervised approach for adding pretrained context embeddings from bidirectional language models to NLP systems and apply it to sequence labeling tasks. |
162 | Learning Symmetric Collaborative Dialogue Agents with Dynamic Knowledge Graph Embeddings | He He, Anusha Balakrishnan, Mihail Eric, Percy Liang | To model both structured knowledge and unstructured language, we propose a neural model with dynamic knowledge graph embeddings that evolve as the dialogue progresses. We collected a dataset of 11K human-human dialogues, which exhibits interesting lexical, semantic, and strategic elements. |
163 | Neural Belief Tracker: Data-Driven Dialogue State Tracking | Nikola Mrkšić, Diarmuid Ó Séaghdha, Tsung-Hsien Wen, Blaise Thomson, Steve Young | We propose a novel Neural Belief Tracking (NBT) framework which overcomes these problems by building on recent advances in representation learning. |
164 | Exploiting Argument Information to Improve Event Detection via Supervised Attention Mechanisms | Shulin Liu, Yubo Chen, Kang Liu, Jun Zhao | In this work, we propose to exploit argument information explicitly for ED via supervised attention mechanisms. |
165 | Topical Coherence in LDA-based Models through Induced Segmentation | Hesam Amoualian, Wei Lu, Eric Gaussier, Georgios Balikas, Massih R. Amini, Marianne Clausel | This paper presents an LDA-based model that generates topically coherent segments within documents by jointly segmenting documents and assigning topics to their words. |
166 | Jointly Extracting Relations with Class Ties via Effective Deep Ranking | Hai Ye, Wenhan Chao, Zhunchen Luo, Zhoujun Li | In this work, to effectively leverage class ties, we propose to make joint relation extraction with a unified model that integrates convolutional neural network (CNN) with a general pairwise ranking framework, in which three novel ranking loss functions are introduced. |
167 | Search-based Neural Structured Learning for Sequential Question Answering | Mohit Iyyer, Wen-tau Yih, Ming-Wei Chang | To solve this sequential question answering task, we propose a novel dynamic neural semantic parsing framework trained using a weakly supervised reward-guided search. We collect a dataset of 6,066 question sequences that inquire about semi-structured tables from Wikipedia, with 17,553 question-answer pairs in total. |
168 | Gated-Attention Readers for Text Comprehension | Bhuwan Dhingra, Hanxiao Liu, Zhilin Yang, William Cohen, Ruslan Salakhutdinov | In this paper we study the problem of answering cloze-style questions over documents. |
169 | Determining Gains Acquired from Word Embedding Quantitatively Using Discrete Distribution Clustering | Jianbo Ye, Yanran Li, Zhaohui Wu, James Z. Wang, Wenjie Li, Jia Li | In this paper, we propose a new document clustering approach by combining any word embedding with a state-of-the-art algorithm for clustering empirical distributions. |
170 | Towards a Seamless Integration of Word Senses into Downstream NLP Applications | Mohammad Taher Pilehvar, Jose Camacho-Collados, Roberto Navigli, Nigel Collier | By incorporating a novel disambiguation algorithm into a state-of-the-art classification model, we create a pipeline to integrate sense-level information into downstream NLP applications. |
171 | Reading Wikipedia to Answer Open-Domain Questions | Danqi Chen, Adam Fisch, Jason Weston, Antoine Bordes | This paper proposes to tackle open-domain question answering using Wikipedia as the unique knowledge source: the answer to any factoid question is a text span in a Wikipedia article. |
172 | Learning to Skim Text | Adams Wei Yu, Hongrae Lee, Quoc Le | In this paper, we present an approach of reading text while skipping irrelevant information if needed. |
173 | An Algebra for Feature Extraction | Vivek Srikumar | In this paper, we formalize feature extraction from an algebraic perspective. |
174 | Chunk-based Decoder for Neural Machine Translation | Shonosuke Ishiwatari, Jingtao Yao, Shujie Liu, Mu Li, Ming Zhou, Naoki Yoshinaga, Masaru Kitsuregawa, Weijia Jia | In this paper, we propose chunk-based decoders for (NMT), each of which consists of a chunk-level decoder and a word-level decoder. |
175 | Doubly-Attentive Decoder for Multi-modal Neural Machine Translation | Iacer Calixto, Qun Liu, Nick Campbell | We introduce a Multi-modal Neural Machine Translation model in which a doubly-attentive decoder naturally incorporates spatial visual features obtained using pre-trained convolutional neural networks, bridging the gap between image description and translation. |
176 | A Teacher-Student Framework for Zero-Resource Neural Machine Translation | Yun Chen, Yang Liu, Yong Cheng, Victor O.K. Li | In this paper, we propose a method for zero-resource NMT by assuming that parallel sentences have close probabilities of generating a sentence in a third language. |
177 | Improved Neural Machine Translation with a Syntax-Aware Encoder and Decoder | Huadong Chen, Shujian Huang, David Chiang, Jiajun Chen | In this paper, we improve this model by explicitly incorporating source-side syntactic trees. |
178 | Cross-lingual Name Tagging and Linking for 282 Languages | Xiaoman Pan, Boliang Zhang, Jonathan May, Joel Nothman, Kevin Knight, Heng Ji | The ambitious goal of this work is to develop a cross-lingual name tagging and linking framework for 282 languages that exist in Wikipedia. |
179 | Adversarial Training for Unsupervised Bilingual Lexicon Induction | Meng Zhang, Yang Liu, Huanbo Luan, Maosong Sun | In this work, we show that such cross-lingual connection can actually be established without any form of supervision. |
180 | Estimating Code-Switching on Twitter with a Novel Generalized Word-Level Language Detection Technique | Shruti Rijhwani, Royal Sequiera, Monojit Choudhury, Kalika Bali, Chandra Shekhar Maddila | We present a novel unsupervised word-level language detection technique for code-switched text for an arbitrarily large number of languages, which does not require any manually annotated training data. |
181 | Using Global Constraints and Reranking to Improve Cognates Detection | Michael Bloodgood, Benjamin Strauss | We propose methods for using global constraints by performing rescoring of the score matrices produced by state of the art cognates detection systems. |
182 | One-Shot Neural Cross-Lingual Transfer for Paradigm Completion | Katharina Kann, Ryan Cotterell, Hinrich Schütze | We present a novel cross-lingual transfer method for paradigm completion, the task of mapping a lemma to its inflected forms, using a neural encoder-decoder model, the state of the art for the monolingual task. |
183 | Morphological Inflection Generation with Hard Monotonic Attention | Roee Aharoni, Yoav Goldberg | We present a neural model for morphological inflection generation which employs a hard attention mechanism, inspired by the nearly-monotonic alignment commonly found between the characters in a word and the characters in its inflection. |
184 | From Characters to Words to in Between: Do We Capture Morphology? | Clara Vania, Adam Lopez | On a language modeling task, we present experiments that systematically vary (1) the basic unit of representation, (2) the composition of these representations, and (3) the morphological typology of the language modeled. |
185 | Riemannian Optimization for Skip-Gram Negative Sampling | Alexander Fonarev, Oleksii Grinchuk, Gleb Gusev, Pavel Serdyukov, Ivan Oseledets | In this paper, we propose an algorithm that optimizes SGNS objective using Riemannian optimization and demonstrates its superiority over popular competitors, such as the original method to train SGNS and SVD over SPPMI matrix. |
186 | Deep Multitask Learning for Semantic Dependency Parsing | Hao Peng, Sam Thomson, Noah A. Smith | We present a deep neural architecture that parses sentences into three semantic dependency graph formalisms. |
187 | Improved Word Representation Learning with Sememes | Yilin Niu, Ruobing Xie, Zhiyuan Liu, Maosong Sun | In this paper, we present that, word sememe information can improve word representation learning (WRL), which maps words into a low-dimensional semantic space and serves as a fundamental step for many NLP tasks. |
188 | Learning Character-level Compositionality with Visual Features | Frederick Liu, Han Lu, Chieh Lo, Graham Neubig | In this paper, we model this effect by creating embeddings for characters based on their visual characteristics, creating an image for the character and running it through a convolutional neural network to produce a visual character embedding. |
189 | A Progressive Learning Approach to Chinese SRL Using Heterogeneous Data | Qiaolin Xia, Lei Sha, Baobao Chang, Zhifang Sui | In this paper, we focus mainly on the latter, that is, to improve Chinese SRL by using heterogeneous corpora together. We also release a new corpus, Chinese SemBank, for Chinese SRL. |
190 | Revisiting Recurrent Networks for Paraphrastic Sentence Embeddings | John Wieting, Kevin Gimpel | We consider the problem of learning general-purpose, paraphrastic sentence embeddings, revisiting the setting of Wieting et al. (2016b). |
191 | Ontology-Aware Token Embeddings for Prepositional Phrase Attachment | Pradeep Dasigi, Waleed Ammar, Chris Dyer, Eduard Hovy | We use the new, context-sensitive embeddings in a model for predicting prepositional phrase (PP) attachments and jointly learn the concept embeddings and model parameters. |
192 | Identifying 1950s American Jazz Musicians: Fine-Grained IsA Extraction via Modifier Composition | Ellie Pavlick, Marius Paşca | We present a method for populating fine-grained classes (e.g., “1950s American jazz musicians”) with instances (e.g., Charles Mingus ). |
193 | Parsing to 1-Endpoint-Crossing, Pagenumber-2 Graphs | Junjie Cao, Sheng Huang, Weiwei Sun, Xiaojun Wan | Our main contribution is an exact algorithm that obtains maximum subgraphs satisfying both restrictions simultaneously in time O(n5). |
194 | Semi-supervised Multitask Learning for Sequence Labeling | Marek Rei | We propose a sequence labeling framework with a secondary training objective, learning to predict surrounding words for every word in the dataset. |
195 | Semantic Parsing of Pre-university Math Problems | Takuya Matsuzaki, Takumi Ito, Hidenao Iwane, Hirokazu Anai, Noriko H. Arai | We have been developing an end-to-end math problem solving system that accepts natural language input. |
TABLE 2: ACL 2017 Short Papers
Title | Authors | Highlight | |
---|---|---|---|
1 | Classifying Temporal Relations by Bidirectional LSTM over Dependency Paths | Fei Cheng, Yusuke Miyao | In this work, we borrow a state-of-the-art method in relation extraction by adopting bidirectional long short-term memory (Bi-LSTM) along dependency paths (DP). |
2 | AMR-to-text Generation with Synchronous Node Replacement Grammar | Linfeng Song, Xiaochang Peng, Yue Zhang, Zhiguo Wang, Daniel Gildea | This paper addresses the task of AMR-to-text generation by leveraging synchronous node replacement grammar. |
3 | Lexical Features in Coreference Resolution: To be Used With Caution | Nafise Sadat Moosavi, Michael Strube | In this paper we investigate a drawback of using many lexical features in state-of-the-art coreference resolvers. |
4 | Alternative Objective Functions for Training MT Evaluation Metrics | Miloš Stanojević, Khalil Sima’an | Subsequently we propose a model trained to optimize both objectives simultaneously and show that it is far more stable than-and on average outperforms-both models on both objectives. |
5 | A Principled Framework for Evaluating Summarizers: Comparing Models of Summary Quality against Human Judgments | Maxime Peyrard, Judith Eckle-Kohler | We present a new framework for evaluating extractive summarizers, which is based on a principled representation as optimization problem. |
6 | Vector space models for evaluating semantic fluency in autism | Emily Prud’hommeaux, Jan van Santen, Douglas Gliner | In this paper, we explore automated approaches for scoring semantic fluency responses that leverage ontological resources and distributional semantic models to characterize the semantic fluency responses produced by young children with and without ASD. |
7 | Neural Architectures for Multilingual Semantic Parsing | Raymond Hendy Susanto, Wei Lu | In this paper, we address semantic parsing in a multilingual context. |
8 | Incorporating Uncertainty into Deep Learning for Spoken Language Assessment | Andrey Malinin, Anton Ragni, Kate Knill, Mark Gales | This paper proposes a novel method to yield uncertainty and compares it to GPs and DNNs with MCD. |
9 | Incorporating Dialectal Variability for Socially Equitable Language Identification | David Jurgens, Yulia Tsvetkov, Dan Jurafsky | We propose a new dataset and a character-based sequence-to-sequence model for LID designed to support dialectal and multilingual language varieties. |
10 | Evaluating Compound Splitters Extrinsically with Textual Entailment | Glorianna Jagfeld, Patrick Ziering, Lonneke van der Plas | We explore a novel way for the extrinsic evaluation of compound splitters, namely recognizing textual entailment. |
11 | An Analysis of Action Recognition Datasets for Language and Vision Tasks | Spandana Gella, Frank Keller | In this survey, we categorize the existing approaches based on how they conceptualize this problem and provide a detailed review of existing datasets, highlighting their diversity as well as advantages and disadvantages. |
12 | Learning to Parse and Translate Improves Neural Machine Translation | Akiko Eriguchi, Yoshimasa Tsuruoka, Kyunghyun Cho | In this paper, we propose a hybrid model, called NMT+RNNG, that learns to parse and translate by combining the recurrent neural network grammar into the attention-based neural machine translation. |
13 | On the Distribution of Lexical Features at Multiple Levels of Analysis | Fatemeh Almodaresi, Lyle Ungar, Vivek Kulkarni, Mohsen Zakeri, Salvatore Giorgi, H. Andrew Schwartz | In this paper, we empirically characterize various lexical distributions at different levels of analysis, showing that, while most features are decidedly sparse and non-normal at the message-level (as with traditional NLP), they follow the central limit theorem to become much more Log-normal or even Normal at the user- and county-levels. |
14 | Exploring Neural Text Simplification Models | Sergiu Nisioi, Sanja Štajner, Simone Paolo Ponzetto, Liviu P. Dinu | We present the first attempt at using sequence to sequence neural networks to model text simplification (TS). |
15 | On the Challenges of Translating NLP Research into Commercial Products | Daniel Dahlmeier | This paper highlights challenges in industrial research related to translating research in natural language processing into commercial products. |
16 | Sentence Alignment Methods for Improving Text Simplification Systems | Sanja Štajner, Marc Franco-Salvador, Simone Paolo Ponzetto, Paolo Rosso, Heiner Stuckenschmidt | We provide several methods for sentence-alignment of texts with different complexity levels. |
17 | Understanding Task Design Trade-offs in Crowdsourced Paraphrase Collection | Youxuan Jiang, Jonathan K. Kummerfeld, Walter S. Lasecki | In this paper, we present the first systematic study of the key factors in crowdsourcing paraphrase collection. |
18 | Arc-swift: A Novel Transition System for Dependency Parsing | Peng Qi, Christopher D. Manning | This paper proposes a novel transition system, arc-swift, that enables direct attachments between tokens farther apart with a single transition. |
19 | A Generative Parser with a Discriminative Recognition Algorithm | Jianpeng Cheng, Adam Lopez, Mirella Lapata | We propose a framework for parsing and language modeling which marries a generative model with a discriminative recognition model in an encoder-decoder setting. |
20 | Hybrid Neural Network Alignment and Lexicon Model in Direct HMM for Statistical Machine Translation | Weiyue Wang, Tamer Alkhouli, Derui Zhu, Hermann Ney | This work proposes a direct HMM with neural network-based lexicon and alignment models, which are trained jointly using the Baum-Welch algorithm. |
21 | Towards String-To-Tree Neural Machine Translation | Roee Aharoni, Yoav Goldberg | We present a simple method to incorporate syntactic information about the target language in a neural machine translation system by translating into linearized, lexicalized constituency trees. |
22 | Learning Lexico-Functional Patterns for First-Person Affect | Lena Reed, Jiaqi Wu, Shereen Oraby, Pranav Anand, Marilyn Walker | We present a method to learn proxies for these functions from first-person narratives. We construct a novel fine-grained test set, and show that the patterns we learn improve our ability to predict first-person affective reactions to everyday events, from a Stanford sentiment baseline of .67F to .75F. |
23 | Lifelong Learning CRF for Supervised Aspect Extraction | Lei Shu, Hu Xu, Bing Liu | This paper makes a focused contribution to supervised aspect extraction. |
24 | Exploiting Domain Knowledge via Grouped Weight Sharing with Application to Text Categorization | Ye Zhang, Matthew Lease, Byron C. Wallace | We propose a general, novel method for exploiting such resources via weight sharing. |
25 | Improving Neural Parsing by Disentangling Model Combination and Reranking Effects | Daniel Fried, Mitchell Stern, Dan Klein | Recent work has proposed several generative neural models for constituency parsing that achieve state-of-the-art results. |
26 | Information-Theory Interpretation of the Skip-Gram Negative-Sampling Objective Function | Oren Melamud, Jacob Goldberger | In this paper we define a measure of dependency between two random variables, based on the Jensen-Shannon (JS) divergence between their joint distribution and the product of their marginal distributions. |
27 | Implicitly-Defined Neural Networks for Sequence Labeling | Michaeel Kazi, Brian Thompson | In this work, we propose a novel, implicitly-defined neural network architecture and describe a method to compute its components. |
28 | The Role of Prosody and Speech Register in Word Segmentation: A Computational Modelling Perspective | Bogdan Ludusan, Reiko Mazuka, Mathieu Bernard, Alejandrina Cristia, Emmanuel Dupoux | Since these two factors are thought to play an important role in early language acquisition, we aim to quantify their contribution for this task. |
29 | A Two-Stage Parsing Method for Text-Level Discourse Analysis | Yizhong Wang, Sujian Li, Houfeng Wang | Previous work introduced transition-based algorithms to form a unified architecture of parsing rhetorical structures (including span, nuclearity and relation), but did not achieve satisfactory performance. |
30 | Error-repair Dependency Parsing for Ungrammatical Texts | Keisuke Sakaguchi, Matt Post, Benjamin Van Durme | We propose a new dependency parsing scheme which jointly parses a sentence and repairs grammatical errors by extending the non-directional transition-based formalism of Goldberg and Elhadad (2010) with three additional actions: SUBSTITUTE, DELETE, INSERT. |
31 | Attention Strategies for Multi-Source Sequence-to-Sequence Learning | Jindřich Libovický, Jindřich Helcl | We propose two novel approaches to combine the outputs of attention mechanisms over each source sequence, flat and hierarchical. |
32 | Understanding and Detecting Supporting Arguments of Diverse Types | Xinyu Hua, Lu Wang | We investigate the problem of sentence-level supporting argument detection from relevant documents for user-specified claims. |
33 | A Neural Model for User Geolocation and Lexical Dialectology | Afshin Rahimi, Trevor Cohn, Timothy Baldwin | We propose a simple yet effective text-based user geolocation model based on a neural network with one hidden layer, which achieves state of the art performance over three Twitter benchmark geolocation datasets, in addition to producing word and phrase embeddings in the hidden layer that we show to be useful for detecting dialectal terms. |
34 | A Corpus of Natural Language for Visual Reasoning | Alane Suhr, Mike Lewis, James Yeh, Yoav Artzi | We describe a method of crowdsourcing linguistically-diverse data, and present an analysis of our data. We present a new visual reasoning language dataset, containing 92,244 pairs of examples of natural statements grounded in synthetic images with 3,962 unique sentences. |
35 | Neural Architecture for Temporal Relation Extraction: A Bi-LSTM Approach for Detecting Narrative Containers | Julien Tourille, Olivier Ferret, Aurélie Névéol, Xavier Tannier | We present a neural architecture for containment relation identification between medical events and/or temporal expressions. |
36 | How to Make Context More Useful? An Empirical Study on Context-Aware Neural Conversational Models | Zhiliang Tian, Rui Yan, Lili Mou, Yiping Song, Yansong Feng, Dongyan Zhao | In this paper, we conduct an empirical study to compare various models and investigate the effect of context information in dialog systems. |
37 | Cross-lingual and cross-domain discourse segmentation of entire documents | Chloé Braud, Ophélie Lacroix, Anders Søgaard | In this paper, we propose statistical discourse segmenters for five languages and three domains that do not rely on gold pre-annotations. |
38 | Detecting Good Arguments in a Non-Topic-Specific Way: An Oxymoron? | Beata Beigman Klebanov, Binod Gyawali, Yi Song | We investigate the extent to which it is possible to close the performance gap between topic-specific and across-topics models for identification of good arguments. |
39 | Argumentation Quality Assessment: Theory vs. Practice | Henning Wachsmuth, Nona Naderi, Ivan Habernal, Yufang Hou, Graeme Hirst, Iryna Gurevych, Benno Stein | We find that most observations on quality phrased spontaneously are in fact adequately represented by theory. |
40 | A Recurrent Neural Model with Attention for the Recognition of Chinese Implicit Discourse Relations | Samuel Rönnqvist, Niko Schenk, Christian Chiarcos | We introduce an attention-based Bi-LSTM for Chinese implicit discourse relations and demonstrate that modeling argument pairs as a joint sequence can outperform word order-agnostic approaches. |
41 | Discourse Annotation of Non-native Spontaneous Spoken Responses Using the Rhetorical Structure Theory Framework | Xinhao Wang, James Bruno, Hillary Molloy, Keelan Evanini, Klaus Zechner | Considering that the measurement of discourse coherence is typically a key metric in human scoring rubrics for assessments of spoken language, we initiated a research effort to obtain RST annotations of a large number of non-native spoken responses from a standardized assessment of academic English proficiency. |
42 | Improving Implicit Discourse Relation Recognition with Discourse-specific Word Embeddings | Changxing Wu, Xiaodong Shi, Yidong Chen, Jinsong Su, Boli Wang | We introduce a simple and effective method to learn discourse-specific word embeddings (DSWE) for implicit discourse relation recognition. |
43 | Oracle Summaries of Compressive Summarization | Tsutomu Hirao, Masaaki Nishino, Masaaki Nagata | This paper derives an Integer Linear Programming (ILP) formulation to obtain an oracle summary of the compressive summarization paradigm in terms of ROUGE. |
44 | Japanese Sentence Compression with a Large Training Dataset | Shun Hasegawa, Yuta Kikuchi, Hiroya Takamura, Manabu Okumura | In English, high-quality sentence compression models by deleting words have been trained on automatically created large training datasets. |
45 | A Neural Architecture for Generating Natural Language Descriptions from Source Code Changes | Pablo Loyola, Edison Marrese-Taylor, Yutaka Matsuo | We propose a model to automatically describe changes introduced in the source code of a program using natural language. |
46 | English Event Detection With Translated Language Features | Sam Wei, Igor Korostil, Joel Nothman, Ben Hachey | We propose novel radical features from automatic translation for event extraction. |
47 | EviNets: Neural Networks for Combining Evidence Signals for Factoid Question Answering | Denis Savenkov, Eugene Agichtein | This paper proposes EviNets: a novel neural network architecture for factoid question answering. |
48 | Pocket Knowledge Base Population | Travis Wolfe, Mark Dredze, Benjamin Van Durme | We describe novel Open Information Extraction methods which leverage the PKB to find informative trigger words. |
49 | Answering Complex Questions Using Open Information Extraction | Tushar Khot, Ashish Sabharwal, Peter Clark | We overcome this limitation by presenting a method for reasoning with Open IE knowledge, allowing more complex questions to be handled. |
50 | Bootstrapping for Numerical Open IE | Swarnadeep Saha, Harinder Pal, Mausam | We design and release BONIE, the first open numerical relation extractor, for extracting Open IE tuples where one of the arguments is a number or a quantity-unit phrase. |
51 | Feature-Rich Networks for Knowledge Base Completion | Alexandros Komninos, Suresh Manandhar | We propose jointly modelling Knowledge Bases and aligned text with Feature-Rich Networks. |
52 | Fine-Grained Entity Typing with High-Multiplicity Assignments | Maxim Rabinovich, Dan Klein | In this paper, we consider the high-multiplicity regime inherent in data sources such as Wikipedia that have semi-open type systems. |
53 | Group Sparse CNNs for Question Classification with Answer Sets | Mingbo Ma, Liang Huang, Bing Xiang, Bowen Zhou | Group Sparse CNNs for Question Classification with Answer Sets |
54 | Multi-Task Learning of Keyphrase Boundary Classification | Isabelle Augenstein, Anders Søgaard | To overcome this, we explore several auxiliary tasks, including semantic super-sense tagging and identification of multi-word expressions, and cast the task as a multi-task learning problem with deep recurrent neural networks. |
55 | Cardinal Virtues: Extracting Relation Cardinalities from Text | Paramita Mirza, Simon Razniewski, Fariz Darari, Gerhard Weikum | We present a distant supervision method using conditional random fields. We introduce this novel problem of extracting cardinalities and discusses the specific challenges that set it apart from standard IE. |
56 | Integrating Deep Linguistic Features in Factuality Prediction over Unified Datasets | Gabriel Stanovsky, Judith Eckle-Kohler, Yevgeniy Puzikov, Ido Dagan, Iryna Gurevych | In this work we propose an intuitive method for mapping three previously annotated corpora onto a single factuality scale, thereby enabling models to be tested across these corpora. We make both the unified factuality corpus and our new model publicly available. |
57 | Question Answering on Knowledge Bases and Text using Universal Schema and Memory Networks | Rajarshi Das, Manzil Zaheer, Siva Reddy, Andrew McCallum | In this paper we extend universal schema to natural language question answering, employing Memory networks to attend to the large body of facts in the combination of text and KB. |
58 | Differentiable Scheduled Sampling for Credit Assignment | Kartik Goyal, Chris Dyer, Taylor Berg-Kirkpatrick | By incorporating this approximation into the scheduled sampling training procedure-a well-known technique for correcting exposure bias-we introduce a new training objective that is continuous and differentiable everywhere and can provide informative gradients near points where previous decoding decisions change their value. |
59 | A Deep Network with Visual Text Composition Behavior | Hongyu Guo | We propose a deep network, which not only achieves competitive accuracy for text classification, but also exhibits compositional behavior. |
60 | Neural System Combination for Machine Translation | Long Zhou, Wenpeng Hu, Jiajun Zhang, Chengqing Zong | In this paper, we propose a neural system combination framework leveraging multi-source NMT, which takes as input the outputs of NMT and SMT systems and produces the final translation. |
61 | An Empirical Comparison of Domain Adaptation Methods for Neural Machine Translation | Chenhui Chu, Raj Dabre, Sadao Kurohashi | In this paper, we propose a novel domain adaptation method named “mixed fine tuning” for neural machine translation (NMT). |
62 | Efficient Extraction of Pseudo-Parallel Sentences from Raw Monolingual Data Using Word Embeddings | Benjamin Marie, Atsushi Fujita | We propose a new method for extracting pseudo-parallel sentences from a pair of large monolingual corpora, without relying on any document-level information. |
63 | Feature Hashing for Language and Dialect Identification | Shervin Malmasi, Mark Dras | We evaluate feature hashing for language identification (LID), a method not previously used for this task. |
64 | Detection of Chinese Word Usage Errors for Non-Native Chinese Learners with Bidirectional LSTM | Yow-Ting Shiue, Hen-Hsen Huang, Hsin-Hsi Chen | In this paper, we propose (bidirectional) LSTM sequence labeling models and explore various features to detect word usage errors in Chinese sentences. |
65 | Automatic Compositor Attribution in the First Folio of Shakespeare | Maria Ryskina, Hannah Alpert-Abrams, Dan Garrette, Taylor Berg-Kirkpatrick | In this paper, we introduce a novel unsupervised model that jointly describes the textual and visual features needed to distinguish compositors. |
66 | STAIR Captions: Constructing a Large-Scale Japanese Image Caption Dataset | Yuya Yoshikawa, Yutaro Shigeto, Akikazu Takeuchi | In this paper, we particularly consider generating Japanese captions for images. To tackle this problem, we construct a large-scale Japanese image caption dataset based on images from MS-COCO, which is called STAIR Captions. |
67 | “Liar, Liar Pants on Fire”: A New Benchmark Dataset for Fake News Detection | William Yang Wang | In this paper, we present LIAR: a new, publicly available dataset for fake news detection. |
68 | English Multiword Expression-aware Dependency Parsing Including Named Entities | Akihiko Kato, Hiroyuki Shindo, Yuji Matsumoto | In this work, we construct a corpus that ensures consistency between dependency structures and MWEs, including named entities. |
69 | Improving Semantic Composition with Offset Inference | Thomas Kober, Julie Weeds, Jeremy Reffin, David Weir | We therefore introduce a novel form of distributional inference that exploits the rich type structure in APTs and infers missing data by the same mechanism that is used for semantic composition. |
70 | Learning Topic-Sensitive Word Representations | Marzieh Fadaee, Arianna Bisazza, Christof Monz | We present two approaches to learn multiple topic-sensitive representations per word by using Hierarchical Dirichlet Process. |
71 | Temporal Word Analogies: Identifying Lexical Replacement with Diachronic Word Embeddings | Terrence Szymanski | This paper introduces the concept of temporal word analogies: pairs of words which occupy the same semantic space at different points in time. |
72 | Methodical Evaluation of Arabic Word Embeddings | Mohammed Elrazzaz, Shady Elbassuoni, Khaled Shaban, Chadi Helwe | In this study, we evaluate these various techniques when used to generate Arabic word embeddings. We first build a benchmark for the Arabic language that can be utilized to perform intrinsic evaluation of different word embeddings. |
73 | Multilingual Connotation Frames: A Case Study on Social Media for Targeted Sentiment Analysis and Forecast | Hannah Rashkin, Eric Bell, Yejin Choi, Svitlana Volkova | To study targeted public sentiments across many languages and geographic locations, we introduce multilingual connotation frames: an extension from English connotation frames of Rashkin et al. (2016) with 10 additional European languages, focusing on the implied sentiments among event participants engaged in a frame. |
74 | Best-Worst Scaling More Reliable than Rating Scales: A Case Study on Sentiment Intensity Annotation | Svetlana Kiritchenko, Saif Mohammad | Here for the first time, we set up an experiment that directly compares the rating scale method with BWS. |
75 | Demographic Inference on Twitter using Recursive Neural Networks | Sunghwan Mac Kim, Qiongkai Xu, Lizhen Qu, Stephen Wan, Cécile Paris | In this work, we employ recursive neural networks to break down these independence assumptions to obtain inference about demographic characteristics on Twitter. |
76 | Twitter Demographic Classification Using Deep Multi-modal Multi-task Learning | Prashanth Vijayaraghavan, Soroush Vosoughi, Deb Roy | In this paper, we present a demographic classifier for gender, age, political orientation and location on Twitter. We collected and curated a robust Twitter demographic dataset for this task. |
77 | A Network Framework for Noisy Label Aggregation in Social Media | Xueying Zhan, Yaowei Wang, Yanghui Rao, Haoran Xie, Qing Li, Fu Lee Wang, Tak-Lam Wong | To aggregate noisy labels at a small cost, a network framework is proposed by calculating the matching degree of a document’s topics and the annotators’ meta-data. |
78 | Parser Adaptation for Social Media by Integrating Normalization | Rob van der Goot, Gertjan van Noord | This work explores different approaches of using normalization for parser adaptation. |
79 | AliMe Chat: A Sequence to Sequence and Rerank based Chatbot Engine | Minghui Qiu, Feng-Lin Li, Siyu Wang, Xing Gao, Yan Chen, Weipeng Zhao, Haiqing Chen, Jun Huang, Wei Chu | We propose AliMe Chat, an open-domain chatbot engine that integrates the joint results of Information Retrieval (IR) and Sequence to Sequence (Seq2Seq) based generation models. |
80 | A Conditional Variational Framework for Dialog Generation | Xiaoyu Shen, Hui Su, Yanran Li, Wenjie Li, Shuzi Niu, Yang Zhao, Akiko Aizawa, Guoping Long | In this paper, we propose a framework allowing conditional response generation based on specific attributes. |
81 | Question Answering through Transfer Learning from Large Fine-grained Supervision Data | Sewon Min, Minjoon Seo, Hannaneh Hajishirzi | We show that the task of question answering (QA) can significantly benefit from the transfer learning of models trained on a different large, fine-grained QA dataset. |
82 | Self-Crowdsourcing Training for Relation Extraction | Azad Abad, Moin Nabi, Alessandro Moschitti | In this paper we introduce a self-training strategy for crowdsourcing. |
83 | A Generative Attentional Neural Network Model for Dialogue Act Classification | Quan Hung Tran, Gholamreza Haffari, Ingrid Zukerman | We propose a novel generative neural network architecture for Dialogue Act classification. |
84 | Salience Rank: Efficient Keyphrase Extraction with Topic Modeling | Nedelina Teneva, Weiwei Cheng | In this paper, we propose a modification of TPR, called Salience Rank. |
85 | List-only Entity Linking | Ying Lin, Chin-Yew Lin, Heng Ji | In this work, we select most linkable mentions as seed mentions and disambiguate other mentions by comparing them with the seed mentions rather than directly with the entities. |
86 | Improving Native Language Identification by Using Spelling Errors | Lingzhen Chen, Carlo Strapparava, Vivi Nastase | In this paper, we explore spelling errors as a source of information for detecting the native language of a writer, a previously under-explored area. |
87 | Disfluency Detection using a Noisy Channel Model and a Deep Neural Language Model | Paria Jamshid Lou, Mark Johnson | This paper presents a model for disfluency detection in spontaneous speech transcripts called LSTM Noisy Channel Model. |
88 | On the Equivalence of Holographic and Complex Embeddings for Link Prediction | Katsuhiko Hayashi, Masashi Shimbo | We show the equivalence of two state-of-the-art models for link prediction/knowledge graph completion: Nickel et al’s holographic embeddings and Trouillon et al.’s complex embeddings. |
89 | Sentence Embedding for Neural Machine Translation Domain Adaptation | Rui Wang, Andrew Finch, Masao Utiyama, Eiichiro Sumita | In this paper, we exploit the NMT’s internal embedding of the source sentence and use the sentence embedding similarity to select the sentences which are close to in-domain data. |
90 | Data Augmentation for Low-Resource Neural Machine Translation | Marzieh Fadaee, Arianna Bisazza, Christof Monz | Inspired by work in computer vision, we propose a novel data augmentation approach that targets low-frequency words by generating new sentence pairs containing rare words in new, synthetically created contexts. |
91 | Speeding Up Neural Machine Translation Decoding by Shrinking Run-time Vocabulary | Xing Shi, Kevin Knight | We speed up Neural Machine Translation (NMT) decoding by shrinking run-time target vocabulary. |
92 | Chunk-Based Bi-Scale Decoder for Neural Machine Translation | Hao Zhou, Zhaopeng Tu, Shujian Huang, Xiaohua Liu, Hang Li, Jiajun Chen | In this paper, we propose a new type of decoder for NMT, which splits the decode state into two parts and updates them in two different time-scales. |
93 | Model Transfer for Tagging Low-resource Languages using a Bilingual Dictionary | Meng Fang, Trevor Cohn | We propose a novel neural network model for joint training from both sources of data based on cross-lingual word embeddings, and show substantial empirical improvements over baseline techniques. |
94 | EuroSense: Automatic Harvesting of Multilingual Sense Annotations from Parallel Text | Claudio Delli Bovi, Jose Camacho-Collados, Alessandro Raganato, Roberto Navigli | In this paper we present EuroSense, a multilingual sense-annotated resource based on the joint disambiguation of the Europarl parallel corpus, with almost 123 million sense annotations for over 155 thousand distinct concepts and entities from a language-independent unified sense inventory. |
95 | Challenging Language-Dependent Segmentation for Arabic: An Application to Machine Translation and Part-of-Speech Tagging | Hassan Sajjad, Fahim Dalvi, Nadir Durrani, Ahmed Abdelali, Yonatan Belinkov, Stephan Vogel | In our analysis, we show that a neural machine translation system is sensitive to the ratio of source and target tokens, and a ratio close to 1 or greater, gives optimal performance. |
96 | Fast and Accurate Neural Word Segmentation for Chinese | Deng Cai, Hai Zhao, Zhisong Zhang, Yuan Xin, Yongjian Wu, Feiyue Huang | In this paper, we propose a greedy neural word segmenter with balanced word and character embedding inputs to alleviate the existing drawbacks. |
97 | Pay Attention to the Ending:Strong Neural Baselines for the ROC Story Cloze Task | Zheng Cai, Lifu Tu, Kevin Gimpel | We develop a model that uses hierarchical recurrent networks with attention to encode the sentences in the story and score candidate endings. |
98 | Neural Semantic Parsing over Multiple Knowledge-bases | Jonathan Herzig, Jonathan Berant | In this paper, we propose to exploit structural regularities in language in different domains, and train semantic parsers over multiple knowledge-bases (KBs), while sharing information across datasets. |
99 | Representing Sentences as Low-Rank Subspaces | Jiaqi Mu, Suma Bhat, Pramod Viswanath | We observe a simple geometry of sentences – the word representations of a given sentence (on average 10.23 words in all SemEval datasets with a standard deviation 4.84) roughly lie in a low-rank subspace (roughly, rank 4). |
100 | Improving Semantic Relevance for Sequence-to-Sequence Learning of Chinese Social Media Text Summarization | Shuming Ma, Xu Sun, Jingjing Xu, Houfeng Wang, Wenjie Li, Qi Su | In this work, our goal is to improve semantic relevance between source texts and summaries for Chinese social media summarization. |
101 | Determining Whether and When People Participate in the Events They Tweet About | Krishna Chaitanya Sanagavarapu, Alakananda Vempala, Eduardo Blanco | This paper describes an approach to determine whether people participate in the events they tweet about. |
102 | Separating Facts from Fiction: Linguistic Models to Classify Suspicious and Trusted News Posts on Twitter | Svitlana Volkova, Kyle Shaffer, Jin Yea Jang, Nathan Hodas | In this work we build predictive models to classify 130 thousand news posts as suspicious or verified, and predict four sub-types of suspicious news – satire, hoaxes, clickbait and propaganda. |
103 | Recognizing Counterfactual Thinking in Social Media Texts | Youngseo Son, Anneke Buffone, Joe Raso, Allegra Larche, Anthony Janocko, Kevin Zembroski, H Andrew Schwartz, Lyle Ungar | We create a counterfactual tweet dataset and explore approaches for detecting counterfactuals using rule-based and supervised statistical approaches. |
104 | Temporal Orientation of Tweets for Predicting Income of Users | Mohammed Hasanuzzaman, Sabyasachi Kamila, Mandeep Kaur, Sriparna Saha, Asif Ekbal | The current paper presents the first study where user cognitive structure is used to build a predictive model of income. |
105 | Character-Aware Neural Morphological Disambiguation | Alymzhan Toleu, Gulmira Tolegen, Aibek Makazhanov | Guided by the intuition that the correct analysis should be “most similar” to the context, we propose dense representations for morphological analyses and surface context and a simple yet effective way of combining the two to perform disambiguation. |
106 | Character Composition Model with Convolutional Neural Networks for Dependency Parsing on Morphologically Rich Languages | Xiang Yu, Ngoc Thang Vu | We present a transition-based dependency parser that uses a convolutional neural network to compose word representations from characters. |
107 | How (not) to train a dependency parser: The curious case of jackknifing part-of-speech taggers | Željko Agić, Natalie Schluter | On 26 languages, we reveal a preference that conflicts with, and surpasses the ubiquitous ten-folding. |