Paper Digest: ACL 2017 Highlights

July 29, 2017October 6, 2019 admin

Annual Meeting of the Association for Computational Linguistics (ACL) is one of the top natural language processing conferences in the world. In 2017, it is to be held in Vancouver, Canada. There were 1,318 paper submissions, of which 195 were accepted as long papers, and 107 as short papers.

To help AI community quickly catch up on the work presented in this conference, Paper Digest Team processed all accepted papers, and generated one highlight sentence (typically the main topic) for each paper. Readers are encouraged to read these machine generated highlights / summaries to quickly get the main idea of each paper.

We thank all authors for writing these interesting papers, and readers for reading our digests. If you do not want to miss any interesting AI paper, you are welcome to sign up our free paper digest service to get new paper updates customized to your own interests on a daily basis.

Paper Digest Team
team@paperdigest.org

TABLE 1: ACL 2017 Long Papers

	Title	Authors	Highlight
1	Adversarial Multi-task Learning for Text Classification	Pengfei Liu, Xipeng Qiu, Xuanjing Huang	In this paper, we propose an adversarial multi-task learning framework, alleviating the shared and private latent feature spaces from interfering with each other.
2	Neural End-to-End Learning for Computational Argumentation Mining	Steffen Eger, Johannes Daxenberger, Iryna Gurevych	We investigate neural techniques for end-to-end computational argumentation mining (AM).
3	Neural Symbolic Machines: Learning Semantic Parsers on Freebase with Weak Supervision	Chen Liang, Jonathan Berant, Quoc Le, Kenneth D. Forbus, Ni Lao	In this work, we introduce a Neural Symbolic Machine, which contains (a) a neural “programmer”, i.e., a sequence-to-sequence model that maps language utterances to programs and utilizes a key-variable memory to handle compositionality (b) a symbolic “computer”, i.e., a Lisp interpreter that performs program execution, and helps find good programs by pruning the search space.
4	Neural Relation Extraction with Multi-lingual Attention	Yankai Lin, Zhiyuan Liu, Maosong Sun	To address this issue, we introduce a multi-lingual neural relation extraction framework, which employs mono-lingual attention to utilize the information within mono-lingual texts and further proposes cross-lingual attention to consider the information consistency and complementarity among cross-lingual texts.
5	Learning Structured Natural Language Representations for Semantic Parsing	Jianpeng Cheng, Siva Reddy, Vijay Saraswat, Mirella Lapata	We introduce a neural semantic parser which is interpretable and scalable.
6	Morph-fitting: Fine-Tuning Word Vector Spaces with Simple Language-Specific Rules	Ivan Vulić, Nikola Mrkšić, Roi Reichart, Diarmuid Ó Séaghdha, Steve Young, Anna Korhonen	In this work, we propose a novel morph-fitting procedure which moves past the use of curated semantic lexicons for improving distributional vector spaces.
7	Skip-Gram − Zipf + Uniform = Vector Additivity	Alex Gittens, Dimitris Achlioptas, Michael W. Mahoney	When these assumptions do not hold, this work describes the correct non-linear composition operator.
8	The State of the Art in Semantic Representation	Omri Abend, Ari Rappoport	Yet, little has been done to assess the achievements and the shortcomings of these new contenders, compare them with syntactic schemes, and clarify the general goals of research on semantic representation.
9	Joint Learning for Event Coreference Resolution	Jing Lu, Vincent Ng	To address this problem, we propose a model for jointly learning event coreference, trigger detection, and event anaphoricity.
10	Generating and Exploiting Large-scale Pseudo Training Data for Zero Pronoun Resolution	Ting Liu, Yiming Cui, Qingyu Yin, Wei-Nan Zhang, Shijin Wang, Guoping Hu	To alleviate the problem above, in this paper, we propose a simple but novel approach to automatically generate large-scale pseudo training data for zero pronoun resolution.
11	Discourse Mode Identification in Essays	Wei Song, Dong Wang, Ruiji Fu, Lizhen Liu, Ting Liu, Guoping Hu	We annotate a corpus to study the characteristics of discourse modes and describe a neural sequence labeling model for identification.
12	A Convolutional Encoder Model for Neural Machine Translation	Jonas Gehring, Michael Auli, David Grangier, Yann Dauphin	We present a faster and simpler architecture based on a succession of convolutional layers.
13	Deep Neural Machine Translation with Linear Associative Unit	Mingxuan Wang, Zhengdong Lu, Jie Zhou, Qun Liu	To address this problem we propose a novel linear associative units (LAU) to reduce the gradient propagation path inside the recurrent unit.
14	Neural AMR: Sequence-to-Sequence Models for Parsing and Generation	Ioannis Konstas, Srinivasan Iyer, Mark Yatskar, Yejin Choi, Luke Zettlemoyer	We present a novel training procedure that can lift this limitation using millions of unlabeled sentences and careful preprocessing of the AMR graphs.
15	Program Induction by Rationale Generation: Learning to Solve and Explain Algebraic Word Problems	Wang Ling, Dani Yogatama, Chris Dyer, Phil Blunsom	To make this task more feasible, we solve these problems by generating answer rationales, sequences of natural language and human-readable mathematical expressions that derive the final answer through a series of small steps. To evaluate our approach, we have created a new 100,000-sample dataset of questions, answers and rationales.
16	Automatically Generating Rhythmic Verse with Neural Networks	Jack Hopkins, Douwe Kiela	We propose two novel methodologies for the automatic generation of rhythmic poetry in a variety of forms.
17	Creating Training Corpora for NLG Micro-Planners	Claire Gardent, Anastasia Shimorina, Shashi Narayan, Laura Perez-Beltrachini	In this paper, we present a novel framework for semi-automatically creating linguistically challenging micro-planning data-to-text corpora from existing Knowledge Bases.
18	Gated Self-Matching Networks for Reading Comprehension and Question Answering	Wenhui Wang, Nan Yang, Furu Wei, Baobao Chang, Ming Zhou	In this paper, we present the gated self-matching networks for reading comprehension style question answering, which aims to answer questions from a given passage.
19	Generating Natural Answers by Incorporating Copying and Retrieving Mechanisms in Sequence-to-Sequence Learning	Shizhu He, Cao Liu, Kang Liu, Jun Zhao	In this paper, we propose an end-to-end question answering system called COREQA in sequence-to-sequence learning, which incorporates copying and retrieving mechanisms to generate natural answers within an encoder-decoder framework.
20	Coarse-to-Fine Question Answering for Long Documents	Eunsol Choi, Daniel Hewlett, Jakob Uszkoreit, Illia Polosukhin, Alexandre Lacoste, Jonathan Berant	We present a framework for question answering that can efficiently scale to longer documents while maintaining or even improving performance of state-of-the-art models.
21	An End-to-End Model for Question Answering over Knowledge Base with Cross-Attention Combining Global Knowledge	Yanchao Hao, Yuanzhe Zhang, Kang Liu, Shizhu He, Zhanyi Liu, Hua Wu, Jun Zhao	Hence, we present an end-to-end neural network model to represent the questions and their corresponding scores dynamically according to the various candidate answer aspects via cross-attention mechanism.
22	Translating Neuralese	Jacob Andreas, Anca Dragan, Dan Klein	Several approaches have recently been proposed for learning decentralized deep multiagent policies that coordinate via a differentiable communication channel.
23	Obtaining referential word meanings from visual and distributional information: Experiments on object naming	Sina Zarrieß, David Schlangen	We present a model that learns individual predictors for object names that link visual and distributional aspects of word meaning during training.
24	FOIL it! Find One mismatch between Image and Language caption	Ravi Shekhar, Sandro Pezzelle, Yauhen Klimovich, Aurélie Herbelot, Moin Nabi, Enver Sangineto, Raffaella Bernardi	In this paper, we aim to understand whether current language and vision (LaVi) models truly grasp the interaction between the two modalities.
25	Verb Physics: Relative Physical Knowledge of Actions and Objects	Maxwell Forbes, Yejin Choi	In this paper, we present an approach to infer relative physical knowledge of actions and objects along five dimensions (e.g., size, weight, and strength) from unstructured natural language text.
26	A* CCG Parsing with a Supertag and Dependency Factored Model	Masashi Yoshikawa, Hiroshi Noji, Yuji Matsumoto	We propose a new A* CCG parsing model in which the probability of a tree is decomposed into factors of CCG categories and its syntactic dependencies both defined on bi-directional LSTMs.
27	A Full Non-Monotonic Transition System for Unrestricted Non-Projective Parsing	Daniel Fernández-González, Carlos Gómez-Rodríguez	In this paper, we propose a novel, fully non-monotonic transition system based on the non-projective Covington algorithm.
28	Aggregating and Predicting Sequence Labels from Crowd Annotations	An Thanh Nguyen, Byron Wallace, Junyi Jessy Li, Ani Nenkova, Matthew Lease	To predict sequences in unannotated text, we propose a neural approach using Long Short Term Memory.
29	Multi-space Variational Encoder-Decoders for Semi-supervised Labeled Sequence Transduction	Chunting Zhou, Graham Neubig	In this paper we propose multi-space variational encoder-decoders, a new model for labeled sequence transduction with semi-supervised learning.
30	Scalable Bayesian Learning of Recurrent Neural Networks for Language Modeling	Zhe Gan, Chunyuan Li, Changyou Chen, Yunchen Pu, Qinliang Su, Lawrence Carin	This paper leverages recent advances in stochastic gradient Markov Chain Monte Carlo (also appropriate for large training sets) to learn weight uncertainty in RNNs.
31	Learning attention for historical text normalization by learning to pronounce	Marcel Bollmann, Joachim Bingel, Anders Søgaard	We analyze the induced models across 44 different texts from Early New High German.
32	Deep Learning in Semantic Kernel Spaces	Danilo Croce, Simone Filice, Giuseppe Castellucci, Roberto Basili	In this paper, we show that expressive kernels and deep neural networks can be combined in a common framework in order to (i) explicitly model structured information and (ii) learn non-linear decision functions.
33	Topically Driven Neural Language Model	Jey Han Lau, Timothy Baldwin, Trevor Cohn	We present a neural language model that incorporates document context in the form of a topic model-like architecture, thus providing a succinct representation of the broader document context outside of the current sentence.
34	Handling Cold-Start Problem in Review Spam Detection by Jointly Embedding Texts and Behaviors	Xuepeng Wang, Kang Liu, Jun Zhao	This paper proposes a novel neural network model to detect review spam for cold-start problem, by learning to represent the new reviewers’ review with jointly embedded textual and behavioral information.
35	Learning Cognitive Features from Gaze Data for Sentiment and Sarcasm Classification using Convolutional Neural Network	Abhijit Mishra, Kuntal Dey, Pushpak Bhattacharyya	We introduce a framework to automatically extract cognitive features from the eye-movement/gaze data of human readers reading the text and use them as features along with textual features for the tasks of sentiment polarity and sarcasm detection.
36	An Unsupervised Neural Attention Model for Aspect Extraction	Ruidan He, Wee Sun Lee, Hwee Tou Ng, Daniel Dahlmeier	In this paper, we present a novel neural approach with the aim of discovering coherent aspects.
37	Other Topics You May Also Agree or Disagree: Modeling Inter-Topic Preferences using Tweets and Matrix Factorization	Akira Sasaki, Kazuaki Hanawa, Naoaki Okazaki, Kentaro Inui	We presents in this paper our approach for modeling inter-topic preferences of Twitter users: for example, “those who agree with the Trans-Pacific Partnership (TPP) also agree with free trade”.
38	Automatically Labeled Data Generation for Large Scale Event Extraction	Yubo Chen, Shulin Liu, Xiang Zhang, Kang Liu, Jun Zhao	To solve the data labeling problem, we propose to automatically label training data for event extraction via world knowledge and linguistic knowledge, which can detect key arguments and trigger words for each event type and employ them to label events in texts automatically.
39	Time Expression Analysis and Recognition Using Syntactic Token Types and General Heuristic Rules	Xiaoshi Zhong, Aixin Sun, Erik Cambria	Based on the findings, we propose a type-based approach, named SynTime, to recognize time expressions.
40	Learning with Noise: Enhance Distantly Supervised Relation Extraction with Dynamic Transition Matrix	Bingfeng Luo, Yansong Feng, Zheng Wang, Zhanxing Zhu, Songfang Huang, Rui Yan, Dongyan Zhao	In this paper, we take a deep look at the application of distant supervision in relation extraction.
41	A Syntactic Neural Model for General-Purpose Code Generation	Pengcheng Yin, Graham Neubig	Informed by previous work in semantic parsing, in this paper we propose a novel neural architecture powered by a grammar model to explicitly capture the target syntax as prior knowledge.
42	Learning bilingual word embeddings with (almost) no bilingual data	Mikel Artetxe, Gorka Labaka, Eneko Agirre	Our method exploits the structural similarity of embedding spaces, and works with as little bilingual evidence as a 25 word dictionary or even an automatically generated list of numerals, obtaining results comparable to those of systems that use richer resources.
43	Abstract Meaning Representation Parsing using LSTM Recurrent Neural Networks	William Foland, James H. Martin	We present a system which parses sentences into Abstract Meaning Representations, improving state-of-the-art results for this task by more than 5%.
44	Deep Semantic Role Labeling: What Works and What’s Next	Luheng He, Kenton Lee, Mike Lewis, Luke Zettlemoyer	We introduce a new deep learning model for semantic role labeling (SRL) that significantly improves the state of the art, along with detailed analyses to reveal its strengths and limitations.
45	Towards End-to-End Reinforcement Learning of Dialogue Agents for Information Access	Bhuwan Dhingra, Lihong Li, Xiujun Li, Jianfeng Gao, Yun-Nung Chen, Faisal Ahmed, Li Deng	In this paper, we address this limitation by replacing symbolic queries with an induced “soft” posterior distribution over the KB that indicates which entities the user is interested in.
46	Sequential Matching Network: A New Architecture for Multi-turn Response Selection in Retrieval-Based Chatbots	Yu Wu, Wei Wu, Chen Xing, Ming Zhou, Zhoujun Li	We propose a sequential matching network (SMN) to address both problems.
47	Learning Word-Like Units from Joint Audio-Visual Analysis	David Harwath, James Glass	Given a collection of images and spoken audio captions, we present a method for discovering word-like acoustic units in the continuous speech signal and grounding them to semantically relevant image regions.
48	Joint CTC/attention decoding for end-to-end speech recognition	Takaaki Hori, Shinji Watanabe, John Hershey	This paper proposes joint decoding algorithm for end-to-end ASR with a hybrid CTC/attention architecture, which effectively utilizes both advantages in decoding.
49	Found in Translation: Reconstructing Phylogenetic Language Trees from Translations	Ella Rabinovich, Noam Ordan, Shuly Wintner	We show that traces of the source language remain in the translation product to the extent that it is possible to uncover the history of the source language by looking only at the translation.
50	Predicting Native Language from Gaze	Yevgeni Berzak, Chie Nakamura, Suzanne Flynn, Boris Katz	We present a novel methodology for studying this question: analysis of eye-movement patterns in second language reading of free-form text.
51	MORSE: Semantic-ally Drive-n MORpheme SEgment-er	Tarek Sakakini, Suma Bhat, Pramod Viswanath	We present in this paper a novel framework for morpheme segmentation which uses the morpho-syntactic regularities preserved by word representations, in addition to orthographic features, to segment words into morphemes. We also analyze the deficiencies of available benchmarking datasets and introduce our own dataset that was created on the basis of compositionality.
52	Deep Pyramid Convolutional Neural Networks for Text Categorization	Rie Johnson, Tong Zhang	This paper proposes a low-complexity word-level deep convolutional neural network (CNN) architecture for text categorization that can efficiently represent long-range associations in text.
53	Improved Neural Relation Detection for Knowledge Base Question Answering	Mo Yu, Wenpeng Yin, Kazi Saidul Hasan, Cicero dos Santos, Bing Xiang, Bowen Zhou	In this paper, we propose a hierarchical recurrent neural network enhanced by residual learning which detects KB relations given an input question.
54	Deep Keyphrase Generation	Rui Meng, Sanqiang Zhao, Shuguang Han, Daqing He, Peter Brusilovsky, Yu Chi	We propose a generative model for keyphrase prediction with an encoder-decoder framework, which can effectively overcome the above drawbacks.
55	Attention-over-Attention Neural Networks for Reading Comprehension	Yiming Cui, Zhipeng Chen, Si Wei, Shijin Wang, Ting Liu, Guoping Hu	In this paper, we present a simple but novel model called attention-over-attention reader for better solving cloze-style reading comprehension task.
56	Alignment at Work: Using Language to Distinguish the Internalization and Self-Regulation Components of Cultural Fit in Organizations	Gabriel Doyle, Amir Goldberg, Sameer Srivastava, Michael Frank	Recent research draws on computational linguistics to measure cultural fit but overlooks asymmetries in cultural adaptation.
57	Representations of language in a model of visually grounded speech signal	Grzegorz Chrupała, Lieke Gelderloos, Afra Alishahi	We present a visually grounded model of speech perception which projects spoken utterances and images to a joint semantic space.
58	Spectral Analysis of Information Density in Dialogue Predicts Collaborative Task Performance	Yang Xu, David Reitter	We propose a perspective on dialogue that focuses on relative information contributions of conversation partners as a key to successful communication.
59	Affect-LM: A Neural Language Model for Customizable Affective Text Generation	Sayan Ghosh, Mathieu Chollet, Eugene Laksana, Louis-Philippe Morency, Stefan Scherer	In this paper, we propose an extension to an LSTM (Long Short-Term Memory) language model for generation of conversational text, conditioned on affect categories.
60	Domain Attention with an Ensemble of Experts	Young-Bum Kim, Karl Stratos, Dongchan Kim	We describe a solution based on attending an ensemble of domain experts.
61	Learning Discourse-level Diversity for Neural Dialog Models using Conditional Variational Autoencoders	Tiancheng Zhao, Ran Zhao, Maxine Eskenazi	Unlike past work that has focused on diversifying the output of the decoder from word-level to alleviate this problem, we present a novel framework based on conditional variational autoencoders that capture the discourse-level diversity in the encoder.
62	Hybrid Code Networks: practical and efficient end-to-end dialog control with supervised and reinforcement learning	Jason D. Williams, Kavosh Asadi, Geoffrey Zweig	We introduce Hybrid Code Networks (HCNs), which combine an RNN with domain-specific knowledge encoded as software and system action templates.
63	Generating Contrastive Referring Expressions	Martín Villalba, Christoph Teichmann, Alexander Koller	We present an algorithm for generating corrective REs that use contrastive focus (“no, the BLUE button”) to emphasize the information the hearer most likely misunderstood.
64	Modeling Source Syntax for Neural Machine Translation	Junhui Li, Deyi Xiong, Zhaopeng Tu, Muhua Zhu, Min Zhang, Guodong Zhou	On the basis, we propose three different sorts of encoders to incorporate source syntax into NMT: 1) Parallel RNN encoder that learns word and label annotation vectors parallelly; 2) Hierarchical RNN encoder that learns word and label annotation vectors in a two-level hierarchy; and 3) Mixed RNN encoder that stitchingly learns word and label annotation vectors over sequences where words and labels are mixed.
65	Sequence-to-Dependency Neural Machine Translation	Shuangzhi Wu, Dongdong Zhang, Nan Yang, Mu Li, Ming Zhou	Inspired by the success of using syntactic knowledge of target language for improving statistical machine translation, in this paper we propose a novel Sequence-to-Dependency Neural Machine Translation (SD-NMT) method, in which the target word sequence and its corresponding dependency structure are jointly constructed and modeled, and this structure is used as context to facilitate word generations.
66	Detect Rumors in Microblog Posts Using Propagation Structure via Kernel Learning	Jing Ma, Wei Gao, Kam-Fai Wong	In this paper, we attempt to address the problem of identifying rumors, i.e., fake information, out of microblog posts based on their propagation structure.
67	EmoNet: Fine-Grained Emotion Detection with Gated Recurrent Neural Networks	Muhammad Abdul-Mageed, Lyle Ungar	In this work, we build a very large dataset for fine-grained emotions and develop deep learning models on it.
68	Beyond Binary Labels: Political Ideology Prediction of Twitter Users	Daniel Preoţiuc-Pietro, Ye Liu, Daniel Hopkins, Lyle Ungar	Using a novel data set with political ideology labels self-reported through surveys, our goal is two-fold: a) to characterize the groups of politically engaged users through language use on Twitter; b) to build a fine-grained model that predicts political ideology of unseen users.
69	Leveraging Behavioral and Social Information for Weakly Supervised Collective Classification of Political Discourse on Twitter	Kristen Johnson, Di Jin, Dan Goldwasser	We present a collection of weakly supervised models which harness collective classification to predict the frames used in political discourse on the microblogging platform, Twitter.
70	A Nested Attention Neural Hybrid Model for Grammatical Error Correction	Jianshu Ji, Qinlong Wang, Kristina Toutanova, Yongen Gong, Steven Truong, Jianfeng Gao	A Nested Attention Neural Hybrid Model for Grammatical Error Correction
71	TextFlow: A Text Similarity Measure based on Continuous Sequences	Yassine Mrabet, Halil Kilicoglu, Dina Demner-Fushman	In this paper we present a novel text similarity measure inspired from a common representation in DNA sequence alignment algorithms.
72	Friendships, Rivalries, and Trysts: Characterizing Relations between Ideas in Texts	Chenhao Tan, Dallas Card, Noah A. Smith	Because ideas are naturally embedded in texts, we propose the first framework to systematically characterize the relations between ideas based on their occurrence in a corpus of documents, independent of how these ideas are represented.
73	Polish evaluation dataset for compositional distributional semantics models	Alina Wróblewska, Katarzyna Krasnowska-Kieraś	The paper presents a procedure of building an evaluation dataset.
74	Automatic Annotation and Evaluation of Error Types for Grammatical Error Correction	Christopher Bryant, Mariano Felice, Ted Briscoe	To overcome this problem, we introduce ERRANT, a grammatical ERRor ANnotation Toolkit designed to automatically extract edits from parallel original and corrected sentences and classify them according to a new, dataset-agnostic, rule-based framework.
75	Evaluation Metrics for Machine Reading Comprehension: Prerequisite Skills and Readability	Saku Sugawara, Yusuke Kido, Hikaru Yokono, Akiko Aizawa	In this study, two classes of metrics were adopted for evaluating RC datasets: prerequisite skills and readability.
76	A Minimal Span-Based Neural Constituency Parser	Mitchell Stern, Jacob Andreas, Dan Klein	In this work, we present a minimal neural model for constituency parsing based on independent scoring of labels and spans.
77	Semantic Dependency Parsing via Book Embedding	Weiwei Sun, Junjie Cao, Xiaojun Wan	To build a semantic graph for a given sentence, we design new Maximum Subgraph algorithms to generate noncrossing graphs on each page, and a Lagrangian Relaxation-based algorithm tocombine pages into a book.
78	Neural Word Segmentation with Rich Pretraining	Jie Yang, Yue Zhang, Fei Dong	We investigate the effectiveness of a range of external training sources for neural word segmentation by building a modular segmentation model, pretraining the most important submodule using rich external sources.
79	Neural Machine Translation via Binary Code Prediction	Yusuke Oda, Philip Arthur, Graham Neubig, Koichiro Yoshino, Satoshi Nakamura	In this paper, we propose a new method for calculating the output layer in neural machine translation systems.
80	What do Neural Machine Translation Models Learn about Morphology?	Yonatan Belinkov, Nadir Durrani, Fahim Dalvi, Hassan Sajjad, James Glass	In this work, we analyze the representations learned by neural MT models at various levels of granularity and empirically evaluate the quality of the representations for learning morphology through extrinsic part-of-speech and morphological tagging tasks.
81	Context-Dependent Sentiment Analysis in User-Generated Videos	Soujanya Poria, Erik Cambria, Devamanyu Hazarika, Navonil Majumder, Amir Zadeh, Louis-Philippe Morency	In this paper, we propose a LSTM-based model that enables utterances to capture contextual information from their surroundings in the same video, thus aiding the classification process.
82	A Multidimensional Lexicon for Interpersonal Stancetaking	Umashanthi Pavalanathan, Jim Fitzpatrick, Scott Kiesling, Jacob Eisenstein	A Multidimensional Lexicon for Interpersonal Stancetaking
83	Tandem Anchoring: a Multiword Anchor Approach for Interactive Topic Modeling	Jeffrey Lund, Connor Cook, Kevin Seppi, Jordan Boyd-Graber	We propose combinations of words as anchors, going beyond existing single word anchor algorithms-an approach we call “Tandem Anchors”.
84	Apples to Apples: Learning Semantics of Common Entities Through a Novel Comprehension Task	Omid Bakhshandeh, James Allen	In order to enable learning about common entities, we introduce a novel machine comprehension task, GuessTwo: given a short paragraph comparing different aspects of two real-world semantically-similar entities, a system should guess what those entities are. For benchmarking further progress in the task, we have collected a set of paragraphs as the test set on which human can accomplish the task with an accuracy of 94.2% on open-ended prediction.
85	Going out on a limb: Joint Extraction of Entity Mentions and Relations without Dependency Trees	Arzoo Katiyar, Claire Cardie	We present a novel attention-based recurrent neural network for joint extraction of entity mentions and relations.
86	Naturalizing a Programming Language via Interactive Learning	Sida I. Wang, Samuel Ginn, Percy Liang, Christopher D. Manning	Our goal is to create a convenient natural language interface for performing well-specified but complex actions such as analyzing data, manipulating text, and querying databases.
87	Semantic Word Clusters Using Signed Spectral Clustering	João Sedoc, Jean Gallier, Dean Foster, Lyle Ungar	We present a new signed spectral normalized graph cut algorithm, \textit{signed clustering}, that overlays existing thesauri upon distributionally derived vector representations of words, so that antonym relationships between word pairs are represented by negative weights.
88	An Interpretable Knowledge Transfer Model for Knowledge Base Completion	Qizhe Xie, Xuezhe Ma, Zihang Dai, Eduard Hovy	We propose a novel embedding model, ITransF, to perform knowledge base completion.
89	Learning a Neural Semantic Parser from User Feedback	Srinivasan Iyer, Ioannis Konstas, Alvin Cheung, Jayant Krishnamurthy, Luke Zettlemoyer	We present an approach to rapidly and easily build natural language interfaces to databases for new domains, whose performance improves over time based on user feedback, and requires minimal intervention.
90	Joint Modeling of Content and Discourse Relations in Dialogues	Kechen Qin, Lu Wang, Joseph Kim	We present a joint modeling approach to identify salient discussion points in spoken meetings as well as to label the discourse relations between speaker turns.
91	Argument Mining with Structured SVMs and RNNs	Vlad Niculae, Joonsuk Park, Claire Cardie	We propose a novel factor graph model for argument mining, designed for settings in which the argumentative relations in a document do not necessarily form a tree structure.
92	Neural Discourse Structure for Text Categorization	Yangfeng Ji, Noah A. Smith	We show that discourse structure, as defined by Rhetorical Structure Theory and provided by an existing discourse parser, benefits text categorization.
93	Adversarial Connective-exploiting Networks for Implicit Discourse Relation Classification	Lianhui Qin, Zhisong Zhang, Hai Zhao, Zhiting Hu, Eric Xing	We propose a feature imitation framework in which an implicit relation network is driven to learn from another neural network with access to connectives, and thus encouraged to extract similarly salient features for accurate classification.
94	Don’t understand a measure? Learn it: Structured Prediction for Coreference Resolution optimizing its measures	Iryna Haponchyk, Alessandro Moschitti	In this paper, we trade off exact computation for enabling the use and study of more complex loss functions for coreference resolution.
95	Bayesian Modeling of Lexical Resources for Low-Resource Settings	Nicholas Andrews, Mark Dredze, Benjamin Van Durme, Jason Eisner	In this paper, we investigate a more robust approach: we stipulate that the lexicon is the result of an assumed generative process.
96	Semi-Supervised QA with Generative Domain-Adaptive Nets	Zhilin Yang, Junjie Hu, Ruslan Salakhutdinov, William Cohen	We propose a novel training framework, the \textit{Generative Domain-Adaptive Nets}.
97	From Language to Programs: Bridging Reinforcement Learning and Maximum Marginal Likelihood	Kelvin Guu, Panupong Pasupat, Evan Liu, Percy Liang	Our goal is to learn a semantic parser that maps natural language utterances into executable programs when only indirect supervision is available: examples are labeled with the correct execution result, but not the program itself.
98	Diversity driven attention model for query-based abstractive summarization	Preksha Nema, Mitesh M. Khapra, Anirban Laha, Balaraman Ravindran	In this work we propose a model for the query-based summarization task based on the encode-attend-decode paradigm with two key additions (i) a query attention model (in addition to document attention model) which learns to focus on different portions of the query at different time steps (instead of using a static representation for the query) and (ii) a new diversity based attention model which aims to alleviate the problem of repeating phrases in the summary.
99	Get To The Point: Summarization with Pointer-Generator Networks	Abigail See, Peter J. Liu, Christopher D. Manning	In this work we propose a novel architecture that augments the standard sequence-to-sequence attentional model in two orthogonal ways.
100	Supervised Learning of Automatic Pyramid for Optimization-Based Multi-Document Summarization	Maxime Peyrard, Judith Eckle-Kohler	We present a new supervised framework that learns to estimate automatic Pyramid scores and uses them for optimization-based extractive multi-document summarization.
101	Selective Encoding for Abstractive Sentence Summarization	Qingyu Zhou, Nan Yang, Furu Wei, Ming Zhou	We propose a selective encoding model to extend the sequence-to-sequence framework for abstractive sentence summarization.
102	PositionRank: An Unsupervised Approach to Keyphrase Extraction from Scholarly Documents	Corina Florescu, Cornelia Caragea	In this paper, we propose PositionRank, an unsupervised model for keyphrase extraction from scholarly documents that incorporates information from all positions of a word’s occurrences into a biased PageRank.
103	Towards an Automatic Turing Test: Learning to Evaluate Dialogue Responses	Ryan Lowe, Michael Noseworthy, Iulian Vlad Serban, Nicolas Angelard-Gontier, Yoshua Bengio, Joelle Pineau	In response to this challenge, we formulate automatic dialogue evaluation as a learning problem.We present an evaluation model (ADEM)that learns to predict human-like scores to input responses, using a new dataset of human response scores.
104	A Transition-Based Directed Acyclic Graph Parser for UCCA	Daniel Hershcovich, Omri Abend, Ari Rappoport	We present the first parser for UCCA, a cross-linguistically applicable framework for semantic representation, which builds on extensive typological work and supports rapid annotation.
105	Abstract Syntax Networks for Code Generation and Semantic Parsing	Maxim Rabinovich, Mitchell Stern, Dan Klein	We introduce abstract syntax networks, a modeling framework for these problems.
106	Visualizing and Understanding Neural Machine Translation	Yanzhuo Ding, Yang Liu, Huanbo Luan, Maosong Sun	In this work, we propose to use layer-wise relevance propagation (LRP) to compute the contribution of each contextual word to arbitrary hidden states in the attention-based encoder-decoder framework.
107	Detecting annotation noise in automatically labelled data	Ines Rehbein, Josef Ruppenhofer	We introduce a method for error detection in automatically annotated text, aimed at supporting the creation of high-quality language resources at affordable cost.
108	Abstractive Document Summarization with a Graph-Based Attentional Neural Model	Jiwei Tan, Xiaojun Wan, Jianguo Xiao	Abstractive summarization is the ultimate goal of document summarization research, but previously it is less investigated due to the immaturity of text generation techniques.
109	Probabilistic Typology: Deep Generative Models of Vowel Inventories	Ryan Cotterell, Jason Eisner	In this paper we present the first probabilistic treatment of a basic question in phonological typology: What makes a natural vowel inventory?
110	Adversarial Multi-Criteria Learning for Chinese Word Segmentation	Xinchi Chen, Zhan Shi, Xipeng Qiu, Xuanjing Huang	In this paper, we propose adversarial multi-criteria learning for CWS by integrating shared knowledge from multiple heterogeneous segmentation criteria.
111	Neural Joint Model for Transition-based Chinese Syntactic Analysis	Shuhei Kurita, Daisuke Kawahara, Sadao Kurohashi	We present neural network-based joint models for Chinese word segmentation, POS tagging and dependency parsing.
112	Robust Incremental Neural Semantic Graph Parsing	Jan Buys, Phil Blunsom	We propose a neural encoder-decoder transition-based parser which is the first full-coverage semantic graph parser for Minimal Recursion Semantics (MRS).
113	Joint Extraction of Entities and Relations Based on a Novel Tagging Scheme	Suncong Zheng, Feng Wang, Hongyun Bao, Yuexing Hao, Peng Zhou, Bo Xu	What’s more, the end-to-end model proposed in this paper, achieves the best results on the public dataset.
114	A Local Detection Approach for Named Entity Recognition and Mention Detection	Mingbin Xu, Hui Jiang, Sedtawut Watcharawittayakul	In this paper, we study a novel approach for named entity recognition (NER) and mention detection (MD) in natural language processing.
115	Vancouver Welcomes You! Minimalist Location Metonymy Resolution	Milan Gritta, Mohammad Taher Pilehvar, Nut Limsopatham, Nigel Collier	We show how a minimalist neural approach combined with a novel predicate window method can achieve competitive results on the SemEval 2007 task on Metonymy Resolution.
116	Unifying Text, Metadata, and User Network Representations with a Neural Network for Geolocation Prediction	Yasuhide Miura, Motoki Taniguchi, Tomoki Taniguchi, Tomoko Ohkuma	We propose a novel geolocation prediction model using a complex neural network.
117	Multi-Task Video Captioning with Video and Entailment Generation	Ramakanth Pasunuru, Mohit Bansal	For this, we present a many-to-many multi-task learning model that shares parameters across the encoders and decoders of the three tasks.
118	Enriching Complex Networks with Word Embeddings for Detecting Mild Cognitive Impairment from Speech Transcripts	Leandro Santos, Edilson Anselmo Corrêa Júnior, Osvaldo Oliveira Jr, Diego Amancio, Letícia Mansur, Sandra Aluísio	In this paper, we modeled transcripts into complex networks and enriched them with word embedding (CNE) to better represent short texts produced in neuropsychological assessments.
119	Adversarial Adaptation of Synthetic or Stale Data	Young-Bum Kim, Karl Stratos, Dongchan Kim	We propose a solution to this mismatch problem by framing it as domain adaptation, treating the flawed training dataset as a source domain and the evaluation dataset as a target domain.
120	Chat Detection in an Intelligent Assistant: Combining Task-oriented and Non-task-oriented Spoken Dialogue Systems	Satoshi Akasaki, Nobuhiro Kaji	To address the lack of benchmark datasets for this task, we construct a new dataset consisting of 15,160 utterances collected from the real log data of a commercial intelligent assistant (and will release the dataset to facilitate future research activity).
121	A Neural Local Coherence Model	Dat Tien Nguyen, Shafiq Joty	We propose a local coherence model based on a convolutional neural network that operates over the entity grid representation of a text.
122	Data-Driven Broad-Coverage Grammars for Opinionated Natural Language Generation (ONLG)	Tomer Cagan, Stefan L. Frank, Reut Tsarfaty	We present a data-driven architecture for ONLG that generates subjective responses triggered by users’ agendas, consisting of topics and sentiments, and based on wide-coverage automatically-acquired generative grammars.
123	Learning to Ask: Neural Question Generation for Reading Comprehension	Xinya Du, Junru Shao, Claire Cardie	We introduce an attention-based sequence learning model for the task and investigate the effect of encoding sentence- vs. paragraph-level information.
124	Joint Optimization of User-desired Content in Multi-document Summaries by Learning from User Feedback	Avinesh P.V.S, Christian M. Meyer	In this paper, we propose an extractive multi-document summarization (MDS) system using joint optimization and active learning for content selection grounded in user feedback.
125	Flexible and Creative Chinese Poetry Generation Using Neural Memory	Jiyuan Zhang, Yang Feng, Dong Wang, Yang Wang, Andrew Abel, Shiyue Zhang, Andi Zhang	This work proposes a memory augmented neural model for Chinese poem generation, where the neural model and the augmented memory work together to balance the requirements of linguistic accordance and aesthetic innovation, leading to innovative generations that are still rule-compliant.
126	Learning to Generate Market Comments from Stock Prices	Soichiro Murakami, Akihiko Watanabe, Akira Miyazawa, Keiichi Goshima, Toshihiko Yanase, Hiroya Takamura, Yusuke Miyao	This paper presents a novel encoder-decoder model for automatically generating market comments from stock prices.
127	Can Syntax Help? Improving an LSTM-based Sentence Compression Model for New Domains	Liangguo Wang, Jing Jiang, Hai Leong Chieu, Chen Hui Ong, Dandan Song, Lejian Liao	In this paper, we study how to improve the domain adaptability of a deletion-based Long Short-Term Memory (LSTM) neural network model for sentence compression.
128	Transductive Non-linear Learning for Chinese Hypernym Prediction	Chengyu Wang, Junchi Yan, Aoying Zhou, Xiaofeng He	Rather than extracting hypernyms from texts, in this paper, we present a transductive learning approach to establish mappings from entities to hypernyms in the embedding space directly.
129	A Constituent-Centric Neural Architecture for Reading Comprehension	Pengtao Xie, Eric Xing	In this paper, we study the RC problem on the Stanford Question Answering Dataset (SQuAD).
130	Cross-lingual Distillation for Text Classification	Ruochen Xu, Yiming Yang	This paper presents a novel approach to CLTC that builds on model distillation, which adapts and extends a framework originally proposed for model compression.
131	Understanding and Predicting Empathic Behavior in Counseling Therapy	Verónica Pérez-Rosas, Rada Mihalcea, Kenneth Resnicow, Satinder Singh, Lawrence An	In this paper, we explore several aspects pertaining to counseling interaction dynamics and their relation to counselor empathy during motivational interviewing encounters.
132	Leveraging Knowledge Bases in LSTMs for Improving Machine Reading	Bishan Yang, Tom Mitchell	We propose KBLSTM, a novel neural model that leverages continuous representations of KBs to enhance the learning of recurrent neural networks for machine reading.
133	Prerequisite Relation Learning for Concepts in MOOCs	Liangming Pan, Chengjiang Li, Juanzi Li, Jie Tang	We study the extent to which the prerequisite relation between knowledge concepts in Massive Open Online Courses (MOOCs) can be inferred automatically.
134	Unsupervised Text Segmentation Based on Native Language Characteristics	Shervin Malmasi, Mark Dras, Mark Johnson, Lan Du, Magdalena Wolska	We propose a Bayesian unsupervised text segmentation approach to the latter.
135	Weakly Supervised Cross-Lingual Named Entity Recognition via Effective Annotation and Representation Projection	Jian Ni, Georgiana Dinu, Radu Florian	In this paper, we present two weakly supervised approaches for cross-lingual NER with no human annotation in a target language.
136	Context Sensitive Lemmatization Using Two Successive Bidirectional Gated Recurrent Networks	Abhisek Chakrabarty, Onkar Arun Pandit, Utpal Garain	We introduce a composite deep neural network architecture for supervised and language independent context sensitive lemmatization. To train the model on Bengali, we develop a gold lemma annotated dataset (having 1,702 sentences with a total of 20,257 word tokens), which is an additional contribution of this work.
137	Learning to Create and Reuse Words in Open-Vocabulary Neural Language Modeling	Kazuya Kawakami, Chris Dyer, Phil Blunsom	In this paper, we augment a hierarchical LSTM language model that generates sequences of word tokens character by character with a caching mechanism that learns to reuse previously generated words. To validate our model we construct a new open-vocabulary language modeling corpus (the Multilingual Wikipedia Corpus; MWC) from comparable Wikipedia articles in 7 typologically diverse languages and demonstrate the effectiveness of our model across this range of languages.
138	Bandit Structured Prediction for Neural Sequence-to-Sequence Learning	Julia Kreutzer, Artem Sokolov, Stefan Riezler	Bandit structured prediction describes a stochastic optimization framework where learning is performed from partial feedback.
139	Prior Knowledge Integration for Neural Machine Translation using Posterior Regularization	Jiacheng Zhang, Yang Liu, Huanbo Luan, Jingfang Xu, Maosong Sun	In this work, we propose to use posterior regularization to provide a general framework for integrating prior knowledge into neural machine translation.
140	Incorporating Word Reordering Knowledge into Attention-based Neural Machine Translation	Jinchao Zhang, Mingxuan Wang, Qun Liu, Jie Zhou	This paper proposes three distortion models to explicitly incorporate the word reordering knowledge into attention-based Neural Machine Translation (NMT) for further improving translation performance.
141	Lexically Constrained Decoding for Sequence Generation Using Grid Beam Search	Chris Hokamp, Qun Liu	We present Grid Beam Search (GBS), an algorithm which extends beam search to allow the inclusion of pre-specified lexical constraints.
142	Combating Human Trafficking with Multimodal Deep Models	Edmund Tong, Amir Zadeh, Cara Jones, Louis-Philippe Morency	In this paper, we take a major step in the automatic detection of advertisements suspected to pertain to human trafficking. We present a novel dataset called Trafficking-10k, with more than 10,000 advertisements annotated for this task.
143	MalwareTextDB: A Database for Annotated Malware Articles	Swee Kiat Lim, Aldrian Obaja Muis, Wei Lu, Chen Hui Ong	In this paper, we discuss the construction of a new database for annotated malware texts.
144	A Corpus of Annotated Revisions for Studying Argumentative Writing	Fan Zhang, Homa B. Hashemi, Rebecca Hwa, Diane Litman	This paper presents ArgRewrite, a corpus of between-draft revisions of argumentative essays.
145	Watset: Automatic Induction of Synsets from a Graph of Synonyms	Dmitry Ustalov, Alexander Panchenko, Chris Biemann	This paper presents a new graph-based approach that induces synsets using synonymy dictionaries and word embeddings.
146	Neural Modeling of Multi-Predicate Interactions for Japanese Predicate Argument Structure Analysis	Hiroki Ouchi, Hiroyuki Shindo, Yuji Matsumoto	To remedy this problem, we introduce a model that uses grid-type recurrent neural networks.
147	TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension	Mandar Joshi, Eunsol Choi, Daniel Weld, Luke Zettlemoyer	We present TriviaQA, a challenging reading comprehension dataset containing over 650K question-answer-evidence triples.
148	Learning Semantic Correspondences in Technical Documentation	Kyle Richardson, Jonas Kuhn	Our approach exploits the parallel nature of such documentation, or the tight coupling between high-level text and the low-level representations we aim to learn.
149	Bridge Text and Knowledge by Learning Multi-Prototype Entity Mention Embedding	Yixin Cao, Lifu Huang, Heng Ji, Xu Chen, Juanzi Li	In this paper, to deal with the ambiguity of entity mentions, we propose a novel Multi-Prototype Mention Embedding model, which learns multiple sense embeddings for each mention by jointly modeling words from textual contexts and entities derived from a knowledge base.
150	Interactive Learning of Grounded Verb Semantics towards Human-Robot Communication	Lanbo She, Joyce Chai	To address this limitation, this paper presents a new interactive learning approach that allows robots to proactively engage in interaction with human partners by asking good questions to learn models for grounded verb semantics.
151	Multimodal Word Distributions	Ben Athiwaratkun, Andrew Wilson	To learn these distributions, we propose an energy-based max-margin objective.
152	Enhanced LSTM for Natural Language Inference	Qian Chen, Xiaodan Zhu, Zhen-Hua Ling, Si Wei, Hui Jiang, Diana Inkpen	In this paper, we present a new state-of-the-art result, achieving the accuracy of 88.6% on the Stanford Natural Language Inference Dataset.
153	Linguistic analysis of differences in portrayal of movie characters	Anil Ramakrishna, Victor R. Martínez, Nikolaos Malandrakis, Karan Singla, Shrikanth Narayanan	We examine differences in portrayal of characters in movies using psycholinguistic and graph theoretic measures computed directly from screenplays.
154	Linguistically Regularized LSTM for Sentiment Classification	Qiao Qian, Minlie Huang, Jinhao Lei, Xiaoyan Zhu	In this paper, we propose simple models trained with sentence-level annotation, but also attempt to model the linguistic role of sentiment lexicons, negation words, and intensity words.
155	Sarcasm SIGN: Interpreting Sarcasm with Sentiment Based Monolingual Machine Translation	Lotem Peled, Roi Reichart	In this paper we present the novel task of sarcasm interpretation, defined as the generation of a non-sarcastic utterance conveying the same message as the original sarcastic one. We introduce a novel dataset of 3000 sarcastic tweets, each interpreted by five human judges.
156	Active Sentiment Domain Adaptation	Fangzhao Wu, Yongfeng Huang, Jun Yan	In this paper, we propose an active sentiment domain adaptation approach to handle this problem.
157	Volatility Prediction using Financial Disclosures Sentiments with Word Embedding-based IR Models	Navid Rekabsaz, Mihai Lupu, Artem Baklanov, Alexander Dür, Linda Andersson, Allan Hanbury	We therefore study different fusion methods to combine text and market data resources.
158	CANE: Context-Aware Network Embedding for Relation Modeling	Cunchao Tu, Han Liu, Zhiyuan Liu, Maosong Sun	In this paper, we assume that one vertex usually shows different aspects when interacting with different neighbor vertices, and should own different embeddings respectively.
159	Universal Dependencies Parsing for Colloquial Singaporean English	Hongmin Wang, Yue Zhang, GuangYong Leonard Chan, Jie Yang, Hai Leong Chieu	We make both our annotation and parser available for further research.
160	Generic Axiomatization of Families of Noncrossing Graphs in Dependency Parsing	Anssi Yli-Jyrä, Carlos Gómez-Rodríguez	We present a simple encoding for unlabeled noncrossing graphs and show how its latent counterpart helps us to represent several families of directed and undirected graphs used in syntactic and semantic parsing of natural language as context-free languages.
161	Semi-supervised sequence tagging with bidirectional language models	Matthew Peters, Waleed Ammar, Chandra Bhagavatula, Russell Power	In this paper, we demonstrate a general semi-supervised approach for adding pretrained context embeddings from bidirectional language models to NLP systems and apply it to sequence labeling tasks.
162	Learning Symmetric Collaborative Dialogue Agents with Dynamic Knowledge Graph Embeddings	He He, Anusha Balakrishnan, Mihail Eric, Percy Liang	To model both structured knowledge and unstructured language, we propose a neural model with dynamic knowledge graph embeddings that evolve as the dialogue progresses. We collected a dataset of 11K human-human dialogues, which exhibits interesting lexical, semantic, and strategic elements.
163	Neural Belief Tracker: Data-Driven Dialogue State Tracking	Nikola Mrkšić, Diarmuid Ó Séaghdha, Tsung-Hsien Wen, Blaise Thomson, Steve Young	We propose a novel Neural Belief Tracking (NBT) framework which overcomes these problems by building on recent advances in representation learning.
164	Exploiting Argument Information to Improve Event Detection via Supervised Attention Mechanisms	Shulin Liu, Yubo Chen, Kang Liu, Jun Zhao	In this work, we propose to exploit argument information explicitly for ED via supervised attention mechanisms.
165	Topical Coherence in LDA-based Models through Induced Segmentation	Hesam Amoualian, Wei Lu, Eric Gaussier, Georgios Balikas, Massih R. Amini, Marianne Clausel	This paper presents an LDA-based model that generates topically coherent segments within documents by jointly segmenting documents and assigning topics to their words.
166	Jointly Extracting Relations with Class Ties via Effective Deep Ranking	Hai Ye, Wenhan Chao, Zhunchen Luo, Zhoujun Li	In this work, to effectively leverage class ties, we propose to make joint relation extraction with a unified model that integrates convolutional neural network (CNN) with a general pairwise ranking framework, in which three novel ranking loss functions are introduced.
167	Search-based Neural Structured Learning for Sequential Question Answering	Mohit Iyyer, Wen-tau Yih, Ming-Wei Chang	To solve this sequential question answering task, we propose a novel dynamic neural semantic parsing framework trained using a weakly supervised reward-guided search. We collect a dataset of 6,066 question sequences that inquire about semi-structured tables from Wikipedia, with 17,553 question-answer pairs in total.
168	Gated-Attention Readers for Text Comprehension	Bhuwan Dhingra, Hanxiao Liu, Zhilin Yang, William Cohen, Ruslan Salakhutdinov	In this paper we study the problem of answering cloze-style questions over documents.
169	Determining Gains Acquired from Word Embedding Quantitatively Using Discrete Distribution Clustering	Jianbo Ye, Yanran Li, Zhaohui Wu, James Z. Wang, Wenjie Li, Jia Li	In this paper, we propose a new document clustering approach by combining any word embedding with a state-of-the-art algorithm for clustering empirical distributions.
170	Towards a Seamless Integration of Word Senses into Downstream NLP Applications	Mohammad Taher Pilehvar, Jose Camacho-Collados, Roberto Navigli, Nigel Collier	By incorporating a novel disambiguation algorithm into a state-of-the-art classification model, we create a pipeline to integrate sense-level information into downstream NLP applications.
171	Reading Wikipedia to Answer Open-Domain Questions	Danqi Chen, Adam Fisch, Jason Weston, Antoine Bordes	This paper proposes to tackle open-domain question answering using Wikipedia as the unique knowledge source: the answer to any factoid question is a text span in a Wikipedia article.
172	Learning to Skim Text	Adams Wei Yu, Hongrae Lee, Quoc Le	In this paper, we present an approach of reading text while skipping irrelevant information if needed.
173	An Algebra for Feature Extraction	Vivek Srikumar	In this paper, we formalize feature extraction from an algebraic perspective.
174	Chunk-based Decoder for Neural Machine Translation	Shonosuke Ishiwatari, Jingtao Yao, Shujie Liu, Mu Li, Ming Zhou, Naoki Yoshinaga, Masaru Kitsuregawa, Weijia Jia	In this paper, we propose chunk-based decoders for (NMT), each of which consists of a chunk-level decoder and a word-level decoder.
175	Doubly-Attentive Decoder for Multi-modal Neural Machine Translation	Iacer Calixto, Qun Liu, Nick Campbell	We introduce a Multi-modal Neural Machine Translation model in which a doubly-attentive decoder naturally incorporates spatial visual features obtained using pre-trained convolutional neural networks, bridging the gap between image description and translation.
176	A Teacher-Student Framework for Zero-Resource Neural Machine Translation	Yun Chen, Yang Liu, Yong Cheng, Victor O.K. Li	In this paper, we propose a method for zero-resource NMT by assuming that parallel sentences have close probabilities of generating a sentence in a third language.
177	Improved Neural Machine Translation with a Syntax-Aware Encoder and Decoder	Huadong Chen, Shujian Huang, David Chiang, Jiajun Chen	In this paper, we improve this model by explicitly incorporating source-side syntactic trees.
178	Cross-lingual Name Tagging and Linking for 282 Languages	Xiaoman Pan, Boliang Zhang, Jonathan May, Joel Nothman, Kevin Knight, Heng Ji	The ambitious goal of this work is to develop a cross-lingual name tagging and linking framework for 282 languages that exist in Wikipedia.
179	Adversarial Training for Unsupervised Bilingual Lexicon Induction	Meng Zhang, Yang Liu, Huanbo Luan, Maosong Sun	In this work, we show that such cross-lingual connection can actually be established without any form of supervision.
180	Estimating Code-Switching on Twitter with a Novel Generalized Word-Level Language Detection Technique	Shruti Rijhwani, Royal Sequiera, Monojit Choudhury, Kalika Bali, Chandra Shekhar Maddila	We present a novel unsupervised word-level language detection technique for code-switched text for an arbitrarily large number of languages, which does not require any manually annotated training data.
181	Using Global Constraints and Reranking to Improve Cognates Detection	Michael Bloodgood, Benjamin Strauss	We propose methods for using global constraints by performing rescoring of the score matrices produced by state of the art cognates detection systems.
182	One-Shot Neural Cross-Lingual Transfer for Paradigm Completion	Katharina Kann, Ryan Cotterell, Hinrich Schütze	We present a novel cross-lingual transfer method for paradigm completion, the task of mapping a lemma to its inflected forms, using a neural encoder-decoder model, the state of the art for the monolingual task.
183	Morphological Inflection Generation with Hard Monotonic Attention	Roee Aharoni, Yoav Goldberg	We present a neural model for morphological inflection generation which employs a hard attention mechanism, inspired by the nearly-monotonic alignment commonly found between the characters in a word and the characters in its inflection.
184	From Characters to Words to in Between: Do We Capture Morphology?	Clara Vania, Adam Lopez	On a language modeling task, we present experiments that systematically vary (1) the basic unit of representation, (2) the composition of these representations, and (3) the morphological typology of the language modeled.
185	Riemannian Optimization for Skip-Gram Negative Sampling	Alexander Fonarev, Oleksii Grinchuk, Gleb Gusev, Pavel Serdyukov, Ivan Oseledets	In this paper, we propose an algorithm that optimizes SGNS objective using Riemannian optimization and demonstrates its superiority over popular competitors, such as the original method to train SGNS and SVD over SPPMI matrix.
186	Deep Multitask Learning for Semantic Dependency Parsing	Hao Peng, Sam Thomson, Noah A. Smith	We present a deep neural architecture that parses sentences into three semantic dependency graph formalisms.
187	Improved Word Representation Learning with Sememes	Yilin Niu, Ruobing Xie, Zhiyuan Liu, Maosong Sun	In this paper, we present that, word sememe information can improve word representation learning (WRL), which maps words into a low-dimensional semantic space and serves as a fundamental step for many NLP tasks.
188	Learning Character-level Compositionality with Visual Features	Frederick Liu, Han Lu, Chieh Lo, Graham Neubig	In this paper, we model this effect by creating embeddings for characters based on their visual characteristics, creating an image for the character and running it through a convolutional neural network to produce a visual character embedding.
189	A Progressive Learning Approach to Chinese SRL Using Heterogeneous Data	Qiaolin Xia, Lei Sha, Baobao Chang, Zhifang Sui	In this paper, we focus mainly on the latter, that is, to improve Chinese SRL by using heterogeneous corpora together. We also release a new corpus, Chinese SemBank, for Chinese SRL.
190	Revisiting Recurrent Networks for Paraphrastic Sentence Embeddings	John Wieting, Kevin Gimpel	We consider the problem of learning general-purpose, paraphrastic sentence embeddings, revisiting the setting of Wieting et al. (2016b).
191	Ontology-Aware Token Embeddings for Prepositional Phrase Attachment	Pradeep Dasigi, Waleed Ammar, Chris Dyer, Eduard Hovy	We use the new, context-sensitive embeddings in a model for predicting prepositional phrase (PP) attachments and jointly learn the concept embeddings and model parameters.
192	Identifying 1950s American Jazz Musicians: Fine-Grained IsA Extraction via Modifier Composition	Ellie Pavlick, Marius Paşca	We present a method for populating fine-grained classes (e.g., “1950s American jazz musicians”) with instances (e.g., Charles Mingus ).
193	Parsing to 1-Endpoint-Crossing, Pagenumber-2 Graphs	Junjie Cao, Sheng Huang, Weiwei Sun, Xiaojun Wan	Our main contribution is an exact algorithm that obtains maximum subgraphs satisfying both restrictions simultaneously in time O(n5).
194	Semi-supervised Multitask Learning for Sequence Labeling	Marek Rei	We propose a sequence labeling framework with a secondary training objective, learning to predict surrounding words for every word in the dataset.
195	Semantic Parsing of Pre-university Math Problems	Takuya Matsuzaki, Takumi Ito, Hidenao Iwane, Hirokazu Anai, Noriko H. Arai	We have been developing an end-to-end math problem solving system that accepts natural language input.

TABLE 2: ACL 2017 Short Papers

	Title	Authors	Highlight
1	Classifying Temporal Relations by Bidirectional LSTM over Dependency Paths	Fei Cheng, Yusuke Miyao	In this work, we borrow a state-of-the-art method in relation extraction by adopting bidirectional long short-term memory (Bi-LSTM) along dependency paths (DP).
2	AMR-to-text Generation with Synchronous Node Replacement Grammar	Linfeng Song, Xiaochang Peng, Yue Zhang, Zhiguo Wang, Daniel Gildea	This paper addresses the task of AMR-to-text generation by leveraging synchronous node replacement grammar.
3	Lexical Features in Coreference Resolution: To be Used With Caution	Nafise Sadat Moosavi, Michael Strube	In this paper we investigate a drawback of using many lexical features in state-of-the-art coreference resolvers.
4	Alternative Objective Functions for Training MT Evaluation Metrics	Miloš Stanojević, Khalil Sima’an	Subsequently we propose a model trained to optimize both objectives simultaneously and show that it is far more stable than-and on average outperforms-both models on both objectives.
5	A Principled Framework for Evaluating Summarizers: Comparing Models of Summary Quality against Human Judgments	Maxime Peyrard, Judith Eckle-Kohler	We present a new framework for evaluating extractive summarizers, which is based on a principled representation as optimization problem.
6	Vector space models for evaluating semantic fluency in autism	Emily Prud’hommeaux, Jan van Santen, Douglas Gliner	In this paper, we explore automated approaches for scoring semantic fluency responses that leverage ontological resources and distributional semantic models to characterize the semantic fluency responses produced by young children with and without ASD.
7	Neural Architectures for Multilingual Semantic Parsing	Raymond Hendy Susanto, Wei Lu	In this paper, we address semantic parsing in a multilingual context.
8	Incorporating Uncertainty into Deep Learning for Spoken Language Assessment	Andrey Malinin, Anton Ragni, Kate Knill, Mark Gales	This paper proposes a novel method to yield uncertainty and compares it to GPs and DNNs with MCD.
9	Incorporating Dialectal Variability for Socially Equitable Language Identification	David Jurgens, Yulia Tsvetkov, Dan Jurafsky	We propose a new dataset and a character-based sequence-to-sequence model for LID designed to support dialectal and multilingual language varieties.
10	Evaluating Compound Splitters Extrinsically with Textual Entailment	Glorianna Jagfeld, Patrick Ziering, Lonneke van der Plas	We explore a novel way for the extrinsic evaluation of compound splitters, namely recognizing textual entailment.
11	An Analysis of Action Recognition Datasets for Language and Vision Tasks	Spandana Gella, Frank Keller	In this survey, we categorize the existing approaches based on how they conceptualize this problem and provide a detailed review of existing datasets, highlighting their diversity as well as advantages and disadvantages.
12	Learning to Parse and Translate Improves Neural Machine Translation	Akiko Eriguchi, Yoshimasa Tsuruoka, Kyunghyun Cho	In this paper, we propose a hybrid model, called NMT+RNNG, that learns to parse and translate by combining the recurrent neural network grammar into the attention-based neural machine translation.
13	On the Distribution of Lexical Features at Multiple Levels of Analysis	Fatemeh Almodaresi, Lyle Ungar, Vivek Kulkarni, Mohsen Zakeri, Salvatore Giorgi, H. Andrew Schwartz	In this paper, we empirically characterize various lexical distributions at different levels of analysis, showing that, while most features are decidedly sparse and non-normal at the message-level (as with traditional NLP), they follow the central limit theorem to become much more Log-normal or even Normal at the user- and county-levels.
14	Exploring Neural Text Simplification Models	Sergiu Nisioi, Sanja Štajner, Simone Paolo Ponzetto, Liviu P. Dinu	We present the first attempt at using sequence to sequence neural networks to model text simplification (TS).
15	On the Challenges of Translating NLP Research into Commercial Products	Daniel Dahlmeier	This paper highlights challenges in industrial research related to translating research in natural language processing into commercial products.
16	Sentence Alignment Methods for Improving Text Simplification Systems	Sanja Štajner, Marc Franco-Salvador, Simone Paolo Ponzetto, Paolo Rosso, Heiner Stuckenschmidt	We provide several methods for sentence-alignment of texts with different complexity levels.
17	Understanding Task Design Trade-offs in Crowdsourced Paraphrase Collection	Youxuan Jiang, Jonathan K. Kummerfeld, Walter S. Lasecki	In this paper, we present the first systematic study of the key factors in crowdsourcing paraphrase collection.
18	Arc-swift: A Novel Transition System for Dependency Parsing	Peng Qi, Christopher D. Manning	This paper proposes a novel transition system, arc-swift, that enables direct attachments between tokens farther apart with a single transition.
19	A Generative Parser with a Discriminative Recognition Algorithm	Jianpeng Cheng, Adam Lopez, Mirella Lapata	We propose a framework for parsing and language modeling which marries a generative model with a discriminative recognition model in an encoder-decoder setting.
20	Hybrid Neural Network Alignment and Lexicon Model in Direct HMM for Statistical Machine Translation	Weiyue Wang, Tamer Alkhouli, Derui Zhu, Hermann Ney	This work proposes a direct HMM with neural network-based lexicon and alignment models, which are trained jointly using the Baum-Welch algorithm.
21	Towards String-To-Tree Neural Machine Translation	Roee Aharoni, Yoav Goldberg	We present a simple method to incorporate syntactic information about the target language in a neural machine translation system by translating into linearized, lexicalized constituency trees.
22	Learning Lexico-Functional Patterns for First-Person Affect	Lena Reed, Jiaqi Wu, Shereen Oraby, Pranav Anand, Marilyn Walker	We present a method to learn proxies for these functions from first-person narratives. We construct a novel fine-grained test set, and show that the patterns we learn improve our ability to predict first-person affective reactions to everyday events, from a Stanford sentiment baseline of .67F to .75F.
23	Lifelong Learning CRF for Supervised Aspect Extraction	Lei Shu, Hu Xu, Bing Liu	This paper makes a focused contribution to supervised aspect extraction.
24	Exploiting Domain Knowledge via Grouped Weight Sharing with Application to Text Categorization	Ye Zhang, Matthew Lease, Byron C. Wallace	We propose a general, novel method for exploiting such resources via weight sharing.
25	Improving Neural Parsing by Disentangling Model Combination and Reranking Effects	Daniel Fried, Mitchell Stern, Dan Klein	Recent work has proposed several generative neural models for constituency parsing that achieve state-of-the-art results.
26	Information-Theory Interpretation of the Skip-Gram Negative-Sampling Objective Function	Oren Melamud, Jacob Goldberger	In this paper we define a measure of dependency between two random variables, based on the Jensen-Shannon (JS) divergence between their joint distribution and the product of their marginal distributions.
27	Implicitly-Defined Neural Networks for Sequence Labeling	Michaeel Kazi, Brian Thompson	In this work, we propose a novel, implicitly-defined neural network architecture and describe a method to compute its components.
28	The Role of Prosody and Speech Register in Word Segmentation: A Computational Modelling Perspective	Bogdan Ludusan, Reiko Mazuka, Mathieu Bernard, Alejandrina Cristia, Emmanuel Dupoux	Since these two factors are thought to play an important role in early language acquisition, we aim to quantify their contribution for this task.
29	A Two-Stage Parsing Method for Text-Level Discourse Analysis	Yizhong Wang, Sujian Li, Houfeng Wang	Previous work introduced transition-based algorithms to form a unified architecture of parsing rhetorical structures (including span, nuclearity and relation), but did not achieve satisfactory performance.
30	Error-repair Dependency Parsing for Ungrammatical Texts	Keisuke Sakaguchi, Matt Post, Benjamin Van Durme	We propose a new dependency parsing scheme which jointly parses a sentence and repairs grammatical errors by extending the non-directional transition-based formalism of Goldberg and Elhadad (2010) with three additional actions: SUBSTITUTE, DELETE, INSERT.
31	Attention Strategies for Multi-Source Sequence-to-Sequence Learning	Jindřich Libovický, Jindřich Helcl	We propose two novel approaches to combine the outputs of attention mechanisms over each source sequence, flat and hierarchical.
32	Understanding and Detecting Supporting Arguments of Diverse Types	Xinyu Hua, Lu Wang	We investigate the problem of sentence-level supporting argument detection from relevant documents for user-specified claims.
33	A Neural Model for User Geolocation and Lexical Dialectology	Afshin Rahimi, Trevor Cohn, Timothy Baldwin	We propose a simple yet effective text-based user geolocation model based on a neural network with one hidden layer, which achieves state of the art performance over three Twitter benchmark geolocation datasets, in addition to producing word and phrase embeddings in the hidden layer that we show to be useful for detecting dialectal terms.
34	A Corpus of Natural Language for Visual Reasoning	Alane Suhr, Mike Lewis, James Yeh, Yoav Artzi	We describe a method of crowdsourcing linguistically-diverse data, and present an analysis of our data. We present a new visual reasoning language dataset, containing 92,244 pairs of examples of natural statements grounded in synthetic images with 3,962 unique sentences.
35	Neural Architecture for Temporal Relation Extraction: A Bi-LSTM Approach for Detecting Narrative Containers	Julien Tourille, Olivier Ferret, Aurélie Névéol, Xavier Tannier	We present a neural architecture for containment relation identification between medical events and/or temporal expressions.
36	How to Make Context More Useful? An Empirical Study on Context-Aware Neural Conversational Models	Zhiliang Tian, Rui Yan, Lili Mou, Yiping Song, Yansong Feng, Dongyan Zhao	In this paper, we conduct an empirical study to compare various models and investigate the effect of context information in dialog systems.
37	Cross-lingual and cross-domain discourse segmentation of entire documents	Chloé Braud, Ophélie Lacroix, Anders Søgaard	In this paper, we propose statistical discourse segmenters for five languages and three domains that do not rely on gold pre-annotations.
38	Detecting Good Arguments in a Non-Topic-Specific Way: An Oxymoron?	Beata Beigman Klebanov, Binod Gyawali, Yi Song	We investigate the extent to which it is possible to close the performance gap between topic-specific and across-topics models for identification of good arguments.
39	Argumentation Quality Assessment: Theory vs. Practice	Henning Wachsmuth, Nona Naderi, Ivan Habernal, Yufang Hou, Graeme Hirst, Iryna Gurevych, Benno Stein	We find that most observations on quality phrased spontaneously are in fact adequately represented by theory.
40	A Recurrent Neural Model with Attention for the Recognition of Chinese Implicit Discourse Relations	Samuel Rönnqvist, Niko Schenk, Christian Chiarcos	We introduce an attention-based Bi-LSTM for Chinese implicit discourse relations and demonstrate that modeling argument pairs as a joint sequence can outperform word order-agnostic approaches.
41	Discourse Annotation of Non-native Spontaneous Spoken Responses Using the Rhetorical Structure Theory Framework	Xinhao Wang, James Bruno, Hillary Molloy, Keelan Evanini, Klaus Zechner	Considering that the measurement of discourse coherence is typically a key metric in human scoring rubrics for assessments of spoken language, we initiated a research effort to obtain RST annotations of a large number of non-native spoken responses from a standardized assessment of academic English proficiency.
42	Improving Implicit Discourse Relation Recognition with Discourse-specific Word Embeddings	Changxing Wu, Xiaodong Shi, Yidong Chen, Jinsong Su, Boli Wang	We introduce a simple and effective method to learn discourse-specific word embeddings (DSWE) for implicit discourse relation recognition.
43	Oracle Summaries of Compressive Summarization	Tsutomu Hirao, Masaaki Nishino, Masaaki Nagata	This paper derives an Integer Linear Programming (ILP) formulation to obtain an oracle summary of the compressive summarization paradigm in terms of ROUGE.
44	Japanese Sentence Compression with a Large Training Dataset	Shun Hasegawa, Yuta Kikuchi, Hiroya Takamura, Manabu Okumura	In English, high-quality sentence compression models by deleting words have been trained on automatically created large training datasets.
45	A Neural Architecture for Generating Natural Language Descriptions from Source Code Changes	Pablo Loyola, Edison Marrese-Taylor, Yutaka Matsuo	We propose a model to automatically describe changes introduced in the source code of a program using natural language.
46	English Event Detection With Translated Language Features	Sam Wei, Igor Korostil, Joel Nothman, Ben Hachey	We propose novel radical features from automatic translation for event extraction.
47	EviNets: Neural Networks for Combining Evidence Signals for Factoid Question Answering	Denis Savenkov, Eugene Agichtein	This paper proposes EviNets: a novel neural network architecture for factoid question answering.
48	Pocket Knowledge Base Population	Travis Wolfe, Mark Dredze, Benjamin Van Durme	We describe novel Open Information Extraction methods which leverage the PKB to find informative trigger words.
49	Answering Complex Questions Using Open Information Extraction	Tushar Khot, Ashish Sabharwal, Peter Clark	We overcome this limitation by presenting a method for reasoning with Open IE knowledge, allowing more complex questions to be handled.
50	Bootstrapping for Numerical Open IE	Swarnadeep Saha, Harinder Pal, Mausam	We design and release BONIE, the first open numerical relation extractor, for extracting Open IE tuples where one of the arguments is a number or a quantity-unit phrase.
51	Feature-Rich Networks for Knowledge Base Completion	Alexandros Komninos, Suresh Manandhar	We propose jointly modelling Knowledge Bases and aligned text with Feature-Rich Networks.
52	Fine-Grained Entity Typing with High-Multiplicity Assignments	Maxim Rabinovich, Dan Klein	In this paper, we consider the high-multiplicity regime inherent in data sources such as Wikipedia that have semi-open type systems.
53	Group Sparse CNNs for Question Classification with Answer Sets	Mingbo Ma, Liang Huang, Bing Xiang, Bowen Zhou	Group Sparse CNNs for Question Classification with Answer Sets
54	Multi-Task Learning of Keyphrase Boundary Classification	Isabelle Augenstein, Anders Søgaard	To overcome this, we explore several auxiliary tasks, including semantic super-sense tagging and identification of multi-word expressions, and cast the task as a multi-task learning problem with deep recurrent neural networks.
55	Cardinal Virtues: Extracting Relation Cardinalities from Text	Paramita Mirza, Simon Razniewski, Fariz Darari, Gerhard Weikum	We present a distant supervision method using conditional random fields. We introduce this novel problem of extracting cardinalities and discusses the specific challenges that set it apart from standard IE.
56	Integrating Deep Linguistic Features in Factuality Prediction over Unified Datasets	Gabriel Stanovsky, Judith Eckle-Kohler, Yevgeniy Puzikov, Ido Dagan, Iryna Gurevych	In this work we propose an intuitive method for mapping three previously annotated corpora onto a single factuality scale, thereby enabling models to be tested across these corpora. We make both the unified factuality corpus and our new model publicly available.
57	Question Answering on Knowledge Bases and Text using Universal Schema and Memory Networks	Rajarshi Das, Manzil Zaheer, Siva Reddy, Andrew McCallum	In this paper we extend universal schema to natural language question answering, employing Memory networks to attend to the large body of facts in the combination of text and KB.
58	Differentiable Scheduled Sampling for Credit Assignment	Kartik Goyal, Chris Dyer, Taylor Berg-Kirkpatrick	By incorporating this approximation into the scheduled sampling training procedure-a well-known technique for correcting exposure bias-we introduce a new training objective that is continuous and differentiable everywhere and can provide informative gradients near points where previous decoding decisions change their value.
59	A Deep Network with Visual Text Composition Behavior	Hongyu Guo	We propose a deep network, which not only achieves competitive accuracy for text classification, but also exhibits compositional behavior.
60	Neural System Combination for Machine Translation	Long Zhou, Wenpeng Hu, Jiajun Zhang, Chengqing Zong	In this paper, we propose a neural system combination framework leveraging multi-source NMT, which takes as input the outputs of NMT and SMT systems and produces the final translation.
61	An Empirical Comparison of Domain Adaptation Methods for Neural Machine Translation	Chenhui Chu, Raj Dabre, Sadao Kurohashi	In this paper, we propose a novel domain adaptation method named “mixed fine tuning” for neural machine translation (NMT).
62	Efficient Extraction of Pseudo-Parallel Sentences from Raw Monolingual Data Using Word Embeddings	Benjamin Marie, Atsushi Fujita	We propose a new method for extracting pseudo-parallel sentences from a pair of large monolingual corpora, without relying on any document-level information.
63	Feature Hashing for Language and Dialect Identification	Shervin Malmasi, Mark Dras	We evaluate feature hashing for language identification (LID), a method not previously used for this task.
64	Detection of Chinese Word Usage Errors for Non-Native Chinese Learners with Bidirectional LSTM	Yow-Ting Shiue, Hen-Hsen Huang, Hsin-Hsi Chen	In this paper, we propose (bidirectional) LSTM sequence labeling models and explore various features to detect word usage errors in Chinese sentences.
65	Automatic Compositor Attribution in the First Folio of Shakespeare	Maria Ryskina, Hannah Alpert-Abrams, Dan Garrette, Taylor Berg-Kirkpatrick	In this paper, we introduce a novel unsupervised model that jointly describes the textual and visual features needed to distinguish compositors.
66	STAIR Captions: Constructing a Large-Scale Japanese Image Caption Dataset	Yuya Yoshikawa, Yutaro Shigeto, Akikazu Takeuchi	In this paper, we particularly consider generating Japanese captions for images. To tackle this problem, we construct a large-scale Japanese image caption dataset based on images from MS-COCO, which is called STAIR Captions.
67	“Liar, Liar Pants on Fire”: A New Benchmark Dataset for Fake News Detection	William Yang Wang	In this paper, we present LIAR: a new, publicly available dataset for fake news detection.
68	English Multiword Expression-aware Dependency Parsing Including Named Entities	Akihiko Kato, Hiroyuki Shindo, Yuji Matsumoto	In this work, we construct a corpus that ensures consistency between dependency structures and MWEs, including named entities.
69	Improving Semantic Composition with Offset Inference	Thomas Kober, Julie Weeds, Jeremy Reffin, David Weir	We therefore introduce a novel form of distributional inference that exploits the rich type structure in APTs and infers missing data by the same mechanism that is used for semantic composition.
70	Learning Topic-Sensitive Word Representations	Marzieh Fadaee, Arianna Bisazza, Christof Monz	We present two approaches to learn multiple topic-sensitive representations per word by using Hierarchical Dirichlet Process.
71	Temporal Word Analogies: Identifying Lexical Replacement with Diachronic Word Embeddings	Terrence Szymanski	This paper introduces the concept of temporal word analogies: pairs of words which occupy the same semantic space at different points in time.
72	Methodical Evaluation of Arabic Word Embeddings	Mohammed Elrazzaz, Shady Elbassuoni, Khaled Shaban, Chadi Helwe	In this study, we evaluate these various techniques when used to generate Arabic word embeddings. We first build a benchmark for the Arabic language that can be utilized to perform intrinsic evaluation of different word embeddings.
73	Multilingual Connotation Frames: A Case Study on Social Media for Targeted Sentiment Analysis and Forecast	Hannah Rashkin, Eric Bell, Yejin Choi, Svitlana Volkova	To study targeted public sentiments across many languages and geographic locations, we introduce multilingual connotation frames: an extension from English connotation frames of Rashkin et al. (2016) with 10 additional European languages, focusing on the implied sentiments among event participants engaged in a frame.
74	Best-Worst Scaling More Reliable than Rating Scales: A Case Study on Sentiment Intensity Annotation	Svetlana Kiritchenko, Saif Mohammad	Here for the first time, we set up an experiment that directly compares the rating scale method with BWS.
75	Demographic Inference on Twitter using Recursive Neural Networks	Sunghwan Mac Kim, Qiongkai Xu, Lizhen Qu, Stephen Wan, Cécile Paris	In this work, we employ recursive neural networks to break down these independence assumptions to obtain inference about demographic characteristics on Twitter.
76	Twitter Demographic Classification Using Deep Multi-modal Multi-task Learning	Prashanth Vijayaraghavan, Soroush Vosoughi, Deb Roy	In this paper, we present a demographic classifier for gender, age, political orientation and location on Twitter. We collected and curated a robust Twitter demographic dataset for this task.
77	A Network Framework for Noisy Label Aggregation in Social Media	Xueying Zhan, Yaowei Wang, Yanghui Rao, Haoran Xie, Qing Li, Fu Lee Wang, Tak-Lam Wong	To aggregate noisy labels at a small cost, a network framework is proposed by calculating the matching degree of a document’s topics and the annotators’ meta-data.
78	Parser Adaptation for Social Media by Integrating Normalization	Rob van der Goot, Gertjan van Noord	This work explores different approaches of using normalization for parser adaptation.
79	AliMe Chat: A Sequence to Sequence and Rerank based Chatbot Engine	Minghui Qiu, Feng-Lin Li, Siyu Wang, Xing Gao, Yan Chen, Weipeng Zhao, Haiqing Chen, Jun Huang, Wei Chu	We propose AliMe Chat, an open-domain chatbot engine that integrates the joint results of Information Retrieval (IR) and Sequence to Sequence (Seq2Seq) based generation models.
80	A Conditional Variational Framework for Dialog Generation	Xiaoyu Shen, Hui Su, Yanran Li, Wenjie Li, Shuzi Niu, Yang Zhao, Akiko Aizawa, Guoping Long	In this paper, we propose a framework allowing conditional response generation based on specific attributes.
81	Question Answering through Transfer Learning from Large Fine-grained Supervision Data	Sewon Min, Minjoon Seo, Hannaneh Hajishirzi	We show that the task of question answering (QA) can significantly benefit from the transfer learning of models trained on a different large, fine-grained QA dataset.
82	Self-Crowdsourcing Training for Relation Extraction	Azad Abad, Moin Nabi, Alessandro Moschitti	In this paper we introduce a self-training strategy for crowdsourcing.
83	A Generative Attentional Neural Network Model for Dialogue Act Classification	Quan Hung Tran, Gholamreza Haffari, Ingrid Zukerman	We propose a novel generative neural network architecture for Dialogue Act classification.
84	Salience Rank: Efficient Keyphrase Extraction with Topic Modeling	Nedelina Teneva, Weiwei Cheng	In this paper, we propose a modification of TPR, called Salience Rank.
85	List-only Entity Linking	Ying Lin, Chin-Yew Lin, Heng Ji	In this work, we select most linkable mentions as seed mentions and disambiguate other mentions by comparing them with the seed mentions rather than directly with the entities.
86	Improving Native Language Identification by Using Spelling Errors	Lingzhen Chen, Carlo Strapparava, Vivi Nastase	In this paper, we explore spelling errors as a source of information for detecting the native language of a writer, a previously under-explored area.
87	Disfluency Detection using a Noisy Channel Model and a Deep Neural Language Model	Paria Jamshid Lou, Mark Johnson	This paper presents a model for disfluency detection in spontaneous speech transcripts called LSTM Noisy Channel Model.
88	On the Equivalence of Holographic and Complex Embeddings for Link Prediction	Katsuhiko Hayashi, Masashi Shimbo	We show the equivalence of two state-of-the-art models for link prediction/knowledge graph completion: Nickel et al’s holographic embeddings and Trouillon et al.’s complex embeddings.
89	Sentence Embedding for Neural Machine Translation Domain Adaptation	Rui Wang, Andrew Finch, Masao Utiyama, Eiichiro Sumita	In this paper, we exploit the NMT’s internal embedding of the source sentence and use the sentence embedding similarity to select the sentences which are close to in-domain data.
90	Data Augmentation for Low-Resource Neural Machine Translation	Marzieh Fadaee, Arianna Bisazza, Christof Monz	Inspired by work in computer vision, we propose a novel data augmentation approach that targets low-frequency words by generating new sentence pairs containing rare words in new, synthetically created contexts.
91	Speeding Up Neural Machine Translation Decoding by Shrinking Run-time Vocabulary	Xing Shi, Kevin Knight	We speed up Neural Machine Translation (NMT) decoding by shrinking run-time target vocabulary.
92	Chunk-Based Bi-Scale Decoder for Neural Machine Translation	Hao Zhou, Zhaopeng Tu, Shujian Huang, Xiaohua Liu, Hang Li, Jiajun Chen	In this paper, we propose a new type of decoder for NMT, which splits the decode state into two parts and updates them in two different time-scales.
93	Model Transfer for Tagging Low-resource Languages using a Bilingual Dictionary	Meng Fang, Trevor Cohn	We propose a novel neural network model for joint training from both sources of data based on cross-lingual word embeddings, and show substantial empirical improvements over baseline techniques.
94	EuroSense: Automatic Harvesting of Multilingual Sense Annotations from Parallel Text	Claudio Delli Bovi, Jose Camacho-Collados, Alessandro Raganato, Roberto Navigli	In this paper we present EuroSense, a multilingual sense-annotated resource based on the joint disambiguation of the Europarl parallel corpus, with almost 123 million sense annotations for over 155 thousand distinct concepts and entities from a language-independent unified sense inventory.
95	Challenging Language-Dependent Segmentation for Arabic: An Application to Machine Translation and Part-of-Speech Tagging	Hassan Sajjad, Fahim Dalvi, Nadir Durrani, Ahmed Abdelali, Yonatan Belinkov, Stephan Vogel	In our analysis, we show that a neural machine translation system is sensitive to the ratio of source and target tokens, and a ratio close to 1 or greater, gives optimal performance.
96	Fast and Accurate Neural Word Segmentation for Chinese	Deng Cai, Hai Zhao, Zhisong Zhang, Yuan Xin, Yongjian Wu, Feiyue Huang	In this paper, we propose a greedy neural word segmenter with balanced word and character embedding inputs to alleviate the existing drawbacks.
97	Pay Attention to the Ending:Strong Neural Baselines for the ROC Story Cloze Task	Zheng Cai, Lifu Tu, Kevin Gimpel	We develop a model that uses hierarchical recurrent networks with attention to encode the sentences in the story and score candidate endings.
98	Neural Semantic Parsing over Multiple Knowledge-bases	Jonathan Herzig, Jonathan Berant	In this paper, we propose to exploit structural regularities in language in different domains, and train semantic parsers over multiple knowledge-bases (KBs), while sharing information across datasets.
99	Representing Sentences as Low-Rank Subspaces	Jiaqi Mu, Suma Bhat, Pramod Viswanath	We observe a simple geometry of sentences – the word representations of a given sentence (on average 10.23 words in all SemEval datasets with a standard deviation 4.84) roughly lie in a low-rank subspace (roughly, rank 4).
100	Improving Semantic Relevance for Sequence-to-Sequence Learning of Chinese Social Media Text Summarization	Shuming Ma, Xu Sun, Jingjing Xu, Houfeng Wang, Wenjie Li, Qi Su	In this work, our goal is to improve semantic relevance between source texts and summaries for Chinese social media summarization.
101	Determining Whether and When People Participate in the Events They Tweet About	Krishna Chaitanya Sanagavarapu, Alakananda Vempala, Eduardo Blanco	This paper describes an approach to determine whether people participate in the events they tweet about.
102	Separating Facts from Fiction: Linguistic Models to Classify Suspicious and Trusted News Posts on Twitter	Svitlana Volkova, Kyle Shaffer, Jin Yea Jang, Nathan Hodas	In this work we build predictive models to classify 130 thousand news posts as suspicious or verified, and predict four sub-types of suspicious news – satire, hoaxes, clickbait and propaganda.
103	Recognizing Counterfactual Thinking in Social Media Texts	Youngseo Son, Anneke Buffone, Joe Raso, Allegra Larche, Anthony Janocko, Kevin Zembroski, H Andrew Schwartz, Lyle Ungar	We create a counterfactual tweet dataset and explore approaches for detecting counterfactuals using rule-based and supervised statistical approaches.
104	Temporal Orientation of Tweets for Predicting Income of Users	Mohammed Hasanuzzaman, Sabyasachi Kamila, Mandeep Kaur, Sriparna Saha, Asif Ekbal	The current paper presents the first study where user cognitive structure is used to build a predictive model of income.
105	Character-Aware Neural Morphological Disambiguation	Alymzhan Toleu, Gulmira Tolegen, Aibek Makazhanov	Guided by the intuition that the correct analysis should be “most similar” to the context, we propose dense representations for morphological analyses and surface context and a simple yet effective way of combining the two to perform disambiguation.
106	Character Composition Model with Convolutional Neural Networks for Dependency Parsing on Morphologically Rich Languages	Xiang Yu, Ngoc Thang Vu	We present a transition-based dependency parser that uses a convolutional neural network to compose word representations from characters.
107	How (not) to train a dependency parser: The curious case of jackknifing part-of-speech taggers	Željko Agić, Natalie Schluter	On 26 languages, we reveal a preference that conflicts with, and surpasses the ubiquitous ten-folding.