Paper Digest: EMNLP 2019 Highlights
Download EMNLP-2019-Paper-Digests.pdf– highlights of all ~680 EMNLP-2019 papers.
The Conference on Empirical Methods in Natural Language Processing (EMNLP) is one of the top natural language processing conferences in the world. In 2019, it is to be held in Hong Kong, China. There were 1,813 long paper submissions, of which 465 were accepted and 1,063 short paper submissions, of which 218 were accepted. A large number of these papers also published their code ( code download link).
The Conference on Empirical Methods in Natural Language Processing (EMNLP) is one of the top natural language processing conferences in the world. In 2019, it is to be held in Hong Kong, China. There were 1,813 long paper submissions, of which 465 were accepted and 1,063 short paper submissions, of which 218 were accepted. A large number of these papers also published their code ( code download link).
To help the community quickly catch up on the work presented in this conference, Paper Digest Team processed all accepted papers, and generated one highlight sentence (typically the main topic) for each paper. Readers are encouraged to read these machine generated highlights / summaries to quickly get the main idea of each paper.
If you do not want to miss any interesting academic paper, you are welcome to sign up our free daily paper digest service to get updates on new papers published in your area every day. You are also welcome to follow us on Twitter and Linkedin to get updated with new conference digests.
Paper Digest Team
team@paperdigest.org
TABLE 1: EMNLP 2019 Long/Short Papers
Title | Authors | Highlight | |
---|---|---|---|
1 | Attending to Future Tokens for Bidirectional Sequence Generation | Carolin Lawrence, Bhushan Kotnis, Mathias Niepert | We propose to make the sequence generation process bidirectional by employing special placeholder tokens. |
2 | Attention is not not Explanation | Sarah Wiegreffe, Yuval Pinter | We propose four alternative tests to determine when/whether attention can be used as explanation: a simple uniform-weights baseline; a variance calibration based on multiple random seed runs; a diagnostic framework using frozen weights from pretrained models; and an end-to-end adversarial attention training protocol. |
3 | Practical Obstacles to Deploying Active Learning | David Lowell, Zachary C. Lipton, Byron C. Wallace | In this paper, we show that while AL may provide benefits when used with specific models and for particular domains, the benefits of current approaches do not generalize reliably across models and tasks. |
4 | Transfer Learning Between Related Tasks Using Expected Label Proportions | Matan Ben Noach, Yoav Goldberg | We propose a novel application of the XR framework for transfer learning between related tasks, where knowing the labels of task A provides an estimation of the label proportion of task B. |
5 | Knowledge Enhanced Contextual Word Representations | Matthew E. Peters, Mark Neumann, Robert Logan, Roy Schwartz, Vidur Joshi, Sameer Singh, Noah A. Smith | We propose a general method to embed multiple knowledge bases (KBs) into large scale models, and thereby enhance their representations with structured, human-curated knowledge. |
6 | How Contextual are Contextualized Word Representations? Comparing the Geometry of BERT, ELMo, and GPT-2 Embeddings | Kawin Ethayarajh | This suggests that upper layers of contextualizing models produce more context-specific representations, much like how upper layers of LSTMs produce more task-specific representations. |
7 | Room to Glo: A Systematic Comparison of Semantic Change Detection Approaches with Word Embeddings | Philippa Shoemark, Farhana Ferdousi Liza, Dong Nguyen, Scott Hale, Barbara McGillivray | We propose a new evaluation framework for semantic change detection and find that (i) using the whole time series is preferable over only comparing between the first and last time points; (ii) independently trained and aligned embeddings perform better than continuously trained embeddings for long time periods; and (iii) that the reference point for comparison matters. |
8 | Correlations between Word Vector Sets | Vitalii Zhelezniak, April Shen, Daniel Busbridge, Aleksandar Savkov, Nils Hammerla | Just like cosine similarity is used to compare individual word vectors, we introduce a novel application of the centered kernel alignment (CKA) as a natural generalisation of squared cosine similarity for sets of word vectors. |
9 | Game Theory Meets Embeddings: a Unified Framework for Word Sense Disambiguation | Rocco Tripodi, Roberto Navigli | Game-theoretic models, thanks to their intrinsic ability to exploit contextual information, have shown to be particularly suited for the Word Sense Disambiguation task. |
10 | Guided Dialog Policy Learning: Reward Estimation for Multi-Domain Task-Oriented Dialog | Ryuichi Takanobu, Hanlin Zhu, Minlie Huang | To this end, we propose Guided Dialog Policy Learning, a novel algorithm based on Adversarial Inverse Reinforcement Learning for joint reward estimation and policy optimization in multi-domain task-oriented dialog. |
11 | Multi-hop Selector Network for Multi-turn Response Selection in Retrieval-based Chatbots | Chunyuan Yuan, Wei Zhou, Mingming Li, Shangwen Lv, Fuqing Zhu, Jizhong Han, Songlin Hu | In this paper, we will analyze the side effect of using too many context utterances and propose a multi-hop selector network (MSN) to alleviate the problem. |
12 | MoEL: Mixture of Empathetic Listeners | Zhaojiang Lin, Andrea Madotto, Jamin Shin, Peng Xu, Pascale Fung | In this paper, we propose a novel end-to-end approach for modeling empathy in dialogue systems: Mixture of Empathetic Listeners (MoEL). |
13 | Entity-Consistent End-to-end Task-Oriented Dialogue System with KB Retriever | Libo Qin, Yijia Liu, Wanxiang Che, Haoyang Wen, Yangming Li, Ting Liu | In this paper, we propose a novel framework which queries the KB in two steps to improve the consistency of generated entities. |
14 | Building Task-Oriented Visual Dialog Systems Through Alternative Optimization Between Dialog Policy and Language Generation | Mingyang Zhou, Josh Arnold, Zhou Yu | This paper proposes a novel framework that alternatively trains a RL policy for image guessing and a supervised seq2seq model to improve dialog generation quality. |
15 | DialogueGCN: A Graph Convolutional Neural Network for Emotion Recognition in Conversation | Deepanway Ghosal, Navonil Majumder, Soujanya Poria, Niyati Chhaya, Alexander Gelbukh | In this paper, we present Dialogue Graph Convolutional Network (DialogueGCN), a graph neural network based approach to ERC. |
16 | Knowledge-Enriched Transformer for Emotion Detection in Textual Conversations | Peixiang Zhong, Di Wang, Chunyan Miao | In this paper, we address these challenges by proposing a Knowledge-Enriched Transformer (KET), where contextual utterances are interpreted using hierarchical self-attention and external commonsense knowledge is dynamically leveraged using a context-aware affective graph attention mechanism. |
17 | Interpretable Relevant Emotion Ranking with Event-Driven Attention | Yang Yang, Deyu ZHOU, Yulan He, Meng Zhang | In this paper, we proposed a novel interpretable relevant emotion ranking model with the event information incorporated into a deep learning architecture using the event-driven attentions. |
18 | Justifying Recommendations using Distantly-Labeled Reviews and Fine-Grained Aspects | Jianmo Ni, Jiacheng Li, Julian McAuley | We seek to introduce new datasets and methods to address the recommendation justification task. |
19 | Using Customer Service Dialogues for Satisfaction Analysis with Context-Assisted Multiple Instance Learning | Kaisong Song, Lidong Bing, Wei Gao, Jun Lin, Lujun Zhao, Jiancheng Wang, Changlong Sun, Xiaozhong Liu, Qiong Zhang | In this paper, we conduct a pilot study on the task of service satisfaction analysis (SSA) based on multi-turn CS dialogues. We construct two CS dialogue datasets from a top E-commerce platform. |
20 | Leveraging Dependency Forest for Neural Medical Relation Extraction | Linfeng Song, Yue Zhang, Daniel Gildea, Mo Yu, Zhiguo Wang, jinsong su | We investigate a method to alleviate this problem by utilizing dependency forests. |
21 | Open Relation Extraction: Relational Knowledge Transfer from Supervised Data to Unsupervised Data | Ruidong Wu, Yuan Yao, Xu Han, Ruobing Xie, Zhiyuan Liu, Fen Lin, Leyu Lin, Maosong Sun | To address this issue, we propose Relational Siamese Networks (RSNs) to learn similarity metrics of relations from labeled data of pre-defined relations, and then transfer the relational knowledge to identify novel relations in unlabeled data. |
22 | Improving Relation Extraction with Knowledge-attention | Pengfei Li, Kezhi Mao, Xuefeng Yang, Qi Li | We propose a novel knowledge-attention encoder which incorporates prior knowledge from external lexical resources into deep neural networks for relation extraction task. |
23 | Jointly Learning Entity and Relation Representations for Entity Alignment | Yuting Wu, Xiao Liu, Yansong Feng, Zheng Wang, Dongyan Zhao | This paper presents a novel joint learning framework for entity alignment. |
24 | Tackling Long-Tailed Relations and Uncommon Entities in Knowledge Graph Completion | Zihao Wang, Kwunping Lai, Piji Li, Lidong Bing, Wai Lam | Therefore, we propose a meta-learning framework that aims at handling infrequent relations with few-shot learning and uncommon entities by using textual descriptions. |
25 | Low-Resource Name Tagging Learned with Weakly Labeled Data | Yixin Cao, Zikun Hu, Tat-seng Chua, Zhiyuan Liu, Heng Ji | In this paper, we propose a novel neural model for name tagging solely based on weakly labeled (WL) data, so that it can be applied in any low-resource settings. |
26 | Learning Dynamic Context Augmentation for Global Entity Linking | Xiyuan Yang, Xiaotao Gu, Sheng Lin, Siliang Tang, Yueting Zhuang, Fei Wu, Zhigang Chen, Guoping Hu, Xiang Ren | In this paper, we propose a simple yet effective solution, called Dynamic Context Augmentation (DCA), for collective EL, which requires only one pass through the mentions in a document. |
27 | Open Event Extraction from Online Text using a Generative Adversarial Network | Rui Wang, Deyu ZHOU, Yulan He | To address these limitations, we propose an event extraction model based on Generative Adversarial Nets, called Adversarial-neural Event Model (AEM). |
28 | Learning to Bootstrap for Entity Set Expansion | Lingyong Yan, Xianpei Han, Le Sun, Ben He | To address the above two problems, we propose a novel bootstrapping method combining the Monte Carlo Tree Search (MCTS) algorithm with a deep similarity network, which can efficiently estimate delayed feedback for pattern evaluation and adaptively score entities given sparse supervision signals. |
29 | Multi-Input Multi-Output Sequence Labeling for Joint Extraction of Fact and Condition Tuples from Scientific Text | Tianwen Jiang, Tong Zhao, Bing Qin, Ting Liu, Nitesh Chawla, Meng Jiang | In this work, we propose a new sequence labeling framework (as well as a new tag schema) to jointly extract the fact and condition tuples from statement sentences. |
30 | Cross-lingual Structure Transfer for Relation and Event Extraction | Ananya Subburathinam, Di Lu, Heng Ji, Jonathan May, Shih-Fu Chang, Avirup Sil, Clare Voss | We investigate the suitability of cross-lingual structure transfer techniques for these tasks. |
31 | Uncover the Ground-Truth Relations in Distant Supervision: A Neural Expectation-Maximization Framework | Junfan Chen, Richong Zhang, Yongyi Mao, Hongyu Guo, Jie Xu | To cope with this challenge, we propose a novel label-denoising framework that combines neural network with probabilistic modelling, which naturally takes into account the noisy labels during learning. |
32 | Doc2EDAG: An End-to-End Document-level Framework for Chinese Financial Event Extraction | Shun Zheng, Wei Cao, Wei Xu, Jiang Bian | To address these challenges, we propose a novel end-to-end model, Doc2EDAG, which can generate an entity-based directed acyclic graph to fulfill the document-level EE (DEE) effectively. To demonstrate the effectiveness of Doc2EDAG, we build a large-scale real-world dataset consisting of Chinese financial announcements with the challenges mentioned above. |
33 | Event Detection with Trigger-Aware Lattice Neural Network | Ning Ding, Ziran Li, Zhiyuan Liu, Haitao Zheng, Zibo Lin | To address the two issues simultaneously, we propose the Trigger-aware Lattice Neural Net- work (TLNN). |
34 | A Boundary-aware Neural Model for Nested Named Entity Recognition | Changmeng Zheng, Yi Cai, Jingyun Xu, Ho-fung Leung, Guandong Xu | We propose a boundary-aware neural model for nested NER which leverages entity boundaries to predict entity categorical labels. |
35 | Learning the Extraction Order of Multiple Relational Facts in a Sentence with Reinforcement Learning | Xiangrong Zeng, Shizhu He, Daojian Zeng, Kang Liu, Shengping Liu, Jun Zhao | In this paper we argue that the extraction order is important in this task. |
36 | CaRe: Open Knowledge Graph Embeddings | Swapnil Gupta, Sreyash Kenkre, Partha Talukdar | We fill this gap in the paper and propose Canonicalization-infused Representations (CaRe) for OpenKGs. |
37 | Self-Attention Enhanced CNNs and Collaborative Curriculum Learning for Distantly Supervised Relation Extraction | Yuyun Huang, Jinhua Du | In this paper, we propose a novel model that employs a collaborative curriculum learning framework to reduce the effects of mislabelled data. |
38 | Neural Cross-Lingual Relation Extraction Based on Bilingual Word Embedding Mapping | Jian Ni, Radu Florian | In this paper, we propose a new approach for cross-lingual RE model transfer based on bilingual word embedding mapping. |
39 | Leveraging 2-hop Distant Supervision from Table Entity Pairs for Relation Extraction | xiang deng, Huan Sun | In this paper, we introduce a new strategy named 2-hop DS to enhance distantly supervised RE, based on the observation that there exist a large number of relational tables on the Web which contain entity pairs that share common relations. |
40 | EntEval: A Holistic Evaluation Benchmark for Entity Representations | Mingda Chen, Zewei Chu, Yang Chen, Karl Stratos, Kevin Gimpel | In this work, we propose EntEval: a test suite of diverse tasks that require nontrivial understanding of entities including entity typing, entity similarity, entity relation prediction, and entity disambiguation. |
41 | Joint Event and Temporal Relation Extraction with Shared Representations and Structured Prediction | Rujun Han, Qiang Ning, Nanyun Peng | We propose a joint event and temporal relation extraction model with shared representation learning and structured prediction. |
42 | Hierarchical Text Classification with Reinforced Label Assignment | Yuning Mao, Jingjing Tian, Jiawei Han, Xiang Ren | To solve the mismatch between training and inference as well as modeling label dependencies in a more principled way, we formulate HTC as a Markov decision process and propose to learn a Label Assignment Policy via deep reinforcement learning to determine where to place an object and when to stop the assignment process. |
43 | Investigating Capsule Network and Semantic Feature on Hyperplanes for Text Classification | Chunning Du, Haifeng Sun, Jingyu Wang, Qi Qi, Jianxin Liao, Chun Wang, Bing Ma | Therefore, we propose to use capsule networks to construct the vectorized representation of semantics and utilize hyperplanes to decompose each capsule to acquire the specific senses. |
44 | Label-Specific Document Representation for Multi-Label Text Classification | Lin Xiao, Xin Huang, Boli Chen, Liping Jing | In this paper, we propose a Label-Specific Attention Network (LSAN) to learn a label-specific document representation. |
45 | Hierarchical Attention Prototypical Networks for Few-Shot Text Classification | Shengli Sun, Qingfeng Sun, Kevin Zhou, Tengchao Lv | In this work, we propose a hierarchical attention prototypical networks (HAPN) for few-shot text classification. |
46 | Many Faces of Feature Importance: Comparing Built-in and Post-hoc Feature Importance in Text Classification | Vivian Lai, Zheng Cai, Chenhao Tan | In this work, we systematically compare feature importance from built-in mechanisms in a model such as attention values and post-hoc methods that approximate model behavior such as LIME. |
47 | Enhancing Local Feature Extraction with Global Representation for Neural Text Classification | Guocheng Niu, Hengru Xu, Bolei He, Xinyan Xiao, Hua Wu, Sheng GAO | This paper proposes a novel Encoder1-Encoder2 architecture, where global information is incorporated into the procedure of local feature extraction from scratch. |
48 | Latent-Variable Generative Models for Data-Efficient Text Classification | Xiaoan Ding, Kevin Gimpel | In this paper, we improve generative text classifiers by introducing discrete latent variables into the generative story, and explore several graphical model configurations. |
49 | PaRe: A Paper-Reviewer Matching Approach Using a Common Topic Space | Omer Anjum, Hongyu Gong, Suma Bhat, Wen-Mei Hwu, JinJun Xiong | Our approach, the common topic model, jointly models the topics common to the submission and the reviewer’s profile while relying on abstract topic vectors. |
50 | Linking artificial and human neural representations of language | Jon Gauthier, Roger Levy | What information from an act of sentence understanding is robustly represented in the human brain? We investigate this question by comparing sentence encoding models on a brain decoding task, where the sentence that an experimental participant has seen must be predicted from the fMRI signal evoked by the sentence. |
51 | Neural Text Summarization: A Critical Evaluation | Wojciech Kryscinski, Nitish Shirish Keskar, Bryan McCann, Caiming Xiong, Richard Socher | We critically evaluate key ingredients of the current research setup: datasets, evaluation metrics, and models, and highlight three primary shortcomings: 1) automatically collected datasets leave the task underconstrained and may contain noise detrimental to training and evaluation, 2) current evaluation protocol is weakly correlated with human judgment and does not account for important characteristics such as factual correctness, 3) models overfit to layout biases of current datasets and offer limited diversity in their outputs. |
52 | Neural data-to-text generation: A comparison between pipeline and end-to-end architectures | Thiago Castro Ferreira, Chris van der Lee, Emiel van Miltenburg, Emiel Krahmer | This study introduces a systematic comparison between neural pipeline and end-to-end data-to-text approaches for the generation of text from RDF triples. |
53 | MoverScore: Text Generation Evaluating with Contextualized Embeddings and Earth Mover Distance | Wei Zhao, Maxime Peyrard, Fei Liu, Yang Gao, Christian M. Meyer, Steffen Eger | In this paper we investigate strategies to encode system and reference texts to devise a metric that shows a high correlation with human judgment of text quality. |
54 | Select and Attend: Towards Controllable Content Selection in Text Generation | Xiaoyu Shen, Jun Suzuki, Kentaro Inui, Hui Su, Dietrich Klakow, Satoshi Sekine | This paper tackles this problem by decoupling content selection from the decoder. |
55 | Sentence-Level Content Planning and Style Specification for Neural Text Generation | Xinyu Hua, Lu Wang | To address these issues, we present an end-to-end trained two-step generation model, where a sentence-level content planner first decides on the keyphrases to cover as well as a desired language style, followed by a surface realization decoder that generates relevant and coherent text. |
56 | Translate and Label! An Encoder-Decoder Approach for Cross-lingual Semantic Role Labeling | Angel Daza, Anette Frank | We propose a Cross-lingual Encoder-Decoder model that simultaneously translates and generates sentences with Semantic Role Labeling annotations in a resource-poor target language. |
57 | Syntax-Enhanced Self-Attention-Based Semantic Role Labeling | Yue Zhang, Rui Wang, Luo Si | We present different approaches of en- coding the syntactic information derived from dependency trees of different quality and representations; we propose a syntax-enhanced self-attention model and compare it with other two strong baseline methods; and we con- duct experiments with newly published deep contextualized word representations as well. |
58 | VerbAtlas: a Novel Large-Scale Verbal Semantic Resource and Its Application to Semantic Role Labeling | Andrea Di Fabio, Simone Conia, Roberto Navigli | We present VerbAtlas, a new, hand-crafted lexical-semantic resource whose goal is to bring together all verbal synsets from WordNet into semantically-coherent frames. |
59 | Parameter-free Sentence Embedding via Orthogonal Basis | Ziyi Yang, Chenguang Zhu, Weizhu Chen | We propose a simple and robust non-parameterized approach for building sentence representations. |
60 | Evaluation Benchmarks and Learning Criteria for Discourse-Aware Sentence Representations | Mingda Chen, Zewei Chu, Kevin Gimpel | We benchmark sentence encoders pretrained with our proposed training objectives, as well as other popular pretrained sentence encoders on DiscoEval and other sentence evaluation tasks. |
61 | Extracting Possessions from Social Media: Images Complement Language | Dhivya Chinnappa, Srikala Murugan, Eduardo Blanco | This paper describes a new dataset and experiments to determine whether authors of tweets possess the objects they tweet about. |
62 | Learning to Speak and Act in a Fantasy Text Adventure Game | Jack Urbanek, Angela Fan, Siddharth Karamcheti, Saachi Jain, Samuel Humeau, Emily Dinan, Tim Rocktäschel, Douwe Kiela, Arthur Szlam, Jason Weston | We introduce a large-scale crowdsourced text adventure game as a research platform for studying grounded dialogue. |
63 | Help, Anna! Visual Navigation with Natural Multimodal Assistance via Retrospective Curiosity-Encouraging Imitation Learning | Khanh Nguyen, Hal Daumé III | We develop “Help, Anna!” (HANNA), an interactive photo-realistic simulator in which an agent fulfills object-finding tasks by requesting and interpreting natural language-and-vision assistance. |
64 | Incorporating Visual Semantics into Sentence Representations within a Grounded Space | Patrick Bordes, Eloi Zablocki, Laure Soulier, Benjamin Piwowarski, patrick Gallinari | To overcome this limitation, we propose to transfer visual information to textual representations by learning an intermediate representation space: the grounded space. |
65 | Neural Naturalist: Generating Fine-Grained Image Comparisons | Maxwell Forbes, Christine Kaeser-Chen, Piyush Sharma, Serge Belongie | We introduce the new Birds-to-Words dataset of 41k sentences describing fine-grained differences between photographs of birds. We propose a new model called Neural Naturalist that uses a joint image encoding and comparative module to generate comparative language, and evaluate the results with humans who must use the descriptions to distinguish real images. |
66 | Fine-Grained Evaluation for Entity Linking | Henry Rosales-Méndez, Aidan Hogan, Barbara Poblete | We propose a fuzzy recall metric to address the lack of consensus and conclude with fine-grained evaluation results comparing a selection of online EL systems. |
67 | Supervising Unsupervised Open Information Extraction Models | Arpita Roy, Youngja Park, Taesung Lee, Shimei Pan | We propose a novel supervised open information extraction (Open IE) framework that leverages an ensemble of unsupervised Open IE systems and a small amount of labeled data to improve system performance. |
68 | Neural Cross-Lingual Event Detection with Minimal Parallel Resources | Jian Liu, Yubo Chen, Kang Liu, Jun Zhao | In this paper, we propose a new method for cross-lingual ED, demonstrating a minimal dependency on parallel resources. |
69 | KnowledgeNet: A Benchmark Dataset for Knowledge Base Population | Filipe Mesquita, Matteo Cannaviccio, Jordan Schmidek, Paramita Mirza, Denilson Barbosa | KnowledgeNet is a benchmark dataset for the task of automatically populating a knowledge base (Wikidata) with facts expressed in natural language text on the web. |
70 | Effective Use of Transformer Networks for Entity Tracking | Aditya Gupta, Greg Durrett | In this paper, we explore the use of pre-trained transformer networks for entity tracking tasks in procedural text. |
71 | Explicit Cross-lingual Pre-training for Unsupervised Machine Translation | Shuo Ren, Yu Wu, Shujie Liu, Ming Zhou, Shuai Ma | In this paper, we propose a novel cross-lingual pre-training method for unsupervised machine translation by incorporating explicit cross-lingual training signals. |
72 | Latent Part-of-Speech Sequences for Neural Machine Translation | Xuewen Yang, Yingru Liu, Dongliang Xie, Xin Wang, Niranjan Balasubramanian | In this work, we introduce a new latent variable model, LaSyn, that captures the co-dependence between syntax and semantics, while allowing for effective and efficient inference over the latent space. |
73 | Improving Back-Translation with Uncertainty-based Confidence Estimation | Shuo Wang, Yang Liu, Chao Wang, Huanbo Luan, Maosong Sun | In this work, we propose to quantify the confidence of NMT model predictions based on model uncertainty. |
74 | Towards Linear Time Neural Machine Translation with Capsule Networks | Mingxuan Wang | To the best of our knowledge, this is the first work that capsule networks have been empirically investigated for sequence to sequence problems. |
75 | Modeling Multi-mapping Relations for Precise Cross-lingual Entity Alignment | Xiaofei Shi, Yanghua Xiao | To solve this issue, we propose a new embedding-based framework. |
76 | Supervised and Nonlinear Alignment of Two Embedding Spaces for Dictionary Induction in Low Resourced Languages | Masud Moshtaghi | In this study, we first describe the general requirements for the success of these techniques and then present a noise tolerant piecewise linear technique to learn a non-linear mapping between two monolingual word embedding vector spaces. |
77 | Beto, Bentz, Becas: The Surprising Cross-Lingual Effectiveness of BERT | Shijie Wu, Mark Dredze | This paper explores the broader cross-lingual potential of mBERT (multilingual) as a zero shot language transfer model on 5 NLP tasks covering a total of 39 languages from various language families: NLI, document classification, NER, POS tagging, and dependency parsing. |
78 | Iterative Dual Domain Adaptation for Neural Machine Translation | Jiali Zeng, Yang Liu, jinsong su, yubing Ge, Yaojie Lu, Yongjing Yin, jiebo luo | In this paper, we argue that such a strategy fails to fully extract the domain-shared translation knowledge, and repeatedly utilizing corpora of different domains can lead to better distillation of domain-shared translation knowledge. |
79 | Multi-agent Learning for Neural Machine Translation | tianchi bi, hao xiong, Zhongjun He, Hua Wu, Haifeng Wang | In this paper, we extend the training framework to the multi-agent sce- nario by introducing diverse agents in an in- teractive updating process. |
80 | Pivot-based Transfer Learning for Neural Machine Translation between Non-English Languages | Yunsu Kim, Petre Petrov, Pavel Petrushkov, Shahram Khadivi, Hermann Ney | We propose three methods to increase the relation among source, pivot, and target languages in the pre-training: 1) step-wise training of a single model for different language pairs, 2) additional adapter component to smoothly connect pre-trained encoder and decoder, and 3) cross-lingual encoder training via autoencoding of the pivot language. |
81 | Context-Aware Monolingual Repair for Neural Machine Translation | Elena Voita, Rico Sennrich, Ivan Titov | We propose a monolingual DocRepair model to correct inconsistencies between sentence-level translations. |
82 | Multi-Granularity Self-Attention for Neural Machine Translation | Jie Hao, Xing Wang, Shuming Shi, Jinfeng Zhang, Zhaopeng Tu | In this work, we present {{\textbackslash}em multi-granularity self-attention} (Mg-Sa): a neural network that combines multi-head self-attention and phrase modeling. |
83 | Improving Deep Transformer with Depth-Scaled Initialization and Merged Attention | Biao Zhang, Ivan Titov, Rico Sennrich | We propose depth-scaled initialization (DS-Init), which decreases parameter variance at the initialization stage, and reduces output variance of residual connections so as to ease gradient back-propagation through normalization layers. |
84 | A Discriminative Neural Model for Cross-Lingual Word Alignment | Elias Stengel-Eskin, Tzu-ray Su, Matt Post, Benjamin Van Durme | We introduce a novel discriminative word alignment model, which we integrate into a Transformer-based machine translation model. |
85 | One Model to Learn Both: Zero Pronoun Prediction and Translation | Longyue Wang, Zhaopeng Tu, Xing Wang, Shuming Shi | In this paper, we propose a unified and discourse-aware ZP translation approach for neural MT models. |
86 | Dynamic Past and Future for Neural Machine Translation | Zaixiang Zheng, Shujian Huang, Zhaopeng Tu, XIN-YU DAI, Jiajun CHEN | In this paper, we propose to model the {\textbackslash}textit{dynamic principles} by explicitly separating source words into groups of translated and untranslated contents through parts-to-wholes assignment. |
87 | Revisit Automatic Error Detection for Wrong and Missing Translation — A Supervised Approach | Wenqiang Lei, Weiwen Xu, Ai Ti Aw, Yuanxin Xiang, Tat Seng Chua | To have a closer study of these issues and accelerate model development, we propose automatic detecting adequacy errors in MT hypothesis for MT model evaluation. |
88 | Towards Understanding Neural Machine Translation with Word Importance | Shilin He, Zhaopeng Tu, Xing Wang, Longyue Wang, Michael Lyu, Shuming Shi | In this work, we propose to address this gap by focusing on understanding the input-output behavior of NMT models. |
89 | Multilingual Neural Machine Translation with Language Clustering | Xu Tan, Jiale Chen, Di He, Yingce Xia, Tao QIN, Tie-Yan Liu | In this work, we develop a framework that clusters languages into different groups and trains one multilingual model for each cluster. |
90 | Don’t Forget the Long Tail! A Comprehensive Analysis of Morphological Generalization in Bilingual Lexicon Induction | Paula Czarnowska, Sebastian Ruder, Edouard Grave, Ryan Cotterell, Ann Copestake | In this work, we investigate whether state-of-the-art bilingual lexicon inducers are capable of learning this kind of generalization. |
91 | Pushing the Limits of Low-Resource Morphological Inflection | Antonios Anastasopoulos, Graham Neubig | In response, we propose a battery of improvements that greatly improve performance under such low-resource conditions. |
92 | Cross-Lingual Dependency Parsing Using Code-Mixed TreeBank | Meishan Zhang, Yue Zhang, Guohong Fu | To address this problem, we investigate syntactic transfer by code mixing, translating only confident words in a source treebank. |
93 | Hierarchical Pointer Net Parsing | Linlin Liu, Xiang Lin, Shafiq Joty, Simeng Han, Lidong Bing | In this paper, we propose hierarchical pointer network parsers, and apply them to dependency and sentence-level discourse parsing tasks. |
94 | Semi-Supervised Semantic Role Labeling with Cross-View Training | Rui Cai, Mirella Lapata | We propose an end-to-end SRL model and demonstrate it can effectively leverage unlabeled data under the cross-view training modeling paradigm. |
95 | Low-Resource Sequence Labeling via Unsupervised Multilingual Contextualized Representations | Zuyi Bao, Rui Huang, Chen Li, Kenny Zhu | In this work, we propose a Multilingual Language Model with deep semantic Alignment (MLMA) to generate language-independent representations for cross-lingual sequence labeling. |
96 | A Lexicon-Based Graph Neural Network for Chinese NER | Tao Gui, Yicheng Zou, Qi Zhang, Minlong Peng, Jinlan Fu, Zhongyu Wei, Xuanjing Huang | In this work, we try to alleviate this problem by introducing a lexicon-based graph neural network with global semantics, in which lexicon knowledge is used to connect characters to capture the local composition, while a global relay node can capture global sentence semantics and long-range dependency. |
97 | CM-Net: A Novel Collaborative Memory Network for Spoken Language Understanding | Yijin Liu, Fandong Meng, Jinchao Zhang, Jie Zhou, Yufeng Chen, Jinan Xu | To address this issue, in this paper we propose a novel Collaborative Memory Network (CM-Net) based on the well-designed block, named CM-block. |
98 | Tree Transformer: Integrating Tree Structures into Self-Attention | Yaushian Wang, Hung-Yi Lee, Yun-Nung Chen | This paper proposes Tree Transformer, which adds an extra constraint to attention heads of the bidirectional Transformer encoder in order to encourage the attention heads to follow tree structures. |
99 | Semantic Role Labeling with Iterative Structure Refinement | Chunchuan Lyu, Shay B. Cohen, Ivan Titov | We model interactions between argument labeling decisions through iterative refinement. |
100 | Entity Projection via Machine Translation for Cross-Lingual NER | Alankar Jain, Bhargavi Paranjape, Zachary C. Lipton | We propose a system that improves over prior entity-projection methods by: (a) leveraging machine translation systems twice: first for translating sentences and subsequently for translating entities; (b) matching entities based on orthographic and phonetic similarity; and (c) identifying matches based on distributional statistics derived from the dataset. |
101 | A Bayesian Approach for Sequence Tagging with Crowds | Edwin D. Simpson, Iryna Gurevych | To address this, we propose a Bayesian method for aggregating sequence tags that reduces errors by modelling sequential dependencies between the annotations as well as the ground-truth labels. |
102 | A systematic comparison of methods for low-resource dependency parsing on genuinely low-resource languages | Clara Vania, Yova Kementchedjhieva, Anders Søgaard, Adam Lopez | We systematically compare a set of simple strategies for improving low-resource parsers: data augmentation, which has not been tested before; cross-lingual training; and transliteration. |
103 | Target Language-Aware Constrained Inference for Cross-lingual Dependency Parsing | Tao Meng, Nanyun Peng, Kai-Wei Chang | In this paper, we show that weak supervisions of linguistic knowledge for the target languages can improve a cross-lingual graph-based dependency parser substantially. |
104 | Look-up and Adapt: A One-shot Semantic Parser | Zhichu Lu, Forough Arabshahi, Igor Labutov, Tom Mitchell | In this paper, we propose a semantic parser that generalizes to out-of-domain examples by learning a general strategy for parsing an unseen utterance through adapting the logical forms of seen utterances, instead of learning to generate a logical form from scratch. |
105 | Similarity Based Auxiliary Classifier for Named Entity Recognition | Shiyuan Xiao, Yuanxin Ouyang, Wenge Rong, Jianxin Yang, Zhang Xiong | Inspired by previous work in which a multi-task strategy is used to solve segmentation problems, we design a similarity based auxiliary classifier (SAC), which can distinguish entity words from non-entity words. |
106 | Variable beam search for generative neural parsing and its relevance for the analysis of neuro-imaging signal | Benoit Crabbé, Murielle Fabre, Christophe Pallier | This paper describes a method of variable beam size inference for Recurrent Neural Network Grammar (rnng) by drawing inspiration from sequential Monte-Carlo methods such as particle filtering. |
107 | Are We Modeling the Task or the Annotator? An Investigation of Annotator Bias in Natural Language Understanding Datasets | Mor Geva, Yoav Goldberg, Jonathan Berant | In this paper, we perform a series of experiments showing these concerns are evident in three recent NLP datasets. |
108 | Robust Text Classifier on Test-Time Budgets | Md Rizwan Parvez, Tolga Bolukbasi, Kai-Wei Chang, Venkatesh Saligrama | To this end, we propose a data aggregation method to train the classifier, allowing it to achieve competitive performance on fractured sentences. |
109 | Commonsense Knowledge Mining from Pretrained Models | Joe Davison, Joshua Feldman, Alexander Rush | In this work, we develop a method for generating commonsense knowledge using a large, pre-trained bidirectional language model. |
110 | RNN Architecture Learning with Sparse Regularization | Jesse Dodge, Roy Schwartz, Hao Peng, Noah A. Smith | We present a structure learning method for learning sparse, parameter-efficient NLP models. |
111 | Analytical Methods for Interpretable Ultradense Word Embeddings | Philipp Dufter, Hinrich Schütze | In this work, we investigate three methods for making word spaces interpretable by rotation: Densifier (Rothe et al., 2016), linear SVMs and DensRay, a new method we propose. |
112 | Investigating Meta-Learning Algorithms for Low-Resource Natural Language Understanding Tasks | Zi-Yi Dou, Keyi Yu, Antonios Anastasopoulos | Inspired by the recent success of optimization-based meta-learning algorithms, in this paper, we explore the model-agnostic meta-learning algorithm (MAML) and its variants for low-resource NLU tasks. |
113 | Retrofitting Contextualized Word Embeddings with Paraphrases | Weijia Shi, Muhao Chen, Pei Zhou, Kai-Wei Chang | To address this issue, we propose a post-processing approach to retrofit the embedding with paraphrases. |
114 | Incorporating Contextual and Syntactic Structures Improves Semantic Similarity Modeling | Linqing Liu, Wei Yang, Jinfeng Rao, Raphael Tang, Jimmy Lin | However, such structure priors have not been well exploited in previous work for semantic modeling. To examine their effectiveness, we start with the Pairwise Word Interaction Model, one of the best models according to a recent reproducibility study, then introduce components for modeling context and structure using multi-layer BiLSTMs and TreeLSTMs. |
115 | Neural Linguistic Steganography | Zachary Ziegler, Yuntian Deng, Alexander Rush | We propose a steganography technique based on arithmetic coding with large-scale neural language models. |
116 | The Feasibility of Embedding Based Automatic Evaluation for Single Document Summarization | Simeng Sun, Ani Nenkova | Here we present a suite of experiments on using distributed representations for evaluating summarizers, both in reference-based and in reference-free setting. |
117 | Attention Optimization for Abstractive Document Summarization | Min Gui, Junfeng Tian, Rui Wang, Zhenglu Yang | We propose attention refinement unit paired with local variance loss to impose supervision on the attention model at each decoding step, and we also propose a global variance loss to optimize the attention distributions of all decoding steps from the global perspective. |
118 | Rewarding Coreference Resolvers for Being Consistent with World Knowledge | Rahul Aralikatte, Heather Lent, Ana Valeria Gonzalez, Daniel Herschcovich, Chen Qiu, Anders Sandholm, Michael Ringaard, Anders Søgaard | We show how to improve coreference resolvers by forwarding their input to a relation extraction system and reward the resolvers for producing triples that are found in knowledge bases. |
119 | An Empirical Study of Incorporating Pseudo Data into Grammatical Error Correction | Shun Kiyono, Jun Suzuki, Masato Mita, Tomoya Mizumoto, Kentaro Inui | In this study, these choices are investigated through extensive experiments, and state-of-the-art performance is achieved on the CoNLL-2014 test set (F0.5=65.0) and the official test set of the BEA-2019 shared task (F0.5=70.2) without making any modifications to the model architecture. |
120 | A Multilingual Topic Model for Learning Weighted Topic Links Across Corpora with Low Comparability | Weiwei Yang, Jordan Boyd-Graber, Philip Resnik | We introduce a new model that does not rely on this assumption, particularly useful in important low-resource language scenarios. |
121 | Measure Country-Level Socio-Economic Indicators with Streaming News: An Empirical Study | Bonan Min, Xiaoxi Zhao | In this paper, we propose Event-Centric Indicator Measure (ECIM), a novel approach to measure socio-economic indicators with events. |
122 | Towards Extracting Medical Family History from Natural Language Interactions: A New Dataset and Baselines | Mahmoud Azab, Stephane Dadian, Vivi Nastase, Larry An, Rada Mihalcea | We introduce a new dataset consisting of natural language interactions annotated with medical family histories, obtained during interactions with a genetic counselor and through crowdsourcing, following a questionnaire created by experts in the domain. |
123 | Multi-task Learning for Natural Language Generation in Task-Oriented Dialogue | Chenguang Zhu, Michael Zeng, Xuedong Huang | In this paper, we propose a novel multi-task learning framework, NLG-LM, for natural language generation. |
124 | Dirichlet Latent Variable Hierarchical Recurrent Encoder-Decoder in Dialogue Generation | Min Zeng, Yisen Wang, Yuan Luo | To address the issues, we propose to use the Dirichlet distribution with flexible structures to characterize the latent variables in place of the traditional Gaussian distribution, called Dirichlet Latent Variable Hierarchical Recurrent Encoder-Decoder model (Dir-VHRED). |
125 | Semi-Supervised Bootstrapping of Dialogue State Trackers for Task-Oriented Modelling | Bo-Hsiang Tseng, Marek Rei, Pawe? Budzianowski, Richard Turner, Bill Byrne, Anna Korhonen | In this paper, we investigate semi-supervised learning methods that are able to reduce the amount of required intermediate labelling. |
126 | A Progressive Model to Enable Continual Learning for Semantic Slot Filling | Yilin Shen, Xiangyu Zeng, Hongxia Jin | In this paper, we introduce a novel progressive slot filling model, ProgModel. |
127 | CASA-NLU: Context-Aware Self-Attentive Natural Language Understanding for Task-Oriented Chatbots | Arshit Gupta, Peng Zhang, Garima Lalwani, Mona Diab | In this work, we propose a context-aware self-attentive NLU (CASA-NLU) model that uses multiple signals over a variable context window, such as previous intents, slots, dialog acts and utterances, in addition to the current user utterance. |
128 | Sampling Matters! An Empirical Study of Negative Sampling Strategies for Learning of Matching Models in Retrieval-based Dialogue Systems | Jia Li, Chongyang Tao, wei wu, Yansong Feng, Dongyan Zhao, Rui Yan | We study how to sample negative examples to automatically construct a training set for effective model learning in retrieval-based dialogue systems. |
129 | Zero-shot Cross-lingual Dialogue Systems with Transferable Latent Variables | Zihan Liu, Jamin Shin, Yan Xu, Genta Indra Winata, Peng Xu, Andrea Madotto, Pascale Fung | Hence, we propose a zero-shot adaptation of task-oriented dialogue system to low-resource languages. |
130 | Modeling Multi-Action Policy for Task-Oriented Dialogues | Lei Shu, Hu Xu, Bing Liu, Piero Molino | In this paper, we compare the performance of several models on the task of predicting multiple acts for each turn. |
131 | An Evaluation Dataset for Intent Classification and Out-of-Scope Prediction | Stefan Larson, Anish Mahendran, Joseph J. Peper, Christopher Clarke, Andrew Lee, Parker Hill, Jonathan K. Kummerfeld, Kevin Leach, Michael A. Laurenzano, Lingjia Tang, Jason mars | We introduce a new dataset that includes queries that are out-of-scope—i.e., queries that do not fall into any of the system’s supported intents. |
132 | Automatically Learning Data Augmentation Policies for Dialogue Tasks | Tong Niu, Mohit Bansal | In our work, we adapt AutoAugment to automatically discover effective perturbation policies for natural language processing (NLP) tasks such as dialogue generation. |
133 | uniblock: Scoring and Filtering Corpus with Unicode Block Information | Yingbo Gao, Weiyue Wang, Hermann Ney | In this paper, we introduce a simple statistical method, uniblock, to overcome this problem. |
134 | Multilingual word translation using auxiliary languages | Hagai Taitelbaum, Gal Chechik, Jacob Goldberger | In this study we propose a multilingual translation procedure that uses all the learned mappings to translate a word from one language to another. |
135 | Towards Better Modeling Hierarchical Structure for Self-Attention with Ordered Neurons | Jie Hao, Xing Wang, Shuming Shi, Jinfeng Zhang, Zhaopeng Tu | With the belief that modeling hierarchical structure is an essential complementary between SANs and RNNs, we propose to further enhance the strength of hybrid models with an advanced variant of RNNs — Ordered Neurons LSTM (ON-LSTM), which introduces a syntax-oriented inductive bias to perform tree-like composition. |
136 | Vecalign: Improved Sentence Alignment in Linear Time and Space | Brian Thompson, Philipp Koehn | We introduce Vecalign, a novel bilingual sentence alignment method which is linear in time and space with respect to the number of sentences being aligned and which requires only bilingual sentence embeddings. |
137 | Simpler and Faster Learning of Adaptive Policies for Simultaneous Translation | Baigong Zheng, Renjie Zheng, Mingbo Ma, Liang Huang | To combine the merits of both approaches, we propose a simple supervised-learning framework to learn an adaptive policy from oracle READ/WRITE sequences generated from parallel text. |
138 | Adversarial Learning with Contextual Embeddings for Zero-resource Cross-lingual Classification and NER | Phillip Keung, yichao lu, Vikas Bhardwaj | We improve upon multilingual BERT’s zero-resource cross-lingual performance via adversarial learning. |
139 | Recurrent Positional Embedding for Neural Machine Translation | Kehai Chen, Rui Wang, Masao Utiyama, Eiichiro Sumita | To address this issue, this work proposes a recurrent positional embedding approach based on word vector. |
140 | Machine Translation for Machines: the Sentiment Classification Use Case | amirhossein tebbifakhr, Luisa Bentivogli, Matteo Negri, Marco Turchi | We propose a neural machine translation (NMT) approach that, instead of pursuing adequacy and fluency (“human-oriented” quality criteria), aims to generate translations that are best suited as input to a natural language processing component designed for a specific downstream task (a “machine-oriented” criterion). |
141 | Investigating the Effectiveness of BPE: The Power of Shorter Sequences | Matthias Gallé | We link BPE to the broader family of dictionary-based compression algorithms and compare it with other members of this family. |
142 | HABLex: Human Annotated Bilingual Lexicons for Experiments in Machine Translation | Brian Thompson, Rebecca Knowles, Xuan Zhang, Huda Khayrallah, Kevin Duh, Philipp Koehn | In this work, we present the HABLex dataset, designed to test methods for bilingual lexicon integration into neural machine translation. |
143 | Handling Syntactic Divergence in Low-resource Machine Translation | Chunting Zhou, Xuezhe Ma, Junjie Hu, Graham Neubig | In this paper, we propose a simple yet effective solution, whereby target-language sentences are re-ordered to match the order of the source and used as an additional source of training-time supervision. |
144 | Speculative Beam Search for Simultaneous Translation | Renjie Zheng, Mingbo Ma, Baigong Zheng, Liang Huang | To address this challenge, we propose a new speculative beam search algorithm that hallucinates several steps into the future in order to reach a more accurate decision by implicitly benefiting from a target language model. |
145 | Self-Attention with Structural Position Representations | Xing Wang, Zhaopeng Tu, Longyue Wang, Shuming Shi | In this work, we propose to augment SANs with structural position representations to model the latent structure of the input sentence, which is complementary to the standard sequential positional representations. |
146 | Exploiting Multilingualism through Multistage Fine-Tuning for Low-Resource Neural Machine Translation | Raj Dabre, Atsushi Fujita, Chenhui Chu | This paper highlights the impressive utility of multi-parallel corpora for transfer learning in a one-to-many low-resource neural machine translation (NMT) setting. |
147 | Unsupervised Domain Adaptation for Neural Machine Translation with Domain-Aware Feature Embeddings | Zi-Yi Dou, Junjie Hu, Antonios Anastasopoulos, Graham Neubig | In this work, we propose an approach that adapts models with domain-aware feature embeddings, which are learned via an auxiliary language modeling task. |
148 | A Regularization-based Framework for Bilingual Grammar Induction | Yong Jiang, Wenjuan Han, Kewei Tu | We propose three regularization methods that encourage similarity between model parameters, dependency edge scores, and parse trees respectively. |
149 | Encoders Help You Disambiguate Word Senses in Neural Machine Translation | Gongbo Tang, Rico Sennrich, Joakim Nivre | In this paper, we explore the ability of NMT encoders and decoders to disambiguate word senses by evaluating hidden states and investigating the distributions of self-attention. |
150 | Korean Morphological Analysis with Tied Sequence-to-Sequence Multi-Task Model | Hyun-Je Song, Seong-Bae Park | This paper formulates Korean morphological analysis as a combination of the tasks and presents a tied sequence-to-sequence multi-task model for training the two tasks simultaneously without any explicit regularization. |
151 | Efficient Convolutional Neural Networks for Diacritic Restoration | Sawsan Alqahtani, Ajay Mishra, Mona Diab | As diacritic restoration benefits from both previous as well as subsequent timesteps, we further apply and evaluate a variant of TCN, Acausal TCN (A-TCN), which incorporates context from both directions (previous and future) rather than strictly incorporating previous context as in the case of TCN. |
152 | Improving Generative Visual Dialog by Answering Diverse Questions | Vishvak Murahari, Prithvijit Chattopadhyay, Dhruv Batra, Devi Parikh, Abhishek Das | To improve this, we devise a simple auxiliary objective that incentivizes Q-Bot to ask diverse questions, thus reducing repetitions and in turn enabling A-Bot to explore a larger state space during RL i.e. be exposed to more visual concepts to talk about, and varied questions to answer. |
153 | Cross-lingual Transfer Learning with Data Selection for Large-Scale Spoken Language Understanding | Quynh Do, Judith Gaspers | In this paper, we address this question and propose a simple but effective language model based source-language data selection method for cross-lingual transfer learning in large-scale spoken language understanding. |
154 | Multi-Head Attention with Diversity for Learning Grounded Multilingual Multimodal Representations | Po-Yao Huang, Xiaojun Chang, Alexander Hauptmann | With the aim of promoting and understanding the multilingual version of image search, we leverage visual object detection and propose a model with diverse multi-head attention to learn grounded multilingual multimodal representations. |
155 | Decoupled Box Proposal and Featurization with Ultrafine-Grained Semantic Labels Improve Image Captioning and Visual Question Answering | Soravit Changpinyo, Bo Pang, Piyush Sharma, Radu Soricut | In this paper, we examine the effect of decoupling box proposal and featurization for down-stream tasks. |
156 | REO-Relevance, Extraness, Omission: A Fine-grained Evaluation for Image Captioning | Ming Jiang, Junjie Hu, Qiuyuan Huang, Lei Zhang, Jana Diesner, Jianfeng Gao | In this study, we present a fine-grained evaluation method REO for automatically measuring the performance of image captioning systems. |
157 | WSLLN:Weakly Supervised Natural Language Localization Networks | Mingfei Gao, Larry Davis, Richard Socher, Caiming Xiong | We propose weakly supervised language localization networks (WSLLN) to detect events in long, untrimmed videos given language queries. |
158 | Grounding learning of modifier dynamics: An application to color naming | Xudong Han, Philip Schulz, Trevor Cohn | We present a model of color modifiers that, compared with previous additive models in RGB space, learns more complex transformations. |
159 | Robust Navigation with Language Pretraining and Stochastic Sampling | Xiujun Li, Chunyuan Li, Qiaolin Xia, Yonatan Bisk, Asli Celikyilmaz, Jianfeng Gao, Noah A. Smith, Yejin Choi | In this paper, we report two simple but highly effective methods to address these challenges and lead to a new state-of-the-art performance. |
160 | Towards Making a Dependency Parser See | Michalina Strzyz, David Vilares, Carlos Gómez-Rodríguez | We explore whether it is possible to leverage eye-tracking data in an RNN dependency parser (for English) when such information is only available during training – i.e. no aggregated or token-level gaze features are used at inference time. |
161 | Unsupervised Labeled Parsing with Deep Inside-Outside Recursive Autoencoders | Andrew Drozdov, Patrick Verga, Yi-Pei Chen, Mohit Iyyer, Andrew McCallum | In this work, we show that we can effectively recover these types of labels using the learned phrase vectors from deep inside-outside recursive autoencoders (DIORA). |
162 | Dependency Parsing for Spoken Dialog Systems | Sam Davidson, Dian Yu, Zhou Yu | Therefore, we propose the Spoken Conversation Universal Dependencies (SCUD) annotation scheme that extends the Universal Dependencies (UD) (Nivre et al., 2016) guidelines to spoken human-machine dialogs. |
163 | Span-based Hierarchical Semantic Parsing for Task-Oriented Dialog | Panupong Pasupat, Sonal Gupta, Karishma Mandyam, Rushin Shah, Mike Lewis, Luke Zettlemoyer | We propose a semantic parser for parsing compositional utterances into Task Oriented Parse (TOP), a tree representation that has intents and slots as labels of nesting tree nodes. |
164 | Enhancing Context Modeling with a Query-Guided Capsule Network for Document-level Translation | Zhengxin Yang, Jinchao Zhang, Fandong Meng, Shuhao Gu, Yang Feng, Jie Zhou | To address this problem, we propose a query-guided capsule networks to cluster context information into different perspectives from which the target translation may concern. |
165 | Simple, Scalable Adaptation for Neural Machine Translation | Ankur Bapna, Orhan Firat | We propose a simple yet efficient approach for adaptation in NMT. |
166 | Controlling Text Complexity in Neural Machine Translation | Sweta Agrawal, Marine Carpuat | This work introduces a machine translation task where the output is aimed at audiences of different levels of target language proficiency. |
167 | Investigating Multilingual NMT Representations at Scale | Sneha Kudugunta, Ankur Bapna, Isaac Caswell, Orhan Firat | In this work, we attempt to understand massively multilingual NMT representations (with 103 languages) using Singular Value Canonical Correlation Analysis (SVCCA), a representation similarity framework that allows us to compare representations across different languages, layers and models. |
168 | Hierarchical Modeling of Global Context for Document-Level Neural Machine Translation | Xin Tan, Longyin Zhang, Deyi Xiong, Guodong Zhou | In this paper, we propose a hierarchical model to learn the global context for document-level neural machine translation (NMT). |
169 | Cross-Lingual Machine Reading Comprehension | Yiming Cui, Wanxiang Che, Ting Liu, Bing Qin, Shijin Wang, Guoping Hu | In this paper, we propose Cross-Lingual Machine Reading Comprehension (CLMRC) task for the languages other than English. |
170 | A Multi-Type Multi-Span Network for Reading Comprehension that Requires Discrete Reasoning | Minghao Hu, Yuxing Peng, Zhen Huang, Dongsheng Li | In this paper, we introduce the Multi-Type Multi-Span Network (MTMSN), a neural reading comprehension model that combines a multi-type answer predictor designed to support various answer types (e.g., span, count, negation, and arithmetic expression) with a multi-span extraction method for dynamically producing one or multiple text spans. |
171 | Neural Duplicate Question Detection without Labeled Training Data | Andreas Rücklé, Nafise Sadat Moosavi, Iryna Gurevych | In this work, we propose two novel methods—weak supervision using the title and body of a question, and the automatic generation of duplicate questions—and show that both can achieve improved performances even though they do not require any labeled data. |
172 | Asking Clarification Questions in Knowledge-Based Question Answering | Jingjing Xu, Yuechen Wang, Duyu Tang, Nan Duan, Pengcheng Yang, Qi Zeng, Ming Zhou, Xu SUN | In this paper, we construct a new clarification dataset, CLAQUA, with nearly 40K open-domain examples. |
173 | Multi-View Domain Adapted Sentence Embeddings for Low-Resource Unsupervised Duplicate Question Detection | Nina Poerner, Hinrich Schütze | We address the problem of Duplicate Question Detection (DQD) in low-resource domain-specific Community Question Answering forums. |
174 | Multi-label Categorization of Accounts of Sexism using a Neural Framework | Pulkit Parikh, Harika Abburi, Pinkesh Badjatiya, Radhika Krishnan, Niyati Chhaya, Manish Gupta, Vasudeva Varma | We develop a neural solution for this multi-label classification that can combine sentence representations obtained using models such as BERT with distributional and linguistic word embeddings using a flexible, hierarchical architecture involving recurrent components and optional convolutional ones. |
175 | The Trumpiest Trump? Identifying a Subject’s Most Characteristic Tweets | Charuta Pethe, Steve Skiena | We quantify the extent to which a given short text is characteristic of a specific person, using a dataset of tweets from fifteen celebrities. Such analysis is useful for generating excerpts of high-volume Twitter profiles, and understanding how representativeness relates to tweet popularity. |
176 | Finding Microaggressions in the Wild: A Case for Locating Elusive Phenomena in Social Media Posts | Luke Breitfeller, Emily Ahn, David Jurgens, Yulia Tsvetkov | In this paper, we devise a general but nuanced, computationally operationalizable typology of microaggressions based on a small subset of data that we have. |
177 | Reinforced Product Metadata Selection for Helpfulness Assessment of Customer Reviews | Miao Fan, Chao Feng, Mingming Sun, Ping Li | To address this problem, we propose a novel framework composed of two mutual-benefit modules. |
178 | Learning Invariant Representations of Social Media Users | Nicholas Andrews, Marcus Bishop | In this paper, we propose a novel procedure to learn a mapping from short episodes of user activity on social media to a vector space in which the distance between points captures the similarity of the corresponding users’ invariant features. |
179 | (Male, Bachelor) and (Female, Ph.D) have different connotations: Parallelly Annotated Stylistic Language Dataset with Multiple Personas | Dongyeop Kang, Varun Gangal, Eduard Hovy | We release PASTEL, the parallel and annotated stylistic language dataset, that contains ~41K parallel sentences (8.3K parallel stories) annotated across different personas. |
180 | Movie Plot Analysis via Turning Point Identification | Pinelopi Papalampidi, Frank Keller, Mirella Lapata | We propose the task of turning point identification in movies as a means of analyzing their narrative structure. We introduce a dataset consisting of screenplays and plot synopses annotated with turning points and present an end-to-end neural network model that identifies turning points in plot synopses and projects them onto scenes in screenplays. |
181 | Latent Suicide Risk Detection on Microblog via Suicide-Oriented Word Embeddings and Layered Attention | Lei Cao, Huijun Zhang, Ling Feng, Zihan Wei, Xin Wang, Ningyun Li, Xiaohao He | Enlightened by the hidden “tree holes” phenomenon on microblog, where people at suicide risk tend to disclose their inner real feelings and thoughts to the microblog space whose authors have committed suicide, we explore the use of tree holes to enhance microblog-based suicide risk detection from the following two perspectives. A large-scale well-labelled suicide data set is also reported in the paper. |
182 | Deep Ordinal Regression for Pledge Specificity Prediction | Shivashankar Subramanian, Trevor Cohn, Timothy Baldwin | In this paper we collate a novel dataset of manifestos from eleven Australian federal election cycles, with over 12,000 sentences annotated with specificity (e.g., rhetorical vs detailed pledge) on a fine-grained scale. We propose deep ordinal regression approaches for specificity prediction, under both supervised and semi-supervised settings, and provide empirical results demonstrating the effectiveness of the proposed techniques over several baseline approaches. |
183 | Data-Efficient Goal-Oriented Conversation with Dialogue Knowledge Transfer Networks | Igor Shalyminov, Sungjin Lee, Arash Eshghi, Oliver Lemon | In this paper, we present the Dialogue Knowledge Transfer Network (DiKTNet), a state-of-the-art approach to goal-oriented dialogue generation which only uses a few example dialogues (i.e. few-shot learning), none of which has to be annotated. |
184 | Multi-Granularity Representations of Dialog | Shikib Mehri, Maxine Eskenazi | This paper introduces a novel training procedure which explicitly learns multiple representations of language at several levels of granularity. |
185 | Are You for Real? Detecting Identity Fraud via Dialogue Interactions | Weikang Wang, Jiajun Zhang, Qian Li, Chengqing Zong, Zhifei Li | In this paper, we focus on identity fraud detection in loan applications and propose to solve this problem with a novel interactive dialogue system which consists of two modules. |
186 | Hierarchy Response Learning for Neural Conversation Generation | Bo Zhang, Xiaoming Zhang | Unlike past work that has focused on diversifying the output at word-level or discourse-level with a flat model to alleviate this problem, we propose a hierarchical generation model to capture the different levels of diversity using the conditional variational autoencoders. |
187 | Knowledge Aware Conversation Generation with Explainable Reasoning over Augmented Graphs | zhibin liu, Zheng-Yu Niu, Hua Wu, Haifeng Wang | To address this challenge, we propose a knowledge aware chatting machine with three components, an augmented knowledge graph with both triples and texts, knowledge selector, and knowledge aware response generator. |
188 | Adaptive Parameterization for Neural Dialogue Generation | Hengyi Cai, Hongshen Chen, Cheng Zhang, Yonghao Song, Xiaofang Zhao, Dawei Yin | In this work, we propose an Adaptive Neural Dialogue generation model, AdaND, which manages various conversations with conversation-specific parameterization. |
189 | Towards Knowledge-Based Recommender Dialog System | Qibin Chen, Junyang Lin, Yichang Zhang, Ming Ding, Yukuo Cen, Hongxia Yang, Jie Tang | In this paper, we propose a novel end-to-end framework called KBRD, which stands for Knowledge-Based Recommender Dialog System. |
190 | Structuring Latent Spaces for Stylized Response Generation | Xiang Gao, Yizhe Zhang, Sungjin Lee, Michel Galley, Chris Brockett, Jianfeng Gao, Bill Dolan | We propose StyleFusion, which bridges conversation modeling and non-parallel style transfer by sharing a structured latent space. |
191 | Improving Open-Domain Dialogue Systems via Multi-Turn Incomplete Utterance Restoration | Zhufeng Pan, Kun Bai, Yan Wang, Lianqiang Zhou, Xiaojiang Liu | To facilitate the study of incomplete utterance restoration for open-domain dialogue systems, a large-scale multi-turn dataset Restoration-200K is collected and manually labeled with the explicit relation between an utterance and its context. We also propose a “pick-and-combine” model to restore the incomplete utterance from its context. |
192 | Unsupervised Context Rewriting for Open Domain Conversation | Kun Zhou, Kai Zhang, Yu Wu, Shujie Liu, Jingsong Yu | This paper proposes an explicit context rewriting method, which rewrites the last utterance by considering context history. |
193 | Dually Interactive Matching Network for Personalized Response Selection in Retrieval-Based Chatbots | Jia-Chen Gu, Zhen-Hua Ling, Xiaodan Zhu, Quan Liu | This paper proposes a dually interactive matching network (DIM) for presenting the personalities of dialogue agents in retrieval-based chatbots. |
194 | DyKgChat: Benchmarking Dialogue Generation Grounding on Dynamic Knowledge Graphs | Yi-Lin Tuan, Yun-Nung Chen, Hung-yi Lee | This paper proposes a new task about how to apply dynamic knowledge graphs in neural conversation model and presents a novel TV series conversation corpus (DyKgChat) for the task. |
195 | Retrieval-guided Dialogue Response Generation via a Matching-to-Generation Framework | Deng Cai, Yan Wang, Wei Bi, Zhaopeng Tu, Xiaojiang Liu, Shuming Shi | This paper presents a novel framework in which the skeleton extraction is made by an interpretable matching model and the following skeleton-guided response generation is accomplished by a separately trained generator. |
196 | Scalable and Accurate Dialogue State Tracking via Hierarchical Sequence Generation | Liliang Ren, Jianmo Ni, Julian McAuley | In this paper, we investigate how to approach DST using a generation framework without the pre-defined ontology list. |
197 | Low-Resource Response Generation with Template Prior | Ze Yang, wei wu, Jian Yang, Can Xu, zhoujun li | Since the paired data now is no longer enough to train a neural generation model, we consider leveraging the large scale of unpaired data that are much easier to obtain, and propose response generation with both paired and unpaired data. |
198 | A Discrete CVAE for Response Generation on Short-Text Conversation | Jun Gao, Wei Bi, Xiaojiang Liu, Junhui Li, Guodong Zhou, Shuming Shi | In this paper, we introduce a discrete latent variable with an explicit semantic meaning to improve the CVAE on short-text conversation. |
199 | Who Is Speaking to Whom? Learning to Identify Utterance Addressee in Multi-Party Conversations | Ran Le, Wenpeng Hu, Mingyue Shang, Zhenjun You, Lidong Bing, Dongyan Zhao, Rui Yan | In this paper, we aim to tackle the challenge of identifying all the miss- ing addressees in a conversation session. |
200 | A Semi-Supervised Stable Variational Network for Promoting Replier-Consistency in Dialogue Generation | Jinxin Chang, Ruifang He, Longbiao Wang, Xiangyu Zhao, Ting Yang, Ruifang Wang | However, the sampled information from latent space usually becomes useless due to the KL divergence vanishing issue, and the highly abstractive global variables easily dilute the personal features of replier, leading to a non replier-specific response. Therefore, a novel Semi-Supervised Stable Variational Network (SSVN) is proposed to address these issues. |
201 | Modeling Personalization in Continuous Space for Response Generation via Augmented Wasserstein Autoencoders | Zhangming Chan, Juntao Li, Xiaopeng Yang, Xiuying Chen, Wenpeng Hu, Dongyan Zhao, Rui Yan | In this work, we improve the WAE for response generation. |
202 | Variational Hierarchical User-based Conversation Model | JinYeong Bak, Alice Oh | To overcome this limitation, we propose a new model with a stochastic variable designed to capture the speaker information and deliver it to the conversational context. To test whether our model generates more appropriate conversation responses, we build a new conversation corpus containing approximately 27,000 speakers and 770,000 conversations. |
203 | Recommendation as a Communication Game: Self-Supervised Bot-Play for Goal-oriented Dialogue | Dongyeop Kang, Anusha Balakrishnan, Pararth Shah, Paul Crook, Y-Lan Boureau, Jason Weston | In this work, we collect a goal-driven recommendation dialogue dataset (GoRecDial), which consists of 9,125 dialogue games and 81,260 conversation turns between pairs of human workers recommending movies to each other. We leverage the dataset to develop an end-to-end dialogue system that can simultaneously converse and recommend. |
204 | CoSQL: A Conversational Text-to-SQL Challenge Towards Cross-Domain Natural Language Interfaces to Databases | Tao Yu, Rui Zhang, Heyang Er, Suyi Li, Eric Xue, Bo Pang, Xi Victoria Lin, Yi Chern Tan, Tianze Shi, Zihan Li, Youxuan Jiang, Michihiro Yasunaga, Sungrok Shim, Tao Chen, Alexander Fabbri, Zifan Li, Luyao Chen, Yuwen Zhang, Shreya Dixit, Vincent Zhang, Caiming Xiong, Richard Socher, Walter Lasecki, Dragomir Radev | We present CoSQL, a corpus for building cross-domain, general-purpose database (DB) querying dialogue systems. |
205 | A Practical Dialogue-Act-Driven Conversation Model for Multi-Turn Response Selection | Harshit Kumar, Arvind Agarwal, Sachindra Joshi | This paper proposes an end-to-end multi-task model for conversation modeling, which is optimized for two tasks, dialogue act prediction and response selection, with the latter being the task of interest. |
206 | How to Build User Simulators to Train RL-based Dialog Systems | Weiyan Shi, Kun Qian, Xuewei Wang, Zhou Yu | We propose a method of standardizing user simulator building that can be used by the community to compare dialog system quality using the same set of user simulators fairly. |
207 | Low-Rank HOCA: Efficient High-Order Cross-Modal Attention for Video Captioning | Tao Jin, Siyu Huang, Yingming Li, Zhongfei Zhang | Motivated by this, we propose a video captioning model with High-Order Cross-Modal Attention (HOCA) where the attention weights are calculated based on the high-order correlation tensor to capture the frame-level cross-modal interaction of different modalities sufficiently. |
208 | Image Captioning with Very Scarce Supervised Data: Adversarial Semi-Supervised Learning Approach | Dong-Jin Kim, Jinsoo Choi, Tae-Hyun Oh, In So Kweon | In this paper, we develop a novel data-efficient semi-supervised framework for training an image captioning model. To evaluate, we construct scarcely-paired COCO dataset, a modified version of MS COCO caption dataset. |
209 | Dual Attention Networks for Visual Reference Resolution in Visual Dialog | Gi-Cheon Kang, Jaeseo Lim, Byoung-Tak Zhang | In this paper, we propose Dual Attention Networks (DAN) for visual reference resolution in VisDial. |
210 | Unsupervised Discovery of Multimodal Links in Multi-image, Multi-sentence Documents | Jack Hessel, Lillian Lee, David Mimno | We present algorithms that discover image-sentence relationships without relying on explicit multimodal annotation in training. |
211 | UR-FUNNY: A Multimodal Language Dataset for Understanding Humor | Md Kamrul Hasan, Wasifur Rahman, AmirAli Bagher Zadeh, Jianyuan Zhong, Md Iftekhar Tanveer, Louis-Philippe Morency, Mohammed (Ehsan) Hoque | The dataset and accompanying studies, present a framework in multimodal humor detection for the natural language processing community. This paper presents a diverse multimodal dataset, called UR-FUNNY, to open the door to understanding multimodal language used in expressing humor. |
212 | Partners in Crime: Multi-view Sequential Inference for Movie Understanding | Nikos Papasarantopoulos, Lea Frermann, Mirella Lapata, Shay B. Cohen | We describe an incremental neural architecture paired with a novel training objective for incremental inference. |
213 | Guiding the Flowing of Semantics: Interpretable Video Captioning via POS Tag | Xinyu Xiao, Lingfeng Wang, Bin Fan, Shinming Xiang, Chunhong Pan | To address these problems, we propose an Adaptive Semantic Guidance Network (ASGN), which instantiates the whole video semantics to different POS-aware semantics with the supervision of part of speech (POS) tag. |
214 | A Stack-Propagation Framework with Token-Level Intent Detection for Spoken Language Understanding | Libo Qin, Wanxiang Che, Yangming Li, Haoyang Wen, Ting Liu | In this paper, we propose a novel framework for SLU to better incorporate the intent information, which further guiding the slot filling. |
215 | Talk2Car: Taking Control of Your Self-Driving Car | Thierry Deruyttere, Simon Vandenhende, Dusan Grujicic, Luc Van Gool, Marie-Francine Moens | In this paper we consider the former. Our work presents the Talk2Car dataset, which is the first object referral dataset that contains commands written in natural language for self-driving cars. |
216 | Fact-Checking Meets Fauxtography: Verifying Claims About Images | Dimitrina Zlatkova, Preslav Nakov, Ivan Koychev | In particular, we create a new dataset for this problem, and we explore a variety of features modeling the claim, the image, and the relationship between the claim and the image. |
217 | Video Dialog via Progressive Inference and Cross-Transformer | Weike Jin, Zhou Zhao, Mao Gu, Jun Xiao, Furu Wei, Yueting Zhuang | In this paper, we introduce a novel progressive inference mechanism for video dialog, which progressively updates query information based on dialog history and video content until the agent think the information is sufficient and unambiguous. |
218 | Executing Instructions in Situated Collaborative Interactions | Alane Suhr, Claudia Yan, Jack Schluger, Stanley Yu, Hadi Khader, Marwa Mouallem, Iris Zhang, Yoav Artzi | We introduce a learning approach focused on recovery from cascading errors between instructions, and modeling methods to explicitly reason about instructions with multiple goals. |
219 | Fusion of Detected Objects in Text for Visual Question Answering | Chris Alberti, Jeffrey Ling, Michael Collins, David Reitter | To advance models of multimodal context, we introduce a simple yet powerful neural architecture for data that combines vision and natural language. |
220 | TIGEr: Text-to-Image Grounding for Image Caption Evaluation | Ming Jiang, Qiuyuan Huang, Lei Zhang, Xin Wang, Pengchuan Zhang, Zhe Gan, Jana Diesner, Jianfeng Gao | This paper presents a new metric called TIGEr for the automatic evaluation of image captioning systems. |
221 | Universal Adversarial Triggers for Attacking and Analyzing NLP | Eric Wallace, Shi Feng, Nikhil Kandpal, Matt Gardner, Sameer Singh | We propose a gradient-guided search over tokens which finds short trigger sequences (e.g., one word for classification and four words for language modeling) that successfully trigger the target prediction. |
222 | To Annotate or Not? Predicting Performance Drop under Domain Shift | Hady Elsahar, Matthias Gallé | In this paper, we study the problem of predicting the performance drop of modern NLP models under domain-shift, in the absence of any target domain labels. |
223 | Adaptively Sparse Transformers | Gonçalo M. Correia, Vlad Niculae, André F. T. Martins | In this work, we introduce the adaptively sparse Transformer, wherein attention heads have flexible, context-dependent sparsity patterns. |
224 | Show Your Work: Improved Reporting of Experimental Results | Jesse Dodge, Suchin Gururangan, Dallas Card, Roy Schwartz, Noah A. Smith | In this paper, we demonstrate that test-set performance scores alone are insufficient for drawing accurate conclusions about which model performs best. |
225 | A Deep Factorization of Style and Structure in Fonts | Akshay Srivatsan, Jonathan Barron, Dan Klein, Taylor Berg-Kirkpatrick | We propose a deep factorization model for typographic analysis that disentangles content from style. |
226 | Cross-lingual Semantic Specialization via Lexical Relation Induction | Edoardo Maria Ponti, Ivan Vuli?, Goran Glavaš, Roi Reichart, Anna Korhonen | To bridge this gap, we propose a novel method that transfers specialization from a resource-rich source language (English) to virtually any target language. |
227 | Modelling the interplay of metaphor and emotion through multitask learning | Verna Dankers, Marek Rei, Martha Lewis, Ekaterina Shutova | In this paper, we investigate the relationship between metaphor and emotion within a computational framework, by proposing the first joint model of these phenomena. |
228 | How well do NLI models capture verb veridicality? | Alexis Ross, Ellie Pavlick | We investigate whether a state-of-the-art natural language inference model (BERT) learns to make correct inferences about veridicality in verb-complement constructions. We introduce an NLI dataset for veridicality evaluation consisting of 1,500 sentence pairs, covering 137 unique verbs. |
229 | Modeling Color Terminology Across Thousands of Languages | Arya D. McCarthy, Winston Wu, Aaron Mueller, William Watson, David Yarowsky | This paper employs a set of diverse measures on massively cross-linguistic data to operationalize and critique the Berlin and Kay color term hypotheses. |
230 | Negative Focus Detection via Contextual Attention Mechanism | Longxiang Shen, Bowei Zou, Yu Hong, Guodong Zhou, Qiaoming Zhu, AiTi Aw | In particular, we introduce a framework which consists of a Bidirectional Long Short-Term Memory (BiLSTM) neural network and a Conditional Random Fields (CRF) layer to effectively encode the order information and the long-range context dependency in a sentence. |
231 | A Unified Neural Coherence Model | Han Cheol Moon, Tasnim Mohiuddin, Shafiq Joty, Chi Xu | In this paper, we propose a unified coherence model that incorporates sentence grammar, inter-sentence coherence relations, and global coherence patterns into a common neural framework. |
232 | Topic-Guided Coherence Modeling for Sentence Ordering by Preserving Global and Local Information | Byungkook Oh, Seungmin Seo, Cheolheon Shin, Eunju Jo, Kyong-Ho Lee | We propose a novel topic-guided coherence modeling (TGCM) for sentence ordering. |
233 | Neural Generative Rhetorical Structure Parsing | Amandla Mabona, Laura Rimell, Stephen Clark, Andreas Vlachos | In this paper, we present the first generative model for RST parsing. |
234 | Weak Supervision for Learning Discourse Structure | Sonia Badene, Kate Thompson, Jean-Pierre Lorré, Nicholas Asher | This paper provides a detailed comparison of a data programming approach with (i) off-the-shelf, state-of-the-art deep learning architectures that optimize their representations (BERT) and (ii) handcrafted-feature approaches previously used in the discourse analysis literature. |
235 | Predicting Discourse Structure using Distant Supervision from Sentiment | Patrick Huber, Giuseppe Carenini | We propose a novel approach that uses distant supervision on an auxiliary task (sentiment classification), to generate abundant data for RST-style discourse structure prediction. |
236 | The Myth of Double-Blind Review Revisited: ACL vs. EMNLP | Cornelia Caragea, Ana Uban, Liviu P. Dinu | We study this question on the ACL and EMNLP paper collections and present an analysis on how well deep learning techniques can infer the authors of a paper. |
237 | Uncover Sexual Harassment Patterns from Personal Stories by Joint Key Element Extraction and Categorization | Yingchi Liu, Quanzhi Li, Marika Cifor, Xiaozhong Liu, Qiong Zhang, Luo Si | In this study, we manually annotated those stories with labels in the dimensions of location, time, and harassers’ characteristics, and marked the key elements related to these dimensions. |
238 | Identifying Predictive Causal Factors from News Streams | Ananth Balashankar, Sunandan Chakraborty, Samuel Fraiberger, Lakshminarayanan Subramanian | We propose a new framework to uncover the relationship between news events and real world phenomena. |
239 | Training Data Augmentation for Detecting Adverse Drug Reactions in User-Generated Content | Sepideh Mesbah, Jie Yang, Robert-Jan Sips, Manuel Valle Torre, Christoph Lofi, Alessandro Bozzon, Geert-Jan Houben | In this paper, we introduce a data augmentation approach that leverages variational autoencoders to learn high-quality data distributions from a large unlabeled dataset, and subsequently, to automatically generate a large labeled training set from a small set of labeled samples. |
240 | Deep Reinforcement Learning-based Text Anonymization against Private-Attribute Inference | Ahmadreza Mosallanezhad, Ghazaleh Beigi, Huan Liu | In this paper, we study the problem of textual data anonymization and propose a novel Reinforcement Learning-based Text Anonymizor, RLTA, which addresses the problem of private-attribute leakage while preserving the utility of textual data. |
241 | Tree-structured Decoding for Solving Math Word Problems | Qianying Liu, Wenyv Guan, Sujian Li, Daisuke Kawahara | To address this problem, we propose a tree-structured decoding method that generates the abstract syntax tree of the equation in a top-down manner. |
242 | PullNet: Open Domain Question Answering with Iterative Retrieval on Knowledge Bases and Text | Haitian Sun, Tania Bedrax-Weiss, William Cohen | We describe PullNet, an integrated framework for (1) learning what to retrieve and (2) reasoning with this heterogeneous information to find the best answer. |
243 | Cosmos QA: Machine Reading Comprehension with Contextual Commonsense Reasoning | Lifu Huang, Ronan Le Bras, Chandra Bhagavatula, Yejin Choi | In this paper, we introduce Cosmos QA, a large-scale dataset of 35,600 problems that require commonsense-based reading comprehension, formulated as multiple-choice questions. |
244 | Finding Generalizable Evidence by Learning to Convince Qtextbackslash&A Models | Ethan Perez, Siddharth Karamcheti, Rob Fergus, Jason Weston, Douwe Kiela, Kyunghyun Cho | We propose a system that finds the strongest supporting evidence for a given answer to a question, using passage-based question-answering (QA) as a testbed. |
245 | Ranking and Sampling in Open-Domain Question Answering | Yanfu Xu, Zheng Lin, Yuanxin Liu, Rui Liu, Weiping Wang, Dan Meng | In this paper, we first introduce a ranking model leveraging the paragraph-question and the paragraph-paragraph relevance to compute a confidence score for each paragraph. Furthermore, based on the scores, we design a modified weighted sampling strategy for training to mitigate the influence of the noisy and distracting paragraphs. |
246 | A Non-commutative Bilinear Model for Answering Path Queries in Knowledge Graphs | Katsuhiko Hayashi, Masashi Shimbo | In this paper, we propose a new bilinear KGE model, called BlockHolE, based on block circulant matrices. |
247 | Generating Questions for Knowledge Bases via Incorporating Diversified Contexts and Answer-Aware Loss | Cao Liu, Kang Liu, Shizhu He, Zaiqing Nie, Jun Zhao | In this paper, we strive toward the above two issues via incorporating diversified contexts and answer-aware loss. |
248 | Multi-Task Learning for Conversational Question Answering over a Large-Scale Knowledge Base | Tao Shen, Xiubo Geng, Tao QIN, Daya Guo, Duyu Tang, Nan Duan, Guodong Long, Daxin Jiang | To tackle these issues, we propose an innovative multi-task learning framework where a pointer-equipped semantic parsing model is designed to resolve coreference in conversations, and naturally empower joint learning with a novel type-aware entity detection model. |
249 | BiPaR: A Bilingual Parallel Dataset for Multilingual and Cross-lingual Reading Comprehension on Novels | Yimin Jing, Deyi Xiong, Zhen Yan | This paper presents BiPaR, a bilingual parallel novel-style machine reading comprehension (MRC) dataset, developed to support multilingual and cross-lingual reading comprehension. |
250 | Language Models as Knowledge Bases? | Fabio Petroni, Tim Rocktäschel, Sebastian Riedel, Patrick Lewis, Anton Bakhtin, Yuxiang Wu, Alexander Miller | We present an in-depth analysis of the relational knowledge already present (without fine-tuning) in a wide range of state-of-the-art pretrained language models. |
251 | NumNet: Machine Reading Comprehension with Numerical Reasoning | Qiu Ran, Yankai Lin, Peng Li, Jie Zhou, Zhiyuan Liu | To address this issue, we propose a numerical MRC model named as NumNet, which utilizes a numerically-aware graph neural network to consider the comparing information and performs numerical reasoning over numbers in the question and passage. |
252 | Unicoder: A Universal Language Encoder by Pre-training with Multiple Cross-lingual Tasks | Haoyang Huang, Yaobo Liang, Nan Duan, Ming Gong, Linjun Shou, Daxin Jiang, Ming Zhou | We present Unicoder, a universal language encoder that is insensitive to different languages. |
253 | Addressing Semantic Drift in Question Generation for Semi-Supervised Question Answering | Shiyue Zhang, Mohit Bansal | We propose two ways to generate synthetic QA pairs: generate new questions from existing articles or collect QA pairs from new articles. |
254 | Adversarial Domain Adaptation for Machine Reading Comprehension | Huazheng Wang, Zhe Gan, Xiaodong Liu, Jingjing Liu, Jianfeng Gao, Hongning Wang | In this paper, we focus on unsupervised domain adaptation for Machine Reading Comprehension (MRC), where the source domain has a large amount of labeled data, while only unlabeled passages are available in the target domain. |
255 | Incorporating External Knowledge into Machine Reading for Generative Question Answering | Bin Bi, Chen Wu, Ming Yan, Wei Wang, Jiangnan Xia, Chenliang Li | In this paper, we propose a new neural model, Knowledge-Enriched Answer Generator (KEAG), which is able to compose a natural answer by exploiting and aggregating evidence from all four information sources available: question, passage, vocabulary and knowledge. |
256 | Answering questions by learning to rank – Learning to rank by answering questions | George Sebastian Pirtoaca, Traian Rebedea, Stefan Ruseti | The contribution of this article is two-fold. First, it describes a method which can be used to semantically rank documents extracted from Wikipedia or similar natural language corpora. Second, we propose a model employing the semantic ranking that holds the first place in two of the most popular leaderboards for answering multiple-choice questions: ARC Easy and Challenge. |
257 | Discourse-Aware Semantic Self-Attention for Narrative Reading Comprehension | Todor Mihaylov, Anette Frank | In this work, we propose to use linguistic annotations as a basis for a Discourse-Aware Semantic Self-Attention encoder that we employ for reading comprehension on narrative texts. |
258 | Revealing the Importance of Semantic Retrieval for Machine Reading at Scale | Yixin Nie, Songhe Wang, Mohit Bansal | In this work, we give general guidelines on system design for MRS by proposing a simple yet effective pipeline system with special consideration on hierarchical semantic retrieval at both paragraph and sentence level, and their potential effects on the downstream task. |
259 | PubMedQA: A Dataset for Biomedical Research Question Answering | Qiao Jin, Bhuwan Dhingra, Zhengping Liu, William Cohen, Xinghua Lu | We introduce PubMedQA, a novel biomedical question answering (QA) dataset collected from PubMed abstracts. |
260 | Quick and (not so) Dirty: Unsupervised Selection of Justification Sentences for Multi-hop Question Answering | Vikas Yadav, Steven Bethard, Mihai Surdeanu | We propose an unsupervised strategy for the selection of justification sentences for multi-hop question answering (QA) that (a) maximizes the relevance of the selected sentences, (b) minimizes the overlap between the selected facts, and (c) maximizes the coverage of both question and answer. |
261 | Answering Complex Open-domain Questions Through Iterative Query Generation | Peng Qi, Xiaowen Lin, Leo Mehr, Zijian Wang, Christopher D. Manning | We present GoldEn (Gold Entity) Retriever, which iterates between reading context and retrieving more supporting documents to answer open-domain multi-hop questions. |
262 | NL2pSQL: Generating Pseudo-SQL Queries from Under-Specified Natural Language Questions | Fuxiang Chen, Seung-won Hwang, Jaegul Choo, Jung-Woo Ha, Sunghun Kim | Here we describe a new NL2pSQL task to generate pSQL codes from natural language questions on under-specified database issues, NL2pSQL. |
263 | Leveraging Frequent Query Substructures to Generate Formal Queries for Complex Question Answering | Jiwei Ding, Wei Hu, Qixin Xu, Yuzhong Qu | In this paper, we propose SubQG, a new query generation approach based on frequent query substructures, which helps rank the existing (but nonsignificant) query structures or build new query structures. |
264 | Incorporating Graph Attention Mechanism into Knowledge Graph Reasoning Based on Deep Reinforcement Learning | Heng Wang, Shuangyin Li, Rong Pan, Mingzhi Mao | In this paper, we present a deep reinforcement learning based model named by AttnPath, which incorporates LSTM and Graph Attention Mechanism as the memory components. |
265 | Learning to Update Knowledge Graphs by Reading News | Jizhi Tang, Yansong Feng, Dongyan Zhao | In this paper, we propose a novel neural network method, GUpdater, to tackle these problems. |
266 | DIVINE: A Generative Adversarial Imitation Learning Framework for Knowledge Graph Reasoning | Ruiping Li, Xiang Cheng | To this end, in this paper, we present DIVINE, a novel plug-and-play framework based on generative adversarial imitation learning for enhancing existing RL-based methods. |
267 | Original Semantics-Oriented Attention and Deep Fusion Network for Sentence Matching | Mingtong Liu, Yujie Zhang, Jinan Xu, Yufeng Chen | In this paper, we present an original semantics-oriented attention and deep fusion network (OSOA-DFN) for sentence matching. |
268 | Representation Learning with Ordered Relation Paths for Knowledge Graph Completion | Yao Zhu, Hongzhi Liu, Zhonghai Wu, Yang Song, Tao Zhang | To solve these problems, we propose a novel KG completion method named OPTransE. |
269 | Collaborative Policy Learning for Open Knowledge Graph Reasoning | Cong Fu, Tong Chen, Meng Qu, Woojeong Jin, Xiang Ren | We propose a novel reinforcement learning framework to train two collaborative agents jointly, i.e., a multi-hop graph reasoner and a fact extractor. |
270 | Modeling Event Background for If-Then Commonsense Reasoning Using Context-aware Variational Autoencoder | Li Du, Xiao Ding, Ting Liu, Zhongyang Li | To address these issues, we propose a novel context-aware variational autoencoder effectively learning event background information to guide the If-Then reasoning. |
271 | Asynchronous Deep Interaction Network for Natural Language Inference | Di Liang, Fubao Zhang, Qi Zhang, Xuanjing Huang | In this paper, we propose an asynchronous deep interaction network (ADIN) to complete the task. |
272 | Keep Calm and Switch On! Preserving Sentiment and Fluency in Semantic Text Exchange | Steven Y. Feng, Aaron W. Li, Jesse Hoey | In this paper, we present a novel method for measurably adjusting the semantics of text while preserving its sentiment and fluency, a task we call semantic text exchange. |
273 | Query-focused Scenario Construction | Su Wang, Greg Durrett, Katrin Erk | The news coverage of events often contains not one but multiple incompatible accounts of what happened. We develop a query-based system that extracts compatible sets of events (scenarios) from such data, formulated as one-class clustering. |
274 | Semi-supervised Entity Alignment via Joint Knowledge Embedding Model and Cross-graph Model | Chengjiang Li, Yixin Cao, Lei Hou, Jiaxin Shi, Juanzi Li, Tat-Seng Chua | In this paper, we propose a semi-supervised entity alignment method by joint Knowledge Embedding model and Cross-Graph model (KECG). |
275 | Designing and Interpreting Probes with Control Tasks | John Hewitt, Percy Liang | In this paper, we propose control tasks, which associate word types with random outputs, to complement linguistic tasks. |
276 | Specializing Word Embeddings (for Parsing) by Information Bottleneck | Xiang Lisa Li, Jason Eisner | We propose a very fast variational information bottleneck (VIB) method to nonlinearly compress these embeddings, keeping only the information that helps a discriminative parser. |
277 | Deep Contextualized Word Embeddings in Transition-Based and Graph-Based Dependency Parsing – A Tale of Two Parsers Revisited | Artur Kulmizev, Miryam de Lhoneux, Johannes Gontrum, Elena Fano, Joakim Nivre | In this paper, we show that, even though some details of the picture have changed after the switch to neural networks and continuous representations, the basic trade-off between rich features and global optimization remains essentially the same. |
278 | Semantic graph parsing with recurrent neural network DAG grammars | Federico Fancellu, Sorcha Gilroy, Adam Lopez, Mirella Lapata | We present recurrent neural network DAG grammars, a graph-aware sequence model that generates only well-formed graphs while sidestepping many difficulties in graph prediction. |
279 | 75 Languages, 1 Model: Parsing Universal Dependencies Universally | Dan Kondratyuk, Milan Straka | We present UDify, a multilingual multi-task model capable of accurately predicting universal part-of-speech, morphological features, lemmas, and dependency trees simultaneously for all 124 Universal Dependencies treebanks across 75 languages. |
280 | Interactive Language Learning by Question Answering | Xingdi Yuan, Marc-Alexandre Côté, Jie Fu, Zhouhan Lin, Chris Pal, Yoshua Bengio, Adam Trischler | We propose and evaluate a set of baseline models for the QAit task that includes deep reinforcement learning agents. |
281 | What’s Missing: A Knowledge Gap Guided Approach for Multi-hop Question Answering | Tushar Khot, Ashish Sabharwal, Peter Clark | We propose jointly training a model to simultaneously fill this knowledge gap and compose it with the provided partial knowledge. |
282 | KagNet: Knowledge-Aware Graph Networks for Commonsense Reasoning | Bill Yuchen Lin, Xinyue Chen, Jamin Chen, Xiang Ren | In this paper, we propose a textual inference framework for answering commonsense questions, which effectively utilizes external, structured commonsense knowledge graphs to perform explainable inferences. |
283 | Learning with Limited Data for Multilingual Reading Comprehension | Kyungjae Lee, Sunghyun Park, Hojae Han, Jinyoung Yeo, Seung-won Hwang, Juho Lee | To address this challenge, we propose a weakly-supervised framework that quantifies such noises from automatically generated labels, to deemphasize or fix noisy data in training. |
284 | A Discrete Hard EM Approach for Weakly Supervised Question Answering | Sewon Min, Danqi Chen, Hannaneh Hajishirzi, Luke Zettlemoyer | In this paper, we show it is possible to convert such tasks into discrete latent variable learning problems with a precomputed, task-specific set of possible solutions (e.g. different mentions or equations) that contains one correct option. |
285 | Is the Red Square Big? MALeViC: Modeling Adjectives Leveraging Visual Contexts | Sandro Pezzelle, Raquel Fernández | This work aims at modeling how the meaning of gradable adjectives of size (big’, small’) can be learned from visually-grounded contexts. In contrast with the standard computational approach that simplistically treats gradable adjectives as fixed’ attributes, we pose the problem as relational: to be successful, a model has to consider the full visual context. |
286 | Investigating BERT’s Knowledge of Language: Five Analysis Methods with NPIs | Alex Warstadt, Yu Cao, Ioana Grosu, Wei Peng, Hagen Blix, Yining Nie, Anna Alsop, Shikha Bordia, Haokun Liu, Alicia Parrish, Sheng-Fu Wang, Jason Phang, Anhad Mohananey, Phu Mon Htut, Paloma Jeretic, Samuel R. Bowman | We explore five experimental methods inspired by prior work evaluating pretrained sentence representation models. We use a single linguistic phenomenon, negative polarity item (NPI) licensing, as a case study for our experiments. |
287 | Representation of Constituents in Neural Language Models: Coordination Phrase as a Case Study | Aixiu AN, Peng Qian, Ethan Wilcox, Roger Levy | Here we investigate neural models’ ability to represent constituent-level features, using coordinated noun phrases as a case study. |
288 | Towards Zero-shot Language Modeling | Edoardo Maria Ponti, Ivan Vuli?, Ryan Cotterell, Roi Reichart, Anna Korhonen | Can we construct a neural language model which is inductively biased towards learning human language? Motivated by this question, we aim at constructing an informative prior for held-out languages on the task of character-level, open-vocabulary language modelling. |
289 | What Gets Echoed? Understanding the “Pointers” in Explanations of Persuasive Arguments | David Atkinson, Kumar Bhargav Srinivasan, Chenhao Tan | We propose a novel word-level prediction task to investigate how explanations selectively reuse, or echo, information from what is being explained (henceforth, explanandum). |
290 | Modeling Frames in Argumentation | Yamen Ajjour, Milad Alshomary, Henning Wachsmuth, Benno Stein | We present a fully unsupervised approach to this task, which first removes topical information and then identifies frames using clustering. For evaluation purposes, we provide a corpus with 12, 326 debate-portal arguments, organized along the frames of the debates’ topics. |
291 | AMPERSAND: Argument Mining for PERSuAsive oNline Discussions | Tuhin Chakrabarty, Christopher Hidey, Smaranda Muresan, Kathy McKeown, Alyssa Hwang | We propose a computational model for argument mining in online persuasive discussion forums that brings together the micro-level (argument as product) and macro-level (argument as process) models of argumentation. |
292 | Evaluating adversarial attacks against multiple fact verification systems | James Thorne, Andreas Vlachos, Christos Christodoulopoulos, Arpit Mittal | We introduce two novel scoring metrics, attack potency and system resilience which take into account the correctness of the adversarial instances, an aspect often ignored in adversarial evaluations. |
293 | Nonsense!: Quality Control via Two-Step Reason Selection for Annotating Local Acceptability and Related Attributes in News Editorials | Wonsuk Yang, seungwon yoon, Ada Carpenter, Jong Park | In this study, we present a simple but powerful quality control method using two-step reason selection. |
294 | Evaluating Pronominal Anaphora in Machine Translation: An Evaluation Measure and a Test Suite | Prathyusha Jwalapuram, Shafiq Joty, Irina Temnikova, Preslav Nakov | With this aim in mind, we contribute an extensive, targeted dataset that can be used as a test suite for pronoun translation, covering multiple source languages and different pronoun errors drawn from real system translations, for English. |
295 | A Regularization Approach for Incorporating Event Knowledge and Coreference Relations into Neural Discourse Parsing | Zeyu Dai, Ruihong Huang | Realizing that external knowledge and linguistic constraints may not always apply in understanding a particular context, we propose a regularization approach that tightly integrates these constraints with contexts for deriving word representations. |
296 | Weakly Supervised Multilingual Causality Extraction from Wikipedia | Chikara Hashimoto | We present a method for extracting causality knowledge from Wikipedia, such as Protectionism -{\textgreater} Trade war, where the cause and effect entities correspond to Wikipedia articles. |
297 | Attribute-aware Sequence Network for Review Summarization | Junjie Li, Xuepeng Wang, Dawei Yin, Chengqing Zong | Therefore, we propose an Attribute-aware Sequence Network (ASN) to take the aforementioned users’ characteristics into account, which includes three modules: an attribute encoder encodes the attribute preferences over the words; an attribute-aware review encoder adopts an attribute-based selective mechanism to select the important information of a review; and an attribute-aware summary decoder incorporates attribute embedding and attribute-specific word-using habits into word prediction. |
298 | Extractive Summarization of Long Documents by Combining Global and Local Context | Wen Xiao, Giuseppe Carenini | In this paper, we propose a novel neural single-document extractive summarization model for long documents, incorporating both the global context of the whole document and the local context within the current topic. |
299 | Enhancing Neural Data-To-Text Generation Models with External Background Knowledge | Shuang Chen, Jinpeng Wang, Xiaocheng Feng, Feng Jiang, Bing Qin, Chin-Yew Lin | In this paper, we enhance neural data-to-text models with external knowledge in a simple but effective way to improve the fidelity of generated text. |
300 | Reading Like HER: Human Reading Inspired Extractive Summarization | Ling Luo, Xiang Ao, Yan Song, Feiyang Pan, Min Yang, Qing He | In this work, we re-examine the problem of extractive text summarization for long documents. |
301 | Contrastive Attention Mechanism for Abstractive Sentence Summarization | Xiangyu Duan, Hongfei Yu, Mingming Yin, Min Zhang, Weihua Luo, Yue Zhang | We propose a contrastive attention mechanism to extend the sequence-to-sequence framework for abstractive sentence summarization task, which aims to generate a brief summary of a given source sentence. |
302 | NCLS: Neural Cross-Lingual Summarization | Junnan Zhu, Qian Wang, Yining Wang, Yu Zhou, Jiajun Zhang, Shaonan Wang, Chengqing Zong | To handle that, we present an end-to-end CLS framework, which we refer to as Neural Cross-Lingual Summarization (NCLS), for the first time. |
303 | Clickbait? Sensational Headline Generation with Auto-tuned Reinforcement Learning | Peng Xu, Chien-Sheng Wu, Andrea Madotto, Pascale Fung | In this paper, we propose a model that generates sensational headlines without labeled data. |
304 | Concept Pointer Network for Abstractive Summarization | Wenbo Wang, Yang Gao, Heyan Huang, Yuxiang Zhou | Inspired by the popular pointer generator sequence-to-sequence model, this paper presents a concept pointer network for improving these aspects of abstractive summarization. |
305 | Surface Realisation Using Full Delexicalisation | Anastasia Shimorina, Claire Gardent | We propose a modular approach to surface realisation which models each of these components separately, and evaluate our approach on the 10 languages covered by the SR’18 Surface Realisation Shared Task shallow track. |
306 | IMaT: Unsupervised Text Attribute Transfer via Iterative Matching and Translation | Zhijing Jin, Di Jin, Jonas Mueller, Nicholas Matthews, Enrico Santus | In contrast, we propose a simpler approach, Iterative Matching and Translation (IMaT), which: (1) constructs a pseudo-parallel corpus by aligning a subset of semantically similar sentences from the source and the target corpora; (2) applies a standard sequence-to-sequence model to learn the attribute transfer; (3) iteratively improves the learned transfer function by refining imperfections in the alignment. |
307 | Better Rewards Yield Better Summaries: Learning to Summarise Without References | Florian Böhm, Yang Gao, Christian M. Meyer, Ori Shapira, Ido Dagan, Iryna Gurevych | To find a better reward function that can guide RL to generate human-appealing summaries, we learn a reward function from human ratings on 2,500 summaries. |
308 | Mixture Content Selection for Diverse Sequence Generation | Jaemin Cho, Minjoon Seo, Hannaneh Hajishirzi | We present a method to explicitly separate diversification from generation using a general plug-and-play module (called SELECTOR) that wraps around and guides an existing encoder-decoder model. |
309 | An End-to-End Generative Architecture for Paraphrase Generation | Qian Yang, zhouyuan huo, Dinghan Shen, Yong Cheng, Wenlin Wang, Guoyin Wang, Lawrence Carin | To overcome these challenges, we propose the first end-to-end conditional generative architecture for generating paraphrases via adversarial training, which does not depend on extra linguistic information. |
310 | Table-to-Text Generation with Effective Hierarchical Encoder on Three Dimensions (Row, Column and Time) | Heng Gong, Xiaocheng Feng, Bing Qin, Ting Liu | To address aforementioned problems, not only do we model each table cell considering other records in the same row, we also enrich table’s representation by modeling each table cell in context of other cells in the same column or with historical (time dimension) data respectively. |
311 | Subtopic-driven Multi-Document Summarization | Xin Zheng, Aixin Sun, Jing Li, Karthik Muthuswamy | In this paper, we propose a summarization model called STDS. |
312 | Referring Expression Generation Using Entity Profiles | Meng Cao, Jackie Chi Kit Cheung | In this study, we address this in two ways. First, we propose task setups in which we specifically test a REG system’s ability to generalize to entities not seen during training. Second, we propose a profile-based deep neural network model, ProfileREG, which encodes both the local context and an external profile of the entity to generate reference realizations. |
313 | Exploring Diverse Expressions for Paraphrase Generation | Lihua Qian, Lin Qiu, Weinan Zhang, Xin Jiang, Yong Yu | In this paper, we propose a novel approach with two discriminators and multiple generators to generate a variety of different paraphrases. |
314 | Enhancing AMR-to-Text Generation with Dual Graph Representations | Leonardo F. R. Ribeiro, Claire Gardent, Iryna Gurevych | To address this difficulty, we propose a novel graph-to-sequence model that encodes different but complementary perspectives of the structural information contained in the AMR graph. |
315 | Keeping Consistency of Sentence Generation and Document Classification with Multi-Task Learning | Toru Nishino, Shotaro Misawa, Ryuji Kano, Tomoki Taniguchi, Yasuhide Miura, Tomoko Ohkuma | The purpose of our study is to generate multiple outputs consistently. |
316 | Toward a Task of Feedback Comment Generation for Writing Learning | Ryo Nagata | In this paper, we introduce a novel task called feedback comment generation — a task of automatically generating feedback comments such as a hint or an explanatory note for writing learning for non-native learners of English. |
317 | Improving Question Generation With to the Point Context | Jingjing Li, Yifan Gao, Lidong Bing, Irwin King, Michael R. Lyu | To address this issue, we propose a method to jointly model the unstructured sentence and the structured answer-relevant relation (extracted from the sentence in advance) for question generation. |
318 | Deep Copycat Networks for Text-to-Text Generation | Julia Ive, Pranava Madhyastha, Lucia Specia | We introduce Copycat, a transformer-based pointer network for such tasks which obtains competitive results in abstractive text summarisation and generates more abstractive summaries. |
319 | Towards Controllable and Personalized Review Generation | Pan Li, Alexander Tuzhilin | In this paper, we propose a novel model RevGAN that automatically generates controllable and personalized user reviews based on the arbitrarily given sentimental and stylistic information. |
320 | Answers Unite! Unsupervised Metrics for Reinforced Summarization Models | Thomas Scialom, Sylvain Lamprier, Benjamin Piwowarski, Jacopo Staiano | We thus explore and propose alternative evaluation measures: the reported human-evaluation analysis shows that the proposed metrics, based on Question Answering, favorably compare to ROUGE — with the additional property of not requiring reference summaries. |
321 | Long and Diverse Text Generation with Planning-based Hierarchical Variational Model | Zhihong Shao, Minlie Huang, Jiangtao Wen, Wenfei Xu, xiaoyan zhu | To address these issues, we propose a Planning-based Hierarchical Variational Model (PHVM). |
322 | “Transforming” Delete, Retrieve, Generate Approach for Controlled Text Style Transfer | Akhilesh Sudhakar, Bhargav Upadhyay, Arjun Maheswaran | In this work we introduce the Generative Style Transformer (GST) – a new approach to rewriting sentences to a target style in the absence of parallel style corpora. |
323 | An Entity-Driven Framework for Abstractive Summarization | Eva Sharma, Luyang Huang, Zhe Hu, Lu Wang | In this paper, we introduce SENECA, a novel System for ENtity-drivEn Coherent Abstractive summarization framework that leverages entity information to generate informative and coherent abstracts. |
324 | Neural Extractive Text Summarization with Syntactic Compression | Jiacheng Xu, Greg Durrett | In this work, we present a neural model for single-document summarization based on joint extraction and syntactic compression. |
325 | Domain Adaptive Text Style Transfer | Dianqi Li, Yizhe Zhang, Zhe Gan, Yu Cheng, Chris Brockett, Bill Dolan, Ming-Ting Sun | In this paper, we examine domain adaptation for text style transfer to leverage massively available data from other domains. |
326 | Let’s Ask Again: Refine Network for Automatic Question Generation | Preksha Nema, Akash Kumar Mohankumar, Mitesh M. Khapra, Balaji Vasan Srinivasan, Balaraman Ravindran | In this work, we focus on the task of Automatic Question Generation (AQG) where given a passage and an answer the task is to generate the corresponding question. |
327 | Earlier Isn’t Always Better: Sub-aspect Analysis on Corpus and System Biases in Summarization | Taehee Jung, Dongyeop Kang, Lucas Mentch, Eduard Hovy | Following in the spirit of the claim that summarization is a combination of sub-functions, we define three sub-aspects of summarization: position, importance, and diversity and conduct an extensive analysis of the biases of each sub-aspect with respect to the domain of nine different summarization corpora (e.g., news, academic papers, meeting minutes, movie script, books, posts). |
328 | Lost in Evaluation: Misleading Benchmarks for Bilingual Dictionary Induction | Yova Kementchedjhieva, Mareike Hartmann, Anders Søgaard | We study the composition and quality of the test sets for five diverse languages from this dataset, with concerning findings: (1) a quarter of the data consists of proper nouns, which can be hardly indicative of BDI performance, and (2) there are pervasive gaps in the gold-standard targets. |
329 | Towards Realistic Practices In Low-Resource Natural Language Processing: The Development Set | Katharina Kann, Kyunghyun Cho, Samuel R. Bowman | Here, we aim to answer the following questions: Does using a development set for early stopping in the low-resource setting influence results as compared to a more realistic alternative, where the number of training epochs is tuned on development languages? And does it lead to overestimation or underestimation of performance? |
330 | Synchronously Generating Two Languages with Interactive Decoding | Yining Wang, Jiajun Zhang, Long Zhou, Yuchen Liu, Chengqing Zong | In this paper, we introduce a novel interactive approach to translate a source language into two different languages simultaneously and interactively. |
331 | On NMT Search Errors and Model Errors: Cat Got Your Tongue? | Felix Stahlberg, Bill Byrne | We present an exact inference procedure for neural sequence models based on a combination of beam search and depth-first search. |
332 | “Going on a vacation” takes longer than “Going for a walk”: A Study of Temporal Commonsense Understanding | Ben Zhou, Daniel Khashabi, Qiang Ning, Dan Roth | This paper systematically studies this temporal commonsense problem. |
333 | QAInfomax: Learning Robust Question Answering System by Mutual Information Maximization | Yi-Ting Yeh, Yun-Nung Chen | To address this problem, we propose QAInfomax as a regularizer in reading comprehension systems by maximizing mutual information among passages, a question, and its answer. |
334 | Adapting Meta Knowledge Graph Information for Multi-Hop Reasoning over Few-Shot Relations | Xin Lv, Yuxian Gu, Xu Han, Lei Hou, Juanzi Li, Zhiyuan Liu | In this paper, we propose a meta-based multi-hop reasoning method (Meta-KGR), which adopts meta-learning to learn effective meta parameters from high-frequency relations that could quickly adapt to few-shot relations. |
335 | How Reasonable are Common-Sense Reasoning Tasks: A Case-Study on the Winograd Schema Challenge and SWAG | Paul Trichelair, Ali Emami, Adam Trischler, Kaheer Suleman, Jackie Chi Kit Cheung | The question we ask in this paper is whether improved performance on these benchmarks represents genuine progress towards common-sense-enabled systems. |
336 | Pun-GAN: Generative Adversarial Network for Pun Generation | Fuli Luo, Shunyao Li, Pengcheng Yang, Lei Li, Baobao Chang, Zhifang Sui, Xu SUN | In this paper, we focus on the task of generating a pun sentence given a pair of word senses. |
337 | Multi-Task Learning with Language Modeling for Question Generation | Wenjie Zhou, Minghua Zhang, Yunfang Wu | Based on the attention-based pointer generator model, we propose to incorporate an auxiliary task of language modeling to help question generation in a hierarchical multi-task learning structure. |
338 | Autoregressive Text Generation Beyond Feedback Loops | Florian Schmidt, Stephan Mandt, Thomas Hofmann | In this paper, we combine a latent state space model with a CRF observation model. |
339 | The Woman Worked as a Babysitter: On Biases in Language Generation | Emily Sheng, Kai-Wei Chang, Premkumar Natarajan, Nanyun Peng | We present a systematic study of biases in natural language generation (NLG) by analyzing text generated from prompts that contain mentions of different demographic groups. |
340 | On the Importance of Delexicalization for Fact Verification | Sandeep Suntwal, Mithun Paul, Rebecca Sharp, Mihai Surdeanu | Here, we investigate the importance that a model assigns to various aspects of data while learning and making predictions, specifically, in a recognizing textual entailment (RTE) task. |
341 | Towards Debiasing Fact Verification Models | Tal Schuster, Darsh Shah, Yun Jie Serene Yeo, Daniel Roberto Filizzola Ortiz, Enrico Santus, Regina Barzilay | In this paper, we investigate the cause of this phenomenon, identifying strong cues for predicting labels solely based on the claim, without considering any evidence. |
342 | Recognizing Conflict Opinions in Aspect-level Sentiment Classification with Dual Attention Networks | Xingwei Tan, Yi Cai, Changxi Zhu | In this paper, we propose a multi-label classification model with dual attention mechanism to address these problems. |
343 | Investigating Dynamic Routing in Tree-Structured LSTM for Sentiment Analysis | Jin Wang, Liang-Chih Yu, K. Robert Lai, Xuejie Zhang | To overcome the bias problem, this study proposes a capsule tree-LSTM model, introducing a dynamic routing algorithm as an aggregation layer to build sentence representation by assigning different weights to nodes according to their contributions to prediction. |
344 | A Label Informative Wide textbackslash& Deep Classifier for Patents and Papers | Muyao Niu, Jie Cai | In this paper, we provide a simple and effective baseline for classifying both patents and papers to the well-established Cooperative Patent Classification (CPC). |
345 | Text Level Graph Neural Network for Text Classification | Lianzhe Huang, Dehong Ma, Sujian Li, Xiaodong Zhang, Houfeng WANG | To tackle the problems, we propose a new GNN based model that builds graphs for each input text with global parameters sharing instead of a single graph for the whole corpus. |
346 | Semantic Relatedness Based Re-ranker for Text Spotting | Ahmed Sabir, Francesc Moreno, Lluís Padró | Our goal is to improve the performance of vision systems by leveraging semantic information. |
347 | Delta-training: Simple Semi-Supervised Text Classification using Pretrained Word Embeddings | Hwiyeol Jo, Ceyda Cinarel | We propose a novel and simple method for semi-supervised text classification. |
348 | Visual Detection with Context for Document Layout Analysis | Carlos Soto, Shinjae Yoo | We present 1) a work in progress method to visually segment key regions of scientific articles using an object detection technique augmented with contextual features, and 2) a novel dataset of region-labeled articles. |
349 | Evaluating Topic Quality with Posterior Variability | Linzi Xing, Michael J. Paul, Giuseppe Carenini | We derive a novel measure of LDA topic quality using the variability of the posterior distributions. |
350 | Neural Topic Model with Reinforcement Learning | Lin Gui, Jia Leng, Gabriele Pergola, yu zhou, Ruifeng Xu, Yulan He | In this paper, we borrow the idea of reinforcement learning and incorporate topic coherence measures as reward signals to guide the learning of a VAE-based topic model. |
351 | Modelling Stopping Criteria for Search Results using Poisson Processes | Alison Sneyd, Mark Stevenson | In this work, a novel method for determining a stopping criterion is proposed that models the rate at which relevant documents occur using a Poisson process. |
352 | Cross-Domain Modeling of Sentence-Level Evidence for Document Retrieval | Zeynep Akkalyoncu Yilmaz, Wei Yang, Haotian Zhang, Jimmy Lin | This paper applies BERT to ad hoc document retrieval on news articles, which requires addressing two challenges: relevance judgments in existing test collections are typically provided only at the document level, and documents often exceed the length that BERT was designed to handle. |
353 | The Challenges of Optimizing Machine Translation for Low Resource Cross-Language Information Retrieval | Constantine Lignos, Daniel Cohen, Yen-Chieh Lien, Pratik Mehta, W. Bruce Croft, Scott Miller | In this paper, we examine the relationship between the performance of MT systems and both neural and term frequency-based IR models to identify how CLIR performance can be best predicted from MT quality. |
354 | Rotate King to get Queen: Word Relationships as Orthogonal Transformations in Embedding Space | Kawin Ethayarajh | We document an alternative way in which downstream models might learn these relationships: orthogonal and linear transformations. |
355 | GlossBERT: BERT for Word Sense Disambiguation with Gloss Knowledge | Luyao Huang, Chi Sun, Xipeng Qiu, Xuanjing Huang | In this paper, we focus on how to better leverage gloss knowledge in a supervised neural WSD system. |
356 | Leveraging Adjective-Noun Phrasing Knowledge for Comparison Relation Prediction in Text-to-SQL | Haoyan Liu, Lei Fang, Qian Liu, Bei Chen, Jian-Guang LOU, Zhoujun Li | In this paper, we propose to leverage adjective-noun phrasing knowledge mined from the web to predict the comparison relations in text-to-SQL. |
357 | Bridging the Defined and the Defining: Exploiting Implicit Lexical Semantic Relations in Definition Modeling | Koki Washio, Satoshi Sekine, Tsuneaki Kato | In this paper, we propose definition modeling methods that use lexical semantic relations. |
358 | Don’t Just Scratch the Surface: Enhancing Word Representations for Korean with Hanja | Kang Min Yoo, Taeuk Kim, Sang-goo Lee | We propose a simple yet effective approach for improving Korean word representations using additional linguistic annotation (i.e. Hanja). |
359 | SyntagNet: Challenging Supervised Word Sense Disambiguation with Lexical-Semantic Combinations | Marco Maru, Federico Scozzafava, Federico Martelli, Roberto Navigli | This paper introduces SyntagNet, a novel resource consisting of manually disambiguated lexical-semantic combinations. |
360 | Hierarchical Meta-Embeddings for Code-Switching Named Entity Recognition | Genta Indra Winata, Zhaojiang Lin, Jamin Shin, Zihan Liu, Pascale Fung | Therefore, we propose Hierarchical Meta-Embeddings (HME) that learn to combine multiple monolingual word-level and subword-level embeddings to create language-agnostic lexical representations. |
361 | Fine-tune BERT with Sparse Self-Attention Mechanism | Baiyun Cui, Yingming Li, Ming Chen, Zhongfei Zhang | In this paper, we develop a novel Sparse Self-Attention Fine-tuning model (referred as SSAF) which integrates sparsity into self-attention mechanism to enhance the fine-tuning performance of BERT. |
362 | Feature-Dependent Confusion Matrices for Low-Resource NER Labeling with Noisy Labels | Lukas Lange, Michael A. Hedderich, Dietrich Klakow | We propose to cluster the training data using the input features and then compute different confusion matrices for each cluster. |
363 | A Multi-Pairwise Extension of Procrustes Analysis for Multilingual Word Translation | Hagai Taitelbaum, Gal Chechik, Jacob Goldberger | In this paper we present a novel approach to simultaneously representing multiple languages in a common space. |
364 | Out-of-Domain Detection for Low-Resource Text Classification Tasks | Ming Tan, Yang Yu, Haoyu Wang, Dakuo Wang, Saloni Potdar, Shiyu Chang, Mo Yu | In this work, we propose an {\textbackslash}emph{OOD-resistant Prototypical Network} to tackle this zero-shot OOD detection and few-shot ID classification task. |
365 | Harnessing Pre-Trained Neural Networks with Rules for Formality Style Transfer | Yunli Wang, Yu Wu, Lili Mou, Zhoujun Li, Wenhan Chao | We propose three fine-tuning methods in this paper and achieve a new state-of-the-art on benchmark datasets |
366 | Multiple Text Style Transfer by using Word-level Conditional Generative Adversarial Network with Two-Phase Training | Chih-Te Lai, Yi-Te Hong, Hong-You Chen, Chi-Jen Lu, Shou-De Lin | In this paper, we propose a new GAN model with a word-level conditional architecture and a two-phase training procedure. |
367 | Improved Differentiable Architecture Search for Language Modeling and Named Entity Recognition | Yufan Jiang, Chi Hu, Tong Xiao, Chunliang Zhang, Jingbo Zhu | In this paper, we study differentiable neural architecture search (NAS) methods for natural language processing. |
368 | Using Pairwise Occurrence Information to Improve Knowledge Graph Completion on Large-Scale Datasets | Esma Balkir, Masha Naslidnyk, Dave Palfrey, Arpit Mittal | In this paper we use occurrences of entity-relation pairs in the dataset to construct a joint learning model and to increase the quality of sampled negatives during training. |
369 | Single Training Dimension Selection for Word Embedding with PCA | Yu Wang | In this paper, we present a fast and reliable method based on PCA to select the number of dimensions for word embeddings. |
370 | A Surprisingly Effective Fix for Deep Latent Variable Modeling of Text | Bohan Li, Junxian He, Graham Neubig, Taylor Berg-Kirkpatrick, Yiming Yang | In this paper, we investigate a simple fix for posterior collapse which yields surprisingly effective results. |
371 | SciBERT: A Pretrained Language Model for Scientific Text | Iz Beltagy, Kyle Lo, Arman Cohan | We release SciBERT, a pretrained language model based on BERT (Devlin et. al., 2018) to address the lack of high-quality, large-scale labeled scientific data. |
372 | Humor Detection: A Transformer Gets the Last Laugh | Orion Weller, Kevin Seppi | In this paper we extend that capability by proposing a new task: assessing whether or not a joke is humorous. |
373 | Combining Global Sparse Gradients with Local Gradients in Distributed Neural Network Training | Alham Fikri Aji, Kenneth Heafield, Nikolay Bogoychev | We restore gradient quality by combining the compressed global gradient with the node’s locally computed uncompressed gradient. |
374 | Small and Practical BERT Models for Sequence Labeling | Henry Tsai, Jason Riesa, Melvin Johnson, Naveen Arivazhagan, Xin Li, Amelia Archer | We propose a practical scheme to train a single multilingual sequence labeling model that yields state of the art results and is small and fast enough to run on a single CPU. |
375 | Data Augmentation with Atomic Templates for Spoken Language Understanding | Zijian Zhao, Su Zhu, Kai Yu | In this work, we propose a data augmentation method with atomic templates for SLU, which involves minimum human efforts. |
376 | PaLM: A Hybrid Parser and Language Model | Hao Peng, Roy Schwartz, Noah A. Smith | We present PaLM, a hybrid parser and neural language model. |
377 | A Pilot Study for Chinese SQL Semantic Parsing | Qingkai Min, Yuefeng Shi, Yue Zhang | We compare character- and word-based encoders for a semantic parser, and different embedding schemes. |
378 | Global Reasoning over Database Structures for Text-to-SQL Parsing | Ben Bogin, Matt Gardner, Jonathan Berant | In this work, we propose a semantic parser that globally reasons about the structure of the output query to make a more contextually-informed selection of database constants. |
379 | Transductive Learning of Neural Language Models for Syntactic and Semantic Analysis | Hiroki Ouchi, Jun Suzuki, Kentaro Inui | Here we conduct an empirical study of transductive learning for neural models and demonstrate its utility in syntactic and semantic tasks. |
380 | Efficient Sentence Embedding using Discrete Cosine Transform | Nada Almarwani, Hanan Aldarmaki, Mona Diab | As an efficient alternative, we propose the use of discrete cosine transform (DCT) to compress word sequences in an order-preserving manner. |
381 | A Search-based Neural Model for Biomedical Nested and Overlapping Event Detection | Kurt Junshean Espinosa, Makoto Miwa, Sophia Ananiadou | We tackle the nested and overlapping event detection task and propose a novel search-based neural network (SBNN) structured prediction model that treats the task as a search problem on a relation graph of trigger-argument structures. |
382 | PAWS-X: A Cross-lingual Adversarial Dataset for Paraphrase Identification | Yinfei Yang, Yuan Zhang, Chris Tar, Jason Baldridge | We remedy this gap with PAWS-X, a new dataset of 23,659 human translated PAWS evaluation pairs in six typologically distinct languages: French, Spanish, German, Chinese, Japanese, and Korean. |
383 | Pretrained Language Models for Sequential Sentence Classification | Arman Cohan, Iz Beltagy, Daniel King, Bhavana Dalvi, Dan Weld | In this work, we show that pretrained language models, BERT (Devlin et al., 2018) in particular, can be used for this task to capture contextual dependencies without the need for hierarchical encoding nor a CRF. |
384 | Emergent Linguistic Phenomena in Multi-Agent Communication Games | Laura Harding Graesser, Kyunghyun Cho, Douwe Kiela | We describe a multi-agent communication framework for examining high-level linguistic phenomena at the community-level. |
385 | TalkDown: A Corpus for Condescension Detection in Context | Zijian Wang, Christopher Potts | To address this, we present TalkDown, a new labeled dataset of condescending linguistic acts in context. |
386 | Summary Cloze: A New Task for Content Selection in Topic-Focused Summarization | Daniel Deutsch, Dan Roth | In this work, we propose a new method for studying content selection in topic-focused summarization called the summary cloze task. |
387 | Text Summarization with Pretrained Encoders | Yang Liu, Mirella Lapata | In this paper, we showcase how BERT can be usefully applied in text summarization and propose a general framework for both extractive and abstractive models. |
388 | How to Write Summaries with Patterns? Learning towards Abstractive Summarization through Prototype Editing | Shen Gao, Xiuying Chen, Piji Li, Zhangming Chan, Dongyan Zhao, Rui Yan | To tackle these challenges, we design a model named Prototype Editing based Summary Generator (PESG). |
389 | BottleSum: Unsupervised and Self-supervised Sentence Summarization using the Information Bottleneck Principle | Peter West, Ari Holtzman, Jan Buys, Yejin Choi | In this paper, we propose a novel approach to unsupervised sentence summarization by mapping the Information Bottleneck principle to a conditional language modelling objective: given a sentence, our approach seeks a compressed sentence that can best predict the next sentence. |
390 | Improving Latent Alignment in Text Summarization by Generalizing the Pointer Generator | Xiaoyu Shen, Yang Zhao, Hui Su, Dietrich Klakow | In this paper, we address these problems by allowing the model to “edit” pointed tokens instead of always hard copying them. |
391 | Learning Semantic Parsers from Denotations with Latent Structured Alignments and Abstract Programs | Bailin Wang, Ivan Titov, Mirella Lapata | Our goal is to instill an inductive bias in the parser to help it distinguish between spurious and correct programs. |
392 | Broad-Coverage Semantic Parsing as Transduction | Sheng Zhang, Xutai Ma, Kevin Duh, Benjamin Van Durme | We unify different broad-coverage semantic parsing tasks into a transduction parsing paradigm, and propose an attention-based neural transducer that incrementally builds meaning representation via a sequence of semantic relations. |
393 | Core Semantic First: A Top-down Approach for AMR Parsing | Deng Cai, Wai Lam | We introduce a novel scheme for parsing a piece of text into its Abstract Meaning Representation (AMR): Graph Spanning based Parsing (GSP). |
394 | Don’t paraphrase, detect! Rapid and Effective Data Collection for Semantic Parsing | Jonathan Herzig, Jonathan Berant | In this paper, we thoroughly analyze two sources of mismatch in this process: the mismatch in logical form distribution and the mismatch in language distribution between the true and induced distributions. We quantify the effects of these mismatches, and propose a new data collection approach that mitigates them. |
395 | Improving Distantly-Supervised Relation Extraction with Joint Label Embedding | Linmei Hu, Luhao Zhang, Chuan Shi, Liqiang Nie, Weili Guan, Cheng Yang | In this paper, we propose a novel multi-layer attention-based model to improve relation extraction with joint label embedding. |
396 | Leverage Lexical Knowledge for Chinese Named Entity Recognition via Collaborative Graph Network | Dianbo Sui, Yubo Chen, Kang Liu, Jun Zhao, Shengping Liu | We present a Collaborative Graph Network to solve these challenges. |
397 | Looking Beyond Label Noise: Shifted Label Distribution Matters in Distantly Supervised Relation Extraction | Qinyuan Ye, Liyuan Liu, Maosen Zhang, Xiang Ren | In this paper, we study the problem what limits the performance of DS-trained neural models, conduct thorough analyses, and identify a factor that can influence the performance greatly, shifted label distribution. |
398 | Easy First Relation Extraction with Information Redundancy | Shuai Ma, Gang Wang, Yansong Feng, Jinpeng Huai | In this paper, we propose an easy first approach for relation extraction with information redundancies, embedded in the results produced by local sentence level extractors, during which conflict decisions are resolved with domain and uniqueness constraints. |
399 | Dependency-Guided LSTM-CRF for Named Entity Recognition | Zhanming Jie, Wei Lu | In this work, we propose a simple yet effective dependency-guided LSTM-CRF model to encode the complete dependency trees and capture the above properties for the task of named entity recognition (NER). |
400 | Cross-Cultural Transfer Learning for Text Classification | Dor Ringel, Gal Lavee, Ido Guy, Kira Radinsky | In this work, we show that cross-cultural differences can be harnessed for natural language text classification. |
401 | Combining Unsupervised Pre-training and Annotator Rationales to Improve Low-shot Text Classification | Oren Melamud, Mihaela Bornea, Ken Barker | In this work, we combine these two approaches to improve low-shot text classification with two novel methods: a simple bag-of-words embedding approach; and a more complex context-aware method, based on the BERT model. |
402 | ProSeqo: Projection Sequence Networks for On-Device Text Classification | Zornitsa Kozareva, Sujith Ravi | We propose a novel on-device sequence model for text classification using recurrent projections. |
403 | Induction Networks for Few-Shot Text Classification | Ruiying Geng, Binhua Li, Yongbin Li, Xiaodan Zhu, Ping Jian, Jian Sun | In this paper, we propose a novel Induction Network to learn such a generalized class-wise representation, by innovatively leveraging the dynamic routing algorithm in meta-learning. |
404 | Benchmarking Zero-shot Text Classification: Datasets, Evaluation and Entailment Approach | Wenpeng Yin, Jamaal Hay, Dan Roth | Our contributions include: i) The datasets we provide facilitate studying 0Shot-TC relative to conceptually different and diverse aspects: the “topic” aspect includes “sports” and “politics” as labels; the “emotion” aspect includes “joy” and “anger”; the “situation” aspect includes “medical assistance” and “water shortage”. |
405 | A Logic-Driven Framework for Consistency of Neural Models | Tao Li, Vivek Gupta, Maitrey Mehta, Vivek Srikumar | In this paper, we formalize such inconsistency as a generalization of prediction error. |
406 | Style Transfer for Texts: Retrain, Report Errors, Compare with Rewrites | Alexey Tikhonov, Viacheslav Shibaev, Aleksander Nagaev, Aigul Nugmanova, Ivan P. Yamshchikov | This paper shows that standard assessment methodology for style transfer has several significant problems. |
407 | Implicit Deep Latent Variable Models for Text Generation | Le Fang, Chunyuan Li, Jianfeng Gao, Wen Dong, Changyou Chen | In this paper, we advocate sample-based representations of variational distributions for natural language, leading to implicit latent features, which can provide flexible representation power compared with Gaussian-based posteriors. |
408 | Text Emotion Distribution Learning from Small Sample: A Meta-Learning Approach | Zhenjie Zhao, Xiaojuan Ma | In this paper, we propose a meta-learning approach to learn text emotion distributions from a small sample. |
409 | Judge the Judges: A Large-Scale Evaluation Study of Neural Language Models for Online Review Generation | Cristina Garbacea, Samuel Carton, Shiyan Yan, Qiaozhu Mei | We conduct a large-scale, systematic study to evaluate the existing evaluation methods for natural language generation in the context of generating online product reviews. |
410 | Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks | Nils Reimers, Iryna Gurevych | In this publication, we present Sentence-BERT (SBERT), a modification of the pretrained BERT network that use siamese and triplet network structures to derive semantically meaningful sentence embeddings that can be compared using cosine-similarity. |
411 | Learning Only from Relevant Keywords and Unlabeled Documents | Nontawat Charoenphakdee, Jongyeong Lee, Yiping Jin, Dittaya Wanvarie, Masashi Sugiyama | In this paper, we propose a theoretically guaranteed learning framework that is simple to implement and has flexible choices of models, e.g., linear models or neural networks. |
412 | Denoising based Sequence-to-Sequence Pre-training for Text Generation | Liang Wang, Wei Zhao, Ruoyu Jia, Sujian Li, Jingming Liu | This paper presents a new sequence-to-sequence (seq2seq) pre-training method PoDA (Pre-training of Denoising Autoencoders), which learns representations suitable for text generation tasks. |
413 | Dialog Intent Induction with Deep Multi-View Clustering | Hugh Perkins, Yi Yang | We introduce the dialog intent induction task and present a novel deep multi-view clustering approach to tackle the problem. |
414 | Nearly-Unsupervised Hashcode Representations for Biomedical Relation Extraction | Sahil Garg, Aram Galstyan, Greg Ver Steeg, Guillermo Cecchi | In this paper, we propose to optimize the hashcode representations in a nearly unsupervised manner, in which we only use data points, but not their class labels, for learning. |
415 | Auditing Deep Learning processes through Kernel-based Explanatory Models | Danilo Croce, Daniele Rossini, Roberto Basili | In this paper, we discuss the application of Layerwise Relevance Propagation over a linguistically motivated neural architecture, the Kernel-based Deep Architecture, in order to trace back connections between linguistic properties of input instances and system decisions. |
416 | Enhancing Variational Autoencoders with Mutual Information Neural Estimation for Text Generation | Dong Qian, William K. Cheung | In this paper, we propose to introduce a mutual information (MI) term between the input and its latent variable to regularize the objective of the VAE. |
417 | Sampling Bias in Deep Active Classification: An Empirical Study | Ameya Prabhu, Charles Dognin, Maneesh Singh | Based on the above, we propose a simple baseline for deep active text classification that outperforms the state of the art. |
418 | Don’t Take the Easy Way Out: Ensemble Based Methods for Avoiding Known Dataset Biases | Christopher Clark, Mark Yatskar, Luke Zettlemoyer | In this paper, we show that if we have prior knowledge of such biases, we can train a model to be more robust to domain shift. |
419 | Achieving Verified Robustness to Symbol Substitutions via Interval Bound Propagation | Po-Sen Huang, Robert Stanforth, Johannes Welbl, Chris Dyer, Dani Yogatama, Sven Gowal, Krishnamurthy Dvijotham, Pushmeet Kohli | In this work, we approach the problem from the opposite direction: to formally verify a system’s robustness against a predefined class of adversarial attacks. |
420 | Rethinking Cooperative Rationalization: Introspective Extraction and Complement Control | Mo Yu, Shiyu Chang, Yang Zhang, Tommi Jaakkola | We introduce an introspective model which explicitly predicts and incorporates the outcome into the selection process. |
421 | Experimenting with Power Divergences for Language Modeling | Matthieu Labeau, Shay B. Cohen | In this paper, we experiment with several families (alpha, beta and gamma) of power divergences, generalized from the KL divergence, for learning language models with an objective different than standard MLE. |
422 | Hierarchically-Refined Label Attention Network for Sequence Labeling | Leyang Cui, Yue Zhang | For better representing label sequences, we investigate a hierarchically-refined label attention network, which explicitly leverages label embeddings and captures potential long-term label dependency by giving each word incrementally refined label distributions with hierarchical attention. |
423 | Certified Robustness to Adversarial Word Substitutions | Robin Jia, Aditi Raghunathan, Kerem Göksel, Percy Liang | We train the first models that are provably robust to all word substitutions in this family. |
424 | Visualizing and Understanding the Effectiveness of BERT | Yaru Hao, Li Dong, Furu Wei, Ke Xu | In this paper, we propose to visualize loss landscapes and optimization trajectories of fine-tuning BERT on specific datasets. |
425 | Topics to Avoid: Demoting Latent Confounds in Text Classification | Sachin Kumar, Shuly Wintner, Noah A. Smith, Yulia Tsvetkov | We propose a method that represents the latent topical confounds and a model which “unlearns” confounding features by predicting both the label of the input text and the confound; but we train the two predictors adversarially in an alternating fashion to learn a text representation that predicts the correct label but is less prone to using information about the confound. |
426 | Learning to Ask for Conversational Machine Learning | Shashank Srivastava, Igor Labutov, Tom Mitchell | We present a reinforcement learning framework, where the learner’s actions correspond to question types and the reward for asking a question is based on how the teacher’s response changes performance of the resulting machine learning model on the learning task. |
427 | Language Modeling for Code-Switching: Evaluation, Integration of Monolingual Data, and Discriminative Training | Hila Gonen, Yoav Goldberg | We tackle these three issues: we propose an ASR-motivated evaluation setup which is decoupled from an ASR system and the choice of vocabulary, and provide an evaluation dataset for English-Spanish code-switching. |
428 | Using Local Knowledge Graph Construction to Scale Seq2Seq Models to Multi-Document Inputs | Angela Fan, Claire Gardent, Chloé Braud, Antoine Bordes | We propose constructing a local graph structured knowledge base for each query, which compresses the web search information and reduces redundancy. |
429 | Fine-grained Knowledge Fusion for Sequence Labeling Domain Adaptation | Huiyun Yang, Shujian Huang, XIN-YU DAI, Jiajun CHEN | To take the multi-level domain relevance discrepancy into account, in this paper, we propose a fine-grained knowledge fusion model with the domain relevance modeling scheme to control the balance between learning from the target domain data and learning from the source domain model. |
430 | Exploiting Monolingual Data at Scale for Neural Machine Translation | Lijun Wu, Yiren Wang, Yingce Xia, Tao QIN, Jianhuang Lai, Tie-Yan Liu | In this work, we study how to use both the source-side and target-side monolingual data for NMT, and propose an effective strategy leveraging both of them. |
431 | Meta Relational Learning for Few-Shot Link Prediction in Knowledge Graphs | Mingyang Chen, Wen Zhang, Wei Zhang, Qiang Chen, Huajun Chen | In this work, we propose a Meta Relational Learning (MetaR) framework to do the common but challenging few-shot link prediction in KGs, namely predicting new triples about a relation by only observing a few associative triples. |
432 | Distributionally Robust Language Modeling | Yonatan Oren, Shiori Sagawa, Tatsunori Hashimoto, Percy Liang | To remedy this without the knowledge of the test distribution, we propose an approach which trains a model that performs well over a wide range of potential test distributions. |
433 | Unsupervised Domain Adaptation of Contextualized Embeddings for Sequence Labeling | Xiaochuang Han, Jacob Eisenstein | To address this scenario, we propose domain-adaptive fine-tuning, in which the contextualized embeddings are adapted by masked language modeling on text from the target domain. |
434 | Learning Latent Parameters without Human Response Patterns: Item Response Theory with Artificial Crowds | John P. Lalor, Hao Wu, Hong Yu | In this work we propose learning IRT models using RPs generated from artificial crowds of DNN models. |
435 | Parallel Iterative Edit Models for Local Sequence Transduction | Abhijeet Awasthi, Sunita Sarawagi, Rasna Goyal, Sabyasachi Ghosh, Vihari Piratla | We present a Parallel Iterative Edit (PIE) model for the problem of local sequence transduction arising in tasks like Grammatical error correction (GEC). |
436 | ARAML: A Stable Adversarial Training Framework for Text Generation | Pei Ke, Fei Huang, Minlie Huang, xiaoyan zhu | To tackle this problem, we propose a novel framework called Adversarial Reward Augmented Maximum Likelihood (ARAML). |
437 | FlowSeq: Non-Autoregressive Conditional Sequence Generation with Generative Flow | Xuezhe Ma, Chunting Zhou, Xian Li, Graham Neubig, Eduard Hovy | In this paper, we propose a simple, efficient, and effective model for non-autoregressive sequence generation using latent variable models. |
438 | Compositional Generalization for Primitive Substitutions | Yuanpeng Li, Liang Zhao, Jianyu Wang, Joel Hestness | In this paper, we conduct fundamental research for encoding compositionality in neural networks. |
439 | WikiCREM: A Large Unsupervised Corpus for Coreference Resolution | Vid Kocijan, Oana-Maria Camburu, Ana-Maria Cretu, Yordan Yordanov, Phil Blunsom, Thomas Lukasiewicz | In this work, we introduce WikiCREM (Wikipedia CoREferences Masked) a large-scale, yet accurate dataset of pronoun disambiguation instances. |
440 | Identifying and Explaining Discriminative Attributes | Armins Stepanjans, André Freitas | This paper describes an explicit word vector representation model (WVM) to support the identification of discriminative attributes. |
441 | Patient Knowledge Distillation for BERT Model Compression | Siqi Sun, Yu Cheng, Zhe Gan, Jingjing Liu | In order to alleviate this resource hunger in large-scale model training, we propose a Patient Knowledge Distillation approach to compress an original large model (teacher) into an equally-effective lightweight shallow network (student). |
442 | Neural Gaussian Copula for Variational Autoencoder | Prince Zizhuang Wang, William Yang Wang | We propose Gaussian Copula Variational Autoencoder (VAE) to avert this problem. |
443 | Transformer Dissection: An Unified Understanding for Transformer’s Attention via the Lens of Kernel | Yao-Hung Hubert Tsai, Shaojie Bai, Makoto Yamada, Louis-Philippe Morency, Ruslan Salakhutdinov | In this paper, we present a new formulation of attention via the lens of the kernel. |
444 | Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification | Jiawei Wu, Wenhan Xiong, William Yang Wang | In this paper, we propose a meta-learning method to capture these complex label dependencies. |
445 | Revealing the Dark Secrets of BERT | Olga Kovaleva, Alexey Romanov, Anna Rogers, Anna Rumshisky | In the current work, we focus on the interpretation of self-attention, which is one of the fundamental underlying components of BERT. |
446 | Machine Translation With Weakly Paired Documents | Lijun Wu, Jinhua Zhu, Di He, Fei Gao, Tao QIN, Jianhuang Lai, Tie-Yan Liu | Observing that weakly paired bilingual documents are much easier to collect than bilingual sentences, e.g., from Wikipedia, news websites or books, in this paper, we investigate training translation models with weakly paired bilingual documents. |
447 | Countering Language Drift via Visual Grounding | Jason Lee, Kyunghyun Cho, Douwe Kiela | We recast translation as a multi-agent communication game and examine auxiliary training constraints for their effectiveness in mitigating language drift. |
448 | The Bottom-up Evolution of Representations in the Transformer: A Study with Machine Translation and Language Modeling Objectives | Elena Voita, Rico Sennrich, Ivan Titov | In this work, we use canonical correlation analysis and mutual information estimators to study how information flows across Transformer layers and observe that the choice of the objective determines this process. |
449 | Do We Really Need Fully Unsupervised Cross-Lingual Embeddings? | Ivan Vuli?, Goran Glavaš, Roi Reichart, Anna Korhonen | In this paper, we question the ability of even the most robust unsupervised CLWE approaches to induce meaningful CLWEs in these more challenging settings. |
450 | Weakly-Supervised Concept-based Adversarial Learning for Cross-lingual Word Embeddings | Haozhou Wang, James Henderson, Paola Merlo | In this paper, we propose a weakly-supervised adversarial training method to overcome this limitation, based on the intuition that mapping across languages is better done at the concept level than at the word level. |
451 | Aligning Cross-Lingual Entities with Multi-Aspect Information | Hsiu-Wei Yang, Yanyan Zou, Peng Shi, Wei Lu, Jimmy Lin, Xu SUN | In this work, we investigate embedding-based approaches to encode entities from multilingual KGs into the same vector space, where equivalent entities are close to each other. |
452 | Contrastive Language Adaptation for Cross-Lingual Stance Detection | Mitra Mohtarami, James Glass, Preslav Nakov | In particular, we introduce a novel contrastive language adaptation approach applied to memory networks, which ensures accurate alignment of stances in the source and target languages, and can effectively deal with the challenge of limited labeled data in the target language. |
453 | Jointly Learning to Align and Translate with Transformer Models | Sarthak Garg, Stephan Peitz, Udhyakumar Nallasamy, Matthias Paulik | In this paper, we present an approach to train a Transformer model to produce both accurate translations and alignments. |
454 | Social IQa: Commonsense Reasoning about Social Interactions | Maarten Sap, Hannah Rashkin, Derek Chen, Ronan Le Bras, Yejin Choi | We introduce Social IQa, the first large-scale benchmark for commonsense reasoning about social situations. |
455 | Self-Assembling Modular Networks for Interpretable Multi-Hop Reasoning | Yichen Jiang, Mohit Bansal | In this work, we present an interpretable, controller-based Self-Assembling Neural Modular Network (Hu et al., 2017, 2018) for multi-hop reasoning, where we design four novel modules (Find, Relocate, Compare, NoOp) to perform unique types of language reasoning. |
456 | Posing Fair Generalization Tasks for Natural Language Inference | Atticus Geiger, Ignacio Cases, Lauri Karttunen, Christopher Potts | In this paper, we define and motivate a formal notion of fairness in this sense. |
457 | Everything Happens for a Reason: Discovering the Purpose of Actions in Procedural Text | Bhavana Dalvi, Niket Tandon, Antoine Bosselut, Wen-tau Yih, Peter Clark | We present our new model (XPAD) that biases effect predictions towards those that (1) explain more of the actions in the paragraph and (2) are more plausible with respect to background knowledge. |
458 | CLUTRR: A Diagnostic Benchmark for Inductive Reasoning from Text | Koustuv Sinha, Shagun Sodhani, Jin Dong, Joelle Pineau, William L. Hamilton | In this work, we introduce a diagnostic benchmark suite, named CLUTRR, to clarify some key issues related to the robustness and systematicity of NLU systems. |
459 | Taskmaster-1: Toward a Realistic and Diverse Dialog Dataset | Bill Byrne, Karthik Krishnamoorthi, Chinnadhurai Sankar, Arvind Neelakantan, Ben Goodrich, Daniel Duckworth, Semih Yavuz, Amit Dubey, Kyu-Young Kim, Andy Cedilnik | To help satisfy this elementary requirement, we introduce the initial release of the Taskmaster-1 dataset which includes 13,215 task-based dialogs comprising six domains. |
460 | Multi-Domain Goal-Oriented Dialogues (MultiDoGO): Strategies toward Curating and Annotating Large Scale Dialogue Data | Denis Peskov, Nancy Clarke, Jason Krone, Brigi Fodor, Yi Zhang, Adel Youssef, Mona Diab | In this paper, we present strategies toward curating and annotating large scale goal oriented dialogue data. We introduce the MultiDoGO dataset to overcome these limitations. |
461 | Build it Break it Fix it for Dialogue Safety: Robustness from Adversarial Human Attack | Emily Dinan, Samuel Humeau, Bharath Chintagunta, Jason Weston | In this work, we develop a training scheme for a model to become robust to such human attacks by an iterative build it, break it, fix it scheme with humans and models in the loop. |
462 | GECOR: An End-to-End Generative Ellipsis and Co-reference Resolution Model for Task-Oriented Dialogue | Jun Quan, Deyi Xiong, Bonnie Webber, Changjian Hu | In this paper, we treat the resolution of ellipsis and co-reference in dialogue as a problem of generating omitted or referred expressions from the dialogue context. |
463 | Task-Oriented Conversation Generation Using Heterogeneous Memory Networks | Zehao Lin, Xinjing Huang, Feng Ji, Haiqing Chen, Yin Zhang | In this paper, we propose a novel and versatile external memory networks called Heterogeneous Memory Networks (HMNs), to simultaneously utilize user utterances, dialogue history and background knowledge tuples. |
464 | Aspect-based Sentiment Classification with Aspect-specific Graph Convolutional Networks | Chen Zhang, Qiuchi Li, Dawei Song | To tackle this problem, we propose to build a Graph Convolutional Network (GCN) over the dependency tree of a sentence to exploit syntactical information and word dependencies. |
465 | Coupling Global and Local Context for Unsupervised Aspect Extraction | Ming Liao, Jing Li, Haisong Zhang, Lingzhi Wang, Xixin Wu, Kam-Fai Wong | We propose a novel neural model, capable of coupling global and local representation to discover aspect words. |
466 | Transferable End-to-End Aspect-based Sentiment Analysis with Selective Adversarial Learning | Zheng Li, Xin Li, Ying Wei, Lidong Bing, Yu Zhang, Qiang Yang | To resolve it, we propose a novel Selective Adversarial Learning (SAL) method to align the inferred correlation vectors that automatically capture their latent relations. |
467 | CAN: Constrained Attention Networks for Multi-Aspect Sentiment Analysis | Mengting Hu, Shiwan Zhao, Li Zhang, Keke Cai, Zhong Su, Renhong Cheng, Xiaowei Shen | In this paper, we propose constrained attention networks (CAN), a simple yet effective solution, to regularize the attention for multi-aspect sentiment analysis, which alleviates the drawback of the attention mechanism. |
468 | Leveraging Just a Few Keywords for Fine-Grained Aspect Detection Through Weakly Supervised Co-Training | Giannis Karamanolakis, Daniel Hsu, Luis Gravano | In this work, we consider weakly supervised approaches for training aspect classifiers that only require the user to provide a small set of seed words (i.e., weakly positive indicators) for the aspects of interest. |
469 | Integrating Text and Image: Determining Multimodal Document Intent in Instagram Posts | Julia Kruk, Jonah Lubin, Karan Sikka, Xiao Lin, Dan Jurafsky, Ajay Divakaran | Here we introduce a multimodal dataset of {\$}1299{\$} Instagram posts labeled for three orthogonal taxonomies: the authorial intent behind the image-caption pair, the contextual relationship between the literal meanings of the image and caption, and the semiotic relationship between the signified meanings of the image and caption. |
470 | Neural Conversation Recommendation with Online Interaction Modeling | Xingshan Zeng, Jing Li, Lu Wang, Kam-Fai Wong | In this paper, we present a novel framework to automatically recommend conversations to users based on their prior conversation behaviors. |
471 | Different Absorption from the Same Sharing: Sifted Multi-task Learning for Fake News Detection | Lianwei Wu, Yuan Rao, Haolin Jin, Ambreen Nazir, Ling Sun | In this paper, we design a sifted multi-task learning method with a selected sharing layer for fake news detection. |
472 | Text-based inference of moral sentiment change | Jing Yi Xie, Renato Ferreira Pinto Junior, Graeme Hirst, Yang Xu | We present a text-based framework for investigating moral sentiment change of the public via longitudinal corpora. |
473 | Detecting Causal Language Use in Science Findings | Bei Yu, Yingya Li, Jun Wang | In this study, we first annotated a corpus of over 3,000 PubMed research conclusion sentences, then developed a BERT-based prediction model that classifies conclusion sentences into “no relationship”, “correlational”, “conditional causal”, and “direct causal” categories, achieving an accuracy of 0.90 and a macro-F1 of 0.88. We then applied the prediction model to measure the causal language use in the research conclusions of about 38,000 observational studies in PubMed. |
474 | Multilingual and Multi-Aspect Hate Speech Analysis | Nedjma Ousidhoum, Zizheng Lin, Hongming Zhang, Yangqiu Song, Dit-Yan Yeung | In this paper, we present a new multilingual multi-aspect hate speech analysis dataset and use it to test the current state-of-the-art multilingual multitask learning approaches. |
475 | MultiFC: A Real-World Multi-Domain Dataset for Evidence-Based Fact Checking of Claims | Isabelle Augenstein, Christina Lioma, Dongsheng Wang, Lucas Chaves Lima, Casper Hansen, Christian Hansen, Jakob Grue Simonsen | We contribute the largest publicly available dataset of naturally occurring factual claims for the purpose of automatic claim verification. |
476 | A Deep Neural Information Fusion Architecture for Textual Network Embeddings | Zenan Xu, Qinliang Su, Xiaojun Quan, Weijia Zhang | In this paper, a deep neural architecture is proposed to effectively fuse the two kinds of informations into one representation. |
477 | You Shall Know a User by the Company It Keeps: Dynamic Representations for Social Media Users in NLP | Marco Del Tredici, Diego Marcheggiani, Sabine Schulte im Walde, Raquel Fernández | We present a model based on Graph Attention Networks that captures this observation. |
478 | Adaptive Ensembling: Unsupervised Domain Adaptation for Political Document Analysis | Shrey Desai, Barea Sinno, Alex Rosenfeld, Junyi Jessy Li | To bridge this gap, we present adaptive ensembling, an unsupervised domain adaptation framework, equipped with a novel text classification model and time-aware training to ensure our methods work well with diachronic corpora. |
479 | Macrocosm: Social Media Persona Linking for Open Source Intelligence Applications | Graham Horwood, Ning Yu, Thomas Boggs, Changjiang Yang, Chad Holvenstot | This paper presents a multi-modal analysis of cross-contextual online social media (Macrocosm), a data-driven approach to detect similarities among user personas over six modalities: usernames, patterns-of-life, stylometry, semantic content, image content, and social network associations. |
480 | A Hierarchical Location Prediction Neural Network for Twitter User Geolocation | Binxuan Huang, Kathleen Carley | In this paper, we propose a hierarchical location prediction neural network for Twitter user geolocation. |
481 | Trouble on the Horizon: Forecasting the Derailment of Online Conversations as they Develop | Jonathan P. Chang, Cristian Danescu-Niculescu-Mizil | In this work we introduce a conversational forecasting model that learns an unsupervised representation of conversational dynamics and exploits it to predict future derailment as the conversation develops. |
482 | A Benchmark Dataset for Learning to Intervene in Online Hate Speech | Jing Qian, Anna Bethke, Yinyin Liu, Elizabeth Belding, William Yang Wang | In this paper, we propose a novel task of generative hate speech intervention, where the goal is to automatically generate responses to intervene during online conversations that contain hate speech. As a part of this work, we introduce two fully-labeled large-scale hate speech intervention datasets collected from Gab and Reddit. |
483 | Detecting and Reducing Bias in a High Stakes Domain | Ruiqi Zhong, Yanda Chen, Desmond Patton, Charlotte Selous, Kathy McKeown | To address the possibility of bias in this sensitive application, we developed an approach to systematically interpret the state of the art model. |
484 | CodeSwitch-Reddit: Exploration of Written Multilingual Discourse in Online Discussion Forums | Ella Rabinovich, Masih Sultani, Suzanne Stevenson | We introduce a novel, large, and diverse dataset of written code-switched productions, curated from topical threads of multiple bilingual communities on the Reddit discussion platform, and explore questions that were mainly addressed in the context of spoken language thus far. |
485 | Modeling Conversation Structure and Temporal Dynamics for Jointly Predicting Rumor Stance and Veracity | Penghui Wei, Nan Xu, Wenji Mao | In this paper, we propose a hierarchical multi-task learning framework for jointly predicting rumor stance and veracity on Twitter, which consists of two components. |
486 | Reconstructing Capsule Networks for Zero-shot Intent Classification | Han Liu, Xiaotong Zhang, Lu Fan, Xuandi Fu, Qimai Li, Xiao-Ming Wu, Albert Y.S. Lam | To overcome these limitations, we propose to reconstruct capsule networks for zero-shot intent classification. |
487 | Domain Adaptation for Person-Job Fit with Transferable Deep Global Match Network | Shuqing Bian, Wayne Xin Zhao, Yang Song, Tao Zhang, Ji-Rong Wen | We study the domain adaptation problem for person-job fit. |
488 | Heterogeneous Graph Attention Networks for Semi-supervised Short Text Classification | Hu Linmei, Tianchi Yang, Chuan Shi, Houye Ji, Xiaoli Li | In this paper, we propose a novel heterogeneous graph neural network based method for semi-supervised short text classification, leveraging full advantage of few labeled data and large unlabeled data through information propagation along the graph. |
489 | Comparing and Developing Tools to Measure the Readability of Domain-Specific Texts | Elissa Redmiles, Lisa Maszkiewicz, Emily Hwang, Dhruv Kuchhal, Everest Liu, Miraida Morales, Denis Peskov, Sudha Rao, Rock Stevens, Kristina Gligori?, Sean Kross, Michelle Mazurek, Hal Daumé III | In this work, we present a comparison of the validity of well-known readability measures and introduce a novel approach, Smart Cloze, which is designed to address shortcomings of existing measures. |
490 | News2vec: News Network Embedding with Subnode Information | Ye Ma, Lu Zong, Yikang Yang, Jionglong Su | With the aim of filling this gap, the News2vec model is proposed to allow the distributed representation of news taking into account its associated features. |
491 | Recursive Context-Aware Lexical Simplification | Sian Gooding, Ekaterina Kochmar | This paper presents a novel architecture for recursive context-aware lexical simplification, REC-LS, that is capable of (1) making use of the wider context when detecting the words in need of simplification and suggesting alternatives, and (2) taking previous simplification steps into account. |
492 | Leveraging Medical Literature for Section Prediction in Electronic Health Records | Sara Rosenthal, Ken Barker, Zhicheng Liang | We propose using sections from medical literature (e.g., textbooks, journals, web content) that contain content similar to that found in EHR sections. |
493 | Neural News Recommendation with Heterogeneous User Behavior | Chuhan Wu, Fangzhao Wu, Mingxiao An, Tao Qi, Jianqiang Huang, Yongfeng Huang, Xing Xie | In this paper, we propose a neural news recommendation approach which can exploit heterogeneous user behaviors. |
494 | Reviews Meet Graphs: Enhancing User and Item Representations for Recommendation with Hierarchical Attentive Graph Neural Network | Chuhan Wu, Fangzhao Wu, Tao Qi, Suyu Ge, Yongfeng Huang, Xing Xie | In this paper, we propose a neural recommendation approach which can utilize useful information from both review content and user-item graphs. |
495 | Event Representation Learning Enhanced with External Commonsense Knowledge | Xiao Ding, Kuo Liao, Ting Liu, Zhongyang Li, Junwen Duan | To address this issue, this paper proposes to leverage external commonsense knowledge about the intent and sentiment of the event. |
496 | Learning to Discriminate Perturbations for Blocking Adversarial Attacks in Text Classification | Yichao Zhou, Jyun-Yu Jiang, Kai-Wei Chang, Wei Wang | In this paper, we propose a novel framework, learning to discriminate perturbations (DISP), to identify and adjust malicious perturbations, thereby blocking adversarial attacks for text classification models. |
497 | A Neural Citation Count Prediction Model based on Peer Review Text | Siqing Li, Wayne Xin Zhao, Eddy Jing Yin, Ji-Rong Wen | In this paper, we take the initiative to utilize peer review data for the CCP task with a neural prediction model. |
498 | Connecting the Dots: Document-level Neural Relation Extraction with Edge-oriented Graphs | Fenia Christopoulou, Makoto Miwa, Sophia Ananiadou | We thus propose an edge-oriented graph neural model for document-level relation extraction. |
499 | Semi-supervised Text Style Transfer: Cross Projection in Latent Space | Mingyue Shang, Piji Li, Zhenxin Fu, Lidong Bing, Dongyan Zhao, Shuming Shi, Rui Yan | With these two types of training data, we introduce a projection function between the latent space of different styles and design two constraints to train it. |
500 | Question Answering for Privacy Policies: Combining Computational and Legal Perspectives | Abhilasha Ravichander, Alan W Black, Shomir Wilson, Thomas Norton, Norman Sadeh | We present PrivacyQA, a corpus consisting of 1750 questions about the privacy policies of mobile applications, and over 3500 expert annotations of relevant answers. |
501 | Stick to the Facts: Learning towards a Fidelity-oriented E-Commerce Product Description Generation | Zhangming Chan, Xiuying Chen, Yongliang Wang, Juntao Li, Zhiqiang Zhang, Kun Gai, Dongyan Zhao, Rui Yan | To bridge this gap we propose a model named Fidelity-oriented Product Description Generator (FPDG). |
502 | Fine-Grained Entity Typing via Hierarchical Multi Graph Convolutional Networks | Hailong Jin, Lei Hou, Juanzi Li, Tiansi Dong | We convert this problem into the task of graph-based semi-supervised classification, and propose Hierarchical Multi Graph Convolutional Network (HMGCN), a novel Deep Learning architecture to tackle this problem. |
503 | Learning to Infer Entities, Properties and their Relations from Clinical Conversations | Nan Du, Mingqiu Wang, Linh Tran, Gang Lee, Izhak Shafran | We extend the SAT model to jointly infer not only entities and their properties but also relations between them. |
504 | Practical Correlated Topic Modeling and Analysis via the Rectified Anchor Word Algorithm | Moontae Lee, Sungjun Cho, David Bindel, David Mimno | This paper aims to solidify the foundations of spectral topic inference and provide a practical implementation for anchor-based topic modeling. |
505 | Modeling the Relationship between User Comments and Edits in Document Revision | Xuchao Zhang, Dheeraj Rajagopal, Michael Gamon, Sujay Kumar Jauhar, ChangTien Lu | Thus, in this paper we explore the relationship between comments and edits by defining two novel, related tasks: Comment Ranking and Edit Anchoring. |
506 | PRADO: Projection Attention Networks for Document Classification On-Device | Karthik Krishnamoorthi, Sujith Ravi, Zornitsa Kozareva | We propose a novel projection attention neural network PRADO that combines trainable projections with attention and convolutions. |
507 | Subword Language Model for Query Auto-Completion | Gyuwan Kim | We present how to utilize subword language models for the fast and accurate generation of query completion candidates. |
508 | Enhancing Dialogue Symptom Diagnosis with Global Attention and Symptom Graph | Xinzhu Lin, Xiahui He, Qin Chen, Huaixiao Tou, Zhongyu Wei, Ting Chen | In order to further enhance the performance of symptom diagnosis over dialogues, we propose a global attention mechanism to capture more symptom related information, and build a symptom graph to model the associations between symptoms rather than treating each symptom independently. |
509 | Counterfactual Story Reasoning and Generation | Lianhui Qin, Antoine Bosselut, Ari Holtzman, Chandra Bhagavatula, Elizabeth Clark, Yejin Choi | In this paper, we propose Counterfactual Story Rewriting: given an original story and an intervening counterfactual event, the task is to minimally revise the story to make it compatible with the given counterfactual event. |
510 | Encode, Tag, Realize: High-Precision Text Editing | Eric Malmi, Sebastian Krause, Sascha Rothe, Daniil Mirylenka, Aliaksei Severyn | To predict the edit operations, we propose a novel model, which combines a BERT encoder with an autoregressive Transformer decoder. |
511 | Answer-guided and Semantic Coherent Question Generation in Open-domain Conversation | Weichao Wang, Shi Feng, Daling Wang, Yifei Zhang | Thus, we devise two methods to further enhance semantic coherence between post and question under the guidance of answer. |
512 | Read, Attend and Comment: A Deep Architecture for Automatic News Comment Generation | Ze Yang, Can Xu, wei wu, zhoujun li | In this paper, we propose a “read-attend-comment” procedure for news comment generation and formalize the procedure with a reading network and a generation network. |
513 | A Topic Augmented Text Generation Model: Joint Learning of Semantics and Structural Features | hongyin tang, Miao Li, Beihong Jin | In this paper, we propose a text generation model that learns semantics and structural features simultaneously. |
514 | LXMERT: Learning Cross-Modality Encoder Representations from Transformers | Hao Tan, Mohit Bansal | We thus propose the LXMERT (Learning Cross-Modality Encoder Representations from Transformers) framework to learn these vision-and-language connections. |
515 | Phrase Grounding by Soft-Label Chain Conditional Random Field | Jiacheng Liu, Julia Hockenmaier | In this paper, we formulate phrase grounding as a sequence labeling task where we treat candidate regions as potential labels, and use neural chain Conditional Random Fields (CRFs) to model dependencies among regions for adjacent mentions. |
516 | What You See is What You Get: Visual Pronoun Coreference Resolution in Dialogues | Xintong Yu, Hongming Zhang, Yangqiu Song, Yan Song, Changshui Zhang | To tackle this challenge, in this paper, we formally define the task of visual-aware pronoun coreference resolution (PCR) and introduce VisPro, a large-scale dialogue PCR dataset, to investigate whether and how the visual information can help resolve pronouns in dialogues. |
517 | YouMakeup: A Large-Scale Domain-Specific Multimodal Dataset for Fine-Grained Semantic Comprehension | Weiying Wang, Yongcheng Wang, Shizhe Chen, Qin Jin | In this work, we introduce “YouMakeup”, a large-scale multimodal instructional video dataset to support fine-grained semantic comprehension research in specific domain. |
518 | DEBUG: A Dense Bottom-Up Grounding Approach for Natural Language Video Localization | Chujie Lu, Long Chen, Chilie Tan, Xiaolin Li, Jun Xiao | In this paper, we focus on natural language video localization: localizing (ie, grounding) a natural language description in a long and untrimmed video sequence. |
519 | CrossWeigh: Training Named Entity Tagger from Imperfect Annotations | Zihan Wang, Jingbo Shang, Liyuan Liu, Lihao Lu, Jiacheng Liu, Jiawei Han | In this study, we dive deep into one of the widely-adopted NER benchmark datasets, CoNLL03 NER. |
520 | A Little Annotation does a Lot of Good: A Study in Bootstrapping Low-resource Named Entity Recognizers | Aditi Chaudhary, Jiateng Xie, Zaid Sheikh, Graham Neubig, Jaime Carbonell | In this paper, we ask the question: given this recent progress, and some amount of human annotation, what is the most effective method for efficiently creating high-quality entity recognizers in under-resourced languages? |
521 | Open Domain Web Keyphrase Extraction Beyond Language Modeling | Lee Xiong, Chuan Hu, Chenyan Xiong, Daniel Campos, Arnold Overwijk | To handle the variations of domain and content quality, we develop BLING-KPE, a neural keyphrase extraction model that goes beyond language understanding using visual presentations of documents and weak supervision from search queries. |
522 | TuckER: Tensor Factorization for Knowledge Graph Completion | Ivana Balazevic, Carl Allen, Timothy Hospedales | We propose TuckER, a relatively straightforward but powerful linear model based on Tucker decomposition of the binary tensor representation of knowledge graph triples. |
523 | Human-grounded Evaluations of Explanation Methods for Text Classification | Piyawat Lertvittayakumjorn, Francesca Toni | In this paper, we consider several model-agnostic and model-specific explanation methods for CNNs for text classification and conduct three human-grounded evaluations, focusing on different purposes of explanations: (1) revealing model behavior, (2) justifying model predictions, and (3) helping humans investigate uncertain predictions. |
524 | A Context-based Framework for Modeling the Role and Function of On-line Resource Citations in Scientific Literature | He Zhao, Zhunchen Luo, Chong Feng, Anqing Zheng, Xiaopeng Liu | In this paper, we propose a possible solution by using a multi-task framework to build the scientific resource classifier (SciResCLF) for jointly recognizing the role and function types. |
525 | Adversarial Reprogramming of Text Classification Neural Networks | Paarth Neekhara, Shehzeen Hussain, Shlomo Dubnov, Farinaz Koushanfar | In this work, we develop methods to repurpose text classification neural networks for alternate tasks without modifying the network architecture or parameters. |
526 | Document Hashing with Mixture-Prior Generative Models | Wei Dong, Qinliang Su, Dinghan Shen, Changyou Chen | In this paper, two mixture-prior generative models are proposed, under the objective to produce high-quality hashing codes for documents. |
527 | On Efficient Retrieval of Top Similarity Vectors | Shulong Tan, Zhixin Zhou, Zhaozhuo Xu, Ping Li | In this paper, we demonstrate an efficient method for searching vectors via a typical non-metric matching function: inner product. |
528 | Multiplex Word Embeddings for Selectional Preference Acquisition | Hongming Zhang, Jiaxin Bai, Yan Song, Kun Xu, Changlong Yu, Yangqiu Song, Wilfred Ng, Dong Yu | Therefore, in this paper, we propose a multiplex word embedding model, which can be easily extended according to various relations among words. |
529 | MulCode: A Multiplicative Multi-way Model for Compressing Neural Language Model | Yukun Ma, Patrick H. Chen, Cho-Jui Hsieh | To compress these embedding layers, we propose MulCode, a novel multi-way multiplicative neural compressor. |
530 | It’s All in the Name: Mitigating Gender Bias with Name-Based Counterfactual Data Substitution | Rowan Hall Maudslay, Hila Gonen, Ryan Cotterell, Simone Teufel | We propose two improvements to CDA: Counterfactual Data Substitution (CDS), a variant of CDA in which potentially biased text is randomly substituted to avoid duplication, and the Names Intervention, a novel name-pairing technique that vastly increases the number of words being treated. |
531 | Examining Gender Bias in Languages with Grammatical Gender | Pei Zhou, Weijia Shi, Jieyu Zhao, Kuan-Hao Huang, Muhao Chen, Ryan Cotterell, Kai-Wei Chang | In this paper, we propose new metrics for evaluating gender bias in word embeddings of these languages and further demonstrate evidence of gender bias in bilingual embeddings which align these languages with English. |
532 | Weakly Supervised Cross-lingual Semantic Relation Classification via Knowledge Distillation | Yogarshi Vyas, Marine Carpuat | We introduce a cross-lingual relation classifier trained only with English examples and a bilingual dictionary. |
533 | Improved Word Sense Disambiguation Using Pre-Trained Contextualized Word Representations | Christian Hadiwinoto, Hwee Tou Ng, Wee Chung Gan | In this paper, we explore different strategies of integrating pre-trained contextualized word representations and our best strategy achieves accuracies exceeding the best prior published accuracies by significant margins on multiple benchmark WSD datasets. |
534 | Do NLP Models Know Numbers? Probing Numeracy in Embeddings | Eric Wallace, Yizhong Wang, Sujian Li, Sameer Singh, Matt Gardner | We begin by investigating the numerical reasoning capabilities of a state-of-the-art question answering model on the DROP dataset. We find this model excels on questions that require numerical reasoning, i.e., it already captures numeracy. |
535 | A Split-and-Recombine Approach for Follow-up Query Analysis | Qian Liu, Bei Chen, Haoyan Liu, Jian-Guang LOU, Lei Fang, Bin Zhou, Dongmei Zhang | To leverage the advances in context-independent semantic parsing, we propose to perform follow-up query analysis, aiming to restate context-dependent natural language queries with contextual information. |
536 | Text2Math: End-to-end Parsing Text into Math Expressions | Yanyan Zou, Wei Lu | We propose Text2Math, a model for semantically parsing text into math expressions. |
537 | Editing-Based SQL Query Generation for Cross-Domain Context-Dependent Questions | Rui Zhang, Tao Yu, Heyang Er, Sungrok Shim, Eric Xue, Xi Victoria Lin, Tianze Shi, Caiming Xiong, Richard Socher, Dragomir Radev | Based on the observation that adjacent natural language questions are often linguistically dependent and their corresponding SQL queries tend to overlap, we utilize the interaction history by editing the previous predicted query to improve the generation quality. |
538 | Syntax-aware Multilingual Semantic Role Labeling | Shexia He, Zuchao Li, Hai Zhao | Unlike existing work, we propose a novel method guided by syntactic rule to prune arguments, which enables us to integrate syntax into multilingual SRL model simply and effectively. |
539 | Cloze-driven Pretraining of Self-attention Networks | Alexei Baevski, Sergey Edunov, Yinhan Liu, Luke Zettlemoyer, Michael Auli | We present a new approach for pretraining a bi-directional transformer model that provides significant performance gains across a variety of language understanding problems. |
540 | Bridging the Gap between Relevance Matching and Semantic Matching for Short Text Similarity Modeling | Jinfeng Rao, Linqing Liu, Yi Tay, Wei Yang, Peng Shi, Jimmy Lin | To bridge this gap, we propose a novel model, HCAN{\textasciitilde}(Hybrid Co-Attention Network), that comprises (1) a hybrid encoder module that includes ConvNet-based and LSTM-based encoders, (2) a relevance matching module that measures soft term matches with importance weighting at multiple granularities, and (3) a semantic matching module with co-attention mechanisms that capture context-aware semantic relatedness. |
541 | A Syntax-aware Multi-task Learning Framework for Chinese Semantic Role Labeling | Qingrong Xia, Zhenghua Li, Min Zhang | In this paper, we adopt a simple unified span-based model for both span-based and word-based Chinese SRL as a strong baseline. |
542 | Transfer Fine-Tuning: A BERT Case Study | Yuki Arase, Jun’ichi Tsujii | Herein, we propose to inject phrasal paraphrase relations into BERT in order to generate suitable representations for semantic equivalence assessment instead of increasing the model size. |
543 | Data-Anonymous Encoding for Text-to-SQL Generation | Zhen Dong, Shizhao Sun, Hongzhi Liu, Jian-Guang Lou, Dongmei Zhang | In this work, we propose a more efficient approach to handle table-related tokens before the semantic parser. |
544 | Capturing Argument Interaction in Semantic Role Labeling with Capsule Networks | Xinchi Chen, Chunchuan Lyu, Ivan Titov | We propose a new approach to modeling these interactions while maintaining efficient inference. |
545 | Learning Programmatic Idioms for Scalable Semantic Parsing | Srinivasan Iyer, Alvin Cheung, Luke Zettlemoyer | In this paper, we introduce an iterative method to extract code idioms from large source code corpora by repeatedly collapsing most-frequent depth-2 subtrees of their syntax trees, and train semantic parsers to apply these idioms during decoding. |
546 | JuICe: A Large Scale Distantly Supervised Dataset for Open Domain Context-based Code Generation | Rajas Agashe, Srinivasan Iyer, Luke Zettlemoyer | To study code generation conditioned on a long context history, we present JuICe, a corpus of 1.5 million examples with a curated test set of 3.7K instances based on online programming assignments. |
547 | Model-based Interactive Semantic Parsing: A Unified Framework and A Text-to-SQL Case Study | Ziyu Yao, Yu Su, Huan Sun, Wen-tau Yih | In this paper, we propose a new, unified formulation of the interactive semantic parsing problem, where the goal is to design a model-based intelligent agent. |
548 | Modeling Graph Structure in Transformer for Better AMR-to-Text Generation | Jie Zhu, Junhui Li, Muhua Zhu, Longhua Qian, Min Zhang, Guodong Zhou | In this paper we eliminate such a strong limitation and propose a novel structure-aware self-attention approach to better model the relations between indirectly connected concepts in the state-of-the-art seq2seq model, i.e. the Transformer. |
549 | Syntax-Aware Aspect Level Sentiment Classification with Graph Attention Networks | Binxuan Huang, Kathleen Carley | In this paper, we propose a novel target-dependent graph attention network (TD-GAT) for aspect level sentiment classification, which explicitly utilizes the dependency relationship among words. |
550 | Learning Explicit and Implicit Structures for Targeted Sentiment Analysis | Hao Li, Wei Lu | In this work, we argue that both types of information (implicit and explicit structural information) are crucial for building a successful targeted sentiment analysis model. |
551 | Capsule Network with Interactive Attention for Aspect-Level Sentiment Classification | Chunning Du, Haifeng Sun, Jingyu Wang, Qi Qi, Jianxin Liao, Tong Xu, Ming Liu | To solve this problem, we propose to utilize capsule network to construct vector-based feature representation and cluster features by an EM routing algorithm. |
552 | Emotion Detection with Neural Personal Discrimination | Xiabing Zhou, Zhongqing Wang, Shoushan Li, Guodong Zhou, Min Zhang | Accordingly, we propose a Neural Personal Discrimination (NPD) approach to address above challenges by determining personal attributes from posts, and connecting relevant posts with similar attributes to jointly learn their emotions. |
553 | Specificity-Driven Cascading Approach for Unsupervised Sentiment Modification | Pengcheng Yang, Junyang Lin, Jingjing Xu, Jun Xie, Qi Su, Xu SUN | To remedy this, we propose a specificity-driven cascading approach in this work, which can effectively increase the specificity of the generated text and further improve content preservation. |
554 | LexicalAT: Lexical-Based Adversarial Reinforcement Training for Robust Sentiment Classification | Jingjing Xu, Liang Zhao, Hanqi Yan, Qi Zeng, Yun Liang, Xu SUN | In this work, we propose a novel adversarial training approach, LexicalAT, to improve the robustness of current classification models. |
555 | Leveraging Structural and Semantic Correspondence for Attribute-Oriented Aspect Sentiment Discovery | Zhe Zhang, Munindar Singh | We propose Trait, an unsupervised probabilistic model that discovers aspects and sentiments from text and associates them with different attributes. |
556 | From the Token to the Review: A Hierarchical Multimodal approach to Opinion Mining | Alexandre Garcia, Pierre Colombo, Florence d’Alché-Buc, Slim Essid, Chloé Clavel | In this work we aim at bridging the gap separating fine grained opinion models already developed for written language and coarse grained models developed for spontaneous multimodal opinion mining. |
557 | Shallow Domain Adaptive Embeddings for Sentiment Analysis | Prathusha K Sarma, Yingyu Liang, William Sethares | This paper proposes a way to improve the performance of existing algorithms for text classification in domains with strong language semantics. |
558 | Domain-Invariant Feature Distillation for Cross-Domain Sentiment Classification | Mengting Hu, Yike Wu, Shiwan Zhao, Honglei Guo, Renhong Cheng, Zhong Su | In this paper, we focus on aspect-level cross-domain sentiment classification, and propose to distill the domain-invariant sentiment features with the help of an orthogonal domain-dependent task, i.e. aspect detection, which is built on the aspects varying widely in different domains. |
559 | A Novel Aspect-Guided Deep Transition Model for Aspect Based Sentiment Analysis | Yunlong Liang, Fandong Meng, Jinchao Zhang, Jinan Xu, Yufeng Chen, Jie Zhou | In this paper, we propose a novel Aspect-Guided Deep Transition model, named AGDT, which utilizes the given aspect to guide the sentence encoding from scratch with the specially-designed deep transition architecture. |
560 | Human-Like Decision Making: Document-level Aspect Sentiment Classification via Hierarchical Reinforcement Learning | Jingjing Wang, Changlong Sun, Shoushan Li, Jiancheng Wang, Luo Si, Min Zhang, Xiaozhong Liu, Guodong Zhou | In this paper, to simulating the steps of analyzing aspect sentiment in a document by human beings, we propose a new Hierarchical Reinforcement Learning (HRL) approach to DASC. |
561 | A Dataset of General-Purpose Rebuttal | Matan Orbach, Yonatan Bilu, Ariel Gera, Yoav Kantor, Lena Dankin, Tamar Lavee, Lili Kotlerman, Shachar Mirkin, Michal Jacovi, Ranit Aharonov, Noam Slonim | Here we present a novel task of producing a critical response to a long argumentative text, and suggest a method based on general rebuttal arguments to address it. |
562 | Rethinking Attribute Representation and Injection for Sentiment Classification | Reinald Kim Amplayo | The de facto standard method is to incorporate them as additional biases in the attention mechanism, and more performance gains are achieved by extending the model architecture. In this paper, we show that the above method is the least effective way to represent and inject attributes. |
563 | A Knowledge Regularized Hierarchical Approach for Emotion Cause Analysis | Chuang Fan, Hongyu Yan, Jiachen Du, Lin Gui, Lidong Bing, Min Yang, Ruifeng Xu, Ruibin Mao | In this paper, we propose a new method to extract emotion cause with a hierarchical neural model and knowledge-based regularizations, which aims to incorporate discourse context information and restrain the parameters by sentiment lexicon and common knowledge. |
564 | Automatic Argument Quality Assessment – New Datasets and Methods | Assaf Toledo, Shai Gretz, Edo Cohen-Karlik, Roni Friedman, Elad Venezian, Dan Lahav, Michal Jacovi, Ranit Aharonov, Noam Slonim | We explore the task of automatic assessment of argument quality. |
565 | Fine-Grained Analysis of Propaganda in News Article | Giovanni Da San Martino, Seunghak Yu, Alberto Barrón-Cedeño, Rostislav Petrov, Preslav Nakov | To overcome these limitations, we propose a novel task: performing fine-grained analysis of texts by detecting all fragments that contain propaganda techniques as well as their type. |
566 | Context-aware Interactive Attention for Multi-modal Sentiment and Emotion Analysis | Dushyant Singh Chauhan, Md Shad Akhtar, Asif Ekbal, Pushpak Bhattacharyya | In this paper, we introduce a recurrent neural network based approach for the multi-modal sentiment and emotion analysis. The proposed model learns the inter-modal interaction among the participating modalities through an auto-encoder mechanism. |
567 | Sequential Learning of Convolutional Features for Effective Text Classification | Avinash Madasu, Vijjini Anvesh Rao | In this paper, we present an experimental study on the fundamental blocks of CNNs in text categorization. |
568 | The Role of Pragmatic and Discourse Context in Determining Argument Impact | Esin Durmus, Faisal Ladhak, Claire Cardie | This paper presents a new dataset to initiate the study of this aspect of argumentation: it consists of a diverse collection of arguments covering 741 controversial topics and comprising over 47,000 claims. |
569 | Aspect-Level Sentiment Analysis Via Convolution over Dependency Tree | Kai Sun, Richong Zhang, Samuel Mensah, Yongyi Mao, Xudong Liu | We propose a method based on neural networks to identify the sentiment polarity of opinion words expressed on a specific aspect of a sentence. |
570 | Understanding Data Augmentation in Neural Machine Translation: Two Perspectives towards Generalization | Guanlin Li, Lemao Liu, Guoping Huang, Conghui Zhu, Tiejun Zhao | Based on the observation, this paper makes an initial attempt to answer a fundamental question: what benefits, which are consistent across different methods and tasks, does DA in general obtain? |
571 | Simple and Effective Noisy Channel Modeling for Neural Machine Translation | Kyra Yee, Yann Dauphin, Michael Auli | We pursue an alternative approach based on standard sequence to sequence models which utilize the entire source. |
572 | MultiFiT: Efficient Multi-lingual Language Model Fine-tuning | Julian Eisenschlos, Sebastian Ruder, Piotr Czapla, Marcin Kadras, Sylvain Gugger, Jeremy Howard | We propose Multi-lingual language model Fine-Tuning (MultiFiT) to enable practitioners to train and fine-tune language models efficiently in their own language. |
573 | Hint-Based Training for Non-Autoregressive Machine Translation | Zhuohan Li, Zi Lin, Di He, Fei Tian, Tao QIN, Liwei WANG, Tie-Yan Liu | In this paper, we proposed a novel approach to leveraging the hints from hidden states and word alignments to help the training of NART models. |
574 | Working Hard or Hardly Working: Challenges of Integrating Typology into Neural Dependency Parsers | Adam Fisch, Jiang Guo, Regina Barzilay | This paper explores the task of leveraging typology in the context of cross-lingual dependency parsing. |
575 | Cross-Lingual BERT Transformation for Zero-Shot Dependency Parsing | Yuxuan Wang, Wanxiang Che, Jiang Guo, Yijia Liu, Ting Liu | We propose Cross-Lingual BERT Transformation (CLBT), a simple and efficient approach to generate cross-lingual contextualized word embeddings based on publicly available pre-trained BERT models (Devlin et al., 2018). |
576 | Multilingual Grammar Induction with Continuous Language Identification | Wenjuan Han, Ge Wang, Yong Jiang, Kewei Tu | In this work, we propose a novel universal grammar induction approach that represents language identities with continuous vectors and employs a neural network to predict grammar parameters based on the representation. |
577 | Quantifying the Semantic Core of Gender Systems | Adina Williams, Damian Blasi, Lawrence Wolf-Sonkin, Hanna Wallach, Ryan Cotterell | In this work, we present the first large-scale investigation of the arbitrariness of gender assignment that uses canonical correlation analysis as a method for correlating the gender of inanimate nouns with their lexical semantic meaning. |
578 | Perturbation Sensitivity Analysis to Detect Unintended Model Biases | Vinodkumar Prabhakaran, Ben Hutchinson, Margaret Mitchell | Based on this idea, we propose a generic evaluation framework, Perturbation Sensitivity Analysis, which detects unintended model biases related to named entities, and requires no new annotations or corpora. |
579 | Automatically Inferring Gender Associations from Language | Serina Chang, Kathy McKeown | In this paper, we pose the question: do people talk about women and men in different ways? We introduce two datasets and a novel integration of approaches for automatically inferring gender associations from language, discovering coherent word clusters, and labeling the clusters for the semantic concepts they represent. |
580 | Reporting the Unreported: Event Extraction for Analyzing the Local Representation of Hate Crimes | Aida Mostafazadeh Davani, Leigh Yeh, Mohammad Atari, Brendan Kennedy, Gwenyth Portillo Wightman, Elaine Gonzalez, Natalie Delong, Rhea Bhatia, Arineh Mirinjian, Xiang Ren, Morteza Dehghani | Here, we first demonstrate that event extraction and multi-instance learning, applied to a corpus of local news articles, can be used to predict instances of hate crime. We then use the trained model to detect incidents of hate in cities for which the FBI lacks statistics. |
581 | Minimally Supervised Learning of Affective Events Using Discourse Relations | Jun Saito, Yugo Murawaki, Sadao Kurohashi | In this paper, we propose to propagate affective polarity using discourse relations. |
582 | Event Detection with Multi-Order Graph Convolution and Aggregated Attention | Haoran Yan, Xiaolong Jin, Xiangbin Meng, Jiafeng Guo, Xueqi Cheng | For this reason, this paper proposes a new method for event detection, which uses a dependency tree based graph convolution network with aggregative attention to explicitly model and aggregate multi-order syntactic representations in sentences. |
583 | Coverage of Information Extraction from Sentences and Paragraphs | Simon Razniewski, Nitisha Jain, Paramita Mirza, Gerhard Weikum | In this paper we discuss the importance of scalar implicatures in the context of textual information extraction. |
584 | HMEAE: Hierarchical Modular Event Argument Extraction | Xiaozhi Wang, Ziqi Wang, Xu Han, Zhiyuan Liu, Juanzi Li, Peng Li, Maosong Sun, Jie Zhou, Xiang Ren | In this paper, we propose a Hierarchical Modular Event Argument Extraction (HMEAE) model, to provide effective inductive bias from the concept hierarchy of event argument roles. |
585 | Entity, Relation, and Event Extraction with Contextualized Span Representations | David Wadden, Ulme Wennberg, Yi Luan, Hannaneh Hajishirzi | We examine the capabilities of a unified, multi-task framework for three information extraction tasks: named entity recognition, relation extraction, and event extraction. |
586 | Next Sentence Prediction helps Implicit Discourse Relation Classification within and across Domains | Wei Shi, Vera Demberg | We here show that this shortcoming can be effectively addressed by using the bidirectional encoder representation from transformers (BERT) proposed by Devlin et al. (2019), which were trained on a next-sentence prediction task, and thus encode a representation of likely next sentences. |
587 | Split or Merge: Which is Better for Unsupervised RST Parsing? | Naoki Kobayashi, Tsutomu Hirao, Kengo Nakamura, Hidetaka Kamigaito, Manabu Okumura, Masaaki Nagata | In this paper, we present two language-independent unsupervised RST parsing methods based on dynamic programming. |
588 | BERT for Coreference Resolution: Baselines and Analysis | Mandar Joshi, Omer Levy, Luke Zettlemoyer, Daniel Weld | We apply BERT to coreference resolution, achieving a new state of the art on the GAP (+11.5 F1) and OntoNotes (+3.9 F1) benchmarks. |
589 | Linguistic Versus Latent Relations for Modeling Coherent Flow in Paragraphs | Dongyeop Kang, Eduard Hovy | In order to produce a coherent flow of text, we explore two forms of intersentential relations in a paragraph: one is a human-created linguistical relation that forms a structure (e.g., discourse tree) and the other is a relation from latent representation learned from the sentences themselves. |
590 | Event Causality Recognition Exploiting Multiple Annotators’ Judgments and Background Knowledge | Kazuma Kadowaki, Ryu Iida, Kentaro Torisawa, Jong-Hoon Oh, Julien Kloetzer | We propose new BERT-based methods for recognizing event causality such as “smoke cigarettes” –{\textgreater} “die of lung cancer” written in web texts. |
591 | What Part of the Neural Network Does This? Understanding LSTMs by Measuring and Dissecting Neurons | Ji Xin, Jimmy Lin, Yaoliang Yu | We find inspiration from biologists and study the affinity between individual neurons and labels, propose a novel metric to quantify the sensitivity of neurons to each label, and conduct experiments to show the validity of our proposed metric. |
592 | Quantity doesn’t buy quality syntax with neural language models | Marten van Schijndel, Aaron Mueller, Tal Linzen | We investigate to what extent these shortcomings can be mitigated by increasing the size of the network and the corpus on which it is trained. |
593 | Higher-order Comparisons of Sentence Encoder Representations | Mostafa Abdou, Artur Kulmizev, Felix Hill, Daniel M. Low, Anders Søgaard | We demonstrate the utility of RSA by establishing a previously unknown correspondence between widely-employed pretrained language encoders and human processing difficulty via eye-tracking data, showcasing its potential in the interpretability toolbox for neural models. |
594 | Text Genre and Training Data Size in Human-like Parsing | John Hale, Adhiguna Kuncoro, Keith Hall, Chris Dyer, Jonathan Brennan | Domain-specific training typically makes NLP systems work better. We show that this extends to cognitive modeling as well by relating the states of a neural phrase-structure parser to electrophysiological measures from human participants. |
595 | Feature2Vec: Distributional semantic modelling of human property knowledge | Steven Derby, Paul Miller, Barry Devereux | We propose a method for mapping human property knowledge onto a distributional semantic space, which adapts the word2vec architecture to the task of modelling concept features. |
596 | Sunny and Dark Outside?! Improving Answer Consistency in VQA through Entailed Question Generation | Arijit Ray, Karan Sikka, Ajay Divakaran, Stefan Lee, Giedrius Burachas | In this work, we introduce a dataset, ConVQA, and metrics that enable quantitative evaluation of consistency in VQA. Further, we propose a consistency-improving data augmentation module, a Consistency Teacher Module (CTM). |
597 | GeoSQA: A Benchmark for Scenario-based Question Answering in the Geography Domain at High School Level | Zixian Huang, Yulin Shen, Xiao Li, Yu’ang Wei, Gong Cheng, Lin Zhou, Xinyu Dai, Yuzhong Qu | In this paper, we introduce the GeoSQA dataset. |
598 | Revisiting the Evaluation of Theory of Mind through Question Answering | Matthew Le, Y-Lan Boureau, Maximilian Nickel | In this work, we revisit the evaluation of theory of mind through question answering. |
599 | Multi-passage BERT: A Globally Normalized BERT Model for Open-domain Question Answering | Zhiguo Wang, Patrick Ng, Xiaofei Ma, Ramesh Nallapati, Bing Xiang | To tackle this issue, we propose a multi-passage BERT model to globally normalize answer scores across all passages of the same question, and this change enables our QA model find better answers by utilizing more passages. |
600 | A Span-Extraction Dataset for Chinese Machine Reading Comprehension | Yiming Cui, Ting Liu, Wanxiang Che, Li Xiao, Zhipeng Chen, Wentao Ma, Shijin Wang, Guoping Hu | In this paper, we introduce a Span-Extraction dataset for Chinese machine reading comprehension to add language diversities in this area. |
601 | MICRON: Multigranular Interaction for Contextualizing RepresentatiON in Non-factoid Question Answering | Hojae Han, Seungtaek Choi, Haeju Park, Seung-won Hwang | Specifically, we propose MICRON: Multigranular Interaction for Contextualizing RepresentatiON, a novel approach which derives contextualized uni-gram representation from n-grams. |
602 | Machine Reading Comprehension Using Structural Knowledge Graph-aware Network | Delai Qiu, Yuanzhe Zhang, Xinwei Feng, Xiangwen Liao, Wenbin Jiang, Yajuan Lyu, Kang Liu, Jun Zhao | To this end, we propose a Structural Knowledge Graph-aware Network(SKG) model, constructing sub-graphs for entities in the machine comprehension context. |
603 | Answering Conversational Questions on Structured Data without Logical Forms | Thomas Mueller, Francesco Piccinno, Peter Shaw, Massimo Nicosia, Yasemin Altun | We present a novel approach to answering sequential questions based on structured objects such as knowledge bases or tables without using a logical form as an intermediate representation. |
604 | Improving Answer Selection and Answer Triggering using Hard Negatives | Sawan Kumar, shweta garg, Kartik Mehta, Nikhil Rasiwasia | In this paper, we establish the effectiveness of using hard negatives, coupled with a siamese network and a suitable loss function, for the tasks of answer selection and answer triggering. |
605 | Can You Unpack That? Learning to Rewrite Questions-in-Context | Ahmed Elgohary, Denis Peskov, Jordan Boyd-Graber | We introduce the task of question-in-context rewriting: given the context of a conversation’s history, rewrite a context-dependent into a self-contained question with the same answer. |
606 | Quoref: A Reading Comprehension Dataset with Questions Requiring Coreferential Reasoning | Pradeep Dasigi, Nelson F. Liu, Ana Marasovic, Noah A. Smith, Matt Gardner | We present a new crowdsourced dataset containing more than 24K span-selection questions that require resolving coreference among entities in over 4.7K English paragraphs from Wikipedia. |
607 | Zero-shot Reading Comprehension by Cross-lingual Transfer Learning with Multi-lingual Language Representation Model | Tsung-Yuan Hsu, Chi-Liang Liu, Hung-yi Lee | In this paper, we systematically explore zero-shot cross-lingual transfer learning on reading comprehension tasks with language representation model pre-trained on multi-lingual corpus. |
608 | QuaRTz: An Open-Domain Dataset of Qualitative Relationship Questions | Oyvind Tafjord, Matt Gardner, Kevin Lin, Peter Clark | We introduce the first open-domain dataset, called QuaRTz, for reasoning about textual qualitative relationships. |
609 | Giving BERT a Calculator: Finding Operations and Arguments with Reading Comprehension | Daniel Andor, Luheng He, Kenton Lee, Emily Pitler | We enable a BERT-based reading comprehension model to perform lightweight numerical reasoning. |
610 | A Gated Self-attention Memory Network for Answer Selection | Tuan Lai, Quan Hung Tran, Trung Bui, Daisuke Kihara | In this work, we take a departure from the popular Compare-Aggregate architecture, and instead, propose a new gated self-attention memory network for the task. |
611 | Polly Want a Cracker: Analyzing Performance of Parroting on Paraphrase Generation Datasets | Hong-Ren Mao, Hung-Yi Lee | In this paper, we analyze datasets commonly used for paraphrase generation research, and show that simply parroting input sentences surpasses state-of-the-art models in the literature when evaluated on standard metrics. |
612 | Query-focused Sentence Compression in Linear Time | Abram Handler, Brendan O’Connor | This work introduces a new transition-based sentence compression technique developed for such settings. |
613 | Generating Personalized Recipes from Historical User Preferences | Bodhisattwa Prasad Majumder, Shuyang Li, Jianmo Ni, Julian McAuley | We propose a new task of personalized recipe generation to help these users: expanding a name and incomplete ingredient details into complete natural-text instructions aligned with the user’s historical preferences. |
614 | Generating Highly Relevant Questions | Jiazuo Qiu, Deyi Xiong | The neural seq2seq based question generation (QG) is prone to generating generic and undiversified questions that are poorly relevant to the given passage and target answer. In this paper, we propose two methods to address the issue. |
615 | Improving Neural Story Generation by Targeted Common Sense Grounding | Huanru Henry Mao, Bodhisattwa Prasad Majumder, Julian McAuley, Garrison Cottrell | We propose a simple multi-task learning scheme to achieve quantitatively better common sense reasoning in language models by leveraging auxiliary training signals from datasets designed to provide common sense grounding. |
616 | Abstract Text Summarization: A Low Resource Challenge | Shantipriya Parida, Petr Motlicek | We propose an iterative data augmentation approach which uses synthetic data along with the real summarization data for the German language. |
617 | Generating Modern Poetry Automatically in Finnish | Mika Hämäläinen, Khalid Alnajjar | We present a novel approach for generating poetry automatically for the morphologically rich Finnish language by using a genetic algorithm. |
618 | SUM-QE: a BERT-based Summary Quality Estimation Model | Stratos Xenouleas, Prodromos Malakasiotis, Marianna Apidianaki, Ion Androutsopoulos | We propose SUM-QE, a novel Quality Estimation model for summarization based on BERT. |
619 | An Empirical Comparison on Imitation Learning and Reinforcement Learning for Paraphrase Generation | Wanyu Du, Yangfeng Ji | In this work, we present an empirical study on how RL and IL can help boost the performance of generating paraphrases, with the pointer-generator as a base model. |
620 | Countering the Effects of Lead Bias in News Summarization via Multi-Stage Training and Auxiliary Losses | Matt Grenander, Yue Dong, Jackie Chi Kit Cheung, Annie Louis | We propose two techniques to make systems sensitive to the importance of content in different parts of the article. |
621 | Learning Rhyming Constraints using Structured Adversaries | Harsh Jhamtani, Sanket Vaibhav Mehta, Jaime Carbonell, Taylor Berg-Kirkpatrick | We propose an alternate approach that uses a structured discriminator to learn a poetry generator that directly captures rhyming constraints in a generative adversarial setup. |
622 | Question-type Driven Question Generation | Wenjie Zhou, Minghua Zhang, Yunfang Wu | We propose to automatically predict the question type based on the input answer and context. |
623 | Deep Reinforcement Learning with Distributional Semantic Rewards for Abstractive Summarization | Siyao Li, Deren Lei, Pengda Qin, William Yang Wang | In this paper, instead of Rouge-L, we explore the practicability of utilizing the distributional semantics to measure the matching degrees. |
624 | Clause-Wise and Recursive Decoding for Complex and Cross-Domain Text-to-SQL Generation | Dongjun Lee | In this paper, we propose a SQL clause-wise decoding neural architecture with a self-attention based database schema encoder to address the Spider task. |
625 | Do Nuclear Submarines Have Nuclear Captains? A Challenge Dataset for Commonsense Reasoning over Adjectives and Objects | James Mullenbach, Jonathan Gordon, Nanyun Peng, Jonathan May | To attack this challenge, we crowdsource a set of human judgments that answer the English-language question “Given a whole described by an adjective, does the adjective also describe a given part?” |
626 | Aggregating Bidirectional Encoder Representations Using MatchLSTM for Sequence Matching | Bo Shao, Yeyun Gong, Weizhen Qi, Nan Duan, Xiaola Lin | In this work, we propose an aggregation method to combine the Bidirectional Encoder Representations from Transformer (BERT) with a MatchLSTM layer for Sequence Matching. |
627 | What Does This Word Mean? Explaining Contextualized Embeddings with Natural Language Definition | Ting-Yun Chang, Yun-Nung Chen | To further investigate what contextualized word embeddings capture, this paper analyzes whether they can indicate the corresponding sense definitions and proposes a general framework that is capable of explaining word meanings given contextualized word embeddings for better interpretation. |
628 | Pre-Training BERT on Domain Resources for Short Answer Grading | Chul Sung, Tejas Dhamecha, Swarnadeep Saha, Tengfei Ma, Vinay Reddy, Rishi Arora | In this paper, we explore ways of improving the pre-trained contextual representations for the task of automatic short answer grading, a critical component of intelligent tutoring systems. |
629 | WIQA: A dataset for “What if…” reasoning over procedural text | Niket Tandon, Bhavana Dalvi, Keisuke Sakaguchi, Peter Clark, Antoine Bosselut | We introduce WIQA, the first large-scale dataset of “What if…” questions over procedural text. |
630 | Evaluating BERT for natural language inference: A case study on the CommitmentBank | Nanjiang Jiang, Marie-Catherine de Marneffe | We address this problem by recasting the CommitmentBank for NLI, which contains items involving reasoning over the extent to which a speaker is committed to complements of clause-embedding verbs under entailment-canceling environments (conditional, negation, modal and question). |
631 | Incorporating Domain Knowledge into Medical NLI using Knowledge Graphs | Soumya Sharma, Bishal Santra, Abhik Jana, Santosh Tokala, Niloy Ganguly, Pawan Goyal | In this paper, we explore how to incorporate structured domain knowledge, available in the form of a knowledge graph (UMLS), for the Medical NLI task. |
632 | The FLORES Evaluation Datasets for Low-Resource Machine Translation: Nepali–English and Sinhala–English | Francisco Guzmán, Peng-Jen Chen, Myle Ott, Juan Pino, Guillaume Lample, Philipp Koehn, Vishrav Chaudhary, Marc’Aurelio Ranzato | In this work, we introduce the FLORES evaluation datasets for Nepali-English and Sinhala- English, based on sentences translated from Wikipedia. |
633 | Mask-Predict: Parallel Decoding of Conditional Masked Language Models | Marjan Ghazvininejad, Omer Levy, Yinhan Liu, Luke Zettlemoyer | We, instead, use a masked language modeling objective to train a model to predict any subset of the target words, conditioned on both the input text and a partially masked target translation. |
634 | Learning to Copy for Automatic Post-Editing | Xuancheng Huang, Yang Liu, Huanbo Luan, Jingfang Xu, Maosong Sun | In this work, we propose a new method for modeling copying for APE. |
635 | Exploring Human Gender Stereotypes with Word Association Test | Yupei Du, Yuanbin Wu, Man Lan | In this work, we utilize word association test, which contains rich types of word connections annotated by human participants, to explore how gender stereotypes spread within our minds. |
636 | A Modular Architecture for Unsupervised Sarcasm Generation | Abhijit Mishra, Tarun Tater, Karthik Sankaranarayanan | In this paper, we propose a novel framework for sarcasm generation; the system takes a literal negative opinion as input and translates it into a sarcastic version. |
637 | Generating Classical Chinese Poems from Vernacular Chinese | Zhichao Yang, Pengshan Cai, Yansong Feng, Fei Li, Weijiang Feng, Elena Suet-Ying Chiu, hong yu | In this paper, we propose a novel task of generating classical Chinese poems from vernacular, which allows users to have more control over the semantic of generated poems. |
638 | Set to Ordered Text: Generating Discharge Instructions from Medical Billing Codes | Litton J Kurisinkel, Nancy Chen | We present set to ordered text, a natural language generation task applied to automatically generating discharge instructions from admission ICD (International Classification of Diseases) codes. |
639 | Constraint-based Learning of Phonological Processes | Shraddha Barke, Rose Kunkel, Nadia Polikarpova, Eric Meinhardt, Eric Bakovic, Leon Bergen | We present an unsupervised approach to learning human-readable descriptions of phonological processes from collections of related utterances. |
640 | Detect Camouflaged Spam Content via StoneSkipping: Graph and Text Joint Embedding for Chinese Character Variation Representation | Zhuoren Jiang, Zhe Gao, Guoxiu He, Yangyang Kang, Changlong Sun, Qiong Zhang, Luo Si, Xiaozhong Liu | This paper proposes a novel framework to jointly model Chinese variational, semantic, and contextualized representations for Chinese text spam detection task. |
641 | An Attentive Fine-Grained Entity Typing Model with Latent Type Representation | Ying Lin, Heng Ji | We propose a fine-grained entity typing model with a novel attention mechanism and a hybrid type classifier. |
642 | An Improved Neural Baseline for Temporal Relation Extraction | Qiang Ning, Sanjay Subramanian, Dan Roth | This paper proposes a new neural system that achieves about 10% absolute improvement in accuracy over the previous best system (25% error reduction) on two benchmark datasets. |
643 | Improving Fine-grained Entity Typing with Entity Linking | Hongliang Dai, Donghong Du, Xin Li, Yangqiu Song | In this paper, we use entity linking to help with the fine-grained entity type classification process. |
644 | Combining Spans into Entities: A Neural Two-Stage Approach for Recognizing Discontiguous Entities | Bailin Wang, Wei Lu | In this work, we propose a neural two-stage approach to recognizing discontiguous and overlapping entities by decomposing this problem into two subtasks: 1) it first detects all the overlapping spans that either form entities on their own or present as segments of discontiguous entities, based on the representation of segmental hypergraph, 2) next it learns to combine these segments into discontiguous entities with a classifier, which filters out other incorrect combinations of segments. |
645 | Cross-Sentence N-ary Relation Extraction using Lower-Arity Universal Schemas | Kosuke Akimoto, Takuya Hiraoka, Kunihiko Sadamasa, Mathias Niepert | In this paper, we propose a novel approach to cross-sentence n-ary relation extraction based on universal schemas. |
646 | Gazetteer-Enhanced Attentive Neural Networks for Named Entity Recognition | Hongyu Lin, Yaojie Lu, Xianpei Han, Le Sun, Bin Dong, Shanshan Jiang | To alleviate this problem, this paper proposes Gazetteer-Enhanced Attentive Neural Networks, which can enhance region-based NER by learning name knowledge of entity mentions from easily-obtainable gazetteers, rather than only from fully-annotated data. |
647 | “A Buster Keaton of Linguistics”: First Automated Approaches for the Extraction of Vossian Antonomasia | Michel Schwab, Robert Jäschke, Frank Fischer, Jannik Strötgen | In this paper, we propose a first method for the extraction of VAs that works completely automatically. |
648 | Multi-Task Learning for Chemical Named Entity Recognition with Chemical Compound Paraphrasing | Taiki Watanabe, Akihiro Tamura, Takashi Ninomiya, Takuya Makino, Tomoya Iwakura | We propose a method to improve named entity recognition (NER) for chemical compounds using multi-task learning by jointly training a chemical NER model and a chemical com- pound paraphrase model. |
649 | FewRel 2.0: Towards More Challenging Few-Shot Relation Classification | Tianyu Gao, Xu Han, Hao Zhu, Zhiyuan Liu, Peng Li, Maosong Sun, Jie Zhou | We present FewRel 2.0, a more challenging task to investigate two aspects of few-shot relation classification models: (1) Can they adapt to a new domain with only a handful of instances? |
650 | ner and pos when nothing is capitalized | Stephen Mayhew, Tatiana Tsygankova, Dan Roth | In this work, we perform a systematic analysis of solutions to this problem, modifying only the casing of the train or test data using lowercasing and truecasing methods. |
651 | CaRB: A Crowdsourced Benchmark for Open IE | Sangnie Bhardwaj, Samarth Aggarwal, Mausam Mausam | We contribute CaRB, an improved dataset and framework for testing Open IE systems. |
652 | Weakly Supervised Attention Networks for Entity Recognition | Barun Patra, Joel Ruben Antony Moniz | In this work, we aim to circumvent this requirement of word-level annotated data. |
653 | Revealing and Predicting Online Persuasion Strategy with Elementary Units | Gaku Morio, Ryo Egawa, Katsuhide Fujita | Our contributions are as follows: (1) annotating five types of EUs in a persuasive forum, the so-called ChangeMyView, (2) revealing both intuitive and non-intuitive strategic insights for the persuasion by analyzing 4612 annotated EUs, and (3) proposing baseline neural models that identify the EU boundary and type. |
654 | A Challenge Dataset and Effective Models for Aspect-Based Sentiment Analysis | Qingnan Jiang, Lei Chen, Ruifeng Xu, Xiang Ao, Min Yang | In this paper, we present a new large-scale Multi-Aspect Multi-Sentiment (MAMS) dataset, in which each sentence contains at least two different aspects with different sentiment polarities. |
655 | Learning with Noisy Labels for Sentence-level Sentiment Classification | Hao Wang, Bing Liu, Chaozhuo Li, Yan Yang, Tianrui Li | We propose a novel DNN model called NetAb (as shorthand for convolutional neural Networks with Ab-networks) to handle noisy labels during training. |
656 | DENS: A Dataset for Multi-class Emotion Analysis | Chen Liu, Muhammad Osama, Anderson De Andrade | We introduce a new dataset for multi-class emotion analysis from long-form narratives in English. |
657 | Multi-Task Stance Detection with Sentiment and Stance Lexicons | Yingjie Li, Cornelia Caragea | In this paper, we propose a multi-task framework that incorporates target-specific attention mechanism and at the same time takes sentiment classification as an auxiliary task. |
658 | A Robust Self-Learning Framework for Cross-Lingual Text Classification | Xin Dong, Gerard de Melo | In this paper, we present an elegantly simple robust self-learning framework to include unlabeled non-English samples in the fine-tuning process of pretrained multilingual representation models. |
659 | Learning to Flip the Sentiment of Reviews from Non-Parallel Corpora | Canasai Kruengkrai | We introduce a method for acquiring imperfectly aligned sentences from non-parallel corpora and propose a model that learns to minimize the sentiment and content losses in a fully end-to-end manner. |
660 | Label Embedding using Hierarchical Structure of Labels for Twitter Classification | Taro Miyazaki, Kiminobu Makino, Yuka Takei, Hiroki Okamoto, Jun Goto | Therefore, we propose a method that can consider the hierarchical structure of labels and label texts themselves. |
661 | Interpretable Word Embeddings via Informative Priors | Miriam Hurtado Bodell, Martin Arvidsson, Måns Magnusson | We propose the use of informative priors to create interpretable and domain-informed dimensions for probabilistic word embeddings. |
662 | Adversarial Removal of Demographic Attributes Revisited | Maria Barrett, Yova Kementchedjhieva, Yanai Elazar, Desmond Elliott, Anders Søgaard | We revisit their experiments and conduct a series of follow-up experiments showing that, in fact, the diagnostic classifier generalizes poorly to both new in-domain samples and new domains, indicating that it relies on correlations specific to their particular data sample. |
663 | A deep-learning framework to detect sarcasm targets | Jasabanta Patro, Srijan Bansal, Animesh Mukherjee | In this paper we propose a deep learning framework for sarcasm target detection in predefined sarcastic texts. |
664 | In Plain Sight: Media Bias Through the Lens of Factual Reporting | Lisa Fan, Marshall White, Eva Sharma, Ruisi Su, Prafulla Kumar Choubey, Ruihong Huang, Lu Wang | In this work, we investigate the effects of informational bias: factual content that can nevertheless be deployed to sway reader opinion. |
665 | Incorporating Label Dependencies in Multilabel Stance Detection | William Ferreira, Andreas Vlachos | In this paper, we address versions of the task in which an utterance can have multiple labels, thus corresponding to multilabel classification. |
666 | Investigating Sports Commentator Bias within a Large Corpus of American Football Broadcasts | Jack Merullo, Luke Yeh, Abram Handler, Alvin Grissom II, Brendan O’Connor, Mohit Iyyer | We identify major confounding factors for researchers examining racial bias in FOOTBALL, and perform a computational analysis that supports conclusions from prior social science studies. |
667 | Charge-Based Prison Term Prediction with Deep Gating Network | Huajie Chen, Deng Cai, Wei Dai, Zehui Dai, Yadong Ding | In this paper, we argue that charge-based prison term prediction (CPTP) not only better fits realistic needs, but also makes the total prison term prediction more accurate and interpretable. We collect the first large-scale structured data for CPTP and evaluate several competitive baselines. |
668 | Restoring ancient text using deep learning: a case study on Greek epigraphy | Yannis Assael, Thea Sommerschield, Jonathan Prag | This work presents Pythia, the first ancient text restoration model that recovers missing characters from a damaged text input using deep neural networks. |
669 | Embedding Lexical Features via Tensor Decomposition for Small Sample Humor Recognition | Zhenjie Zhao, Andrew Cattle, Evangelos Papalexakis, Xiaojuan Ma | We propose a novel tensor embedding method that can effectively extract lexical features for humor recognition. |
670 | EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks | Jason Wei, Kai Zou | We present EDA: easy data augmentation techniques for boosting performance on text classification tasks. |
671 | Neural News Recommendation with Multi-Head Self-Attention | Chuhan Wu, Fangzhao Wu, Suyu Ge, Tao Qi, Yongfeng Huang, Xing Xie | In this paper, we propose a neural news recommendation approach with multi-head self-attention (NRMS). |
672 | What Matters for Neural Cross-Lingual Named Entity Recognition: An Empirical Analysis | Xiaolei Huang, Jonathan May, Nanyun Peng | In this paper, we first propose a simple and efficient neural architecture for cross-lingual NER. |
673 | Telling the Whole Story: A Manually Annotated Chinese Dataset for the Analysis of Humor in Jokes | Dongyu Zhang, Heting Zhang, Xikai Liu, Hongfei LIN, Feng Xia | We propose a novel annotation scheme to give scenarios of how humor arises in text. We therefore create a dataset on humor with 9,123 manually annotated jokes in Chinese. |
674 | Generating Natural Anagrams: Towards Language Generation Under Hard Combinatorial Constraints | Masaaki Nishino, Sho Takase, Tsutomu Hirao, Masaaki Nagata | In this paper, we show that simple depth-first search can yield natural anagrams when it is combined with modern neural language models. |
675 | STANCY: Stance Classification Based on Consistency Cues | Kashyap Popat, Subhabrata Mukherjee, Andrew Yates, Gerhard Weikum | In this work, we present a neural network model for stance classification leveraging BERT representations and augmenting them with a novel consistency constraint. |
676 | Cross-lingual intent classification in a low resource industrial setting | Talaat Khalil, Kornel Kie?czewski, Georgios Christos Chouliaras, Amina Keldibek, Maarten Versteegh | This paper explores different approaches to multilingual intent classification in a low resource setting. |
677 | SoftRegex: Generating Regex from Natural Language Descriptions using Softened Regex Equivalence | Jun-U Park, Sang-Ki Ko, Marco Cognetta, Yo-Sub Han | Since the regular expression equivalence problem is PSPACE-complete, we introduce the EQ_Reg model for computing the simi-larity of two regular expressions using deep neural networks. |
678 | Using Clinical Notes with Time Series Data for ICU Management | Swaraj Khadanga, Karan Aggarwal, Shafiq Joty, Jaideep Srivastava | We propose a method to model them jointly, achieving considerable improvement across benchmark tasks over baseline time-series model. |
679 | Spelling-Aware Construction of Macaronic Texts for Teaching Foreign-Language Vocabulary | Adithya Renduchintala, Philipp Koehn, Jason Eisner | We present a machine foreign-language teacher that modifies text in a student’s native language (L1) by replacing some word tokens with glosses in a foreign language (L2), in such a way that the student can acquire L2 vocabulary simply by reading the resulting macaronic text. |
680 | Towards Machine Reading for Interventions from Humanitarian-Assistance Program Literature | Bonan Min, Yee Seng Chan, Haoling Qiu, Joshua Fasching | In this paper, we developed a corpus annotated with interventions to foster research, and developed an information extraction system for extracting interventions and their location and time from text. |
681 | RUN through the Streets: A New Dataset and Baseline Models for Realistic Urban Navigation | Tzuf Paz-Argaman, Reut Tsarfaty | Here we introduce the Realistic Urban Navigation (RUN) task, aimed at interpreting NL navigation instructions based on a real, dense, urban map. Using Amazon Mechanical Turk, we collected a dataset of 2515 instructions aligned with actual routes over three regions of Manhattan. |
682 | Context-Aware Conversation Thread Detection in Multi-Party Chat | Ming Tan, Dakuo Wang, Yupeng Gao, Haoyu Wang, Saloni Potdar, Xiaoxiao Guo, Shiyu Chang, Mo Yu | In this work, we propose a novel Context-Aware Thread Detection (CATD) model that automatically disentangles these conversation threads. |