Paper Digest: EMNLP 2022 Highlights
To help the community quickly catch up on the work presented in this conference, Paper Digest Team processed all accepted papers, and generated one highlight sentence (typically the main topic) for each paper. Readers are encouraged to read these machine generated highlights / summaries to quickly get the main idea of each paper.
Based in New York, Paper Digest is dedicated to helping people generate contents & reason over unstructured data. Different from black-box approaches, we build deep models on semantics, which allows results to be produced with explainations. Such models power this website, and are behind our services including “search engine”, “summarization”, “question answering”, and “literature review”.
If you do not want to miss interesting academic papers, you are welcome to sign up our daily paper digest service to get updates on new papers published in your area every day. You are also welcome to follow us on Twitter and Linkedin to get updated with new conference digests.
Paper Digest Team
New York City, New York, 10017
team@paperdigest.org
TABLE 1: Paper Digest: EMNLP 2022 Highlights
Paper | Author(s) | |
---|---|---|
1 | Generative Knowledge Graph Construction: A Review Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this study, we summarize the recent compelling progress in generative knowledge graph construction. |
Hongbin Ye; Ningyu Zhang; Hui Chen; Huajun Chen; |
2 | CDConv: A Benchmark for Contradiction Detection in Chinese Conversations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose a benchmark for Contradiction Detection in Chinese Conversations, namely CDConv. |
Chujie Zheng; Jinfeng Zhou; Yinhe Zheng; Libiao Peng; Zhen Guo; Wenquan Wu; Zheng-Yu Niu; Hua Wu; Minlie Huang; |
3 | Transformer Feed-Forward Layers Build Predictions By Promoting Concepts in The Vocabulary Space Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Transformer-based language models (LMs) are at the core of modern NLP, but their internal prediction construction process is opaque and largely not understood. In this work, we make a substantial step towards unveiling this underlying prediction process, by reverse-engineering the operation of the feed-forward network (FFN) layers, one of the building blocks of transformer models. |
Mor Geva; Avi Caciularu; Kevin Wang; Yoav Goldberg; |
4 | Learning to Generate Question By Asking Question: A Primal-Dual Approach with Uncommon Word Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Moreover, unseen or rare word generation has not been studied in previous works. In this paper, we propose a novel approach which incorporates question generation with its dual problem, question answering, into a unified primal-dual framework. |
Qifan Wang; Li Yang; Xiaojun Quan; Fuli Feng; Dongfang Liu; Zenglin Xu; Sinong Wang; Hao Ma; |
5 | Graph-based Model Generation for Few-Shot Relation Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Existing models follow a ?one-for-all? scheme where one general large model performs all individual N-way-K-shot tasks in FSRE, which prevents the model from achieving the optimal point on each task. In view of this, we propose a model generation framework that consists of one general model for all tasks and many tiny task-specific models for each individual task. |
Wanli Li; Tieyun Qian; |
6 | Backdoor Attacks in Federated Learning By Rare Embeddings and Gradient Ensembling Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper investigates the feasibility of model poisoning for backdoor attacks through rare word embeddings of NLP models. |
Ki Yoon Yoo; Nojun Kwak; |
7 | Generating Natural Language Proofs with Verifier-Guided Search Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we focus on proof generation: Given a hypothesis and a set of supporting facts, the model generates a proof tree indicating how to derive the hypothesis from supporting facts. |
Kaiyu Yang; Jia Deng; Danqi Chen; |
8 | Toward Unifying Text Segmentation and Long Document Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we explore the role that section segmentation plays in extractive summarization of written and spoken documents. |
Sangwoo Cho; Kaiqiang Song; Xiaoyang Wang; Fei Liu; Dong Yu; |
9 | The Geometry of Multilingual Language Model Representations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We assess how multilingual language models maintain a shared multilingual representation space while still encoding language-sensitive information in each language. |
Tyler Chang; Zhuowen Tu; Benjamin Bergen; |
10 | Improving Complex Knowledge Base Question Answering Via Question-to-Action and Question-to-Question Alignment Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, there is a significant semantic and structural gap between natural language and action sequences, which makes this conversion difficult. In this paper, we introduce an alignment-enhanced complex question answering framework, called ALCQA, which mitigates this gap through question-to-action alignment and question-to-question alignment. |
Yechun Tang; Xiaoxia Cheng; Weiming Lu; |
11 | PAIR: Prompt-Aware MargIn Ranking for Counselor Reflection Scoring in Motivational Interviewing Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a system for the analysis of counselor reflections. |
Do June Min; Ver�nica P�rez-Rosas; Kenneth Resnicow; Rada Mihalcea; |
12 | Co-guiding Net: Achieving Mutual Guidances Between Multiple Intent Detection and Slot Filling Via Heterogeneous Semantics-Label Graphs Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a novel model termed Co-guiding Net, which implements a two-stage framework achieving the mutual guidances between the two tasks. |
Bowen Xing; Ivor Tsang; |
13 | The Importance of Being Parameters: An Intra-Distillation Method for Serious Gains Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a general task-agnostic method, namely intra-distillation, appended to the regular training loss to balance parameter sensitivity. |
Haoran Xu; Philipp Koehn; Kenton Murray; |
14 | Interpreting Language Models with Contrastive Explanations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To disentangle the different decisions in language modeling, we focus on explaining language models contrastively: we look for salient input tokens that explain why the model predicted one token instead of another. |
Kayo Yin; Graham Neubig; |
15 | RankGen: Improving Text Generation with Large Ranking Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Given an input sequence (or prefix), modern language models often assign high probabilities to output sequences that are repetitive, incoherent, or irrelevant to the prefix; as such, model-generated text also contains such artifacts. To address these issues we present RankGen, a 1.2B parameter encoder model for English that scores model generations given a prefix. |
Kalpesh Krishna; Yapei Chang; John Wieting; Mohit Iyyer; |
16 | Learning A Grammar Inducer from Massive Uncurated Instructional Videos Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: While previous work focuses on building systems for inducing grammars on text that are well-aligned with video content, we investigate the scenario, in which text and video are only in loose correspondence. |
Songyang Zhang; Linfeng Song; Lifeng Jin; Haitao Mi; Kun Xu; Dong Yu; Jiebo Luo; |
17 | Normalized Contrastive Learning for Text-Video Retrieval Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Specifically, we show that many test instances are either over- or under-represented during retrieval, significantly hurting the retrieval performance. To address this problem, we propose Normalized Contrastive Learning (NCL) which utilizes the Sinkhorn-Knopp algorithm to compute the instance-wise biases that properly normalize the sum retrieval probabilities of each instance so that every text and video instance is fairly represented during cross-modal retrieval. |
Yookoon Park; Mahmoud Azab; Seungwhan Moon; Bo Xiong; Florian Metze; Gourab Kundu; Kirmani Ahmed; |
18 | Estimating Soft Labels for Out-of-Domain Intent Detection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose an adaptive soft pseudo labeling (ASoul) method that can estimate soft labels for pseudo OOD samples when training OOD detectors. |
Hao Lang; Yinhe Zheng; Jian Sun; Fei Huang; Luo Si; Yongbin Li; |
19 | Multi-VQG: Generating Engaging Questions for Multiple Images Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose generating engaging questions from multiple images. |
Min-Hsuan Yeh; Vincent Chen; Ting-Hao Huang; Lun-Wei Ku; |
20 | Tomayto, Tomahto. Beyond Token-level Answer Equivalence for Question Answering Evaluation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present the first systematic conceptual and data-driven analysis to examine the shortcomings of token-level equivalence measures. |
Jannis Bulian; Christian Buck; Wojciech Gajewski; Benjamin B�rschinger; Tal Schuster; |
21 | Non-Parametric Domain Adaptation for End-to-End Speech Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a novel non-parametric method that leverages in-domain text translation corpus to achieve domain adaptation for E2E-ST systems. |
Yichao Du; Weizhi Wang; Zhirui Zhang; Boxing Chen; Tong Xu; Jun Xie; Enhong Chen; |
22 | Prompting for Multimodal Hateful Meme Classification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, there is no known explicit external knowledge base that could provide such hate speech contextual information. To address this gap, we propose PromptHate, a simple yet effective prompt-based model that prompts pre-trained language models (PLMs) for hateful meme classification. |
Rui Cao; Roy Ka-Wei Lee; Wen-Haw Chong; Jing Jiang; |
23 | Certified Error Control of Candidate Set Pruning for Two-Stage Relevance Ranking Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose the concept of certified error control of candidate set pruning for relevance ranking, which means that the test error after pruning is guaranteed to be controlled under a user-specified threshold with high probability. |
Minghan Li; Xinyu Zhang; Ji Xin; Hongyang Zhang; Jimmy Lin; |
24 | Linearizing Transformer with Key-Value Memory Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose Memsizer, an approach towards closing the performance gap while improving the efficiency even with short generation. |
Yizhe Zhang; Deng Cai; |
25 | Robustness of Fusion-based Multimodal Classifiers to Cross-Modal Content Dilutions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we investigate the robustness of multimodal classifiers to cross-modal dilutions ? a plausible variation. |
Gaurav Verma; Vishwa Vinay; Ryan Rossi; Srijan Kumar; |
26 | Translation Between Molecules and Natural Language Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present MolT5 – a self-supervised learning framework for pretraining models on a vast amount of unlabeled natural language text and molecule strings. |
Carl Edwards; Tuan Lai; Kevin Ros; Garrett Honke; Kyunghyun Cho; Heng Ji; |
27 | What Makes Instruction Learning Hard? An Investigation and A New Challenge in A Synthetic Environment Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We thus propose Hard RegSet as a challenging instruction learning dataset, and a controlled environment for studying instruction learning. |
Matthew Finlayson; Kyle Richardson; Ashish Sabharwal; Peter Clark; |
28 | Sentence-Incremental Neural Coreference Resolution Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a sentence-incremental neural coreference resolution system which incrementally builds clusters after marking mention boundaries in a shift-reduce method. |
Matt Grenander; Shay B. Cohen; Mark Steedman; |
29 | SNaC: Coherence Error Detection for Narrative Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we introduce SNaC, a narrative coherence evaluation framework for fine-grained annotations of long summaries. |
Tanya Goyal; Junyi Jessy Li; Greg Durrett; |
30 | HydraSum: Disentangling Style Features in Text Summarization with Multi-Decoder Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, these are implicitly encoded within model parameters and specific styles cannot be enforced. To address this, we introduce HydraSum, a new summarization architecture that extends the single decoder framework of current models to a mixture-of-experts version with multiple decoders. |
Tanya Goyal; Nazneen Rajani; Wenhao Liu; Wojciech Kryscinski; |
31 | A Good Neighbor, A Found Treasure: Mining Treasured Neighbors for Knowledge Graph Entity Typing Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Besides, we also observe that there are co-occurrence relations between types, which is very helpful to alleviate false-negative problem. In this paper, we propose a novel method called Mining Treasured Neighbors (MiNer) to make use of these two characteristics. |
Zhuoran Jin; Pengfei Cao; Yubo Chen; Kang Liu; Jun Zhao; |
32 | Guiding Neural Entity Alignment with Compatibility Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we argue that different entities within one KG should have compatible counterparts in the other KG due to the potential dependencies among the entities. |
Bing Liu; Harrisen Scells; Wen Hua; Guido Zuccon; Genghong Zhao; Xia Zhang; |
33 | InstructDial: Improving Zero and Few-shot Generalization in Dialogue Through Instruction Tuning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce InstructDial, an instruction tuning framework for dialogue, which consists of a repository of 48 diverse dialogue tasks in a unified text-to-text format created from 59 openly available dialogue datasets. |
Prakhar Gupta; Cathy Jiao; Yi-Ting Yeh; Shikib Mehri; Maxine Eskenazi; Jeffrey Bigham; |
34 | Unsupervised Boundary-Aware Language Model Pretraining for Chinese Sequence Labeling Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we suggest unsupervised statistical boundary information instead, and propose an architecture to encode the information directly into pre-trained language models, resulting in Boundary-Aware BERT (BABERT). |
Peijie Jiang; Dingkun Long; Yanzhao Zhang; Pengjun Xie; Meishan Zhang; Min Zhang; |
35 | RetroMAE: Pre-Training Retrieval-oriented Language Models Via Masked Auto-Encoder Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose RetroMAE, a new retrieval oriented pre-training paradigm based on Masked Auto-Encoder (MAE). |
Shitao Xiao; Zheng Liu; Yingxia Shao; Zhao Cao; |
36 | Aligning Recommendation and Conversation Via Dual Imitation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, existing conversational recommendation systems (CRS) ignore the advantage of user interest shift in connecting recommendation and conversation, which leads to an ineffective loose coupling structure of CRS. To address this issue, by modeling the recommendation actions as recommendation paths in a knowledge graph (KG), we propose DICR (Dual Imitation for Conversational Recommendation), which designs a dual imitation to explicitly align the recommendation paths and user interest shift paths in a recommendation module and a conversation module, respectively. |
Jinfeng Zhou; Bo Wang; Minlie Huang; Dongming Zhao; Kun Huang; Ruifang He; Yuexian Hou; |
37 | QRelScore: Better Evaluating Generated Questions with Deeper Understanding of Context-aware Relevance Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose QRelScore, a context-aware Relevance evaluation metric for Question Generation. |
Xiaoqiang Wang; Bang Liu; Siliang Tang; Lingfei Wu; |
38 | Abstract Visual Reasoning with Tangram Shapes Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce KiloGram, a resource for studying abstract visual reasoning in humans and machines. |
Anya Ji; Noriyuki Kojima; Noah Rush; Alane Suhr; Wai Keen Vong; Robert Hawkins; Yoav Artzi; |
39 | UnifiedSKG: Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Since the inputs and outputs of SKG tasks are heterogeneous, they have been studied separately by different communities, which limits systematic and compatible research on SKG. In this paper, we overcome this limitation by proposing the UnifiedSKG framework, which unifies 21 SKG tasks into a text-to-text format, aiming to promote systematic SKG research, instead of being exclusive to a single task, domain, or dataset. |
Tianbao Xie; Chen Henry Wu; Peng Shi; Ruiqi Zhong; Torsten Scholak; Michihiro Yasunaga; Chien-Sheng Wu; Ming Zhong; Pengcheng Yin; Sida I. Wang; Victor Zhong; Bailin Wang; Chengzu Li; Connor Boyle; Ansong Ni; Ziyu Yao; Dragomir Radev; Caiming Xiong; Lingpeng Kong; Rui Zhang; Noah A. Smith; Luke Zettlemoyer; Tao Yu; |
40 | Balanced Adversarial Training: Balancing Tradeoffs Between Fickleness and Obstinacy in NLP Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We show that standard adversarial training methods focused on reducing vulnerability to fickle adversarial examples may make a model more vulnerable to obstinate adversarial examples, with experiments for both natural language inference and paraphrase identification tasks. To counter this phenomenon, we introduce Balanced Adversarial Training, which incorporates contrastive learning to increase robustness against both fickle and obstinate adversarial examples. |
Hannah Chen; Yangfeng Ji; David Evans; |
41 | When Can Transformers Ground and Compose: Insights from Compositional Generalization Benchmarks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we present a simple transformer-based model that outperforms specialized architectures on ReaSCAN and a modified version (Qiu et al., 2021) of gSCAN (Ruis et al., 2020). |
Ankur Sikarwar; Arkil Patel; Navin Goyal; |
42 | Generative Language Models for Paragraph-Level Question Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, it is difficult to measure advances in QG research since there are no standardized resources that allow a uniform comparison among approaches. In this paper, we introduce QG-Bench, a multilingual and multidomain benchmark for QG that unifies existing question answering datasets by converting them to a standard QG setting. |
Asahi Ushio; Fernando Alva-Manchego; Jose Camacho-Collados; |
43 | A Unified Encoder-Decoder Framework with Entity Memory Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose an encoder-decoder framework with an entity memory, namely EDMem. |
Zhihan Zhang; Wenhao Yu; Chenguang Zhu; Meng Jiang; |
44 | Segmenting Numerical Substitution Ciphers Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose the first automatic methods to segment those ciphers using Byte Pair Encoding (BPE) and unigram language models. |
Nada Aldarrab; Jonathan May; |
45 | Crossmodal-3600: A Massively Multilingual Multimodal Evaluation Dataset Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper we present the Crossmodal-3600 dataset (XM3600 in short), a geographically diverse set of 3600 images annotated with human-generated reference captions in 36 languages. |
Ashish V. Thapliyal; Jordi Pont Tuset; Xi Chen; Radu Soricut; |
46 | ReSel: N-ary Relation Extraction from Scientific Text and Tables By Learning to Retrieve and Select Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We study the problem of extracting N-ary relation tuples from scientific articles. |
Yuchen Zhuang; Yinghao Li; Junyang Zhang; Yue Yu; Yingjun Mou; Xiang Chen; Le Song; Chao Zhang; |
47 | GammaE: Gamma Embeddings for Logical Queries on Knowledge Graphs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: An additional limitation is that the union operator is non-closure, which undermines the model to handle a series of union operators. To address these problems, we propose a novel probabilistic embedding model, namely Gamma Embeddings (GammaE), for encoding entities and queries to answer different types of FOL queries on KGs. |
Dong Yang; Peijun Qing; Yang Li; Haonan Lu; Xiaodong Lin; |
48 | Reasoning Like Program Executors Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we showcase two simple instances POET-Math and POET-Logic, in addition to a complex instance, POET-SQL. |
Xinyu Pi; Qian Liu; Bei Chen; Morteza Ziyadi; Zeqi Lin; Qiang Fu; Yan Gao; Jian-Guang Lou; Weizhu Chen; |
49 | SEM-F1: An Automatic Way for Semantic Evaluation of Multi-Narrative Overlap Summaries at Scale Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we exclusively focus on the automated evaluation of the SOS task using the benchmark dataset. |
Naman Bansal; Mousumi Akter; Shubhra Kanti Karmaker Santu; |
50 | Inducer-tuning: Connecting Prefix-tuning and Adapter-tuning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose to understand and further develop prefix-tuning through the kernel lens. |
Yifan Chen; Devamanyu Hazarika; Mahdi Namazifar; Yang Liu; Di Jin; Dilek Hakkani-Tur; |
51 | DocInfer: Document-level Natural Language Inference Using Optimal Evidence Selection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present DocInfer – a novel, end-to-end Document-level Natural Language Inference model that builds a hierarchical document graph enriched through inter-sentence relations (topical, entity-based, concept-based), performs paragraph pruning using the novel SubGraph Pooling layer, followed by optimal evidence selection based on REINFORCE algorithm to identify the most important context sentences for a given hypothesis. |
Puneet Mathur; Gautam Kunapuli; Riyaz Bhat; Manish Shrivastava; Dinesh Manocha; Maneesh Singh; |
52 | LightEA: A Scalable, Robust, and Interpretable Entity Alignment Framework Via Three-view Label Propagation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we argue that existing complex EA methods inevitably inherit the inborn defects from their neural network lineage: poor interpretability and weak scalability. |
Xin Mao; Wenting Wang; Yuanbin Wu; Man Lan; |
53 | Metric-guided Distillation: Distilling Knowledge from The Metric to Ranker and Retriever for Generative Commonsense Reasoning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Another problem is that re-ranking is very expensive, but only using retrievers will seriously degrade the performance of their generation models. To solve these problems, we propose the metric distillation rule to distill knowledge from the metric (e.g., BLEU) to the ranker. |
Xingwei He; Yeyun Gong; A-Long Jin; Weizhen Qi; Hang Zhang; Jian Jiao; Bartuer Zhou; Biao Cheng; Sm Yiu; Nan Duan; |
54 | Efficient Document Retrieval By End-to-End Refining and Quantizing BERT Embedding with Contrastive Product Quantization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Furthermore, the Hamming distance can only be equal to one of several integer values, significantly limiting its representational ability for document distances. To address these issues, in this paper, we propose to leverage BERT embeddings to perform efficient retrieval based on the product quantization technique, which will assign for every document a real-valued codeword from the codebook, instead of a binary code as in semantic hashing. |
Zexuan Qiu; Qinliang Su; Jianxing Yu; Shijing Si; |
55 | Curriculum Knowledge Distillation for Emoji-supervised Cross-lingual Sentiment Analysis Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, based on the intuitive assumption that the relationships between emojis and sentiments are consistent across different languages, we investigate transferring sentiment knowledge across languages with the help of emojis. |
Jianyang Zhang; Tao Liang; Mingyang Wan; Guowu Yang; Fengmao Lv; |
56 | Correctable-DST: Mitigating Historical Context Mismatch Between Training and Inference for Improved Dialogue State Tracking Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, only the previously predicted dialogue state can be used in inference. This discrepancy might lead to error propagation, i.e., mistakes made by the model in the current turn are likely to be carried over to the following turns.To solve this problem, we propose Correctable Dialogue State Tracking (Correctable-DST). |
Hongyan Xie; Haoxiang Su; Shuangyong Song; Hao Huang; Bo Zou; Kun Deng; Jianghua Lin; Zhihui Zhang; Xiaodong He; |
57 | DropMix: A Textual Data Augmentation Combining Dropout with Mixup Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we argue that the property is essential to overcome overfitting in text learning. |
Fanshuang Kong; Richong Zhang; Xiaohui Guo; Samuel Mensah; Yongyi Mao; |
58 | Cross-document Event Coreference Search: Task, Dataset and Modeling Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose an appealing, and often more applicable, complementary set up for the task ? Cross-document Coreference Search, focusing in this paper on event coreference. |
Alon Eirew; Avi Caciularu; Ido Dagan; |
59 | VIRT: Improving Representation-based Text Matching Via Virtual Interaction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, these models suffer from severe performance degradation due to the lack of interactions between the pair of texts. To remedy this, we propose a Virtual InteRacTion mechanism (VIRT) for improving representation-based text matching while maintaining its efficiency. |
Dan Li; Yang Yang; Hongyin Tang; Jiahao Liu; Qifan Wang; Jingang Wang; Tong Xu; Wei Wu; Enhong Chen; |
60 | MAVEN-ERE: A Unified Large-scale Dataset for Event Coreference, Temporal, Causal, and Subevent Relation Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Different types of event relations naturally interact with each other, but existing datasets only cover limited relation types at once, which prevents models from taking full advantage of relation interactions. To address these issues, we construct a unified large-scale human-annotated ERE dataset MAVEN-ERE with improved annotation schemes. |
Xiaozhi Wang; Yulin Chen; Ning Ding; Hao Peng; Zimu Wang; Yankai Lin; Xu Han; Lei Hou; Juanzi Li; Zhiyuan Liu; Peng Li; Jie Zhou; |
61 | Entity Extraction in Low Resource Domains with Selective Pre-training of Large Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we introduce effective ways to select data from unlabeled corpora of target domains for language model pretraining to improve the performances in target entity extraction tasks. |
Aniruddha Mahapatra; Sharmila Reddy Nangi; Aparna Garimella; Anandhavelu N; |
62 | How Large Language Models Are Transforming Machine-Paraphrase Plagiarism Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This work explores T5 and GPT3 for machine-paraphrase generation on scientific articles from arXiv, student theses, and Wikipedia. |
Jan Philip Wahle; Terry Ruas; Frederic Kirstein; Bela Gipp; |
63 | M2D2: A Massively Multi-Domain Language Modeling Dataset Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present M2D2, a fine-grained, massively multi-domain corpus for studying domain adaptation in language models (LMs). |
Machel Reid; Victor Zhong; Suchin Gururangan; Luke Zettlemoyer; |
64 | �Will You Find These Shortcuts?� A Protocol for Evaluating The Faithfulness of Input Salience Methods for Text Classification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Existing work on faithfulness evaluation is not conclusive and does not provide a clear answer as to how different methods are to be compared.Focusing on text classification and the model debugging scenario, our main contribution is a protocol for faithfulness evaluation that makes use of partially synthetic data to obtain ground truth for feature importance ranking. |
Jasmijn Bastings; Sebastian Ebert; Polina Zablotskaia; Anders Sandholm; Katja Filippova; |
65 | Information-Transport-based Policy for Simultaneous Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we treat the translation as information transport from source to target and accordingly propose an Information-Transport-based Simultaneous Translation (ITST). |
Shaolei Zhang; Yang Feng; |
66 | Learning to Adapt to Low-Resource Paraphrase Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, transferring a paraphrasing model to another domain encounters the problem of domain shifting especially when the data is sparse. At the same time, widely using large pre-trained language models (PLMs) faces the overfitting problem when training on scarce labeled data. To mitigate these two issues, we propose, LAPA, an effective adapter for PLMs optimized by meta-learning. |
Zhigen Li; Yanmeng Wang; Rizhao Fan; Ye Wang; Jianfeng Li; Shaojun Wang; |
67 | A Distributional Lens for Multi-Aspect Controllable Text Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Existing methods achieve complex multi-aspect control by fusing multiple controllers learned from single-aspect, but suffer from attribute degeneration caused by the mutual interference of these controllers. To address this, we provide observations on attribute fusion from a distributional perspective and propose to directly search for the intersection areas of multiple attribute distributions as their combination for generation. |
Yuxuan Gu; Xiaocheng Feng; Sicheng Ma; Lingyuan Zhang; Heng Gong; Bing Qin; |
68 | ELMER: A Non-Autoregressive Pre-trained Language Model for Efficient and Effective Text Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose ELMER: an efficient and effective PLM for NAR text generation to explicitly model the token dependency during NAR generation. |
Junyi Li; Tianyi Tang; Wayne Xin Zhao; Jian-Yun Nie; Ji-Rong Wen; |
69 | Multilingual Relation Classification Via Efficient and Effective Prompting Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present the first work on prompt-based multilingual relation classification (RC), by introducing an efficient and effective method that constructs prompts from relation triples and involves only minimal translation for the class labels. |
Yuxuan Chen; David Harbecke; Leonhard Hennig; |
70 | Topic-Regularized Authorship Representation Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To handle a large number of unseen authors and topics, we propose Authorship Representation Regularization (ARR), a distillation framework that creates authorship representation with reduced reliance on topic-specific information. |
Jitkapat Sawatphol; Nonthakit Chaiwong; Can Udomcharoenchaikit; Sarana Nutanong; |
71 | Fine-grained Contrastive Learning for Relation Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose fine-grained contrastive learning (FineCL) for RE, which leverages fine-grained information about which silver labels are and are not noisy to improve the quality of learned relationship representations for RE. |
William Hogan; Jiacheng Li; Jingbo Shang; |
72 | Curriculum Prompt Learning with Self-Training for Abstractive Dialogue Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Succinctly summarizing dialogue is a task of growing interest, but inherent challenges, such as insufficient training data and low information density impede our ability to train abstractive models. In this work, we propose a novel curriculum-based prompt learning method with self-training to address these problems. |
Changqun Li; Linlin Wang; Xin Lin; Gerard de Melo; Liang He; |
73 | Zero-Shot Text Classification with Self-Training Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, the fact that such models are unfamiliar with the target task can lead to instability and performance issues. We propose a plug-and-play method to bridge this gap using a simple self-training approach, requiring only the class names along with an unlabeled dataset, and without the need for domain expertise or trial and error. |
Ariel Gera; Alon Halfon; Eyal Shnarch; Yotam Perlitz; Liat Ein-Dor; Noam Slonim; |
74 | Deconfounding Legal Judgment Prediction for European Court of Human Rights Cases Towards Better Alignment with Experts Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This work demonstrates that Legal Judgement Prediction systems without expert-informed adjustments can be vulnerable to shallow, distracting surface signals that arise from corpus construction, case distribution, and confounding factors. |
T.y.s.s Santosh; Shanshan Xu; Oana Ichim; Matthias Grabmair; |
75 | SQuALITY: Building A Long-Document Summarization Dataset The Hard Way Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we turn to a slower but more straightforward approach to developing summarization benchmark data: We hire highly-qualified contractors to read stories and write original summaries from scratch. |
Alex Wang; Richard Yuanzhe Pang; Angelica Chen; Jason Phang; Samuel R. Bowman; |
76 | MetaASSIST: Robust Dialogue State Tracking with Meta Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Specifically, we propose three schemes with varying degrees of flexibility, ranging from slot-wise to both slot-wise and instance-wise, to convert the weighting parameter into learnable functions. |
Fanghua Ye; Xi Wang; Jie Huang; Shenghui Li; Samuel Stern; Emine Yilmaz; |
77 | Multilingual Machine Translation with Hyper-Adapters Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, adapters of related languages are unable to transfer information, and their total number of parameters becomes prohibitively expensive as the number of languages grows. In this work, we overcome these drawbacks using hyper-adapters ? hyper-networks that generate adapters from language and layer embeddings. |
Christos Baziotis; Mikel Artetxe; James Cross; Shruti Bhosale; |
78 | Z-LaVI: Zero-Shot Language Solver Fueled By Visual Imagination Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, they generally suffer from reporting bias, the phenomenon describing the lack of explicit commonsense knowledge in written text, e.g., ?an orange is orange?. To overcome this limitation, we develop a novel approach, Z-LaVI, to endow language models with visual imagination capabilities. |
Yue Yang; Wenlin Yao; Hongming Zhang; Xiaoyang Wang; Dong Yu; Jianshu Chen; |
79 | Using Commonsense Knowledge to Answer Why-Questions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: What aspects can be made accessible via external commonsense resources? We study these questions in the context of answering questions in the TellMeWhy dataset using COMET as a source of relevant commonsense relations. |
Yash Kumar Lal; Niket Tandon; Tanvi Aggarwal; Horace Liu; Nathanael Chambers; Raymond Mooney; Niranjan Balasubramanian; |
80 | Affective Idiosyncratic Responses to Music Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Despite consensus that idiosyncratic factors play a key role in regulating how listeners emotionally respond to music, precisely measuring the marginal effects of these variables has proved challenging. To address this gap, we develop computational methods to measure affective responses to music from over 403M listener comments on a Chinese social music platform. |
Sky CH-Wang; Evan Li; Oliver Li; Smaranda Muresan; Zhou Yu; |
81 | Successive Prompting for Decomposing Complex Questions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce a way to generate synthetic dataset which can be used to bootstrap model?s ability to decompose and answer intermediate questions. |
Dheeru Dua; Shivanshu Gupta; Sameer Singh; Matt Gardner; |
82 | Maieutic Prompting: Logically Consistent Reasoning with Recursive Explanations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we develop Maieutic Prompting, which aims to infer a correct answer to a question even from the unreliable generations of LM. |
Jaehun Jung; Lianhui Qin; Sean Welleck; Faeze Brahman; Chandra Bhagavatula; Ronan Le Bras; Yejin Choi; |
83 | DANLI: Deliberative Agent for Following Natural Language Instructions Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: These reactive agents are insufficient for long-horizon complex tasks. To address this limitation, we propose a neuro-symbolic deliberative agent that, while following language instructions, proactively applies reasoning and planning based on its neural and symbolic representations acquired from past experience (e.g., natural language and egocentric vision). |
Yichi Zhang; Jianing Yang; Jiayi Pan; Shane Storks; Nikhil Devraj; Ziqiao Ma; Keunwoo Yu; Yuwei Bao; Joyce Chai; |
84 | Tracing Semantic Variation in Slang Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We explore these theories using computational models and test them against historical slang dictionary entries, with a focus on characterizing regularity in the geographical variation of slang usages attested in the US and the UK over the past two centuries. |
Zhewei Sun; Yang Xu; |
85 | Fine-grained Category Discovery Under Coarse-grained Supervision with Hierarchical Weighted Self-contrastive Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Considering most current methods cannot transfer knowledge from coarse-grained level to fine-grained level, we propose a hierarchical weighted self-contrastive network by building a novel weighted self-contrastive module and combining it with supervised learning in a hierarchical manner. |
Wenbin An; Feng Tian; Ping Chen; Siliang Tang; Qinghua Zheng; QianYing Wang; |
86 | PLM-based World Models for Text-based Games Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: As the core tasks of world models are future prediction and commonsense understanding, our claim is that pre-trained language models (PLMs) already provide a strong base upon which to build world models. |
Minsoo Kim; Yeonjoon Jung; Dohyeon Lee; Seung-won Hwang; |
87 | Prompt-Based Meta-Learning For Few-shot Text Classification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Prompt-tuning has recently proved to be another effective few-shot learner by bridging the gap between pre-train and downstream tasks. In this work, we closely combine the two promising few-shot learning methodologies in structure and propose a Prompt-Based Meta-Learning (PBML) model to overcome the above meta-learning problem by adding the prompting mechanism. |
Haoxing Zhang; Xiaofeng Zhang; Haibo Huang; Lei Yu; |
88 | How Well Can Text-to-Image Generative Models Understand Ethical Natural Language Interventions? Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To this end, we introduce an Ethical NaTural Language Interventions in Text-to-Image GENeration (ENTIGEN) benchmark dataset to evaluate the change in image generations conditional on ethical interventions across three social axes ? gender, skin color, and culture. |
Hritik Bansal; Da Yin; Masoud Monajatipoor; Kai-Wei Chang; |
89 | Geographic Citation Gaps in NLP Research Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In the spirit of “what we do not measure, we cannot improve”, this work asks a series of questions on the relationship between geographical location and publication success (acceptance in top NLP venues and citation impact). |
Mukund Rungta; Janvijay Singh; Saif M. Mohammad; Diyi Yang; |
90 | Language Models of Code Are Few-Shot Commonsense Learners Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we show that when we instead frame structured commonsense reasoning tasks as code generation tasks, pre-trained LMs of code are better structured commonsense reasoners than LMs of natural language, even when the downstream task does not involve source code at all.We demonstrate our approach across three diverse structured commonsense reasoning tasks. |
Aman Madaan; Shuyan Zhou; Uri Alon; Yiming Yang; Graham Neubig; |
91 | Numerical Optimizations for Weighted Low-rank Estimation on Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Unlike standard SVD, weighed value decomposition is a non-convex optimization problem that lacks a closed-form solution. We systematically investigated multiple optimization strategies to tackle the problem and examined our method by compressing Transformer-based language models. |
Ting Hua; Yen-Chang Hsu; Felicity Wang; Qian Lou; Yilin Shen; Hongxia Jin; |
92 | Generative Multi-hop Retrieval Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, such a bi-encoder approach has limitations in multi-hop settings; (1) the reformulated query gets longer as the number of hops increases, which further tightens the embedding bottleneck of the query vector, and (2) it is prone to error propagation. In this paper, we focus on alleviating these limitations in multi-hop settings by formulating the problem in a fully generative way. |
Hyunji Lee; Sohee Yang; Hanseok Oh; Minjoon Seo; |
93 | Visual Spatial Description: Controlled Spatial-Oriented Image-to-Text Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Accordingly, we annotate a dataset manually to facilitate the investigation of the newly-introduced task, and then build several benchmark encoder-decoder models by using VL-BART and VL-T5 as backbones. |
Yu Zhao; Jianguo Wei; ZhiChao Lin; Yueheng Sun; Meishan Zhang; Min Zhang; |
94 | M3: A Multi-View Fusion and Multi-Decoding Network for Multi-Document Reading Comprehension Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose a novel method that tries to employ a multi-view fusion and multi-decoding mechanism to achieve it. |
Liang Wen; Houfeng Wang; Yingwei Luo; Xiaolin Wang; |
95 | COCO-DR: Combating The Distribution Shift in Zero-Shot Dense Retrieval with Contrastive and Distributionally Robust Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a new zero-shot dense retrieval (ZeroDR) method, COCO-DR, to improve the generalization ability of dense retrieval by combating the distribution shifts between source training tasks and target scenarios. |
Yue Yu; Chenyan Xiong; Si Sun; Chao Zhang; Arnold Overwijk; |
96 | Language Model Pre-Training with Sparse Latent Typing Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we manage to push the language models to obtain a deeper understanding of sentences by proposing a new pre-training objective, Sparse Latent Typing, which enables the model to sparsely extract sentence-level keywords with diverse latent types. |
Liliang Ren; Zixuan Zhang; Han Wang; Clare Voss; ChengXiang Zhai; Heng Ji; |
97 | On The Transformation of Latent Space in Fine-Tuned NLP Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study the evolution of latent space in fine-tuned NLP models. |
Nadir Durrani; Hassan Sajjad; Fahim Dalvi; Firoj Alam; |
98 | Watch The Neighbors: A Unified K-Nearest Neighbor Contrastive Learning Framework for OOD Intent Discovery Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a unified K-nearest neighbor contrastive learning framework to discover OOD intents. |
Yutao Mou; Keqing He; Pei Wang; Yanan Wu; Jingang Wang; Wei Wu; Weiran Xu; |
99 | Extracted BERT Model Leaks More Information Than You Think! Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This work bridges this gap by launching an attribute inference attack against the extracted BERT model. |
Xuanli He; Lingjuan Lyu; Chen Chen; Qiongkai Xu; |
100 | Do Vision-and-Language Transformers Learn Grounded Predicate-Noun Dependencies? Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We take a first step in closing this gap by creating a new multimodal task targeted at evaluating understanding of predicate-noun dependencies in a controlled setup. |
Mitja Nikolaus; Emmanuelle Salin; Stephane Ayache; Abdellah Fourtassi; Benoit Favre; |
101 | A Multilingual Perspective Towards The Evaluation of Attribution Methods in Natural Language Inference Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we present a multilingual approach for evaluating attribution methods for the Natural Language Inference (NLI) task in terms of faithfulness and plausibility. |
Kerem Zaman; Yonatan Belinkov; |
102 | Graph-Based Multilingual Label Propagation for Low-Resource Part-of-Speech Tagging Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a novel method for transferring labels from multiple high-resource source to low-resource target languages. |
Ayyoob ImaniGooghari; Silvia Severini; Masoud Jalili Sabet; Fran�ois Yvon; Hinrich Sch�tze; |
103 | SubeventWriter: Iterative Sub-event Sequence Generation with Coherence Controller Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a new task of sub-event generation for an unseen process to evaluate the understanding of the coherence of sub-event actions and objects. |
Zhaowei Wang; Hongming Zhang; Tianqing Fang; Yangqiu Song; Ginny Wong; Simon See; |
104 | Infinite SCAN: An Infinite Model of Diachronic Semantic Change Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this study, we propose a Bayesian model that can jointly estimate the number of senses of words and their changes through time.The model combines a dynamic topic model on Gaussian Markov random fields with a logistic stick-breaking process that realizes Dirichlet process. |
Seiichi Inoue; Mamoru Komachi; Toshinobu Ogiso; Hiroya Takamura; Daichi Mochihashi; |
105 | Learning Instructions with Unlabeled Data for Zero-Shot Cross-Task Generalization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we study how IT can be improved with unlabeled data. |
Yuxian Gu; Pei Ke; Xiaoyan Zhu; Minlie Huang; |
106 | Counterfactual Data Augmentation Via Perspective Transition for Open-Domain Dialogues Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a data augmentation method to automatically augment high-quality responses with different semantics by counterfactual inference. |
Jiao Ou; Jinchao Zhang; Yang Feng; Jie Zhou; |
107 | SQUIRE: A Sequence-to-sequence Framework for Multi-hop Knowledge Graph Reasoning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Here we present SQUIRE, the first Sequence-to-sequence based multi-hop reasoning framework, which utilizes an encoder-decoder Transformer structure to translate the query to a path. |
Yushi Bai; Xin Lv; Juanzi Li; Lei Hou; Yincen Qu; Zelin Dai; Feiyu Xiong; |
108 | SpeechUT: Bridging Speech and Text with Hidden-Unit for Encoder-Decoder Based Speech-Text Pre-training Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a unified-modal speech-unit-text pre-training model, SpeechUT, to connect the representations of a speech encoder and a text decoder with a shared unit encoder. |
Ziqiang Zhang; Long Zhou; Junyi Ao; Shujie Liu; Lirong Dai; Jinyu Li; Furu Wei; |
109 | Learning Label Modular Prompts for Text Classification in The Wild Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, current modular approaches in NLP do not take advantage of recent advances in parameter efficient tuning of pretrained language models. To close this gap, we propose ModularPrompt, a label-modular prompt tuning framework for text classification tasks. |
Hailin Chen; Amrita Saha; Shafiq Joty; Steven C.H. Hoi; |
110 | Unbiased and Efficient Sampling of Dependency Trees Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper we show that their fastest algorithm for sampling with replacement, Wilson-RC, is in fact producing biased samples and we provide two alternatives that are unbiased. |
Milo� Stanojevic; |
111 | Continual Learning of Neural Machine Translation Within Low Forgetting Risk Regions Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To solve the problem, we propose a two-stage training method based on the local features of the real loss. |
Shuhao Gu; Bojie Hu; Yang Feng; |
112 | COST-EFF: Collaborative Optimization of Spatial and Temporal Efficiency with Slenderized Multi-exit Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To address the trade-off of early exiting, we propose a joint training approach that calibrates slenderization and preserves contributive structures to each exit instead of only the final layer. |
Bowen Shen; Zheng Lin; Yuanxin Liu; Zhengxiao Liu; Lei Wang; Weiping Wang; |
113 | Rescue Implicit and Long-tail Cases: Nearest Neighbor Relation Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we introduce a simple enhancement of RE using k nearest neighbors (kNN-RE). |
Zhen Wan; Qianying Liu; Zhuoyuan Mao; Fei Cheng; Sadao Kurohashi; Jiwei Li; |
114 | StoryER: Automatic Story Evaluation Via Ranking, Rating and Reasoning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Existing automatic story evaluation methods place a premium on story lexical level coherence, deviating from human preference.We go beyond this limitation by considering a novel Story Evaluation method that mimics human preference when judging a story, namely StoryER, which consists of three sub-tasks: Ranking, Rating and Reasoning. |
Hong Chen; Duc Vo; Hiroya Takamura; Yusuke Miyao; Hideki Nakayama; |
115 | Enhancing Self-Consistency and Performance of Pre-Trained Language Models Through Natural Language Inference Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: . To address this failure mode, we propose a framework, Consistency Correction through Relation Detection, or ConCoRD, for boosting the consistency and accuracy of pre-trained NLP models using pre-trained natural language inference (NLI) models without fine-tuning or re-training. |
Eric Mitchell; Joseph Noh; Siyan Li; Will Armstrong; Ananth Agarwal; Patrick Liu; Chelsea Finn; Christopher Manning; |
116 | Robustness of Demonstration-based Learning Under Limited Data Scenario Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we design pathological demonstrations by gradually removing intuitively useful information from the standard ones to take a deep dive of the robustness of demonstration-based sequence labeling and show that (1) demonstrations composed of random tokens still make the model a better few-shot learner; (2) the length of random demonstrations and the relevance of random tokens are the main factors affecting the performance; (3) demonstrations increase the confidence of model predictions on captured superficial patterns. |
Hongxin Zhang; Yanzhe Zhang; Ruiyi Zhang; Diyi Yang; |
117 | Modeling Information Change in Science Communication with Semantically Matched Paraphrases Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To this end, we present the SCIENTIFIC PARAPHRASE AND INFORMATION CHANGE DATASET (SPICED), the first paraphrase dataset of scientific findings annotated for degree of information change. |
Dustin Wright; Jiaxin Pei; David Jurgens; Isabelle Augenstein; |
118 | Word Order Matters When You Increase Masking Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, recent experiments have shown that explicit position encoding is not always useful, since some models without such feature managed to achieve state-of-the art performance on some tasks. To understand better this phenomenon, we examine the effect of removing position encodings on the pre-training objective itself (i.e., masked language modelling), to test whether models can reconstruct position information from co-occurrences alone. |
Karim Lasri; Alessandro Lenci; Thierry Poibeau; |
119 | An Empirical Analysis of Memorization in Fine-tuned Autoregressive Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we empirically study memorization of fine-tuning methods using membership inference and extraction attacks, and show that their susceptibility to attacks is very different. |
Fatemehsadat Mireshghallah; Archit Uniyal; Tianhao Wang; David Evans; Taylor Berg-Kirkpatrick; |
120 | Style Transfer As Data Augmentation: A Case Study on Named Entity Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we take the named entity recognition task in the English language as a case study and explore style transfer as a data augmentation method to increase the size and diversity of training data in low-resource scenarios. |
Shuguang Chen; Leonardo Neves; Thamar Solorio; |
121 | Linguistic Corpus Annotation for Automatic Text Simplification Evaluation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose annotations of the ASSET corpus that can be used to shed more light on ATS evaluation. |
R�mi Cardon; Adrien Bibal; Rodrigo Wilkens; David Alfter; Magali Norr�; Adeline M�ller; Watrin Patrick; Thomas Fran�ois; |
122 | Semantic Framework Based Query Generation for Temporal Question Answering Over Knowledge Graphs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Based on the semantic framework, we propose a temporal question answering method, SF-TQA, which generates query graphs by exploring the relevant facts of mentioned entities, where the exploring process is restricted by SF-TCons. |
Wentao Ding; Hao Chen; Huayu Li; Yuzhong Qu; |
123 | There Is No Standard Answer: Knowledge-Grounded Dialogue Generation with Adversarial Activated Multi-Reference Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To this end, we establish a multi-reference KGC dataset and propose a series of metrics to systematically assess the one-to-many efficacy of existing KGC models. |
Xueliang Zhao; Tingchen Fu; Chongyang Tao; Rui Yan; |
124 | Stop Measuring Calibration When Humans Disagree Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Recently, calibration to human majority has been measured on tasks where humans inherently disagree about which class applies. We show that measuring calibration to human majority given inherent disagreements is theoretically problematic, demonstrate this empirically on the ChaosNLI dataset, and derive several instance-level measures of calibration that capture key statistical properties of human judgements – including class frequency, ranking and entropy. |
Joris Baan; Wilker Aziz; Barbara Plank; Raquel Fernandez; |
125 | Improving Compositional Generalization for Multi-step Quantitative Reasoning in Question Answering Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Quantitative reasoning is an important aspect of question answering, especially when numeric and verbal cues interact to indicate sophisticated, multi-step programs. In this paper, we demonstrate how modeling the compositional nature of quantitative text can enhance the performance and robustness of QA models, allowing them to capture arithmetic logic that is expressed verbally. |
Armineh Nourbakhsh; Cathy Jiao; Sameena Shah; Carolyn Ros�; |
126 | A Comprehensive Comparison of Neural Networks As Cognitive Models of Inflection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This debate has gravitated into NLP by way of the question: Are neural networks a feasible account for human behavior in morphological inflection?We address that question by measuring the correlation between human judgments and neural network probabilities for unknown word inflections. |
Adam Wiemerslage; Shiran Dudy; Katharina Kann; |
127 | Can Visual Context Improve Automatic Speech Recognition for An Embodied Agent? Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we present a method to incorporate a robot?s visual information into an ASR system and improve the recognition of a spoken utterance containing a visible entity. |
Pradip Pramanick; Chayan Sarkar; |
128 | AfroLID: A Neural Language Identification Tool for African Languages Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Problematically, most of the world?s 7000+ languages today are not covered by LID technologies. We address this pressing issue for Africa by introducing AfroLID, a neural LID toolkit for 517 African languages and varieties. |
Ife Adebara; AbdelRahim Elmadany; Muhammad Abdul-Mageed; Alcides Inciarte; |
129 | EvEntS ReaLM: Event Reasoning of Entity States Via Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper investigates models of event implications. |
Evangelia Spiliopoulou; Artidoro Pagnoni; Yonatan Bisk; Eduard Hovy; |
130 | Large Language Models Are Few-shot Clinical Information Extractors Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we show that large language models, such as InstructGPT (Ouyang et al., 2022), perform well at zero- and few-shot information extraction from clinical text despite not being trained specifically for the clinical domain. |
Monica Agrawal; Stefan Hegselmann; Hunter Lang; Yoon Kim; David Sontag; |
131 | Towards A Unified Multi-Dimensional Evaluator for Text Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a unified multi-dimensional evaluator UniEval for NLG. |
Ming Zhong; Yang Liu; Da Yin; Yuning Mao; Yizhu Jiao; Pengfei Liu; Chenguang Zhu; Heng Ji; Jiawei Han; |
132 | GeoMLAMA: Geo-Diverse Commonsense Probing on Multilingual Pre-Trained Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we introduce a benchmark dataset, Geo-diverse Commonsense Multilingual Language Models Analysis (GeoMLAMA), for probing the diversity of the relational knowledge in multilingual PLMs. |
Da Yin; Hritik Bansal; Masoud Monajatipoor; Liunian Harold Li; Kai-Wei Chang; |
133 | The (Undesired) Attenuation of Human Biases By Multilinguality Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce and release CA-WEAT, multilingual cultural aware tests to quantify biases, and compare them to previous English-centric tests. |
Cristina Espa�a-Bonet; Alberto Barr�n-Cede�o; |
134 | Entailer: Answering Questions with Faithful and Truthful Chains of Reasoning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Our goal is a question-answering (QA) system that can show how its answers are implied by its own internal beliefs via a systematic chain of reasoning. |
Oyvind Tafjord; Bhavana Dalvi Mishra; Peter Clark; |
135 | Near-Negative Distinction: Giving A Second Life to Human Evaluation Datasets Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a new and simple automatic evaluation method for NLG called Near-Negative Distinction (NND) that repurposes prior human annotations into NND tests.In an NND test, an NLG model must place a higher likelihood on a high-quality output candidate than on a near-negative candidate with a known error. |
Philippe Laban; Chien-Sheng Wu; Wenhao Liu; Caiming Xiong; |
136 | ToKen: Task Decomposition and Knowledge Infusion for Few-Shot Hate Speech Detection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: It is also difficult to collect a large-scale hate speech annotated dataset. In this work, we frame this problem as a few-shot learning task, and show significant gains with decomposing the task into its ?constituent? parts. |
Badr AlKhamissi; Faisal Ladhak; Srinivasan Iyer; Veselin Stoyanov; Zornitsa Kozareva; Xian Li; Pascale Fung; Lambert Mathias; Asli Celikyilmaz; Mona Diab; |
137 | Are Hard Examples Also Harder to Explain? A Study with Human and Model-Generated Explanations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we study the connection between explainability and sample hardness by investigating the following research question ? ?Are LLMs and humans equally good at explaining data labels for both easy and hard samples? |
Swarnadeep Saha; Peter Hase; Nazneen Rajani; Mohit Bansal; |
138 | Stanceosaurus: Classifying Stance Towards Multicultural Misinformation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present Stanceosaurus, a new corpus of 28,033 tweets in English, Hindi and Arabic annotated with stance towards 250 misinformation claims. |
Jonathan Zheng; Ashutosh Baheti; Tarek Naous; Wei Xu; Alan Ritter; |
139 | Gendered Mental Health Stigma in Masked Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we investigate gendered mental health stigma in masked language models. |
Inna Lin; Lucille Njoo; Anjalie Field; Ashish Sharma; Katharina Reinecke; Tim Althoff; Yulia Tsvetkov; |
140 | Efficient Nearest Neighbor Search for Cross-Encoder Models Using Matrix Factorization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present an approach that avoids the use of a dual-encoder for retrieval, relying solely on the cross-encoder. |
Nishant Yadav; Nicholas Monath; Rico Angell; Manzil Zaheer; Andrew McCallum; |
141 | Prompt-and-Rerank: A Method for Zero-Shot and Few-Shot Arbitrary Textual Style Transfer with Small Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a method for arbitrary textual style transfer (TST)?the task of transforming a text into any given style?utilizing general-purpose pre-trained language models. |
Mirac Suzgun; Luke Melas-Kyriazi; Dan Jurafsky; |
142 | Learning to Decompose: Hypothetical Question Decomposition Based on Comparable Texts Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we look at large-scale intermediate pre-training of decomposition-based transformers using distant supervision from comparable texts, particularly large-scale parallel news. |
Ben Zhou; Kyle Richardson; Xiaodong Yu; Dan Roth; |
143 | Why Is Winoground Hard? Investigating Failures in Visuolinguistic Compositionality Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we identify the dataset?s main challenges through a suite of experiments on related tasks (probing task, image retrieval task), data augmentation, and manual inspection of the dataset. |
Anuj Diwan; Layne Berry; Eunsol Choi; David Harwath; Kyle Mahowald; |
144 | Gradient-based Constrained Sampling from Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Large pretrained language models are successful at generating fluent text but are notoriously hard to controllably sample from. In this work, we study constrained sampling from such language models, i.e., generating text that satisfies user-defined constraints, while maintaining fluency and model?s performance in a downstream task. |
Sachin Kumar; Biswajit Paria; Yulia Tsvetkov; |
145 | TaCube: Pre-computing Data Cubes for Answering Numerical-Reasoning Questions Over Tabular Data Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, auto-regressive PLMs are challenged by recent emerging numerical reasoning datasets, such as TAT-QA, due to the error-prone implicit calculation. In this paper, we present TaCube, to pre-compute aggregation/arithmetic results for the table in advance, so that they are handy and readily available for PLMs to answer numerical reasoning questions. |
Fan Zhou; Mengkang Hu; Haoyu Dong; Zhoujun Cheng; Fan Cheng; Shi Han; Dongmei Zhang; |
146 | Rich Knowledge Sources Bring Complex Knowledge Conflicts: Recalibrating Models to Reflect Conflicting Evidence Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we simulate knowledge conflicts (i.e., where parametric knowledge suggests one answer and different passages suggest different answers) and examine model behaviors. |
Hung-Ting Chen; Michael Zhang; Eunsol Choi; |
147 | QA Domain Adaptation Using Hidden Space Augmentation and Self-Supervised Contrastive Adaptation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a novel self-supervised framework called QADA for QA domain adaptation. |
Zhenrui Yue; Huimin Zeng; Bernhard Kratzwald; Stefan Feuerriegel; Dong Wang; |
148 | When FLUE Meets FLANG: Benchmarks and Large Pretrained Language Model for Financial Domain Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a novel domain specific Financial LANGuage model (FLANG) which uses financial keywords and phrases for better masking, together with span boundary objective and in-filing objective. |
Raj Shah; Kunal Chawla; Dheeraj Eidnani; Agam Shah; Wendi Du; Sudheer Chava; Natraj Raman; Charese Smiley; Jiaao Chen; Diyi Yang; |
149 | Retrieval As Attention: End-to-end Learning of Retrieval and Reading Within A Single Transformer Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: These two components are usually modeled separately, which necessitates a cumbersome implementation and is awkward to optimize in an end-to-end fashion. In this paper, we revisit this design and eschew the separate architecture and training in favor of a single Transformer that performs retrieval as attention (RAA), and end-to-end training solely based on supervision from the end QA task. |
Zhengbao Jiang; Luyu Gao; Zhiruo Wang; Jun Araki; Haibo Ding; Jamie Callan; Graham Neubig; |
150 | Reproducibility in Computational Linguistics: Is Source Code Enough? Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This work studies trends in source code availability at major computational linguistics conferences, namely, ACL, EMNLP, LREC, NAACL, and COLING. |
Mohammad Arvan; Lu�s Pina; Natalie Parde; |
151 | Generating Information-Seeking Conversations from Unlabeled Documents Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we introduce a novel framework, **SimSeek**, (**Sim**ulating information-**Seek**ing conversation from unlabeled documents), and compare its two variants. |
Gangwoo Kim; Sungdong Kim; Kang Min Yoo; Jaewoo Kang; |
152 | Distill The Image to Nowhere: Inversion Knowledge Distillation for Multimodal Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Thus, in this work, we introduce IKD-MMT, a novel MMT framework to support the image-free inference phase via an inversion knowledge distillation scheme. |
Ru Peng; Yawen Zeng; Jake Zhao; |
153 | A Multifaceted Framework to Evaluate Evasion, Content Preservation, and Misattribution in Authorship Obfuscation Techniques Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Motivated by recent work on cross-topic authorship identification and content preservation in summarization, we re-evaluate different authorship obfuscation techniques on detection evasion and content preservation. |
Malik Altakrori; Thomas Scialom; Benjamin C. M. Fung; Jackie Chi Kit Cheung; |
154 | SafeText: A Benchmark for Exploring Physical Safety in Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We create the first benchmark dataset, SafeText, comprising real-life scenarios with paired safe and physically unsafe pieces of advice. |
Sharon Levy; Emily Allaway; Melanie Subbiah; Lydia Chilton; Desmond Patton; Kathleen McKeown; William Yang Wang; |
155 | Ground-Truth Labels Matter: A Deeper Look Into Input-Label Demonstrations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Intrigued by this counter-intuitive observation, we re-examine the importance of ground-truth labels in in-context learning. |
Kang Min Yoo; Junyeob Kim; Hyuhng Joon Kim; Hyunsoo Cho; Hwiyeol Jo; Sang-Woo Lee; Sang-goo Lee; Taeuk Kim; |
156 | D4: A Chinese Dialogue Dataset for Depression-Diagnosis-Oriented Chat Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Based on clinical depression diagnostic criteria ICD-11 and DSM-5, we designed a 3-phase procedure to construct D4: a Chinese Dialogue Dataset for Depression-Diagnosis-Oriented Chat, which simulates the dialogue between doctors and patients during the diagnosis of depression, including diagnosis results and symptom summary given by professional psychiatrists for each conversation. |
Binwei Yao; Chao Shi; Likai Zou; Lingfeng Dai; Mengyue Wu; Lu Chen; Zhen Wang; Kai Yu; |
157 | Exploiting Domain-slot Related Keywords Description for Few-Shot Cross-Domain Dialogue State Tracking Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel framework based on domain-slot related description to tackle the challenge of few-shot cross-domain DST. |
Gao Qixiang; Guanting Dong; Yutao Mou; Liwen Wang; Chen Zeng; Daichi Guo; Mingyang Sun; Weiran Xu; |
158 | CoCoa: An Encoder-Decoder Model for Controllable Code-switched Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present CoCoa, an encoder-decoder translation model that converts monolingual Hindi text to Hindi-English code-switched text with both encoder-side and decoder-side interventions to achieve fine-grained controllable generation. |
Sneha Mondal; Ritika .; Shreya Pathak; Preethi Jyothi; Aravindan Raghuveer; |
159 | Towards Climate Awareness in NLP Research Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: As a remedy, we propose a climate performance model card with the primary purpose of being practically usable with only limited information about experiments and the underlying computer hardware. |
Daniel Hershcovich; Nicolas Webersinke; Mathias Kraus; Julia Bingler; Markus Leippold; |
160 | Navigating Connected Memories with A Task-oriented Dialog System Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose dialogs for connected memories as a powerful tool to empower users to search their media collection through a multi-turn, interactive conversation. |
Satwik Kottur; Seungwhan Moon; Alborz Geramifard; Babak Damavandi; |
161 | Language Model Decomposition: Quantifying The Dependency and Correlation of Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, a theoretical framework for studying their relationships is still missing. In this paper, we fill this gap by investigating the linear dependency between pre-trained LMs. |
Hao Zhang; |
162 | SynGEC: Syntax-Enhanced Grammatical Error Correction with A Tailored GEC-Oriented Parser Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This work proposes a syntax-enhanced grammatical error correction (GEC) approach named SynGEC that effectively incorporates dependency syntactic information into the encoder part of GEC models. |
Yue Zhang; Bo Zhang; Zhenghua Li; Zuyi Bao; Chen Li; Min Zhang; |
163 | Varifocal Question Generation for Fact-checking Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present Varifocal, a method that generates questions based on different focal points within a given claim, i.e. different spans of the claim and its metadata, such as its source and date. |
Nedjma Ousidhoum; Zhangdie Yuan; Andreas Vlachos; |
164 | Bilingual Lexicon Induction for Low-Resource Languages Using Graph Matching Via Optimal Transport Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we improve bilingual lexicon induction performance across 40 language pairs with a graph-matching method based on optimal transport. |
Kelly Marchisio; Ali Saad-Eldin; Kevin Duh; Carey Priebe; Philipp Koehn; |
165 | Whose Language Counts As High Quality? Measuring Language Ideologies in Text Data Selection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Using a new dataset of U.S. high school newspaper articles—written by students from across the country—we investigate whose language is preferred by the quality filter used for GPT-3. |
Suchin Gururangan; Dallas Card; Sarah Dreier; Emily Gade; Leroy Wang; Zeyu Wang; Luke Zettlemoyer; Noah A. Smith; |
166 | ConReader: Exploring Implicit Relations in Contracts for Contract Clause Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We study automatic Contract Clause Extraction (CCE) by modeling implicit relations in legal contracts. |
Weiwen Xu; Yang Deng; Wenqiang Lei; Wenlong Zhao; Tat-Seng Chua; Wai Lam; |
167 | Training Dynamics for Curriculum Learning: A Study on Monolingual and Cross-lingual NLU Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Current approaches for Natural Language Understanding (NLU) tasks use CL to improve in-distribution data performance often via heuristic-oriented or task-agnostic difficulties. In this work, instead, we employ CL for NLU by taking advantage of training dynamics as difficulty metrics, i.e., statistics that measure the behavior of the model at hand on specific task-data instances during training and propose modifications of existing CL schedulers based on these statistics. |
Fenia Christopoulou; Gerasimos Lampouras; Ignacio Iacobacci; |
168 | Revisiting Parameter-Efficient Tuning: Are We Really There Yet? Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: By tuning just a fraction amount of parameters comparing to full model finetuning, PETuning methods claim to have achieved performance on par with or even better than finetuning. In this work, we take a step back and re-examine these PETuning methods by conducting the first comprehensive investigation into the training and evaluation of them. |
Guanzheng Chen; Fangyu Liu; Zaiqiao Meng; Shangsong Liang; |
169 | Transfer Learning from Semantic Role Labeling to Event Argument Extraction with Template-based Slot Querying Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we investigate transfer learning from semantic role labeling (SRL) to event argument extraction (EAE), considering their similar argument structures. |
Zhisong Zhang; Emma Strubell; Eduard Hovy; |
170 | Calibrating Zero-shot Cross-lingual (Un-)structured Predictions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study different post-training calibration methods in structured and unstructured prediction tasks. |
Zhengping Jiang; Anqi Liu; Benjamin Van Durme; |
171 | PRINCE: Prefix-Masked Decoding for Knowledge Enhanced Sequence-to-Sequence Pre-Training Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper introduces a simple yet effective pre-training paradigm, equipped with a knowledge-enhanced decoder that predicts the next entity token with noises in the prefix, explicitly strengthening the representation learning of entities that span over multiple input tokens. |
Song Xu; Haoran Li; Peng Yuan; Youzheng Wu; Xiaodong He; |
172 | How Far Are We from Robust Long Abstractive Summarization? Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Abstractive summarization has made tremendous progress in recent years. In this work, we perform fine-grained human annotations to evaluate long document abstractive summarization systems (i.e., models and metrics) with the aim of implementing them to generate reliable summaries. |
Huan Yee Koh; Jiaxin Ju; He Zhang; Ming Liu; Shirui Pan; |
173 | Measuring Context-Word Biases in Lexical Semantic Datasets Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We question this assumption by presenting the first quantitative analysis on the context-word interaction being tested in major contextual lexical semantic tasks. To achieve this, we run probing baselines on masked input, and propose measures to calculate and visualize the degree of context or word biases in existing datasets. |
Qianchu Liu; Diana McCarthy; Anna Korhonen; |
174 | Iteratively Prompt Pre-trained Language Models for Chain of Thought Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we explore an iterative prompting framework, a new prompting paradigm which progressively elicits relevant knowledge from PLMs for multi-step inference. |
Boshi Wang; Xiang Deng; Huan Sun; |
175 | Unobserved Local Structures Make Compositional Generalization Hard Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we investigate the factors that make generalization to certain test instances challenging. |
Ben Bogin; Shivanshu Gupta; Jonathan Berant; |
176 | Mitigating Data Sparsity for Short Text Topic Modeling By Topic-Semantic Contrastive Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To better address data sparsity, in this paper we propose a novel short text topic modeling framework, Topic-Semantic Contrastive Topic Model (TSCTM). |
Xiaobao Wu; Anh Tuan Luu; Xinshuai Dong; |
177 | Back to The Future: Bidirectional Information Decoupling Network for Multi-turn Dialogue Modeling Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Recent studies of dialogue modeling commonly employ pre-trained language models (PrLMs) to encode the dialogue history as successive tokens, which is insufficient in capturing the temporal characteristics of dialogues. Therefore, we propose Bidirectional Information Decoupling Network (BiDeN) as a universal dialogue encoder, which explicitly incorporates both the past and future contexts and can be generalized to a wide range of dialogue-related tasks. |
Yiyang Li; Hai Zhao; Zhuosheng Zhang; |
178 | Calibration Meets Explanation: A Simple and Effective Approach for Model Confidence Estimates Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Similarly, the explanations may tell us when the model might know and when it does not. Inspired by this, we propose a method named CME that leverages model explanations to make the model less confident with non-inductive attributions. |
Dongfang Li; Baotian Hu; Qingcai Chen; |
179 | Non-Autoregressive Neural Machine Translation: A Call for Clarity Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we take a step back and revisit several techniques that have been proposed for improving non-autoregressive translation models and compare their combined translation quality and speed implications under third-party testing environments. |
Robin Schmidt; Telmo Pires; Stephan Peitz; Jonas L��f; |
180 | RED-ACE: Robust Error Detection for ASR Using Confidence Embeddings Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Specifically, we add an ASR Confidence Embedding (ACE) layer to the AED model’s encoder, allowing us to jointly encode the confidence scores and the transcribed text into a contextualized representation. |
Zorik Gekhman; Dina Zverinski; Jonathan Mallinson; Genady Beryozkin; |
181 | Fast-R2D2: A Pretrained Recursive Neural Network Based on Pruned CKY for Grammar Induction and Text Representation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, its rule-based pruning process suffers from local optima and slow inference. In this paper, we propose a unified R2D2 method that overcomes these issues. |
Xiang Hu; Haitao Mi; Liang Li; Gerard de Melo; |
182 | A Localized Geometric Method to Match Knowledge in Low-dimensional Hyperbolic Space Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a localized geometric method to find equivalent entities in hyperbolic space. |
Bo Hui; Tian Xia; Wei-Shinn Ku; |
183 | Memory-assisted Prompt Editing to Improve GPT-3 After Deployment Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Large LMs such as GPT-3 are powerful, but can commit mistakes that are obvious to humans. For example, GPT-3 would mistakenly interpret What word is similar to good? to mean a homophone, while the user intended a synonym. Our goal is to effectively correct such errors via user interactions with the system but without retraining, which will be prohibitively costly. |
Aman Madaan; Niket Tandon; Peter Clark; Yiming Yang; |
184 | LVP-M3: Language-aware Visual Prompt for Multilingual Multimodal Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To this end,we first propose the Multilingual MMT task by establishing two new Multilingual MMT benchmark datasets covering seven languages.Then, an effective baseline LVP-M3 using visual prompts is proposed to support translations between different languages,which includes three stages (token encoding, language-aware visual prompt generation, and language translation). |
Hongcheng Guo; Jiaheng Liu; Haoyang Huang; Jian Yang; Zhoujun Li; Dongdong Zhang; Zheng Cui; |
185 | PromptEHR: Conditional Electronic Healthcare Records Generation with Prompt Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose to formulate EHRs generation as a text-to-text translation task by language models (LMs), which suffices to highly flexible event imputation during generation. |
Zifeng Wang; Jimeng Sun; |
186 | ROSE: Robust Selective Fine-tuning for Pre-trained Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, they are still limited due to redundant attack search spaces and the inability to defend against various types of attacks. In this work, we present a novel fine-tuning approach called RObust SEletive fine-tuning (ROSE) to address this issue. |
Lan Jiang; Hao Zhou; Yankai Lin; Peng Li; Jie Zhou; Rui Jiang; |
187 | CodeRetriever: A Large Scale Contrastive Pre-Training Method for Code Search Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose the CodeRetriever model, which learns the function-level code semantic representations through large-scale code-text contrastive pre-training. |
Xiaonan Li; Yeyun Gong; Yelong Shen; Xipeng Qiu; Hang Zhang; Bolun Yao; Weizhen Qi; Daxin Jiang; Weizhu Chen; Nan Duan; |
188 | Open-Topic False Information Detection on Social Networks with Contrastive Adversarial Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this open-topic scenario, we empirically find that the existing models suffer from impairment in the detection performance for seen or unseen topic data, resulting in poor overall model performance. To address this issue, we propose a novel Contrastive Adversarial Learning Network, CALN, that employs an unsupervised topic clustering method to capture topic-specific features to enhance the model?s performance for seen topics and an unsupervised adversarial learning method to align data representation distributions to enhance the model?s generalisation to unseen topics. |
Guanghui Ma; Chunming Hu; Ling Ge; Hong Zhang; |
189 | Mitigating Inconsistencies in Multimodal Sentiment Analysis Under Uncertain Missing Modalities Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To tackle the issue, we propose an Ensemble-based Missing Modality Reconstruction (EMMR) network to detect and recover semantic features of the key missing modality. |
Jiandian Zeng; Jiantao Zhou; Tianyi Liu; |
190 | ConvTrans: Transforming Web Search Sessions for Conversational Dense Retrieval Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we present ConvTrans, a data augmentation method that can automatically transform easily-accessible web search sessions into conversational search sessions to fundamentally alleviate the data scarcity problem for conversational dense retrieval. |
Kelong Mao; Zhicheng Dou; Hongjin Qian; Fengran Mo; Xiaohua Cheng; Zhao Cao; |
191 | MUSIED: A Benchmark for Event Detection from Multi-Source Heterogeneous Informal Texts Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: As a pioneering exploration that expands event detection to the scenarios involving informal and heterogeneous texts, we propose a new large-scale Chinese event detection dataset based on user reviews, text conversations, and phone conversations in a leading e-commerce platform for food service. |
Xiangyu Xi; Jianwei Lv; Shuaipeng Liu; Wei Ye; Fan Yang; Guanglu Wan; |
192 | Reproducibility Issues for BERT-based Evaluation Metrics Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we ask whether results and claims from four recent BERT-based metrics can be reproduced. |
Yanran Chen; Jonas Belouadi; Steffen Eger; |
193 | Improving Multi-task Stance Detection with Multi-task Interaction Network Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, they neglect to explore capturing the fine-grained task-specific interaction between stance detection and sentiment tasks, thus degrading performance. To address this issue, this paper proposes a novel multi-task interaction network (MTIN) for improving the performance of stance detection and sentiment analysis tasks simultaneously. |
Heyan Chai; Siyu Tang; Jinhao Cui; Ye Ding; Binxing Fang; Qing Liao; |
194 | Neural-based Mixture Probabilistic Query Embedding for Answering FOL Queries on Knowledge Graphs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a Neural-based Mixture Probabilistic Query Embedding Model (NMP-QEM) that encodes the answer set of each mini-query as a mixed Gaussian distribution with multiple means and covariance parameters, which can approximate any random distribution arbitrarily well in real KGs. |
Xiao Long; Liansheng Zhuang; Li Aodi; Shafei Wang; Houqiang Li; |
195 | Improving Multi-turn Emotional Support Dialogue Generation with Lookahead Strategy Planning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In comparison, multi-turn ES conversation systems can provide ES more effectively, but face several new technical challenges, including: (1) how to adopt appropriate support strategies to achieve the long-term dialogue goal of comforting the user?s emotion; (2) how to dynamically model the user?s state. In this paper, we propose a novel system MultiESC to address these issues. |
Yi Cheng; Wenge Liu; Wenjie Li; Jiashuo Wang; Ruihui Zhao; Bang Liu; Xiaodan Liang; Yefeng Zheng; |
196 | Conformal Predictor for Improving Zero-Shot Text Classification Efficiency Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we improve the efficiency of such cross-encoder-based 0shot models by restricting the number of likely labels using another fast base classifier-based conformal predictor (CP) calibrated on samples labeled by the 0shot model. |
Prafulla Kumar Choubey; Yu Bai; Chien-Sheng Wu; Wenhao Liu; Nazneen Rajani; |
197 | Effective and Efficient Query-aware Snippet Extraction for Web Search Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose an effective query-aware webpage snippet extraction method named DeepQSE. |
Jingwei Yi; Fangzhao Wu; Chuhan Wu; Xiaolong Huang; Binxing Jiao; Guangzhong Sun; Xing Xie; |
198 | You Only Need One Model for Open-domain Question Answering Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This allows us to use a single question answering model trained end-to-end, which is a more efficient use of model capacity and also leads to better gradient flow. We present a pre-training method to effectively train this architecture and evaluate our model on the Natural Questions and TriviaQA open datasets. |
Haejun Lee; Akhil Kedia; Jongwon Lee; Ashwin Paranjape; Christopher Manning; Kyoung-Gu Woo; |
199 | Generative Entity Typing with Curriculum Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: The traditional classification-based entity typing paradigm has two unignorable drawbacks: 1) it fails to assign an entity to the types beyond the predefined type set, and 2) it can hardly handle few-shot and zero-shot situations where many long-tail types only have few or even no training instances. To overcome these drawbacks, we propose a novel generative entity typing (GET) paradigm: given a text with an entity mention, the multiple types for the role that the entity plays in the text are generated with a pre-trained language model (PLM). |
Siyu Yuan; Deqing Yang; Jiaqing Liang; Zhixu Li; Jinxi Liu; Jingyue Huang; Yanghua Xiao; |
200 | SetGNER: General Named Entity Recognition As Entity Set Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Motivated by the observation that the target output of NER is essentially a set of sequences, we propose a novel entity set generation framework for general NER scenes in this paper. |
Yuxin He; Buzhou Tang; |
201 | Opinion Summarization By Weak-Supervision from Mix-structured Data Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we convert each review into a mixof structured and unstructured data, which we call opinion-aspect pairs (OAs) and implicit sentences (ISs). |
Yizhu Liu; Qi Jia; Kenny Zhu; |
202 | Multi-level Distillation of Semantic Knowledge for Pre-training Multilingual Language Model Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose Multi-level Multilingual Knowledge Distillation (MMKD), a novel method for improving multilingual language models. |
Mingqi Li; Fei Ding; Dan Zhang; Long Cheng; Hongxin Hu; Feng Luo; |
203 | Empowering Dual-Encoder with Query Generator for Cross-Lingual Dense Retrieval Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose to use a query generator as the teacher in the cross-lingual setting, which is less dependent on enough training samples and high-quality negative samples. |
Houxing Ren; Linjun Shou; Ning Wu; Ming Gong; Daxin Jiang; |
204 | R2F: A General Retrieval, Reading and Fusion Framework for Document-level Natural Language Inference Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we establish a general solution, named Retrieval, Reading and Fusion (R2F) framework, and a new setting, by analyzing the main challenges of DOCNLI: interpretability, long-range dependency, and cross-sentence inference. |
Hao Wang; Yixin Cao; Yangguang Li; Zhen Huang; Kun Wang; Jing Shao; |
205 | Revisiting Pre-trained Language Models and Their Evaluation for Arabic Natural Language Processing Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, they are still limited due to redundant attack search spaces and the inability to defend against various types of attacks.In this work, we present a novel fine-tuning approach called RObust SEletive fine-tuning (ROSE) to address this issue. |
Abbas Ghaddar; Yimeng Wu; Sunyam Bagga; Ahmad Rashid; Khalil Bibi; Mehdi Rezagholizadeh; Chao Xing; Yasheng Wang; Xinyu Duan; Zhefeng Wang; Baoxing Huai; Xin Jiang; Qun Liu; Phillippe Langlais; |
206 | KECP: Knowledge Enhanced Contrastive Prompting for Few-shot Extractive Question Answering Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, most existing approaches for MRC may perform poorly in the few-shot learning scenario. To solve this issue, we propose a novel framework named Knowledge Enhanced Contrastive Prompt-tuning (KECP). |
Jianing Wang; Chengyu Wang; Minghui Qiu; Qiuhui Shi; Hongbin Wang; Jun Huang; Ming Gao; |
207 | Knowledge Prompting in Pre-trained Language Model for Natural Language Understanding Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Then we design multiple continuous prompts rules and transform the knowledge sub-graph into natural language prompts. To further leverage the factual knowledge from these prompts, we propose two novel knowledge-aware self-supervised tasks including prompt relevance inspection and masked prompt modeling. |
Jianing Wang; Wenkang Huang; Minghui Qiu; Qiuhui Shi; Hongbin Wang; Xiang Li; Ming Gao; |
208 | On The Evaluation Metrics for Paraphrase Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper we revisit automatic metrics for paraphrase evaluation and obtain two findings that disobey conventional wisdom: (1) Reference-free metrics achieve better performance than their reference-based counterparts. |
Lingfeng Shen; Lemao Liu; Haiyun Jiang; Shuming Shi; |
209 | Curriculum Learning Meets Weakly Supervised Multimodal Correlation Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we inject curriculum learning into weakly supervised multimodal correlation learning. |
Sijie Mai; Ya Sun; Haifeng Hu; |
210 | Rethinking Positional Encoding in Tree Transformer for Code Representation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose a novel tree Transformer encoding node positions based on our new description method for tree structures. |
Han Peng; Ge Li; Yunfei Zhao; Zhi Jin; |
211 | RASAT: Integrating Relational Structures Into Pretrained Seq2Seq Model for Text-to-SQL Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, introducing these structural relations comes with prices: they often result in a specialized model structure, which largely prohibits using large pretrained models in text-to-SQL. To address this problem, we propose RASAT: a Transformer seq2seq architecture augmented with relation-aware self-attention that could leverage a variety of relational structures while inheriting the pretrained parameters from the T5 model effectively. |
Jiexing Qi; Jingyao Tang; Ziwei He; Xiangpeng Wan; Yu Cheng; Chenghu Zhou; Xinbing Wang; Quanshi Zhang; Zhouhan Lin; |
212 | COM-MRC: A COntext-Masked Machine Reading Comprehension Framework for Aspect Sentiment Triplet Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel COntext-Masked MRC (COM-MRC) framework for ASTE. |
Zepeng Zhai; Hao Chen; Fangxiang Feng; Ruifan Li; Xiaojie Wang; |
213 | CEM: Machine-Human Chatting Handoff Via Causal-Enhance Module Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: These variables are significantly associated with handoff decisions, resulting in prediction bias and cost increasement. Therefore, we propose Causal-Enhance Module (CEM) by establishing the causal graph of MHCH based on these two variables, which is a simple yet effective module and can be easy to plug into the existing MHCH methods. |
Shanshan Zhong; Jinghui Qin; Zhongzhan Huang; Daifeng Li; |
214 | Nearest Neighbor Zero-Shot Inference Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Retrieval-augmented language models (LMs) use non-parametric memory to substantially outperform their non-retrieval counterparts on perplexity-based evaluations, but it is an open question whether they achieve similar gains in few- and zero-shot end-task accuracy. We extensively study one such model, the k-nearest neighbor LM (kNN-LM), showing that the gains marginally transfer. |
Weijia Shi; Julian Michael; Suchin Gururangan; Luke Zettlemoyer; |
215 | Robots-Dont-Cry: Understanding Falsely Anthropomorphic Utterances in Dialog Systems Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We collect human ratings on the feasibility of approximately 900 two-turn dialogs sampled from 9 diverse data sources. |
David Gros; Yu Li; Zhou Yu; |
216 | A Joint Learning Framework for Restaurant Survival Prediction and Explanation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we tackle the practical problem of restaurant survival prediction. |
Xin Li; Xiaojie Zhang; Peng JiaHao; Rui Mao; Mingyang Zhou; Xing Xie; Hao Liao; |
217 | Making Pretrained Language Models Good Long-tailed Learners Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This motivates us to check the hypothesis that prompt-tuning is also a promising choice for long-tailed classification, since the tail classes are intuitively few-shot ones. To achieve this aim, we conduct empirical studies to examine the hypothesis. |
Chen Zhang; Lei Ren; Jingang Wang; Wei Wu; Dawei Song; |
218 | UniGeo: Unifying Geometry Logical Reasoning Via Reformulating Mathematical Expression Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, in essence, these two tasks have similar problem representations and overlapped math knowledge which can improve the understanding and reasoning ability of a deep model on both two tasks. Therefore, we construct a large-scale Unified Geometry problem benchmark, UniGeo, which contains 4,998 calculation problems and 9,543 proving problems. |
Jiaqi Chen; Tong Li; Jinghui Qin; Pan Lu; Liang Lin; Chongyu Chen; Xiaodan Liang; |
219 | Face-Sensitive Image-to-Emotional-Text Cross-modal Translation for Multimodal Aspect-based Sentiment Analysis Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we introduce a face-sensitive image-to-emotional-text translation (FITE) method, which focuses on capturing visual sentiment cues through facial expressions and selectively matching and fusing with the target aspect in textual modality. |
Hao Yang; Yanyan Zhao; Bing Qin; |
220 | FineD-Eval: Fine-grained Automatic Dialogue-Level Evaluation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To this end, we are motivated to propose a multi-dimensional dialogue-level metric, which consists of three sub-metrics with each targeting a specific dimension. |
Chen Zhang; Luis Fernando D�Haro; Qiquan Zhang; Thomas Friedrichs; Haizhou Li; |
221 | Sentence Representation Learning with Generative Objective Rather Than Contrastive Objective Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We instead propose a novel generative self-supervised learning objective based on phrase reconstruction. |
Bohong Wu; Hai Zhao; |
222 | RLPrompt: Optimizing Discrete Text Prompts with Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper proposes RLPrompt, an efficient discrete prompt optimization approach with reinforcement learning (RL). |
Mingkai Deng; Jianyu Wang; Cheng-Ping Hsieh; Yihan Wang; Han Guo; Tianmin Shu; Meng Song; Eric Xing; Zhiting Hu; |
223 | DisCup: Discriminator Cooperative Unlikelihood Prompt-tuning for Controllable Text Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a new CTG approach, namely DisCup, which incorporates the attribute knowledge of discriminator to optimize the control-prompts, steering a frozen CLM to produce attribute-specific texts. |
Hanqing Zhang; Dawei Song; |
224 | CPL: Counterfactual Prompt Learning for Vision and Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Towards non-spurious and efficient prompt learning from limited examples, this paper presents a novel Counterfactual Prompt Learning (CPL) method for vision and language models, which simultaneously employs counterfactual generation and contrastive learning in a joint optimization framework. |
Xuehai He; Diji Yang; Weixi Feng; Tsu-Jui Fu; Arjun Akula; Varun Jampani; Pradyumna Narayana; Sugato Basu; William Yang Wang; Xin Wang; |
225 | Red Teaming Language Models with Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we automatically find cases where a target LM behaves in a harmful way, by generating test cases (?red teaming?) using another LM. |
Ethan Perez; Saffron Huang; Francis Song; Trevor Cai; Roman Ring; John Aslanides; Amelia Glaese; Nat McAleese; Geoffrey Irving; |
226 | CapOnImage: Context-driven Dense-Captioning on Image Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we introduce a new taskcalled captioning on image (CapOnImage), which aims to generatedense captions at different locations of the image based on contextual information. |
Yiqi Gao; Xinglin Hou; Yuanmeng Zhang; Tiezheng Ge; Yuning Jiang; Peng Wang; |
227 | SpanProto: A Two-stage Span-based Prototypical Network for Few-shot Named Entity Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To this end, we propose a seminal span-based prototypical network (SpanProto) that tackles few-shot NER via a two-stage approach, including span extraction and mention classification. |
Jianing Wang; Chengyu Wang; Chuanqi Tan; Minghui Qiu; Songfang Huang; Jun Huang; Ming Gao; |
228 | Discovering Differences in The Representation of People Using Contextualized Semantic Axes Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In particular, past work has compared embeddings against “semantic axes” that represent two opposing concepts. We extend this paradigm to BERT embeddings, and construct contextualized axes that mitigate the pitfall where antonyms have neighboring representations. |
Li Lucy; Divya Tadimeti; David Bamman; |
229 | Generating Literal and Implied Subquestions to Fact-check Complex Claims Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we focus on decomposing a complex claim into a comprehensive set of yes-no subquestions whose answers influence the veracity of the claim. |
Jifan Chen; Aniruddh Sriram; Eunsol Choi; Greg Durrett; |
230 | Machine Translation Robustness to Natural Asemantic Variation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: An important yet under-studied category involves minor variations in nuance (non-typos) that preserve meaning w.r.t. the target language. We introduce and formalize this category as Natural Asemantic Variation (NAV) and investigate it in the context of MT robustness. |
Jacob Bremerman; Xiang Ren; Jonathan May; |
231 | Natural Language to Code Translation with Execution Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we introduce execution result?based minimum Bayes risk decoding (MBR-EXEC) for program selection and show that it improves the few-shot performance of pretrained code models on natural-language-to-code tasks. |
Freda Shi; Daniel Fried; Marjan Ghazvininejad; Luke Zettlemoyer; Sida I. Wang; |
232 | Life Is A Circus and We Are The Clowns: Automatically Finding Analogies Between Situations and Processes Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Our goal is to automatically extract entities and their relations from the text and find a mapping between the different domains based on relational similarity (e.g., blood is mapped to water). |
Oren Sultan; Dafna Shahaf; |
233 | Language Contamination Helps Explains The Cross-lingual Capabilities of English Pretrained Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: These models are generally presented as being trained only on English text but have been found to transfer surprisingly well to other languages. We investigate this phenomenon and find that common English pretraining corpora actually contain significant amounts of non-English text: even when less than 1% of data is not English (well within the error rate of strong language classifiers), this leads to hundreds of millions of foreign language tokens in large-scale datasets. |
Terra Blevins; Luke Zettlemoyer; |
234 | Analyzing The Mono- and Cross-Lingual Pretraining Dynamics of Multilingual Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, because these analyses have focused on fully trained multilingual models, little is known about the dynamics of the multilingual pretraining process. We investigate when these models acquire their in-language and cross-lingual abilities by probing checkpoints taken from throughout XLM-R pretraining, using a suite of linguistic tasks. |
Terra Blevins; Hila Gonen; Luke Zettlemoyer; |
235 | Neural Machine Translation with Contrastive Translation Memories Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Different from previous works that make use of mutually similar but redundant translation memories (TMs), we propose a new retrieval-augmented NMT to model contrastively retrieved translation memories that are holistically similar to the source sentence while individually contrastive to each other providing maximal information gain in three phases. |
Xin Cheng; Shen Gao; Lemao Liu; Dongyan Zhao; Rui Yan; |
236 | Distilling Causal Effect from Miscellaneous Other-Class for Continual Named Entity Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Thanks to the causal inference, we identify that the forgetting is caused by the missing causal effect from the old data.To this end, we propose a unified causal framework to retrieve the causality from both new entity types and Other-Class. |
Junhao Zheng; Zhanxian Liang; Haibin Chen; Qianli Ma; |
237 | Exploring The Secrets Behind The Learning Difficulty of Meaning Representations for Semantic Parsing Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a data-aware metric called ISS (denoting incremental structural stability) of MRs, and demonstrate that ISS is highly correlated with the final performance. |
Zhenwen Li; Jiaqi Guo; Qian Liu; Jian-Guang Lou; Tao Xie; |
238 | That�s The Wrong Lung! Evaluating and Improving The Interpretability of Unsupervised Multimodal Encoders for Medical Data Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Our main finding is that the text has an often weak or unintuitive influence on attention; alignments do not consistently reflect basic anatomical information. |
Jered McInerney; Geoffrey Young; Jan-Willem van de Meent; Byron Wallace; |
239 | Unsupervised Tokenization Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In the presented study, we discover that the so-called ?transition freedom? metric appears superior for unsupervised tokenization purposes in comparison to statistical metrics such as mutual information and conditional probability, providing F-measure scores in range from 0.71 to 1.0 across explored multilingual corpora. |
Anton Kolonin; Vignav Ramesh; |
240 | A Template-based Method for Constrained Neural Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose a template-based method that can yield results with high translation quality and match accuracy and the inference speed of our method is comparable with unconstrained NMT models. |
Shuo Wang; Peng Li; Zhixing Tan; Zhaopeng Tu; Maosong Sun; Yang Liu; |
241 | PATS: Sensitivity-aware Noisy Learning for Pretrained Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we present PATS (Perturbation According To Sensitivity), a noisy training mechanism which considers each parameter?s importance in the downstream task to help fine-tune PLMs. |
Yupeng Zhang; Hongzhi Zhang; Sirui Wang; Wei Wu; Zhoujun Li; |
242 | Towards Reinterpreting Neural Topic Models Via Composite Activations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we present a model-free two-stage process to reinterpret NTM and derive further insights on the state of the trained model. |
Jia Peng Lim; Hady Lauw; |
243 | Few-shot Query-Focused Summarization with Prefix-Merging Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we investigate the idea that whether we can integrate and transfer the knowledge of text summarization and question answering to assist the few-shot learning in query-focused summarization. |
Ruifeng Yuan; Zili Wang; Ziqiang Cao; Wenjie Li; |
244 | Cross-Align: Modeling Deep Cross-lingual Interactions for Word Alignment Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, we find that the existing approaches capture few interactions between the input sentence pairs, which degrades the word alignment quality severely, especially for the ambiguous words in the monolingual context. To remedy this problem, we propose Cross-Align to model deep interactions between the input sentence pairs, in which the source and target sentences are encoded separately with the shared self-attention modules in the shallow layers, while cross-lingual interactions are explicitly constructed by the cross-attention modules in the upper layers. |
Siyu Lai; Zhen Yang; Fandong Meng; Yufeng Chen; Jinan Xu; Jie Zhou; |
245 | BERTScore Is Unfair: On Social Bias in Language Model-Based Metrics for Text Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To that end, this work presents the first systematic study on the social bias in PLM-based metrics. |
Tianxiang Sun; Junliang He; Xipeng Qiu; Xuanjing Huang; |
246 | HPT: Hierarchy-aware Prompt Tuning for Hierarchical Text Classification Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To bridge the gap, in this paper, we propose HPT, a Hierarchy-aware Prompt Tuning method to handle HTC from a multi-label MLM perspective. |
Zihan Wang; Peiyi Wang; Tianyu Liu; Binghuai Lin; Yunbo Cao; Zhifang Sui; Houfeng Wang; |
247 | Not to Overfit or Underfit The Source Domains? An Empirical Study of Domain Generalization in Question Answering Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Here we examine the contrasting view that multi-source domain generalization (DG) is first and foremost a problem of mitigating source domain underfitting: models not adequately learning the signal already present in their multi-domain training data. |
Md Arafat Sultan; Avi Sil; Radu Florian; |
248 | Neural Theory-of-Mind? On The Limits of Social Intelligence in Large LMs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we examine the open question of social intelligence and Theory of Mind in modern NLP systems from an empirical and theorybased perspective. |
Maarten Sap; Ronan Le Bras; Daniel Fried; Yejin Choi; |
249 | Improving Passage Retrieval with Zero-Shot Question Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a simple and effective re-ranking method for improving passage retrieval in open question answering. |
Devendra Sachan; Mike Lewis; Mandar Joshi; Armen Aghajanyan; Wen-tau Yih; Joelle Pineau; Luke Zettlemoyer; |
250 | Summarizing Community-based Question-Answer Pairs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To help users quickly digest the key information, we propose the novel CQA summarization task that aims to create a concise summary from CQA pairs. |
Ting-Yao Hsu; Yoshi Suhara; Xiaolan Wang; |
251 | Logical Reasoning with Span-Level Predictions for Interpretable and Robust NLI Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Unlike prior work, we show that improved interpretability can be achieved without decreasing the predictive accuracy. |
Joe Stacey; Pasquale Minervini; Haim Dubossarsky; Marek Rei; |
252 | How to Disagree Well: Investigating The Dispute Tactics Used on Wikipedia Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Disagreements are frequently studied from the perspective of either detecting toxicity or analysing argument structure. We propose a framework of dispute tactics which unifies these two perspectives, as well as other dialogue acts which play a role in resolving disputes, such as asking questions and providing clarification. |
Christine De Kock; Andreas Vlachos; |
253 | Chapter Ordering in Novels Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Understanding narrative flow and text coherence in long-form documents (novels) remains an open problem in NLP. To gain insight, we explore the task of chapter ordering, reconstructing the original order of chapters in novel given a random permutation of the text. |
Allen Kim; Steve Skiena; |
254 | Open-ended Knowledge Tracing for Computer Science Education Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We develop an initial solution to the OKT problem, a student knowledge-guided code generation approach, that combines program synthesis methods using language models with student knowledge tracing methods. |
Naiming Liu; Zichao Wang; Richard Baraniuk; Andrew Lan; |
255 | Logical Neural Networks for Knowledge Base Completion with Embeddings & Rules Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To this end, we propose to utilize logical neural networks (LNN), a powerful neuro-symbolic AI framework that can express both kinds of rules and learn these end-to-end using gradient-based optimization. |
Prithviraj Sen; Breno William Carvalho; Ibrahim Abdelaziz; Pavan Kapanipathi; Salim Roukos; Alexander Gray; |
256 | MedCLIP: Contrastive Learning from Unpaired Medical Images and Text Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we decouple images and texts for multimodal contrastive learning, thus scaling the usable training data in a combinatorial magnitude with low cost. |
Zifeng Wang; Zhenbang Wu; Dinesh Agarwal; Jimeng Sun; |
257 | GA-SAM: Gradient-Strength Based Adaptive Sharpness-Aware Minimization for Improved Generalization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, it has some difficulty implying SAM to some natural language tasks, especially to models with drastic gradient changes, such as RNNs. In this work, we analyze the relation between the flatness of the local minimum and its generalization ability from a novel and straightforward theoretical perspective. |
Zhiyuan Zhang; Ruixuan Luo; Qi Su; Xu Sun; |
258 | Sparse Teachers Can Be Dense with Knowledge Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To remove the parameters that result in student-unfriendliness, we propose a sparse teacher trick under the guidance of an overall knowledgeable score for each teacher parameter. |
Yi Yang; Chen Zhang; Dawei Song; |
259 | BBTv2: Towards A Gradient-Free Future with Large Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present BBTv2, an improved version of Black-Box Tuning, to drive PTMs for few-shot learning. |
Tianxiang Sun; Zhengfu He; Hong Qian; Yunhua Zhou; Xuanjing Huang; Xipeng Qiu; |
260 | Passage-Mask: A Learnable Regularization Strategy for Retriever-Reader Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Retriever-reader models achieve competitive performance across many different NLP tasks such as open question answering and dialogue conversations. In this work, we notice these models easily overfit the top-rank retrieval passages and standard training fails to reason over the entire retrieval passages. |
Shujian Zhang; Chengyue Gong; Xingchao Liu; |
261 | Mixed-effects Transformers for Hierarchical Adaptation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we introduce the mixed-effects transformer (MET), a novel approach for learning hierarchically-structured prefixes? lightweight modules prepended to an input sequence? to account for structured variation in language use. |
Julia White; Noah Goodman; Robert Hawkins; |
262 | On Measuring The Intrinsic Few-Shot Hardness of Datasets Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We consider an extensive set of recent few-shot learning methods and show that their performance across a large number of datasets is highly correlated, showing that few-shot hardness may be intrinsic to datasets, for a given pre-trained model. |
Xinran Zhao; Shikhar Murty; Christopher Manning; |
263 | Group Is Better Than Individual: Exploiting Label Topologies and Label Relations for Joint Multiple Intent Detection and Slot Filling Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Therefore, in this paper, we first construct a Heterogeneous Label Graph (HLG) containing two kinds of topologies: (1) statistical dependencies based on labels’ co-occurrence patterns and hierarchies in slot labels; (2) rich relations among the label nodes.Then we propose a novel model termed ReLa-Net.It can capture beneficial correlations among the labels from HLG. |
Bowen Xing; Ivor Tsang; |
264 | An Empirical Study on Finding Spans Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present an empirical study on methods for span finding, the selection of consecutive tokens in text for some downstream tasks. |
Weiwei Gu; Boyuan Zheng; Yunmo Chen; Tongfei Chen; Benjamin Van Durme; |
265 | MGDoc: Pre-training with Multi-granular Hierarchy for Document Image Understanding Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In contrast, region-level models attempt to encode regions corresponding to paragraphs or text blocks into a single embedding, but they perform worse with additional word-level features. To deal with these issues, we propose MGDoc, a new multi-modal multi-granular pre-training framework that encodes page-level, region-level, and word-level information at the same time. |
Zilong Wang; Jiuxiang Gu; Chris Tensmeyer; Nikolaos Barmpalios; Ani Nenkova; Tong Sun; Jingbo Shang; Vlad Morariu; |
266 | Understanding Jargon: Combining Extraction and Generation for Definition Modeling Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose to combine extraction and generation for jargon definition modeling: first extract self- and correlative definitional information of target jargon from the Web and then generate the final definitions by incorporating the extracted definitional information. |
Jie Huang; Hanyin Shao; Kevin Chen-Chuan Chang; Jinjun Xiong; Wen-mei Hwu; |
267 | ProsocialDialog: A Prosocial Backbone for Conversational Agents Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Most existing dialogue systems fail to respond properly to potentially unsafe user utterances by either ignoring or passively agreeing with them. To address this issue, we introduce ProsocialDialog, the first large-scale multi-turn dialogue dataset to teach conversational agents to respond to problematic content following social norms. |
Hyunwoo Kim; Youngjae Yu; Liwei Jiang; Ximing Lu; Daniel Khashabi; Gunhee Kim; Yejin Choi; Maarten Sap; |
268 | Exploiting Global and Local Hierarchies for Hierarchical Text Classification Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To exploit global and local hierarchies, we propose Hierarchy-guided BERT with Global and Local hierarchies (HBGL), which utilizes the large-scale parameters and prior language knowledge of BERT to model both global and local hierarchies. |
Ting Jiang; Deqing Wang; Leilei Sun; Zhongzhi Chen; Fuzhen Zhuang; Qinghong Yang; |
269 | Semantic-aware Contrastive Learning for More Accurate Semantic Parsing Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a semantic-aware contrastive learning algorithm, which can learn to distinguish fine-grained meaning representations and take the overall sequence-level semantic into consideration. |
Shan Wu; Chunlei Xin; Bo Chen; Xianpei Han; Le Sun; |
270 | Scientific Paper Extractive Summarization Enhanced By Citation Graphs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we focus on leveraging citation graphs to improve scientific paper extractive summarization under different settings. |
Xiuying Chen; Mingzhe Li; Shen Gao; Rui Yan; Xin Gao; Xiangliang Zhang; |
271 | Hardness-guided Domain Adaptation to Recognise Biomedical Named Entities Under Low-resource Scenarios Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we present a simple yet effective hardness-guided domain adaptation framework for bioNER tasks that can effectively leverage the domain hardness information to improve the adaptability of the learnt model in the low-resource scenarios. |
Ngoc Dang Nguyen; Lan Du; Wray Buntine; Changyou Chen; Richard Beare; |
272 | Syntactic Multi-view Learning for Open Information Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we model both constituency and dependency trees into word-level graphs, and enable neural OpenIE to learn from the syntactic structures. |
Kuicai Dong; Aixin Sun; Jung-Jae Kim; Xiaoli Li; |
273 | TRIPS: Efficient Vision-and-Language Pre-training with Text-Relevant Image Patch Selection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Though previous VLP works have proved the effectiveness of ViTs, they still suffer from computational efficiency brought by the long visual sequence. To tackle this problem, in this paper, we propose an efficient vision-and-language pre-training model with Text-Relevant Image Patch Selection, namely TRIPS, which reduces the visual sequence progressively with a text-guided patch-selection layer in the visual backbone for efficient training and inference. |
Chaoya Jiang; Haiyang Xu; Chenliang Li; Ming Yan; Wei Ye; Shikun Zhang; Bin Bi; Songfang Huang; |
274 | CGoDial: A Large-Scale Benchmark for Chinese Goal-oriented Dialog Evaluation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Practical dialog systems need to deal with various knowledge sources, noisy user expressions, and the shortage of annotated data. To better solve the above problems, we propose CGoDial, a new challenging and comprehensive Chinese benchmark for multi-domain Goal-oriented Dialog evaluation. |
Yinpei Dai; Wanwei He; Bowen Li; Yuchuan Wu; Zheng Cao; Zhongqi An; Jian Sun; Yongbin Li; |
275 | Kernel-Whitening: Overcome Dataset Bias with Isotropic Sentence Embedding Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, such two-stage methods scale up the computational complexity of training process and obstruct valid feature information while mitigating bias.To address this issue, we utilize the representation normalization method which aims at disentangling the correlations between features of encoded sentences. |
SongYang Gao; Shihan Dou; Qi Zhang; Xuanjing Huang; |
276 | A Unified Positive-Unlabeled Learning Framework for Document-Level Relation Extraction with Different Levels of Labeling Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To solve the common incomplete labeling problem, we propose a unified positive-unlabeled learning framework – shift and squared ranking loss positive-unlabeled (SSR-PU) learning. |
Ye Wang; Xinxin Liu; Wenxin Hu; Tao Zhang; |
277 | Automatic Generation of Socratic Subquestions for Teaching Math Word Problems Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose various guided question generation schemes based on input conditioning and reinforcement learning. |
Kumar Shridhar; Jakub Macina; Mennatallah El-Assady; Tanmay Sinha; Manu Kapur; Mrinmaya Sachan; |
278 | Mixture of Attention Heads: Selecting Attention Heads Per Token Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper proposes the Mixture of Attention Heads (MoA), a new architecture that combines multi-head attention with the MoE mechanism. |
Xiaofeng Zhang; Yikang Shen; Zeyu Huang; Jie Zhou; Wenge Rong; Zhang Xiong; |
279 | The Optimal BERT Surgeon: Scalable and Accurate Second-Order Pruning for Large Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we consider the problem of sparsifying BERT models, which are a key building block for natural language processing, in order to reduce their storage and computational cost. |
Eldar Kurtic; Daniel Campos; Tuan Nguyen; Elias Frantar; Mark Kurtz; Benjamin Fineran; Michael Goin; Dan Alistarh; |
280 | Information-Theoretic Text Hallucination Reduction for Video-grounded Dialogue Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Hence, we design Text Hallucination Mitigating (THAM) framework, which incorporates Text Hallucination Regularization (THR) loss derived from the proposed information-theoretic text hallucination measurement approach. |
Sunjae Yoon; Eunseop Yoon; Hee Suk Yoon; Junyeong Kim; Chang Yoo; |
281 | DSM: Question Generation Over Knowledge Base Via Modeling Diverse Subgraphs with Meta-learner Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we show that making use of the past experience on semantically similar subgraphs can reduce the learning difficulty and promote the performance of KBQG models. To achieve this, we propose a novel approach to model diverse subgraphs with meta-learner (DSM). |
Shasha Guo; Jing Zhang; Yanling Wang; Qianyi Zhang; Cuiping Li; Hong Chen; |
282 | RelU-Net: Syntax-aware Graph U-Net for Relational Triple Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This is due to the absence of entity locations, which is the prerequisite for pruning noisy edges from the dependency tree, when extracting relational triples. In this paper, we propose a unified framework to tackle this challenge and incorporate syntactic information for relational triple extraction. |
Yunqi Zhang; Yubo Chen; Yongfeng Huang; |
283 | Evidence > Intuition: Transferability Estimation for Encoder Selection Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose to generate quantitative evidence to predict which LM, out of a pool of models, will perform best on a target task without having to fine-tune all candidates. |
Elisa Bassignana; Max M�ller-Eberstein; Mike Zhang; Barbara Plank; |
284 | Chunk-based Nearest Neighbor Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we introduce a chunk-based kNN-MT model which retrieves chunks of tokens from the datastore, instead of a single token. |
Pedro Henrique Martins; Zita Marinho; Andr� F. T. Martins; |
285 | FiE: Building A Global Probability Space By Leveraging Early Fusion in Encoder for Open-Domain Question Answering Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose to extend transformer encoders with the ability to fuse information from multiple passages, using global representation to provide cross-sample attention over all tokens across samples. |
Akhil Kedia; Mohd Abbas Zaidi; Haejun Lee; |
286 | Inductive Relation Prediction with Logical Reasoning Using Contrastive Representations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To this end, we propose a novel graph convolutional network (GCN)-based model LogCo with logical reasoning by contrastive representations. |
Yudai Pan; Jun Liu; Lingling Zhang; Tianzhe Zhao; Qika Lin; Xin Hu; Qianying Wang; |
287 | Improving Chinese Spelling Check By Character Pronunciation Prediction: The Effects of Adaptivity and Granularity Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: As most of these spelling errors are caused by phonetic similarity, effectively modeling the pronunciation of Chinese characters is a key factor for CSC. In this paper, we consider introducing an auxiliary task of Chinese pronunciation prediction (CPP) to improve CSC, and, for the first time, systematically discuss the adaptivity and granularity of this auxiliary task. |
Jiahao Li; Quan Wang; Zhendong Mao; Junbo Guo; Yanyan Yang; Yongdong Zhang; |
288 | MT-GenEval: A Counterfactual and Contextual Dataset for Evaluating Gender Accuracy in Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we introduce MT-GenEval, a benchmark for evaluating gender accuracy in translation from English into eight widely-spoken languages. |
Anna Currey; Maria Nadejde; Raghavendra Reddy Pappagari; Mia Mayer; Stanislas Lauly; Xing Niu; Benjamin Hsu; Georgiana Dinu; |
289 | A Span-level Bidirectional Network for Aspect Sentiment Triplet Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present a span-level bidirectional network which utilizes all possible spans as input and extracts triplets from spans bidirectionally. |
Yuqi Chen; Chen Keming; Xian Sun; Zequn Zhang; |
290 | On The Calibration of Massively Multilingual Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Overall, our work contributes towards building more reliable multilingual models by highlighting the issue of their miscalibration, understanding what language and model-specific factors influence it, and pointing out the strategies to improve the same. |
Kabir Ahuja; Sunayana Sitaram; Sandipan Dandapat; Monojit Choudhury; |
291 | Momentum Contrastive Pre-training for Question Answering Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Existing pre-training methods for extractive Question Answering (QA) generate cloze-like queries different from natural questions in syntax structure, which could overfit pre-trained models to simple keyword matching. In order to address this problem, we propose a novel Momentum Contrastive pRe-training fOr queStion anSwering (MCROSS) method for extractive QA. |
Minda Hu; Muzhi Li; Yasheng Wang; Irwin King; |
292 | A Second Wave of UD Hebrew Treebanking and Cross-Domain Parsing Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper presents a new, freely available UD treebank of Hebrew stratified from a range of topics selected from Hebrew Wikipedia. |
Amir Zeldes; Nick Howell; Noam Ordan; Yifat Ben Moshe; |
293 | Finding Dataset Shortcuts with Grammar Induction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose to use probabilistic grammars to characterize and discover shortcuts in NLP datasets. |
Dan Friedman; Alexander Wettig; Danqi Chen; |
294 | Retrieval Augmentation for Commonsense Reasoning: A Unified Approach Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we systematically investigate how to leverage commonsense knowledge retrieval to improve commonsense reasoning tasks. |
Wenhao Yu; Chenguang Zhu; Zhihan Zhang; Shuohang Wang; Zhuosheng Zhang; Yuwei Fang; Meng Jiang; |
295 | Open World Classification with Adaptive Negative Samples Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose an approach based on Adaptive Negative Samples (ANS) designed to generate effective synthetic open category samples in the training stage and without requiring any prior knowledge or external datasets. |
Ke Bai; Guoyin Wang; Jiwei Li; Sunghyun Park; Sungjin Lee; Puyang Xu; Ricardo Henao; Lawrence Carin; |
296 | Re3: Generating Longer Stories With Recursive Reprompting and Revision Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Compared to prior work on shorter stories, long-range plot coherence and relevance are more central challenges here. We propose the Recursive Reprompting and Revision framework (Re3) to address these challenges by (a) prompting a general-purpose language model to construct a structured overarching plan, and (b) generating story passages by repeatedly injecting contextual information from both the plan and current story state into a language model prompt. |
Kevin Yang; Yuandong Tian; Nanyun Peng; Dan Klein; |
297 | Does Joint Training Really Help Cascaded Speech Translation? Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we seek to answer the question of whether joint training really helps cascaded speech translation. |
Viet Anh Khoa Tran; David Thulke; Yingbo Gao; Christian Herold; Hermann Ney; |
298 | MasakhaNER 2.0: Africa-centric Transfer Learning for Named Entity Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Multiple challenges exist, including the limited availability of annotated training and evaluation datasets as well as the lack of understanding of which settings, languages, and recently proposed methods like cross-lingual transfer will be effective. In this paper, we aim to move towards solutions for these challenges, focusing on the task of named entity recognition (NER). |
David Adelani; Graham Neubig; Sebastian Ruder; Shruti Rijhwani; Michael Beukman; Chester Palen-Michel; Constantine Lignos; Jesujoba Alabi; Shamsuddeen Muhammad; Peter Nabende; Cheikh M. Bamba Dione; Andiswa Bukula; Rooweither Mabuya; Bonaventure F. P. Dossou; Blessing Sibanda; Happy Buzaaba; Jonathan Mukiibi; Godson Kalipe; Derguene Mbaye; Amelia Taylor; Fatoumata Kabore; Chris Chinenye Emezue; Anuoluwapo Aremu; Perez Ogayo; Catherine Gitau; Edwin Munkoh-Buabeng; Victoire Memdjokam Koagne; Allahsera Auguste Tapo; Tebogo Macucwa; Vukosi Marivate; Mboning Tchiaze Elvis; Tajuddeen Gwadabe; Tosin Adewumi; Orevaoghene Ahia; Joyce Nakatumba-Nabende; Neo Lerato Mokono; Ignatius Ezeani; Chiamaka Chukwuneke; Mofetoluwa Oluwaseun Adeyemi; Gilles Quentin Hacheme; Idris Abdulmumin; Odunayo Ogundepo; Oreen Yousuf; Tatiana Moteu; Dietrich Klakow; |
299 | Ethics Consideration Sections in Natural Language Processing Papers Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we present the results of a manual classification of all ethical consideration sections for ACL 2021. |
Luciana Benotti; Patrick Blackburn; |
300 | Continued Pretraining for Better Zero- and Few-Shot Promptability Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we investigate if a dedicated continued pretraining stage could improve ?promptability?, i.e., zero-shot performance with natural language prompts or few-shot performance with prompt tuning. |
Zhaofeng Wu; Robert L Logan IV; Pete Walsh; Akshita Bhagia; Dirk Groeneveld; Sameer Singh; Iz Beltagy; |
301 | Less Is More: Summary of Long Instructions Is Better for Program Synthesis Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We show that LMs benefit from the summarized version of complicated questions. |
Kirby Kuznia; Swaroop Mishra; Mihir Parmar; Chitta Baral; |
302 | Is A Question Decomposition Unit All We Need? Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: With the growing number of new benchmarks, we build bigger and more complex LMs. |
Pruthvi Patel; Swaroop Mishra; Mihir Parmar; Chitta Baral; |
303 | Discourse-Aware Soft Prompting for Text Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We show that structured design of prefix parameters yields more coherent, faithful and relevant generations than the baseline prefix-tuning on all generation tasks. |
Marjan Ghazvininejad; Vladimir Karpukhin; Vera Gor; Asli Celikyilmaz; |
304 | ExPUNations: Augmenting Puns with Keywords and Explanations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present the ExPUNations (ExPUN) dataset, in which we augment an existing dataset of puns with detailed crowdsourced annotations of keywords denoting the most distinctive words that make the text funny, pun explanations describing why the text is funny, and fine-grained funniness ratings. |
Jiao Sun; Anjali Narayan-Chen; Shereen Oraby; Alessandra Cervone; Tagyoung Chung; Jing Huang; Yang Liu; Nanyun Peng; |
305 | SLING: Sino Linguistic Evaluation of Large Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To understand what kinds of linguistic knowledge are encoded by pretrained Chinese language models (LMs), we introduce the benchmark of Sino LINGuistics (SLING), which consists of 38K minimal sentence pairs in Mandarin Chinese grouped into 9 high-level linguistic phenomena. |
Yixiao Song; Kalpesh Krishna; Rajesh Bhatt; Mohit Iyyer; |
306 | Context-Situated Pun Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose a new task, context-situated pun generation, where a specific context represented by a set of keywords is provided, and the task is to first identify suitable pun words that are appropriate for the context, then generate puns based on the context keywords and the identified pun words. |
Jiao Sun; Anjali Narayan-Chen; Shereen Oraby; Shuyang Gao; Tagyoung Chung; Jing Huang; Yang Liu; Nanyun Peng; |
307 | Retrieval-Augmented Generative Question Answering for Event Argument Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a retrieval-augmented generative QA model (R-GQA) for event argument extraction. |
Xinya Du; Heng Ji; |
308 | Concadia: Towards Image-Based Text Generation with A Purpose Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Descriptions focus on visual features and are meant to replace an image (often to increase accessibility), whereas captions appear alongside an image to supply additional information. To motivate this distinction and help people put it into practice, we introduce the publicly available Wikipedia-based dataset Concadia consisting of 96,918 images with corresponding English-language descriptions, captions, and surrounding context. |
Elisa Kreiss; Fei Fang; Noah Goodman; Christopher Potts; |
309 | Context Matters for Image Descriptions for Accessibility: Challenges for Referenceless Evaluation Metrics Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: The fundamental shortcoming of these metrics is that they do not take context into account, whereas contextual information is highly valued by BLV users. To substantiate these claims, we present a study with BLV participants who rated descriptions along a variety of dimensions. |
Elisa Kreiss; Cynthia Bennett; Shayan Hooshmand; Eric Zelikman; Meredith Ringel Morris; Christopher Potts; |
310 | MetaLogic: Logical Reasoning Explanations with Fine-Grained Structure Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a comprehensive benchmark to investigate models? logical reasoning capabilities in complex real-life scenarios. |
Yinya Huang; Hongming Zhang; Ruixin Hong; Xiaodan Liang; Changshui Zhang; Dong Yu; |
311 | Explicit Query Rewriting for Conversational Dense Retrieval Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a model CRDR that can perform query rewriting and context modelling in a unified framework in which the query rewriting?s supervision signals further enhance the context modelling. |
Hongjin Qian; Zhicheng Dou; |
312 | Efficient Nearest Neighbor Emotion Classification with BERT-whitening Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose kNN-EC, a simple and efficient non-parametric emotion classification (EC) method using nearest neighbor retrieval. |
Wenbiao Yin; Lin Shang; |
313 | FastClass: A Time-Efficient Approach to Weakly-Supervised Text Classification Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper proposes FastClass, an efficient weakly-supervised classification approach. |
Tingyu Xia; Yue Wang; Yuan Tian; Yi Chang; |
314 | Neural-Symbolic Inference for Robust Autoregressive Graph Parsing Via Compositional Uncertainty Quantification Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we study compositionality-aware approach to neural-symbolic inference informed by model confidence, performing fine-grained neural-symbolic reasoning at subgraph level (i.e., nodes and edges) and precisely targeting subgraph components with high uncertainty in the neural parser. |
Zi Lin; Jeremiah Liu; Jingbo Shang; |
315 | A Speaker-Aware Co-Attention Framework for Medical Dialogue Information Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To this end, in this paper, we propose a speaker-aware co-attention framework for medical dialogue information extraction. |
Yuan Xia; Zhenhui Shi; Jingbo Zhou; Jiayu Xu; Chao Lu; Yehui Yang; Lei Wang; Haifeng Huang; Xia Zhang; Junwei Liu; |
316 | Towards Interactivity and Interpretability: A Rationale-based Legal Judgment Prediction Framework Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Following the judge?s real trial logic, in this paper, we propose a novel Rationale-based Legal Judgment Prediction (RLJP) framework. |
Yiquan Wu; Yifei Liu; Weiming Lu; Yating Zhang; Jun Feng; Changlong Sun; Fei Wu; Kun Kuang; |
317 | RelCLIP: Adapting Language-Image Pretraining for Visual Relationship Detection Via Relational Contrastive Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we introduce compact language information of relation labels for regularizing the representation learning of visual relations. |
Yi Zhu; Zhaoqing Zhu; Bingqian Lin; Xiaodan Liang; Feng Zhao; Jianzhuang Liu; |
318 | Candidate Soups: Fusing Candidate Results Improves Translation Quality for Non-Autoregressive Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a simple but effective method called ?Candidate Soups,? which can obtain high-quality translations while maintaining the inference speed of NAT models. |
Huanran Zheng; Wei Zhu; Pengfei Wang; Xiaoling Wang; |
319 | Evaluating Parameter Efficient Learning for Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we presentcomparisons between PERMs and finetuningfrom three new perspectives: (1) the effect ofsample and model size to in-domain evaluations, (2) generalization to unseen domains andnew datasets, and (3) the faithfulness of generations. |
Peng Xu; Mostofa Patwary; Shrimai Prabhumoye; Virginia Adams; Ryan Prenger; Wei Ping; Nayeon Lee; Mohammad Shoeybi; Bryan Catanzaro; |
320 | McQueen: A Benchmark for Multimodal Conversational Query Rewrite Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose the task of multimodal conversational query rewrite (McQR), which performs query rewrite under the multimodal visual conversation setting. |
Yifei Yuan; Chen Shi; Runze Wang; Liyi Chen; Feijun Jiang; Yuan You; Wai Lam; |
321 | Self-supervised Graph Masking Pre-training for Graph-to-Text Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Additionally, PLMs are typically pre-trained on free text which introduces domain mismatch between pre-training and downstream G2T generation tasks. To address these shortcomings, we propose graph masking pre-training strategies that neither require supervision signals nor adjust the architecture of the underlying pre-trained encoder-decoder model. |
Jiuzhou Han; Ehsan Shareghi; |
322 | Improving Stability of Fine-Tuning Pretrained Language Models Via Component-Wise Gradient Norm Clipping Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we first point out that this method does not always work out due to the different convergence speeds of different layers/modules. Inspired by this observation, we propose a simple component-wise gradient norm clipping method to adjust the convergence speed for different components. |
Chenghao Yang; Xuezhe Ma; |
323 | Differentially Private Language Models for Secure Data Sharing Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we approach the problem at hand using global differential privacy, particularly by training a generative language model in a differentially private manner and consequently sampling data from it. |
Justus Mattern; Zhijing Jin; Benjamin Weggenmann; Bernhard Schoelkopf; Mrinmaya Sachan; |
324 | Conditional Set Generation Using Seq2seq Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a novel algorithm for effectively sampling informative orders over the combinatorial space of label orders. |
Aman Madaan; Dheeraj Rajagopal; Niket Tandon; Yiming Yang; Antoine Bosselut; |
325 | Analyzing and Evaluating Faithfulness in Dialogue Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we first perform the fine-grained human analysis on the faithfulness of dialogue summaries and observe that over 35% of generated summaries are faithfully inconsistent respective the source dialogues. Furthermore, we present a new model-level faithfulness evaluation method. |
Bin Wang; Chen Zhang; Yan Zhang; Yiming Chen; Haizhou Li; |
326 | Twist Decoding: Diverse Generators Guide Each Other Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce Twist decoding, a simple and general text generation algorithm that benefits from diverse models at inference time. |
Jungo Kasai; Keisuke Sakaguchi; Ronan Le Bras; Hao Peng; Ximing Lu; Dragomir Radev; Yejin Choi; Noah A. Smith; |
327 | Exploring Representation-level Augmentation for Code Search Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we explore augmentation methods that augment data (both code and query) at representation level which does not require additional data processing and training, and based on this we propose a general format of representation-level augmentation that unifies existing methods. |
Haochen Li; Chunyan Miao; Cyril Leung; Yanxian Huang; Yuan Huang; Hongyu Zhang; Yanlin Wang; |
328 | Learning Semantic Textual Similarity Via Topic-informed Discrete Latent Variables Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we develop a topic-informed discrete latent variable model for semantic textual similarity, which learns a shared latent space for sentence-pair representation via vector quantization. |
Erxin Yu; Lan Du; Yuan Jin; Zhepei Wei; Yi Chang; |
329 | STRUDEL: Structured Dialogue Summarization for Dialogue Comprehension Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel type of dialogue summarization task – STRUctured DiaLoguE Summarization (STRUDEL) – that can help pre-trained language models to better understand dialogues and improve their performance on important dialogue comprehension tasks. |
Borui Wang; Chengcheng Feng; Arjun Nair; Madelyn Mao; Jai Desai; Asli Celikyilmaz; Haoran Li; Yashar Mehdad; Dragomir Radev; |
330 | Competency-Aware Neural Machine Translation: Can Machine Translation Know Its Own Translation Quality? Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This is in sharp contrast to human translators who give feedback or conduct further investigations whenever they are in doubt about predictions. To fill this gap, we propose a novel competency-aware NMT by extending conventional NMT with a self-estimator, offering abilities to translate a source sentence and estimate its competency. |
Pei Zhang; Baosong Yang; Hao-Ran Wei; Dayiheng Liu; Kai Fan; Luo Si; Jun Xie; |
331 | PASTA: Table-Operations Aware Fact Verification Via Sentence-Table Cloze Pre-training Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Yet, progress has been limited due to the lack of datasets that can be used to pre-train language models (LMs) to be aware of common table operations, such as aggregating a column or comparing tuples. To bridge this gap, this paper introduces PASTA for table-based fact verification via pre-training with synthesized sentence?table cloze questions. |
Zihui Gu; Ju Fan; Nan Tang; Preslav Nakov; Xiaoman Zhao; Xiaoyong Du; |
332 | Sentiment-Aware Word and Sentence Level Pre-training for Sentiment Analysis Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose SentiWSP, a novel Sentiment-aware pre-trained language model with combined Word-level and Sentence-level Pre-training tasks. |
Shuai Fan; Chen Lin; Haonan Li; Zhenghao Lin; Jinsong Su; Hang Zhang; Yeyun Gong; JIan Guo; Nan Duan; |
333 | Towards Multi-Modal Sarcasm Detection Via Hierarchical Congruity Modeling with Knowledge Enhancement Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a novel hierarchical framework for sarcasm detection by exploring both the atomic-level congruity based on multi-head cross attentions and the composition-level congruity based on graph neural networks, where a post with low congruity can be identified as sarcasm. |
Hui Liu; Wenya Wang; Haoliang Li; |
334 | Efficiently Tuned Parameters Are Task Embeddings Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we anticipate that task-specific parameters updated in parameter-efficient tuning methods are likely to encode task-specific information. |
Wangchunshu Zhou; Canwen Xu; Julian McAuley; |
335 | COPEN: Probing Conceptual Knowledge in Pre-trained Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Inspired by knowledge representation schemata, we comprehensively evaluate conceptual knowledge of PLMs by designing three tasks to probe whether PLMs organize entities by conceptual similarities, learn conceptual properties, and conceptualize entities in contexts, respectively. |
Hao Peng; Xiaozhi Wang; Shengding Hu; Hailong Jin; Lei Hou; Juanzi Li; Zhiyuan Liu; Qun Liu; |
336 | Capturing Global Structural Information in Long Document Question Answering with Compressive Graph Selector Network Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, these methods usually ignore the global structure of the long document, which is essential for long-range understanding. To tackle this problem, we propose Compressive Graph Selector Network (CGSN) to capture the global structure in a compressive and iterative manner. |
Yuxiang Nie; Heyan Huang; Wei Wei; Xian-Ling Mao; |
337 | Structural Generalization Is Hard for Sequence-to-sequence Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, recent work on compositional generalization has shown that seq2seq models achieve very low accuracy in generalizing to linguistic structures that were not seen in training. We present new evidence that this is a general limitation of seq2seq models that is present not just in semantic parsing, but also in syntactic parsing and in text-to-text tasks, and that this limitation can often be overcome by neurosymbolic models that have linguistic knowledge built in. |
Yuekun Yao; Alexander Koller; |
338 | Contrastive Learning Enhanced Author-Style Headline Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Besides, we propose two methods to use the learned stylistic features to guide both the pointer and the decoder during the generation. |
Hui Liu; Weidong Guo; Yige Chen; Xiangyang Li; |
339 | Multi-Granularity Optimization for Non-Autoregressive Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This assumption is further strengthened by cross-entropy loss, which encourages a strict match between the hypothesis and the reference token by token. To alleviate this issue, we propose multi-granularity optimization for NAT, which collects model behaviours on translation segments of various granularities and integrates feedback for backpropagation. |
Yafu Li; Leyang Cui; Yongjing Yin; Yue Zhang; |
340 | Super-NaturalInstructions: Generalization Via Declarative Instructions on 1600+ NLP Tasks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Furthermore, we build Tk-Instruct, a transformer model trained to follow a variety of in-context instructions (plain language task definitions or k-shot examples). |
Yizhong Wang; Swaroop Mishra; Pegah Alipoormolabashi; Yeganeh Kordi; Amirreza Mirzaei; Atharva Naik; Arjun Ashok; Arut Selvan Dhanasekaran; Anjana Arunkumar; David Stap; Eshaan Pathak; Giannis Karamanolakis; Haizhi Lai; Ishan Purohit; Ishani Mondal; Jacob Anderson; Kirby Kuznia; Krima Doshi; Kuntal Kumar Pal; Maitreya Patel; Mehrad Moradshahi; Mihir Parmar; Mirali Purohit; Neeraj Varshney; Phani Rohitha Kaza; Pulkit Verma; Ravsehaj Singh Puri; Rushang Karia; Savan Doshi; Shailaja Keyur Sampat; Siddhartha Mishra; Sujan Reddy A; Sumanta Patro; Tanay Dixit; Xudong Shen; |
341 | MetaFill: Text Infilling for Meta-Path Generation on Heterogeneous Information Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Existing meta-path generation approaches cannot fully exploit the rich textual information in HINs, such as node names and edge type names. To address this problem, we propose MetaFill, a text-infilling-based approach for meta-path generation. |
Zequn Liu; Kefei Duan; Junwei Yang; Hanwen Xu; Ming Zhang; Sheng Wang; |
342 | DRLK: Dynamic Hierarchical Reasoning with Language Model and Knowledge Graph for Question Answering Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose DRLK (Dynamic Hierarchical Reasoning with Language Model and Knowledge Graphs), a novel model that utilizes dynamic hierarchical interactions between the QA context and KG for reasoning. |
Miao Zhang; Rufeng Dai; Ming Dong; Tingting He; |
343 | AEG: Argumentative Essay Generation Via A Dual-Decoder Model with Content Planning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a new task, Argumentative Essay Generation (AEG). |
Jianzhu Bao; Yasheng Wang; Yitong Li; Fei Mi; Ruifeng Xu; |
344 | BotsTalk: Machine-sourced Framework for Automatic Curation of Large-scale Multi-skill Dialogue Datasets Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To build open-domain chatbots that are able to use diverse communicative skills, we propose a novel framework BotsTalk, where multiple agents grounded to the specific target skills participate in a conversation to automatically annotate multi-skill dialogues. |
Minju Kim; Chaehyeong Kim; Yong Ho Song; Seung-won Hwang; Jinyoung Yeo; |
345 | Wider & Closer: Mixture of Short-channel Distillers for Zero-shot Cross-lingual Named Entity Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this study, a mixture of short-channel distillers (MSD) method is proposed to fully interact the rich hierarchical information in the teacher model and to transfer knowledge to the student model sufficiently and efficiently. |
Jun-Yu Ma; Beiduo Chen; Jia-Chen Gu; Zhenhua Ling; Wu Guo; Quan Liu; Zhigang Chen; Cong Liu; |
346 | An Efficient Memory-Augmented Transformer for Knowledge-Intensive NLP Tasks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To combine the strength of both approaches, we propose the Efficient Memory-Augmented Transformer (EMAT) – it encodes external knowledge into a key-value memory and exploits the fast maximum inner product search for memory querying. |
Yuxiang Wu; Yu Zhao; Baotian Hu; Pasquale Minervini; Pontus Stenetorp; Sebastian Riedel; |
347 | Supervised Prototypical Contrastive Learning for Emotion Recognition in Conversation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a Supervised Prototypical Contrastive Learning (SPCL) loss for the ERC task. |
Xiaohui Song; Longtao Huang; Hui Xue; Songlin Hu; |
348 | RuCoLA: Russian Corpus of Linguistic Acceptability Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To this end, we introduce the Russian Corpus of Linguistic Acceptability (RuCoLA), built from the ground up under the well-established binary LA approach. |
Vladislav Mikhailov; Tatiana Shamardina; Max Ryabinin; Alena Pestova; Ivan Smurov; Ekaterina Artemova; |
349 | Complex Hyperbolic Knowledge Graph Embeddings with Fast Fourier Transform Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper aims to utilize the representation capacity of the complex hyperbolic geometry in multi-relational KG embeddings. |
Huiru Xiao; Xin Liu; Yangqiu Song; Ginny Wong; Simon See; |
350 | Towards Knowledge-Intensive Text-to-SQL Semantic Parsing with Formulaic Knowledge Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we study the problem of knowledge-intensive text-to-SQL, in which domain knowledge is necessary to parse expert questions into SQL queries over domain-specific tables. |
Longxu Dou; Yan Gao; Xuqi Liu; Mingyang Pan; Dingzirui Wang; Wanxiang Che; Dechen Zhan; Min-Yen Kan; Jian-Guang Lou; |
351 | Should We Ban English NLP for A Year? Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Many have argued that it is almost impossible to mitigate inequality amplification. I argue that, on the contrary, it is quite simple to do so, and that counter-measures would have little-to-no negative impact, except for, perhaps, in the very short term. |
Anders S�gaard; |
352 | LittleBird: Efficient Faster & Longer Transformer for Question Answering Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: But it has a limitation dealing with long inputs due to its attention mechanism. Longformer, ETC and BigBird addressed this issue and effectively solved the quadratic dependency problem.However we find that these models are not sufficient, and propose LittleBird, a novel model based on BigBird with improved speed and memory footprint while maintaining accuracy. |
Minchul Lee; Kijong Han; Myeong Cheol Shin; |
353 | WeTS: A Benchmark for Translation Suggestion Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To break these limitations mentioned above and spur the research in TS, we create a benchmark dataset, called WeTS, which is a golden corpus annotated by expert translators on four translation directions. |
Zhen Yang; Fandong Meng; Yingxue Zhang; Ernan Li; Jie Zhou; |
354 | Discrete Cross-Modal Alignment Enables Zero-Shot Speech Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In order to enable zero-shot ST, we propose a novel Discrete Cross-Modal Alignment (DCMA) method that employs a shared discrete vocabulary space to accommodate and match both modalities of speech and text. |
Chen Wang; Yuchen Liu; Boxing Chen; Jiajun Zhang; Wei Luo; Zhongqiang Huang; Chengqing Zong; |
355 | Abstractive Summarization Guided By Latent Hierarchical Document Structure Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To address this shortcoming, we propose a hierarchy-aware graph neural network (HierGNN) which captures such dependencies through three main steps: 1) learning a hierarchical document structure through a latent structure tree learned by a sparse matrix-tree computation; 2) propagating sentence information over this structure using a novel message-passing node propagation mechanism to identify salient information; 3) using graph-level attention to concentrate the decoder on salient information. |
Yifu Qiu; Shay B. Cohen; |
356 | Explainable Question Answering Based on Semantic Graph By Global Differentiable Learning and Dynamic Adaptive Reasoning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To alleviate it, we propose a simple yet effective Global Differentiable Learning strategy to explore optimal reasoning paths from the latent probability space so that the model learns to solve intermediate reasoning processes without expert annotations. |
Jianguo Mao; Wenbin Jiang; Xiangdong Wang; Hong Liu; Yu Xia; Yajuan Lyu; QiaoQiao She; |
357 | DuReader-Retrieval: A Large-scale Chinese Benchmark for Passage Retrieval from Web Search Engine Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we present DuReader-retrieval, a large-scale Chinese dataset for passage retrieval. |
Yifu Qiu; Hongyu Li; Yingqi Qu; Ying Chen; QiaoQiao She; Jing Liu; Hua Wu; Haifeng Wang; |
358 | Pair-Based Joint Encoding with Relational Graph Convolutional Networks for Emotion-Cause Pair Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This lead to an imbalance in inter-task feature interaction where features extracted later have no direct contact with the former. To address this issue, we propose a novel **P**air-**B**ased **J**oint **E**ncoding (**PBJE**) network, which generates pairs and clauses features simultaneously in a joint feature encoding manner to model the causal relationship in clauses. |
Junlong Liu; Xichen Shang; Qianli Ma; |
359 | Affective Knowledge Enhanced Multiple-Graph Fusion Networks for Aspect-based Sentiment Analysis Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel multi-graph fusion network (MGFN) based on latent graph to leverage the richer syntax dependency relation label information and affective semantic information of words. |
Siyu Tang; Heyan Chai; Ziyi Yao; Ye Ding; Cuiyun Gao; Binxing Fang; Qing Liao; |
360 | IndicNLG Benchmark: Multilingual Datasets for Diverse NLG Tasks in Indic Languages Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present the IndicNLG Benchmark, a collection of datasets for benchmarking NLG for 11 Indic languages. |
Aman Kumar; Himani Shrotriya; Prachi Sahu; Amogh Mishra; Raj Dabre; Ratish Puduppully; Anoop Kunchukuttan; Mitesh M. Khapra; Pratyush Kumar; |
361 | Improving Machine Translation with Phrase Pair Injection and Corpus Filtering Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we show that the combination of Phrase Pair Injection and Corpus Filtering boosts the performance of Neural Machine Translation (NMT) systems. |
Akshay Batheja; Pushpak Bhattacharyya; |
362 | An Anchor-based Relative Position Embedding Method for Cross-Modal Tasks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a unified position embedding method for these problems, called AnChor-basEd Relative Position Embedding (ACE-RPE), in which we first introduce an anchor locating mechanism to bridge the semantic gap and locate anchors from different modalities. |
Ya Wang; Xingwu Sun; Lian Fengzong; ZhanHui Kang; Chengzhong Xu Xu; |
363 | Norm-based Noisy Corpora Filtering and Refurbishing in Neural Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a norm-based noisy corpora filtering and refurbishing method with no external data and costly scorers. |
Yu Lu; Jiajun Zhang; |
364 | TeleMelody: Lyric-to-Melody Generation with A Template-Based Two-Stage Method Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we develop TeleMelody, a two-stage lyric-to-melody generation system with music template (e.g., tonality, chord progression, rhythm pattern, and cadence) to bridge the gap between lyrics and melodies (i.e., the system consists of a lyric-to-template module and a template-to-melody module). |
Zeqian Ju; Peiling Lu; Xu Tan; Rui Wang; Chen Zhang; Songruoyao Wu; Kejun Zhang; Xiang-Yang Li; Tao Qin; Tie-Yan Liu; |
365 | SEEN: Structured Event Enhancement Network for Explainable Need Detection of Information Recall Assistance Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a pilot model?structured event enhancement network (SEEN) that detects life event inconsistency, additional information in life events, and forgotten events. |
You-En Lin; An-Zi Yen; Hen-Hsen Huang; Hsin-Hsi Chen; |
366 | Rethinking Style Transformer with Energy-based Interpretation: Adversarial Unsupervised Style Transfer Using A Pretrained Model Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, adversarial training significantly degrades fluency compared to the other two metrics. In this work, we explain this phenomenon using energy-based interpretation, and leverage a pretrained language model to improve fluency. |
Hojun Cho; Dohee Kim; Seungwoo Ryu; ChaeHun Park; Hyungjong Noh; Jeong-in Hwang; Minseok Choi; Edward Choi; Jaegul Choo; |
367 | Towards Robust K-Nearest-Neighbor Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To alleviate the impact of noise, we propose a confidence-enhanced kNN-MT model with robust training. |
Hui Jiang; Ziyao Lu; Fandong Meng; Chulun Zhou; Jie Zhou; Degen Huang; Jinsong Su; |
368 | Tiny-NewsRec: Effective and Efficient PLM-based News Recommendation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose Tiny-NewsRec, which can improve both the effectiveness and the efficiency of PLM-based news recommendation. |
Yang Yu; Fangzhao Wu; Chuhan Wu; Jingwei Yi; Qi Liu; |
369 | TABS: Efficient Textual Adversarial Attack for Pre-trained NL Code Model Using Semantic Beam Search Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose TABS, an efficient beam search black-box adversarial attack method. |
YunSeok Choi; Hyojun Kim; Jee-Hyong Lee; |
370 | Investigating The Robustness of Natural Language Generation from Logical Forms Via Counterfactual Samples Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: State-of-the-art methods based on pre-trained models have achieved remarkable performance on the standard test dataset. However, we question whether these methods really learn how to perform logical reasoning, rather than just relying on the spurious correlations between the headers of the tables and operators of the logical form. |
Chengyuan Liu; Leilei Gan; Kun Kuang; Fei Wu; |
371 | Helping The Weak Makes You Strong: Simple Multi-Task Learning Improves Non-Autoregressive Translators Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a simple and model-agnostic multi-task learning framework to provide more informative learning signals. |
Xinyou Wang; Zaixiang Zheng; Shujian Huang; |
372 | RACE: Retrieval-augmented Commit Message Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose RACE, a new retrieval-augmented neural commit message generation method, which treats the retrieved similar commit as an exemplar and leverages it to generate an accurate commit message. |
Ensheng Shi; Yanlin Wang; Wei Tao; Lun Du; Hongyu Zhang; Shi Han; Dongmei Zhang; Hongbin Sun; |
373 | PLOG: Table-to-Logic Pretraining for Logical Table-to-Text Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose a Pretrained Logical Form Generator (PLOG) framework to improve generation fidelity. |
Ao Liu; Haoyu Dong; Naoaki Okazaki; Shi Han; Dongmei Zhang; |
374 | GHAN: Graph-Based Hierarchical Aggregation Network for Text-Video Retrieval Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, there are structural and semantic differences between text and video, making this approach challenging for fine-grained understanding. In order to solve this, we propose an end-to-end graph-based hierarchical aggregation network for text-video retrieval according to the hierarchy possessed by text and video. |
Yahan Yu; Bojie Hu; Yu Li; |
375 | MuRAG: Multimodal Retrieval-Augmented Generator for Open Question Answering Over Images and Text Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, these methods are restricted to retrieving only textual knowledge, neglecting the ubiquitous amount of knowledge in other modalities like images ? much of which contains information not covered by any text. To address this limitation, we propose the first Multimodal Retrieval-Augmented Transformer (MuRAG), which accesses an external non-parametric multimodal memory to augment language generation. |
Wenhu Chen; Hexiang Hu; Xi Chen; Pat Verga; William Cohen; |
376 | PHEE: A Dataset for Pharmacovigilance Event Extraction from Text Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present PHEE, a novel dataset for pharmacovigilance comprising over 5000 annotated events from medical case reports and biomedical literature, making it the largest such public dataset to date. |
Zhaoyue Sun; Jiazheng Li; Gabriele Pergola; Byron Wallace; Bino John; Nigel Greene; Joseph Kim; Yulan He; |
377 | OTSeq2Set: An Optimal Transport Enhanced Sequence-to-Set Model for Extreme Multi-label Text Classification Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, the labels in XMTC tasks are essentially an unordered set rather than an ordered sequence, the default order of labels restrains Seq2Seq models in training. To address this limitation in Seq2Seq, we propose an autoregressive sequence-to-set model for XMTC tasks named OTSeq2Set. |
Jie Cao; Yin Zhang; |
378 | SimQA: Detecting Simultaneous MT Errors Through Word-by-Word Question Answering Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Yet, evaluations of simultaneous machine translation (SimulMT) fail to capture if systems correctly translate the most salient elements of a question: people, places, and dates. To address this problem, we introduce a downstream word-by-word question answering evaluation task (SimQA): given a source language question, translate the question word by word into the target language, and answer as soon as possible. |
HyoJung Han; Marine Carpuat; Jordan Boyd-Graber; |
379 | Discovering Low-rank Subspaces for Language-agnostic Multilingual Representations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we provide a novel view of projecting away language-specific factors from a multilingual embedding space. |
Zhihui Xie; Handong Zhao; Tong Yu; Shuai Li; |
380 | Rethinking The Authorship Verification Experimental Setups Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To this end, we improve the experimental setup by proposing five new public splits over the PAN dataset, specifically designed to isolate and identify biases related to the text topic and to the author?s writing style. |
Florin Brad; Andrei Manolache; Elena Burceanu; Antonio Barbalau; Radu Tudor Ionescu; Marius Popescu; |
381 | Borrowing Human Senses: Comment-Aware Self-Training for Social Media Multimodal Classification Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To better glue the cross-modal semantics therein, we capture hinting features from user comments, which are retrieved via jointly leveraging visual and lingual similarity. |
Chunpu Xu; Jing Li; |
382 | Training Language Models with Memory Augmentation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we present TRIME, a novel yet simple training approach designed for training LMs with memory augmentation. |
Zexuan Zhong; Tao Lei; Danqi Chen; |
383 | Data-Efficient Strategies for Expanding Hate Speech Detection Into Under-Resourced Languages Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To mitigate these issues, we explore data-efficient strategies for expanding hate speech detection into under-resourced languages. In a series of experiments with mono- and multilingual models across five non-English languages, we find that 1) a small amount of target-language fine-tuning data is needed to achieve strong performance, 2) the benefits of using more such data decrease exponentially, and 3) initial fine-tuning on readily-available English data can partially substitute target-language data and improve model generalisability. |
Paul R�ttger; Debora Nozza; Federico Bianchi; Dirk Hovy; |
384 | Dimension Reduction for Efficient Dense Retrieval Via Conditional Autoencoder Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To reduce the embedding dimensions of dense retrieval, this paper proposes a Conditional Autoencoder (ConAE) to compress the high-dimensional embeddings to maintain the same embedding distribution and better recover the ranking features. |
Zhenghao Liu; Han Zhang; Chenyan Xiong; Zhiyuan Liu; Yu Gu; Xiaohua Li; |
385 | Controlled Text Reduction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Concretely, we formalize Controlled Text Reduction as a standalone task, whose input is a source text with marked spans of targeted content (?highlighting?). |
Aviv Slobodkin; Paul Roit; Eran Hirsch; Ori Ernst; Ido Dagan; |
386 | Questioning The Validity of Summarization Datasets and Improving Their Factual Consistency Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Due to this lack of well-defined formulation, a large number of popular abstractive summarization datasets are constructed in a manner that neither guarantees validity nor meets one of the most essential criteria of summarization: factual consistency. In this paper, we address this issue by combining state-of-the-art factual consistency models to identify the problematic instances present in popular summarization datasets. |
Yanzhu Guo; Chlo� Clavel; Moussa Kamal Eddine; Michalis Vazirgiannis; |
387 | Invariant Language Modeling Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Inspired by recent progress in causal machine learning, in particular the invariant risk minimization (IRM) paradigm, we propose invariant language modeling, a framework for learning invariant representations that generalize better across multiple environments. |
Maxime Peyrard; Sarvjeet Ghotra; Martin Josifoski; Vidhan Agarwal; Barun Patra; Dean Carignan; Emre Kiciman; Saurabh Tiwary; Robert West; |
388 | AdaMix: Mixture-of-Adaptations for Parameter-efficient Model Tuning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose AdaMix as a general PEFT method that tunes a mixture of adaptation modules ? given the underlying PEFT method of choice ? introduced in each Transformer layer while keeping most of the PLM weights frozen. |
Yaqing Wang; Sahaj Agarwal; Subhabrata Mukherjee; Xiaodong Liu; Jing Gao; Ahmed Hassan Awadallah; Jianfeng Gao; |
389 | How �Multi� Is Multi-Document Summarization? Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Accordingly, it is expected that both reference summaries in MDS datasets, as well as system summaries, would indeed be based on such dispersed information. In this paper, we argue for quantifying and assessing this expectation. |
Ruben Wolhandler; Arie Cattan; Ori Ernst; Ido Dagan; |
390 | BioReader: A Retrieval-Enhanced Text-to-Text Transformer for Biomedical Literature Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce BioReader, the first retrieval-enhanced text-to-text model for biomedical natural language processing. |
Giacomo Frisoni; Miki Mizutani; Gianluca Moro; Lorenzo Valgimigli; |
391 | T-Modules: Translation Modules for Zero-Shot Cross-Modal Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a new approach to perform zero-shot cross-modal transfer between speech and text for translation tasks. |
Paul-Ambroise Duquenne; Hongyu Gong; Beno�t Sagot; Holger Schwenk; |
392 | LILA: A Unified Benchmark for Mathematical Reasoning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Towards evaluating and improving AI systems in this domain, we proposeLILA, a unified mathematical reasoning benchmark consisting of 23 diversetasks along four dimensions:(i) mathematical abilities e.g., arithmetic, calculus (ii) language format e.g., question-answering, fill-in-the-blanks (iii) language diversity e.g., no language, simple language (iv) external knowledge e.g., commonsense, physics. |
Swaroop Mishra; Matthew Finlayson; Pan Lu; Leonard Tang; Sean Welleck; Chitta Baral; Tanmay Rajpurohit; Oyvind Tafjord; Ashish Sabharwal; Peter Clark; Ashwin Kalyan; |
393 | Leveraging Affirmative Interpretations from Negation Improves Natural Language Understanding Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Inspired by the fact that understanding a negated statement often requires humans to infer affirmative interpretations, in this paper we show that doing so benefits models for three natural language understanding tasks. |
Md Mosharaf Hossain; Eduardo Blanco; |
394 | GraphQ IR: Unifying The Semantic Parsing of Graph Query Languages with One Intermediate Representation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper proposes a unified intermediate representation for graph query languages, named GraphQ IR. |
Lunyiu Nie; Shulin Cao; Jiaxin Shi; Jiuding Sun; Qi Tian; Lei Hou; Juanzi Li; Jidong Zhai; |
395 | InforMask: Unsupervised Informative Masking for Language Model Pretraining Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose InforMask, a new unsupervised masking strategy for training masked language models. |
Nafis Sadeq; Canwen Xu; Julian McAuley; |
396 | CTRLsum: Towards Generic Controllable Text Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Current summarization systems yield generic summaries that are disconnected from users? preferences and expectations. To address this limitation, we present CTRLsum, a generic framework to control generated summaries through a set of keywords. |
Junxian He; Wojciech Kryscinski; Bryan McCann; Nazneen Rajani; Caiming Xiong; |
397 | Missing Counter-Evidence Renders NLP Fact-Checking Unrealistic for Misinformation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Here, we contrast and compare NLP fact-checking with how professional fact-checkers combat misinformation in the absence of counter-evidence. |
Max Glockner; Yufang Hou; Iryna Gurevych; |
398 | A Framework for Adapting Pre-Trained Language Models to Knowledge Graph Completion Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we conduct a comprehensive exploration of how to best extract and incorporate those embeddings into knowledge graph completion models. |
Justin Lovelace; Carolyn Ros�; |
399 | Mutual Information Alleviates Hallucinations in Abstractive Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we identify a simple criterion under which models are significantly more likely to assign more probability to hallucinated content during generation: high model uncertainty. |
Liam van der Poel; Ryan Cotterell; Clara Meister; |
400 | Toward The Limitation of Code-Switching in Cross-Lingual Transfer Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper mitigates the limitation of the code-switching method by not only making the token replacement but considering the similarity between the context and the switched tokens so that the newly substituted sentences are grammatically consistent during both training and inference. |
Yukun Feng; Feng Li; Philipp Koehn; |
401 | Syntactically Rich Discriminative Training: An Effective Method for Open Information Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose several new methods for training neural OIE models in this paper. |
Frank Mtumbuka; Thomas Lukasiewicz; |
402 | Transformer-based Entity Typing in Knowledge Graphs Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a novel Transformer-based Entity Typing (TET) approach, effectively encoding the content of neighbours of an entity by means of a transformer mechanism. |
Zhiwei Hu; Victor Gutierrez-Basulto; Zhiliang Xiang; Ru Li; Jeff Pan; |
403 | NewsClaims: A New Benchmark for Claim Detection from News with Attribute Knowledge Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we present NewsClaims, a new benchmark for attribute-aware claim detection in the news domain. |
Revanth Gangi Reddy; Sai Chetan Chinthakindi; Zhenhailong Wang; Yi Fung; Kathryn Conger; Ahmed ELsayed; Martha Palmer; Preslav Nakov; Eduard Hovy; Kevin Small; Heng Ji; |
404 | IsoVec: Controlling The Relative Isomorphism of Word Embedding Spaces Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We incorporate global measures of isomorphism directly into the skipgram loss function, successfully increasing the relative isomorphism of trained word embedding spaces and improving their ability to be mapped to a shared cross-lingual space. |
Kelly Marchisio; Neha Verma; Kevin Duh; Philipp Koehn; |
405 | Adversarial Concept Erasure in Kernel Space Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a kernelization of the recently-proposed linear concept-removal objective, and show that it is effective in guarding against the ability of certain nonlinear adversaries to recover the concept. |
Shauli Ravfogel; Francisco Vargas; Yoav Goldberg; Ryan Cotterell; |
406 | The Authenticity Gap in Human Evaluation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We suggest improvements to the standard protocol to make it more theoretically sound, but even in its improved form, it cannot be used to evaluate open-ended tasks like story generation. |
Kawin Ethayarajh; Dan Jurafsky; |
407 | BERT in Plutarch�s Shadows Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper presents a BERT language model for Ancient Greek. |
Ivan Yamshchikov; Alexey Tikhonov; Yorgos Pantis; Charlotte Schubert; J�rgen Jost; |
408 | Leveraging Locality in Abstractive Text Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, the quadratic memory complexity of the self-attention module with respect to the input length hinders their applications in long text summarization. Instead of designing more efficient attention modules, we approach this problem by investigating if models with a restricted context can have competitive performance compared with the memory-efficient attention models that maintain a global context by treating the input as a single sequence. |
Yixin Liu; Ansong Ni; Linyong Nan; Budhaditya Deb; Chenguang Zhu; Ahmed Hassan Awadallah; Dragomir Radev; |
409 | Salience Allocation As Guidance for Abstractive Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a novel summarization approach with a flexible and reliable salience guidance, namely SEASON (SaliencE Allocation as Guidance for Abstractive SummarizatiON). |
Fei Wang; Kaiqiang Song; Hongming Zhang; Lifeng Jin; Sangwoo Cho; Wenlin Yao; Xiaoyang Wang; Muhao Chen; Dong Yu; |
410 | Fine-tuned Language Models Are Continual Learners Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To address this limitation, we argue that a model should be able to keep extending its knowledge and abilities, without forgetting previous skills. |
Thomas Scialom; Tuhin Chakrabarty; Smaranda Muresan; |
411 | Natural Logic-guided Autoregressive Multi-hop Document Retrieval for Fact Verification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a novel retrieve-and-rerank method for multi-hop retrieval, that consists of a retriever that jointly scores documents in the knowledge source and sentences from previously retrieved documents using an autoregressive formulation and is guided by a proof system based on natural logic that dynamically terminates the retrieval process if the evidence is deemed sufficient. |
Rami Aly; Andreas Vlachos; |
412 | AX-MABSA: A Framework for Extremely Weakly Supervised Multi-label Aspect Based Sentiment Analysis Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We explore unsupervised language model post-training to improve the overall performance, and propose a multi-label generator model to generate multiple aspect category-sentiment pairs per review sentence. |
Sabyasachi Kamila; Walid Magdy; Sourav Dutta; MingXue Wang; |
413 | Transfer Learning with Synthetic Corpora for Spatial Role Labeling and Reasoning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We provide two new data resources on multiple spatial language processing tasks. |
Roshanak Mirzaee; Parisa Kordjamshidi; |
414 | A Survey of Active Learning for Natural Language Processing Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we provide a literature review of active learning (AL) for its applications in natural language processing (NLP). |
Zhisong Zhang; Emma Strubell; Eduard Hovy; |
415 | Bernice: A Multilingual Pre-trained Encoder for Twitter Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce Bernice, the first multilingual RoBERTa language model trained from scratch on 2. |
Alexandra DeLucia; Shijie Wu; Aaron Mueller; Carlos Aguirre; Philip Resnik; Mark Dredze; |
416 | CEFR-Based Sentence Difficulty Annotation and Assessment Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To address this problem, we created the CEFR-based Sentence Profile (CEFR-SP) corpus, containing 17k English sentences annotated with the levels based on the Common European Framework of Reference for Languages assigned by English-education professionals. |
Yuki Arase; Satoru Uchida; Tomoyuki Kajiwara; |
417 | Simple Questions Generate Named Entity Recognition Datasets Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This work introduces an ask-to-generate approach, which automatically generates NER datasets by asking simple natural language questions to an open-domain question answering system (e. g. , �Which disease?�) |
Hyunjae Kim; Jaehyo Yoo; Seunghyun Yoon; Jinhyuk Lee; Jaewoo Kang; |
418 | TemporalWiki: A Lifelong Benchmark for Training and Evaluating Ever-Evolving Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To this end, we introduce TemporalWiki, a lifelong benchmark for ever-evolving LMs that utilizes the difference between consecutive snapshots of English Wikipedia and English Wikidata for training and evaluation, respectively. |
Joel Jang; Seonghyeon Ye; Changho Lee; Sohee Yang; Joongbo Shin; Janghoon Han; Gyeonghun Kim; Minjoon Seo; |
419 | Bi-Directional Iterative Prompt-Tuning for Event Argument Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a bi-directional iterative prompt-tuning method for EAE, where the EAE task is treated as a cloze-style task to take full advantage of entity information and pre-trained language models (PLMs). |
Lu Dai; Bang Wang; Wei Xiang; Yijun Mo; |
420 | Learning Robust Representations for Continual Relation Extraction Via Adversarial Class Augmentation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Most previous work attributes catastrophic forgetting to the corruption of the learned representations as new relations come, with an implicit assumption that the CRE models have adequately learned the old relations. In this paper, through empirical studies we argue that this assumption may not hold, and an important reason for catastrophic forgetting is that the learned representations do not have good robustness against the appearance of analogous relations in the subsequent learning process. |
Peiyi Wang; Yifan Song; Tianyu Liu; Binghuai Lin; Yunbo Cao; Sujian Li; Zhifang Sui; |
421 | ConvFinQA: Exploring The Chain of Numerical Reasoning in Conversational Finance Question Answering Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we investigate the application domain of finance that involves real-world, complex numerical reasoning. |
Zhiyu Chen; Shiyang Li; Charese Smiley; Zhiqiang Ma; Sameena Shah; William Yang Wang; |
422 | A Span-based Multimodal Variational Autoencoder for Semi-supervised Multimodal Named Entity Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To fuse the text and image features for MNER effectively under semi-supervised setting, we propose a novel span-based multimodal variational autoencoder (SMVAE) model for semi-supervised MNER. |
Baohang Zhou; Ying Zhang; Kehui Song; Wenya Guo; Guoqing Zhao; Hongbin Wang; Xiaojie Yuan; |
423 | R-TeaFor: Regularized Teacher-Forcing for Abstractive Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Nevertheless, they do not consider the pairwise relationship between the original training data and the modified ones, which provides more information during training. Hence, we propose Regularized Teacher-Forcing (R-TeaFor) to utilize this relationship for better regularization. |
Guan-Yu Lin; Pu-Jen Cheng; |
424 | Modeling Consistency Preference Via Lexical Chains for Document-level Neural Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper we aim to relieve the issue of lexical translation inconsistency for document-level neural machine translation (NMT) by modeling consistency preference for lexical chains, which consist of repeated words in a source-side document and provide a representation of the lexical consistency structure of the document. |
Xinglin Lyu; Junhui Li; Shimin Tao; Hao Yang; Ying Qin; Min Zhang; |
425 | Just Fine-tune Twice: Selective Differential Privacy for Large Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we develop a novel framework, *Just Fine-tune Twice* (JFT), that achieves SDP for state-of-the-art large transformer-based models. |
Weiyan Shi; Ryan Shea; Si Chen; Chiyuan Zhang; Ruoxi Jia; Zhou Yu; |
426 | Factorizing Content and Budget Decisions in Abstractive Summarization of Long Documents Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We argue that disentangling content selection from the budget used to cover salient content improves the performance and applicability of abstractive summarizers. |
Marcio Fonseca; Yftah Ziser; Shay B. Cohen; |
427 | Open-Domain Sign Language Translation Learned from Online Video Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we introduce OpenASL, a large-scale American Sign Language (ASL) – English dataset collected from online video sites (e. g. , YouTube). |
Bowen Shi; Diane Brentari; Gregory Shakhnarovich; Karen Livescu; |
428 | Improving Temporal Generalization of Pre-trained Language Models with Lexical Semantic Change Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we empirically observe that temporal generalization is closely affiliated with lexical semantic change, which is one of the essential phenomena of natural languages. |
Zhaochen Su; Zecheng Tang; Xinyan Guan; Lijun Wu; Min Zhang; Juntao Li; |
429 | ULN: Towards Underspecified Vision-and-Language Navigation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: As a primary step toward ULN, we propose a VLN framework that consists of a classification module, a navigation agent, and an Exploitation-to-Exploration (E2E) module. |
Weixi Feng; Tsu-Jui Fu; Yujie Lu; William Yang Wang; |
430 | Federated Model Decomposition with Private Vocabulary for Text Classification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a fedrated model decomposition method that protects the privacy of vocabularies, shorted as FEDEVOCAB. |
Zhuo Zhang; Xiangjing Hu; Lizhen Qu; Qifan Wang; Zenglin Xu; |
431 | ReCo: Reliable Causal Chain Reasoning Via Structural Causal Recurrent Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In other words, the causal pairs to be spliced may have a conflicting threshold boundary or scenario. To address these issues, we propose a novel Reliable Causal chain reasoning framework (ReCo), which introduces exogenous variables to represent the threshold and scene factors of each causal pair within the causal chain, and estimates the threshold and scene contradictions across exogenous variables via structural causal recurrent neural networks (SRNN). |
Kai Xiong; Xiao Ding; Zhongyang Li; Li Du; Ting Liu; Bing Qin; Yi Zheng; Baoxing Huai; |
432 | Video Question Answering: Datasets, Algorithms and Challenges Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This survey aims to sort out the recent advances in video question answering (VideoQA) and point towards future directions. |
Yaoyao Zhong; Wei Ji; Junbin Xiao; Yicong Li; Weihong Deng; Tat-Seng Chua; |
433 | Retrofitting Multilingual Sentence Embeddings with Abstract Meaning Representation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce a new method to improve existing multilingual sentence embeddings with Abstract Meaning Representation (AMR). |
Deng Cai; Xin Li; Jackie Chun-Sing Ho; Lidong Bing; Wai Lam; |
434 | Breaking The Representation Bottleneck of Chinese Characters: Neural Machine Translation with Stroke Sequence Modeling Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we introduce a novel representation method for Chinese characters to break the bottlenecks, namely StrokeNet, which represents a Chinese character by a Latinized stroke sequence (e. g. , �? |
Zhijun Wang; Xuebo Liu; Min Zhang; |
435 | Boundary-Driven Table-Filling for Aspect Sentiment Triplet Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To overcome these issues, we propose Boundary-Driven Table-Filling (BDTF), which represents each triplet as a relation region in the 2D table and transforms the ASTE task into detection and classification of relation regions. |
Yice Zhang; Yifan Yang; Yihui Li; Bin Liang; Shiwei Chen; Yixue Dang; Min Yang; Ruifeng Xu; |
436 | Attention and Edge-Label Guided Graph Convolutional Networks for Named Entity Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose the Attention and Edge-Label guided Graph Convolution Network (AELGCN) model. |
Renjie Zhou; Zhongyi Xie; Jian Wan; Jilin Zhang; Yong Liao; Qiang Liu; |
437 | Title2Event: Benchmarking Open Event Extraction with A Large-scale Chinese Title Dataset Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we present Title2Event, a large-scale sentence-level dataset benchmarking Open Event Extraction without restricting event types. |
Haolin Deng; Yanan Zhang; Yangfan Zhang; Wangyang Ying; Changlong Yu; Jun Gao; Wei Wang; Xiaoling Bai; Nan Yang; Jin Ma; Xiang Chen; Tianhua Zhou; |
438 | Cascading Biases: Investigating The Effect of Heuristic Annotation Strategies on Data and Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we study cognitive heuristic use in the context of annotating multiple-choice reading comprehension datasets. |
Chaitanya Malaviya; Sudeep Bhatia; Mark Yatskar; |
439 | Teaching Broad Reasoning Skills for Multi-Step QA By Generating Hard Contexts Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We show how to use question decompositions to teach language models these broad reasoning skills in a robust fashion. |
Harsh Trivedi; Niranjan Balasubramanian; Tushar Khot; Ashish Sabharwal; |
440 | ADDMU: Detection of Far-Boundary Adversarial Examples with Data and Model Uncertainty Estimation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Existing methods show worse than random guess performance under this scenario. To overcome this limitation, we propose a new technique, ADDMU, adversary detection with data and model uncertainty, which combines two types of uncertainty estimation for both regular and FB adversarial example detection. |
Fan Yin; Yao Li; Cho-Jui Hsieh; Kai-Wei Chang; |
441 | G-MAP: General Memory-Augmented Pre-trained Language Model for Domain Tasks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, this domain-adaptive pre-training (DAPT (CITATION)) tends to forget the previous general knowledge acquired by general PLMs, which leads to a catastrophic forgetting phenomenon and sub-optimal performance. To alleviate this problem, we propose a new framework of Memory-Augmented Pre-trained Language Model (MAP), which augments the domain-specific PLM by a memory built from the frozen general PLM without losing the general knowledge. |
Zhongwei Wan; Yichun Yin; Wei Zhang; Jiaxin Shi; Lifeng Shang; Guangyong Chen; Xin Jiang; Qun Liu; |
442 | Towards Unifying Reference Expression Generation and Comprehension Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To address the problems, we propose a unified model for REG and REC, named UniRef. |
Duo Zheng; Tao Kong; Ya Jing; Jiaan Wang; Xiaojie Wang; |
443 | Textual Manifold-based Defense Against Natural Language Adversarial Examples Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this study, we find a similar phenomenon occurs in the contextualized embedding space of natural sentences induced by pretrained language models in which textual adversarial examples tend to have their embeddings diverge off the manifold of natural sentence embeddings. |
Dang Nguyen Minh; Anh Tuan Luu; |
444 | Tiny-Attention Adapter: Contexts Are More Important Than The Number of Parameters Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we investigate the effectiveness of using tiny-attention�i. e. , attention with extremely small per-head dimensionality�as adapters. |
Hongyu Zhao; Hao Tan; Hongyuan Mei; |
445 | Reduce Catastrophic Forgetting of Dense Retrieval Training with Teleportation Negatives Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we investigate the instability in the standard dense retrieval training, which iterates between model training and hard negative selection using the being-trained model. |
Si Sun; Chenyan Xiong; Yue Yu; Arnold Overwijk; Zhiyuan Liu; Jie Bao; |
446 | ATTEMPT: Parameter-Efficient Multi-task Tuning Via Attentional Mixtures of Soft Prompts Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This work introduces a new multi-task, parameter-efficient language model (LM) tuning method that learns to transfer knowledge across different tasks via a mixture of soft prompts�small prefix embedding vectors pre-trained for different tasks. |
Akari Asai; Mohammadreza Salehi; Matthew Peters; Hannaneh Hajishirzi; |
447 | Exploration of The Usage of Color Terms By Color-blind Participants in Online Discussion Platforms Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We study this question by making a step forward towards a better understanding of the conceptual perception of colors by color-blind individuals, as reflected in their spontaneous linguistic productions. |
Ella Rabinovich; Boaz Carmeli; |
448 | DEER: Descriptive Knowledge Graph for Explaining Entity Relationships Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To construct DEER, we propose a self-supervised learning method to extract relation descriptions with the analysis of dependency patterns and generate relation descriptions with a transformer-based relation description synthesizing model, where no human labeling is required. |
Jie Huang; Kerui Zhu; Kevin Chen-Chuan Chang; Jinjun Xiong; Wen-mei Hwu; |
449 | META-GUI: Towards Multi-modal Conversational Agents on Mobile GUI Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a new TOD architecture: GUI-based task-oriented dialogue system (GUI-TOD). |
Liangtai Sun; Xingyu Chen; Lu Chen; Tianle Dai; Zichen Zhu; Kai Yu; |
450 | Understanding and Improving Knowledge Distillation for Quantization Aware Training of Large Transformer Encoders Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we provide an in-depth analysis of the mechanism of KD on attention recovery of quantized large Transformers. |
Minsoo Kim; Sihwa Lee; Suk-Jin Hong; Du-Seong Chang; Jungwook Choi; |
451 | Exploring Mode Connectivity for Pre-trained Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we investigate the geometric connections of different minima through the lens of mode connectivity, which measures whether two minima can be connected with a low-loss path. |
Yujia Qin; Cheng Qian; Jing Yi; Weize Chen; Yankai Lin; Xu Han; Zhiyuan Liu; Maosong Sun; Jie Zhou; |
452 | Synergy with Translation Artifacts for Training and Inference in Multilingual Tasks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Translation has played a crucial role in improving the performance on multilingual tasks: (1) to generate the target language data from the source language data for training and (2) to generate the source language data from the target language data for inference. However, prior works have not considered the use of both translations simultaneously. This paper shows that combining them can synergize the results on various multilingual sentence classification tasks. |
Jaehoon Oh; Jongwoo Ko; Se-Young Yun; |
453 | Increasing Visual Awareness in Multimodal Neural Machine Translation from An Information Theoretic Perspective Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we endeavor to improve MMT performance by increasing visual awareness from an information theoretic perspective. |
Baijun Ji; Tong Zhang; Yicheng Zou; Bojie Hu; Si Shen; |
454 | Improving Event Coreference Resolution Using Document-level and Topic-level Information Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: They failed to capture the interactions and contextual cues among those long-distance event mentions. Besides, high-level information, such as event topics, is rarely considered to enhance representation learning for ECR. To address the above two issues, we first apply a Longformer-based encoder to obtain the document-level embeddings and an encoder with a trigger-mask mechanism to learn sentence-level embeddings based on local context. In addition, we propose an event topic generator to infer the latent topic-level representations. |
Sheng Xu; Peifeng Li; Qiaoming Zhu; |
455 | Vector-Quantized Input-Contextualized Soft Prompts for Natural Language Understanding Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: A fixed prompt, however, may not generalize well to the diverse kinds of inputs the task comprises. In order to address this, we propose Vector-quantized Input-contextualized Prompts (VIP) as an extension to the soft prompt tuning framework. |
Rishabh Bhardwaj; Amrita Saha; Steven C.H. Hoi; Soujanya Poria; |
456 | Boosting Natural Language Generation from Instructions with Meta-Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper we investigate whether meta-learning applied to MTIL can further improve generalization to unseen tasks in a zero-shot setting. |
Budhaditya Deb; Ahmed Hassan Awadallah; Guoqing Zheng; |
457 | Topical Segmentation of Spoken Narratives: A Test Case on Holocaust Survivor Testimonies Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We tackle the task of segmenting running (spoken) narratives, which poses hitherto unaddressed challenges. |
Eitan Wagner; Renana Keydar; Amit Pinchevski; Omri Abend; |
458 | Unifying The Convergences in Multilingual Neural Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a novel training strategy named LSSD (LanguageSpecific Self-Distillation), which can alleviate the convergence inconsistency and help MNMT models achieve the best performance on each language pair simultaneously. |
Yichong Huang; Xiaocheng Feng; Xinwei Geng; Bing Qin; |
459 | Modeling Label Correlations for Ultra-Fine Entity Typing with Neural Pairwise Conditional Random Field Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To this end, we use an undirected graphical model called pairwise conditional random field (PCRF) to formulate the UFET problem, in which the type variables are not only unarily influenced by the input but also pairwisely relate to all the other type variables. |
Chengyue Jiang; Yong Jiang; Weiqi Wu; Pengjun Xie; Kewei Tu; |
460 | Help Me Write A Poem – Instruction Tuning As A Vehicle for Collaborative Poetry Writing Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Building on the prior success of large language models in the realm of computer assisted creativity, in this work, we present CoPoet, a collaborative poetry writing system, with the goal of to study if LLM�s actually improve the quality of the generated content. |
Tuhin Chakrabarty; Vishakh Padmakumar; He He; |
461 | Open Relation and Event Type Discovery with Type Abstraction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This calls for systems that can automatically infer new types from given corpora, a task which we refer to as type discovery. To tackle this problem, we introduce the idea of type abstraction, where the model is prompted to generalize and name the type. |
Sha Li; Heng Ji; Jiawei Han; |
462 | Enhancing Multilingual Language Model with Massive Multilingual Knowledge Triples Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we explore methods to make better use of the multilingual annotation and language agnostic property of KG triples, and present novel knowledge based multilingual language models (KMLMs) trained directly on the knowledge triples. |
Linlin Liu; Xin Li; Ruidan He; Lidong Bing; Shafiq Joty; Luo Si; |
463 | Revisiting Grammatical Error Correction Evaluation and Beyond Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To alleviate the limitation, we propose a novel GEC evaluation metric to achieve the best of both worlds, namely PT-M2 which only uses PT-based metrics to score those corrected parts. |
Peiyuan Gong; Xuebo Liu; Heyan Huang; Min Zhang; |
464 | R2D2: Robust Data-to-Text with Replacement Detection Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce R2D2, a training framework that addresses unfaithful Data-to-Text generation by training a system both as a generator and a faithfulness discriminator with additional replacement detection and unlikelihood learning tasks. |
Linyong Nan; Lorenzo Jaime Flores; Yilun Zhao; Yixin Liu; Luke Benson; Weijin Zou; Dragomir Radev; |
465 | IDK-MRC: Unanswerable Questions for Indonesian Machine Reading Comprehension Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Existing Indonesian MRC datasets (Purwarianti et al. , 2007; Clark et al. , 2020) are still inadequate because of the small size and limited question types, i. e. , they only cover answerable questions. To fill this gap, we build a new Indonesian MRC dataset called I(n)don�tKnow- MRC (IDK-MRC) by combining the automatic and manual unanswerable question generation to minimize the cost of manual dataset construction while maintaining the dataset quality. |
Rifki Afina Putri; Alice Oh; |
466 | XLM-D: Decorate Cross-lingual Pre-training Model As Non-Autoregressive Neural Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we establish the connection between a pre-trained masked language model (MLM) and non-autoregressive generation on machine translation. |
Yong Wang; Shilin He; Guanhua Chen; Yun Chen; Daxin Jiang; |
467 | Cross-stitching Text and Knowledge Graph Encoders for Distantly Supervised Relation Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Here, we introduce cross-stitch bi-encoders, which allow full interaction between the text encoder and the KG encoder via a cross-stitch mechanism. |
Qin Dai; Benjamin Heinzerling; Kentaro Inui; |
468 | Assist Non-native Viewers: Multimodal Cross-Lingual Summarization for How2 Videos Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, existing works are restricted to monolingual video scenarios, ignoring the demands of non-native video viewers to understand the cross-language videos in practical applications. It stimulates us to propose a new task, named Multimodal Cross-Lingual Summarization for videos (MCLS), which aims to generate cross-lingual summaries from multimodal inputs of videos. |
Nayu Liu; Kaiwen Wei; Xian Sun; Hongfeng Yu; Fanglong Yao; Li Jin; Guo Zhi; Guangluan Xu; |
469 | PACIFIC: Towards Proactive Conversational Question Answering Over Tabular and Textual Data in Finance Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To facilitate conversational question answering (CQA) over hybrid contexts in finance, we present a new dataset, named PACIFIC. |
Yang Deng; Wenqiang Lei; Wenxuan Zhang; Wai Lam; Tat-Seng Chua; |
470 | Generative Data Augmentation with Contrastive Learning for Zero-Shot Stance Detection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Among them, one of the important challenges is to reduce the domain transfer between seen and unseen targets. To tackle this problem, we propose a generative data augmentation approach to generate training samples containing targets and stances for testing data, and map the real samples and generated synthetic samples into the same embedding space with contrastive learning, then perform the final classification based on the augmented data. |
Yang Li; Jiawei Yuan; |
471 | Better Few-Shot Relation Extraction with Label Prompt Dropout Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Instead, we present a novel approach called label prompt dropout, which randomly removes label descriptions in the learning process. |
Peiyuan Zhang; Wei Lu; |
472 | Break It Down Into BTS: Basic, Tiniest Subword Units for Korean Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce Basic, Tiniest Subword (BTS) units for the Korean language, which are inspired by the invention principle of Hangeul, the Korean writing system. |
Nayeon Kim; Jun-Hyung Park; Joon-Young Choi; Eojin Jeon; Youjin Kang; SangKeun Lee; |
473 | The Devil in Linear Transformer Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, they usually suffer from degraded performances on various tasks and corpus. In this paper, we examine existing kernel-based linear transformers and identify two key issues that lead to such performance gaps: 1) unbounded gradients in the attention computation adversely impact the convergence of linear transformer models; 2) attention dilution which trivially distributes attention scores over long sequences while neglecting neighbouring structures. |
Zhen Qin; Xiaodong Han; Weixuan Sun; Dongxu Li; Lingpeng Kong; Nick Barnes; Yiran Zhong; |
474 | Zero-Shot Learners for Natural Language Understanding Via A Unified Multiple Choice Perspective Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a new paradigm for zero-shot learners that is format agnostic, i. e. , it is compatible with any format and applicable to a list of language tasks, such as text classification, commonsense reasoning, coreference resolution, and sentiment analysis. |
Ping Yang; Junjie Wang; Ruyi Gan; Xinyu Zhu; Lin Zhang; Ziwei Wu; Xinyu Gao; Jiaxing Zhang; Tetsuya Sakai; |
475 | Hypoformer: Hybrid Decomposition Transformer for Edge-friendly Neural Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To compress and accelerate Transformer, we propose a Hybrid Tensor-Train (HTT) decomposition, which retains full rank and meanwhile reduces operations and parameters. |
Sunzhu Li; Peng Zhang; Guobing Gan; Xiuqing Lv; Benyou Wang; Junqiu Wei; Xin Jiang; |
476 | FigMemes: A Dataset for Figurative Language Identification in Politically-Opinionated Memes Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To enable future research into this area, we first present FigMemes, a dataset for figurative language classification in politically-opinionated memes. |
Chen Liu; Gregor Geigle; Robin Krebs; Iryna Gurevych; |
477 | UniRel: Unified Representation and Interaction for Joint Relational Triple Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Therefore, the rich correlations are not fully exploited by existing works. In this paper, we propose UniRel to address these challenges. |
Wei Tang; Benfeng Xu; Yuyue Zhao; Zhendong Mao; Yifeng Liu; Yong Liao; Haiyong Xie; |
478 | X-FACTOR: A Cross-metric Evaluation of Factual Correctness in Abstractive Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we present X-FACTOR, a cross-evaluation of three high-performing fact-aware abstractive summarization methods. |
Subhajit Chaudhury; Sarathkrishna Swaminathan; Chulaka Gunasekara; Maxwell Crouse; Srinivas Ravishankar; Daiki Kimura; Keerthiram Murugesan; Ram�n Fernandez Astudillo; Tahira Naseem; Pavan Kapanipathi; Alexander Gray; |
479 | ParaTag: A Dataset of Paraphrase Tagging for Fine-Grained Labels, NLG Evaluation, and Data Augmentation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To this end, we propose a novel fine-grained paraphrase annotation schema that labels the minimum spans of tokens in a sentence that don�t have the corresponding paraphrases in the other sentence. |
Shuohang Wang; Ruochen Xu; Yang Liu; Chenguang Zhu; Michael Zeng; |
480 | Factual Accuracy Is Not Enough: Planning Consistent Description Order for Radiology Report Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We employ a planning-based radiology report generation system that generates the overall structure of reports as �plans�� prior to generating reports that are accurate and consistent in order. Additionally, we propose a novel reinforcement learning and inference method, Coordinated Planning (CoPlan), that includes a content planner and a text generator to train and infer in a coordinated manner to alleviate the cascading of errors that are often inherent in planning-based models. |
Toru Nishino; Yasuhide Miura; Tomoki Taniguchi; Tomoko Ohkuma; Yuki Suzuki; Shoji Kido; Noriyuki Tomiyama; |
481 | FLUTE: Figurative Language Understanding Through Textual Explanations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Yet no such data exists for figurative language, making it harder to assess genuine understanding of such expressions. To address this issue, we release FLUTE, a dataset of 9,000 figurative NLI instances with explanations, spanning four categories: Sarcasm, Simile, Metaphor, and Idioms. |
Tuhin Chakrabarty; Arkadiy Saakyan; Debanjan Ghosh; Smaranda Muresan; |
482 | Precisely The Point: Adversarial Augmentations for Faithful and Informative Text Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we conduct the first quantitative analysis on the robustness of pre-trained Seq2Seq models. |
Wenhao Wu; Wei Li; Jiachen Liu; Xinyan Xiao; Sujian Li; Yajuan Lyu; |
483 | RLET: A Reinforcement Learning Based Approach for Explainable QA with Entailment Trees Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose RLET, a Reinforcement Learning based Entailment Tree generation framework, which is trained utilising the cumulative signals across the whole tree. |
Tengxiao Liu; Qipeng Guo; Xiangkun Hu; Yue Zhang; Xipeng Qiu; Zheng Zhang; |
484 | Let The CAT Out of The Bag: Contrastive Attributed Explanations for Text Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a method Contrastive Attributed explanations for Text (CAT) which provides contrastive explanations for natural language text data with a novel twist as we build and exploit attribute classifiers leading to more semantically meaningful explanations. |
Saneem Chemmengath; Amar Prakash Azad; Ronny Luss; Amit Dhurandhar; |
485 | MonoQA: Multi-Task Learning of Reranking and Answer Extraction for Open-Retrieval Conversational Question Answering Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we investigate the use of Multi-Task Learning (MTL) to improve performance on the ORConvQA task by sharing the reranker and reader�s learned structure in a generative model. |
Sarawoot Kongyoung; Craig Macdonald; Iadh Ounis; |
486 | Composing Ci with Reinforced Non-autoregressive Text Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Moreover, consider that with the format prepared, Ci generation can be operated by an efficient synchronous process, where autoregressive models are limited in doing so since they follow the character-by-character generation protocol. Therefore, in this paper, we propose to compose Ci through a non-autoregressive approach, which not only ensure that the generation process accommodates tune patterns by controlling the rhythm and essential meaning of each sentence, but also allow the model to perform synchronous generation. |
Yan Song; |
487 | MetaTKG: Learning Evolutionary Meta-Knowledge for Temporal Knowledge Graph Reasoning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Since existing models highly rely on historical information to learn embeddings for entities, they perform poorly on such entities with little historical information. To tackle these issues, we propose a novel Temporal Meta-learning framework for TKG reasoning, MetaTKG for brevity. |
Yuwei Xia; Mengqi Zhang; Qiang Liu; Shu Wu; Xiao-Yu Zhang; |
488 | MPLUG: Effective and Efficient Vision-Language Learning By Cross-modal Skip-connections Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper presents mPLUG, a new vision-language foundation model for both cross-modal understanding and generation. |
Chenliang Li; Haiyang Xu; Junfeng Tian; Wei Wang; Ming Yan; Bin Bi; Jiabo Ye; He Chen; Guohai Xu; Zheng Cao; Ji Zhang; Songfang Huang; Fei Huang; Jingren Zhou; Luo Si; |
489 | Q-TOD: A Query-driven Task-oriented Dialogue System Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we introduce a novel query-driven task-oriented dialogue system, namely Q-TOD. |
Xin Tian; Yingzhan Lin; Mengfei Song; Siqi Bao; Fan Wang; Huang He; Shuqi Sun; Hua Wu; |
490 | Dial2vec: Self-Guided Contrastive Learning of Unsupervised Dialogue Embeddings Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we introduce the task of learning unsupervised dialogue embeddings. |
Che Liu; Rui Wang; Junfeng Jiang; Yongbin Li; Fei Huang; |
491 | WR-One2Set: Towards Well-Calibrated Keyphrase Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Nevertheless, we observe serious calibration errors outputted by ONE2SET, especially in the over-estimation of � token (means �no corresponding keyphrase�). In this paper, we deeply analyze this limitation and identify two main reasons behind: 1) the parallel generation has to introduce excessive � as padding tokens into training instances; and 2) the training mechanism assigning target to each slot is unstable and further aggravates the � token over-estimation. |
Binbin Xie; Xiangpeng Wei; Baosong Yang; Huan Lin; Jun Xie; Xiaoli Wang; Min Zhang; Jinsong Su; |
492 | Eeny, Meeny, Miny, Moe. How to Choose Data for Morphological Inflection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we explore four sampling strategies for the task of morphological inflection using a Transformer model: a pair of oracle experiments where data is chosen based on correct/incorrect predictions by the model, model confidence, entropy, and random selection. |
Saliha Muradoglu; Mans Hulden; |
493 | An Adaptive Logical Rule Embedding Model for Inductive Reasoning Over Temporal Knowledge Graphs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We combine the two methods to capture deep causal logic by learning rule embeddings, and propose an interpretable model for temporal knowledge graph reasoning called adaptive logical rule embedding model for inductive reasoning (ALRE-IR). |
Xin Mei; Libin Yang; Xiaoyan Cai; Zuowei Jiang; |
494 | UniNL: Aligning Representation Learning with Scoring Function for OOD Detection Via Unified Neighborhood Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a unified neighborhood learning framework (UniNL) to detect OOD intents. |
Yutao Mou; Pei Wang; Keqing He; Yanan Wu; Jingang Wang; Wei Wu; Weiran Xu; |
495 | Open-domain Video Commentary Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We detail the construction of a new large-scale dataset of transcribed commentary aligned with videos containing various human actions in a variety of domains, and propose approaches based on well-known neural architectures to tackle the task. |
Edison Marrese-Taylor; Yumi Hamazono; Tatsuya Ishigaki; Goran Topic; Yusuke Miyao; Ichiro Kobayashi; Hiroya Takamura; |
496 | One Size Does Not Fit All: Investigating Strategies for Differentially-private Learning Across NLP Tasks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this short paper, we provide an extensive analysis of different privacy preserving strategies on seven downstream datasets in five different �typical� NLP tasks with varying complexity using modern neural models based on BERT and XtremeDistil architectures. |
Manuel Senge; Timour Igamberdiev; Ivan Habernal; |
497 | Counterfactual Recipe Generation: Exploring Compositional Generalization in A Realistic Scenario Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we investigate whether pretrained language models can perform compositional generalization in a realistic setting: recipe generation. |
Xiao Liu; Yansong Feng; Jizhi Tang; Chengang Hu; Dongyan Zhao; |
498 | Tutoring Helps Students Learn Better: Improving Knowledge Distillation for BERT with Tutor Network Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel KD framework, Tutor-KD, which improves the distillation effectiveness by controlling the difficulty of training examples during pre-training. |
Junho Kim; Jun-Hyung Park; Mingyu Lee; Wing-Lam Mok; Joon-Young Choi; SangKeun Lee; |
499 | Does Corpus Quality Really Matter for Low-Resource Languages? Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Taking representation learning in Basque as a case study, we explore tailored crawling (manually identifying and scraping websites with high-quality content) as an alternative to filtering CommonCrawl. |
Mikel Artetxe; Itziar Aldabe; Rodrigo Agerri; Olatz Perez-de-Vi�aspre; Aitor Soroa; |
500 | Unifying Data Perspectivism and Personalization: An Application to Social Norms Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we examine a corpus of social media posts about conflict from a set of 13k annotators and 210k judgements of social norms. |
Joan Plepi; B�la Neuendorf; Lucie Flek; Charles Welch; |
501 | Does Self-Rationalization Improve Robustness to Spurious Correlations? Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Specifically, we evaluate how training self-rationalization models with free-text rationales affects robustness to spurious correlations in fine-tuned encoder-decoder and decoder-only models of six different sizes. |
Alexis Ross; Matthew Peters; Ana Marasovic; |
502 | Efficient Pre-training of Masked Language Model Via Concept-based Curriculum Masking Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a novel concept-based curriculum masking (CCM) method to efficiently pre-train a language model. |
Mingyu Lee; Jun-Hyung Park; Junho Kim; Kang-Min Kim; SangKeun Lee; |
503 | Subword Evenness (SuE) As A Predictor of Cross-lingual Transfer to Low-resource Languages Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this study, we show that languages written in non-Latin and non-alphabetic scripts (mostly Asian languages) are the best choices for improving performance on the task of Masked Language Modelling (MLM) in a diverse set of 30 low-resource languages and that the success of the transfer is well predicted by our novel measure of Subword Evenness (SuE). |
Olga Pelloni; Anastassia Shaitarova; Tanja Samardzic; |
504 | A Unified Neural Network Model for Readability Assessment with Feature Projection and Length-Balanced Loss Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a BERT-based model with feature projection and length-balanced loss (BERT-FP-LBL) to determine the difficulty level of a given text. |
Wenbiao Li; Wang Ziyang; Yunfang Wu; |
505 | Speaker Overlap-aware Neural Diarization for Multi-party Meeting Analysis Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To overcome the disadvantages, we reformulate overlapped speaker diarization task as a single-label prediction problem via the proposed power set encoding (PSE). |
Zhihao Du; ShiLiang Zhang; Siqi Zheng; Zhi-Jie Yan; |
506 | GREENER: Graph Neural Networks for News Media Profiling Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study the problem of profiling news media on the Web with respect to their factuality of reporting and bias. |
Panayot Panayotov; Utsav Shukla; Husrev Taha Sencar; Mohamed Nabeel; Preslav Nakov; |
507 | Graph Hawkes Transformer for Extrapolated Reasoning on Temporal Knowledge Graphs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To this end, we propose a Graph Hawkes Transformer (GHT) for both TKG entity prediction and time prediction tasks in the future time. |
Haohai Sun; Shangyi Geng; Jialun Zhong; Han Hu; Kun He; |
508 | UniRPG: Unified Discrete Reasoning Over Table and Text As Program Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose UniRPG, a semantic-parsing-based approach advanced in interpretability and scalability, to perform Unified discrete Reasoning over heterogeneous knowledge resources, i. e. , table and text, as Program Generation. |
Yongwei Zhou; Junwei Bao; Chaoqun Duan; Youzheng Wu; Xiaodong He; Tiejun Zhao; |
509 | Don�t Prompt, Search! Mining-based Zero-Shot Learning with Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose an alternative mining-based approach for zero-shot learning. |
Mozes van de Kar; Mengzhou Xia; Danqi Chen; Mikel Artetxe; |
510 | SEMGraph: Incorporating Sentiment Knowledge and Eye Movement Into Graph Model for Sentiment Analysis Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper investigates the sentiment analysis task from a novel perspective by incorporating sentiment knowledge and eye movement into a graph architecture, aiming to draw the eye movement-based sentiment relationships for learning the sentiment expression of the context. |
Bingbing Wang; Bin Liang; Jiachen Du; Min Yang; Ruifeng Xu; |
511 | Cross-lingual Neural Fuzzy Matching for Exploiting Target-language Monolingual Corpora in Computer-aided Translation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, the reduced availability of in-domain TMs, as compared to in-domain monolingual corpora, limits its adoption for a number of translation tasks. In this paper, we introduce a novel neural approach aimed at overcoming this limitation by exploiting not only TMs, but also in-domain target-language (TL) monolingual corpora, and still enabling a similar functionality to that offered by conventional TM-based CAT tools. |
Miquel Espl�-Gomis; V�ctor M. S�nchez-Cartagena; Juan Antonio P�rez-Ortiz; Felipe S�nchez-Mart�nez; |
512 | Multi-Label Intent Detection Via Contrastive Task Specialization of Sentence Encoders Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Deploying task-oriented dialog ToD systems for new domains and tasks requires natural language understanding models that are 1) resource-efficient and work under low-data regimes; 2) adaptable, efficient, and quick-to-train; 3) expressive and can handle complex ToD scenarios with multiple user intents in a single utterance. Motivated by these requirements, we introduce a novel framework for multi-label intent detection (mID): MultI-ConvFiT (Multi-Label Intent Detection via Contrastive Conversational Fine-Tuning). |
Ivan Vulic; I�igo Casanueva; Georgios Spithourakis; Avishek Mondal; Tsung-Hsien Wen; Pawel Budzianowski; |
513 | Discovering Language-neutral Sub-networks in Multilingual Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, the extent to which they learn language-neutral representations (i. e. , shared representations that encode similar phenomena across languages), and the effect of such representations on cross-lingual transfer performance, remain open questions. In this work, we conceptualize language neutrality of multilingual models as a function of the overlap between language-encoding sub-networks of these models. |
Negar Foroutan; Mohammadreza Banaei; R�mi Lebret; Antoine Bosselut; Karl Aberer; |
514 | Parameter-Efficient Tuning Makes A Good Classification Head Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we find that parameter-efficient tuning makes a good classification head, with which we can simply replace the randomly initialized heads for a stable performance gain. |
Zhuoyi Yang; Ming Ding; Yanhui Guo; Qingsong Lv; Jie Tang; |
515 | STGN: An Implicit Regularization Method for Learning with Noisy Labels in Natural Language Processing Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, previous studies exert identical perturbation for all samples, which may cause overfitting on incorrect ones or optimizing correct ones inadequately. To facilitate this, we propose a novel stochastic tailor-made gradient noise (STGN), mitigating the effect of inherent label noise by introducing tailor-made benign noise for each sample. |
Tingting Wu; Xiao Ding; Minji Tang; Hao Zhang; Bing Qin; Ting Liu; |
516 | Cross-Modal Similarity-Based Curriculum Learning for Image Captioning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a simple yet efficient difficulty measurement for image captioning using cross-modal similarity calculated by a pretrained vision�language model. |
Hongkuan Zhang; Saku Sugawara; Akiko Aizawa; Lei Zhou; Ryohei Sasano; Koichi Takeda; |
517 | Debiasing Masks: A New Framework for Shortcut Mitigation in NLU Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a new debiasing method in which we identify debiased pruning masks that can be applied to a finetuned model. |
Johannes Mario Meissner; Saku Sugawara; Akiko Aizawa; |
518 | Extending Phrase Grounding with Pronouns in Visual Dialogues Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: First, we construct a dataset of phrase grounding with both noun phrases and pronouns to image regions. Based on the dataset, we test the performance of phrase grounding by using a state-of-the-art literature model of this line. Then, we enhance the baseline grounding model with coreference information which should help our task potentially, modeling the coreference structures with graph convolutional networks. |
Panzhong Lu; Xin Zhang; Meishan Zhang; Min Zhang; |
519 | EUR-Lex-Sum: A Multi- and Cross-lingual Dataset for Long-form Summarization in The Legal Domain Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose a novel dataset, called EUR-Lex-Sum, based on manually curated document summaries of legal acts from the European Union law platform (EUR-Lex). |
Dennis Aumiller; Ashish Chouhan; Michael Gertz; |
520 | Differentiable Data Augmentation for Contrastive Sentence Representation Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Although the contrastive learning framework has shown its superiority on sentence representation learning over previous methods, the potential of such a framework is under-explored so far due to the simple method it used to construct positive pairs. Motivated by this, we propose a method that makes hard positives from the original training examples. |
Tianduo Wang; Wei Lu; |
521 | Text Style Transferring Via Adversarial Masking and Styled Filling Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To tackle both challenges, in this study, we propose a style transfer model, with an adversarial masking approach and a styled filling technique (AMSF). |
Jiarui Wang; Richong Zhang; Junfan Chen; Jaein Kim; Yongyi Mao; |
522 | Character-level White-Box Adversarial Attacks Against Transformers Via Attachable Subwords Substitution Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose the first character-level white-box adversarial attack method against transformer models. |
Aiwei Liu; Honghai Yu; Xuming Hu; Shu�ang Li; Li Lin; Fukun Ma; Yawen Yang; Lijie Wen; |
523 | Query-based Instance Discrimination Network for Relational Triple Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, they still suffer from error propagation, relation redundancy and lack of high-level connections between triples. To address these issues, we propose a novel query-based approach to construct instance-level representations for relational triples. |
Zeqi Tan; Yongliang Shen; Xuming Hu; Wenqi Zhang; Xiaoxia Cheng; Weiming Lu; Yueting Zhuang; |
524 | Learning Inter-Entity-Interaction for Few-Shot Knowledge Graph Completion Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Such practice, however, ignores the inter-entity interaction, resulting in low-discrimination representations for entity pairs, especially when these entity pairs are associated with 1-to-N, N-to-1, and N-to-N relations. To address this issue, this paper proposes a novel FKGC model, named Cross-Interaction Attention Network (CIAN) to investigate the inter-entity interaction between head and tail entities. |
Yuling Li; Kui Yu; Xiaoling Huang; Yuhong Zhang; |
525 | Empowering The Fact-checkers! Automatic Identification of Claim Spans on Twitter Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we introduce the novel task of Claim Span Identification (CSI). |
Megha Sundriyal; Atharva Kulkarni; Vaibhav Pulastya; Md. Shad Akhtar; Tanmoy Chakraborty; |
526 | ClidSum: A Benchmark Dataset for Cross-Lingual Dialogue Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present ClidSum, a benchmark dataset towards building cross-lingual summarization systems on dialogue documents. |
Jiaan Wang; Fandong Meng; Ziyao Lu; Duo Zheng; Zhixu Li; Jianfeng Qu; Jie Zhou; |
527 | Spectral Probing Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Contextualized embeddings have analogously been found to capture these phenomena at distinctive layers and frequencies. Leveraging these findings, we develop a fully learnable frequency filter to identify spectral profiles for any given task. |
Max M�ller-Eberstein; Rob van der Goot; Barbara Plank; |
528 | QASem Parsing: Text-to-text Modeling of QA-based Semantics Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: More recently, an appealing trend introduces semi-structured natural-language structures as an intermediate meaning-capturing representation, often in the form of questions and answers. In this work, we further promote this line of research by considering three prior QA-based semantic representations. |
Ayal Klein; Eran Hirsch; Ron Eliav; Valentina Pyatkin; Avi Caciularu; Ido Dagan; |
529 | Keyphrase Generation Via Soft and Hard Semantic Corrections Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To tackle the above biases, we propose a novel correction model CorrKG on top of the MLE pipeline, where the biases are corrected via the optimal transport (OT) and a frequency-based filtering-and-sorting (FreqFS) strategy. |
Guangzhen Zhao; Guoshun Yin; Peng Yang; Yu Yao; |
530 | Modal-specific Pseudo Query Generation for Video Corpus Moment Retrieval Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Previous works have shown promising results; however, they relied on the expensive query annotations for the VCMR, i. e. , the corresponding moment intervals. To overcome this problem, we propose a self-supervised learning framework: Modal-specific Pseudo Query Generation Network (MPGN). |
Minjoon Jung; SeongHo Choi; JooChan Kim; Jin-Hwa Kim; Byoung-Tak Zhang; |
531 | DuQM: A Chinese Dataset of Linguistically Perturbed Natural Questions for Evaluating The Robustness of Question Matching Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we focus on the robustness evaluation of Chinese Question Matching (QM) models. |
Hongyu Zhu; Yan Chen; Jing Yan; Jing Liu; Yu Hong; Ying Chen; Hua Wu; Haifeng Wang; |
532 | DivEMT: Neural Machine Translation Post-Editing Effort Across Typologically Diverse Languages Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce DivEMT, the first publicly available post-editing study of Neural Machine Translation (NMT) over a typologically diverse set of target languages. |
Gabriele Sarti; Arianna Bisazza; Ana Guerberof-Arenas; Antonio Toral; |
533 | Bridging Fairness and Environmental Sustainability in Natural Language Processing Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This lacuna is highly problematic, since there is increasing evidence that an exclusive focus on fairness can actually hinder environmental sustainability, and vice versa. In this work, we shed light on this crucial intersection in NLP by (1) investigating the efficiency of current fairness approaches through surveying example methods for reducing unfair stereotypical bias from the literature, and (2) evaluating a common technique to reduce energy consumption (and thus environmental impact) of English NLP models, knowledge distillation (KD), for its impact on fairness. |
Marius Hessenthaler; Emma Strubell; Dirk Hovy; Anne Lauscher; |
534 | UniMSE: Towards Unified Multimodal Sentiment Analysis and Emotion Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a multimodal sentiment knowledge-sharing framework (UniMSE) that unifies MSA and ERC tasks from features, labels, and models. |
Guimin Hu; Ting-En Lin; Yi Zhao; Guangming Lu; Yuchuan Wu; Yongbin Li; |
535 | Is The Brain Mechanism for Hierarchical Structure Building Universal Across Languages? An FMRI Study of Chinese and English Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we first analyze the differences in language structure between two diverse languages: Chinese and English. By computing the working memory requirements when applying parsing strategies to different language structures, we find that top-down parsing generates less memory load for the right-branching English and bottom-up parsing is less memory-demanding for Chinese. |
Xiaohan Zhang; Shaonan Wang; Nan Lin; Chengqing Zong; |
536 | HashFormers: Towards Vocabulary-independent Pre-trained Transformers Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, these methods are not pre-trained. Inspired by this line of work, we propose HashFormers, a new family of vocabulary-independent pre-trained transformers that support an unlimited vocabulary (i. e. all possible tokens in a corpus) given a substantially smaller fixed-sized embedding matrix. |
Huiyin Xue; Nikolaos Aletras; |
537 | MatchPrompt: Prompt-based Open Relation Extraction with Semantic Consistency Guided Clustering Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose a new prompt-based framework named MatchPrompt, which can realize OpenRE with efficient knowledge transfer from only a few pre-defined relational instances as well as mine the specific meanings for cluster interpretability. |
Jiaxin Wang; Lingling Zhang; Jun Liu; Xi Liang; Yujie Zhong; Yaqiang Wu; |
538 | Improving Aspect Sentiment Quad Prediction Via Template-Order Data Augmentation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Specifically, we use the pre-trained language model to select the orders with minimal entropy. |
Mengting Hu; Yike Wu; Hang Gao; Yinhao Bai; Shiwan Zhao; |
539 | SocioProbe: What, When, and Where Language Models Learn About Sociodemographics Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We address this research gap by probing the sociodemographic knowledge of different single-GPU PLMs on multiple English data sets via traditional classifier probing and information-theoretic minimum description length probing. |
Anne Lauscher; Federico Bianchi; Samuel R. Bowman; Dirk Hovy; |
540 | When Does Parameter-Efficient Transfer Learning Work for Machine Translation? Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We conduct a comprehensive empirical study of PEFTs for MT, considering (1) various parameter budgets, (2) a diverse set of language-pairs, and (3) different pre-trained models. |
Ahmet �st�n; Asa Cooper Stickland; |
541 | Hyper-X: A Unified Hypernetwork for Multi-Task Multilingual Transfer Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, existing methods are unable to fully leverage training data when it is available in different task-language combinations. To exploit such heterogeneous supervision, we propose Hyper-X, a single hypernetwork that unifies multi-task and multilingual learning with efficient adaptation. |
Ahmet �st�n; Arianna Bisazza; Gosse Bouma; Gertjan van Noord; Sebastian Ruder; |
542 | Towards Robust Numerical Question Answering: Diagnosing Numerical Capabilities of NLP Systems Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose to conduct numerical capability diagnosis on a series of Numerical Question Answering systems and datasets. |
Jialiang Xu; Mengyu Zhou; Xinyi He; Shi Han; Dongmei Zhang; |
543 | Enhancing Joint Multiple Intent Detection and Slot Filling with Global Intent-Slot Co-occurrence Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we aim to make full use of the statistical co-occurrence frequency between intents and slots as prior knowledge to enhance joint multiple intent detection and slot filling. |
Mengxiao Song; Bowen Yu; Li Quangang; Wang Yubin; Tingwen Liu; Hongbo Xu; |
544 | Towards Pragmatic Production Strategies for Natural Language Generation Tasks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This position paper proposes a conceptual framework for the design of Natural Language Generation (NLG) systems that follow efficient and effective production strategies in order to achieve complex communicative goals. |
Mario Giulianelli; |
545 | LiteVL: Efficient Video-Language Learning with Enhanced Spatial-Temporal Modeling Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, the pre-training process is computationally expensive due to the requirement of millions of video-text pairs and the redundant data structure of each video. To mitigate these problems, we propose LiteVL, which adapts a pre-trained image-language model BLIP into a video-text model directly on downstream tasks, without heavy pre-training. |
Dongsheng Chen; Chaofan Tao; Lu Hou; Lifeng Shang; Xin Jiang; Qun Liu; |
546 | Communication Breakdown: On The Low Mutual Intelligibility Between Human and Neural Captioning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We compare the 0-shot performance of a neural caption-based image retriever when given as input either human-produced captions or captions generated by a neural captioner. |
Roberto Dess�; Eleonora Gualdoni; Francesca Franzon; Gemma Boleda; Marco Baroni; |
547 | Normalizing Mutual Information for Robust Adaptive Training for Translation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The score is obtained by combining the probability from the translation model and the target language model, which is then used to assign different weights to losses from sentences and tokens. Meanwhile, we argue this metric is not properly normalized, for which we propose Normalized Pointwise Mutual Information (NPMI). |
Youngwon Lee; Changmin Lee; Hojin Lee; Seung-won Hwang; |
548 | Bilingual Synchronization: Restoring Translational Relationships with Editing Operations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We consider here a more general setting which assumes an initial target sequence, that must be transformed into a valid translation of the source, thereby restoring parallelism between source and target. |
Jitao Xu; Josep Crego; Fran�ois Yvon; |
549 | Human-Machine Collaboration Approaches to Build A Dialogue Dataset for Hate Speech Countering Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present a hybrid approach for dialogical data collection, which combines the intervention of human expert annotators over machine generated dialogues obtained using 19 different configurations. |
Helena Bonaldi; Sara Dellantonio; Serra Sinem Tekiroglu; Marco Guerini; |
550 | JANUS: Joint Autoregressive and Non-autoregressive Training with Auxiliary Loss for Sequence Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose JANUS, a Joint Autoregressive and Non-autoregressive training method using aUxiliary losS to enhance the model performance in both AR and NAR manner simultaneously and effectively alleviate the problem of distribution discrepancy. |
Xiaobo Liang; Lijun Wu; Juntao Li; Min Zhang; |
551 | Entity-Focused Dense Passage Retrieval for Outside-Knowledge Visual Question Answering Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Also, the naturally available supervision (whether the passage contains the correct answer) is weak and does not guarantee question relevancy. To address these issues, we propose an Entity-Focused Retrieval (EnFoRe) model that provides stronger supervision during training and recognizes question-relevant entities to help retrieve more specific knowledge. |
Jialin Wu; Raymond Mooney; |
552 | Cross-Linguistic Syntactic Difference in Multilingual BERT: How Good Is It and How Does It Affect Transfer? Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we investigate the distributions of grammatical relations induced from mBERT in the context of 24 typologically different languages. |
Ningyu Xu; Tao Gui; Ruotian Ma; Qi Zhang; Jingting Ye; Menghan Zhang; Xuanjing Huang; |
553 | �It�s Not Just Hate�: A Multi-Dimensional Perspective on Detecting Harmful Speech Online Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We show that a more fine-grained multi-label approach to predicting incivility and hateful or intolerant content addresses both conceptual and performance issues. |
Federico Bianchi; Stefanie HIlls; Patricia Rossini; Dirk Hovy; Rebekah Tromble; Nava Tintarev; |
554 | Long Text Generation with Topic-aware Discrete Latent Variable Model Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we investigate whether discrete latent codes can learn information of topics. |
Erguang Yang; Mingtong Liu; Deyi Xiong; Yujie Zhang; Yufeng Chen; Jinan Xu; |
555 | TIARA: Multi-grained Retrieval for Robust Question Answering Over Large Knowledge Base Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we present a new KBQA model, TIARA, which addresses those issues by applying multi-grained retrieval to help the PLM focus on the most relevant KB context, viz. , entities, exemplary logical forms, and schema items. |
Yiheng Shu; Zhiwei Yu; Yuhan Li; B�rje Karlsson; Tingting Ma; Yuzhong Qu; Chin-Yew Lin; |
556 | Structure-Unified M-Tree Coding Solver for Math Word Problem Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose the Structure-Unified M-Tree Coding Solver (SUMC-Solver), which applies a tree with any M branches (M-tree) to unify the output structures. |
Bin Wang; Jiangzhou Ju; Yang Fan; Xinyu Dai; Shujian Huang; Jiajun Chen; |
557 | FormLM: Recommending Creation Ideas for Online Forms By Modelling Semantic and Structural Information Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To assist form designers, in this work we present FormLM to model online forms (by enhancing pre-trained language model with form structural information) and recommend form creation ideas (including question / options recommendations and block type suggestion). |
Yijia Shao; Mengyu Zhou; Yifan Zhong; Tao Wu; Hongwei Han; Shi Han; Gideon Huang; Dongmei Zhang; |
558 | Generate, Discriminate and Contrast: A Semi-Supervised Sentence Representation Learning Framework Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose a semi-supervised sentence embedding framework, GenSE, that effectively leverages large-scale unlabeled data. |
Yiming Chen; Yan Zhang; Bin Wang; Zuozhu Liu; Haizhou Li; |
559 | GPS: Genetic Prompt Search for Efficient Few-Shot Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we introduce Genetic Prompt Search (GPS) to improve few-shot learning with prompts, which utilizes a genetic algorithm to automatically search for the best prompt. |
Hanwei Xu; Yujun Chen; Yulun Du; Nan Shao; Wang Yanggang; Haiyu Li; Zhilin Yang; |
560 | Multitask Instruction-based Prompting for Fallacy Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Moreover, a big challenge for computational models lies in the fact that fallacies are formulated differently across the datasets with differences in the input format (e. g. , question-answer pair, sentence with fallacy fragment), genre (e. g. , social media, dialogue, news), as well as types and number of fallacies (from 5 to 18 types per dataset). To move towards solving the fallacy recognition task, we approach these differences across datasets as multiple tasks and show how instruction-based prompting in a multitask setup based on the T5 model improves the results against approaches built for a specific dataset such as T5, BERT or GPT-3. |
Tariq Alhindi; Tuhin Chakrabarty; Elena Musi; Smaranda Muresan; |
561 | Rethinking Multi-Modal Alignment in Multi-Choice VideoQA from Feature and Sample Perspectives Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we reconsider the multi-modal alignment problem in VideoQA from feature and sample perspectives to achieve better performance. |
Shaoning Xiao; Long Chen; Kaifeng Gao; Zhao Wang; Yi Yang; Zhimeng Zhang; Jun Xiao; |
562 | Towards Table-to-Text Generation with Pretrained Language Model: A Table Structure Understanding and Text Deliberating Approach Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, to implement the table-to-text generation with pretrained language model, we propose a table structure understanding and text deliberating approach, namely TASD. |
Miao Chen; Xinjiang Lu; Tong Xu; Yanyan Li; Zhou Jingbo; Dejing Dou; Hui Xiong; |
563 | Hierarchical Phrase-Based Sequence-to-Sequence Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper describes a neural transducer that maintains the flexibility of standard sequence-to-sequence (seq2seq) models while incorporating hierarchical phrases as a source of inductive bias during training and as explicit constraints during inference. |
Bailin Wang; Ivan Titov; Jacob Andreas; Yoon Kim; |
564 | Natural Language Deduction with Incomplete Information Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a new system that can handle the underspecified setting where not all premises are stated at the outset; that is, additional assumptions need to be materialized to prove a claim. |
Zayne Sprague; Kaj Bostrom; Swarat Chaudhuri; Greg Durrett; |
565 | Character-centric Story Visualization Via Visual Planning and Token Alignment Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To tackle the challenge, we propose to adapt a recent work that augments VQ-VAE with a text-to-visual-token (transformer) architecture. |
Hong Chen; Rujun Han; Te-Lin Wu; Hideki Nakayama; Nanyun Peng; |
566 | ASQA: Factoid Questions Meet Long-Form Answers Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The hurdles include a lack of high-quality data and the absence of a well-defined notion of an answer�s quality. In this work, we address these problems by releasing a novel dataset and a task that we call ASQA (Answer Summaries for Questions which are Ambiguous); and proposing a reliable metric for measuring performance on ASQA. |
Ivan Stelmakh; Yi Luan; Bhuwan Dhingra; Ming-Wei Chang; |
567 | Algorithms for Acyclic Weighted Finite-State Automata with Failure Arcs Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we present moreefficient algorithms for computing the pathsumin sparse acyclic WFSAs, i. e. , WFSAs with av-erage out symbol fraction s � 1. |
Anej Svete; Benjamin Dayan; Ryan Cotterell; Tim Vieira; Jason Eisner; |
568 | Towards Better Document-level Relation Extraction Via Iterative Inference Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Existing methods usually directly predict the relations of all entity pairs of input document in a one-pass manner, ignoring the fact that predictions of some entity pairs heavily depend on the predicted results of other pairs. To deal with this issue, in this paper, we propose a novel document-level RE model with iterative inference. |
Liang Zhang; Jinsong Su; Yidong Chen; Zhongjian Miao; Min Zijun; Qingguo Hu; Xiaodong Shi; |
569 | Efficient Adversarial Training with Robust Early-Bird Tickets Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Delving into the optimization process of adversarial training, we find that robust connectivity patterns emerge in the early training phase (typically 0. |
Zhiheng Xi; Rui Zheng; Tao Gui; Qi Zhang; Xuanjing Huang; |
570 | Quantifying Privacy Risks of Masked Language Models Using Membership Inference Attacks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Prior attempts at measuring leakage of MLMs via membership inference attacks have been inconclusive, implying potential robustness of MLMs to privacy attacks. In this work, we posit that prior attempts were inconclusive because they based their attack solely on the MLM�s model score. |
Fatemehsadat Mireshghallah; Kartik Goyal; Archit Uniyal; Taylor Berg-Kirkpatrick; Reza Shokri; |
571 | SMaLL-100: Introducing Shallow Multilingual Machine Translation Model for Low-Resource Languages Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce SMaLL-100, a distilled version of the M2M-100(12B) model, a massively multilingual machine translation model covering 100 languages. |
Alireza Mohammadshahi; Vassilina Nikoulina; Alexandre Berard; Caroline Brun; James Henderson; Laurent Besacier; |
572 | TextFusion: Privacy-Preserving Pre-trained Model Inference Via Token Fusion Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, recent studies have shown that intermediate representations can also be recovered to plain text with reasonable accuracy, thus the risk of privacy leakage still exists. To address this issue, we propose TextFusion, a novel method for preserving inference privacy. |
Xin Zhou; Jinzhu Lu; Tao Gui; Ruotian Ma; Zichu Fei; Yuran Wang; Yong Ding; Yibo Cheung; Qi Zhang; Xuanjing Huang; |
573 | Learning to Explain Selectively: A Case Study on Question Answering Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose learning to explain�selectively�: for each decision that the user makes, we use a model to choose the best explanation from a set of candidates and update this model with feedback to optimize human performance. |
Shi Feng; Jordan Boyd-Graber; |
574 | ConsistTL: Modeling Consistency in Transfer Learning for Low-Resource Neural Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a novel transfer learning method for NMT, namely ConsistTL, which can continuously transfer knowledge from the parent model during the training of the child model. |
Zhaocong Li; Xuebo Liu; Derek F. Wong; Lidia S. Chao; Min Zhang; |
575 | Better Hit The Nail on The Head Than Beat Around The Bush: Removing Protected Attributes with A Single Projection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce two methods that find a single targeted projection: Mean Projection (MP, more efficient) and Tukey Median Projection (TMP, with theoretical guarantees). |
Pantea Haghighatkhah; Antske Fokkens; Pia Sommerauer; Bettina Speckmann; Kevin Verbeek; |
576 | IELM: An Open Information Extraction Benchmark for Pre-Trained Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce a new open information extraction (OIE) benchmark for pre-trained language models (LM). |
Chenguang Wang; Xiao Liu; Dawn Song; |
577 | ConNER: Consistency Training for Cross-lingual Named Entity Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose ConNER as a novel consistency training framework for cross-lingual NER, which comprises of: (1) translation-based consistency training on unlabeled target-language data, and (2) dropout-based consistency training on labeled source-language data. |
Ran Zhou; Xin Li; Lidong Bing; Erik Cambria; Luo Si; Chunyan Miao; |
578 | A Sequential Flow Control Framework for Multi-hop Knowledge Base Question Answering Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Existing methods, however, (i) infer the dynamic question representation only through coarse-grained attention mechanisms, which may bring information loss, (ii) and have not effectively modeled the sequential logic, which is crucial for the multi-hop reasoning process in KBQA. To address these issues, we propose a sequential reasoning self-attention mechanism to capture the crucial reasoning information of each single hop in a more fine-grained way. |
Minghui Xie; Chuzhan Hao; Peng Zhang; |
579 | ACENet: Attention Guided Commonsense Reasoning on Hybrid Knowledge Graph Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose an Attention guided Commonsense rEasoning Network (ACENet) to endow the neural network with the capability of integrating hybrid knowledge. |
Chuzhan Hao; Minghui Xie; Peng Zhang; |
580 | Revisiting DocRED – Addressing The False Negative Problem in Relation Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, we find that the annotation of DocRED is incomplete, i.e., false negative samples are prevalent. We analyze the causes and effects of the overwhelming false negative problem in the DocRED dataset. |
Qingyu Tan; Lu Xu; Lidong Bing; Hwee Tou Ng; Sharifah Mahani Aljunied; |
581 | Towards Summary Candidates Fusion Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To bypass this limitation, we propose a new paradigm in second-stage abstractive summarization called SummaFusion that fuses several summary candidates to produce a novel abstractive second-stage summary. |
Mathieu Ravaut; Shafiq Joty; Nancy Chen; |
582 | Multimodal Robustness for Neural Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we look at the case of a Generic text-to-text NMT model that has to deal with data coming from various modalities, like speech, images, or noisy text extracted from the web. |
Yuting Zhao; Ioan Calapodescu; |
583 | TranSHER: Translating Knowledge Graph Embedding with Hyper-Ellipsoidal Restriction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, such a method strictly restricts entities on the hyper-ellipsoid surfaces which limits the optimization of entity distribution, leading to suboptimal performance of knowledge graph completion. To address this issue, we propose a novel score function TranSHER, which leverages relation-specific translations between head and tail entities to relax the constraint of hyper-ellipsoid restrictions. |
Yizhi Li; Wei Fan; Chao Liu; Chenghua Lin; Jiang Qian; |
584 | IRRGN: An Implicit Relational Reasoning Graph Network for Multi-turn Response Selection Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In addition, few studies consider differences between the options before and after reasoning. In this paper, we propose an Implicit Relational Reasoning Graph Network to address these issues, which consists of the Utterance Relational Reasoner (URR) and the Option Dual Comparator (ODC). |
Jingcheng Deng; Hengwei Dai; Xuewei Guo; Yuanchen Ju; Wei Peng; |
585 | Predicting Prerequisite Relations for Unseen Concepts Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, many real-world scenarios deal with concepts that are left undiscovered at training time, which is relatively unexplored. This paper studies this problem and proposes a novel alternating knowledge distillation approach to take advantage of both content- and graph-based models for this task. |
Yaxin Zhu; Hamed Zamani; |
586 | Contrastive Learning with Expectation-Maximization for Weakly Supervised Phrase Grounding Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Specifically, we propose a novel contrastive learning framework based on the expectation-maximization algorithm that adaptively refines the target prediction. |
Keqin Chen; Richong Zhang; Samuel Mensah; Yongyi Mao; |
587 | Beyond Prompting: Making Pre-trained Language Models Better Zero-shot Learners By Clustering Representations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we show that zero-shot text classification can be improved simply by clustering texts in the embedding spaces of PLMs. |
Yu Fei; Zhao Meng; Ping Nie; Roger Wattenhofer; Mrinmaya Sachan; |
588 | Generalizing Over Long Tail Concepts for Medical Term Normalization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper we introduce a simple and effective learning strategy that leverages such information to enhance the generalizability of both discriminative and generative models. |
Beatrice Portelli; Simone Scaboro; Enrico Santus; Hooman Sedghamiz; Emmanuele Chersoni; Giuseppe Serra; |
589 | Unsupervised Opinion Summarisation in The Wasserstein Space Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Such posts are noisy and have unpredictable structure, posing additional challenges for the construction of the summary distribution and the preservation of meaning compared to online reviews, which has been so far the focus on opinion summarisation. To address these challenges we present WassOS, an unsupervised abstractive summarization model which makesuse of the Wasserstein distance. |
Jiayu Song; Iman Munire Bilal; Adam Tsakalidis; Rob Procter; Maria Liakata; |
590 | Bloom Library: Multimodal Datasets in 300+ Languages for A Variety of Downstream Tasks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present Bloom Library, a linguistically diverse set of multimodal and multilingual datasets for language modeling, image captioning, visual storytelling, and speech synthesis/recognition. |
Colin Leong; Joshua Nemecek; Jacob Mansdorfer; Anna Filighera; Abraham Owodunni; Daniel Whitenack; |
591 | Disentangling Uncertainty in Machine Translation Evaluation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose more powerful and efficient uncertainty predictors for MT evaluation, and we assess their ability to target different sources of aleatoric and epistemic uncertainty. |
Chrysoula Zerva; Taisiya Glushkova; Ricardo Rei; Andr� F. T. Martins; |
592 | Does Your Model Classify Entities Reasonably? Diagnosing and Mitigating Spurious Correlations in Entity Typing Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To comprehensively investigate the faithfulness and reliability of entity typing methods, we first systematically define distinct kinds of model biases that are reflected mainly from spurious correlations. Particularly, we identify six types of existing model biases, including mention-context bias, lexical overlapping bias, named entity bias, pronoun bias, dependency bias, and overgeneralization bias. To mitigate model biases, we then introduce a counterfactual data augmentation method. |
Nan Xu; Fei Wang; Bangzheng Li; Mingtao Dong; Muhao Chen; |
593 | EDIN: An End-to-end Benchmark and Pipeline for Unknown Entity Discovery and Indexing Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Building on dense-retrieval based entity linking, we introduce the end-to-end EDIN-pipeline that detects, clusters, and indexes mentions of unknown entities in context. |
Nora Kassner; Fabio Petroni; Mikhail Plekhanov; Sebastian Riedel; Nicola Cancedda; |
594 | POQue: Asking Participant-specific Outcome Questions for A Deeper Understanding of Complex Events Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We show that by pre-identifying a participant in a complex event, crowdworkers are ableto (1) infer the collective impact of salient events that make up the situation, (2) annotate the volitional engagement of participants in causing the situation, and (3) ground theoutcome of the situation in state changes of the participants. |
Sai Vallurupalli; Sayontan Ghosh; Katrin Erk; Niranjan Balasubramanian; Francis Ferraro; |
595 | Measuring The Mixing of Contextual Information in The Transformer Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we consider the whole attention block �multi-head attention, residual connection, and layer normalization� and define a metric to measure token-to-token interactions within each layer. |
Javier Ferrando; Gerard I. G�llego; Marta R. Costa-juss�; |
596 | Dealing with Abbreviations in The Slovenian Biographical Lexicon Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a new method for addressing the problems caused by a high density of domain-specific abbreviations in a text. |
Angel Daza; Antske Fokkens; Toma� Erjavec; |
597 | AfriCLIRMatrix: Enabling Cross-Lingual Information Retrieval for African Languages Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: For search, most existing datasets feature few or no African languages, directly impacting researchers� ability to build and improve information access capabilities in those languages. Motivated by this, we created AfriCLIRMatrix, a test collection for cross-lingual information retrieval research in 15 diverse African languages. |
Odunayo Ogundepo; Xinyu Zhang; Shuo Sun; Kevin Duh; Jimmy Lin; |
598 | CONDAQA: A Contrastive Reading Comprehension Dataset for Reasoning About Negation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To facilitate the future development of models that can process negation effectively, we present CONDAQA, the first English reading comprehension dataset which requires reasoning about the implications of negated statements in paragraphs. |
Abhilasha Ravichander; Matt Gardner; Ana Marasovic; |
599 | Towards Opening The Black Box of Neural Machine Translation: Source and Target Interpretations of The Transformer Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose an interpretability method that tracks input tokens� attributions for both contexts. |
Javier Ferrando; Gerard I. G�llego; Belen Alastruey; Carlos Escolano; Marta R. Costa-juss�; |
600 | ArtELingo: A Million Emotion Annotations of WikiArt with Emphasis on Diversity Over Language and Culture Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper introduces ArtELingo, a new benchmark and dataset, designed to encourage work on diversity across languages and cultures. |
Youssef Mohamed; Mohamed Abdelfattah; Shyma Alhuwaider; Feifan Li; Xiangliang Zhang; Kenneth Church; Mohamed Elhoseiny; |
601 | Decoding A Neural Retriever�s Latent Space for Query Suggestion Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, neural systems lack the interpretability of bag-of-words models; it is not trivial to connect a query change to a change in the latent space that ultimately determines the retrieval results. To shed light on this embedding space, we learn a �query decoder� that, given a latent representation of a neural search engine, generates the corresponding query. |
Leonard Adolphs; Michelle Chen Huebscher; Christian Buck; Sertan Girgin; Olivier Bachem; Massimiliano Ciaramita; Thomas Hofmann; |
602 | T-STAR: Truthful Style Transfer Using AMR Graph As Intermediate Representation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we study the usefulness of Abstract Meaning Representation (AMR) graph as the intermediate style agnostic representation. |
Anubhav Jangra; Preksha Nema; Aravindan Raghuveer; |
603 | PromptBERT: Improving BERT Sentence Embeddings with Prompts Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose PromptBERT, a novel contrastive learning method for learning better sentence representation. |
Ting Jiang; Jian Jiao; Shaohan Huang; Zihan Zhang; Deqing Wang; Fuzhen Zhuang; Furu Wei; Haizhen Huang; Denvy Deng; Qi Zhang; |
604 | Extending Logic Explained Networks to Text Classification Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, these models have only been applied to vision and tabular data, and they mostly favour the generation of global explanations, while local ones tend to be noisy and verbose. For these reasons, we propose LENp, improving local explanations by perturbing input words, and we test it on text classification. |
Rishabh Jain; Gabriele Ciravegna; Pietro Barbiero; Francesco Giannini; Davide Buffelli; Pietro Lio; |
605 | Uni-Parser: Unified Semantic Parser for Question Answering on Knowledge Base and Database Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose Uni-Parser, a unified semantic parser for question answering (QA) on both KB and DB. |
Ye Liu; Semih Yavuz; Rui Meng; Dragomir Radev; Caiming Xiong; Yingbo Zhou; |
606 | RAPO: An Adaptive Ranking Paradigm for Bilingual Lexicon Induction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose a novel ranking-oriented induction model RAPO to learn personalized mapping function for each word. |
Zhoujin Tian; Chaozhuo Li; Shuo Ren; Zhiqiang Zuo; Zengxuan Wen; Xinyue Hu; Xiao Han; Haizhen Huang; Denvy Deng; Qi Zhang; Xing Xie; |
607 | On Parsing As Tagging Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: There are many proposals to reduce constituency parsing to tagging. To figure out what these approaches have in common, we offer a unifying pipeline, which consists of three steps: linearization, learning, and decoding. |
Afra Amini; Ryan Cotterell; |
608 | Distilled Dual-Encoder Model for Vision-Language Understanding Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To get the best of both worlds, we propose DiDE, a framework that distills the knowledge of the fusion-encoder teacher model into the dual-encoder student model. |
Zekun Wang; Wenhui Wang; Haichao Zhu; Ming Liu; Bing Qin; Furu Wei; |
609 | Argument Mining for Review Helpfulness Prediction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To this end, we present the AMazon Argument Mining (AM2) corpus�a corpus of 878 Amazon reviews on headphones annotated according to a theoretical argumentation model designed to evaluate argument quality. |
Zaiqian Chen; Daniel Verdi do Amarante; Jenna Donaldson; Yohan Jo; Joonsuk Park; |
610 | Hierarchical Multi-Label Classification of Scientific Documents Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we introduce a new dataset for hierarchical multi-label text classification (HMLTC) of scientific papers called SciHTC, which contains 186,160 papers and 1,234 categories from the ACM CCS tree. |
Mobashir Sadat; Cornelia Caragea; |
611 | Rainier: Reinforced Knowledge Introspector for Commonsense Question Answering Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present Rainier, or Reinforced Knowledge Introspector, that learns to generate contextually relevant knowledge in response to given questions. |
Jiacheng Liu; Skyler Hallinan; Ximing Lu; Pengfei He; Sean Welleck; Hannaneh Hajishirzi; Yejin Choi; |
612 | A Major Obstacle for NLP Research: Let�s Talk About Time Allocation! Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, this paper argues that we have been less successful than we *should* have been and reflects on where and how the field fails to tap its full potential. |
Katharina Kann; Shiran Dudy; Arya D. McCarthy; |
613 | Towards Inter-character Relationship-driven Story Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we introduce the task of modeling interpersonal relationships for story generation. |
Anvesh Rao Vijjini; Faeze Brahman; Snigdha Chaturvedi; |
614 | Incorporating Relevance Feedback for Information-Seeking Retrieval Using Few-Shot Document Re-Ranking Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Specifically, we introduce a kNN approach that re-ranks documents based on their similarity with the query and the documents the user considers relevant. |
Tim Baumg�rtner; Leonardo F. R. Ribeiro; Nils Reimers; Iryna Gurevych; |
615 | ReasTAP: Injecting Table Reasoning Skills During Pre-training Via Synthetic Reasoning Examples Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we develop ReasTAP to show that high-level table reasoning skills can be injected into models during pre-training without a complex table-specific architecture design. |
Yilun Zhao; Linyong Nan; Zhenting Qi; Rui Zhang; Dragomir Radev; |
616 | Few-shot Learning with Multilingual Generative Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we train multilingual generative language models on a corpus covering a diverse set of languages, and study their few- and zero-shot learning capabilities in a wide range of tasks. |
Xi Victoria Lin; Todor Mihaylov; Mikel Artetxe; Tianlu Wang; Shuohui Chen; Daniel Simig; Myle Ott; Naman Goyal; Shruti Bhosale; Jingfei Du; Ramakanth Pasunuru; Sam Shleifer; Punit Singh Koura; Vishrav Chaudhary; Brian O�Horo; Jeff Wang; Luke Zettlemoyer; Zornitsa Kozareva; Mona Diab; Veselin Stoyanov; Xian Li; |
617 | Are Representations Built from The Ground Up? An Empirical Examination of Local Composition in Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: At the same time, many phrases are non-compositional, carrying a meaning beyond that of each part in isolation. Representing both of these types of phrases is critical for language understanding, but it is an open question whether modern language models (LMs) learn to do so; in this work we examine this question. |
Emmy Liu; Graham Neubig; |
618 | Detecting Label Errors By Using Pre-Trained Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To this end, we contribute a novel method for introducing realistic, human-originated label noise into existing crowdsourced datasets such as SNLI and TweetNLP. |
Derek Chong; Jenny Hong; Christopher Manning; |
619 | Intriguing Properties of Compression on Multilingual Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose an experimental framework to characterize the impact of sparsifying multilingual pre-trained language models during fine-tuning. |
Kelechi Ogueji; Orevaoghene Ahia; Gbemileke Onilude; Sebastian Gehrmann; Sara Hooker; Julia Kreutzer; |
620 | Sequence Models for Document Structure Identification in An Undeciphered Script Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This work describes the first thorough analysis of �header� signs in proto-Elamite, an undeciphered script from 3100-2900 BCE. |
Logan Born; M. Monroe; Kathryn Kelley; Anoop Sarkar; |
621 | English Contrastive Learning Can Learn Universal Cross-lingual Sentence Embeddings Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose mSimCSE, which extends SimCSE to multilingual settings and reveal that contrastive learning on English data can surprisingly learn high-quality universal cross-lingual sentence embeddings without any parallel data. |
Yaushian Wang; Ashley Wu; Graham Neubig; |
622 | Active Example Selection for In-Context Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We formulate example selection for in-context learning as a sequential decision problem, and propose a reinforcement learning algorithm for identifying generalizable policies to select demonstration examples. |
Yiming Zhang; Shi Feng; Chenhao Tan; |
623 | Improving Factual Consistency in Summarization with Compression-Based Post-Editing Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose to use sentence-compression data to train the post-editing model to take a summary with extrinsic entity errors marked with special tokens and output a compressed, well-formed summary with those errors removed. |
Alex Fabbri; Prafulla Kumar Choubey; Jesse Vig; Chien-Sheng Wu; Caiming Xiong; |
624 | Evaluating The Impact of Model Scale for Compositional Generalization in Semantic Parsing Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We evaluate encoder-decoder models up to 11B parameters and decoder-only models up to 540B parameters, and compare model scaling curves for three different methods for applying a pre-trained language model to a new task: fine-tuning all parameters, prompt tuning, and in-context learning. |
Linlu Qiu; Peter Shaw; Panupong Pasupat; Tianze Shi; Jonathan Herzig; Emily Pitler; Fei Sha; Kristina Toutanova; |
625 | �I�m Sorry to Hear That�: Finding New Biases in Language Models with A Holistic Descriptor Dataset Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we present a new, more inclusive bias measurement dataset, HolisticBias, which includes nearly 600 descriptor terms across 13 different demographic axes. |
Eric Michael Smith; Melissa Hall; Melanie Kambadur; Eleonora Presani; Adina Williams; |
626 | Understanding ME? Multimodal Evaluation for Fine-grained Visual Commonsense Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To provide an in-depth analysis, we present a Multimodal Evaluation (ME) pipeline to automatically generate question-answer pairs to test models� understanding of the visual scene, text, and related knowledge. |
Zhecan Wang; Haoxuan You; Yicheng He; Wenhao Li; Kai-Wei Chang; Shih-Fu Chang; |
627 | Semantic Novelty Detection and Characterization in Factual Text Involving Named Entities Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper proposes an effective model (called PAT-SND) to solve the problem, which can also characterize the novelty. |
Nianzu Ma; Sahisnu Mazumder; Alexander Politowicz; Bing Liu; Eric Robertson; Scott Grigsby; |
628 | CN-AutoMIC: Distilling Chinese Commonsense Knowledge from Pretrained Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a large-scale Chinese CKG generated from multilingual PLMs, named as **CN-AutoMIC**, aiming to fill the research gap of non-English CKGs. |
Chenhao Wang; Jiachun Li; Yubo Chen; Kang Liu; Jun Zhao; |
629 | Calibrating Student Models for Emotion-related Tasks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we study KD on the emotion-related tasks from a new perspective: calibration. |
Mahshid Hosseini; Cornelia Caragea; |
630 | Overcoming Catastrophic Forgetting in Zero-Shot Cross-Lingual Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we explore the challenging problem of performing a generative task in a target language when labeled data is only available in English, using summarization as a case study. |
Tu Vu; Aditya Barua; Brian Lester; Daniel Cer; Mohit Iyyer; Noah Constant; |
631 | Improving Large-scale Paraphrase Acquisition and Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a new Multi-Topic Paraphrase in Twitter (MultiPIT) corpus that consists of a total of 130k sentence pairs with crowdsoursing (MultiPIT_crowd) and expert (MultiPIT_expert) annotations using two different paraphrase definitions for paraphrase identification, in addition to a multi-reference test set (MultiPIT_NMR) and a large automatically constructed training set (MultiPIT_Auto) for paraphrase generation. |
Yao Dou; Chao Jiang; Wei Xu; |
632 | Entropy- and Distance-Based Predictors From GPT-2 Attention Patterns Predict Reading Times Over and Above GPT-2 Surprisal Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Under this framework, this work first defines an entropy-based predictor that quantifies the diffuseness of self-attention, as well as distance-based predictors that capture the incremental change in attention patterns across timesteps. |
Byung-Doh Oh; William Schuler; |
633 | A Survey of Computational Framing Analysis Approaches Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The growing scholarship, however, lacks a comprehensive understanding and resources of computational framing analysis methods. Aiming to address the gap, this article surveys existing computational framing analysis approaches and puts them together. |
Mohammad Ali; Naeemul Hassan; |
634 | Learning Cross-Task Dependencies for Joint Extraction of Entities, Events, Event Arguments, and Relations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, the cross-task dependencies in prior work are not optimal as they are only designed manually according to some task heuristics. To address this issue, we propose a novel model for JointIE that aims to learn cross-task dependencies from data. |
Minh Van Nguyen; Bonan Min; Franck Dernoncourt; Thien Nguyen; |
635 | Don�t Copy The Teacher: Data and Model Challenges in Embodied Dialogue Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper contributes to that conversation, by arguing that imitation learning (IL) and related low-level metrics are actually misleading and do not align with the goals of embodied dialogue research and may hinder progress. |
So Yeon Min; Hao Zhu; Ruslan Salakhutdinov; Yonatan Bisk; |
636 | ALFRED-L: Investigating The Role of Language for Action Learning in Interactive Visual Environments Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we examine ALFRED, a challenging benchmark for embodied task completion, with the goal of gaining insight into how effectively models utilize language. |
Arjun Akula; Spandana Gella; Aishwarya Padmakumar; Mahdi Namazifar; Mohit Bansal; Jesse Thomason; Dilek Hakkani-Tur; |
637 | Dungeons and Dragons As A Dialog Challenge for Artificial Intelligence Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we frame D&D specifically as a dialogue system challenge, where the tasks are to both generate the next conversational turn in the game and predict the state of the game given the dialogue history. |
Chris Callison-Burch; Gaurav Singh Tomar; Lara Martin; Daphne Ippolito; Suma Bailis; David Reitter; |
638 | Unsupervised Entity Linking with Guided Summarization and Multiple-Choice Selection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We address two challenge in entity linking: how to leverage wider contexts surrounding a mention, and how to deal with limited training data. |
Young Min Cho; Li Zhang; Chris Callison-Burch; |
639 | Weakly-Supervised Temporal Article Grounding Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To this end, we propose a new challenging grounding task: Weakly-Supervised temporal Article Grounding (WSAG). Specifically, given an article and a relevant video, WSAG aims to localize all �groundable� sentences to the video, and these sentences are possibly at different semantic scales. |
Long Chen; Yulei Niu; Brian Chen; Xudong Lin; Guangxing Han; Christopher Thomas; Hammad Ayyubi; Heng Ji; Shih-Fu Chang; |
640 | Exploring Dual Encoder Architectures for Question Answering Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we explore the dual encoder architectures for QA retrieval tasks. |
Zhe Dong; Jianmo Ni; Dan Bikel; Enrique Alfonseca; Yuan Wang; Chen Qu; Imed Zitouni; |
641 | ArXivEdits: Understanding The Human Revision Process in Scientific Writing Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we provide a complete computational framework for studying text revision in scientific writing. |
Chao Jiang; Wei Xu; Samuel Stevens; |
642 | Why Do You Feel This Way? Summarizing Triggers of Emotions in Social Media Posts Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper takes a novel angle, namely, emotion detection and trigger summarization, aiming to both detect perceived emotions in text, and summarize events and their appraisals that trigger each emotion. To support this goal, we introduce CovidET (Emotions and their Triggers during Covid-19), a dataset of ~1,900 English Reddit posts related to COVID-19, which contains manual annotations of perceived emotions and abstractive summaries of their triggers described in the post. |
Hongli Zhan; Tiberiu Sosea; Cornelia Caragea; Junyi Jessy Li; |
643 | Analogical Math Word Problems Solving with Enhanced Problem-Solution Association Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose to build a novel MWP solver by leveraging analogical MWPs, which advance the solver�s generalization ability across different kinds of MWPs. |
Zhenwen Liang; Jipeng Zhang; Xiangliang Zhang; |
644 | Towards Teachable Reasoning Systems: Using A Dynamic Memory of User Feedback for Continual System Improvement Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Our goal is a teachable reasoning system for question-answering (QA), where a user can interact with faithful answer explanations, and correct its errors so that the system improves over time. |
Bhavana Dalvi Mishra; Oyvind Tafjord; Peter Clark; |
645 | Knowledge Transfer from Answer Ranking to Answer Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose to train a GenQA model by transferring knowledge from a trained AS2 model, to overcome the aforementioned issue. |
Matteo Gabburo; Rik Koncel-Kedziorski; Siddhant Garg; Luca Soldaini; Alessandro Moschitti; |
646 | Perturbation Augmentation for Fairer NLP Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we ask whether training on demographically perturbed data leads to fairer language models. |
Rebecca Qian; Candace Ross; Jude Fernandes; Eric Michael Smith; Douwe Kiela; Adina Williams; |
647 | Automatic Document Selection for Efficient Encoder Pretraining Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose an alternative to larger training sets by automatically identifying smaller yet domain-representative subsets. |
Yukun Feng; Patrick Xia; Benjamin Van Durme; Jo�o Sedoc; |
648 | The Aligned Multimodal Movie Treebank: An Audio, Video, Dependency-parse Treebank Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce the Aligned Multimodal Movie Treebank (AMMT), an English language treebank derived from dialog in Hollywood movies which includes transcriptions of the audio-visual streams with word-level alignment, as well as part of speech tags and dependency parses in the Universal Dependencies formalism. |
Adam Yaari; Jan DeWitt; Henry Hu; Bennett Stankovits; Sue Felshin; Yevgeni Berzak; Helena Aparicio; Boris Katz; Ignacio Cases; Andrei Barbu; |
649 | DEMETR: Diagnosing Evaluation Metrics for Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: The operations of newer learned metrics (e. g. , BLEURT, COMET), which leverage pretrained language models to achieve higher correlations with human quality judgments than BLEU, are opaque in comparison. In this paper, we shed light on the behavior of these learned metrics by creating DEMETR, a diagnostic dataset with 31K English examples (translated from 10 source languages) for evaluating the sensitivity of MT evaluation metrics to 35 different linguistic perturbations spanning semantic, syntactic, and morphological error categories. |
Marzena Karpinska; Nishant Raj; Katherine Thai; Yixiao Song; Ankita Gupta; Mohit Iyyer; |
650 | Empowering Language Models with Knowledge Graph Reasoning for Open-Domain Question Answering Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose knOwledge REasOning empowered Language Model(OREO-LM), which consists of a novel Knowledge Interaction Layer that can be flexibly plugged into existing Transformer-based LMs to interact with a differentiable Knowledge Graph Reasoning module collaboratively. |
Ziniu Hu; Yichong Xu; Wenhao Yu; Shuohang Wang; Ziyi Yang; Chenguang Zhu; Kai-Wei Chang; Yizhou Sun; |
651 | Debiasing Pretrained Text Encoders By Paying Attention to Paying Attention Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a debiasing method for pre-trained text encoders that both reduces social stereotypes, and inflicts next to no semantic damage. |
Yacine Gaci; Boualem Benatallah; Fabio Casati; Khalid Benabdeslem; |
652 | MEE: A Novel Multilingual Event Extraction Dataset Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, one limitation of current research for EE involves the under-exploration for non-English languages in which the lack of high-quality multilingual EE datasets for model training and evaluation has been the main hindrance. To address this limitation, we propose a novel Multilingual Event Extraction dataset (MEE) that provides annotation for more than 50K event mentions in 8 typologically different languages. |
Amir Pouran Ben Veyseh; Javid Ebrahimi; Franck Dernoncourt; Thien Nguyen; |
653 | RobustLR: A Diagnostic Benchmark for Evaluating Logical Robustness of Deductive Reasoners Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To this end, we present RobustLR, a diagnostic benchmark that evaluates the robustness of language models to minimal logical edits in the inputs and different logical equivalence conditions. |
Soumya Sanyal; Zeyi Liao; Xiang Ren; |
654 | Evaluating and Improving Factuality in Multimodal Abstractive Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose CLIPBERTSCORE, a simple weighted combination of CLIPScore and BERTScore to leverage the robustness and strong factuality detection performance between image-summary and document-summary, respectively. |
David Wan; Mohit Bansal; |
655 | Referee: Reference-Free Sentence Summarization with Sharper Controllability Through Symbolic Knowledge Distillation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present Referee, a novel framework for sentence summarization that can be trained reference-free (i. e. , requiring no gold summaries for supervision), while allowing direct control for compression ratio. |
Melanie Sclar; Peter West; Sachin Kumar; Yulia Tsvetkov; Yejin Choi; |
656 | Algorithms for Weighted Pushdown Automata Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we develop novel algorithms that operate directly on WPDAs. |
Alexandra Butoi; Brian DuSell; Tim Vieira; Ryan Cotterell; David Chiang; |
657 | MABEL: Attenuating Gender Bias Using Textual Entailment Data Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To this end, we propose MABEL (a Method for Attenuating Gender Bias using Entailment Labels), an intermediate pre-training approach for mitigating gender bias in contextualized representations. |
Jacqueline He; Mengzhou Xia; Christiane Fellbaum; Danqi Chen; |
658 | Breakpoint Transformers for Modeling and Tracking Intermediate Beliefs Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To this end, we propose a representation learning framework called breakpoint modeling that allows for efficient and robust learning of this type. |
Kyle Richardson; Ronen Tamari; Oren Sultan; Dafna Shahaf; Reut Tsarfaty; Ashish Sabharwal; |
659 | Late Fusion with Triplet Margin Objective for Multimodal Ideology Prediction and Analysis Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we introduce the task of multimodal ideology prediction, where a model predicts binary or five-point scale ideological leanings, given a text-image pair with political content. |
Changyuan Qiu; Winston Wu; Xinliang Frederick Zhang; Lu Wang; |
660 | Leveraging QA Datasets to Improve Generative Data Augmentation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose CONDA, an approach to further improve GLM�s ability to generate synthetic data by reformulating data generation as context generation for a given question-answer (QA) pair and leveraging QA datasets for training context generators. |
Dheeraj Mekala; Tu Vu; Timo Schick; Jingbo Shang; |
661 | Meta-Learning Fast Weight Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present Fast Weight Layers (FWLs), a neural component that provides the benefits of dynamic evaluation much more efficiently by expressing gradient updates as linear attention. |
Kevin Clark; Kelvin Guu; Ming-Wei Chang; Panupong Pasupat; Geoffrey Hinton; Mohammad Norouzi; |
662 | CTL++: Evaluating Generalization on Never-Seen Compositional Patterns of Known Functions, and Compatibility of Neural Representations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Here we introduce CTL++, a new diagnostic dataset based on compositions of unary symbolic functions. |
R�bert Csord�s; Kazuki Irie; Juergen Schmidhuber; |
663 | Learning with Rejection for Abstractive Text Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose a training objective for abstractive summarization based on rejection learning, in which the model learns whether or not to reject potentially noisy tokens. |
Meng Cao; Yue Dong; Jingyi He; Jackie Chi Kit Cheung; |
664 | Adaptive Label Smoothing with Self-Knowledge in Natural Language Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In other words, label smoothing does not reflect the change in probability distribution mapped by a model over the course of training. To address this issue, we propose a regularization scheme that brings dynamic nature into the smoothing parameter by taking model probability distribution into account, thereby varying the parameter per instance. |
Dongkyu Lee; Ka Chun Cheung; Nevin Zhang; |
665 | Hard Gate Knowledge Distillation – Leverage Calibration for Robust and Reliable Language Model Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Knowledge of a teacher is considered a subject that holds inter-class relations which send a meaningful supervision to a student; hence, much effort has been put to find such knowledge to be distilled. In this paper, we explore a question that has been given little attention: �when to distill such knowledge. |
Dongkyu Lee; Zhiliang Tian; Yingxiu Zhao; Ka Chun Cheung; Nevin Zhang; |
666 | Are All Spurious Features in Natural Language Alike? An Analysis Through A Causal Lens Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Therefore, a more fine-grained treatment of spurious features is needed to specify the desired model behavior. We formalize this distinction using a causal model and probabilities of necessity and sufficiency, which delineates the causal relations between a feature and a label. |
Nitish Joshi; Xiang Pan; He He; |
667 | Correcting Diverse Factual Errors in Abstractive Summarization Via Post-Editing and Language Model Infilling Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose to generate hard, representative synthetic examples of non-factual summaries through infilling language models. |
Vidhisha Balachandran; Hannaneh Hajishirzi; William Cohen; Yulia Tsvetkov; |
668 | Coordinated Topic Modeling Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a new problem called coordinated topic modeling that imitates human behavior while describing a text corpus. |
Pritom Saha Akash; Jie Huang; Kevin Chen-Chuan Chang; |
669 | Large Dual Encoders Are Generalizable Retrievers Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: One widespread belief is that the bottleneck layer of a dual encoder, where the final score is simply a dot-product between a query vector and a passage vector, is too limited compared to models with fine-grained interactions between the query and the passage. In this paper, we challenge this belief by scaling up the size of the dual encoder model while keeping the bottleneck layer as a single dot-product with a fixed size. |
Jianmo Ni; Chen Qu; Jing Lu; Zhuyun Dai; Gustavo Hernandez Abrego; Ji Ma; Vincent Zhao; Yi Luan; Keith Hall; Ming-Wei Chang; Yinfei Yang; |
670 | CRIPP-VQA: Counterfactual Reasoning About Implicit Physical Properties Via Video Question Answering Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we introduce CRIPP-VQA, a new video question answering dataset for reasoning about the implicit physical properties of objects in a scene. |
Maitreya Patel; Tejas Gokhale; Chitta Baral; Yezhou Yang; |
671 | Entity-centered Cross-document Relation Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Moreover, they utilize all the text paths in a document bag in a coarse-grained way, without considering the connections between these text paths. In this paper, we aim to address both of these shortages and push the state-of-the-art for cross-document RE. |
Fengqi Wang; Fei Li; Hao Fei; Jingye Li; Shengqiong Wu; Fangfang Su; Wenxuan Shi; Donghong Ji; Bo Cai; |
672 | Exploring Document-Level Literary Machine Translation with Parallel Paragraphs from World Literature Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: The experts note that MT outputs contain not only mistranslations, but also discourse-disrupting errors and stylistic inconsistencies. To address these problems, we train a post-editing model whose output is preferred over normal MT output at a rate of 69% by experts. |
Katherine Thai; Marzena Karpinska; Kalpesh Krishna; Bill Ray; Moira Inghilleri; John Wieting; Mohit Iyyer; |
673 | Label-aware Multi-level Contrastive Learning for Cross-lingual Spoken Language Understanding Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose to model the utterance-slot-word structure by a multi-level contrastive learning framework at the utterance, slot and word levels to facilitate explicit alignment. |
Shining Liang; Linjun Shou; Jian Pei; Ming Gong; Wanli Zuo; Xianglin Zuo; Daxin Jiang; |
674 | Polyglot Prompt: Multilingual Multitask Prompt Training Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper aims for a potential architectural improvement for multilingual learning and asks: Can different tasks from different languages be modeled in a monolithic framework, i. e. without any task/language-specific module? |
Jinlan Fu; See-Kiong Ng; Pengfei Liu; |
675 | VisToT: Vision-Augmented Table-to-Text Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: For example, in the tourism domain, images can be used to infer knowledge such as the type of landmark (e. g. , church), its architecture (e. g. , Ancient Roman), and composition (e. g. , white marble). Therefore, in this paper, we introduce the novel task of Vision-augmented Table-To-Text Generation (VisToT, defined as follows: given a table and an associated image, produce a descriptive sentence conditioned on the multimodal input. |
Prajwal Gatti; Anand Mishra; Manish Gupta; Mithun Das Gupta; |
676 | Generative Entity-to-Entity Stance Detection with Knowledge Graph Augmentation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we emphasize the imperative need for studying interactions among entities when inferring stances. |
Xinliang Frederick Zhang; Nick Beauchamp; Lu Wang; |
677 | Symptom Identification for Interpretable Detection of Multiple Mental Disorders on Social Media Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper introduces PsySym, the first annotated symptom identification corpus of multiple psychiatric disorders, to facilitate further research progress. |
Zhiling Zhang; Siyuan Chen; Mengyue Wu; Kenny Zhu; |
678 | Improving Iterative Text Revision By Learning Where to Edit from Other Revision Tasks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we aim to build an end-to-end text revision system that can iteratively generate helpful edits by explicitly detecting editable spans (where-to-edit) with their corresponding edit intents and then instructing a revision model to revise the detected edit spans. |
Zae Myung Kim; Wanyu Du; Vipul Raheja; Dhruv Kumar; Dongyeop Kang; |
679 | CONQRR: Conversational Query Rewriting for Retrieval with Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Moreover, it can be expensive to re-train well-established retrievers such as search engines that are originally developed for non-conversational queries. To facilitate their use, we develop a query rewriting model CONQRR that rewrites a conversational question in the context into a standalone question. |
Zeqiu Wu; Yi Luan; Hannah Rashkin; David Reitter; Hannaneh Hajishirzi; Mari Ostendorf; Gaurav Singh Tomar; |
680 | Specializing Multi-domain NMT Via Penalizing Low Mutual Information Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we investigate domain-specific information through the lens of mutual information (MI) and propose a new objective that penalizes low MI to become higher. |
Jiyoung Lee; Hantae Kim; Hyunchang Cho; Edward Choi; Cheonbok Park; |
681 | A Simple Contrastive Learning Framework for Interactive Argument Pair Identification Via Argument-Context Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, current context-based methods achieve limited improvements since the entire context typically contains much irrelevant information. In this paper, we propose a simple contrastive learning framework to solve this problem by extracting valuable information from the context. |
Lida Shi; Fausto Giunchiglia; Rui Song; Daqian Shi; Tongtong Liu; Xiaolei Diao; Hao Xu; |
682 | Sentence-level Media Bias Analysis Informed By Discourse Structures Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we aim to identify sentences within an article that can illuminate and explain the overall bias of the entire article. |
Yuanyuan Lei; Ruihong Huang; Lu Wang; Nick Beauchamp; |
683 | Towards Efficient Dialogue Pre-training with Transferable and Interpretable Latent Structure Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper proposes a novel dialogue generation model with a latent structure that is easily transferable from the general domain to downstream tasks in a lightweight and transparent way. |
Xueliang Zhao; Lemao Liu; Tingchen Fu; Shuming Shi; Dongyan Zhao; Rui Yan; |
684 | An Empirical Revisiting of Linguistic Knowledge Fusion in Language Understanding Tasks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Hence we call for attention to using trivial graphs as necessary baselines to design advanced knowledge fusion methods in the future. |
Changlong Yu; Tianyi Xiao; Lingpeng Kong; Yangqiu Song; Wilfred Ng; |
685 | Unsupervised Non-transferable Text Classification Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a novel unsupervised non-transferable learning method for the text classification task that does not require annotated target domain data. |
Guangtao Zeng; Wei Lu; |
686 | Adaptive Contrastive Learning on Multimodal Transformer for Review Helpfulness Prediction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To overcome the aforementioned issues, we propose Multi-modal Contrastive Learning for Multimodal Review Helpfulness Prediction (MRHP) problem, concentrating on mutual information between input modalities to explicitly elaborate cross-modal relations. |
Thong Nguyen; Xiaobao Wu; Anh Tuan Luu; Zhen Hai; Lidong Bing; |
687 | Adaptive Token-level Cross-lingual Feature Mixing for Multilingual Neural Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel token-level feature mixing method that enables the model to capture different features and dynamically determine the feature sharing across languages. |
Junpeng Liu; Kaiyu Huang; Jiuyi Li; Huan Liu; Jinsong Su; Degen Huang; |
688 | A Dataset for Hyper-Relational Extraction and A Cube-Filling Approach Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Hence, we propose CubeRE, a cube-filling model inspired by table-filling approaches and explicitly considers the interaction between relation triplets and qualifiers. |
Yew Ken Chia; Lidong Bing; Sharifah Mahani Aljunied; Luo Si; Soujanya Poria; |
689 | Low-resource Neural Machine Translation with Cross-modal Alignment Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we turn to connect several low-resource languages to a particular high-resource one by additional visual modality. |
Zhe Yang; Qingkai Fang; Yang Feng; |
690 | Prompt-based Distribution Alignment for Domain Generalization in Text Classification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To improve domain generalization with prompting, we learn distributional invariance across source domains via two alignment regularization loss functions. |
Chen Jia; Yue Zhang; |
691 | Two Is Better Than Many? Binary Classification As An Effective Approach to Multi-Choice Question Answering Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a simple refactoring of multi-choice question answering (MCQA) tasks as a series of binary classifications. |
Deepanway Ghosal; Navonil Majumder; Rada Mihalcea; Soujanya Poria; |
692 | HEGEL: Hypergraph Transformer for Long Document Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper proposes HEGEL, a hypergraph neural network for long document summarization by capturing high-order cross-sentence relations. |
Haopeng Zhang; Xiao Liu; Jiawei Zhang; |
693 | Adapting A Language Model While Preserving Its General Knowledge Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper shows that the existing methods are suboptimal and proposes a novel method to perform a more informed adaptation of the knowledge in the LM by (1) soft-masking the attention heads based on their importance to best preserve the general knowledge in the LM and (2) contrasting the representations of the general and the full (both general and domain knowledge) to learn an integrated representation with both general and domain-specific knowledge. |
Zixuan Ke; Yijia Shao; Haowei Lin; Hu Xu; Lei Shu; Bing Liu; |
694 | Human Guided Exploitation of Interpretable Attention Patterns in Summarization and Topic Segmentation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In another vein, researchers propose new attention augmentation methods to make transformers more accurate, efficient and interpretable. In this paper, we combine these two lines of research in a human-in-the-loop pipeline to first discover important task-specific attention patterns. |
Raymond Li; Wen Xiao; Linzi Xing; Lanjun Wang; Gabriel Murray; Giuseppe Carenini; |
695 | Continual Training of Language Models for Few-Shot Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Adapting or posttraining an LM using an unlabeled domain corpus can produce even better performance for end-tasks in the domain. This paper proposes the problem of continually extending an LM by incrementally post-train the LM with a sequence of unlabeled domain corpora to expand its knowledge without forgetting its previous skills. |
Zixuan Ke; Haowei Lin; Yijia Shao; Hu Xu; Lei Shu; Bing Liu; |
696 | Dictionary-Assisted Supervised Contrastive Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce the dictionary-assisted supervised contrastive learning (DASCL) objective, allowing researchers to leverage specialized dictionaries when fine-tuning pretrained language models. |
Patrick Wu; Richard Bonneau; Joshua Tucker; Jonathan Nagler; |
697 | Fine-Tuning Pre-trained Transformers Into Decaying Fast Weights Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Recent work proposes kernel-based methods to approximate causal self-attention by replacing it with recurrent formulations with various update rules and feature maps to achieve O(1) time and memory complexity. We explore these approaches and find that they are unnecessarily complex, and propose a simple alternative – decaying fast weights – that runs fast on GPU, outperforms prior methods, and retains 99% of attention�s performance for GPT-2. |
Huanru Henry Mao; |
698 | PRO-CS : An Instance-Based Prompt Composition Technique for Code-Switched Tasks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel instance-based prompt composition technique, PRO-CS, for CS tasks that combine language and task knowledge. |
Srijan Bansal; Suraj Tripathi; Sumit Agarwal; Teruko Mitamura; Eric Nyberg; |
699 | SentBS: Sentence-level Beam Search for Controllable Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, current structure-controlling methods have limited effectiveness in enforcing the desired structure. To address this limitation, we propose a sentence-level beam search generation method (SentBS), where evaluation is conducted throughout the generation process to select suitable sentences for subsequent generations. |
Chenhui Shen; Liying Cheng; Lidong Bing; Yang You; Luo Si; |
700 | A Fine-grained Chinese Software Privacy Policy Dataset for Sequence Labeling and Regulation Compliant Identification Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we construct the first Chinese privacy policy dataset, namely CA4P-483, to facilitate the sequence labeling tasks and regulation compliance identification between privacy policies and software. |
Kaifa Zhao; Le Yu; Shiyao Zhou; Jing Li; Xiapu Luo; Yat Fei Aemon Chiu; Yutong Liu; |
701 | Saving Dense Retriever from Shortcut Dependency in Conversational Search Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we demonstrate the existence of a retrieval shortcut in CS, which causes models to retrieve passages solely relying on partial history while disregarding the latest question. |
Sungdong Kim; Gangwoo Kim; |
702 | Graph-Induced Transformers for Efficient Multi-Hop Question Answering Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Our work proposes the Graph-Induced Transformer (GIT) that applies graph-derived attention patterns directly into a PLM, without the need to employ external graph modules. |
Giwon Hong; Jeonghwan Kim; Junmo Kang; Sung-Hyon Myaeng; |
703 | DiscoSense: Commonsense Reasoning with Discourse Connectives Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present DiscoSense, a benchmark for commonsense reasoning via understanding a wide variety of discourse connectives. |
Prajjwal Bhargava; Vincent Ng; |
704 | Boosting Document-Level Relation Extraction By Mining and Injecting Logical Rules Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose MILR, a logic enhanced framework that boosts DocRE by Mining and Injecting Logical Rules. |
Shengda Fan; Shasha Mo; Jianwei Niu; |
705 | MOCHA: A Multi-Task Training Approach for Coherent Text Generation from Cognitive Perspective Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose a novel multi-task training strategy for long text generation grounded on the cognitive theory of writing, which empowers the model to learn essential subskills needed for writing including planning and reviewing besides end-to-end generation. |
Zhe Hu; Hou Pong Chan; Lifu Huang; |
706 | Variational Autoencoder with Disentanglement Priors for Low-Resource Task-Specific Natural Language Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a variational autoencoder with disentanglement priors, VAE-Dprior, for task-specific natural language generation with none or a handful of task-specific labeled examples. |
Zhuang Li; Lizhen Qu; Qiongkai Xu; Tongtong Wu; Tianyang Zhan; Gholamreza Haffari; |
707 | CISLR: Corpus for Indian Sign Language Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In recent years researchers have actively worked for sign languages like American Sign Languages, however, Indian Sign language is still far from data-driven tasks like machine translation. To address this gap, in this paper, we introduce a new dataset CISLR (Corpus for Indian Sign Language Recognition) for word-level recognition in Indian Sign Language using videos. |
Abhinav Joshi; Ashwani Bhat; Pradeep S; Priya Gole; Shashwat Gupta; Shreyansh Agarwal; Ashutosh Modi; |
708 | Mask The Correct Tokens: An Embarrassingly Simple Approach for Error Correction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Since the error rate of the incorrect sentence is usually low (e. g. , 10%), the correction model can only learn to correct on limited error tokens but trivially copy on most tokens (correct tokens), which harms the effective training of error correction. In this paper, we argue that the correct tokens should be better utilized to facilitate effective training and then propose a simple yet effective masking strategy to achieve this goal. |
Kai Shen; Yichong Leng; Xu Tan; Siliang Tang; Yuan Zhang; Wenjie Liu; Edward Lin; |
709 | AMAL: Meta Knowledge-Driven Few-Shot Adapter Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this study, we present a meta-learning-driven low-rank adapter pooling method, called AMAL, for leveraging pre-trained language models even with just a few data points. |
S. K. Hong; Tae Young Jang; |
710 | Discourse Context Predictability Effects in Hindi Word Order Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We test the hypothesis that discourse predictability influences Hindi syntactic choice. |
Sidharth Ranjan; Marten van Schijndel; Sumeet Agarwal; Rajakrishnan Rajkumar; |
711 | �Covid Vaccine Is Against Covid But Oxford Vaccine Is Made at Oxford!� Semantic Interpretation of Proper Noun Compounds Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: These are commonly used in short-form domains, such as news headlines, but are largely ignored in information-seeking applications. To address this limitation, we release a new manually annotated dataset, ProNCI, consisting of 22.5K proper noun compounds along with their free-form semantic interpretations. |
Keshav Kolluru; Gabriel Stanovsky; Mausam –; |
712 | Context Limitations Make Neural Language Models More Human-Like Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This study highlights a limitation of modern neural LMs as the model of choice for this purpose: there is a discrepancy between their context access capacities and that of humans. |
Tatsuki Kuribayashi; Yohei Oseki; Ana Brassard; Kentaro Inui; |
713 | A Generative Model for End-to-End Argument Mining with Reconstructed Positional Encoding and Constrained Pointer Mechanism Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we investigate the end-to-end AM task from a novel perspective by proposing a generative framework, in which the expected outputs of AM are framed as a simple target sequence. |
Jianzhu Bao; Yuhang He; Yang Sun; Bin Liang; Jiachen Du; Bing Qin; Min Yang; Ruifeng Xu; |
714 | Reflect, Not Reflex: Inference-Based Common Ground Improves Dialogue Response Quality Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Human communication relies on common ground (CG), the mutual knowledge and beliefs shared by participants, to produce coherent and interesting conversations. In this paper, we demonstrate that current response generation (RG) models produce generic and dull responses in dialogues because they act reflexively, failing to explicitly model CG, both due to the lack of CG in training data and the standard RG training procedure. |
Pei Zhou; Hyundong Cho; Pegah Jandaghi; Dong-Ho Lee; Bill Yuchen Lin; Jay Pujara; Xiang Ren; |
715 | FlowEval: A Consensus-Based Dialogue Evaluation Framework Using Segment Act Flows Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Hence, we propose segment act, an extension of dialog act from utterance level to segment level, and crowdsource a large-scale dataset for it. |
Jianqiao Zhao; Yanyang Li; Wanyu Du; Yangfeng Ji; Dong Yu; Michael Lyu; Liwei Wang; |
716 | FaD-VLP: Fashion Vision-and-Language Pre-training Towards Unified Retrieval and Captioning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Additionally, these works have mainly been restricted to multimodal understanding tasks. To address these gaps, we make two key contributions. First, we propose a novel fashion-specific pre-training framework based on weakly-supervised triplets constructed from fashion image-text pairs. |
Suvir Mirchandani; Licheng Yu; Mengjiao Wang; Animesh Sinha; Wenwen Jiang; Tao Xiang; Ning Zhang; |
717 | MM-Align: Learning Optimal Transport-based Alignment Dynamics for Fast and Accurate Inference on Missing Modality Sequences Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present a novel approach named MM-Align to address the missing-modality inference problem. |
Wei Han; Hui Chen; Min-Yen Kan; Soujanya Poria; |
718 | Evaluating The Knowledge Dependency of Questions Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: They fail to evaluate the MCQ�s ability to assess the student�s knowledge of the corresponding target fact. To tackle this issue, we propose a novel automatic evaluation metric, coined Knowledge Dependent Answerability (KDA), which measures the MCQ�s answerability given knowledge of the target fact. |
Hyeongdon Moon; Yoonseok Yang; Hangyeol Yu; Seunghyun Lee; Myeongho Jeong; Juneyoung Park; Jamin Shin; Minsam Kim; Seungtaek Choi; |
719 | MoSE: Modality Split and Ensemble for Multimodal Knowledge Graph Completion Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose MoSE, a Modality Split representation learning and Ensemble inference framework for MKGC. |
Yu Zhao; Xiangrui Cai; Yike Wu; Haiwei Zhang; Ying Zhang; Guoqing Zhao; Ning Jiang; |
720 | Entropy-Based Vocabulary Substitution for Incremental Learning in Multilingual Neural Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose an entropy-based vocabulary substitution (EVS) method that just needs to walk through new language pairs for incremental learning in a large-scale multilingual data updating while remaining the size of the vocabulary. |
Kaiyu Huang; Peng Li; Jin Ma; Yang Liu; |
721 | Eliciting Knowledge from Large Pre-Trained Models for Unsupervised Knowledge-Grounded Conversation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we answer the aforementioned question in unsupervised knowledge-grounded conversation. |
Yanyang Li; Jianqiao Zhao; Michael Lyu; Liwei Wang; |
722 | An Unsupervised, Geometric and Syntax-aware Quantification of Polysemy Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: With scarce attention paid to polysemy in computational linguistics, and even scarcer attention toward quantifying polysemy, in this paper, we propose a novel, unsupervised framework to compute and estimate polysemy scores for words in multiple languages. |
Anmol Goel; Charu Sharma; Ponnurangam Kumaraguru; |
723 | Reorder and Then Parse, Fast and Accurate Discontinuous Constituency Parsing Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Motivated by the observation that a discontinuous constituent tree can be simply transformed into a pseudo-continuous one by artificially reordering words in the sentence, we propose a novel reordering method, thereby construct fast and accurate discontinuous constituency parsing systems working in continuous way. |
Kailai Sun; Zuchao Li; Hai Zhao; |
724 | Making Science Simple: Corpora for The Lay Summarisation of Scientific Literature Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, current corpora for this task are limited in their size and scope, hindering the development of broadly applicable data-driven approaches. Aiming to rectify these issues, we present two novel lay summarisation datasets, PLOS (large-scale) and eLife (medium-scale), each of which contains biomedical journal articles alongside expert-written lay summaries. |
Tomas Goldsack; Zhihao Zhang; Chenghua Lin; Carolina Scarton; |
725 | Looking at The Overlooked: An Analysis on The Word-Overlap Bias in Natural Language Inference Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we focus on an overlooked aspect of the overlap bias in the NLI models: the reverse word-overlap bias. |
Sara Rajaee; Yadollah Yaghoobzadeh; Mohammad Taher Pilehvar; |
726 | An Empirical Study on The Transferability of Transformer Modules in Parameter-efficient Fine-tuning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Parameter-efficient fine-tuning has garnered lots of attention in recent studies. On this subject, we investigate the capability of different transformer modules in transferring knowledge from a pre-trained model to a downstream task. |
Mohammad AkbarTajari; Sara Rajaee; Mohammad Taher Pilehvar; |
727 | CODER: An Efficient Framework for Improving Retrieval Through COntextual Document Embedding Reranking Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we investigate the impact of ranking context – an often overlooked aspect of learning dense retrieval models. |
George Zerveas; Navid Rekabsaz; Daniel Cohen; Carsten Eickhoff; |
728 | AdapterShare: Task Correlation Modeling with Adapter Differentiation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: More deeply, In this paper, we propose AdapterShare, an adapter differentiation method to explicitly model the task correlation among multiple tasks. |
Zhi Chen; Bei Chen; Lu Chen; Kai Yu; Jian-Guang Lou; |
729 | Rethinking Task-Specific Knowledge Distillation: Contextualized Corpus As Better Textbook Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To mitigate the issues in the two gapped corpora, we present a better textbook for the student to learn: contextualized corpus that contextualizes task corpus with large-scale general corpus through relevance-based text retrieval. |
Chang Liu; Chongyang Tao; Jianxin Liang; Tao Shen; Jiazhan Feng; Quzhe Huang; Dongyan Zhao; |
730 | Recovering Gold from Black Sand: Multilingual Dense Passage Retrieval with Hard and False Negative Samples Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel multilingual dense passage retrieval framework, mHFN, to recover and utilize hard and false negative samples. |
Tianhao Shen; Mingtong Liu; Ming Zhou; Deyi Xiong; |
731 | The �Problem� of Human Label Variation: On Ground Truth in Data, Modeling and Evaluation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, thisconventional practice assumes that there exists a *ground truth*, and neglects that there exists genuine human variation in labeling due to disagreement, subjectivity in annotation or multiple plausible answers. In this position paper, we argue that this big open problem of human label variation persists and critically needs more attention to move our field forward. |
Barbara Plank; |
732 | Quality Scoring of Source Words in Neural Translation Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a simple approach based on comparing the difference of probabilities from two language models. |
Priyesh Jain; Sunita Sarawagi; Tushar Tomar; |
733 | Pneg: Prompt-based Negative Response Generation for Dialogue Response Selection Task Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Nevertheless, collecting human-written adversarial responses is expensive, and existing synthesizing methods often have limited scalability. To overcome these limitations, this paper proposes a simple but efficient method for generating adversarial negative responses leveraging a large-scale language model. |
Nyoungwoo Lee; ChaeHun Park; Ho-Jin Choi; Jaegul Choo; |
734 | Facilitating Contrastive Learning of Discourse Relational Senses By Exploiting The Hierarchy of Sense Relations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Here we do more — incorporating the sense hierarchy into the recognition process itself and using it to select the negative examples used in contrastive learning. |
Wanqiu Long; Bonnie Webber; |
735 | Simplified Graph Learning for Inductive Short Text Classification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we present SimpleSTC which handles inductive STC problem but only leverages words. |
Kaixin Zheng; Yaqing Wang; Quanming Yao; Dejing Dou; |
736 | Don�t Stop Fine-Tuning: On Training Regimes for Few-Shot Cross-Lingual Transfer with Multilingual Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we present a systematic study focused on a spectrum of FS-XLT fine-tuning regimes, analyzing key properties such as effectiveness, (in)stability, and modularity. |
Fabian David Schmidt; Ivan Vulic; Goran Glava�; |
737 | Towards Compositional Generalization in Code Search Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Thus we propose CTBERT, or Code Template BERT, representing codes using automatically extracted templates as building blocks. |
Hojae Han; Seung-won Hwang; Shuai Lu; Nan Duan; Seungtaek Choi; |
738 | Towards Relation Extraction from Speech Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a new listening information extraction task, i. e. , speech relation extraction. |
Tongtong Wu; Guitao Wang; Jinming Zhao; Zhaoran Liu; Guilin Qi; Yuan-Fang Li; Gholamreza Haffari; |
739 | Structural Constraints and Natural Language Inference for End-to-End Flowchart Grounded Dialog Response Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In such cases, it fails to understand the correct polarity of the answer. To overcome these issues, we propose Structure-Aware FLONET (SA-FLONET) which infuses structural constraints derived from the connectivity structure of flowcharts into the RAG framework. |
Dinesh Raghu; Suraj Joshi; Sachindra Joshi; Mausam –; |
740 | SLICER: Sliced Fine-Tuning for Low-Resource Cross-Lingual Transfer for Named Entity Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we introduce a simple yet highly effective approach for improving zero-shot transfer for NER to low-resource languages. |
Fabian David Schmidt; Ivan Vulic; Goran Glava�; |
741 | EdgeFormer: A Parameter-Efficient Transformer for On-Device Seq2seq Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce EdgeFormer � a parameter-efficient Transformer for on-device seq2seq generation under the strict computation and memory constraints. |
Tao Ge; Si-Qing Chen; Furu Wei; |
742 | End-to-End Unsupervised Vision-and-Language Pre-training with Referring Expression Matching Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we explore end-to-end unsupervised VLP with a vision encoder to directly encode images. |
Chi Chen; Peng Li; Maosong Sun; Yang Liu; |
743 | Faithful Knowledge Graph Explanations in Commonsense Question Answering Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: A common way of incorporating facts from the graph is to encode them separately from the question, and then combine the two representations to select an answer. In this paper, we argue that highly faithful graph-based explanations cannot be extracted from existing models of this type. |
Guy Aglionby; Simone Teufel; |
744 | KOLD: Korean Offensive Language Dataset Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present the Korean Offensive Language Dataset (KOLD) comprising 40,429 comments, which are annotated hierarchically with the type and the target of offensive language, accompanied by annotations of the corresponding text spans. |
Younghoon Jeong; Juhyun Oh; Jongwon Lee; Jaimeen Ahn; Jihyung Moon; Sungjoon Park; Alice Oh; |
745 | Evade The Trap of Mediocrity: Promoting Diversity and Novelty in Text Generation Via Concentrating Attention Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Nevertheless, these models tend to produce dull high-frequency phrases, severely hurting the diversity and novelty of generated text. In this work, we dig into the intrinsic mechanism of this problem and found that sparser attention values in Transformer could improve diversity. |
Wenhao Li; Xiaoyuan Yi; Jinyi Hu; Maosong Sun; Xing Xie; |
746 | The Better Your Syntax, The Better Your Semantics? Probing Pretrained Language Models for The English Comparative Correlative Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: As a first step towards assessing the compatibility of CxG with the syntactic and semantic knowledge demonstrated by state-of-the-art pretrained language models (PLMs), we present an investigation of their capability to classify and understand one of the most commonly studied constructions, the English comparative correlative (CC). |
Leonie Weissweiler; Valentin Hofmann; Abdullatif K�ksal; Hinrich Sch�tze; |
747 | ProofInfer: Generating Proof Via Iterative Hierarchical Inference Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To this end, we propose a divide-and-conquer algorithm to encode the proof tree as the plain text without losing structure information. |
Zichu Fei; Qi Zhang; Xin Zhou; Tao Gui; Xuanjing Huang; |
748 | ECTSum: A New Benchmark Dataset For Bullet Point Summarization of Long Earnings Call Transcripts Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we present ECTSum, a new dataset with transcripts of earnings calls (ECTs), hosted by publicly traded companies, as documents, and experts-written short telegram-style bullet point summaries derived from corresponding Reuters articles. |
Rajdeep Mukherjee; Abhinav Bohra; Akash Banerjee; Soumya Sharma; Manjunath Hegde; Afreen Shaikh; Shivani Shrivastava; Koustuv Dasgupta; Niloy Ganguly; Saptarshi Ghosh; Pawan Goyal; |
749 | Cross-domain Generalization for AMR Parsing Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Based on our observation, we investigate two approaches to reduce the domain distribution divergence of text and AMR features, respectively. |
Xuefeng Bai; Sen Yang; Leyang Cui; Linfeng Song; Yue Zhang; |
750 | CiteSum: Citation Text-guided Scientific Extreme Summarization and Domain Adaptation with Limited Supervision Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a simple yet effective approach to automatically extracting TLDR summaries for scientific papers from their citation texts. |
Yuning Mao; Ming Zhong; Jiawei Han; |
751 | FETA: A Benchmark for Few-Sample Task Transfer in Open-Domain Dialogue Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We utilize three popular language models and three learning algorithms to analyze the transferability between 132 source-target task pairs and create a baseline for future work. |
Alon Albalak; Yi-Lin Tuan; Pegah Jandaghi; Connor Pryor; Luke Yoffe; Deepak Ramachandran; Lise Getoor; Jay Pujara; William Yang Wang; |
752 | Do Children Texts Hold The Key To Commonsense Knowledge? Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper explores whether children’s texts hold the key to commonsense knowledge compilation, based on the hypothesis that such content makes fewer assumptions on the reader’s knowledge, and therefore spells out commonsense more explicitly. |
Julien Romero; Simon Razniewski; |
753 | On The Limitations of Reference-Free Evaluations of Generated Text Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Therefore, we recommend that reference-free metrics should be used as diagnostic tools for analyzing and understanding model behavior instead of measures of how well models perform a task, in which the goal is to achieve as high of a score as possible. |
Daniel Deutsch; Rotem Dror; Dan Roth; |
754 | Sampling-Based Approximations to Minimum Bayes Risk Decoding for Neural Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Recently, an approximation to minimum Bayes risk (MBR) decoding has been proposed as an alternative decision rule that would likely not suffer from the same problems. We analyse this approximation and establish that it has no equivalent to the beam search curse. |
Bryan Eikema; Wilker Aziz; |
755 | IndicXNLI: Evaluating Multilingual Inference for Indian Languages Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To this end, we introduce INDICXNLI, an NLI dataset for 11 Indic languages. |
Divyanshu Aggarwal; Vivek Gupta; Anoop Kunchukuttan; |
756 | Model Cascading: Towards Jointly Improving Efficiency and Accuracy of NLP Systems Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we present an explorative study on �model cascading�, a simple technique that utilizes a collection of models of varying capacities to accurately yet efficiently output predictions. |
Neeraj Varshney; Chitta Baral; |
757 | Semantic Simplification for Sentiment Classification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To this end, we enhance the original text with a sentiment-driven simplified clause to intensify its sentiment. |
Xiaotong Jiang; Zhongqing Wang; Guodong Zhou; |
758 | XPrompt: Exploring The Extreme of Prompt Tuning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: While prompt tuning has gradually reached the performance level of fine-tuning as the model scale increases, there is still a large performance gap between prompt tuning and fine-tuning for models of moderate and small scales (typically less than 11B parameters). In this paper, we empirically show that the trained prompt tokens can have a negative impact on a downstream task and thus degrade its performance. |
Fang Ma; Chen Zhang; Lei Ren; Jingang Wang; Qifan Wang; Wei Wu; Xiaojun Quan; Dawei Song; |
759 | Rethinking The Role of Demonstrations: What Makes In-Context Learning Work? Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we show that ground truth demonstrations are in fact not required�randomly replacing labels in the demonstrations barely hurts performance on a range of classification and multi-choce tasks, consistently over 12 different models including GPT-3. |
Sewon Min; Xinxi Lyu; Ari Holtzman; Mikel Artetxe; Mike Lewis; Hannaneh Hajishirzi; Luke Zettlemoyer; |
760 | The Curious Case of Control Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We find that models can be categorized by behavior into three separate groups, with broad differences between the groups. |
Elias Stengel-Eskin; Benjamin Van Durme; |
761 | SHARE: A System for Hierarchical Assistive Recipe Editing Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To help them, we propose the task of controllable recipe editing: adapt a base recipe to satisfy a user-specified dietary constraint. |
Shuyang Li; Yufei Li; Jianmo Ni; Julian McAuley; |
762 | IM^2: An Interpretable and Multi-category Integrated Metric Framework for Automatic Dialogue Evaluation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To mitigate the problem, this paper proposes an interpretable, multi-faceted, and controllable framework IM^2 (Interpretable and Multi-category Integrated Metric) to combine a large number of metrics which are good at measuring different qualities. |
Zhihua Jiang; Guanghui Ye; Dongning Rao; Di Wang; Xin Miao; |
763 | PEVL: Position-enhanced Pre-training and Prompt Tuning for Vision-language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To address the challenge, we introduce PEVL that enhances the pre-training and prompt tuning of VLP models with explicit object position modeling. |
Yuan Yao; Qianyu Chen; Ao Zhang; Wei Ji; Zhiyuan Liu; Tat-Seng Chua; Maosong Sun; |
764 | Pre-training Language Models with Deterministic Factual Knowledge Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, some analyses reveal that PLMs fail to perform it robustly, e. g. , being sensitive to the changes of prompts when extracting factual knowledge. To mitigate this issue, we propose to let PLMs learn the deterministic relationship between the remaining context and the masked content. |
Shaobo Li; Xiaoguang Li; Lifeng Shang; Chengjie Sun; Bingquan Liu; Zhenzhou Ji; Xin Jiang; Qun Liu; |
765 | Finding Skill Neurons in Pre-trained Transformer-based Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we find that after prompt tuning for specific tasks, the activations of some neurons within pre-trained Transformers are highly predictive of the task labels. |
Xiaozhi Wang; Kaiyue Wen; Zhengyan Zhang; Lei Hou; Zhiyuan Liu; Juanzi Li; |
766 | Prompt Conditioned VAE: Enhancing Generative Replay for Lifelong Learning in Task-Oriented Dialogue Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a novel method, prompt conditioned VAE for lifelong learning (PCLL), to enhance generative replay by incorporating tasks� statistics. |
Yingxiu Zhao; Yinhe Zheng; Zhiliang Tian; Chang Gao; Jian Sun; Nevin L. Zhang; |
767 | PreQuEL: Quality Estimation of Machine Translation Outputs in Advance Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present the task of PreQuEL, Pre-(Quality-Estimation) Learning. |
Shachar Don-Yehiya; Leshem Choshen; Omri Abend; |
768 | Can Transformers Reason in Fragments of Natural Language? Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper we carry out a large-scale empirical study investigating the detection of formally valid inferences in controlled fragments of natural language for which the satisfiability problem becomes increasingly complex. |
Viktor Schlegel; Kamen Pavlov; Ian Pratt-Hartmann; |
769 | Textless Speech Emotion Conversion Using Discrete & Decomposed Representations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this study, we cast the problem of emotion conversion as a spoken language translation task. |
Felix Kreuk; Adam Polyak; Jade Copet; Eugene Kharitonov; Tu Anh Nguyen; Morgan Rivi�re; Wei-Ning Hsu; Abdelrahman Mohamed; Emmanuel Dupoux; Yossi Adi; |
770 | Textual Backdoor Attacks Can Be More Harmful Via Two Simple Tricks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we find two simple tricks that can make existing textual backdoor attacks much more harmful. |
Yangyi Chen; Fanchao Qi; Hongcheng Gao; Zhiyuan Liu; Maosong Sun; |
771 | Why Should Adversarial Perturbations Be Imperceptible? Rethink The Research Paradigm in Adversarial NLP Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we rethink the research paradigm of textual adversarial samples in security scenarios. |
Yangyi Chen; Hongcheng Gao; Ganqu Cui; Fanchao Qi; Longtao Huang; Zhiyuan Liu; Maosong Sun; |
772 | Retrieval Augmented Visual Question Answering with Outside Knowledge Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Instead, we propose a joint training scheme which includes differentiable DPR integrated with answer generation so that the system can be trained in an end-to-end fashion. |
Weizhe Lin; Bill Byrne; |
773 | Instance Regularization for Discriminative Language Model Pre-training Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To model explicit signals of instance contribution, this work proposes to estimate the complexity of restoring the original sentences from corrupted ones in language model pre-training. |
Zhuosheng Zhang; Hai Zhao; Ming Zhou; |
774 | GuoFeng: A Benchmark for Zero Pronoun Recovery and Translation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To bridge the data and evaluation gaps, we propose a benchmark testset for target evaluation on Chinese-English ZP translation. |
Mingzhou Xu; Longyue Wang; Derek F. Wong; Hongye Liu; Linfeng Song; Lidia S. Chao; Shuming Shi; Zhaopeng Tu; |
775 | ScienceWorld: Is Your Agent Smarter Than A 5th Grader? Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present ScienceWorld, a benchmark to test agents� scientific reasoning abilities in a new interactive text environment at the level of a standard elementary school science curriculum. |
Ruoyao Wang; Peter Jansen; Marc-Alexandre C�t�; Prithviraj Ammanabrolu; |
776 | Improving Embeddings Representations for Comparing Higher Education Curricula: A Use Case in Computing Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose an approach for comparing curricula of study programs in higher education. |
Jeffri Murrugarra-Llerena; Fernando Alva-Manchego; Nils Murrugarra-LLerena; |
777 | Mitigating Spurious Correlation in Natural Language Understanding with Counterfactual Inference Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a causal analysis framework to help debias NLU models. |
Can Udomcharoenchaikit; Wuttikorn Ponwitayarat; Patomporn Payoungkhamdee; Kanruethai Masuk; Weerayut Buaphet; Ekapol Chuangsuwanich; Sarana Nutanong; |
778 | End-to-End Neural Discourse Deixis Resolution in Dialogue Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We adapt Lee et al. �s (2018) span-based entity coreference model to the task of end-to-end discourse deixis resolution in dialogue, specifically by proposing extensions to their model that exploit task-specific characteristics. |
Shengjie Li; Vincent Ng; |
779 | Balancing Out Bias: Achieving Fairness Through Balanced Training Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To achieve Equal Opportunity fairness, such as equal job opportunity without regard to demographics, this paper introduces a simple, but highly effective, objective for countering bias using balanced training. |
Xudong Han; Timothy Baldwin; Trevor Cohn; |
780 | Prompting ELECTRA: Few-Shot Learning with Discriminative Pre-Trained Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we adapt prompt-based few-shot learning to ELECTRA and show that it outperforms masked language models in a wide range of tasks. |
Mengzhou Xia; Mikel Artetxe; Jingfei Du; Danqi Chen; Veselin Stoyanov; |
781 | Identifying Physical Object Use in Sentences Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We define a new task called ObjectUse Classification that determines whethera physical object mentioned in a sentence wasused or likely will be used. We introduce a newdataset for this task and present a classificationmodel that exploits data augmentation methodsand FrameNet when fine-tuning a pre-trainedlanguage model. |
Tianyu Jiang; Ellen Riloff; |
782 | CDialog: A Multi-turn Covid-19 Conversation Dataset for Entity-Aware Dialog Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Abstract: n/a … |
Deeksha Varshney; Aizan Zafar; Niranshu Behera; Asif Ekbal; |
783 | Robustifying Sentiment Classification By Maximally Exploiting Few Counterfactuals Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Thus, we propose a novel solution that only requires annotation of a small fraction (e. g. , 1%) of the original training data, and uses automatic generation of extra counterfactuals in an encoding vector space. |
Maarten De Raedt; Fr�deric Godin; Chris Develder; Thomas Demeester; |
784 | Data-Efficient Playlist Captioning With Musical and Linguistic Knowledge Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose PlayNTell, a data-efficient multi-modal encoder-decoder model for automatic playlist captioning. |
Giovanni Gabbolini; Romain Hennequin; Elena Epure; |
785 | Improved Grammatical Error Correction By Ranking Elementary Edits Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We offer a two-stage reranking method for grammatical error correction: the first model serves as edit generator, while the second classifies the proposed edits as correct or false. |
Alexey Sorokin; |
786 | Improving Tokenisation By Alternative Treatment of Spaces Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Abstract: n/a … |
Edward Gow-Smith; Harish Tayyar Madabushi; Carolina Scarton; Aline Villavicencio; |
787 | GENIE: Toward Reproducible and Standardized Human Evaluation for Text Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Furthermore, we develop an automated mechanism for maintaining annotator quality via a probabilistic model that detects and excludes noisy annotators. Putting these lessons together, we introduce GENIE: a system for running standardized human evaluations across different generation tasks. |
Daniel Khashabi; Gabriel Stanovsky; Jonathan Bragg; Nicholas Lourie; Jungo Kasai; Yejin Choi; Noah A. Smith; Daniel Weld; |
788 | Attentional Probe: Estimating A Module�s Functional Potential Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Attentional Probe: Estimating a Module�s Functional Potential |
Tiago Pimentel; Josef Valvoda; Niklas Stoehr; Ryan Cotterell; |
789 | When More Data Hurts: A Troubling Quirk in Developing Broad-Coverage Natural Language Understanding Systems Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This requires additional training data and results in ever-growing datasets. We present the first systematic investigation into this incremental symbol learning scenario. |
Elias Stengel-Eskin; Emmanouil Antonios Platanios; Adam Pauls; Sam Thomson; Hao Fang; Benjamin Van Durme; Jason Eisner; Yu Su; |
790 | Zero-shot Cross-lingual Transfer of Prompt-based Tuning with A Unified Multilingual Prompt Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To alleviate the effort of designing different prompts for multiple languages, we propose a novel model that uses a unified prompt for all languages, called UniPrompt. |
Lianzhe Huang; Shuming Ma; Dongdong Zhang; Furu Wei; Houfeng Wang; |
791 | Three Real-World Datasets and Neural Computational Models for Classification Tasks in Patent Landscaping Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: With this paper, we release three labeled datasets for PLS-oriented classification tasks covering two diverse domains. |
Subhash Pujari; Jannik Str�tgen; Mark Giereth; Michael Gertz; Annemarie Friedrich; |
792 | Topic Modeling With Topological Data Analysis Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present an unsupervised topic mod-elling method which harnesses TopologicalData Analysis (TDA) to extract a topologicalskeleton of the manifold upon which contextu-alised word embeddings lie. |
Ciar�n Byrne; Danijela Horak; Karo Moilanen; Amandla Mabona; |
793 | Predicting Fine-Tuning Performance with Probing Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper explores the utility of probing deep NLP models to extract a proxy signal widely used in model development � the fine-tuning performance. |
Zining Zhu; Soroosh Shahtalebi; Frank Rudzicz; |
794 | Diverse Parallel Data Synthesis for Cross-Database Adaptation of Text-to-SQL Parsers Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present ReFill, a framework for synthesizing high-quality and textually diverse parallel datasets for adapting Text-to-SQL parsers. |
Abhijeet Awasthi; Ashutosh Sathe; Sunita Sarawagi; |
795 | Agent-Specific Deontic Modality Detection in Legal Language Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To this end, we introduce, LEXDEMOD, a corpus of English contracts annotatedwith deontic modality expressed with respect to a contracting party or agent along with the modal triggers. |
Abhilasha Sancheti; Aparna Garimella; Balaji Vasan Srinivasan; Rachel Rudinger; |
796 | COLD: A Benchmark for Chinese Offensive Language Detection Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To this end, we propose a benchmark �COLD for Chinese offensive language analysis, including a Chinese Offensive Language Dataset �COLDATASET and a baseline detector �COLDETECTOR which is trained on the dataset. |
Jiawen Deng; Jingyan Zhou; Hao Sun; Chujie Zheng; Fei Mi; Helen Meng; Minlie Huang; |
797 | Fixing Model Bugs with Natural Language Patches Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We model the task of determining if a patch applies separately from the task of integrating patch information, and show that with a small amount of synthetic data, we can teach models to effectively use real patches on real data�1 to 7 patches improve accuracy by ~1�4 accuracy points on different slices of a sentiment analysis dataset, and F1 by 7 points on a relation extraction dataset. |
Shikhar Murty; Christopher Manning; Scott Lundberg; Marco Tulio Ribeiro; |
798 | WeDef: Weakly Supervised Backdoor Defense for Text Classification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To defend different trigger types at once, we start from the class-irrelevant nature of the poisoning process and propose a novel weakly supervised backdoor defense framework WeDef. |
Lesheng Jin; Zihan Wang; Jingbo Shang; |
799 | Interventional Training for Out-Of-Distribution Natural Language Understanding Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel interventional training method called Bottom-up Automatic Intervention (BAI) that performs multi-granular intervention with identified multifactorial confounders. |
Sicheng Yu; Jing Jiang; Hao Zhang; Yulei Niu; Qianru Sun; Lidong Bing; |
800 | Pseudo-Relevance for Enhancing Document Representation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Our contribution is to reduce the size of the multi-vector representation, without compromising the effectiveness, supervised by query logs. |
Jihyuk Kim; Seung-won Hwang; Seoho Song; Hyeseon Ko; Young-In Song; |
801 | ZeroGen: Efficient Zero-shot Learning Via Dataset Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we study a flexible and efficient zero-short learning method, ZeroGen. |
Jiacheng Ye; Jiahui Gao; Qintong Li; Hang Xu; Jiangtao Feng; Zhiyong Wu; Tao Yu; Lingpeng Kong; |
802 | Neighborhood Contrastive Learning for Scientific Document Representations with Citation Embeddings Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Instead, we use controlled nearest neighbor sampling over citation graph embeddings for contrastive learning. |
Malte Ostendorff; Nils Rethmeier; Isabelle Augenstein; Bela Gipp; Georg Rehm; |
803 | SPE: Symmetrical Prompt Enhancement for Fact Probing Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose Symmetrical Prompt Enhancement (SPE), a continuous prompt-based method for factual probing in PLMs that leverages the symmetry of the task by constructing symmetrical prompts for subject and object prediction. |
Yiyuan Li; Tong Che; Yezhen Wang; Zhengbao Jiang; Caiming Xiong; Snigdha Chaturvedi; |
804 | Efficient Large Scale Language Modeling with Mixtures of Experts Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper presents a detailed empirical study of how autoregressive MoE language models scale in comparison with dense models in a wide range of settings: in- and out-of-domain language modeling, zero- and few-shot priming, and full-shot fine-tuning. |
Mikel Artetxe; Shruti Bhosale; Naman Goyal; Todor Mihaylov; Myle Ott; Sam Shleifer; Xi Victoria Lin; Jingfei Du; Srinivasan Iyer; Ramakanth Pasunuru; Giridharan Anantharaman; Xian Li; Shuohui Chen; Halil Akin; Mandeep Baines; Louis Martin; Xing Zhou; Punit Singh Koura; Brian O�Horo; Jeffrey Wang; Luke Zettlemoyer; Mona Diab; Zornitsa Kozareva; Veselin Stoyanov; |
805 | MedJEx: A Medical Jargon Extraction Model with Wiki�s Hyperlink Span and Contextualized Masked Language Model Score Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper proposes a new natural language processing (NLP) application for identifying medical jargon terms potentially difficult for patients to comprehend from electronic health record (EHR) notes. |
Sunjae Kwon; Zonghai Yao; Harmon Jordan; David Levy; Brian Corner; Hong Yu; |
806 | Discourse Comprehension: A Question Answering Framework to Represent Sentence Connections Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: A key challenge in building and evaluating models for this type of discourse comprehension is the lack of annotated data, especially since collecting answers to such questions requires high cognitive load for annotators. This paper presents a novel paradigm that enables scalable data collection targeting the comprehension of news documents, viewing these questions through the lens of discourse. |
Wei-Jen Ko; Cutter Dalton; Mark Simmons; Eliza Fisher; Greg Durrett; Junyi Jessy Li; |
807 | Learning to Generate Overlap Summaries Through Noisy Synthetic Data Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: One of the major challenges for solving this task is the lack of existing datasets for supervised training. To address this challenge, we propose a novel data augmentation technique, which allows us to create large amount of synthetic data for training a seq-to-seq model that can perform the SOS task. |
Naman Bansal; Mousumi Akter; Shubhra Kanti Karmaker Santu; |
808 | Mutual Exclusivity Training and Primitive Augmentation to Induce Compositionality Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Recent datasets expose the lack of the systematic generalization ability in standard sequence-to-sequence models. In this work, we analyze this behavior of seq2seq models and identify two contributing factors: a lack of mutual exclusivity bias (one target sequence can only be mapped to one source sequence), and the tendency to memorize whole examples rather than separating structures from contents. |
Yichen Jiang; Xiang Zhou; Mohit Bansal; |
809 | Directions for NLP Practices Applied to Online Hate Speech Detection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, hate speech is a deeply complex and situated concept that eludes such static and disembodied practices. In this position paper, we critically reflect on these methodologies for hate speech detection, we argue that many conventions in NLP are poorly suited for the problem and encourage researchers to develop methods that are more appropriate for the task. |
Paula Fortuna; Monica Dominguez; Leo Wanner; Zeerak Talat; |
810 | Pre-training Transformer Models with Sentence-Level Objectives for Answer Sentence Selection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose three novel sentence-level transformer pre-training objectives that incorporate paragraph-level semantics within and across documents, to improve the performance of transformers for AS2, and mitigate the requirement of large labeled datasets. |
Luca Di Liello; Siddhant Garg; Luca Soldaini; Alessandro Moschitti; |
811 | OpenCQA: Open-ended Question Answering with Charts Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Answering such questions are often difficult and time-consuming as it requires a lot of cognitive and perceptual efforts. To address this challenge, we introduce a new task called OpenCQA, where the goal is to answer an open-ended question about a chart with descriptive texts. |
Shankar Kantharaj; Xuan Long Do; Rixie Tiffany Leong; Jia Qing Tan; Enamul Hoque; Shafiq Joty; |
812 | A Systematic Investigation of Commonsense Knowledge in Large Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Language models (LMs) trained on large amounts of data have shown impressive performance on many NLP tasks under the zero-shot and few-shot setup. Here we aim to better understand the extent to which such models learn commonsense knowledge � a critical component of many NLP applications. |
Xiang Lorraine Li; Adhiguna Kuncoro; Jordan Hoffmann; Cyprien de Masson d�Autume; Phil Blunsom; Aida Nematzadeh; |
813 | Transforming Sequence Tagging Into A Seq2Seq Task Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we rigorously study different formats one could use for casting input text sentences and their output labels into the input and target (i. e. , output) of a Seq2Seq model. |
Karthik Raman; Iftekhar Naim; Jiecao Chen; Kazuma Hashimoto; Kiran Yalasangi; Krishna Srinivasan; |
814 | CycleKQR: Unsupervised Bidirectional Keyword-Question Rewriting Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose the keyword-question rewriting task to improve query understanding capabilities of NLU systems for all surface forms. To achieve this, we present CycleKQR, an unsupervised approach, enabling effective rewriting between keyword and question queries using non-parallel data. |
Andrea Iovine; Anjie Fang; Besnik Fetahu; Jie Zhao; Oleg Rokhlenko; Shervin Malmasi; |
815 | Model Criticism for Long-Form Text Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Here, we propose to apply a statistical tool, model criticism in latent space, to evaluate the high-level structure of the generated text. |
Yuntian Deng; Volodymyr Kuleshov; Alexander Rush; |
816 | Improving Faithfulness By Augmenting Negative Summaries from Fake Documents Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, the commonly used maximum likelihood training does not disentangle factual errors from other model errors. To address this issue,we propose a back-translation-style approach to augment negative samples that mimic factual errors made by the model. |
Tianshu Wang; Faisal Ladhak; Esin Durmus; He He; |
817 | Joint Completion and Alignment of Multilingual Knowledge Graphs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Many effective algorithms have been proposed for completion and alignment as separate tasks. Here we show that these tasks are synergistic and best solved together. |
Soumen Chakrabarti; Harkanwar Singh; Shubham Lohiya; Prachi Jain; Mausam –; |
818 | Offer A Different Perspective: Modeling The Belief Alignment of Arguments in Multi-party Debates Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We adopt a hierarchical generative Variational Autoencoder as our model and impose structural constraints that reflect competing hypotheses about the nature of argumentation. |
Suzanna Sia; Kokil Jaidka; Hansin Ahuja; Niyati Chhaya; Kevin Duh; |
819 | A Federated Approach to Predicting Emojis in Hindi Tweets Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we seek to address the dual concerns of emphasising high resource languages for emoji prediction and risking the privacy of people�s data. |
Deep Gandhi; Jash Mehta; Nirali Parekh; Karan Waghela; Lynette D�Mello; Zeerak Talat; |
820 | Injecting Domain Knowledge in Language Models for Task-oriented Dialogue Systems Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we showcase the advantages of injecting domain-specific knowledge prior to fine-tuning on TOD tasks. |
Denis Emelin; Daniele Bonadiman; Sawsan Alqahtani; Yi Zhang; Saab Mansour; |
821 | TASA: Deceiving Question Answering Models By Twin Answer Sentences Attack Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present Twin Answer Sentences Attack (TASA), an adversarial attack method for question answering (QA) models that produces fluent and grammatical adversarial contexts while maintaining gold answers. |
Yu Cao; Dianqi Li; Meng Fang; Tianyi Zhou; Jun Gao; Yibing Zhan; Dacheng Tao; |
822 | Improving Low-Resource Languages in Pre-Trained Multilingual Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, languages with small available monolingual corpora are often not well-supported by these models leading to poor performance. We propose an unsupervised approach to improve the cross-lingual representations of low-resource languages by bootstrapping word translation pairs from monolingual corpora and using them to improve language alignment in pre-trained language models. |
Viktor Hangya; Hossain Shaikh Saadi; Alexander Fraser; |
823 | SCROLLS: Standardized CompaRison Over Long Language Sequences Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce SCROLLS, a suite of tasks that require reasoning over long texts. |
Uri Shaham; Elad Segal; Maor Ivgi; Avia Efrat; Ori Yoran; Adi Haviv; Ankit Gupta; Wenhan Xiong; Mor Geva; Jonathan Berant; Omer Levy; |
824 | PAR: Political Actor Representation Learning with Social Context and Expert Knowledge Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose PAR, a Political Actor Representation learning framework that jointly leverages social context and expert knowledge. |
Shangbin Feng; Zhaoxuan Tan; Zilong Chen; Ningnan Wang; Peisheng Yu; Qinghua Zheng; Xiaojun Chang; Minnan Luo; |
825 | JDDC 2.1: A Multimodal Chinese Dialogue Dataset with Joint Tasks of Query Rewriting, Response Generation, Discourse Parsing, and Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we construct JDDC 2.1, a large-scale multimodal multi-turn dialogue dataset collected from a mainstream Chinese E-commerce platform, containing about 246K dialogue sessions, 3M utterances, and 507K images, along with product knowledge bases and image category annotations. |
Nan Zhao; Haoran Li; Youzheng Wu; Xiaodong He; |
826 | PCL: Peer-Contrastive Learning with Diverse Augmentations for Unsupervised Sentence Embeddings Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: As one answer, we propose a novel Peer-Contrastive Learning (PCL) with diverse augmentations. |
Qiyu Wu; Chongyang Tao; Tao Shen; Can Xu; Xiubo Geng; Daxin Jiang; |
827 | Digging Errors in NMT: Evaluating and Understanding Model Errors from Partial Hypothesis Space Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To tackle the problem of exponentially large space, we propose two approximation methods, top region evaluation along with an exact top-k decoding algorithm, which finds top-ranked hypotheses in the whole hypothesis space, and Monte Carlo sampling evaluation, which simulates hypothesis space from a broader perspective. |
Jianhao Yan; Chenming Wu; Fandong Meng; Jie Zhou; |
828 | DialogConv: A Lightweight Fully Convolutional Network for Multi-view Response Selection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel lightweight fully convolutional architecture, called DialogConv, for response selection. |
Yongkang Liu; Shi Feng; Wei Gao; Daling Wang; Yifei Zhang; |