Paper Digest: ACL 2023 Highlights
Annual Meeting of the Association for Computational Linguistics (ACL) is one of the top natural language processing conferences in the world. In 2023, it is to be held in Toronto, Canada.
To help the community quickly catch up on the work presented in this conference, Paper Digest Team processed all accepted papers, and generated one highlight sentence (typically the main topic) for each paper. Readers are encouraged to read these machine generated highlights / summaries to quickly get the main idea of each paper.
Based in New York, Paper Digest is dedicated to helping people generate contents & reason over unstructured data. Different from black-box approaches, we build deep models on semantics, which allows results to be produced with explainations. Such models power this website, and are behind our services including “search engine”, “summarization”, “question answering”, and “literature review”.
If you do not want to miss interesting academic papers, you are welcome to sign up our daily paper digest service to get updates on new papers published in your area every day. You are also welcome to follow us on Twitter and Linkedin to get updated with new conference digests.
Paper Digest Team
New York City, New York, 10017
team@paperdigest.org
TABLE 1: Paper Digest: ACL 2023 Highlights
Paper | Author(s) | |
---|---|---|
1 | One Cannot Stand for Everyone! Leveraging Multiple User Simulators to Train Task-oriented Dialogue Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a framework called MUST to optimize ToD systems via leveraging Multiple User SimulaTors. |
Yajiao Liu; Xin Jiang; Yichun Yin; Yasheng Wang; Fei Mi; Qun Liu; Xiang Wan; Benyou Wang; |
2 | SafeConv: Explaining and Correcting Conversational Unsafe Behavior Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we construct a new dataset called SafeConv for the research of conversational safety: (1) Besides the utterance-level safety labels, SafeConv also provides unsafe spans in an utterance, information able to indicate which words contribute to the detected unsafe behavior; (2) SafeConv provides safe alternative responses to continue the conversation when unsafe behavior detected, guiding the conversation to a gentle trajectory. |
Mian Zhang; Lifeng Jin; Linfeng Song; Haitao Mi; Wenliang Chen; Dong Yu; |
3 | Detecting and Mitigating Hallucinations in Machine Translation: Model Internal Workings Alone Do Well, Sentence Similarity Even Better Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose to use a method that evaluates the percentage of the source contribution to a generated translation. |
David Dale; Elena Voita; Loic Barrault; Marta R. Costa-juss�; |
4 | Explainable Recommendation with Personalized Review Retrieval and Aspect Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, historical user reviews of items are often insufficient, making it challenging to ensure the precision of generated explanation text. To address this issue, we propose a novel model, ERRA (Explainable Recommendation by personalized Review retrieval and Aspect learning). |
Hao Cheng; Shuo Wang; Wensheng Lu; Wei Zhang; Mingyang Zhou; Kezhong Lu; Hao Liao; |
5 | Binary and Ternary Natural Language Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We approach the problem with a mix of statistics-based quantization for the weights and elastic quantization of the activations and demonstrate the first ternary and binary transformer models on the downstream tasks of summarization and machine translation. |
Zechun Liu; Barlas Oguz; Aasish Pappu; Yangyang Shi; Raghuraman Krishnamoorthi; |
6 | Span-Selective Linear Attention Transformers for Effective and Robust Schema-Guided Dialogue State Tracking Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce SPLAT, a novel architecture which achieves better generalization and efficiency than prior approaches by constraining outputs to a limited prediction space. |
Bj�rn Bebensee; Haejun Lee; |
7 | EM Pre-training for Multi-party Dialogue Response Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, due to the lack of annotated addressee labels in multi-party dialogue datasets, it is hard to use them to pre-train a response generation model for multi-party dialogues. To tackle this obstacle, we propose an Expectation-Maximization (EM) approach that iteratively performs the expectation steps to generate addressee labels, and the maximization steps to optimize a response generation model. |
Yiyang Li; Hai Zhao; |
8 | ACLM: A Selective-Denoising Based Generative Data Augmentation Approach for Low-Resource Complex NER Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present ACLM Attention-map aware keyword selection for Conditional Language Model fine-tuning), a novel data augmentation approach based on conditional generation, to address the data scarcity problem in low-resource complex NER. |
Sreyan Ghosh; Utkarsh Tyagi; Manan Suri; Sonal Kumar; Ramaneswaran S; Dinesh Manocha; |
9 | Natural Language to Code Generation in Interactive Data Science Notebooks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: It requires a model to understand rich multi-modal contexts, such as existing notebook cells and their execution states as well as previous turns of interaction. To establish a strong baseline on this challenging task, we develop PaChiNCo, a 62B code language model (LM) for Python computational notebooks, which significantly outperforms public code LMs. |
Pengcheng Yin; Wen-Ding Li; Kefan Xiao; Abhishek Rao; Yeming Wen; Kensen Shi; Joshua Howland; Paige Bailey; Michele Catasta; Henryk Michalewski; Oleksandr Polozov; Charles Sutton; |
10 | Subset Retrieval Nearest Neighbor Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose �Subset kNN-MT�, which improves the decoding speed of kNN-MT by two methods: (1) retrieving neighbor target tokens from a subset that is the set of neighbor sentences of the input sentence, not from all sentences, and (2) efficient distance computation technique that is suitable for subset neighbor search using a look-up table. |
Hiroyuki Deguchi; Taro Watanabe; Yusuke Matsui; Masao Utiyama; Hideki Tanaka; Eiichiro Sumita; |
11 | MIL-Decoding: Detoxifying Language Models at Token-Level Via Multiple Instance Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce MIL-Decoding, which detoxifies language models at token-level by interpolating it with a trained multiple instance learning (MIL) network. |
Xu Zhang; Xiaojun Wan; |
12 | Dependency Resolution at The Syntax-semantics Interface: Psycholinguistic and Computational Insights on Control Dependencies Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our results show that while humans correctly identify the (un)acceptability of the strings, language models often fail to identify the correct antecedent in non-adjacent dependencies, showing their reliance on linearity. |
Iria de-Dios-Flores; Juan Garcia Amboage; Marcos Garcia; |
13 | Open-ended Long Text Generation Via Masked Language Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To enhance the long text generation capability of MLMs, we introduce two simple yet effective strategies for the iterative NAR model: dynamic sliding window attention (DSWA) and linear temperature decay (LTD). |
Xiaobo Liang; Zecheng Tang; Juntao Li; Min Zhang; |
14 | A Method for Studying Semantic Construal in Grammatical Constructions with Interpretable Contextual Embedding Spaces Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We study semantic construal in grammatical constructions using large language models. |
Gabriella Chronis; Kyle Mahowald; Katrin Erk; |
15 | Holographic CCG Parsing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a method for formulating CCG as a recursive composition in a continuous vector space. |
Ryosuke Yamaki; Tadahiro Taniguchi; Daichi Mochihashi; |
16 | Prompts Can Play Lottery Tickets Well: Achieving Lifelong Information Extraction Via Lottery Prompt Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we study the UIE system under a more challenging yet practical scenario, i. e. , �lifelong learning� settings, to evaluate its abilities in three aspects, including knowledge sharing and expansion, catastrophic forgetting prevention, and rapid generalization on few-shot and unseen tasks. To achieve these three goals, we present a novel parameter- and deployment-efficient prompt tuning method namely Lottery Prompt Tuning (LPT). |
Zujie Liang; Feng Wei; Yin Jie; Yuxi Qian; Zhenghong Hao; Bing Han; |
17 | Retrieve-and-Sample: Document-level Event Argument Extraction Via Hybrid Retrieval Augmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We investigate various retrieval settings from the input and label distribution views in this paper. |
Yubing Ren; Yanan Cao; Ping Guo; Fang Fang; Wei Ma; Zheng Lin; |
18 | WeCheck: Strong Factual Consistency Checker Via Weakly Supervised Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Bias in synthetic text or upstream tasks makes them perform poorly on text actually generated by language models, especially for general evaluation for various tasks. To alleviate this problem, we propose a weakly supervised framework named WeCheck that is directly trained on actual generated samples from language models with weakly annotated labels. |
Wenhao Wu; Wei Li; Xinyan Xiao; Jiachen Liu; Sujian Li; Yajuan Lyu; |
19 | AMR-based Network for Aspect-based Sentiment Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, further improvement is limited due to the potential mismatch between the dependency tree as a syntactic structure and the sentiment classification as a semantic task. To alleviate this gap, we replace the syntactic dependency tree with the semantic structure named Abstract Meaning Representation (AMR) and propose a model called AMR-based Path Aggregation Relational Network (APARN) to take full advantage of semantic structures. |
Fukun Ma; Xuming Hu; Aiwei Liu; Yawen Yang; Shuang Li; Philip S. Yu; Lijie Wen; |
20 | Text Adversarial Purification As Defense Against Adversarial Attacks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce a novel adversarial purification method that focuses on defending against textual adversarial attacks. |
Linyang Li; Demin Song; Xipeng Qiu; |
21 | SPEECH: Structured Prediction with Energy-Based Event-Centric Hyperspheres Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In most NLP cases, event structures are complex with manifold dependency, and it is challenging to effectively represent these complicated structured events. To address these issues, we propose Structured Prediction with Energy-based Event-Centric Hyperspheres (SPEECH). |
Shumin Deng; Shengyu Mao; Ningyu Zhang; Bryan Hooi; |
22 | Rule By Example: Harnessing Logical Rules for Explainable Hate Speech Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present Rule By Example (RBE): a novel exemplar-based contrastive learning approach for learning from logical rules for the task of textual content moderation. |
Christopher Clarke; Matthew Hall; Gaurav Mittal; Ye Yu; Sandra Sajeev; Jason Mars; Mei Chen; |
23 | What About �em�? How Commercial Machine Translation Fails to Handle (Neo-)Pronouns Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Wrong pronoun translations can discriminate against marginalized groups, e. g. , non-binary individuals (Dev et al. , 2021). In this �reality check�, we study how three commercial MT systems translate 3rd-person pronouns. |
Anne Lauscher; Debora Nozza; Ehm Miltersen; Archie Crowley; Dirk Hovy; |
24 | What Is Overlap Knowledge in Event Argument Extraction? APE: A Cross-datasets Transfer Learning Model for EAE Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we clearly define the overlap knowledge across datasets and split the knowledge of the EAE task into overlap knowledge across datasets and specific knowledge of the target dataset. |
Kaihang Zhang; Kai Shuang; Xinyue Yang; Xuyang Yao; Jinyu Guo; |
25 | Tailor: A Soft-Prompt-Based Approach to Attribute-Based Controlled Text Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing work usually utilize fine-tuning or resort to extra attribute classifiers, yet suffer from increases in storage and inference time. To address these concerns, we explore attribute-based CTG in a parameter-efficient manner. |
Kexin Yang; Dayiheng Liu; Wenqiang Lei; Baosong Yang; Mingfeng Xue; Boxing Chen; Jun Xie; |
26 | Knowledge of Cultural Moral Norms in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We investigate the extent to which monolingual English language models contain knowledge about moral norms in different countries. |
Aida Ramezani; Yang Xu; |
27 | Songs Across Borders: Singable and Controllable Neural Lyric Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper bridges the singability quality gap by formalizing lyric translation into a constrained translation problem, converting theoretical guidance and practical techniques from translatology literature to prompt-driven NMT approaches, exploring better adaptation methods, and instantiating them to an English-Chinese lyric translation system. |
Longshen Ou; Xichu Ma; Min-Yen Kan; Ye Wang; |
28 | Fantastic Expressions and Where to Find Them: Chinese Simile Generation with Multiple Constraints Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we introduce controllable simile generation (CSG), a new task that requires the model to generate a simile with multiple simile elements, e. g. , context and vehicle. |
Kexin Yang; Dayiheng Liu; Wenqiang Lei; Baosong Yang; Xiangpeng Wei; Zhengyuan Liu; Jun Xie; |
29 | Revealing Single Frame Bias for Video-and-Language Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we explore single-frame models for video-and-language learning. |
Jie Lei; Tamara Berg; Mohit Bansal; |
30 | Learning with Partial Annotations for Event Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we conduct a seminal study for learning with partial annotations for ED. |
Jian Liu; Dianbo Sui; Kang Liu; Haoyan Liu; Zhe Zhao; |
31 | World-to-Words: Grounded Open Vocabulary Acquisition Through Fast Mapping in Vision-Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: As an initial attempt, we propose World-to-Words (W2W), a novel visually-grounded language model by pre-training on image-text pairs highlighting grounding as an objective. |
Ziqiao Ma; Jiayi Pan; Joyce Chai; |
32 | A Causal Framework to Quantify The Robustness of Mathematical Reasoning with Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Building on the idea of behavioral testing, we propose a novel framework, which pins down the causal effect of various factors in the input, e. g. , the surface form of the problem text, the operands, and math operators on the output solution. |
Alessandro Stolfo; Zhijing Jin; Kumar Shridhar; Bernhard Schoelkopf; Mrinmaya Sachan; |
33 | Evaluating Open-Domain Dialogues in Latent Space with Next Sentence Prediction and Mutual Information Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The long-standing one-to-many issue of the open-domain dialogues poses significant challenges for automatic evaluation methods, i. e. , there may be multiple suitable responses which differ in semantics for a given conversational context. To tackle this challenge, we propose a novel learning-based automatic evaluation metric (CMN), which can robustly evaluate open-domain dialogues by augmenting Conditional Variational Autoencoders (CVAEs) with a Next Sentence Prediction (NSP) objective and employing Mutual Information (MI) to model the semantic similarity of text in the latent space. |
Kun Zhao; Bohao Yang; Chenghua Lin; Wenge Rong; Aline Villavicencio; Xiaohui Cui; |
34 | Increasing Diversity While Maintaining Accuracy: Text Data Generation with Large Language Models and Human Interventions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we explore human-AI partnerships to facilitate high diversity and accuracy in LLM-based text data generation. |
John Chung; Ece Kamar; Saleema Amershi; |
35 | Pruning Pre-trained Language Models Without Fine-Tuning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we argue fine-tuning is redundant for first-order pruning, since first-order pruning is sufficient to converge PLMs to downstream tasks without fine-tuning. |
Ting Jiang; Deqing Wang; Fuzhen Zhuang; Ruobing Xie; Feng Xia; |
36 | When Does Translation Require Context? A Data-driven, Multilingual Exploration Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we develop the Multilingual Discourse-Aware (MuDA) benchmark, a series of taggers that identify and evaluate model performance on discourse phenomena in any given dataset. |
Patrick Fernandes; Kayo Yin; Emmy Liu; Andr� Martins; Graham Neubig; |
37 | Causal Intervention and Counterfactual Reasoning for Multi-modal Fake News Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we analyze and identify the psycholinguistic bias in the text and the bias of inferring news label based on only image features. |
Ziwei Chen; Linmei Hu; Weixin Li; Yingxia Shao; Liqiang Nie; |
38 | LexSym: Compositionality As Lexical Symmetry Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a domain-general and model-agnostic formulation of compositionality as a constraint on symmetries of data distributions rather than models. |
Ekin Akyurek; Jacob Andreas; |
39 | Layer-wise Fusion with Modality Independence Modeling for Multi-modal Emotion Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose that maintaining modality independence is beneficial for the model performance. |
Jun Sun; Shoukang Han; Yu-Ping Ruan; Xiaoning Zhang; Shu-Kai Zheng; Yulong Liu; Yuxin Huang; Taihao Li; |
40 | CASN:Class-Aware Score Network for Textual Adversarial Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these methods suffer from significant performance degradation when the adversarial samples lie close to the non-adversarial data manifold. To address this limitation, we propose a score-based generative method to implicitly model the data distribution. |
Rong Bao; Rui Zheng; Liang Ding; Qi Zhang; Dacheng Tao; |
41 | Do Androids Laugh at Electric Sheep? Humor �Understanding� Benchmarks from The New Yorker Caption Contest Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We investigate both multimodal and language-only models: the former are challenged with the cartoon images directly, while the latter are given multifaceted descriptions of the visual scene to simulate human-level visual understanding. |
Jack Hessel; Ana Marasovic; Jena D. Hwang; Lillian Lee; Jeff Da; Rowan Zellers; Robert Mankoff; Yejin Choi; |
42 | Making More of Little Data: Improving Low-Resource Automatic Speech Recognition Using Data Augmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we investigate whether data augmentation techniques could help improve low-resource ASR performance, focusing on four typologically diverse minority languages or language variants (West Germanic: Gronings, West-Frisian; Malayo-Polynesian: Besemah, Nasal). |
Martijn Bartelds; Nay San; Bradley McDonnell; Dan Jurafsky; Martijn Wieling; |
43 | CLCL: Non-compositional Expression Detection with Contrastive Learning and Curriculum Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: By leveraging contrastive learning techniques to build improved representations it tackles the non-compositionality challenge. |
Jianing Zhou; Ziheng Zeng; Suma Bhat; |
44 | Multi-VALUE: A Framework for Cross-Dialectal English NLP Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a suite of resources for evaluating and achieving English dialect invariance. |
Caleb Ziems; William Held; Jingfeng Yang; Jwala Dhamala; Rahul Gupta; Diyi Yang; |
45 | Self-Edit: Fault-Aware Code Editor for Code Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by the process of human programming, we propose a generate-and-edit approach named Self-Edit that utilizes execution results of the generated code from LLMs to improve the code quality on the competitive programming task. |
Kechi Zhang; Zhuo Li; Jia Li; Ge Li; Zhi Jin; |
46 | ColD Fusion: Collaborative Descent for Distributed Multitask Finetuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose ColD Fusion, a method that provides the benefits of multitask learning but leverages distributed computation and requires limited communication and no sharing of data. |
Shachar Don-Yehiya; Elad Venezian; Colin Raffel; Noam Slonim; Leshem Choshen; |
47 | Test-time Adaptation for Machine Translation Evaluation By Uncertainty Minimization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper aims to address the inference bias of neural metrics through uncertainty minimization during test time, without requiring additional data. |
Runzhe Zhan; Xuebo Liu; Derek F. Wong; Cuilian Zhang; Lidia S. Chao; Min Zhang; |
48 | Multi-CLS BERT: An Efficient Alternative to Traditional Ensembling Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose Multi-CLS BERT, a novel ensembling method for CLS-based prediction tasks that is almost as efficient as a single BERT model. |
Haw-Shiuan Chang; Ruei-Yao Sun; Kathryn Ricci; Andrew McCallum; |
49 | On-the-fly Cross-lingual Masking for Multilingual Pre-training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present CLPM (Cross-lingual Prototype Masking), a dynamic and token-wise masking scheme, for multilingual pre-training, using a special token [??] |
Xi Ai; Bin Fang; |
50 | How About Kind of Generating Hedges Using End-to-End Neural Models? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we develop a model of hedge generation based on i) fine-tuning state-of-the-art language models trained on human-human tutoring data, followed by ii) reranking to select the candidate that best matches the expected hedging strategy within a candidate pool using a hedge classifier. |
Alafate Abulimiti; Chlo� Clavel; Justine Cassell; |
51 | DiffusionDB: A Large-scale Prompt Gallery Dataset for Text-to-Image Generative Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, generating images with desired details requires proper prompts, and it is often unclear how a model reacts to different prompts or what the best prompts are. To help researchers tackle these critical challenges, we introduce DiffusionDB, the first large-scale text-to-image prompt dataset totaling 6. |
Zijie J. Wang; Evan Montoya; David Munechika; Haoyang Yang; Benjamin Hoover; Duen Horng Chau; |
52 | From Key Points to Key Point Hierarchy: Structured and Expressive Opinion Summarization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: While key points are more expressive than word clouds and key phrases, making sense of a long, flat list of key points, which often express related ideas in varying levels of granularity, may still be challenging. To address this limitation of KPA, we introduce the task of organizing a given set of key points into a hierarchy, according to their specificity. |
Arie Cattan; Lilach Eden; Yoav Kantor; Roy Bar-Haim; |
53 | When to Use What: An In-Depth Comparative Empirical Analysis of OpenIE Systems for Downstream Applications Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present an application-focused empirical survey of neural OpenIE models, training sets, and benchmarks in an effort to help users choose the most suitable OpenIE systems for their applications. |
Kevin Pei; Ishan Jindal; Kevin Chen-Chuan Chang; ChengXiang Zhai; Yunyao Li; |
54 | Subjective Crowd Disagreements for Subjective Data: Uncovering Meaningful CrowdOpinion with Population-level Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce CrowdOpinion, an unsupervised learning based approach that uses language features and label distributions to pool similar items into larger samples of label distributions. |
Tharindu Cyril Weerasooriya; Sarah Luger; Saloni Poddar; Ashiqur KhudaBukhsh; Christopher Homan; |
55 | Post-Abstention: Towards Reliably Re-Attempting The Abstained Instances in QA Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we present an explorative study on �Post-Abstention�, a task that allows re-attempting the abstained instances with the aim of increasing **coverage** of the system without significantly sacrificing its **accuracy**. |
Neeraj Varshney; Chitta Baral; |
56 | UniLG: A Unified Structure-aware Framework for Lyrics Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a unified structure-aware lyrics generation framework named UniLG. |
Tao Qian; Fan Lou; Jiatong Shi; Yuning Wu; Shuai Guo; Xiang Yin; Qin Jin; |
57 | FC-KBQA: A Fine-to-Coarse Composition Framework for Knowledge Base Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a Fine-to-Coarse Composition framework for KBQA (FC-KBQA) to both ensure the generalization ability and executability of the logical expression. |
Lingxi Zhang; Jing Zhang; Yanling Wang; Shulin Cao; Xinmei Huang; Cuiping Li; Hong Chen; Juanzi Li; |
58 | Does GPT-3 Grasp Metaphors? Identifying Metaphor Mappings with Generative Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, this paper proposes to probe the ability of GPT-3 to detect metaphoric language and predict the metaphor�s source domain without any pre-set domains. |
Lennart Wachowiak; Dagmar Gromann; |
59 | Being Right for Whose Right Reasons? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents what we think is a first of its kind, a collection of human rationale annotations augmented with the annotators demographic information. |
Terne Sasha Thorn Jakobsen; Laura Cabello; Anders S�gaard; |
60 | ALERT: Adapt Language Models to Reasoning Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, it is unclear whether these models are applying reasoning skills they have learnt during pre-training , or if they are simply memorizing their training corpus at finer granularity and have learnt to better understand their context. To address this question, we introduce {pasted macro �OUR�}model, a benchmark and suite of analyses for evaluating reasoning skills of language models. |
Ping Yu; Tianlu Wang; Olga Golovneva; Badr AlKhamissi; Siddharth Verma; Zhijing Jin; Gargi Ghosh; Mona Diab; Asli Celikyilmaz; |
61 | Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Our work addresses an important goal of NLP research: we should notlimit NLP to a small fraction of the world�s languages and instead strive to support as many languages as possible to bring the benefits of NLP technology to all languages and cultures. |
Ayyoob ImaniGooghari; Peiqin Lin; Amir Hossein Kargaran; Silvia Severini; Masoud Jalili Sabet; Nora Kassner; Chunlan Ma; Helmut Schmid; Andr� Martins; Fran�ois Yvon; Hinrich Sch�tze; |
62 | Joint Constrained Learning with Boundary-adjusting for Emotion-Cause Pair Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a **J**oint **C**onstrained Learning framework with **B**oundary-adjusting for Emotion-Cause Pair Extraction (**JCB**). |
Huawen Feng; Junlong Liu; Junhao Zheng; Haibin Chen; Xichen Shang; Qianli Ma; |
63 | Pretrained Bidirectional Distillation for Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose Pretrained Bidirectional Distillation (PBD) for NMT, which aims to efficiently transfer bidirectional language knowledge from masked language pretraining to NMT models. |
Yimeng Zhuang; Mei Tu; |
64 | Pivotal Role of Language Modeling in Recommender Systems: Enriching Task-specific and Task-agnostic Representation Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We show that language modeling applied directly to task-specific user histories achieves excellent results on diverse recommendation tasks. |
Kyuyong Shin; Hanock Kwak; Wonjae Kim; Jisu Jeong; Seungjae Jung; Kyungmin Kim; Jung-Woo Ha; Sang-Woo Lee; |
65 | Improving Continual Relation Extraction By Distinguishing Analogous Semantics Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We conduct an empirical study on existing works and observe that their performance is severely affected by analogous relations. To address this issue, we propose a novel continual extraction model for analogous relations. |
Wenzheng Zhao; Yuanning Cui; Wei Hu; |
66 | Improving Pretraining Techniques for Code-Switched NLP Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore different masked language modeling (MLM) pretraining techniques for code-switched text that are cognizant of language boundaries prior to masking. |
Richeek Das; Sahasra Ranjan; Shreya Pathak; Preethi Jyothi; |
67 | A Theory of Unsupervised Speech Recognition Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we proposed a general theoretical framework to study the properties of {pasted macro �ASRU�}/ systems based on random matrix theory and the theory of neural tangent kernels. |
Liming Wang; Mark Hasegawa-Johnson; Chang Yoo; |
68 | ThinkSum: Probabilistic Reasoning Over Sets Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a two-stage probabilistic inference paradigm, ThinkSum, which reasons over sets of objects or facts in a structured manner. |
Batu Ozturkler; Nikolay Malkin; Zhen Wang; Nebojsa Jojic; |
69 | NLG Evaluation Metrics Beyond Correlation Analysis: An Empirical Metric Preference Checklist Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we analyze automatic evaluation metrics for Natural Language Generation (NLG), specifically task-agnostic metrics and human-aligned metrics. |
Iftitahu Nimah; Meng Fang; Vlado Menkovski; Mykola Pechenizkiy; |
70 | DialoGPS: Dialogue Path Sampling in Continuous Semantic Space for Data Augmentation in Multi-Turn Conversations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose DialoGue Path Sampling (DialoGPS) method in continuous semantic space, the first many-to-many augmentation method for multi-turn dialogues. |
Ang Lv; Jinpeng Li; Yuhan Chen; Gao Xing; Ji Zhang; Rui Yan; |
71 | TECHS: Temporal Logical Graph Networks for Explainable Extrapolation Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose an explainable extrapolation reasoning framework TEemporal logiCal grapH networkS (TECHS), which mainly contains a temporal graph encoder and a logical decoder. |
Qika Lin; Jun Liu; Rui Mao; Fangzhi Xu; Erik Cambria; |
72 | Consistency Regularization Training for Compositional Generalization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Without modifying model architectures, we improve the capability of Transformer on compositional generalization through consistency regularization training, which promotes representation consistency across samples and prediction consistency for a single sample. |
Yongjing Yin; Jiali Zeng; Yafu Li; Fandong Meng; Jie Zhou; Yue Zhang; |
73 | NUWA-XL: Diffusion Over Diffusion for EXtremely Long Video Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose NUWA-XL, a novel Diffusion over Diffusion architecture for eXtremely Long video generation. |
Shengming Yin; Chenfei Wu; Huan Yang; Jianfeng Wang; Xiaodong Wang; Minheng Ni; Zhengyuan Yang; Linjie Li; Shuguang Liu; Fan Yang; Jianlong Fu; Ming Gong; Lijuan Wang; Zicheng Liu; Houqiang Li; Nan Duan; |
74 | Synthetic Text Generation with Differential Privacy: A Simple and Practical Recipe Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we show that a simple and practical recipe in the text domain is effective: simply fine-tuning a pretrained generative language model with DP enables the model to generate useful synthetic text with strong privacy protection. |
Xiang Yue; Huseyin Inan; Xuechen Li; Girish Kumar; Julia McAnallen; Hoda Shajari; Huan Sun; David Levitan; Robert Sim; |
75 | A Close Look Into The Calibration of Pre-trained Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Pre-trained language models (PLMs) may fail in giving reliable estimates of their predictive uncertainty. We take a close look into this problem, aiming to answer two questions: (1) Do PLMs learn to become calibrated in the training process? |
Yangyi Chen; Lifan Yuan; Ganqu Cui; Zhiyuan Liu; Heng Ji; |
76 | DIONYSUS: A Pre-trained Model for Low-Resource Dialogue Summarization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose DIONYSUS (dynamic input optimization in pre-training for dialogue summarization), a pre-trained encoder-decoder model for summarizing dialogues in any new domain. |
Yu Li; Baolin Peng; Pengcheng He; Michel Galley; Zhou Yu; Jianfeng Gao; |
77 | MS-DETR: Natural Language Video Localization with Sampling Moment-Moment Interaction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we adopt a proposal-based solution that generates proposals (i. e. candidate moments) and then select the best matching proposal. |
Wang Jing; Aixin Sun; Hao Zhang; Xiaoli Li; |
78 | Diverse Demonstrations Improve In-context Compositional Generalization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, in the setup of compositional generalization, where models are tested on outputs with structures that are absent from the training set, selecting similar demonstrations is insufficient, as often no example will be similar enough to the input. In this work, we propose a method to select diverse demonstrations that aims to collectively cover all of the structures required in the output program, in order to encourage the model to generalize to new structures from these demonstrations. |
Itay Levy; Ben Bogin; Jonathan Berant; |
79 | Self-Adaptive In-Context Learning: An Information Compression Perspective for In-Context Example Selection and Ordering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To validate the effectiveness of self-adaptive ICL, we propose a general select-then-rank framework and instantiate it with new selection and ranking algorithms. |
Zhiyong Wu; Yaoxiang Wang; Jiacheng Ye; Lingpeng Kong; |
80 | On The Efficacy of Sampling Adapters Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To alleviate this issue, various modifications to a model�s sampling distribution, such as top-p or top-k sampling, have been introduced and are now ubiquitously used in language generation systems. We propose a unified framework for understanding these techniques, which we term sampling adapters. |
Clara Meister; Tiago Pimentel; Luca Malagutti; Ethan Wilcox; Ryan Cotterell; |
81 | Cross-Domain Data Augmentation with Domain-Adaptive Language Modeling for Aspect-Based Sentiment Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these CDDA methods still suffer from several issues: 1) preserving many source-specific attributes such as syntactic structures; 2) lack of fluency and coherence; 3) limiting the diversity of generated data. To address these issues, we propose a new cross-domain Data Augmentation approach based on Domain-Adaptive Language Modeling named DA2LM, which contains three stages: 1) assigning pseudo labels to unlabeled target-domain data; 2) unifying the process of token generation and labeling with a Domain-Adaptive Language Model (DALM) to learn the shared context and annotation across domains; 3) using the trained DALM to generate labeled target-domain data. |
Jianfei Yu; Qiankun Zhao; Rui Xia; |
82 | Compositional Data Augmentation for Abstractive Conversation Summarization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, collecting and annotating these conversations can be a time-consuming and labor-intensive task. To address this issue, in this work, we present a sub-structure level compositional data augmentation method, Compo, for generating diverse and high-quality pairs of conversations and summaries. |
Siru Ouyang; Jiaao Chen; Jiawei Han; Diyi Yang; |
83 | PMAES: Prompt-mapping Contrastive Learning for Cross-prompt Automated Essay Scoring Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In fact, when the representations of two prompts are more similar, we can gain more shared features between them. Based on this motivation, in this paper, we propose a learning strategy called �prompt-mapping� to learn about more consistent representations of source and target prompts. |
Yuan Chen; Xia Li; |
84 | Marked Personas: Using Natural Language Prompts to Measure Stereotypes in Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To recognize and mitigate harms from large language models (LLMs), we need to understand the prevalence and nuances of stereotypes in LLM outputs. Toward this end, we present Marked Personas, a prompt-based method to measure stereotypes in LLMs for intersectional demographic groups without any lexicon or data labeling. |
Myra Cheng; Esin Durmus; Dan Jurafsky; |
85 | On Prefix-tuning for Lightweight Out-of-distribution Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we depart from the classic fine-tuning based OOD detection toward a parameter-efficient alternative, and propose an unsupervised prefix-tuning based OOD detection framework termed PTO. |
Yawen Ouyang; Yongchang Cao; Yuan Gao; Zhen Wu; Jianbing Zhang; Xinyu Dai; |
86 | GEC-DePenD: Non-Autoregressive Grammatical Error Correction with Decoupled Permutation and Decoding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a novel non-autoregressive approach to GEC that decouples the architecture into a permutation network that outputs a self-attention weight matrix that can be used in beam search to find the best permutation of input tokens (with auxiliary tokens) and a decoder network based on a step-unrolled denoising autoencoder that fills in specific tokens. |
Konstantin Yakovlev; Alexander Podolskiy; Andrey Bout; Sergey Nikolenko; Irina Piontkovskaya; |
87 | Measuring Progress in Fine-grained Vision-and-Language Understanding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This has resulted in an increased interest in the community to either develop new benchmarks or models for such capabilities. To better understand and quantify progress in this direction, we investigate four competitive V&L models on four fine-grained benchmarks. |
Emanuele Bugliarello; Laurent Sartran; Aishwarya Agrawal; Lisa Anne Hendricks; Aida Nematzadeh; |
88 | Vision Meets Definitions: Unsupervised Visual Word Sense Disambiguation Incorporating Gloss Information Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper introduces an unsupervised VWSD approach that uses gloss information of an external lexical knowledge-base, especially the sense definitions. |
Sunjae Kwon; Rishabh Garodia; Minhwa Lee; Zhichao Yang; Hong Yu; |
89 | Chain-of-Skills: A Configurable Model for Open-Domain Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a modular retriever where individual modules correspond to key skills that can be reused across datasets. |
Kaixin Ma; Hao Cheng; Yu Zhang; Xiaodong Liu; Eric Nyberg; Jianfeng Gao; |
90 | Elaboration-Generating Commonsense Question Answering at Scale Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In question answering requiring common sense, language models (e. g. , GPT-3) have been used to generate text expressing background knowledge that helps improve performance. Yet the cost of working with such models is very high; in this work, we finetune smaller language models to generate useful intermediate context, referred to here as elaborations. |
Wenya Wang; Vivek Srikumar; Hannaneh Hajishirzi; Noah A. Smith; |
91 | Neural Unsupervised Reconstruction of Protolanguage Word Forms Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a state-of-the-art neural approach to the unsupervised reconstruction of ancient word forms. |
Andre He; Nicholas Tomlin; Dan Klein; |
92 | DaMSTF: Domain Adversarial Learning Enhanced Meta Self-Training for Domain Adaptation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a new self-training framework for domain adaptation, namely Domain adversarial learning enhanced Self-Training Framework (DaMSTF). |
Menglong Lu; Zhen Huang; Yunxiang Zhao; Zhiliang Tian; Yang Liu; Dongsheng Li; |
93 | On Evaluating Multilingual Compositional Generalization with Translated Datasets Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, we show that this entails critical semantic distortion. To address this limitation, we craft a faithful rule-based translation of the MCWQ dataset from English to Chinese and Japanese. |
Zi Wang; Daniel Hershcovich; |
94 | FAA: Fine-grained Attention Alignment for Cascade Document Ranking Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In fact, the document ranker can provide fine-grained supervision to make the selector more generalizable and compatible, and the selector built upon a different structure can offer a distinct perspective to assist in document ranking. Inspired by this, we propose a fine-grained attention alignment approach to jointly optimize a cascade document ranking model. |
Zhen Li; Chongyang Tao; Jiazhan Feng; Tao Shen; Dongyan Zhao; Xiubo Geng; Daxin Jiang; |
95 | Fine-tuning Happens in Tiny Subspaces: Exploring Intrinsic Task-specific Subspaces of Pre-trained Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Motivated by the observation, in this paper, we study the problem of re-parameterizing and fine-tuning PLMs from a new perspective: Discovery of intrinsic task-specific subspace. |
Zhong Zhang; Bang Liu; Junming Shao; |
96 | Facilitating Multi-turn Emotional Support Conversation with Positive Emotion Elicitation: A Reinforcement Learning Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose Supporter, a mixture-of-expert-based reinforcement learning model, and well design ES and dialogue coherence rewards to guide policy�s learning for responding. |
Jinfeng Zhou; Zhuang Chen; Bo Wang; Minlie Huang; |
97 | Query Enhanced Knowledge-Intensive Conversation Via Unsupervised Joint Modeling Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose an unsupervised query enhanced approach for knowledge-intensive conversations, namely QKConv. |
Mingzhu Cai; Siqi Bao; Xin Tian; Huang He; Fan Wang; Hua Wu; |
98 | Why Aren�t We NER Yet? Artifacts of ASR Errors in Named Entity Recognition in Spontaneous Speech Transcripts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we examine in detail the complex relationship between ASR and NER errors which limit the ability of NER models to recover entity mentions from spontaneous speech transcripts. |
Piotr Szymanski; Lukasz Augustyniak; Mikolaj Morzy; Adrian Szymczak; Krzysztof Surdyk; Piotr Zelasko; |
99 | Precise Zero-Shot Dense Retrieval Without Relevance Labels Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we recognize the difficulty of zero-shot learning and encoding relevance. |
Luyu Gao; Xueguang Ma; Jimmy Lin; Jamie Callan; |
100 | White-Box Multi-Objective Adversarial Attack on Dialogue Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To this end, we propose a white-box multi-objective attack method called DGSlow. |
Yufei Li; Zexin Li; Yingfan Gao; Cong Liu; |
101 | A Cautious Generalization Goes A Long Way: Learning Morphophonological Rules Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present a novel approach for automatically learning morphophonological rules of Arabic from a corpus. |
Salam Khalifa; Sarah Payne; Jordan Kodner; Ellen Broselow; Owen Rambow; |
102 | Few-shot Adaptation Works with UnpredicTable Data Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Prior work on language models (LMs) shows that training on a large number of diverse tasks improves few-shot learning (FSL) performance on new tasks. We take this to the extreme, automatically extracting 413,299 tasks from internet tables – orders of magnitude more than the next-largest public datasets. |
Jun Shern Chan; Michael Pieler; Jonathan Jao; J�r�my Scheurer; Ethan Perez; |
103 | Cross-lingual Science Journalism: Select, Simplify and Rewrite Summaries for Non-expert Readers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce CSJ as a downstream task of text simplification and cross-lingual scientific summarization to facilitate science journalists� work. |
Mehwish Fatima; Michael Strube; |
104 | HuCurl: Human-induced Curriculum Discovery Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce the problem of curriculum discovery and describe a curriculum learning framework capable of discovering effective curricula in a curriculum space based on prior knowledge about sample difficulty. |
Mohamed Elgaar; Hadi Amiri; |
105 | KNN-TL: K-Nearest-Neighbor Transfer Learning for Low-Resource Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a k-Nearest-Neighbor Transfer Learning (kNN-TL) approach for low-resource NMT, which leverages the parent knowledge throughout the entire developing process of the child model. |
Shudong Liu; Xuebo Liu; Derek F. Wong; Zhaocong Li; Wenxiang Jiao; Lidia S. Chao; Min Zhang; |
106 | Do Language Models Have Coherent Mental Models of Everyday Things? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Do language models similarly have a coherent picture of such everyday things? To investigate this, we propose a benchmark dataset consisting of 100 everyday things, their parts, and the relationships between these parts, expressed as 11,720 �X relation Y? |
Yuling Gu; Bhavana Dalvi Mishra; Peter Clark; |
107 | Rogue Scores Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Does this widespread benchmark metric meet these three evaluation criteria? This systematic review of over two thousand publications using ROUGE finds: (A) Critical evaluation decisions and parameters are routinely omitted, making most reported scores irreproducible. |
Max Grusky; |
108 | Instruction Induction: From Few Examples to Natural Language Task Descriptions Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We show that language models can explicitly infer an underlying task from a few demonstrations by prompting them to generate a natural language instruction that fits the examples. To explore this ability, we introduce the instruction induction challenge, compile a dataset consisting of 24 tasks, and define a novel evaluation metric based on executing the generated instruction. |
Or Honovich; Uri Shaham; Samuel R. Bowman; Omer Levy; |
109 | In-Context Analogical Reasoning with Pre-Trained Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Specifically, we apply large pre-trained language models (PLMs) to visual Raven�s Progressive Matrices (RPM), a common relational reasoning test. |
Xiaoyang Hu; Shane Storks; Richard Lewis; Joyce Chai; |
110 | Peek Across: Improving Multi-Document Modeling Via Cross-Document Question-Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The integration of multi-document pre-training objectives into language models has resulted in remarkable improvements in multi-document downstream tasks. In this work, we propose extending this idea by pre-training a generic multi-document model from a novel cross-document question answering pre-training objective. |
Avi Caciularu; Matthew Peters; Jacob Goldberger; Ido Dagan; Arman Cohan; |
111 | Tailoring Instructions to Student�s Learning Levels Boosts Knowledge Distillation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose Learning Good Teacher Matters (LGTM), an efficient training technique for incorporating distillation influence into the teacher�s learning process. |
Yuxin Ren; Zihan Zhong; Xingjian Shi; Yi Zhu; Chun Yuan; Mu Li; |
112 | REV: Information-Theoretic Evaluation of Free-Text Rationales Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: More concretely, we propose a metric called REV (Rationale Evaluation with conditional V-information), to quantify the amount of new, label-relevant information in a rationale beyond the information already available in the input or the label. |
Hanjie Chen; Faeze Brahman; Xiang Ren; Yangfeng Ji; Yejin Choi; Swabha Swayamdipta; |
113 | ELQA: A Corpus of Metalinguistic Questions and Answers About English Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present ELQA, a corpus of questions and answers in and about the English language. |
Shabnam Behzad; Keisuke Sakaguchi; Nathan Schneider; Amir Zeldes; |
114 | Divide, Conquer, and Combine: Mixture of Semantic-Independent Experts for Zero-Shot Dialogue State Tracking Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a simple and effective �divide, conquer and combine� solution, which explicitly disentangles the semantics of seen data, and leverages the performance and robustness with the mixture-of-experts mechanism. |
Qingyue Wang; Liang Ding; Yanan Cao; Yibing Zhan; Zheng Lin; Shi Wang; Dacheng Tao; Li Guo; |
115 | BIG-C: A Multimodal Multi-Purpose Dataset for Bemba Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present BIG-C (Bemba Image Grounded Conversations), a large multimodal dataset for Bemba. |
Claytone Sikasote; Eunice Mukonde; Md Mahfuz Ibn Alam; Antonios Anastasopoulos; |
116 | Schema-Guided User Satisfaction Modeling for Task-Oriented Dialogues Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose SG-USM, a novel schema-guided user satisfaction modeling framework. |
Yue Feng; Yunlong Jiao; Animesh Prasad; Nikolaos Aletras; Emine Yilmaz; Gabriella Kazai; |
117 | Robust Multi-bit Natural Language Watermarking Through Invariant Features Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we explore ways to advance both payload and robustness by following a well-known proposition from image watermarking and identify features in natural language that are invariant to minor corruption. |
KiYoon Yoo; Wonhyuk Ahn; Jiho Jang; Nojun Kwak; |
118 | KALM: Knowledge-Aware Integration of Local, Document, and Global Contexts for Long Document Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In addition, incorporating varying contexts can especially benefit long document understanding tasks that leverage pre-trained LMs, typically bounded by the input sequence length. In light of these challenges, we propose KALM, a language model that jointly leverages knowledge in local, document-level, and global contexts for long document understanding. |
Shangbin Feng; Zhaoxuan Tan; Wenqian Zhang; Zhenyu Lei; Yulia Tsvetkov; |
119 | AtTGen: Attribute Tree Generation for Real-World Attribute Joint Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: And prior studies attempting to integrate these paradigms through ensemble, pipeline, and co-training models, still face challenges like cascading errors, high computational overhead, and difficulty in training. To address these existing problems, this paper presents Attribute Tree, a unified formulation for real-world attribute extraction application, where closed-world, open-world, and semi-open attribute extraction tasks are modeled uniformly. |
Yanzeng Li; Bingcong Xue; Ruoyu Zhang; Lei Zou; |
120 | Extractive Is Not Faithful: An Investigation of Broad Unfaithfulness Problems in Extractive Summarization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we define a typology with five types of broad unfaithfulness problems (including and beyond not-entailment) that can appear in extractive summaries, including incorrect coreference, incomplete coreference, incorrect discourse, incomplete discourse, as well as other misleading information. |
Shiyue Zhang; David Wan; Mohit Bansal; |
121 | Improving Translation Quality Estimation with Bias Mitigation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a novel method to mitigate the bias of the QE model and improve estimation performance. |
Hui Huang; Shuangzhi Wu; Kehai Chen; Hui Di; Muyun Yang; Tiejun Zhao; |
122 | Breeding Machine Translations: Evolutionary Approach to Survive and Thrive in The World of Automated Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a genetic algorithm (GA) based method for modifying n-best lists produced by a machine translation (MT) system. |
Josef Jon; Ondrej Bojar; |
123 | MoralDial: A Framework to Train and Evaluate Moral Dialogue Systems Via Moral Discussions Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a framework, MoralDial to train and evaluate moral dialogue systems. |
Hao Sun; Zhexin Zhang; Fei Mi; Yasheng Wang; Wei Liu; Jianwei Cui; Bin Wang; Qun Liu; Minlie Huang; |
124 | Denoising Bottleneck with Mutual Information Maximization for Video Multimodal Fusion Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: They often suppress the redundant and noisy information at the risk of losing critical information. Therefore, we propose a denoising bottleneck fusion (DBF) model for fine-grained video multimodal fusion. |
Shaoxiang Wu; Damai Dai; Ziwei Qin; Tianyu Liu; Binghuai Lin; Yunbo Cao; Zhifang Sui; |
125 | SimLM: Pre-training with Representation Bottleneck for Dense Passage Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose SimLM (Similarity matching with Language Model pre-training), a simple yet effective pre-training method for dense passage retrieval. |
Liang Wang; Nan Yang; Xiaolong Huang; Binxing Jiao; Linjun Yang; Daxin Jiang; Rangan Majumder; Furu Wei; |
126 | From Ultra-Fine to Fine: Fine-tuning Ultra-Fine Entity Typing Models to Fine-grained Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a new approach that can avoid the need of creating distantly labeled data whenever there is a new type schema. |
Hongliang Dai; Ziqian Zeng; |
127 | Controlling Learned Effects to Reduce Spurious Correlations in Text Classifiers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, this can be counter-productive when the features have a non-zero causal effect on the target label and thus are important for prediction. Therefore, using methods from the causal inference literature, we propose an algorithm to regularize the learnt effect of the features on the model�s prediction to the estimated effect of feature on label. |
Parikshit Bansal; Amit Sharma; |
128 | What Makes Pre-trained Language Models Better Zero-shot Learners? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Thus, we propose a simple yet effective method for screening reasonable prompt templates in zero-shot text classification: Perplexity Selection (Perplection). |
Jinghui Lu; Dongsheng Zhu; Weidong Han; Rui Zhao; Brian Mac Namee; Fei Tan; |
129 | Z-ICL: Zero-Shot In-Context Learning with Pseudo-Demonstrations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce Z-ICL, a new zero-shot method that closes the gap by constructing pseudo-demonstrations for a given test input using a raw text corpus. |
Xinxi Lyu; Sewon Min; Iz Beltagy; Luke Zettlemoyer; Hannaneh Hajishirzi; |
130 | Learning Optimal Policy for Simultaneous Machine Translation Via Binary Search Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a new method for constructing the optimal policy online via binary search. |
Shoutao Guo; Shaolei Zhang; Yang Feng; |
131 | Better Simultaneous Translation with Monotonic Knowledge Distillation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel approach that leverages traditional translation models as teachers and employs a two-stage beam search algorithm to generate monotonic yet accurate reference translations for sequence-level knowledge distillation. |
Shushu Wang; Jing Wu; Kai Fan; Wei Luo; Jun Xiao; Zhongqiang Huang; |
132 | StoryARG: A Corpus of Narratives and Personal Experiences in Argumentative Texts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our analysis of the annotations in StoryARG uncover a positive impact on effectiveness for stories which illustrate a solution to a problem, and in general, annotator-specific preferences that we investigate with regression analysis. |
Neele Falk; Gabriella Lapesa; |
133 | Injecting Knowledge Into Language Generation: A Case Study in Auto-charting After-visit Care Instructions from Medical Dialogue Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce the �utilization rate� that encodes knowledge and serves as a regularizer by maximizing the marginal probability of selected tokens. |
Maksim Eremeev; Ilya Valmianski; Xavier Amatriain; Anitha Kannan; |
134 | Sequence Parallelism: Long Sequence Training from System Perspective Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing work focuses on reducing time and space complexity from an algorithm perspective. In this work, we propose sequence parallelism, a memory-efficient parallelism to solve this issue from system perspective instead. |
Shenggui Li; Fuzhao Xue; Chaitanya Baranwal; Yongbin Li; Yang You; |
135 | MUSTIE: Multimodal Structural Transformer for Web Information Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a novel MUltimodal Structural Transformer (MUST) that incorporates multiple modalities for web information extraction. |
Qifan Wang; Jingang Wang; Xiaojun Quan; Fuli Feng; Zenglin Xu; Shaoliang Nie; Sinong Wang; Madian Khabsa; Hamed Firooz; Dongfang Liu; |
136 | Augmentation-Adapted Retriever Improves Generalization of Language Models As Generic Plug-In Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we explore the scheme of generic retrieval plug-in: the retriever is to assist target LMs that may not be known beforehand or are unable to be fine-tuned together. |
Zichun Yu; Chenyan Xiong; Shi Yu; Zhiyuan Liu; |
137 | TableVLM: Multi-modal Pre-training for Table Structure Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we propose a novel multi-modal pre-training model for table structure recognition, named TableVLM. |
Leiyuan Chen; Chengsong Huang; Xiaoqing Zheng; Jinshu Lin; Xuanjing Huang; |
138 | Can NLI Provide Proper Indirect Supervision for Low-resource Biomedical Relation Extraction? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present NBR, which converts biomedical RE as natural language inference formulation through indirect supervision. |
Jiashu Xu; Mingyu Derek Ma; Muhao Chen; |
139 | Dynamic Routing Transformer Network for Multimodal Sarcasm Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by routing-based dynamic network, we model the dynamic mechanism in multimodal sarcasm detection and propose the Dynamic Routing Transformer Network (DynRT-Net). |
Yuan Tian; Nan Xu; Ruike Zhang; Wenji Mao; |
140 | What Are You Token About? Dense Retrieval As Distributions Over The Vocabulary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Yet, we have little understanding of how they represent text, and why this leads to good performance. In this work, we shed light on this question via distributions over the vocabulary. |
Ori Ram; Liat Bezalel; Adi Zicher; Yonatan Belinkov; Jonathan Berant; Amir Globerson; |
141 | Cold-Start Data Selection for Better Few-shot Language Model Fine-tuning: A Prompt-based Uncertainty Propagation Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present PATRON, a prompt-based data selection method for pre-trained language model fine-tuning under cold-start scenarios, i. e. , no initial labeled data are available. |
Yue Yu; Rongzhi Zhang; Ran Xu; Jieyu Zhang; Jiaming Shen; Chao Zhang; |
142 | Training-free Neural Architecture Search for RNNs and Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we investigate training-free NAS metrics for recurrent neural network (RNN) and BERT-based transformer architectures, targeted towards language modeling tasks. |
Aaron Serianni; Jugal Kalita; |
143 | CrossSum: Beyond English-Centric Cross-Lingual Summarization for 1,500+ Language Pairs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a multistage data sampling algorithm to effectively train a cross-lingual summarization model capable of summarizing an article in any target language. |
Abhik Bhattacharjee; Tahmid Hasan; Wasi Uddin Ahmad; Yuan-Fang Li; Yong-Bin Kang; Rifat Shahriyar; |
144 | Improving Gradient Trade-offs Between Tasks in Multi-task Text Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a novel gradient trade-off approach to mitigate the task conflict problem, dubbed GetMTL, which can achieve a specific trade-off among different tasks nearby the main objective of multi-task text classification (MTC), so as to improve the performance of each task simultaneously. |
Heyan Chai; Jinhao Cui; Ye Wang; Min Zhang; Binxing Fang; Qing Liao; |
145 | Bi-Phone: Modeling Inter Language Phonetic Influences in Text Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a method to mine phoneme confusions (sounds in L2 that an L1 speaker is likely to conflate) for pairs of L1 and L2. |
Abhirut Gupta; Ananya B. Sai; Richard Sproat; Yuri Vasilevski; James Ren; Ambarish Jash; Sukhdeep Sodhi; Aravindan Raghuveer; |
146 | Cross2StrA: Unpaired Cross-lingual Image Captioning with Cross-lingual Cross-modal Structure-pivoted Alignment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Unpaired cross-lingual image captioning has long suffered from irrelevancy and disfluency issues, due to the inconsistencies of the semantic scene and syntax attributes during transfer. In this work, we propose to address the above problems by incorporating the scene graph (SG) structures and the syntactic constituency (SC) trees. |
Shengqiong Wu; Hao Fei; Wei Ji; Tat-Seng Chua; |
147 | Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning By Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To address the missing-step errors, we propose Plan-and-Solve (PS) Prompting. |
Lei Wang; Wanyu Xu; Yihuai Lan; Zhiqiang Hu; Yunshi Lan; Roy Ka-Wei Lee; Ee-Peng Lim; |
148 | RetroMAE-2: Duplex Masked Auto-Encoder For Pre-Training Retrieval-Oriented Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a novel pre-training method called Duplex Masked Auto-Encoder, a. k. a. DupMAE. |
Zheng Liu; Shitao Xiao; Yingxia Shao; Zhao Cao; |
149 | DecompX: Explaining Transformers Decisions By Propagating Token Decomposition Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, providing a faithful vector-based explanation for a multi-layer model could be challenging in three aspects: (1) Incorporating all components into the analysis, (2) Aggregating the layer dynamics to determine the information flow and mixture throughout the entire model, and (3) Identifying the connection between the vector-based analysis and the model�s predictions. In this paper, we present DecompX to tackle these challenges. |
Ali Modarressi; Mohsen Fayyaz; Ehsan Aghazadeh; Yadollah Yaghoobzadeh; Mohammad Taher Pilehvar; |
150 | Symbolic Chain-of-Thought Distillation: Small Models Can Also �Think� Step-by-Step Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: 3B parameters) can still benefit from chain-of-thought prompting. To achieve this, we introduce Symbolic Chain-of-Thought Distillation (SCoTD), a method to train a smaller student model on rationalizations sampled from a significantly larger teacher model. |
Liunian Harold Li; Jack Hessel; Youngjae Yu; Xiang Ren; Kai-Wei Chang; Yejin Choi; |
151 | Generating EDU Extracts for Plan-Guided Summary Re-Ranking Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Yet, standard decoding methods (i. e. , beam search, nucleus sampling, and diverse beam search) produce candidates with redundant, and often low quality, content. In this paper, we design a novel method to generate candidates for re-ranking that addresses these issues. |
Griffin Adams; Alex Fabbri; Faisal Ladhak; No�mie Elhadad; Kathleen McKeown; |
152 | A Survey on Asking Clarification Questions Datasets in Conversational Systems Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, it is noticeable that a key limitation of the existing ACQs studies is their incomparability, from inconsistent use of data, distinct experimental setups and evaluation strategies. Therefore, in this paper, to assist the development of ACQs techniques, we comprehensively analyse the current ACQs research status, which offers a detailed comparison of publicly available datasets, and discusses the applied evaluation metrics, joined with benchmarks for multiple ACQs-related tasks. |
Hossein A. Rahmani; Xi Wang; Yue Feng; Qiang Zhang; Emine Yilmaz; Aldo Lipani; |
153 | Towards Understanding Chain-of-Thought Prompting: An Empirical Study of What Matters Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we show that CoT reasoning is possible even with invalid demonstrations – prompting with invalid reasoning steps can achieve over 80-90% of the performance obtained using CoT under various metrics, while still generating coherent lines of reasoning during inference. |
Boshi Wang; Sewon Min; Xiang Deng; Jiaming Shen; You Wu; Luke Zettlemoyer; Huan Sun; |
154 | Small Data, Big Impact: Leveraging Minimal Data for Effective Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We describe a broad data collection effort involving around 6k professionally translated sentence pairs for each of 39 low-resource languages, which we make publicly available. |
Jean Maillard; Cynthia Gao; Elahe Kalbassi; Kaushik Ram Sadagopan; Vedanuj Goswami; Philipp Koehn; Angela Fan; Francisco Guzman; |
155 | RMLM: A Flexible Defense Framework for Proactively Mitigating Word-level Adversarial Attacks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, they often neglect to proactively mitigate adversarial attacks during inference. Towards this overlooked aspect, we propose a defense framework that aims to mitigate attacks by confusing attackers and correcting adversarial contexts that are caused by malicious perturbations. |
Zhaoyang Wang; Zhiyue Liu; Xiaopeng Zheng; Qinliang Su; Jiahai Wang; |
156 | Gradient-based Intra-attention Pruning on Pre-trained Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a structured pruning method GRAIN (gradient-based intra-attention pruning), which performs task-specific pruning with knowledge distillation and yields highly effective models. |
Ziqing Yang; Yiming Cui; Xin Yao; Shijin Wang; |
157 | Learning to Substitute Spans Towards Improving Compositional Generalization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To address the two challenges, we first propose a novel compositional augmentation strategy dubbed Span Substitution (SpanSub) that enables multi-grained composition of substantial substructures in the whole training set. Over and above that, we introduce the Learning to Substitute Span (L2S2) framework which empowers the learning of span substitution probabilities in SpanSub in an end-to-end manner by maximizing the loss of neural sequence models, so as to outweigh those challenging compositions with elusive concepts and novel surroundings. |
Zhaoyi Li; Ying Wei; Defu Lian; |
158 | DiffusEmp: A Diffusion Model-Based Framework with Multi-Grained Control for Empathetic Response Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose to use explicit control to guide the empathy expression and design a framework DiffusEmp based on conditional diffusion language model to unify the utilization of dialogue context and attribute-oriented control signals. |
Guanqun Bi; Lei Shen; Yanan Cao; Meng Chen; Yuqiang Xie; Zheng Lin; Xiaodong He; |
159 | BREAK: Breaking The Dialogue State Tracking Barrier with Beam Search and Re-ranking Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In our preliminary error analysis, we find that beam search produces a pool of candidates that is likely to include the correct dialogue state. Motivated by this observation, we introduce a novel framework, called BREAK (Beam search and RE-rAnKing), that achieves outstanding performance on DST. |
Seungpil Won; Heeyoung Kwak; Joongbo Shin; Janghoon Han; Kyomin Jung; |
160 | Faithful Low-Resource Data-to-Text Generation Through Cycle Training Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Sufficient annotated data is often not available for specific domains, leading us to seek an unsupervised approach to improve the faithfulness of output text. Since the problem is fundamentally one of consistency between the representations of the structured data and text, we evaluate the effectiveness of cycle training in this work. |
Zhuoer Wang; Marcus Collins; Nikhita Vedula; Simone Filice; Shervin Malmasi; Oleg Rokhlenko; |
161 | Towards Stable Natural Language Understanding Via Information Entropy Guided Debiasing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, our analyses show that the empirical debiasing methods may fail to capture part of the potential dataset biases and mistake semantic information of input text as biases, which limits the effectiveness of debiasing. To address these issues, we propose a debiasing framework IEGDB that comprehensively detects the dataset biases to induce a set of biased features, and then purifies the biased features with the guidance of information entropy. |
Li Du; Xiao Ding; Zhouhao Sun; Ting Liu; Bing Qin; Jingshuo Liu; |
162 | Dynamic and Efficient Inference for Text Generation Via BERT Family Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a novel fine-tuning method DEER, which can make a single pre-trained model support Dynamic and Efficient infERence and achieve an adaptive trade-off between model performance and latency. |
Xiaobo Liang; Juntao Li; Lijun Wu; Ziqiang Cao; Min Zhang; |
163 | Learning to Generate Equitable Text in Dialogue from Biased Training Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Yet, there is no comprehensive study of equitable text generation in dialogue. Aptly, in this work, we use theories of computational learning to study this problem. |
Anthony Sicilia; Malihe Alikhani; |
164 | Hierarchical Verbalizer for Few-Shot Hierarchical Text Classification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To address the issue, we propose the hierarchical verbalizer (�HierVerb�), a multi-verbalizer framework treating HTC as a single- or multi-label classification problem at multiple layers and learning vectors as verbalizers constrained by hierarchical structure and hierarchical contrastive learning. |
Ke Ji; Yixin Lian; Jingsheng Gao; Baoyuan Wang; |
165 | Summary-Oriented Vision Modeling for Multimodal Abstractive Summarization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose to improve the summary quality through summary-oriented visual features. |
Yunlong Liang; Fandong Meng; Jinan Xu; Jiaan Wang; Yufeng Chen; Jie Zhou; |
166 | Helping A Friend or Supporting A Cause? Disentangling Active and Passive Cosponsorship in The U.S. Congress Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we develop an Encoder+RGCN based model that learns legislator representations from bill texts and speech transcripts. |
Giuseppe Russo; Christoph Gote; Laurence Brandenberger; Sophia Schlosser; Frank Schweitzer; |
167 | TREA: Tree-Structure Reasoning Schema for Conversational Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, recent reasoning-based models heavily rely on simplified structures such as linear structures or fixed-hierarchical structures for causality reasoning, hence they cannot fully figure out sophisticated relationships among utterances with external knowledge. To address this, we propose a novel Tree structure Reasoning schEmA named TREA. |
Wendi Li; Wei Wei; Xiaoye Qu; Xian-Ling Mao; Ye Yuan; Wenfeng Xie; Dangyang Chen; |
168 | CATS: A Pragmatic Chinese Answer-to-Sequence Dataset with Large Scale and High Quality Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Last, current datasets bias in the English language while leaving other languages underexplored. To alleviate these limitations, in this paper, we present CATS, a pragmatic Chinese answer-to-sequence dataset with large scale and high quality. |
Liang Li; Ruiying Geng; Chengyang Fang; Bing Li; Can Ma; Rongyu Cao; Binhua Li; Fei Huang; Yongbin Li; |
169 | Multilingual Multifaceted Understanding of Online News in Terms of Genre, Framing, and Persuasion Techniques Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a new multilingual multifacet dataset of news articles, each annotated for genre (objective news reporting vs. opinion vs. satire), framing (what key aspects are highlighted), and persuasion techniques (logical fallacies, emotional appeals, ad hominem attacks, etc. ). |
Jakub Piskorski; Nicolas Stefanovitch; Nikolaos Nikolaidis; Giovanni Da San Martino; Preslav Nakov; |
170 | Learning Action Conditions from Instructional Manuals for Instruction Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a task dubbed action condition inference, which extracts mentions of preconditions and postconditions of actions in instructional manuals. |
Te-Lin Wu; Caiqi Zhang; Qingyuan Hu; Alexander Spangher; Nanyun Peng; |
171 | StoryWars: A Dataset and Instruction Tuning Baselines for Collaborative Story Understanding and Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Understanding and generating such stories remains an underexplored area due to the lack of open-domain corpora. To address this, we introduce StoryWars, a new dataset of over 40,000 collaborative stories written by 9,400 different authors from an online platform. |
Yulun Du; Lydia Chilton; |
172 | Did You Read The Instructions? Rethinking The Effectiveness of Task Definitions in Instruction Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we systematically study the role of task definitions in instruction learning. |
Fan Yin; Jesse Vig; Philippe Laban; Shafiq Joty; Caiming Xiong; Chien-Sheng Wu; |
173 | Do PLMs Know and Understand Ontological Knowledge? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we focus on probing whether PLMs store ontological knowledge and have a semantic un- derstanding of the knowledge rather than rote memorization of the surface form. |
Weiqi Wu; Chengyue Jiang; Yong Jiang; Pengjun Xie; Kewei Tu; |
174 | CORE: Cooperative Training of Retriever-Reranker for Effective Dialogue Response Selection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we present a cooperative training of the response retriever and the reranker whose parameters are dynamically optimized by the ground-truth labels as well as list-wise supervision signals from each other. |
Chongyang Tao; Jiazhan Feng; Tao Shen; Chang Liu; Juntao Li; Xiubo Geng; Daxin Jiang; |
175 | Exploring How Generative Adversarial Networks Learn Phonological Representations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper explores how Generative Adversarial Networks (GANs) learn representations of phonological phenomena. |
Jingyi Chen; Micha Elsner; |
176 | Interpretable Word Sense Representations Via Definition Generation: The Case of Semantic Change Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose using automatically generated natural language definitions of contextualised word usages as interpretable word and word sense representations. |
Mario Giulianelli; Iris Luden; Raquel Fernandez; Andrey Kutuzov; |
177 | Learning to Simulate Natural Language Feedback for Interactive Semantic Parsing Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a new task of simulating NL feedback for interactive semantic parsing. |
Hao Yan; Saurabh Srivastava; Yintao Tai; Sida I. Wang; Wen-tau Yih; Ziyu Yao; |
178 | InfoMetIC: An Informative Metric for Reference-free Image Caption Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Instead, we humans can easily identify the problems of captions in details, e. g. , which words are inaccurate and which salient objects are not described, and then rate the caption quality. To support such informative feedback, we propose an Informative Metric for Reference-free Image Caption evaluation (InfoMetIC). |
Anwen Hu; Shizhe Chen; Liang Zhang; Qin Jin; |
179 | An Invariant Learning Characterization of Controlled Text Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we show that the performance of controlled generation may be poor if the distributions of text in response to user prompts differ from the distribution the predictor was trained on. |
Carolina Zheng; Claudia Shi; Keyon Vafa; Amir Feder; David Blei; |
180 | HistRED: A Historical Document-Level Relation Extraction Dataset Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To demonstrate the usefulness of our dataset, we propose a bilingual RE model that leverages both Korean and Hanja contexts to predict relations between entities. |
Soyoung Yang; Minseok Choi; Youngwoo Cho; Jaegul Choo; |
181 | A Critical Evaluation of Evaluations for Long-form Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a careful analysis of experts� evaluation, which focuses on new aspects such as the comprehensiveness of the answer. |
Fangyuan Xu; Yixiao Song; Mohit Iyyer; Eunsol Choi; |
182 | HyPe: Better Pre-trained Language Model Fine-tuning with Hidden Representation Perturbation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, there still poses problems when fine-tuning pre-trained language models on downstream tasks, such as over-fitting or representation collapse. In this work, we propose HyPe, a simple yet effective fine-tuning technique to alleviate such problems by perturbing hidden representations of Transformers layers. |
Hongyi Yuan; Zheng Yuan; Chuanqi Tan; Fei Huang; Songfang Huang; |
183 | Generating User-Engaging News Headlines Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, presenting the same news headline to all readers is a suboptimal strategy, because it does not take into account the different preferences and interests of diverse readers, who may be confused about why a particular article has been recommended to them and do not see a clear connection between their interests and the recommended article. In this paper, we present a novel framework that addresses these challenges by incorporating user profiling to generate personalized headlines, and a combination of automated and human evaluation methods to determine user preference for personalized headlines. |
Pengshan Cai; Kaiqiang Song; Sangwoo Cho; Hongwei Wang; Xiaoyang Wang; Hong Yu; Fei Liu; Dong Yu; |
184 | Word Sense Extension Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a paradigm of word sense extension (WSE) thatenables words to spawn new senses toward novel context. |
Lei Yu; Yang Xu; |
185 | PVGRU: Generating Diverse and Relevant Dialogue Responses Via Pseudo-Variational Mechanism Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose Pseudo-Variational Gated Recurrent Unit (PVGRU). |
Yongkang Liu; Shi Feng; Daling Wang; Yifei Zhang; Hinrich Sch�tze; |
186 | Decoding Symbolism in Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present our evaluative framework, Symbolism Analysis (SymbA), which compares LMs (e. g. , RoBERTa, GPT-J) on different types of symbolism and analyze the outcomes along multiple metrics. |
Meiqi Guo; Rebecca Hwa; Adriana Kovashka; |
187 | A Survey on Zero Pronoun Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This phenomenon has been studied extensively in machine translation (MT), as it poses a significant challenge for MT systems due to the difficulty in determining the correct antecedent for the pronoun. This survey paper highlights the major works that have been undertaken in zero pronoun translation (ZPT) after the neural revolution so that researchers can recognize the current state and future directions of this field. |
Longyue Wang; Siyou Liu; Mingzhou Xu; Linfeng Song; Shuming Shi; Zhaopeng Tu; |
188 | We Understand Elliptical Sentences, and Language Models Should Too: A New Dataset for Studying Ellipsis and Its Interaction with Thematic Fit Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explored the issue of how the prototypicality of event participants affects the ability of Language Models (LMs) to handle elliptical sentences and to identify the omitted arguments at different degrees of thematic fit, ranging from highly typical participants to semantically anomalous ones. |
Davide Testa; Emmanuele Chersoni; Alessandro Lenci; |
189 | MPCHAT: Towards Multimodal Persona-Grounded Conversation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we extend persona-based dialogue to the multimodal domain and make two main contributions. First, we present the first multimodal persona-based dialogue dataset named MPCHAT, which extends persona with both text and images to contain episodic memories. |
Jaewoo Ahn; Yeda Song; Sangdoo Yun; Gunhee Kim; |
190 | DOC: Improving Long Story Coherence With Detailed Outline Control Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose the Detailed Outline Control (DOC) framework for improving long-range plot coherence when automatically generating several-thousand-word-long stories. |
Kevin Yang; Dan Klein; Nanyun Peng; Yuandong Tian; |
191 | Dual-Alignment Pre-training for Cross-lingual Sentence Embedding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Based on our findings, we propose a dual-alignment pre-training (DAP) framework for cross-lingual sentence embedding that incorporates both sentence-level and token-level alignment. |
Ziheng Li; Shaohan Huang; Zihan Zhang; Zhi-Hong Deng; Qiang Lou; Haizhen Huang; Jian Jiao; Furu Wei; Weiwei Deng; Qi Zhang; |
192 | Exploring Better Text Image Translation with Multimodal Codebook Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we first annotate a Chinese-English TIT dataset named OCRMT30K, providing convenience for subsequent studies. |
Zhibin Lan; Jiawei Yu; Xiang Li; Wen Zhang; Jian Luan; Bin Wang; Degen Huang; Jinsong Su; |
193 | FEDLEGAL: The First Real-World Federated Learning Benchmark for Legal NLP Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, to the best of our knowledge, there is no work on applying FL to legal NLP. To fill this gap, this paper presents the first real-world FL benchmark for legal NLP, coined FEDLEGAL, which comprises five legal NLP tasks and one privacy task based on the data from Chinese courts. |
Zhuo Zhang; Xiangjing Hu; Jingyuan Zhang; Yating Zhang; Hui Wang; Lizhen Qu; Zenglin Xu; |
194 | A Gradient Control Method for Backdoor Attacks on Parameter-Efficient Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: And this problem might result in forgetting the backdoor. Based on this finding, we propose a gradient control method to consolidate the attack effect, comprising two strategies. |
Naibin Gu; Peng Fu; Xiyu Liu; Zhengxiao Liu; Zheng Lin; Weiping Wang; |
195 | History Semantic Graph Enhanced Conversational KBQA with Temporal Information Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a History Semantic Graph Enhanced KBQA model (HSGE) that is able to effectively model long-range semantic dependencies in conversation history while maintaining low computational cost. |
Hao Sun; Yang Li; Liwei Deng; Bowen Li; Binyuan Hui; Binhua Li; Yunshi Lan; Yan Zhang; Yongbin Li; |
196 | From The One, Judge of The Whole: Typed Entailment Graph Construction with Predicate Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, EGs built by previous methods often suffer from the severe sparsity issues, due to limited corpora available and the long-tail phenomenon of predicate distributions. In this paper, we propose a multi-stage method, Typed Predicate-Entailment Graph Generator (TP-EGG), to tackle this problem. |
Zhibin Chen; Yansong Feng; Dongyan Zhao; |
197 | Alleviating Over-smoothing for Unsupervised Sentence Representation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Experimentally, we observe that the over-smoothing problem reduces the capacity of these powerful PLMs, leading to sub-optimal sentence representations. In this paper, we present a Simple method named Self-Contrastive Learning (SSCL) to alleviate this issue, which samples negatives from PLMs intermediate layers, improving the quality of the sentence representation. |
Nuo Chen; Linjun Shou; Jian Pei; Ming Gong; Bowen Cao; Jianhui Chang; Jia Li; Daxin Jiang; |
198 | Memory-efficient NLLB-200: Language-specific Expert Pruning of A Massively Multilingual Machine Translation Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a pruning method that enables the removal of up to 80% of experts without further finetuning and with a negligible loss in translation quality, which makes it feasible to run the model on a single 32GB GPU. |
Yeskendir Koishekenov; Alexandre Berard; Vassilina Nikoulina; |
199 | DAMP: Doubly Aligned Multilingual Parser for Task-Oriented Dialogue Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work we dramatically improve the zero-shot performance of a multilingual and codeswitched semantic parsing system using two stages of multilingual alignment. |
William Held; Christopher Hidey; Fei Liu; Eric Zhu; Rahul Goel; Diyi Yang; Rushin Shah; |
200 | From Characters to Words: Hierarchical Pre-trained Language Model for Open-vocabulary Language Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce a novel open-vocabulary language model that adopts a hierarchical two-level approach: one at the word level and another at the sequence level. |
Li Sun; Florian Luisier; Kayhan Batmanghelich; Dinei Florencio; Cha Zhang; |
201 | MatSci-NLP: Evaluating Scientific Language Models on Materials Science Language Tasks Using Text-to-Schema Modeling Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present MatSci-NLP, a natural language benchmark for evaluating the performance of natural language processing (NLP) models on materials science text. |
Yu Song; Santiago Miret; Bang Liu; |
202 | Code4Struct: Code Generation for Few-Shot Event Structure Prediction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We observe that semantic structures can be conveniently translated into code and propose Code4Struct to leverage such text-to-structure translation capability to tackle structured prediction tasks. |
Xingyao Wang; Sha Li; Heng Ji; |
203 | GENEVA: Benchmarking Generalizability for Event Argument Extraction with Hundreds of Event Types and Argument Roles Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Then, exhaustive human expert annotations are collected to build the ontology, concluding with 115 events and 220 argument roles, with a significant portion of roles not being entities. We utilize this ontology to further introduce GENEVA, a diverse generalizability benchmarking dataset comprising four test suites aimed at evaluating models� ability to handle limited data and unseen event type generalization. |
Tanmay Parekh; I-Hung Hsu; Kuan-Hao Huang; Kai-Wei Chang; Nanyun Peng; |
204 | Efficient Semiring-Weighted Earley Parsing Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present Earley�s (1970) context-free parsing algorithm as a deduction system, incorporating various known and new speed-ups. |
Andreas Opedal; Ran Zmigrod; Tim Vieira; Ryan Cotterell; Jason Eisner; |
205 | Tree-Based Representation and Generation of Natural and Mathematical Language Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a series of modifications to existing language models to jointly represent and generate text and math: representing mathematical expressions as sequences of node tokens in their operator tree format, using math symbol and tree position embeddings to preserve the semantic and structural properties of mathematical expressions, and using a constrained decoding method to generate mathematically valid expressions. |
Alexander Scarlatos; Andrew Lan; |
206 | ParaLS: Lexical Substitution Via Pretrained Paraphraser Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Since we cannot directly generate the substitutes via commonly used decoding strategies, we propose two simple decoding strategies that focus on the variations of the target word during decoding. |
Jipeng Qiang; Kang Liu; Yun Li; Yunhao Yuan; Yi Zhu; |
207 | Peer-Label Assisted Hierarchical Text Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To fully explore the peer-label relationship, we develop a PeerHTC method. |
Junru Song; Feifei Wang; Yang Yang; |
208 | Free Lunch for Efficient Textual Commonsense Integration in Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, incorporating textual commonsense descriptions is computationally expensive, as compared to encoding conventional symbolic knowledge. In this paper, we propose a method to improve its efficiency without modifying the model. |
Wanyun Cui; Xingran Chen; |
209 | A Probabilistic Framework for Discovering New Intents Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, starting from the intuition that discovering intents could be beneficial for identifying known intents, we propose a probabilistic framework for discovering intents where intent assignments are treated as latent variables. |
Yunhua Zhou; Guofeng Quan; Xipeng Qiu; |
210 | MultiTACRED: A Multilingual Version of The TAC Relation Extraction Dataset Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Relation extraction (RE) is a fundamental task in information extraction, whose extension to multilingual settings has been hindered by the lack of supervised resources comparable in size to large English datasets such as TACRED (Zhang et al. , 2017). To address this gap, we introduce the MultiTACRED dataset, covering 12 typologically diverse languages from 9 language families, which is created by machine-translating TACRED instances and automatically projecting their entity annotations. |
Leonhard Hennig; Philippe Thomas; Sebastian M�ller; |
211 | Towards Higher Pareto Frontier in Multilingual Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a new training framework, Pareto Mutual Distillation (Pareto-MD), towards pushing the Pareto frontier outwards rather than making trade-offs. |
Yichong Huang; Xiaocheng Feng; Xinwei Geng; Baohang Li; Bing Qin; |
212 | Small Pre-trained Language Models Can Be Fine-tuned As Large Models Via Over-Parameterization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we focus on scaling up the parameters of PLMs only during fine-tuning, to benefit from the over-parameterization, while without increasing the inference latency. |
Ze-Feng Gao; Kun Zhou; Peiyu Liu; Wayne Xin Zhao; Ji-Rong Wen; |
213 | Entity Tracking in Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present a task probing to what extent a language model can infer the final state of an entity given an English description of the initial state and a series of state-changing operations. |
Najoung Kim; Sebastian Schuster; |
214 | A Textual Dataset for Situated Proactive Response Selection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address the problem, we introduce a task of proactive response selection based on situational information. |
Naoki Otani; Jun Araki; HyeongSik Kim; Eduard Hovy; |
215 | DiffusionNER: Boundary Diffusion for Named Entity Recognition Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose DiffusionNER, which formulates the named entity recognition task as a boundary-denoising diffusion process and thus generates named entities from noisy spans. |
Yongliang Shen; Kaitao Song; Xu Tan; Dongsheng Li; Weiming Lu; Yueting Zhuang; |
216 | WACO: Word-Aligned Contrastive Learning for Speech Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose Word-Aligned COntrastive learning (WACO), a simple and effective method for extremely low-resource speech-to-text translation. |
Siqi Ouyang; Rong Ye; Lei Li; |
217 | Cross-lingual Continual Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a principled Cross-lingual Continual Learning (CCL) evaluation paradigm, where we analyze different categories of approaches used to continually adapt to emerging data from different languages. |
Meryem M�hamdi; Xiang Ren; Jonathan May; |
218 | Faithful Question Answering with Monte-Carlo Planning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose FAME (FAithful question answering with MontE-carlo planning) to answer questions based on faithful reasoning steps. |
Ruixin Hong; Hongming Zhang; Hong Zhao; Dong Yu; Changshui Zhang; |
219 | Unbalanced Optimal Transport for Unbalanced Word Alignment Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To achieve unbalanced word alignment that values both alignment and null alignment, this study shows that the family of optimal transport (OT), i. e. , balanced, partial, and unbalanced OT, are natural and powerful approaches even without tailor-made techniques. |
Yuki Arase; Han Bao; Sho Yokoi; |
220 | Guiding Computational Stance Detection with Expanded Stance Triangle Framework Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Moreover, the limited amount of available training data leads to subpar performance in out-of-domain and cross-target scenarios, as data-driven approaches are prone to rely on superficial and domain-specific features. In this work, we decompose the stance detection task from a linguistic perspective, and investigate key components and inference paths in this task. |
Zhengyuan Liu; Yong Keong Yap; Hai Leong Chieu; Nancy Chen; |
221 | Analyzing and Reducing The Performance Gap in Cross-Lingual Transfer with Fine-tuning Slow and Fast Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper analyzes the fine-tuning process, discovers when the performance gap changes and identifies which network weights affect the overall performance most. |
Yiduo Guo; Yaobo Liang; Dongyan Zhao; Bing Liu; Nan Duan; |
222 | Improving Self-training for Cross-lingual Named Entity Recognition with Contrastive and Prototype Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we aim to improve self-training for cross-lingual NER by combining representation learning and pseudo label refinement in one coherent framework. |
Ran Zhou; Xin Li; Lidong Bing; Erik Cambria; Chunyan Miao; |
223 | MM-SHAP: A Performance-agnostic Metric for Measuring Multimodal Contributions in Vision and Language Models & Tasks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Instead, we propose MM-SHAP, a performance-agnostic multimodality score based on Shapley values that reliably quantifies in which proportions a multimodal model uses individual modalities. |
Letitia Parcalabescu; Anette Frank; |
224 | Towards Boosting The Open-Domain Chatbot with Human Feedback Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a novel and efficient framework Diamante to boost the open-domain chatbot, where two kinds of human feedback (including explicit demonstration and implicit preference) are collected and leveraged. |
Hua Lu; Siqi Bao; Huang He; Fan Wang; Hua Wu; Haifeng Wang; |
225 | Knowledge-enhanced Mixed-initiative Dialogue System for Emotional Support Conversations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we study the problem of mixed-initiative ESC where the user and system can both take the initiative in leading the conversation. |
Yang Deng; Wenxuan Zhang; Yifei Yuan; Wai Lam; |
226 | UTC-IE: A Unified Token-pair Classification Architecture for Information Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Based on the reformulation, we propose a Unified Token-pair Classification architecture for Information Extraction (UTC-IE), where we introduce Plusformer on top of the token-pair feature matrix. |
Hang Yan; Yu Sun; Xiaonan Li; Yunhua Zhou; Xuanjing Huang; Xipeng Qiu; |
227 | Social-Group-Agnostic Bias Mitigation Via The Stereotype Content Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose that the Stereotype Content Model (SCM) � a theoretical framework developed in social psychology for understanding the content of stereotyping � can help debiasing efforts to become social-group-agnostic by capturing the underlying connection between bias and stereotypes. |
Ali Omrani; Alireza Salkhordeh Ziabari; Charles Yu; Preni Golazizian; Brendan Kennedy; Mohammad Atari; Heng Ji; Morteza Dehghani; |
228 | Revisiting The Gold Standard: Grounding Summarization Evaluation with Robust Human Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, existing human evaluation studies for summarization either exhibit a low inter-annotator agreement or have insufficient scale, and an in-depth analysis of human evaluation is lacking. Therefore, we address the shortcomings of existing summarization evaluation along the following axes: (1) We propose a modified summarization salience protocol, Atomic Content Units (ACUs), which is based on fine-grained semantic units and allows for a high inter-annotator agreement. |
Yixin Liu; Alex Fabbri; Pengfei Liu; Yilun Zhao; Linyong Nan; Ruilin Han; Simeng Han; Shafiq Joty; Chien-Sheng Wu; Caiming Xiong; Dragomir Radev; |
229 | FIREBALL: A Dataset of Dungeons and Dragons Actual-Play with Structured Game State Information Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present FIREBALL, a large dataset containing nearly 25,000 unique sessions from real D&D gameplay on Discord with true game state info. |
Andrew Zhu; Karmanya Aggarwal; Alexander Feng; Lara Martin; Chris Callison-Burch; |
230 | A Fine-grained Comparison of Pragmatic Language Understanding in Humans and Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We perform a fine-grained comparison of language models and humans on seven pragmatic phenomena, using zero-shot prompting on an expert-curated set of English materials. |
Jennifer Hu; Sammy Floyd; Olessia Jouravlev; Evelina Fedorenko; Edward Gibson; |
231 | Counterfactual Multihop QA: A Cause-Effect Approach for Reducing Disconnected Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the existing QA models always rely on shortcuts, e. g. , providing the true answer by only one fact, rather than multi-hop reasoning, which is referred as disconnected reasoning problem. To alleviate this issue, we propose a novel counterfactual multihop QA, a causal-effect approach that enables to reduce the disconnected reasoning. |
Wangzhen Guo; Qinkang Gong; Yanghui Rao; Hanjiang Lai; |
232 | Causal-Debias: Unifying Debiasing in Pretrained Language Models and Fine-tuning Via Causal Invariant Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a unified debiasing framework Causal-Debias to remove unwanted stereotypical associations in PLMs during fine-tuning. |
Fan Zhou; Yuzhou Mao; Liu Yu; Yi Yang; Ting Zhong; |
233 | Parameter-Efficient Fine-Tuning Without Introducing New Latency Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we demonstrate the feasibility of generating a sparse mask in a task-agnostic manner, wherein all downstream tasks share a common mask. |
Baohao Liao; Yan Meng; Christof Monz; |
234 | MANNER: A Variational Memory-Augmented Model for Cross Domain Few-Shot Named Entity Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper focuses on the task of cross domain few-shot named entity recognition (NER), which aims to adapt the knowledge learned from source domain to recognize named entities in target domain with only a few labeled examples. To address this challenging task, we propose MANNER, a variational memory-augmented few-shot NER model. |
Jinyuan Fang; Xiaobin Wang; Zaiqiao Meng; Pengjun Xie; Fei Huang; Yong Jiang; |
235 | MASSIVE: A 1M-Example Multilingual Natural Language Understanding Dataset with 51 Typologically-Diverse Languages Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present the MASSIVE dataset�Multilingual Amazon Slu resource package (SLURP) for Slot-filling, Intent classification, and Virtual assistant Evaluation. |
Jack FitzGerald; Christopher Hench; Charith Peris; Scott Mackie; Kay Rottmann; Ana Sanchez; Aaron Nash; Liam Urbach; Vishesh Kakarala; Richa Singh; Swetha Ranganath; Laurie Crist; Misha Britan; Wouter Leeuwis; Gokhan Tur; Prem Natarajan; |
236 | Distilling Script Knowledge from Large Language Models for Constrained Language Planning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we define the task of constrained language planning for the first time. |
Siyu Yuan; Jiangjie Chen; Ziquan Fu; Xuyang Ge; Soham Shah; Charles Jankowski; Yanghua Xiao; Deqing Yang; |
237 | REDFM: A Filtered and Multilingual Relation Extraction Dataset Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, current RE models often rely on small datasets with low coverage of relation types, particularly when working with languages other than English. In this paper, we address the above issue and provide two new resources that enable the training and evaluation of multilingual RE systems. |
?Pere-Llu�s Huguet Cabot; Simone Tedeschi; Axel-Cyrille Ngonga Ngomo; Roberto Navigli; |
238 | Modeling Appropriate Language in Argumentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we operationalize appropriate language in argumentation for the first time. |
Timon Ziegenbein; Shahbaz Syed; Felix Lange; Martin Potthast; Henning Wachsmuth; |
239 | CELDA: Leveraging Black-box Language Model As Enhanced Classifier Without Labels Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose Clustering-enhanced Linear Discriminative Analysis (CELDA), a novel approach that improves the text classification accuracy with a very weak-supervision signal (i. e. , name of the labels). |
Hyunsoo Cho; Youna Kim; Sang-goo Lee; |
240 | MvP: Multi-view Prompting Improves Aspect Sentiment Tuple Prediction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose Multi-view Prompting (MVP) that aggregates sentiment elements generated in different orders, leveraging the intuition of human-like problem-solving processes from different views. |
Zhibin Gou; Qingyan Guo; Yujiu Yang; |
241 | ACCENT: An Automatic Event Commonsense Evaluation Metric for Open-Domain Dialogue Systems Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose ACCENT, an event commonsense evaluation metric empowered by commonsense knowledge bases (CSKBs). |
Sarik Ghazarian; Yijia Shao; Rujun Han; Aram Galstyan; Nanyun Peng; |
242 | Explanation-based Finetuning Makes Models More Robust to Spurious Cues Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose explanation-based finetuning as a general approach to mitigate LLMs� reliance on spurious correlations. |
Josh Magnus Ludan; Yixuan Meng; Tai Nguyen; Saurabh Shah; Qing Lyu; Marianna Apidianaki; Chris Callison-Burch; |
243 | CAME: Confidence-guided Adaptive Memory Efficient Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we first study a confidence-guided strategy to reduce the instability of existing memory efficient optimizers. Based on this strategy, we propose CAME to simultaneously achieve two goals: fast convergence as in traditional adaptive methods, and low memory usage as in memory-efficient methods. |
Yang Luo; Xiaozhe Ren; Zangwei Zheng; Zhuo Jiang; Xin Jiang; Yang You; |
244 | On Second Thought, Let�s Not Think Step By Step! Bias and Toxicity in Zero-Shot Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Concretely, we perform a controlled evaluation of zero-shot CoT across two socially sensitive domains: harmful questions and stereotype benchmarks. |
Omar Shaikh; Hongxin Zhang; William Held; Michael Bernstein; Diyi Yang; |
245 | Solving Math Word Problems Via Cooperative Reasoning Induced Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We notice that human reasoning has a dual reasoning framework that consists of an immediate reaction system (system 1) and a delicate reasoning system (system 2), where the entire reasoning is determined by their interaction. This inspires us to develop a cooperative reasoning-induced PLM for solving MWPs, called Cooperative Reasoning (CoRe), resulting in a human-like reasoning architecture with system 1 as the generator and system 2 as the verifier. |
Xinyu Zhu; Junjie Wang; Lin Zhang; Yuxiang Zhang; Yongfeng Huang; Ruyi Gan; Jiaxing Zhang; Yujiu Yang; |
246 | Exploiting Biased Models to De-bias Text: A Gender-Fair Rewriting Model Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To eliminate the rule-based nature of data creation, we instead propose using machine translation models to create gender-biased text from real gender-fair text via round-trip translation. |
Chantal Amrhein; Florian Schottmann; Rico Sennrich; Samuel L�ubli; |
247 | Early Discovery of Disappearing Entities in Microblogs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We make decisions by reacting to changes in the real world, particularly the emergence and disappearance of impermanent entities such as restaurants, services, and events. |
Satoshi Akasaki; Naoki Yoshinaga; Masashi Toyoda; |
248 | DiffusionBERT: Improving Generative Masked Language Models with Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present DiffusionBERT, a new generative masked language model based on discrete dif- fusion models. |
Zhengfu He; Tianxiang Sun; Qiong Tang; Kuanning Wang; Xuanjing Huang; Xipeng Qiu; |
249 | Lifting The Curse of Capacity Gap in Distilling Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we aim at lifting the curse of capacity gap via enlarging the capacity of the student without notably increasing the inference compute. |
Chen Zhang; Yang Yang; Jiahao Liu; Jingang Wang; Yunsen Xian; Benyou Wang; Dawei Song; |
250 | Towards Faithful Dialogues Via Focus Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing models heavily rely on elaborate data engineering or increasing the model�s parameters ignoring to track the tokens that significantly influence losses, which is decisive for the optimization direction of the model in each iteration. To address this issue, we propose Focus Learning (FocusL), a novel learning approach that adjusts the contribution of each token to the optimization direction by directly scaling the corresponding objective loss. |
Yifan Deng; Xingsheng Zhang; Heyan Huang; Yue Hu; |
251 | Back Translation for Speech-to-text Translation Without Transcripts Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we aim to utilize large amounts of target-side monolingual data to enhance ST without transcripts. |
Qingkai Fang; Yang Feng; |
252 | Prompter: Zero-shot Adaptive Prefixes for Dialogue State Tracking Domain Adaptation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Our method, Prompter, uses descriptions of target domain slots to generate dynamic prefixes that are concatenated to the key and values at each layer?s self-attention mechanism. |
Ibrahim Taha Aksu; Min-Yen Kan; Nancy Chen; |
253 | Enhancing Dialogue Generation Via Dynamic Graph Knowledge Aggregation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we propose a novel framework for knowledge graph enhanced dialogue generation. |
Chen Tang; Hongbo Zhang; Tyler Loakman; Chenghua Lin; Frank Guerin; |
254 | Multi-modal Action Chain Abductive Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To facilitate this research community, this paper sheds new light on Abductive Reasoning by studying a new vision-language task, Multi-modal Action chain abductive Reasoning (MAR), together with a large-scale Abductive Reasoning dataset: Given an incomplete set of language described events, MAR aims to imagine the most plausible event by spatio-temporal grounding in past video and then infer the hypothesis of subsequent action chain that can best explain the language premise. |
Mengze Li; Tianbao Wang; Jiahe Xu; Kairong Han; Shengyu Zhang; Zhou Zhao; Jiaxu Miao; Wenqiao Zhang; Shiliang Pu; Fei Wu; |
255 | Exploring The Capacity of Pretrained Language Models for Reasoning About Actions and Change Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose four essential RAC tasks as a comprehensive textual benchmark and generate problems in a way that minimizes the influence of other linguistic requirements (e. g. , grounding) to focus on RAC. |
Weinan He; Canming Huang; Zhanhao Xiao; Yongmei Liu; |
256 | Unified Demonstration Retriever for In-Context Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose Unified Demonstration Retriever (UDR), a single model to retrieve demonstrations for a wide range of tasks. |
Xiaonan Li; Kai Lv; Hang Yan; Tianyang Lin; Wei Zhu; Yuan Ni; Guotong Xie; Xiaoling Wang; Xipeng Qiu; |
257 | Movie101: A New Movie Understanding Benchmark Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Existing works benchmark this challenge as a normal video captioning task via some simplifications, such as removing role names and evaluating narrations with ngram-based metrics, which makes it difficult for automatic systems to meet the needs of real application scenarios. To narrow this gap, we construct a large-scale Chinese movie benchmark, named Movie101. |
Zihao Yue; Qi Zhang; Anwen Hu; Liang Zhang; Ziheng Wang; Qin Jin; |
258 | Enhancing Language Representation with Constructional Information for Natural Language Understanding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, PLMs primarily focus on acquiring lexico-semantic information, while they may be unable to adequately handle the meaning of constructions. To address this issue, we introduce construction grammar (CxG), which highlights the pairings of form and meaning, to enrich language representation. |
Lvxiaowei Xu; Jianwang Wu; Jiawei Peng; Zhilin Gong; Ming Cai; Tianxiang Wang; |
259 | Query Structure Modeling for Inductive Logical Reasoning Over Knowledge Graphs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a structure-modeled textual encoding framework for inductive logical reasoning over KGs. |
Siyuan Wang; Zhongyu Wei; Meng Han; Zhihao Fan; Haijun Shan; Qi Zhang; Xuanjing Huang; |
260 | DimonGen: Diversified Generative Commonsense Reasoning for Explaining Concept Relationships Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose DimonGen, which aims to generate diverse sentences describing concept relationships in various everyday scenarios. |
Chenzhengyi Liu; Jie Huang; Kerui Zhu; Kevin Chen-Chuan Chang; |
261 | Incorporating Attribution Importance for Improving Faithfulness Metrics Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a simple yet effective soft erasure criterion. |
Zhixue Zhao; Nikolaos Aletras; |
262 | Reward Gaming in Conditional Text Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Under this framework, we identify three common cases where high rewards are incorrectly assigned to undesirable patterns: noise-induced spurious correlation, naturally occurring spurious correlation, and covariate shift. |
Richard Yuanzhe Pang; Vishakh Padmakumar; Thibault Sellam; Ankur Parikh; He He; |
263 | Hidden Schema Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work we introduce a novel neural language model that enforces, via inductive biases, explicit relational structures which allow for compositionality onto the output representations of pretrained language models. |
Ramses Sanchez; Lukas Conrads; Pascal Welke; Kostadin Cvejoski; Cesar Ojeda Marin; |
264 | Towards Robust Low-Resource Fine-Tuning with Multi-View Compressed Representations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we present a novel method that operates on the hidden representations of a PLM to reduce overfitting. |
Linlin Liu; Xingxuan Li; Megh Thakkar; Xin Li; Shafiq Joty; Luo Si; Lidong Bing; |
265 | An Ordinal Latent Variable Model of Conflict Intensity Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce a probabilistic generative model that assumes each observed event is associated with a latent intensity class. |
Niklas Stoehr; Lucas Torroba Hennigen; Josef Valvoda; Robert West; Ryan Cotterell; Aaron Schein; |
266 | Multilingual Conceptual Coverage in Text-to-Image Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose �Conceptual Coverage Across Languages� (CoCo-CroLa), a technique for benchmarking the degree to which any generative text-to-image system provides multilingual parity to its training language in terms of tangible nouns. |
Michael Saxon; William Yang Wang; |
267 | Pre-Training to Learn in Context Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To this end, we propose PICL (Pre-training for In-Context Learning), a framework to enhance the language models� in-context learning ability by pre-training the model on a large collection of �intrinsic tasks� in the general plain-text corpus using the simple language modeling objective. |
Yuxian Gu; Li Dong; Furu Wei; Minlie Huang; |
268 | Ethical Considerations for Machine Translation of Indigenous Languages: Giving A Voice to The Speakers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The data collection, modeling and deploying machine translation systems thus result in new ethical questions that must be addressed. Motivated by this, we first survey the existing literature on ethical considerations for the documentation, translation, and general natural language processing for Indigenous languages. Afterward, we conduct and analyze an interview study to shed light on the positions of community leaders, teachers, and language activists regarding ethical concerns for the automatic translation of their languages. |
Manuel Mager; Elisabeth Mager; Katharina Kann; Ngoc Thang Vu; |
269 | Revisiting Non-English Text Simplification: A Unified Multilingual Benchmark Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper introduces the MultiSim benchmark, a collection of 27 resources in 12 distinct languages containing over 1. |
Michael Ryan; Tarek Naous; Wei Xu; |
270 | Don�t Generate, Discriminate: A Proposal for Grounding Language Models to Real-World Environments Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose Pangu, a generic framework for grounded language understanding that capitalizes on the discriminative ability of LMs instead of their generative ability. |
Yu Gu; Xiang Deng; Yu Su; |
271 | Privacy-Preserving Domain Adaptation of Semantic Parsers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we study ways in which realistic user utterances can be generated synthetically, to help increase the linguistic and functional coverage of the system, without compromising the privacy of actual users. |
Fatemehsadat Mireshghallah; Yu Su; Tatsunori Hashimoto; Jason Eisner; Richard Shin; |
272 | Guide The Many-to-One Assignment: Open Information Extraction Via IoU-aware Optimal Transport Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The commonly utilized Hungarian algorithm for this procedure is restricted to handling one-to-one assignment among the desired tuples and tuple proposals, which ignores the correlation between proposals and affects the recall of the models. To solve this problem, we propose a dynamic many-to-one label assignment strategy named IOT. |
Kaiwen Wei; Yiran Yang; Li Jin; Xian Sun; Zequn Zhang; Jingyuan Zhang; Xiao Li; Linhao Zhang; Jintao Liu; Guo Zhi; |
273 | Actively Supervised Clustering for Open Relation Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a novel setting, named actively supervised clustering for OpenRE. |
Jun Zhao; Yongxin Zhang; Qi Zhang; Tao Gui; Zhongyu Wei; Minlong Peng; Mingming Sun; |
274 | ConvGQR: Generative Query Reformulation for Conversational Search Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose ConvGQR, a new framework to reformulate conversational queries based on generative pre-trained language models (PLMs), one for query rewriting and another for generating potential answers. |
Fengran Mo; Kelong Mao; Yutao Zhu; Yihong Wu; Kaiyu Huang; Jian-Yun Nie; |
275 | KILM: Knowledge Injection Into Encoder-Decoder Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Large pre-trained language models (PLMs) have been shown to retain implicit knowledge within their parameters. To enhance this implicit knowledge, we propose Knowledge Injection into Language Models (KILM), a novel approach that injects entity-related knowledge into encoder-decoder PLMs, via a generative knowledge infilling objective through continued pre-training. |
Yan Xu; Mahdi Namazifar; Devamanyu Hazarika; Aishwarya Padmakumar; Yang Liu; Dilek Hakkani-Tur; |
276 | VSTAR: A Video-grounded Dialogue Dataset for Situated Semantic Understanding with Scene and Topic Transitions Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present Video-grounded Scene&Topic AwaRe dialogue (VSTAR) dataset, a large scale video-grounded dialogue understanding dataset based on 395 TV series. |
Yuxuan Wang; Zilong Zheng; Xueliang Zhao; Jinpeng Li; Yueqian Wang; Dongyan Zhao; |
277 | NLPeer: A Unified Resource for The Computational Study of Peer Review Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To remedy this, we introduce NLPeer? the first ethically sourced multidomain corpus of more than 5k papers and 11k review reports from five different venues. |
Nils Dycke; Ilia Kuznetsov; Iryna Gurevych; |
278 | IM-TQA: A Chinese Table Question Answering Dataset with Implicit and Multi-type Table Structures Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Correspondingly, we propose an RGCN-RCI framework outperforming recent baselines. |
Mingyu Zheng; Yang Hao; Wenbin Jiang; Zheng Lin; Yajuan Lyu; QiaoQiao She; Weiping Wang; |
279 | Z-Code++: A Pre-trained Language Model Optimized for Abstractive Summarization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents Z-Code++, a new pre-trained language model optimized for abstractive text summarization. |
Pengcheng He; Baolin Peng; Song Wang; Yang Liu; Ruochen Xu; Hany Hassan; Yu Shi; Chenguang Zhu; Wayne Xiong; Michael Zeng; Jianfeng Gao; Xuedong Huang; |
280 | Mixture-of-Domain-Adapters: Decoupling and Injecting Domain Knowledge to Pre-trained Language Models� Memories Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate whether we can adapt PLMs both effectively and efficiently by only tuning a few parameters. |
Shizhe Diao; Tianyang Xu; Ruijia Xu; Jiawei Wang; Tong Zhang; |
281 | Unsupervised Graph-Text Mutual Conversion with A Unified Pretrained Language Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose INFINITY, a simple yet effective unsupervised method with a unified pretrained language model that does not introduce external annotation tools or additional parallel information. |
Yi Xu; Shuqian Sheng; Jiexing Qi; Luoyi Fu; Zhouhan Lin; Xinbing Wang; Chenghu Zhou; |
282 | Randomized Smoothing with Masked Inference for Adversarially Robust Text Classifications Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce RSMI, a novel two-stage framework that combines randomized smoothing (RS) with masked inference (MI) to improve the adversarial robustness of NLP systems. |
Han Cheol Moon; Shafiq Joty; Ruochen Zhao; Megh Thakkar; Chi Xu; |
283 | SESCORE2: Learning Text Generation Evaluation Via Synthesizing Realistic Mistakes Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose SEScore2, a self-supervised approach for training a model-based metric for text generation evaluation. |
Wenda Xu; Xian Qian; Mingxuan Wang; Lei Li; William Yang Wang; |
284 | Tokenization and The Noiseless Channel Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose that good tokenizers lead to efficient channel usage, where the channel is the means by which some input is conveyed to the model and efficiency can be quantified in information-theoretic terms as the ratio of the Shannon entropy to the maximum entropy of the subword distribution. |
Vil�m Zouhar; Clara Meister; Juan Gastaldi; Li Du; Mrinmaya Sachan; Ryan Cotterell; |
285 | Contextual Distortion Reveals Constituency: Masked Language Models Are Implicit Parsers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Recent advancements in pre-trained language models (PLMs) have demonstrated that these models possess some degree of syntactic awareness. To leverage this knowledge, we propose a novel chart-based method for extracting parse trees from masked language models (LMs) without the need to train separate parsers. |
Jiaxi Li; Wei Lu; |
286 | MetaAdapt: Domain Adaptive Few-Shot Misinformation Detection Via Meta Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To address the data scarcity issue, we propose MetaAdapt, a meta learning based approach for domain adaptive few-shot misinformation detection. |
Zhenrui Yue; Huimin Zeng; Yang Zhang; Lanyu Shang; Dong Wang; |
287 | Tackling Modality Heterogeneity with Multi-View Calibration Network for Multimodal Sentiment Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: All these issues will increase the difficulty in understanding the sentiment of the multimodal content. In this paper, we propose a novel Multi-View Calibration Network (MVCN) to alleviate the above issues systematically. |
Yiwei Wei; Shaozu Yuan; Ruosong Yang; Lei Shen; Zhangmeizhi Li; Longbiao Wang; Meng Chen; |
288 | COLA: Contextualized Commonsense Causal Reasoning from The Causal Inference Perspective Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper proposes a new task to detect commonsense causation between two events in an event sequence (i. e. , context), called contextualized commonsense causal reasoning. |
Zhaowei Wang; Quyet V. Do; Hongming Zhang; Jiayao Zhang; Weiqi Wang; Tianqing Fang; Yangqiu Song; Ginny Wong; Simon See; |
289 | MEMEX: Detecting Explanatory Evidence for Memes Via Knowledge-Enriched Contextualization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a novel task, MEMEX – given a meme and a related document, the aim is to mine the context that succinctly explains the background of the meme. |
Shivam Sharma; Ramaneswaran S; Udit Arora; Md. Shad Akhtar; Tanmoy Chakraborty; |
290 | WikiHowQA: A Comprehensive Benchmark for Multi-Document Non-Factoid Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: There is a critical need for high-quality resources for multi-document NFQA (MD-NFQA) to train new models and evaluate answers? grounding and factual consistency in relation to supporting documents. To address this gap, we introduce WikiHowQA, a new multi-document NFQA benchmark built on WikiHow, a website dedicated to answering ?how-to? questions. |
Valeriia Bolotova-Baranova; Vladislav Blinov; Sofya Filippova; Falk Scholer; Mark Sanderson; |
291 | Making Language Models Better Reasoners with Step-Aware Verifier Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present DiVeRSe (Diverse Verifier on Reasoning Step), a novel approach that further enhances the reasoning capability of language models. |
Yifei Li; Zeqi Lin; Shizhuo Zhang; Qiang Fu; Bei Chen; Jian-Guang Lou; Weizhu Chen; |
292 | Distributed Marker Representation for Ambiguous Discourse Markers and Entangled Relations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose to learn a Distributed Marker Representation (DMR) by utilizing the (potentially) unlimited discourse marker data with a latent discourse sense, thereby bridging markers with sentence pairs. |
Dongyu Ru; Lin Qiu; Xipeng Qiu; Yue Zhang; Zheng Zhang; |
293 | MISGENDERED: Limits of Large Language Models in Understanding Pronouns Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we comprehensively evaluate popular language models for their ability to correctly use English gender-neutral pronouns (e. g. , singular they, them) and neo-pronouns (e. g. , ze, xe, thon) that are used by individuals whose gender identity is not represented by binary pronouns. |
Tamanna Hossain; Sunipa Dev; Sameer Singh; |
294 | Reasoning with Language Model Prompting: A Survey Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce research works with comparisons and summaries and provide systematic resources to help beginners. |
Shuofei Qiao; Yixin Ou; Ningyu Zhang; Xiang Chen; Yunzhi Yao; Shumin Deng; Chuanqi Tan; Fei Huang; Huajun Chen; |
295 | Tackling Ambiguity with Images: Improved Multimodal Machine Translation and Contrastive Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a new MMT approach based on a strong text-only MT model, which uses neural adapters, a novel guided self-attention mechanism and which is jointly trained on both visually-conditioned masking and MMT. |
Matthieu Futeral; Cordelia Schmid; Ivan Laptev; Beno�t Sagot; Rachel Bawden; |
296 | Hybrid Knowledge Transfer for Improved Cross-Lingual Event Detection Via Hierarchical Sample Selection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we address the Event Detection task under a zero-shot cross-lingual setting where a model is trained on a source language but evaluated on a distinct target language for which there is no labeled data available. |
Luis Guzman Nateras; Franck Dernoncourt; Thien Nguyen; |
297 | BLEURT Has Universal Translations: An Analysis of Automatic Metrics By Minimum Risk Training Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we systematically analyze and compare various mainstream and cutting-edge automatic metrics from the perspective of their guidance for training machine translation systems. |
Yiming Yan; Tao Wang; Chengqi Zhao; Shujian Huang; Jiajun Chen; Mingxuan Wang; |
298 | Cross-modal Attention Congruence Regularization for Vision-Language Relation Alignment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We find that a critical component lacking from current vision-language models is relation-level alignment: the ability to match directional semantic relations in text (e. g. , �mug in grass�) with spatial relationships in the image (e. g. , the position of the mug relative to the grass). To tackle this problem, we show that relation alignment can be enforced by encouraging the language attention from �mug� to �grass� (capturing the semantic relation �in�) to match the visual attention from the mug to the grass (capturing the corresponding physical relation). |
Rohan Pandey; Rulin Shao; Paul Pu Liang; Ruslan Salakhutdinov; Louis-Philippe Morency; |
299 | Enhancing Personalized Dialogue Generation with Contrastive Latent Variables: Combining Sparse and Dense Persona Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we combine the advantages of the three resources to obtain a richer and more accurate persona. |
Yihong Tang; Bo Wang; Miao Fang; Dongming Zhao; Kun Huang; Ruifang He; Yuexian Hou; |
300 | Can LMs Learn New Entities from Descriptions? Challenges in Propagating Injected Knowledge Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We take a step forward and study LMs? abilities to make inferences based on injected facts (or propagate those facts): for example, after learning that something is a TV show, does an LM predict that you can watch it? |
Yasumasa Onoe; Michael Zhang; Shankar Padmanabhan; Greg Durrett; Eunsol Choi; |
301 | Explaining How Transformers Use Context to Build Predictions Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we leverage recent advances in explainability of the Transformer and present a procedure to analyze models for language generation. |
Javier Ferrando; Gerard I. G�llego; Ioannis Tsiamas; Marta R. Costa-juss�; |
302 | DISCO: Distilling Counterfactuals with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce DISCO (DIStilled COunterfactual Data), a new method for automatically generating high-quality counterfactual data at scale. |
Zeming Chen; Qiyue Gao; Antoine Bosselut; Ashish Sabharwal; Kyle Richardson; |
303 | Non-Sequential Graph Script Induction Via Multimedia Grounding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose the new challenging task of non-sequential graph script induction, aiming to capture optional and interchangeable steps in procedural planning. |
Yu Zhou; Sha Li; Manling Li; Xudong Lin; Shih-Fu Chang; Mohit Bansal; Heng Ji; |
304 | SCOTT: Self-Consistent Chain-of-Thought Distillation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose SCOTT, a faithful knowledge distillation method to learn a small, self-consistent CoT model from a teacher model that is orders of magnitude larger. |
Peifeng Wang; Zhengyang Wang; Zheng Li; Yifan Gao; Bing Yin; Xiang Ren; |
305 | Clinical Note Owns Its Hierarchy: Multi-Level Hypergraph Neural Networks for Patient-Level Representation Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Thus, we propose a taxonomy-aware multi-level hypergraph neural network (TM-HGNN), where multi-level hypergraphs assemble useful neutral words with rare keywords via note and taxonomy level hyperedges to retain the clinical semantic information. |
Nayeon Kim; Yinhua Piao; Sun Kim; |
306 | Incorporating Distributions of Discourse Structure for Long Document Abstractive Summarization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces the �RSTformer�, a novel summarization model that comprehensively incorporates both the types and uncertainty of rhetorical relations. |
Dongqi Pu; Yifan Wang; Vera Demberg; |
307 | Evaluating Open-Domain Question Answering in The Era of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we conduct a thorough analysis of various open-domain QA models, including LLMs, by manually evaluating their answers on a subset of NQ-open, a popular benchmark. |
Ehsan Kamalloo; Nouha Dziri; Charles Clarke; Davood Rafiei; |
308 | No Clues Good Clues: Out of Context Lexical Relation Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, there are indications that commonly used PTLMs already encode enough linguistic knowledge to allow the use of minimal (or none) textual context for some linguistically motivated tasks, thus notably reducing human effort, the need for data pre-processing, and favoring techniques that are language neutral since do not rely on syntactic structures. In this work, we explore this idea for the tasks of lexical relation classification (LRC) and graded Lexical Entailment (LE). |
Lucia Pitarch; Jordi Bernad; Lacramioara Dranca; Carlos Bobed Lisbona; Jorge Gracia; |
309 | Won�t Get Fooled Again: Answering Questions with False Premises Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Such frailties of PLMs often allude to the lack of knowledge within them. In this paper, we find that the PLMs already possess the knowledge required to rebut such questions, and the key is how to activate the knowledge. |
Shengding Hu; Yifan Luo; Huadong Wang; Xingyi Cheng; Zhiyuan Liu; Maosong Sun; |
310 | What The DAAM: Interpreting Stable Diffusion Using Cross Attention Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we perform a text-image attribution analysis on Stable Diffusion, a recently open-sourced model. |
Raphael Tang; Linqing Liu; Akshat Pandey; Zhiying Jiang; Gefei Yang; Karun Kumar; Pontus Stenetorp; Jimmy Lin; Ferhan Ture; |
311 | Zero-shot Faithful Factual Error Correction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Faithfully correcting factual errors is critical for maintaining the integrity of textual knowledge bases and preventing hallucinations in sequence-to-sequence models. Drawing on humans� ability to identify and correct factual errors, we present a zero-shot framework that formulates questions about input claims, looks for correct answers in the given evidence, and assesses the faithfulness of each correction based on its consistency with the evidence. |
Kung-Hsiang Huang; Hou Pong Chan; Heng Ji; |
312 | Open-Domain Hierarchical Event Schema Induction By Incremental Prompting and Verification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In contrast, we propose to treat event schemas as a form of commonsense knowledge that can be derived from large language models (LLMs). |
Sha Li; Ruining Zhao; Manling Li; Heng Ji; Chris Callison-Burch; Jiawei Han; |
313 | Zero-shot Approach to Overcome Perturbation Sensitivity of Prompts Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This study aims to find high-quality prompts for the given task in a zero-shot setting. |
Mohna Chakraborty; Adithya Kulkarni; Qi Li; |
314 | Free Lunch: Robust Cross-Lingual Transfer Via Model Checkpoint Averaging Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, aiming to improve the robustness of �true� ZS-XLT and FS-XLT, we propose a simple and effective method that averages different checkpoints (i. e. , model snapshots) during task fine-tuning. |
Fabian Schmidt; Ivan Vulic; Goran Glava�; |
315 | Cross-View Language Modeling: Towards Unified Cross-Lingual Cross-Modal Pre-training Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce Cross-View Language Modeling, a simple and effective pre-training framework that unifies cross-lingual and cross-modal pre-training with shared architectures and objectives. |
Yan Zeng; Wangchunshu Zhou; Ao Luo; Ziming Cheng; Xinsong Zhang; |
316 | Unsupervised Discontinuous Constituency Parsing with Mildly Context-Sensitive Grammars Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We study grammar induction with mildly context-sensitive grammars for unsupervised discontinuous parsing. |
Songlin Yang; Roger Levy; Yoon Kim; |
317 | Simplicity Bias in Transformers and Their Ability to Learn Sparse Boolean Functions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we conduct an extensive empirical study on Boolean functions to demonstrate the following: (i) Random Transformers are relatively more biased towards functions of low sensitivity. |
Satwik Bhattamishra; Arkil Patel; Varun Kanade; Phil Blunsom; |
318 | Counterspeeches Up My Sleeve! Intent Distribution Learning and Persistent Fusion for Intent-Conditioned Counterspeech Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we explore intent-conditioned counterspeech generation. |
Rishabh Gupta; Shaily Desai; Manvi Goel; Anil Bandhakavi; Tanmoy Chakraborty; Md. Shad Akhtar; |
319 | DITTO: Data-efficient and Fair Targeted Subset Selection for ASR Accent Adaptation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Choosing an informative subset of speech samples that are most representative of the target accents becomes important for effective ASR finetuning. To address this problem, we propose DITTO (Data-efficient and faIr Targeted subseT selectiOn that uses Submodular Mutual Information (SMI) functions as acquisition functions to find the most informative set of utterances matching a target accent within a fixed budget. |
Suraj Kothawade; Anmol Mekala; D.Chandra Sekhara Hetha Havya; Mayank Kothyari; Rishabh Iyer; Ganesh Ramakrishnan; Preethi Jyothi; |
320 | Verify-and-Edit: A Knowledge-Enhanced Chain-of-Thought Framework Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose the Verify-and-Edit framework for CoT prompting, which seeks to increase prediction factuality by post-editing reasoning chains according to external knowledge. |
Ruochen Zhao; Xingxuan Li; Shafiq Joty; Chengwei Qin; Lidong Bing; |
321 | Bridging The Domain Gaps in Context Representations for K-Nearest Neighbor Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, there often exists a significant gap between upstream and downstream domains, which hurts the datastore retrieval and the final translation quality. To deal with this issue, we propose a novel approach to boost the datastore retrieval of kNN-MT by reconstructing the original datastore. |
Zhiwei Cao; Baosong Yang; Huan Lin; Suhang Wu; Xiangpeng Wei; Dayiheng Liu; Jun Xie; Min Zhang; Jinsong Su; |
322 | Node Placement in Argument Maps: Modeling Unidirectional Relations in High & Low-Resource Scenarios Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To support those users, we introduce the task of node placement: suggesting candidate nodes as parents for a new contribution. We establish an upper-bound of human performance, and conduct experiments with models of various sizes and training strategies. |
Iman Jundi; Neele Falk; Eva Maria Vecchi; Gabriella Lapesa; |
323 | Towards A Common Understanding of Contributing Factors for Cross-Lingual Transfer in Multilingual Language Models: A Review Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, given that the aspiration for such an ability has not been explicitly incorporated in the design of the majority of MLLMs, it is challenging to obtain a unique and straightforward explanation for its emergence. In this review paper, we survey literature that investigates different factors contributing to the capacity of MLLMs to perform zero-shot cross-lingual transfer and subsequently outline and discuss these factors in detail. |
Fred Philippy; Siwen Guo; Shohreh Haddadan; |
324 | Toward Human-Like Evaluation for Natural Language Generation with Error Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we argue that the ability to estimate sentence confidence is the tip of the iceberg for PLM-based metrics. |
Qingyu Lu; Liang Ding; Liping Xie; Kanjian Zhang; Derek F. Wong; Dacheng Tao; |
325 | Connective Prediction for Implicit Discourse Relation Recognition Via Knowledge Distillation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Besides, these approaches spend lots of effort on template construction, negatively affecting the generalization capability. To address these problems,we propose a novel Connective Prediction via Knowledge Distillation (CP-KD) approach to instruct large-scale pre-trained language models (PLMs) mining the latent correlations between connectives and discourse relations, which is meaningful for IDRR. |
Hongyi Wu; Hao Zhou; Man Lan; Yuanbin Wu; Yadong Zhang; |
326 | What Is The Best Recipe for Character-level Encoder-only Modelling? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper aims to benchmark recent progress in language understanding models that output contextualised representations at the character level. |
Kris Cao; |
327 | Unifying Cross-Lingual and Cross-Modal Modeling Towards Weakly Supervised Multilingual Vision-Language Pre-training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore a more practical and scalable setting: weakly supervised multilingual VLP with only English image-text pairs and multilingual text corpora. |
Zejun Li; Zhihao Fan; Jingjing Chen; Qi Zhang; Xuanjing Huang; Zhongyu Wei; |
328 | Learning �O� Helps for Learning More: Handling The Unlabeled Entity Problem for Class-incremental NER Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we conduct an empirical study on the �Unlabeled Entity Problem� and find that it leads to severe confusion between �O� and entities, decreasing class discrimination of old classes and declining the model�s ability to learn new classes. |
Ruotian Ma; Xuanting Chen; Zhang Lin; Xin Zhou; Junzhe Wang; Tao Gui; Qi Zhang; Xiang Gao; Yun Wen Chen; |
329 | Scene Graph As Pivoting: Inference-time Image-free Unsupervised Multimodal Machine Translation with Visual Scene Hallucination Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we investigate a more realistic unsupervised multimodal machine translation (UMMT) setup, inference-time image-free UMMT, where the model is trained with source-text image pairs, and tested with only source-text inputs. |
Hao Fei; Qian Liu; Meishan Zhang; Min Zhang; Tat-Seng Chua; |
330 | CoLaDa: A Collaborative Label Denoising Framework for Cross-lingual Named Entity Recognition Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, these methods may suffer from label noise due to the automatic labeling process. In this paper, we propose CoLaDa, a Collaborative Label Denoising Framework, to address this problem. |
Tingting Ma; Qianhui Wu; Huiqiang Jiang; B�rje Karlsson; Tiejun Zhao; Chin-Yew Lin; |
331 | Dialect-robust Evaluation of Generated Text Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a suite of methods to assess whether metrics are dialect robust. |
Jiao Sun; Thibault Sellam; Elizabeth Clark; Tu Vu; Timothy Dozat; Dan Garrette; Aditya Siddhant; Jacob Eisenstein; Sebastian Gehrmann; |
332 | Understanding and Improving The Robustness of Terminology Constraints in Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we study the robustness of two typical terminology translation methods: Placeholder (PH) and Code-Switch (CS), concerning (1) the number of constraints and (2) the target constraint length. |
Huaao Zhang; Qiang Wang; Bo Qin; Zelin Shi; Haibo Wang; Ming Chen; |
333 | Language Model Acceptability Judgements Are Not Always Robust to Context Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we vary the input contexts based on: length, the types of syntactic phenomena it contains, and whether or not there are grammatical violations. |
Koustuv Sinha; Jon Gauthier; Aaron Mueller; Kanishka Misra; Keren Fuentes; Roger Levy; Adina Williams; |
334 | RobuT: A Systematic Study of Table QA Robustness Against Human-Annotated Adversarial Perturbations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Our results indicate that both state-of-the-art Table QA models and large language models (e. g. , GPT-3) with few-shot learning falter in these adversarial sets. We propose to address this problem by using large language models to generate adversarial examples to enhance training, which significantly improves the robustness of Table QA models. |
Yilun Zhao; Chen Zhao; Linyong Nan; Zhenting Qi; Wenlin Zhang; Xiangru Tang; Boyu Mi; Dragomir Radev; |
335 | Morphological Inflection: A Reality Check Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To improve generalizability and reliability of results, we propose new data sampling and evaluation strategies that better reflect likely use-cases. |
Jordan Kodner; Sarah Payne; Salam Khalifa; Zoey Liu; |
336 | TOME: A Two-stage Approach for Model-based Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite its attractive qualities, there remain several major challenges in model-based retrieval, including the discrepancy between pre-training and fine-tuning, and the discrepancy between training and inference. To deal with the above challenges, we propose a novel two-stage model-based retrieval approach called TOME, which makes two major technical contributions, including the utilization of tokenized URLs as identifiers and the design of a two-stage generation architecture. |
Ruiyang Ren; Wayne Xin Zhao; Jing Liu; Hua Wu; Ji-Rong Wen; Haifeng Wang; |
337 | Using Neural Machine Translation for Generating Diverse Challenging Exercises for Language Learner Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a novel approach to automatically generate distractors for cloze exercises for English language learners, using round-trip neural machine translation. |
Frank Palma Gomez; Subhadarshi Panda; Michael Flor; Alla Rozovskaya; |
338 | Similarity-weighted Construction of Contextualized Commonsense Knowledge Graphs for Knowledge-intense Argumentation Tasks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a new unsupervised method for constructing Contextualized Commonsense Knowledge Graphs (CCKGs) that selects contextually relevant knowledge from large knowledge graphs (KGs) efficiently and at high quality. |
Moritz Plenz; Juri Opitz; Philipp Heinisch; Philipp Cimiano; Anette Frank; |
339 | MiCSE: Mutual Information Contrastive Learning for Low-shot Sentence Embeddings Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents miCSE, a mutual information-based contrastive learning framework that significantly advances the state-of-the-art in few-shot sentence embedding. |
Tassilo Klein; Moin Nabi; |
340 | Learning Non-linguistic Skills Without Sacrificing Linguistic Proficiency Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we take a closer look into the phenomena of catastrophic forgetting as it pertains to LLMs and subsequently offer a novel framework for non-linguistic skill injection for LLMs based on information-theoretic interventions and skill-specific losses that enable the learning of strict arithmetic reasoning. |
Mandar Sharma; Nikhil Muralidhar; Naren Ramakrishnan; |
341 | Forgotten Knowledge: Examining The Citational Amnesia in NLP Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work systematically and empirically examines: How far back in time do we tend to go to cite papers? |
Janvijay Singh; Mukund Rungta; Diyi Yang; Saif Mohammad; |
342 | Measuring The Instability of Fine-Tuning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we analyze SD and six other measures quantifying instability of different granularity levels. |
Yupei Du; Dong Nguyen; |
343 | FairPrism: Evaluating Fairness-Related Harms in Text Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To that end, we introduce FairPrism, a dataset of 5,000 examples of AI-generated English text with detailed human annotations covering a diverse set of harms relating to gender and sexuality. |
Eve Fleisig; Aubrie Amstutz; Chad Atalla; Su Lin Blodgett; Hal Daum� III; Alexandra Olteanu; Emily Sheng; Dan Vann; Hanna Wallach; |
344 | Factually Consistent Summarization Via Reinforcement Learning with Textual Entailment Feedback Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This phenomenon is emphasized in tasks like summarization, in which the generated summaries should be corroborated by their source article. In this work we leverage recent progress on textual entailment models to directly address this problem for abstractive summarization systems. |
Paul Roit; Johan Ferret; Lior Shani; Roee Aharoni; Geoffrey Cideron; Robert Dadashi; Matthieu Geist; Sertan Girgin; Leonard Hussenot; Orgad Keller; Nikola Momchev; Sabela Ramos Garea; Piotr Stanczyk; Nino Vieillard; Olivier Bachem; Gal Elidan; Avinatan Hassidim; Olivier Pietquin; Idan Szpektor; |
345 | SIMMC-VR: A Task-oriented Multimodal Dialog Dataset with Situated and Immersive VR Streams Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose SIMMC-VR, an extension of the SIMMC-2. |
Te-Lin Wu; Satwik Kottur; Andrea Madotto; Mahmoud Azab; Pedro Rodriguez; Babak Damavandi; Nanyun Peng; Seungwhan Moon; |
346 | Multilingual LLMs Are Better Cross-lingual In-context Learners with Alignment Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We find that the prevalent mode of selecting random input-label pairs to construct the prompt-context is severely limited in the case of cross-lingual ICL, primarily due to the lack of alignment in the input as well as the output spaces. To mitigate this, we propose a novel prompt construction strategy � Cross-lingual In-context Source Target Alignment (X-InSTA). |
Eshaan Tanwar; Subhabrata Dutta; Manish Borthakur; Tanmoy Chakraborty; |
347 | APOLLO: A Simple Approach for Adaptive Pretraining of Language Models for Logical Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose APOLLO, a simple adaptive pretraining approach to improve the logical reasoning skills of language models. |
Soumya Sanyal; Yichong Xu; Shuohang Wang; Ziyi Yang; Reid Pryzant; Wenhao Yu; Chenguang Zhu; Xiang Ren; |
348 | MultiTabQA: Generating Tabular Answers for Multi-Table Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Furthermore, multi-table operations often result in a tabular output, which necessitates table generation capabilities of tabular QA models. To fill this gap, we propose a new task of answering questions over multiple tables. |
Vaishali Pal; Andrew Yates; Evangelos Kanoulas; Maarten de Rijke; |
349 | To Copy Rather Than Memorize: A Vertical Learning Paradigm for Knowledge Graph Completion Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, this paper points out that the multi-hop relation rules are hard to be reliably memorized due to the inherent deficiencies of such implicit memorization strategy, making embedding models underperform in predicting links between distant entity pairs. To alleviate this problem, we present Vertical Learning Paradigm (VLP), which extends embedding models by allowing to explicitly copy target information from related factual triples for more accurate prediction. |
Rui Li; Xu Chen; Chaozhuo Li; Yanming Shen; Jianan Zhao; Yujing Wang; Weihao Han; Hao Sun; Weiwei Deng; Qi Zhang; Xing Xie; |
350 | CoAD: Automatic Diagnosis Through Symptom and Disease Collaborative Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite its simplicity and superior performance demonstrated, a decline in disease diagnosis accuracy is observed caused by 1) a mismatch between symptoms observed during training and generation, and 2) the effect of different symptom orders on disease prediction. To address the above obstacles, we introduce the CoAD, a novel disease and symptom collaborative generation framework, which incorporates several key innovations to improve AD: 1) aligning sentence-level disease labels with multiple possible symptom inquiry steps to bridge the gap between training and generation; 2) expanding symptom labels for each sub-sequence of symptoms to enhance annotation and eliminate the effect of symptom order; 3) developing a repeated symptom input schema to effectively and efficiently learn the expanded disease and symptom labels. |
Huimin Wang; Wai Chung Kwan; Kam-Fai Wong; Yefeng Zheng; |
351 | Long-Tailed Question Answering in An Open World Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we define Open Long-Tailed QA (OLTQA) as learning from long-tailed distributed data and optimizing performance over seen and unseen QA tasks. |
Yi Dai; Hao Lang; Yinhe Zheng; Fei Huang; Yongbin Li; |
352 | Parallel Context Windows for Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present Parallel Context Windows (PCW), a method that alleviates the context window restriction for any off-the-shelf LLM without further training. |
Nir Ratner; Yoav Levine; Yonatan Belinkov; Ori Ram; Inbal Magar; Omri Abend; Ehud Karpas; Amnon Shashua; Kevin Leyton-Brown; Yoav Shoham; |
353 | Efficient Transformers with Dynamic Token Pooling Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Nevertheless, natural units of meaning, such as words or phrases, display varying sizes. To address this mismatch, we equip language models with a dynamic-pooling mechanism, which predicts segment boundaries in an autoregressive fashion. |
Piotr Nawrot; Jan Chorowski; Adrian Lancucki; Edoardo Maria Ponti; |
354 | Did The Models Understand Documents? Benchmarking Models for Language Understanding in Document-Level Relation Extraction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Document-level relation extraction (DocRE) attracts more research interest recently. While models achieve consistent performance gains in DocRE, their underlying decision rules are still understudied: Do they make the right predictions according to rationales? In this paper, we take the first step toward answering this question and then introduce a new perspective on comprehensively evaluating a model. |
Haotian Chen; Bingsheng Chen; Xiangdong Zhou; |
355 | ContraCLM: Contrastive Learning For Causal Language Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite exciting progress in causal language models, the expressiveness of their representations is largely limited due to poor discrimination ability. To remedy this issue, we present CONTRACLM, a novel contrastive learning framework at both the token-level and the sequence-level. |
Nihal Jain; Dejiao Zhang; Wasi Uddin Ahmad; Zijian Wang; Feng Nan; Xiaopeng Li; Ming Tan; Ramesh Nallapati; Baishakhi Ray; Parminder Bhatia; Xiaofei Ma; Bing Xiang; |
356 | Advancing Multi-Criteria Chinese Word Segmentation Through Criterion Classification and Denoising Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we show that through a simple yet elegant input-hint-based MCCWS model, we can achieve state-of-the-art (SoTA) performances on several datasets simultaneously. |
Tzu Hsuan Chou; Chun-Yi Lin; Hung-Yu Kao; |
357 | Infusing Hierarchical Guidance Into Prompt Tuning: A Parameter-Efficient Framework for Multi-level Implicit Discourse Relation Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Besides, the comprehension of hierarchical semantics for MIDRR makes the conversion much harder. In this paper, we propose a prompt-based Parameter-Efficient Multi-level IDRR (PEMI) framework to solve the above problems. |
Haodong Zhao; Ruifang He; Mengnan Xiao; Jing Xu; |
358 | Contrastive Learning with Adversarial Examples for Alleviating Pathology of Language Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explain the model pathology from the view of sentence representation and argue that the counter-intuitive bias degree and direction of the out-of-distribution examples� representation cause the pathology. |
Pengwei Zhan; Jing Yang; Xiao Huang; Chunlei Jing; Jingying Li; Liming Wang; |
359 | Are Fairy Tales Fair? Analyzing Gender Bias in Temporal Narrative Event Chains of Children�s Fairy Tales Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our work joins this interdisciplinary effort and makes a unique contribution by taking into account the event narrative structures when analyzing the social bias of stories. We propose a computational pipeline that automatically extracts a story�s temporal narrative verb-based event chain for each of its characters as well as character attributes such as gender. |
Paulina Toro Isaza; Guangxuan Xu; Toye Oloko; Yufang Hou; Nanyun Peng; Dakuo Wang; |
360 | FutureTOD: Teaching Future Knowledge to Pre-trained Language Model for Task-Oriented Dialogue Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel dialogue pre-training model, FutureTOD, which distills future knowledge to the representation of the previous dialogue context using a self-training framework. |
Weihao Zeng; Keqing He; Yejie Wang; Chen Zeng; Jingang Wang; Yunsen Xian; Weiran Xu; |
361 | LAMBADA: Backward Chaining for Automated Reasoning in Natural Language Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The classical automated reasoning literature has shown that reasoning in the backward direction (i. e. from intended conclusion to supporting axioms) is significantly more efficient at proof-finding. Importing this intuition into the LM setting, we develop a Backward Chaining algorithm, called LAMBADA, that decomposes reasoning into four sub-modules, that are simply implemented by few-shot prompted LLM inference. |
Mehran Kazemi; Najoung Kim; Deepti Bhatia; Xin Xu; Deepak Ramachandran; |
362 | PeaCoK: Persona Commonsense Knowledge for Consistent and Engaging Narratives Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we construct a new large-scale persona commonsense knowledge graph, PeaCoK, containing ~100K human-validated persona facts. |
Silin Gao; Beatriz Borges; Soyoung Oh; Deniz Bayazit; Saya Kanno; Hiromi Wakaki; Yuki Mitsufuji; Antoine Bosselut; |
363 | OpenSR: Open-Modality Speech Recognition Via Maintaining Multi-Modality Alignment Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, when training the specific model of new domain, it often gets stuck in the lack of new-domain utterances, especially the labeled visual utterances. To break through this restriction, we attempt to achieve zero-shot modality transfer by maintaining the multi-modality alignment in phoneme space learned with unlabeled multimedia utterances in the high resource domain during the pre-training, and propose a training system Open-modality Speech Recognition (OpenSR) that enables the models trained on a single modality (e. g. , audio-only) applicable to more modalities (e. g. , visual-only and audio-visual). |
Xize Cheng; Tao Jin; Linjun Li; Wang Lin; Xinyu Duan; Zhou Zhao; |
364 | Retrieval-free Knowledge Injection Through Multi-Document Traversal for Dialogue Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Nevertheless, retrieval-augmented approaches rely on finely annotated retrieval training data and knowledge-grounded response generation data, making it costly to transfer. To tackle this challenge, this paper proposed a retrieval-free approach, KiDG, by automatically turning knowledge documents into simulated multi-turn dialogues through a Multi-Document Traversal algorithm. |
Rui Wang; Jianzhu Bao; Fei Mi; Yi Chen; Hongru Wang; Yasheng Wang; Yitong Li; Lifeng Shang; Kam-Fai Wong; Ruifeng Xu; |
365 | BERM: Training The Balanced and Extractable Representation for Matching to Improve Generalization Ability of Dense Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel method to improve the generalization of dense retrieval via capturing matching signal called BERM. |
Shicheng Xu; Liang Pang; Huawei Shen; Xueqi Cheng; |
366 | Multiview Identifiers Enhanced Generative Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Current approaches use either a numeric ID or a text piece (such as a title or substrings) as the identifier. However, these identifiers cannot cover a passage?s content well. As such, we are motivated to propose a new type of identifier, synthetic identifiers, that are generated based on the content of a passage and could integrate contextualized information that text pieces lack. |
Yongqi Li; Nan Yang; Liang Wang; Furu Wei; Wenjie Li; |
367 | Prompting Language Models for Linguistic Structure Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although pretrained language models (PLMs) can be prompted to perform a wide range of language tasks, it remains an open question how much this ability comes from generalizable linguistic understanding versus surface-level lexical patterns. To test this, we present a structured prompting approach for linguistic structured prediction tasks, allowing us to perform zero- and few-shot sequence tagging with autoregressive PLMs. |
Terra Blevins; Hila Gonen; Luke Zettlemoyer; |
368 | Trillion Dollar Words: A New Financial Dataset, Task & Market Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we develop a novel task of hawkish-dovish classification and benchmark various pre-trained language models on the proposed dataset. |
Agam Shah; Suvan Paturi; Sudheer Chava; |
369 | RE-Matching: A Fine-Grained Semantic Matching Method for Zero-Shot Relation Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a fine-grained semantic matching method tailored for zero-shot relation extraction. |
Jun Zhao; WenYu Zhan; Xin Zhao; Qi Zhang; Tao Gui; Zhongyu Wei; Junzhe Wang; Minlong Peng; Mingming Sun; |
370 | SQuARe: A Large-Scale Dataset of Sensitive Questions and Acceptable Responses Created Through Human-Machine Collaboration Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, discussions on sensitive issues can become toxic even if the users are well-intentioned. For safer models in such scenarios, we present the Sensitive Questions and Acceptable Response (SQuARe) dataset, a large-scale Korean dataset of 49k sensitive questions with 42k acceptable and 46k non-acceptable responses. |
Hwaran Lee; Seokhee Hong; Joonsuk Park; Takyoung Kim; Meeyoung Cha; Yejin Choi; Byoungpil Kim; Gunhee Kim; Eun-Ju Lee; Yong Lim; Alice Oh; Sangchul Park; Jung-Woo Ha; |
371 | Towards Standardizing Korean Grammatical Error Correction: Datasets and Annotation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we collect three datasets from different sources (Kor-Lang8, Kor-Native, and Kor-Learner) that covers a wide range of Korean grammatical errors. |
Soyoung Yoon; Sungjoon Park; Gyuwan Kim; Junhee Cho; Kihyo Park; Gyu Tae Kim; Minjoon Seo; Alice Oh; |
372 | FLamE: Few-shot Learning from Natural Language Explanations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To effectively learn from explanations, we present FLamE, a two-stage few-shot learning framework that first generates explanations using GPT-3, and then fine-tunes a smaller model (e. g. , RoBERTa) with generated explanations. |
Yangqiaoyu Zhou; Yiming Zhang; Chenhao Tan; |
373 | Learning Symbolic Rules Over Abstract Meaning Representations for Textual Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Therefore, in this paper, we propose a modular, NEuro-Symbolic Textual Agent (NESTA) that combines a generic semantic parser with a rule induction system to learn abstract interpretable rules as policies. |
Subhajit Chaudhury; Sarathkrishna Swaminathan; Daiki Kimura; Prithviraj Sen; Keerthiram Murugesan; Rosario Uceda-Sosa; Michiaki Tatsubori; Achille Fokoue; Pavan Kapanipathi; Asim Munawar; Alexander Gray; |
374 | Counterfactual Debiasing for Fact Verification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Unlike previous works, we propose a novel method from a counterfactual view, namely CLEVER, which is augmentation-free and mitigates biases on the inference stage. |
Weizhi Xu; Qiang Liu; Shu Wu; Liang Wang; |
375 | What Social Attitudes About Gender Does BERT Encode? Leveraging Insights from Psycholinguistics Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Much research has sought to evaluate the degree to which large language models reflect social biases. We complement such work with an approach to elucidating the connections between language model predictions and people?s social attitudes. |
Julia Watson; Barend Beekhuizen; Suzanne Stevenson; |
376 | Rethinking Multimodal Entity and Relation Extraction from A Translation Point of View Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We revisit the multimodal entity and relation extraction from a translation point of view. |
Changmeng Zheng; Junhao Feng; Yi Cai; Xiaoyong Wei; Qing Li; |
377 | Annotating and Detecting Fine-grained Factual Errors for Dialogue Summarization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present the first dataset with fine-grained factual error annotations named DIASUMFACT. |
Rongxin Zhu; Jianzhong Qi; Jey Han Lau; |
378 | Improving The Robustness of Summarization Systems with Dual Augmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To create semantic-consistent substitutes, we propose a SummAttacker, which is an efficient approach to generating adversarial samples based on pre-trained language models. |
Xiuying Chen; Guodong Long; Chongyang Tao; Mingzhe Li; Xin Gao; Chengqi Zhang; Xiangliang Zhang; |
379 | Interpretable Math Word Problem Solution Generation Via Step-by-step Planning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a step-by-step planning approach for intermediate solution generation, which strategically plans the generation of the next solution step based on the MWP and the previous solution steps. |
Mengxue Zhang; Zichao Wang; Zhichao Yang; Weiqi Feng; Andrew Lan; |
380 | TemplateGEC: Improving Grammatical Error Correction with Detection Template Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Grammatical error correction (GEC) can be divided into sequence-to-edit (Seq2Edit) and sequence-to-sequence (Seq2Seq) frameworks, both of which have their pros and cons. To utilize the strengths and make up for the shortcomings of these frameworks, this paper proposes a novel method, TemplateGEC, which capitalizes on the capabilities of both Seq2Edit and Seq2Seq frameworks in error detection and correction respectively. |
Yinghao Li; Xuebo Liu; Shuo Wang; Peiyuan Gong; Derek F. Wong; Yang Gao; Heyan Huang; Min Zhang; |
381 | Deep Model Compression Also Helps Models Capture Ambiguity Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address this problem, we must consider how to exactly capture the degree of relationship between each sample and its candidate classes. In this work, we propose a novel method with deep model compression and show how such relationship can be accounted for. |
Hancheol Park; Jong Park; |
382 | Are Experts Needed? On Human Evaluation of Counselling Reflection Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Laypeople-based evaluation is less expensive and easier to scale, but its quality is unknown for reflections. Therefore, we explore whether laypeople can be an alternative to experts in evaluating a fundamental quality aspect: coherence and context-consistency. |
Zixiu Wu; Simone Balloccu; Ehud Reiter; Rim Helaoui; Diego Reforgiato Recupero; Daniele Riboni; |
383 | PairSpanBERT: An Enhanced Language Model for Bridging Resolution Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present PairSpanBERT, a SpanBERT-based pre-trained model specialized for bridging resolution. |
Hideo Kobayashi; Yufang Hou; Vincent Ng; |
384 | Compounding Geometric Operations for Knowledge Graph Completion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by the synergy, we propose a new KGE model by leveraging all three operations in this work. |
Xiou Ge; Yun Cheng Wang; Bin Wang; C.-C. Jay Kuo; |
385 | Few-shot In-context Learning on Knowledge Base Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To handle questions over diverse KBQA datasets with a unified training-free framework, we propose KB-BINDER, which for the first time enables few-shot in-context learning over KBQA tasks. |
Tianle Li; Xueguang Ma; Alex Zhuang; Yu Gu; Yu Su; Wenhu Chen; |
386 | Fact-Checking Complex Claims with Program-Guided Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present Program-Guided Fact-Checking (ProgramFC), a novel fact-checking model that decomposes complex claims into simpler sub-tasks that can be solved using a shared library of specialized functions. |
Liangming Pan; Xiaobao Wu; Xinyuan Lu; Anh Tuan Luu; William Yang Wang; Min-Yen Kan; Preslav Nakov; |
387 | Patton: Language Model Pretraining on Text-Rich Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose our PretrAining on TexT-Rich NetwOrk framework Patton. |
Bowen Jin; Wentao Zhang; Yu Zhang; Yu Meng; Xinyang Zhang; Qi Zhu; Jiawei Han; |
388 | Soft Language Clustering for Multilingual Model Pre-training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose XLM-P, a method that contextually retrieves prompts as flexible guidance for encoding instances conditionally. |
Jiali Zeng; Yufan Jiang; Yongjing Yin; Yi Jing; Fandong Meng; Binghuai Lin; Yunbo Cao; Jie Zhou; |
389 | Curriculum Learning for Graph Neural Networks: A Multiview Competence-based Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a new perspective on curriculum learning by introducing a novel approach that builds on graph complexity formalisms (as difficulty criteria) and model competence during training. |
Nidhi Vakil; Hadi Amiri; |
390 | When and How to Paraphrase for Named Entity Recognition? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we utilize simple strategies to annotate entity spans in generations and compare established and novel methods of paraphrasing in NLP such as back translation, specialized encoder-decoder models such as Pegasus, and GPT-3 variants for their effectiveness in improving downstream performance for NER across different levels of gold annotations and paraphrasing strength on 5 datasets. |
Saket Sharma; Aviral Joshi; Yiyun Zhao; Namrata Mukhija; Hanoz Bhathena; Prateek Singh; Sashank Santhanam; |
391 | UniEvent: Unified Generative Model with Multi-Dimensional Prefix for Zero-Shot Event-Relational Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To enhance knowledge transfer and enable zero-shot generalization among various combinations, in this work we propose a novel unified framework, called UNIEVENT. |
Zhengwei Tao; Zhi Jin; Haiyan Zhao; Chengfeng Dou; Yongqiang Zhao; Tao Shen; Chongyang Tao; |
392 | Are Machine Rationales (Not) Useful to Humans? Measuring and Improving Human Utility of Free-text Rationales Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We show that, by estimating a rationale�s helpfulness in answering similar unseen instances, we can measure its human utility to a better extent. We also translate this finding into an automated score, Gen-U, that we propose, which can help improve LMs� ability to generate rationales with better human utility, while maintaining most of its task performance. |
Brihi Joshi; Ziyi Liu; Sahana Ramnath; Aaron Chan; Zhewei Tong; Shaoliang Nie; Qifan Wang; Yejin Choi; Xiang Ren; |
393 | Automatic Annotation of Direct Speech in Written French Narratives Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Our goal is to create a unified framework to design and evaluate AADS models in French. |
No� Durandard; Viet Anh Tran; Gaspard Michel; Elena Epure; |
394 | Automatic Creation of Named Entity Recognition Datasets By Querying Phrase Representations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we present a novel framework, HighGEN, that generates NER datasets with high-coverage pseudo-dictionaries. |
Hyunjae Kim; Jaehyo Yoo; Seunghyun Yoon; Jaewoo Kang; |
395 | Dynamic Transformers Provide A False Sense of Efficiency Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To this end, we propose a simple yet effective attacking framework, SAME, a novel slowdown attack framework on multi-exit models, which is specially tailored to reduce the efficiency of the multi-exit models. |
Yiming Chen; Simin Chen; Zexin Li; Wei Yang; Cong Liu; Robby Tan; Haizhou Li; |
396 | Empowering Cross-lingual Behavioral Testing of NLP Models with Typological Features Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose M2C, a morphologically-aware framework for behavioral testing of NLP models. |
Ester Hlavnova; Sebastian Ruder; |
397 | Local Byte Fusion for Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a Local Byte Fusion (LOBEF) method for byte-based machine translation�utilizing byte n-gram and word boundaries�to aggregate local semantic information. |
Makesh Narsimhan Sreedhar; Xiangpeng Wan; Yu Cheng; Junjie Hu; |
398 | Where�s The Point? Self-Supervised Multilingual Punctuation-Agnostic Sentence Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we thus introduce a multilingual punctuation-agnostic sentence segmentation method, currently covering 85 languages, trained in a self-supervised fashion on unsegmented text, by making use of newline characters which implicitly perform segmentation into paragraphs. |
Benjamin Minixhofer; Jonas Pfeiffer; Ivan Vulic; |
399 | Multi-target Backdoor Attacks for Code Pre-trained Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose task-agnostic backdoor attacks for code pre-trained models. |
Yanzhou Li; Shangqing Liu; Kangjie Chen; Xiaofei Xie; Tianwei Zhang; Yang Liu; |
400 | Learning Better Masking for Better Language Model Pre-training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the model may receive complicated impact from pre-training status, which changes accordingly as training time goes on. In this paper, we show that such time-invariant MLM settings on masking ratio and masked content are unlikely to deliver an optimal outcome, which motivates us to explore the influence of time-variant MLM settings. |
Dongjie Yang; Zhuosheng Zhang; Hai Zhao; |
401 | VisText: A Benchmark for Semantically Rich Chart Captioning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In response, we introduce VisText: a dataset of 12,441 pairs of charts and captions that describe the charts� construction, report key statistics, and identify perceptual and cognitive phenomena. |
Benny Tang; Angie Boggust; Arvind Satyanarayan; |
402 | Byte-Level Grammatical Error Correction Using Synthetic and Curated Corpora Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Grammatical error correction (GEC) is the task of correcting typos, spelling, punctuation and grammatical issues in text. Approaching the problem as a sequence-to-sequence task, we compare the use of a common subword unit vocabulary and byte-level encoding. Initial synthetic training data is created using an error-generating pipeline, and used for finetuning two subword-level models and one byte-level model. |
Svanhv�t Lilja Ing�lfsd�ttir; Petur Ragnarsson; Haukur J�nsson; Haukur Simonarson; Vilhjalmur Thorsteinsson; V�steinn Sn�bjarnarson; |
403 | Multi-Level Knowledge Distillation for Out-of-Distribution Detection in Text Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we analyze the complementary characteristic of both methods and propose a multi-level knowledge distillation approach that integrates their strengths while mitigating their limitations. |
Qianhui Wu; Huiqiang Jiang; Haonan Yin; B�rje Karlsson; Chin-Yew Lin; |
404 | Peeking Inside The Black Box: A Commonsense-aware Generative Framework for Explainable Complaint Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We address the task of explainable complaint detection and propose a commonsense-aware unified generative framework by reframing the multitask problem as a text-to-text generation task. |
Apoorva Singh; Raghav Jain; Prince Jha; Sriparna Saha; |
405 | MMDialog: A Large-scale Multi-turn Dialogue Dataset Towards Multi-modal Open-domain Conversation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce the MMDialog dataset to facilitate multi-modal conversation better. |
Jiazhan Feng; Qingfeng Sun; Can Xu; Pu Zhao; Yaming Yang; Chongyang Tao; Dongyan Zhao; Qingwei Lin; |
406 | ByGPT5: End-to-End Style-conditioned Poetry Generation with Token-free Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we investigate end-to-end poetry generation conditioned on styles such as rhyme, meter, and alliteration. |
Jonas Belouadi; Steffen Eger; |
407 | Envisioning Future from The Past: Hierarchical Duality Learning for Multi-Turn Dialogue Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we define a widely neglected property in dialogue text, duality, which is a hierarchical property that is reflected in human behaviours in daily conversations: Based on the logic in a conversation (or a sentence), people can infer follow-up utterances (or tokens) based on the previous text, and vice versa. |
Ang Lv; Jinpeng Li; Shufang Xie; Rui Yan; |
408 | DualGATs: Dual Graph Attention Networks for Emotion Recognition in Conversations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce Dual Graph ATtention networks (DualGATs) to concurrently consider the complementary aspects of discourse structure and speaker-aware context, aiming for more precise ERC. |
Duzhen Zhang; Feilong Chen; Xiuyi Chen; |
409 | Consistent Prototype Learning for Few-Shot Continual Relation Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we design a new N-way-K-shot Continual Relation Extraction (NK-CRE) task and propose a novel few-shot continual relation extraction method with Consistent Prototype Learning (ConPL) to address the aforementioned issues. |
Xiudi Chen; Hui Wu; Xiaodong Shi; |
410 | Matching Pairs: Attributing Fine-Tuned Models to Their Pre-Trained Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Thus, we need a method to investigate how a model was trained or a piece of text was generated and what their pre-trained base model was. In this paper we take the first step to address this open problem by tracing back the origin of a given fine-tuned LLM to its corresponding pre-trained base model. |
Myles Foley; Ambrish Rawat; Taesung Lee; Yufang Hou; Gabriele Picco; Giulio Zizzo; |
411 | Large Language Models Meet NL2Code: A Survey Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To facilitate further research and applications in this field, in this paper, we present a comprehensive survey of 27 existing large language models for NL2Code, and also review benchmarks and metrics. |
Daoguang Zan; Bei Chen; Fengji Zhang; Dianjie Lu; Bingchao Wu; Bei Guan; Wang Yongji; Jian-Guang Lou; |
412 | When Does Aggregating Multiple Skills with Multi-Task Learning Work? A Case Study in Financial NLP Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we conduct a case study in Financial NLP where multiple datasets exist for skills relevant to the domain, such as numeric reasoning and sentiment analysis. |
Jingwei Ni; Zhijing Jin; Qian Wang; Mrinmaya Sachan; Markus Leippold; |
413 | Enhancing Grammatical Error Correction Systems with Explanations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To enhance GEC systems with explanations, we introduce EXPECT, a large dataset annotated with evidence words and grammatical error types. We propose several baselines and anlysis to understand this task. |
Yuejiao Fei; Leyang Cui; Sen Yang; Wai Lam; Zhenzhong Lan; Shuming Shi; |
414 | Linguistic Representations for Fewer-shot Relation Extraction Across Domains Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We focus on the task of relation extraction on three datasets of procedural text in two domains, cooking and materials science. |
Sireesh Gururaja; Ritam Dutt; Tinglong Liao; Carolyn Ros�; |
415 | DarkBERT: A Language Model for The Dark Side of The Internet Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce DarkBERT, a language model pretrained on Dark Web data. |
Youngjin Jin; Eugene Jang; Jian Cui; Jin-Woo Chung; Yongjae Lee; Seungwon Shin; |
416 | MDACE: MIMIC Documents Annotated with Code Evidence Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a dataset for evidence/rationale extraction on an extreme multi-label classification task over long medical documents. |
Hua Cheng; Rana Jafari; April Russell; Russell Klopfer; Edmond Lu; Benjamin Striner; Matthew Gormley; |
417 | Towards Zero-Shot Multilingual Transfer for Code-Switched Responses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a new adapter-based framework that allows for efficient transfer by learning task-specific representations and encapsulating source and target language representations. |
Ting-Wei Wu; Changsheng Zhao; Ernie Chang; Yangyang Shi; Pierce Chuang; Vikas Chandra; Biing Juang; |
418 | One Network, Many Masks: Towards More Parameter-Efficient Transfer Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To achieve even greater storage reduction, we propose ProPETL, a novel method that enables efficient sharing of a single prototype PETL network (e. g. adapter, LoRA, and prefix-tuning) across layers and tasks. |
Guangtao Zeng; Peiyuan Zhang; Wei Lu; |
419 | Can Language Models Make Fun? A Case Study in Chinese Comical Crosstalk Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we aim to preliminarily test *whether NLG can generate humor as humans do*. |
Jianquan Li; XiangBo Wu; Xiaokang Liu; Qianqian Xie; Prayag Tiwari; Benyou Wang; |
420 | Convergence and Diversity in The Control Hierarchy Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We adapt Weir�s definition of a controllable CFG (called a labeled distinguished CFG) to give a definition of controllable pushdown automata (PDAs), called labeled distinguished PDAs. |
Alexandra Butoi; Ryan Cotterell; David Chiang; |
421 | ConFEDE: Contrastive Feature Decomposition for Multimodal Sentiment Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose ConFEDE, a unified learning framework that jointly performs contrastive representation learning and contrastive feature decomposition to enhance the representation of multimodal information. |
Jiuding Yang; Yakun Yu; Di Niu; Weidong Guo; Yu Xu; |
422 | Using Domain Knowledge to Guide Dialog Structure Induction Via Neural Probabilistic Soft Logic Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Neural Probabilistic Soft Logic Dialogue Structure Induction (NEUPSL DSI), a principled approach that injects symbolic knowledge into the latent space of a generative neural model. |
Connor Pryor; Quan Yuan; Jeremiah Liu; Mehran Kazemi; Deepak Ramachandran; Tania Bedrax-Weiss; Lise Getoor; |
423 | Are You Copying My Model? Protecting The Copyright of Large Language Models for EaaS Via Backdoor Watermark Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To protect the copyright of LLMs for EaaS, we propose an Embedding Watermark method called {pasted macro �METHOD�} that implants backdoors on embeddings. |
Wenjun Peng; Jingwei Yi; Fangzhao Wu; Shangxi Wu; Bin Bin Zhu; Lingjuan Lyu; Binxing Jiao; Tong Xu; Guangzhong Sun; Xing Xie; |
424 | Answering Ambiguous Questions Via Iterative Prompting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present AmbigPrompt to address the imperfections of existing approaches to answering ambiguous questions. |
Weiwei Sun; Hengyi Cai; Hongshen Chen; Pengjie Ren; Zhumin Chen; Maarten de Rijke; Zhaochun Ren; |
425 | A Dataset of Argumentative Dialogues on Scientific Papers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce ArgSciChat, a dataset of 41 argumentative dialogues between scientists on 20 NLP papers. |
Federico Ruggeri; Mohsen Mesgar; Iryna Gurevych; |
426 | Massively Multilingual Lexical Specialization of Multilingual Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Concretely, we use BabelNet�s multilingual synsets to create synonym pairs (or synonym-gloss pairs) across 50 languages and then subject the MMTs (mBERT and XLM-R) to a lexical specialization procedure guided by a contrastive objective. We show that such massively multilingual lexical specialization brings substantial gains in two standard cross-lingual lexical tasks, bilingual lexicon induction and cross-lingual word similarity, as well as in cross-lingual sentence retrieval. |
Tommaso Green; Simone Paolo Ponzetto; Goran Glava�; |
427 | RL4F: Generating Natural Language Feedback with Reinforcement Learning for Repairing Model Outputs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Moreover, in the era of large general-purpose language agents, fine-tuning is neither computationally nor spatially efficient as it results in multiple copies of the network. In this work, we introduce RL4F (Reinforcement Learning for Feedback), a multi-agent collaborative framework where the critique generator is trained to maximize end-task performance of GPT-3, a fixed model more than 200 times its size. |
Afra Feyza Akyurek; Ekin Akyurek; Ashwin Kalyan; Peter Clark; Derry Tanti Wijaya; Niket Tandon; |
428 | WebIE: Faithful and Robust Information Extraction on The Web Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present WebIE, the first large-scale, entity-linked closed IE dataset consisting of 1. |
Chenxi Whitehouse; Clara Vania; Alham Fikri Aji; Christos Christodoulopoulos; Andrea Pierleoni; |
429 | NormBank: A Knowledge Bank of Situational Social Norms Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present NormBank, a knowledge bank of 155k situational norms. |
Caleb Ziems; Jane Dwivedi-Yu; Yi-Chia Wang; Alon Halevy; Diyi Yang; |
430 | DIP: Dead Code Insertion Based Black-box Attack for Programming Language Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose DIP (Dead code Insertion based Black-box Attack for Programming Language Model), a high-performance and effective black-box attack method to generate adversarial examples using dead code insertion. |
CheolWon Na; YunSeok Choi; Jee-Hyong Lee; |
431 | Modeling Structural Similarities Between Documents for Coherence Assessment with Graph Convolutional Networks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We investigate a GCN-based coherence model that is capable of capturing structural similarities between documents. |
Wei Liu; Xiyan Fu; Michael Strube; |
432 | HiTIN: Hierarchy-aware Tree Isomorphism Network for Hierarchical Text Classification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose Hierarchy-aware Tree Isomorphism Network (HiTIN) to enhance the text representations with only syntactic information of the label hierarchy. |
He Zhu; Chong Zhang; Junjie Huang; Junran Wu; Ke Xu; |
433 | Contextual Knowledge Learning for Dialogue Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a novel approach to context and knowledge weighting as an integral part of model training. |
Wen Zheng; Natasa Milic-Frayling; Ke Zhou; |
434 | Easy Guided Decoding in Providing Suggestions for Interactive Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we utilize the parameterized objective function of neural machine translation (NMT) and propose a novel constrained decoding algorithm, namely Prefix-Suffix Guided Decoding (PSGD), to deal with the TS problem without additional training. |
Ke Wang; Xin Ge; Jiayi Wang; Yuqi Zhang; Yu Zhao; |
435 | Discourse-Centric Evaluation of Document-level Machine Translation with A New Densely Annotated Parallel Corpus of Novels Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Using these annotations, we systematically investigate the similarities and differences between the discourse structures of source and target languages, and the challenges they pose to MT. We discover that MT outputs differ fundamentally from human translations in terms of their latent discourse structures. This gives us a new perspective on the challenges and opportunities in document-level MT. We make our resource publicly available to spur future research in document-level MT and its generalization to other language translation tasks. |
Yuchen Eleanor Jiang; Tianyu Liu; Shuming Ma; Dongdong Zhang; Mrinmaya Sachan; Ryan Cotterell; |
436 | CMOT: Cross-modal Mixup Via Optimal Transport for Speech Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose Cross-modal Mixup via Optimal Transport (CMOT) to overcome the modality gap. |
Yan Zhou; Qingkai Fang; Yang Feng; |
437 | On The Evaluation of Neural Selective Prediction Methods for Natural Language Processing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We provide a survey and empirical comparison of the state-of-the-art in neural selective classification for NLP tasks. |
Zhengyao Gu; Mark Hopkins; |
438 | Speech-Text Pre-training for Spoken Dialog Understanding with Explicit Cross-Modal Alignment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose Speech-text Pre-training for spoken dialog understanding with ExpliCiT cRoss-Modal Alignment (SPECTRA), which is the first-ever speech-text dialog pre-training model. |
Tianshu Yu; Haoyu Gao; Ting-En Lin; Min Yang; Yuchuan Wu; Wentao Ma; Chao Wang; Fei Huang; Yongbin Li; |
439 | Text Style Transfer with Contrastive Transfer Pattern Mining Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Most existing studies mainly focus on the transformation between styles, yet ignore that this transformation can be actually carried out via different hidden transfer patterns. To address this problem, we propose a novel approach, contrastive transfer pattern mining (CTPM), which automatically mines and utilizes inherent latent transfer patterns to improve the performance of TST. |
Jingxuan Han; Quan Wang; Licheng Zhang; Weidong Chen; Yan Song; Zhendong Mao; |
440 | Zero- and Few-Shot Event Detection Via Prompt-Based Meta Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In our framework, we propose to use the cloze-based prompt and a trigger-aware soft verbalizer to efficiently project output to unseen event types. |
Zhenrui Yue; Huimin Zeng; Mengfei Lan; Heng Ji; Dong Wang; |
441 | Text Style Transfer Back-Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: For natural inputs, BT brings only slight improvements and sometimes even adverse effects. To address this issue, we propose Text Style Transfer Back Translation (TST BT), which uses a style transfer to modify the source side of BT data. |
Daimeng Wei; Zhanglin Wu; Hengchao Shang; Zongyao Li; Minghan Wang; Jiaxin Guo; Xiaoyu Chen; Zhengzhe Yu; Hao Yang; |
442 | Generating Visual Spatial Description Via Holistic 3D Scene Understanding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we investigate the incorporation of 3D scene features for VSD. |
Yu Zhao; Hao Fei; Wei Ji; Jianguo Wei; Meishan Zhang; Min Zhang; Tat-Seng Chua; |
443 | Continual Knowledge Distillation for Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a method called continual knowledge distillation to take advantage of existing translation models to improve one model of interest. |
Yuanchi Zhang; Peng Li; Maosong Sun; Yang Liu; |
444 | Query Refinement Prompts for Closed-Book Long-Form QA Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We resolve the difficulties to evaluate long-form output by doing both tasks at once ? to do question answering that requires long-form answers. Such questions tend to be multifaceted, i. e. , they may have ambiguities and/or require information from multiple sources. |
Reinald Kim Amplayo; Kellie Webster; Michael Collins; Dipanjan Das; Shashi Narayan; |
445 | CONE: An Efficient COarse-to-fiNE Alignment Framework for Long Video Temporal Grounding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Compared with short videos, long videos are also highly demanded but less explored, which brings new challenges in higher inference computation cost and weaker multi-modal alignment. To address these challenges, we propose CONE, an efficient COarse-to-fiNE alignment framework. |
Zhijian Hou; Wanjun Zhong; Lei Ji; Difei Gao; Kun Yan; W.k. Chan; Chong-Wah Ngo; Mike Zheng Shou; Nan Duan; |
446 | Few-Shot Document-Level Event Argument Extraction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we study to capture event arguments that actually spread across sentences in documents. |
Xianjun Yang; Yujie Lu; Linda Petzold; |
447 | ParaAMR: A Large-Scale Syntactically Diverse Paraphrase Dataset By AMR Back-Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present ParaAMR, a large-scale syntactically diverse paraphrase dataset created by abstract meaning representation back-translation. |
Kuan-Hao Huang; Varun Iyer; I-Hung Hsu; Anoop Kumar; Kai-Wei Chang; Aram Galstyan; |
448 | Towards Understanding and Improving Knowledge Distillation for Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Firstly, the current objective of KD spreads its focus to whole distributions to learn the knowledge, yet lacks special treatment on the most crucial top-1 information. Secondly, the knowledge is largely covered by the golden information due to the fact that most top-1 predictions of teachers overlap with ground-truth tokens, which further restricts the potential of KD. To address these issues, we propose a new method named Top-1 Information Enhanced Knowledge Distillation (TIE-KD). |
Songming Zhang; Yunlong Liang; Shuaibo Wang; Yufeng Chen; Wenjuan Han; Jian Liu; Jinan Xu; |
449 | Multi-Row, Multi-Span Distant Supervision For Table+Text Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This leads to a noisy multi-instance training regime. We present MITQA, a transformer-based TextTableQA system that is explicitly designed to cope with distant supervision along both these axes, through a multi-instance loss objective, together with careful curriculum design. |
Vishwajeet Kumar; Yash Gupta; Saneem Chemmengath; Jaydeep Sen; Soumen Chakrabarti; Samarth Bharadwaj; Feifei Pan; |
450 | HAHE: Hierarchical Attention for Hyper-Relational Knowledge Graphs in Global and Local Level Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, existing research seldom simultaneously models the graphical and sequential structure of HKGs, limiting HKGs� representation. To overcome this limitation, we propose a novel Hierarchical Attention model for HKG Embedding (HAHE), including global-level and local-level attention. |
Haoran Luo; Haihong E; Yuhao Yang; Yikai Guo; Mingzhi Sun; Tianyu Yao; Zichen Tang; Kaiyang Wan; Meina Song; Wei Lin; |
451 | ORGAN: Observation-Guided Radiology Report Generation Via Tree Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose an Observation-guided radiology Report Generation framework (ORGan). |
Wenjun Hou; Kaishuai Xu; Yi Cheng; Wenjie Li; Jiang Liu; |
452 | Data Curation Alone Can Stabilize In-context Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we show that carefully curating a subset of training data greatly stabilizes ICL performance without any other changes to the ICL algorithm (e. g. , prompt retrieval or calibration). |
Ting-Yun Chang; Robin Jia; |
453 | MidMed: Towards Mixed-Type Dialogues for Medical Consultation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, in many real situations, due to the lack of medical knowledge, it is usually difficult for patients to determine clear goals with all necessary slots. In this paper, we identify this challenge as how to construct medical consultation dialogue systems to help patients clarify their goals. |
Xiaoming Shi; Zeming Liu; Chuan Wang; Haitao Leng; Kui Xue; Xiaofan Zhang; Shaoting Zhang; |
454 | FiD-ICL: A Fusion-in-Decoder Approach for Efficient In-Context Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by fusion-in-decoder (FiD) models which efficiently aggregate more passages and thus outperforms concatenation-based models in open-domain QA, we hypothesize that similar techniques can be applied to improve the efficiency and end-task performance of ICL. To verify this, we present a comprehensive study on applying three fusion methods�concatenation-based (early fusion), FiD (intermediate), and ensemble-based (late)�to ICL. |
Qinyuan Ye; Iz Beltagy; Matthew Peters; Xiang Ren; Hannaneh Hajishirzi; |
455 | S2ynRE: Two-stage Self-training with Synthetic Data for Low-resource Relation Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose S2ynRE, a framework of two-stage Self-training with Synthetic data for Relation Extraction. |
Benfeng Xu; Quan Wang; Yajuan Lyu; Dai Dai; Yongdong Zhang; Zhendong Mao; |
456 | DSEE: Dually Sparsity-embedded Efficient Tuning of Pre-trained Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, two pain points persist for this paradigm: (a) as the pre-trained models grow bigger (e. g. , 175B parameters for GPT-3), even the fine-tuning process can be time-consuming and computationally expensive; (b) the fine-tuned model has the same size as its starting point by default, which is neither sensible due to its more specialized functionality, nor practical since many fine-tuned models will be deployed in resource-constrained environments. To address these pain points, we propose a framework for resource- and parameter-efficient fine-tuning by leveraging the sparsity prior in both weight updates and the final model weights. |
Xuxi Chen; Tianlong Chen; Weizhu Chen; Ahmed Hassan Awadallah; Zhangyang Wang; Yu Cheng; |
457 | CASE: Aligning Coarse-to-Fine Cognition and Affection for Empathetic Response Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose the CASE model for empathetic dialogue generation. |
Jinfeng Zhou; Chujie Zheng; Bo Wang; Zheng Zhang; Minlie Huang; |
458 | Comparative Evaluation of Boundary-relaxed Annotation for Entity Linking Performance Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: For those cases, a lenient annotation guideline could relieve the annotators� workload and speed up the process. This paper presents a case study designed to verify the feasibility of such annotation process and evaluate the impact of boundary-relaxed annotation in an Entity Linking pipeline. |
Gabriel Herman Bernardim Andrade; Shuntaro Yada; Eiji Aramaki; |
459 | Do CoNLL-2003 Named Entity Taggers Still Work Well in 2023? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we evaluate the generalization of over 20 different models trained on CoNLL-2003, and show that NER models have very different generalization. |
Shuheng Liu; Alan Ritter; |
460 | READIN: A Chinese Multi-Task Benchmark with Realistic and Diverse Input Noises Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, little study has been done to construct such benchmarks for Chinese, where various language-specific input noises happen in the real world. In order to fill this important gap, we construct READIN: a Chinese multi-task benchmark with REalistic And Diverse Input Noises. |
Chenglei Si; Zhengyan Zhang; Yingfa Chen; Xiaozhi Wang; Zhiyuan Liu; Maosong Sun; |
461 | MAD-TSC: A Multilingual Aligned News Dataset for Target-dependent Sentiment Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce MAD-TSC, a new dataset which differs substantially from existing resources. |
Evan Dufraisse; Adrian Popescu; Julien Tourille; Armelle Brun; Jerome Deshayes; |
462 | A New Dataset and Empirical Study for Sentence Simplification in Chinese Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The development of Chinese sentence simplification is relatively slow due to the lack of data. To alleviate this limitation, this paper introduces CSS, a new dataset for assessing sentence simplification in Chinese. |
Shiping Yang; Renliang Sun; Xiaojun Wan; |
463 | Factual or Contextual? Disentangling Error Types in Entity Description Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We develop an evaluation paradigm that enables us to disentangle these two types of errors in naturally occurring textual contexts. |
Navita Goyal; Ani Nenkova; Hal Daum� III; |
464 | Weakly Supervised Vision-and-Language Pre-training with Relative Representations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This affects the data quality and thus the effectiveness of pre-training. In this paper, we propose to directly take a small number of aligned image-text pairs as anchors, and represent each unaligned image and text by its similarities to these anchors, i. e. , relative representations. |
Chi Chen; Peng Li; Maosong Sun; Yang Liu; |
465 | HermEs: Interactive Spreadsheet Formula Prediction Via Hierarchical Formulet Expansion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose HermEs, the first approach for spreadsheet formula prediction via HiEraRchical forMulet ExpanSion, where hierarchical expansion means generating formulas following the underlying parse tree structure, and Formulet refers to commonly-used multi-level patterns mined from real formula parse trees. |
Wanrong He; Haoyu Dong; Yihuai Gao; Zhichao Fan; Xingzhuo Guo; Zhitao Hou; Xiao Lv; Ran Jia; Shi Han; Dongmei Zhang; |
466 | ArgU: A Controllable Factual Argument Generator Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce ArgU: a neural argument generator capable of producing factual arguments from input facts and real-world concepts that can be explicitly controlled for stance and argument structure using Walton�s argument scheme-based control codes. |
Sougata Saha; Rohini Srihari; |
467 | Learning Answer Generation Using Supervision from Automatic Question Answering Evaluators Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel training paradigm for GenQA using supervision from automatic QA evaluation models (GAVA). |
Matteo Gabburo; Siddhant Garg; Rik Koncel-Kedziorski; Alessandro Moschitti; |
468 | RECAP: Retrieval-Enhanced Context-Aware Prefix Encoder for Personalized Dialogue Response Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a new retrieval-enhanced approach for personalized response generation. |
Shuai Liu; Hyundong Cho; Marjorie Freedman; Xuezhe Ma; Jonathan May; |
469 | Don�t Parse, Choose Spans! Continuous and Discontinuous Constituency Parsing Via Autoregressive Span Selection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a simple and unified approach for both continuous and discontinuous constituency parsing via autoregressive span selection. |
Songlin Yang; Kewei Tu; |
470 | Laziness Is A Virtue When It Comes to Compositionality in Neural Semantic Parsing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we approach semantic parsing from, quite literally, the opposite direction; that is, we introduce a neural semantic parsing generation method that constructs logical forms from the bottom up, beginning from the logical form�s leaves. |
Maxwell Crouse; Pavan Kapanipathi; Subhajit Chaudhury; Tahira Naseem; Ramon Fernandez Astudillo; Achille Fokoue; Tim Klinger; |
471 | AD-KD: Attribution-Driven Knowledge Distillation for Language Model Compression Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present a novel attribution-driven knowledge distillation approach, which explores the token-level rationale behind the teacher model based on Integrated Gradients (IG) and transfers attribution knowledge to the student model. |
Siyue Wu; Hongzhan Chen; Xiaojun Quan; Qifan Wang; Rui Wang; |
472 | (QA)2: Question Answering with Questionable Assumptions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose (QA)2 (Question Answering with Questionable Assumptions), an open-domain evaluation dataset consisting of naturally occurring search engine queries that may or may not contain questionable assumptions. |
Najoung Kim; Phu Mon Htut; Samuel R. Bowman; Jackson Petty; |
473 | Attributable and Scalable Opinion Summarization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a method for unsupervised opinion summarization that encodes sentences from customer reviews into a hierarchical discrete latent space, then identifies common opinions based on the frequency of their encodings. |
Tom Hosking; Hao Tang; Mirella Lapata; |
474 | Targeted Data Generation: Finding and Fixing Model Weaknesses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose Targeted Data Generation (TDG), a framework that automatically identifies challenging subgroups, and generates new data for those subgroups using large language models (LLMs) with a human in the loop. |
Zexue He; Marco Tulio Ribeiro; Fereshte Khani; |
475 | HiFi: High-Information Attention Heads Hold for Parameter-Efficient Model Adaptation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, this paradigm poses issues of inefficient updating and resource over-consuming for fine-tuning in data-scarce and resource-limited scenarios, because of the large scale of parameters in PLMs. To alleviate these concerns, in this paper, we propose a parameter-efficient fine-tuning method HiFi, that is, only the highly informative and strongly correlated attention heads for the specific task are fine-tuned. |
Anchun Gui; Han Xiao; |
476 | CFSum Coarse-to-Fine Contribution Network for Multimodal Summarization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing multimodal summarization approaches focus on designing the fusion methods of different modalities, while ignoring the adaptive conditions under which visual modalities are useful. Therefore, we propose a novel Coarse-to-Fine contribution network for multimodal Summarization (CFSum) to consider different contributions of images for summarization. |
Min Xiao; Junnan Zhu; Haitao Lin; Yu Zhou; Chengqing Zong; |
477 | On �Scientific Debt� in NLP: A Case for More Rigour in Language Model Pre-Training Research Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We provide a case in point by revisiting the success of BERT over its baselines, ELMo and GPT-1, and demonstrate how ? under comparable conditions where the baselines are tuned to a similar extent ? these baselines (and even-simpler variants thereof) can, in fact, achieve competitive or better performance than BERT. |
Made Nindyatama Nityasya; Haryo Wibowo; Alham Fikri Aji; Genta Winata; Radityo Eko Prasojo; Phil Blunsom; Adhiguna Kuncoro; |
478 | End-to-end Knowledge Retrieval with Multi-modal Queries Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce a retriever model �ReViz� that can directly process input text and images to retrieve relevant knowledge in an end-to-end fashion without being dependent on intermediate modules such as object detectors or caption generators. |
Man Luo; Zhiyuan Fang; Tejas Gokhale; Yezhou Yang; Chitta Baral; |
479 | AV-TranSpeech: Audio-Visual Robust Speech-to-Speech Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present AV-TranSpeech, the first audio-visual speech-to-speech (AV-S2ST) translation model without relying on intermediate text. |
Rongjie Huang; Huadai Liu; Xize Cheng; Yi Ren; Linjun Li; Zhenhui Ye; Jinzheng He; Lichao Zhang; Jinglin Liu; Xiang Yin; Zhou Zhao; |
480 | Dual Class Knowledge Propagation Network for Multi-label Few-shot Intent Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, previous studies are confused by the identical representation of the utterance with multiple labels and overlook the intrinsic intra-class and inter-class interactions. To address these two limitations, we propose a novel dual class knowledge propagation network in this paper. |
Feng Zhang; Wei Chen; Fei Ding; Tengjiao Wang; |
481 | VendorLink: An NLP Approach for Identifying & Linking Vendor Migrants & Potential Aliases on Darknet Markets Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To identify relationships between illegal markets and their vendors, we propose VendorLink, an NLP-based approach that examines writing patterns to verify, identify, and link unique vendor accounts across text advertisements (ads) on seven public Darknet markets. |
Vageesh Saxena; Nils Rethmeier; Gijs van Dijck; Gerasimos Spanakis; |
482 | Element-aware Summarization with Large Language Models: Expert-aligned Evaluation and Chain-of-Thought Method Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, the reference summaries of those datasets turn out to be noisy, mainly in terms of factual hallucination and information redundancy. To address this challenge, we first annotate new expert-writing Element-aware test sets following the ?Lasswell Communication Model? proposed by Lasswell, allowing reference summaries to focus on more fine-grained news elements objectively and comprehensively. |
Yiming Wang; Zhuosheng Zhang; Rui Wang; |
483 | Efficient Shapley Values Estimation By Amortization for Text Classification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To mitigate the trade-off between stability and efficiency, we develop an amortized model that directly predicts each input feature�s Shapley Value without additional model evaluations. |
Chenghao Yang; Fan Yin; He He; Kai-Wei Chang; Xiaofei Ma; Bing Xiang; |
484 | PeerDA: Data Augmentation Via Modeling Peer Relation for Span Identification Tasks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Different from previous works that merely leverage the Subordinate (SUB) relation (i. e. if a span is an instance of a certain category) to train models, this paper for the first time explores the Peer (PR) relation, which indicates that two spans are instances of the same category and share similar features. |
Weiwen Xu; Xin Li; Yang Deng; Wai Lam; Lidong Bing; |
485 | Dynamic Regularization in UDA for Transformers in Multimodal Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work focuses on two key challenges in multimodal machine learning. The first is finding efficient ways to combine information from different data types. The second is that often, one modality (e. g. , text) is stronger and more relevant, making it difficult to identify meaningful patterns in the weaker modality (e. g. , image). |
Ivonne Monter-Aldana; Adrian Pastor Lopez Monroy; Fernando Sanchez-Vega; |
486 | Conflicts, Villains, Resolutions: Towards Models of Narrative Media Framing Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite increasing interest in the automatic detection of media frames in NLP, the problem is typically simplified as single-label classification and adopts a topic-like view on frames, evading modelling the broader document-level narrative. In this work, we revisit a widely used conceptualization of framing from the communication sciences which explicitly captures elements of narratives, including conflict and its resolution, and integrate it with the narrative framing of key entities in the story as heroes, victims or villains. |
Lea Frermann; Jiatong Li; Shima Khanehzar; Gosia Mikolajczak; |
487 | BgGLUE: A Bulgarian General Language Understanding Evaluation Benchmark Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present bgGLUE (Bulgarian General Language Understanding Evaluation), a benchmark for evaluating language models on Natural Language Understanding (NLU) tasks in Bulgarian. |
Momchil Hardalov; Pepa Atanasova; Todor Mihaylov; Galia Angelova; Kiril Simov; Petya Osenova; Veselin Stoyanov; Ivan Koychev; Preslav Nakov; Dragomir Radev; |
488 | DuNST: Dual Noisy Self Training for Semi-Supervised Controllable Text Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Augmented only by self-generated pseudo text, generation models over-exploit the previously learned text space and fail to explore a larger one, suffering from a restricted generalization boundary and limited controllability. In this work, we propose DuNST, a novel ST framework to tackle these problems. |
Yuxi Feng; Xiaoyuan Yi; Xiting Wang; Laks Lakshmanan; V.S.; Xing Xie; |
489 | What Does The Failure to Reason with �Respectively� in Zero/Few-Shot Settings Tell Us About Language Models? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a controlled synthetic dataset WikiResNLI and a naturally occurring dataset NatResNLI to encompass various explicit and implicit realizations of �respectively�. |
Ruixiang Cui; Seolhwa Lee; Daniel Hershcovich; Anders S�gaard; |
490 | BLIND: Bias Removal With No Demographics Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce BLIND, a method for bias removal with no prior knowledge of the demographics in the dataset. |
Hadas Orgad; Yonatan Belinkov; |
491 | How Do Humans Perceive Adversarial Text? A Reality Check on The Validity and Naturalness of Word-based Adversarial Attacks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This entails that adversarial perturbations would not pass any human quality gate and do not represent real threats to human-checked NLP systems. To bypass this limitation and enable proper assessment (and later, improvement) of NLP model robustness, we have surveyed 378 human participants about the perceptibility of text adversarial examples produced by state-of-the-art methods. |
Salijona Dyrmishi; Salah Ghamizi; Maxime Cordy; |
492 | Soft Alignment Objectives for Robust Adaptation of Language Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work introduces novel training objectives built upon a semantic similarity of the predicted tokens to the reference. |
Michal �tef�nik; Marek Kadlcik; Petr Sojka; |
493 | The CRINGE Loss: Learning What Language Not to Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Growing evidence shows that even with very large amounts of positive training data, issues remain that can be alleviated with relatively small amounts of negative data � examples of what the model should not do. In this work, we propose a novel procedure to train with such data called the �CRINGE� loss (ContRastive Iterative Negative GEneration). |
Leonard Adolphs; Tianyu Gao; Jing Xu; Kurt Shuster; Sainbayar Sukhbaatar; Jason Weston; |
494 | Modeling User Satisfaction Dynamics in Dialogue Via Hawkes Process Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In order to fully simulate users, it is crucial to take satisfaction dynamics into account. To fill this gap, we propose a new estimator ASAP (sAtisfaction eStimation via HAwkes Process) that treats user satisfaction across turns as an event sequence and employs a Hawkes process to effectively model the dynamics in this sequence. |
Fanghua Ye; Zhiyuan Hu; Emine Yilmaz; |
495 | Towards Identifying Fine-Grained Depression Symptoms from Memes Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we conduct a focused study on depression disorder and introduce a new task of identifying fine-grained depressive symptoms from memes. |
Shweta Yadav; Cornelia Caragea; Chenye Zhao; Naincy Kumari; Marvin Solberg; Tanmay Sharma; |
496 | SLUE Phase-2: A Benchmark Suite of Diverse Spoken Language Understanding Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce several new annotated SLU benchmark tasks based on freely available speech data, which complement existing benchmarks and address gaps in the SLU evaluation landscape. |
Suwon Shon; Siddhant Arora; Chyi-Jiunn Lin; Ankita Pasad; Felix Wu; Roshan S Sharma; Wei-Lun Wu; Hung-yi Lee; Karen Livescu; Shinji Watanabe; |
497 | My Side, Your Side and The Evidence: Discovering Aligned Actor Groups and The Narratives They Weave Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we consider a more feasible proxy task: Identify the distinct sets of aligned story actors responsible for sustaining the issue-specific narratives. |
Pavan Holur; David Chong; Timothy Tangherlini; Vwani Roychowdhury; |
498 | Characterizing and Measuring Linguistic Dataset Drift Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose three dimensions of linguistic dataset drift: vocabulary, structural, and semantic drift. |
Tyler Chang; Kishaloy Halder; Neha Anna John; Yogarshi Vyas; Yassine Benajiba; Miguel Ballesteros; Dan Roth; |
499 | WebCPM: Interactive Web Search for Chinese Long-form Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce WebCPM, the first Chinese LFQA dataset. |
Yujia Qin; Zihan Cai; Dian Jin; Lan Yan; Shihao Liang; Kunlun Zhu; Yankai Lin; Xu Han; Ning Ding; Huadong Wang; Ruobing Xie; Fanchao Qi; Zhiyuan Liu; Maosong Sun; Jie Zhou; |
500 | Synthesize, Prompt and Transfer: Zero-shot Conversational Question Generation with Pre-trained Language Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a more realistic and less explored setting, Zero-shot Conversational Question Generation (ZeroCQG), which requires no human-labeled conversations for training. |
Hongwei Zeng; Bifan Wei; Jun Liu; Weiping Fu; |
501 | FormNetV2: Multimodal Graph Contrastive Learning for Form Document Information Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In FormNetV2, we introduce a centralized multimodal graph contrastive learning strategy to unify self-supervised pre-training for all modalities in one loss. |
Chen-Yu Lee; Chun-Liang Li; Hao Zhang; Timothy Dozat; Vincent Perot; Guolong Su; Xiang Zhang; Kihyuk Sohn; Nikolay Glushnev; Renshen Wang; Joshua Ainslie; Shangbang Long; Siyang Qin; Yasuhisa Fujii; Nan Hua; Tomas Pfister; |
502 | MixCE: Training Autoregressive Language Models By Mixing Forward and Reverse Cross-Entropies Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Hence, we propose learning with MixCE, an objective that mixes the forward and reverse cross-entropies. |
Shiyue Zhang; Shijie Wu; Ozan Irsoy; Steven Lu; Mohit Bansal; Mark Dredze; David Rosenberg; |
503 | Knowledgeable Parameter Efficient Tuning Network for Commonsense Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a simple knowledgeable parameter efficient tuning network to couple PLMs with external knowledge for commonsense question answering. |
Ziwang Zhao; Linmei Hu; Hanyu Zhao; Yingxia Shao; Yequan Wang; |
504 | BLASER: A Text-Free Speech-to-Speech Translation Evaluation Metric Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a text-free evaluation metric for end-to-end S2ST, named BLASER, to avoid the dependency on ASR systems. |
Mingda Chen; Paul-Ambroise Duquenne; Pierre Andrews; Justine Kao; Alexandre Mourachko; Holger Schwenk; Marta R. Costa-juss�; |
505 | NLPositionality: Characterizing Design Biases of Datasets and Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce NLPositionality, a framework for characterizing design biases and quantifying the positionality of NLP datasets and models. |
Sebastin Santy; Jenny Liang; Ronan Le Bras; Katharina Reinecke; Maarten Sap; |
506 | Backpack Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present Backpacks: a new neural architecture that marries strong modeling performancewith an interface for interpretability and control. |
John Hewitt; John Thickstun; Christopher Manning; Percy Liang; |
507 | WinoQueer: A Community-in-the-Loop Benchmark for Anti-LGBTQ+ Bias in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present WinoQueer: a benchmark specifically designed to measure whether large language models (LLMs) encode biases that are harmful to the LGBTQ+ community. |
Virginia Felkner; Ho-Chun Herbert Chang; Eugene Jang; Jonathan May; |
508 | Grounded Multimodal Named Entity Recognition on Social Media Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing MNER studies only extract entity-type pairs in text, which is useless for multimodal knowledge graph construction and insufficient for entity disambiguation. To solve these issues, in this work, we introduce a Grounded Multimodal Named Entity Recognition (GMNER) task. |
Jianfei Yu; Ziyan Li; Jieming Wang; Rui Xia; |
509 | Preserving Commonsense Knowledge from Pre-trained Language Models Via Causal Inference Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Based on the causal view, we propose a unified objective for fine-tuning to retrieve the causality back. |
Junhao Zheng; Qianli Ma; Shengjie Qiu; Yue Wu; Peitian Ma; Junlong Liu; Huawen Feng; Xichen Shang; Haibin Chen; |
510 | Translation-Enhanced Multilingual Text-to-Image Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We provide two key contributions. 1) Relying on a multilingual multi-modal encoder, we provide a systematic empirical study of standard methods used in cross-lingual NLP when applied to mTTI: Translate Train, Translate Test, and Zero-Shot Transfer. 2) We propose Ensemble Adapter (EnsAd), a novel parameter-efficient approach that learns to weigh and consolidate the multilingual text knowledge within the mTTI framework, mitigating the language gap and thus improving mTTI performance. |
Yaoyiran Li; Ching-Yun Chang; Stephen Rawls; Ivan Vulic; Anna Korhonen; |
511 | Benchmarking Large Language Model Capabilities for Conditional Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we discuss how to adapt existing application-specific generation benchmarks to PLMs and provide an in-depth, empirical study of the limitations and capabilities of PLMs in natural language generation tasks along dimensions such as scale, architecture, input and output language. |
Joshua Maynez; Priyanka Agrawal; Sebastian Gehrmann; |
512 | LilGym: Natural Language Visual Reasoning with Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present lilGym, a new benchmark for language-conditioned reinforcement learning in visual environments. |
Anne Wu; Kiante Brantley; Noriyuki Kojima; Yoav Artzi; |
513 | Unsupervised Melody-to-Lyrics Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a method for generating high-quality lyrics without training on any aligned melody-lyric data. |
Yufei Tian; Anjali Narayan-Chen; Shereen Oraby; Alessandra Cervone; Gunnar Sigurdsson; Chenyang Tao; Wenbo Zhao; Tagyoung Chung; Jing Huang; Nanyun Peng; |
514 | Causality-aware Concept Extraction Based on Knowledge-guided Prompting Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, through the lens of a Structural Causal Model (SCM), we propose equipping the PLM-based extractor with a knowledge-guided prompt as an intervention to alleviate concept bias. |
Siyu Yuan; Deqing Yang; Jinxi Liu; Shuyu Tian; Jiaqing Liang; Yanghua Xiao; Rui Xie; |
515 | Span-level Aspect-based Sentiment Analysis Via Table Filling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel span-level model for Aspect-Based Sentiment Analysis (ABSA), which aims at identifying the sentiment polarity of the given aspect. |
Mao Zhang; Yongxin Zhu; Zhen Liu; Zhimin Bao; Yunfei Wu; Xing Sun; Linli Xu; |
516 | Limitations of Language Models in Arithmetic and Symbolic Induction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In the end, we introduce LMs with tutor, which demonstrates every single step of teaching. |
Jing Qian; Hong Wang; Zekun Li; Shiyang Li; Xifeng Yan; |
517 | EEL: Efficiently Encoding Lattices for Reranking Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We explore an approach for reranking hypotheses by using Transformers to efficiently encode lattices of generated outputs, a method we call EEL. |
Prasann Singhal; Jiacheng Xu; Xi Ye; Greg Durrett; |
518 | CLAPSpeech: Learning Prosody from Text Context with Contrastive Language-Audio Pre-Training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose CLAPSpeech, a cross-modal contrastive pre-training framework that learns from the prosody variance of the same text token under different contexts. |
Zhenhui Ye; Rongjie Huang; Yi Ren; Ziyue Jiang; Jinglin Liu; Jinzheng He; Xiang Yin; Zhou Zhao; |
519 | Revisiting Cross-Lingual Summarization: A Corpus-based Study and A New Benchmark with Improved Annotation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Most existing cross-lingual summarization (CLS) work constructs CLS corpora by simply and directly translating pre-annotated summaries from one language to another, which can contain errors from both summarization and translation processes. To address this issue, we propose ConvSumX, a cross-lingual conversation summarization benchmark, through a new annotation schema that explicitly considers source input context. |
Yulong Chen; Huajian Zhang; Yijie Zhou; Xuefeng Bai; Yueguan Wang; Ming Zhong; Jianhao Yan; Yafu Li; Judy Li; Xianchao Zhu; Yue Zhang; |
520 | Learning Dynamic Contextualised Word Embeddings Via Template-based Temporal Adaptation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a method for learning DCWEs by time-adapting a pretrained Masked Language Model (MLM) using time-sensitive templates. |
Xiaohang Tang; Yi Zhou; Danushka Bollegala; |
521 | How Poor Is The Stimulus? Evaluating Hierarchical Generalization in Neural Networks Trained on Child-directed Speech Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Is this preference due to a learning bias for hierarchical structure, or due to more general biases that interact with hierarchical cues in children�s linguistic input? We explore these possibilities by training LSTMs and Transformers – two types of neural networks without a hierarchical bias – on data similar in quantity and content to children�s linguistic input: text from the CHILDES corpus. |
Aditya Yedetore; Tal Linzen; Robert Frank; R. Thomas McCoy; |
522 | GanLM: Encoder-Decoder Pre-training with An Auxiliary Discriminator Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Inspired by the idea of Generative Adversarial Networks (GANs), we propose a GAN-style model for encoder-decoder pre-training by introducing an auxiliary discriminator, unifying the ability of language understanding and generation in a single model. |
Jian Yang; Shuming Ma; Li Dong; Shaohan Huang; Haoyang Huang; Yuwei Yin; Dongdong Zhang; Liqun Yang; Furu Wei; Zhoujun Li; |
523 | Linear Guardedness and Its Implications Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the impact of this removal on the behavior of downstream classifiers trained on the modified representations is not fully understood. In this work, we formally define the notion of linear guardedness as the inability of an adversary to predict the concept directly from the representation, and study its implications. |
Shauli Ravfogel; Yoav Goldberg; Ryan Cotterell; |
524 | Searching for Needles in A Haystack: On The Role of Incidental Bilingualism in PaLM�s Translation Capability Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a mixed-method approach to measure and understand incidental bilingualism at scale. |
Eleftheria Briakou; Colin Cherry; George Foster; |
525 | Open Set Relation Extraction Via Unknown-Aware Training Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose an unknown-aware training method, regularizing the model by dynamically synthesizing negative instances that can provide the missing supervision signals. |
Jun Zhao; Xin Zhao; WenYu Zhan; Qi Zhang; Tao Gui; Zhongyu Wei; Yun Wen Chen; Xiang Gao; Xuanjing Huang; |
526 | Learning to Imagine: Visually-Augmented Natural Language Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we aim to utilize visual information for composition in the same manner as humans. |
Tianyi Tang; Yushuo Chen; Yifan Du; Junyi Li; Wayne Xin Zhao; Ji-Rong Wen; |
527 | Generating Hashtags for Short-form Videos with Guided Signals Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Both of these properties cannot be easily modeled with classification approaches. To bridge this gap, we formulate SVHR as a generation task that better represents how hashtags are created naturally. |
Tiezheng Yu; Hanchao Yu; Davis Liang; Yuning Mao; Shaoliang Nie; Po-Yao Huang; Madian Khabsa; Pascale Fung; Yi-Chia Wang; |
528 | NEUROSTRUCTURAL DECODING: Neural Text Generation with Structural Constraints Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While most approaches for conditional text generation have primarily focused on lexical constraints, they often struggle to effectively incorporate syntactic constraints, which provide a richer language for approximating semantic constraints. We address this gap by introducing NeuroStructural Decoding, a new decoding algorithm that incorporates syntactic constraints to further improve the quality of the generated text. |
Mohaddeseh Bastan; Mihai Surdeanu; Niranjan Balasubramanian; |
529 | The Best of Both Worlds: Combining Human and Machine Translations for Multilingual Semantic Parsing with Active Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose an active learning approach that exploits the strengths of both human and machine translations by iteratively adding small batches of human translations into the machine-translated training set. |
Zhuang Li; Lizhen Qu; Philip Cohen; Raj Tumuluri; Gholamreza Haffari; |
530 | Ideology Prediction from Scarce and Biased Supervision: Learn to Disregard The �What� and Focus on The �How�! Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a novel supervised learning approach for political ideology prediction (PIP) that is capable of predicting out-of-distribution inputs. |
Chen Chen; Dylan Walker; Venkatesh Saligrama; |
531 | Unsupervised Extractive Summarization of Emotion Triggers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We instead pursue unsupervised systems that extract triggers from text. First, we introduce CovidET-EXT, augmenting (Zhan et al. , 2022)?s abstractive dataset (in the context of the COVID-19 crisis) with extractive triggers. Second, we develop new unsupervised learning models that can jointly detect emotions and summarize their triggers. |
Tiberiu Sosea; Hongli Zhan; Junyi Jessy Li; Cornelia Caragea; |
532 | Document-Level Event Argument Extraction With A Chain Reasoning Paradigm Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce T-norm fuzzy logic for optimization, which permits end-to-end learning and shows promise for integrating the expressiveness of logical reasoning with the generalization of neural networks. |
Jian Liu; Chen Liang; Jinan Xu; Haoyan Liu; Zhe Zhao; |
533 | Pre-training Multi-party Dialogue Models with Latent Discourse Inference Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, due to the lack of explicitly annotated discourse labels in multi-party dialogue corpora, previous works fail to scale up the pre-training process by putting aside the unlabeled multi-party conversational data for nothing. To fully utilize the unlabeled data, we propose to treat the discourse structures as latent variables, then jointly infer them and pre-train the discourse-aware model by unsupervised latent variable inference methods. |
Yiyang Li; Xinting Huang; Wei Bi; Hai Zhao; |
534 | Interpreting Positional Information in Perspective of Word Order Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although several studies have attempted to improve positional encoding and investigate the influence of word order perturbation, it remains unclear how positional encoding impacts NLP models from the perspective of word order. In this paper, we aim to shed light on this problem by analyzing the working mechanism of the attention module and investigating the root cause of its inability to encode positional information. |
Zhang Xilong; Liu Ruochen; Liu Jin; Liang Xuefeng; |
535 | I2D2: Inductive Knowledge Distillation with NeuroLogic and Self-Imitation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce I2D2, a novel commonsense distillation framework that loosely follows the Symbolic Knowledge Distillation of West et al. but breaks the dependence on the extreme-scale teacher model with two innovations: (1) the novel adaptation of NeuroLogic Decoding to enhance the generation quality of the weak, off-the-shelf language models, and (2) self-imitation learning to iteratively learn from the model�s own enhanced commonsense acquisition capabilities. |
Chandra Bhagavatula; Jena D. Hwang; Doug Downey; Ronan Le Bras; Ximing Lu; Lianhui Qin; Keisuke Sakaguchi; Swabha Swayamdipta; Peter West; Yejin Choi; |
536 | More Than Classification: A Unified Framework for Event Temporal Relation Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a unified event temporal relation extraction framework, which transforms temporal relations into logical expressions of time points and completes the ETRE by predicting the relations between certain time point pairs. |
Quzhe Huang; Yutong Hu; Shengqi Zhu; Yansong Feng; Chang Liu; Dongyan Zhao; |
537 | Multi-Source Test-Time Adaptation As Dueling Bandits for Extractive Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we study multi-source test-time model adaptation from user feedback, where K distinct models are established for adaptation. |
Hai Ye; Qizhe Xie; Hwee Tou Ng; |
538 | Decoupling Pseudo Label Disambiguation and Representation Learning for Generalized Intent Discovery Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a decoupled prototype learning framework (DPL) to decouple pseudo label disambiguation and representation learning. |
Yutao Mou; Xiaoshuai Song; Keqing He; Chen Zeng; Pei Wang; Jingang Wang; Yunsen Xian; Weiran Xu; |
539 | DecompEval: Evaluating Generated Texts As Unsupervised Decomposed Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Furthermore, existing metrics only provide an evaluation score for each dimension without revealing the evidence to interpret how this score is obtained. To deal with these challenges, we propose a simple yet effective metric called DecompEval. |
Pei Ke; Fei Huang; Fei Mi; Yasheng Wang; Qun Liu; Xiaoyan Zhu; Minlie Huang; |
540 | Backdooring Neural Code Search Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This may impact the downstream software (e. g. , stock trading systems and autonomous driving) and cause financial loss and/or life-threatening incidents. In this paper, we demonstrate such attacks are feasible and can be quite stealthy. |
Weisong Sun; Yuchen Chen; Guanhong Tao; Chunrong Fang; Xiangyu Zhang; Quanjun Zhang; Bin Luo; |
541 | Concise Answers to Complex Questions: Summarization of Long-form Answers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Together, we present the first study on summarizing long-form answers, taking a step forward for QA agents that can provide answers at multiple granularities. |
Abhilash Potluri; Fangyuan Xu; Eunsol Choi; |
542 | Towards Better Entity Linking with Multi-View Enhanced Distillation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Aiming at learning entity representations that can match divergent mentions, this paper proposes a Multi-View Enhanced Distillation (MVD) framework, which can effectively transfer knowledge of multiple fine-grained and mention-relevant parts within entities from cross-encoders to dual-encoders. |
Yi Liu; Yuan Tian; Jianxun Lian; Xinlong Wang; Yanan Cao; Fang Fang; Wen Zhang; Haizhen Huang; Weiwei Deng; Qi Zhang; |
543 | A Measure-Theoretic Characterization of Tight Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In order to characterize the notion of leakage more precisely, this paper offers a measure-theoretic treatment of language modeling. |
Li Du; Lucas Torroba Hennigen; Tiago Pimentel; Clara Meister; Jason Eisner; Ryan Cotterell; |
544 | PAED: Zero-Shot Persona Attribute Extraction in Dialogues Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although there is a public dataset for triplet-based persona attribute extraction from conversations, its automatically generated labels present many issues, including unspecific relations and inconsistent annotations. We fix such issues by leveraging more reliable text-label matching criteria to generate high-quality data for persona attribute extraction. |
Luyao Zhu; Wei Li; Rui Mao; Vlad Pandelea; Erik Cambria; |
545 | PromptRank: Unsupervised Keyphrase Extraction Using Prompt Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To this end, in this paper, we propose a simple yet effective unsupervised approach, PromptRank, based on the PLM with an encoder-decoder architecture. |
Aobo Kong; Shiwan Zhao; Hao Chen; Qicheng Li; Yong Qin; Ruiqi Sun; Xiaoyan Bai; |
546 | When Not to Trust Language Models: Investigating Effectiveness of Parametric and Non-Parametric Memories Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper aims to understand LMs� strengths and limitations in memorizing factual knowledge, by conducting large-scale knowledge probing experiments on two open-domain entity-centric QA datasets: PopQA, our new dataset with 14k questions about long-tail entities, and EntityQuestions, a widely used open-domain QA dataset. |
Alex Mallen; Akari Asai; Victor Zhong; Rajarshi Das; Daniel Khashabi; Hannaneh Hajishirzi; |
547 | InfoVerse: A Universal Framework for Dataset Characterization with Multidimensional Meta-information Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce infoVerse, a universal framework for dataset characterization, which provides a new feature space that effectively captures multidimensional characteristics of datasets by incorporating various model-driven meta-information. |
Jaehyung Kim; Yekyung Kim; Karin de Langis; Jinwoo Shin; Dongyeop Kang; |
548 | SeeGULL: A Stereotype Benchmark with Broad Geo-Cultural Coverage Leveraging Generative Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This is especially problematic as language technologies gain hold across the globe. To address this gap, we present SeeGULL, a broad-coverage stereotype dataset, built by utilizing generative capabilities of large language models such as PaLM, and GPT-3, and leveraging a globally diverse rater pool to validate the prevalence of those stereotypes in society. |
Akshita Jha; Aida Mostafazadeh Davani; Chandan K Reddy; Shachi Dave; Vinodkumar Prabhakaran; Sunipa Dev; |
549 | Automated Metrics for Medical Multi-Document Summarization Disagree with Human Evaluations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We analyze how automated summarization evaluation metrics correlate with lexical features of generated summaries, to other automated metrics including several we propose in this work, and to aspects of human-assessed summary quality. |
Lucy Lu Wang; Yulia Otmakhova; Jay DeYoung; Thinh Hung Truong; Bailey Kuehl; Erin Bransom; Byron Wallace; |
550 | Say What You Mean! Large Language Models Speak Too Positively About Negative Commonsense Knowledge Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work examines the ability of LLMs on negative commonsense knowledge. |
Jiangjie Chen; Wei Shi; Ziquan Fu; Sijie Cheng; Lei Li; Yanghua Xiao; |
551 | An Inner Table Retriever for Robust Table Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose Inner Table Retriever (ITR), a general-purpose approach for handling long tables in TableQA that extracts sub-tables to preserve the most relevant information for a question. |
Weizhe Lin; Rexhina Blloshmi; Bill Byrne; Adria de Gispert; Gonzalo Iglesias; |
552 | SIMSUM: Document-level Text Simplification Via Simultaneous Summarization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a new two-stage framework SIMSUM for automated document-level text simplification. |
Sofia Blinova; Xinyu Zhou; Martin Jaggi; Carsten Eickhoff; Seyed Ali Bahrainian; |
553 | SimOAP: Improve Coherence and Consistency in Persona-based Dialogue Generation Via Over-sampling and Post-evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, a simple but effective two-stage SimOAP strategy is proposed, i. e. , over-sampling and post-evaluation. |
Junkai Zhou; Liang Pang; Huawei Shen; Xueqi Cheng; |
554 | NatLogAttack: A Framework for Attacking Natural Language Inference Models with Natural Logic Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we explore the fundamental problem of developing attack models based on logic formalism. |
Zi�ou Zheng; Xiaodan Zhu; |
555 | Cognitive Reframing of Negative Thoughts Through Human-Language Model Interaction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we conduct a human-centered study of how language models may assist people in reframing negative thoughts. |
Ashish Sharma; Kevin Rushton; Inna Lin; David Wadden; Khendra Lucas; Adam Miner; Theresa Nguyen; Tim Althoff; |
556 | Dating Greek Papyri with Text Regression Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: By creating a dataset of 389 transcriptions of documentary Greek papyri, we train 389 regression models and we predict a date for the papyri with an average MAE of 54 years and an MSE of 1. |
John Pavlopoulos; Maria Konstantinidou; Isabelle Marthot-Santaniello; Holger Essler; Asimina Paparigopoulou; |
557 | Interleaving Retrieval with Chain-of-Thought Reasoning for Knowledge-Intensive Multi-Step Questions Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Here, what to retrieve depends on what has already been derived, which in turn may depend on what was previously retrieved. To address this, we propose IRCoT, a new approach for multi-step QA that interleaves retrieval with steps (sentences) in a CoT, guiding the retrieval with CoT and in turn using retrieved results to improve CoT. |
Harsh Trivedi; Niranjan Balasubramanian; Tushar Khot; Ashish Sabharwal; |
558 | Direct Fact Retrieval from Knowledge Graphs Without Entity Linking Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, this approach requires additional labels for training each of the three subcomponents in addition to pairs of input texts and facts, and also may accumulate errors propagated from failures in previous steps. To tackle these limitations, we propose a simple knowledge retrieval framework, which directly retrieves facts from the KGs given the input text based on their representational similarities, which we refer to as Direct Fact Retrieval (DiFaR). |
Jinheon Baek; Alham Fikri Aji; Jens Lehmann; Sung Ju Hwang; |
559 | DisentQA: Disentangling Parametric and Contextual Knowledge with Counterfactual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a new paradigm in which QA models are trained to disentangle the two sources of knowledge. |
Ella Neeman; Roee Aharoni; Or Honovich; Leshem Choshen; Idan Szpektor; Omri Abend; |
560 | A New Direction in Stance Detection: Target-Stance Extraction in The Wild Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Given a text from social media platforms, the target information is often unknown due to implicit mentions in the source text and it is infeasible to have manual target annotations at a large scale. Therefore, in this paper, we propose a new task Target-Stance Extraction (TSE) that aims to extract the (target, stance) pair from the text. |
Yingjie Li; Krishna Garg; Cornelia Caragea; |
561 | Improved Instruction Ordering in Recipe-Grounded Conversation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we study the task of instructional dialogue and focus on the cooking domain. |
Duong Le; Ruohao Guo; Wei Xu; Alan Ritter; |
562 | Token-wise Decomposition of Autoregressive Language Model Hidden States for Analyzing Model Predictions Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: While there is much recent interest in studying why Transformer-based large language models make predictions the way they do, the complex computations performed within each layer have made their behavior somewhat opaque. To mitigate this opacity, this work presents a linear decomposition of final hidden states from autoregressive language models based on each initial input token, which is exact for virtually all contemporary Transformer architectures. |
Byung-Doh Oh; William Schuler; |
563 | Document-Level Multi-Event Extraction with Event Proxy Nodes and Hausdorff Distance Minimization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose an alternative approach for document-level multi-event extraction with event proxy nodes and Hausdorff distance minimization. |
Xinyu Wang; Lin Gui; Yulan He; |
564 | Dialog-Post: Multi-Level Self-Supervised Objectives and Hierarchical Model for Dialogue Post-Training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To model dialogues more comprehensively, we propose a DialPost method, Dialog-Post, with multi-level self-supervised objectives and a hierarchical model. |
Zhenyu Zhang; Lei Shen; Yuming Zhao; Meng Chen; Xiaodong He; |
565 | Language Detoxification with Attribute-Discriminative Latent Space Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, previous methods require excessive memory, computations, and time which are serious bottlenecks in their real-world application. To address such limitations, we propose an effective yet efficient method for language detoxification using an attribute-discriminative latent space. |
Jin Myung Kwak; Minseon Kim; Sung Ju Hwang; |
566 | Just Like A Human Would, Direct Access to Sarcasm Augmented with Potential Result and Reaction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we develop a novel sarcasm detection method, namely Sarcasm Detector with Augmentation of Potential Result and Reaction (SD-APRR). |
Changrong Min; Ximing Li; Liang Yang; Zhilin Wang; Bo Xu; Hongfei Lin; |
567 | Adaptive and Personalized Exercise Generation for Online Language Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Thus, in this paper, we study a novel task of adaptive and personalized exercise generation for online language learning. |
Peng Cui; Mrinmaya Sachan; |
568 | NLP Reproducibility For All: Understanding Experiences of Beginners Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: As natural language processing (NLP) has recently seen an unprecedented level of excitement, and more people are eager to enter the field, it is unclear whether current research reproducibility efforts are sufficient for this group of beginners to apply the latest developments. To understand their needs, we conducted a study with 93 students in an introductory NLP course, where students reproduced the results of recent NLP papers. |
Shane Storks; Keunwoo Yu; Ziqiao Ma; Joyce Chai; |
569 | Why Did The Chicken Cross The Road? Rephrasing and Analyzing Ambiguous Questions in VQA Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Focusing on questions about images, we create a dataset of ambiguous examples. |
Elias Stengel-Eskin; Jimena Guallar-Blasco; Yi Zhou; Benjamin Van Durme; |
570 | UMRSpell: Unifying The Detection and Correction Parts of Pre-trained Models Towards Chinese Missing, Redundant, and Spelling Correction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address these issues, we propose a novel model UMR- Spell to learn detection and correction parts together at the same time from a multi-task learning perspective by using a detection trans- mission self-attention matrix, and flexibly deal with both missing, redundant, and spelling er- rors through re-tagging rules. |
Zheyu He; Yujin Zhu; Linlin Wang; Liang Xu; |
571 | LAIT: Efficient Multi-Segment Encoding in Transformers with Layer-Adjustable Interaction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we introduce Layer-Adjustable Interactions in Transformers (LAIT). |
Jeremiah Milbauer; Annie Louis; Mohammad Javad Hosseini; Alex Fabrikant; Donald Metzler; Tal Schuster; |
572 | Local Interpretation of Transformer Based on Linear Decomposition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work proposes to interpret neural networks by linear decomposition and finds that the ReLU-activated Transformer can be considered as a linear model on a single input. |
Sen Yang; Shujian Huang; Wei Zou; Jianbing Zhang; Xinyu Dai; Jiajun Chen; |
573 | DataFinder: Scientific Dataset Recommendation from Natural Language Descriptions Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We operationalize the task of recommending datasets given a short natural language description of a research idea, to help people find relevant datasets for their needs. |
Vijay Viswanathan; Luyu Gao; Tongshuang Wu; Pengfei Liu; Graham Neubig; |
574 | Multilingual Event Extraction from Historical Newspaper Adverts Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce a new multilingual dataset in English, French, and Dutch composed of newspaper ads from the early modern colonial period reporting on enslaved people who liberated themselves from enslavement. |
Nadav Borenstein; Nat�lia da Silva Perez; Isabelle Augenstein; |
575 | BIC: Twitter Bot Detection with Text-Graph Interaction and Semantic Consistency Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This results in greater inconsistency across the timeline of novel Twitter bots, which warrants more attention. In light of these challenges, we propose BIC, a Twitter Bot detection framework with text-graph Interaction and semantic Consistency. |
Zhenyu Lei; Herun Wan; Wenqian Zhang; Shangbin Feng; Zilong Chen; Jundong Li; Qinghua Zheng; Minnan Luo; |
576 | Do I Have The Knowledge to Answer? Investigating Answerability of Knowledge Base Questions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We create GrailQAbility, a new benchmark KBQA dataset with unanswerability, by first identifying various forms of KB incompleteness that make questions unanswerable, and then systematically adapting GrailQA (a popular KBQA dataset with only answerable questions). |
Mayur Patidar; Prayushi Faldu; Avinash Singh; Lovekesh Vig; Indrajit Bhattacharya; Mausam –; |
577 | Understanding Client Reactions in Online Mental Health Counseling Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, previous NLP research on counseling has mainly focused on studying counselors� intervention strategies rather than their clients� reactions to the intervention. This work aims to fill this gap by developing a theoretically grounded annotation framework that encompasses counselors� strategies and client reaction behaviors. |
Anqi Li; Lizhi Ma; Yaling Mei; Hongliang He; Shuai Zhang; Huachuan Qiu; Zhenzhong Lan; |
578 | Nonlinear Structural Equation Model Guided Gaussian Mixture Hierarchical Topic Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Moreover, the sparsity of text data often complicate the analysis. To address these issues, we propose NSEM-GMHTM as a deep topic model, witha Gaussian mixture prior distribution to improve the model�s ability to adapt to sparse data, which explicitly models hierarchical and symmetric relations between topics through the dependency matrices and nonlinear structural equations. |
HeGang Chen; Pengbo Mao; Yuyin Lu; Yanghui Rao; |
579 | Revisiting Token Dropping Strategy in Efficient BERT Pretraining Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, we empirically find that token dropping is prone to a semantic loss problem and falls short in handling semantic-intense tasks. Motivated by this, we propose a simple yet effective semantic-consistent learning method (ScTD) to improve the token dropping. |
Qihuang Zhong; Liang Ding; Juhua Liu; Xuebo Liu; Min Zhang; Bo Du; Dacheng Tao; |
580 | The Benefits of Bad Advice: Autocontrastive Decoding Across Model Layers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we argue that due to the gradual improvement across model layers, additional information can be gleaned from the contrast between higher and lower layers during inference. |
Ariel Gera; Roni Friedman; Ofir Arviv; Chulaka Gunasekara; Benjamin Sznajder; Noam Slonim; Eyal Shnarch; |
581 | FACTIFY-5WQA: 5W Aspect-based Fact Verification Through Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a 5W framework (who, what, when, where, and why) for question-answer-based fact explainability. |
Anku Rani; S.M Towhidul Islam Tonmoy; Dwip Dalal; Shreya Gautam; Megha Chakraborty; Aman Chadha; Amit Sheth; Amitava Das; |
582 | Naamapadam: A Large-Scale Named Entity Annotated Data for Indic Languages Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present, Naamapadam, the largest publicly available Named Entity Recognition (NER) dataset for the 11 major Indian languages from two language families. |
Arnav Mhaske; Harshit Kedia; Sumanth Doddapaneni; Mitesh M. Khapra; Pratyush Kumar; Rudra Murthy; Anoop Kunchukuttan; |
583 | CREPE: Open-Domain Question Answering with False Presuppositions Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce CREPE, a QA dataset containing a natural distribution of presupposition failures from online information-seeking forums. |
Xinyan Yu; Sewon Min; Luke Zettlemoyer; Hannaneh Hajishirzi; |
584 | Joint Document-Level Event Extraction Via Token-Token Bidirectional Event Completed Graph Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We solve the challenging document-level event extraction problem by proposing a joint exaction methodology that can avoid inefficiency and error propagation issues in classic pipeline methods. |
Qizhi Wan; Changxuan Wan; Keli Xiao; Dexi Liu; Chenliang Li; Bolong Zheng; Xiping Liu; Rong Hu; |
585 | Robust Representation Learning with Reliable Pseudo-labels Generation Via Self-Adaptive Optimal Transport for Short Text Clustering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Existing approaches cannot solve this problem well, since (1) they are prone to obtain degenerate solutions especially on heavy imbalanced datasets, and (2) they are vulnerable to noises. To tackle the above issues, we propose a Robust Short Text Clustering (RSTC) model to improve robustness against imbalanced and noisy data. |
Xiaolin Zheng; Mengling Hu; Weiming Liu; Chaochao Chen; Xinting Liao; |
586 | Multilingual Knowledge Graph Completion with Language-Sensitive Multi-Graph Attention Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing approaches suffer from two main drawbacks: (a) alignment dependency: the multilingual KGC is always realized with joint entity or relation alignment, which introduces additional alignment models and increases the complexity of the whole framework; (b) training inefficiency: the trained model will only be used for the completion of one target KG, although the data from all KGs are used simultaneously. To address these drawbacks, we propose a novel multilingual KGC framework with language-sensitive multi-graph attention such that the missing links on all given KGs can be inferred by a universal knowledge completion model. |
Rongchuan Tang; Yang Zhao; Chengqing Zong; Yu Zhou; |
587 | What Are The Desired Characteristics of Calibration Sets? Identifying Correlates on Long Form Scientific Summarization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we uncover the underlying characteristics of effective sets. |
Griffin Adams; Bichlien Nguyen; Jake Smith; Yingce Xia; Shufang Xie; Anna Ostropolets; Budhaditya Deb; Yuan-Jyue Chen; Tristan Naumann; No�mie Elhadad; |
588 | Annotating Mentions Alone Enables Efficient Domain Adaptation for Coreference Resolution Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we show that adapting mention detection is the key component to successful domain adaptation of coreference models, rather than antecedent linking. |
Nupoor Gandhi; Anjalie Field; Emma Strubell; |
589 | A Universal Discriminator for Zero-Shot Generalization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Generative modeling has been the dominant approach for large-scale pretraining and zero-shot generalization. In this work, we challenge this convention by showing that discriminative approaches perform substantially better than generative ones on a large number of NLP tasks. |
Haike Xu; Zongyu Lin; Jing Zhou; Yanan Zheng; Zhilin Yang; |
590 | Syntax and Geometry of Information Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents an information-theoretical model of syntactic generalization. |
Rapha�l Bailly; Laurent Leblond; Kata G�bor; |
591 | GreenKGC: A Lightweight Knowledge Graph Completion Method Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Yet, a higher-dimensional embedding space is usually required for a better reasoning capability, which leads to larger model size and hinders applicability to real-world problems (e. g. , large-scale KGs or mobile/edge computing). A lightweight modularized KGC solution, called GreenKGC, is proposed in this work to address this issue. |
Yun Cheng Wang; Xiou Ge; Bin Wang; C.-C. Jay Kuo; |
592 | Unsupervised Open-domain Keyphrase Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we study the problem of unsupervised open-domain keyphrase generation, where the objective is a keyphrase generation model that can be built without using human-labeled data and can perform consistently across domains. |
Lam Do; Pritom Saha Akash; Kevin Chen-Chuan Chang; |
593 | A Cognitive Stimulation Dialogue System with Multi-source Knowledge Fusion for Elders with Cognitive Impairment Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a multi-source knowledge fusion method for CS dialogue (CSD), to generate open-ended responses guided by the therapy principle and emotional support strategy. |
Jiyue Jiang; Sheng Wang; Qintong Li; Lingpeng Kong; Chuan Wu; |
594 | Plug-and-Play Knowledge Injection for Pre-trained Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we are the first to study how to improve the flexibility and efficiency of knowledge injection by reusing existing downstream models. |
Zhengyan Zhang; Zhiyuan Zeng; Yankai Lin; Huadong Wang; Deming Ye; Chaojun Xiao; Xu Han; Zhiyuan Liu; Peng Li; Maosong Sun; Jie Zhou; |
595 | Two Birds One Stone: Dynamic Ensemble for OOD Intent Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper investigates �overthinking� in the open-world scenario and its impact on OOD intent classification. Inspired by this, we propose a two-birds-one-stone method, which allows the model to decide whether to make a decision on OOD classification early during inference and can ensure accuracy and accelerate inference. |
Yunhua Zhou; Jianqiang Yang; Pengyu Wang; Xipeng Qiu; |
596 | SWiPE: A Dataset for Document-Level Simplification of Wikipedia Pages Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To scale our efforts, we propose several models to automatically label edits, achieving an F-1 score of up to 70. |
Philippe Laban; Jesse Vig; Wojciech Kryscinski; Shafiq Joty; Caiming Xiong; Chien-Sheng Wu; |
597 | Are Message Passing Neural Networks Really Helpful for Knowledge Graph Completion? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we find that surprisingly, simple MLP models are able to achieve comparable performance to MPNNs, suggesting that MP may not be as crucial as previously believed. |
Juanhui Li; Harry Shomer; Jiayuan Ding; Yiqi Wang; Yao Ma; Neil Shah; Jiliang Tang; Dawei Yin; |
598 | A Dynamic Programming Algorithm for Span-based Nested Named-entity Recognition in O(n2) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We show that by adding a supplementary structural constraint on the search space, nested NER has a quadratic-time complexity, that is the same asymptotic complexity than the non-nested case. |
Caio Corro; |
599 | Target-Side Augmentation for Document-Level Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Document-level machine translation faces the challenge of data sparsity due to its long input length and a small amount of training data, increasing the risk of learning spurious patterns. To address this challenge, we propose a target-side augmentation method, introducing a data augmentation (DA) model to generate many potential translations for each source document. |
Guangsheng Bao; Zhiyang Teng; Yue Zhang; |
600 | Rethinking Masked Language Modeling for Chinese Spelling Correction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we study Chinese Spelling Correction (CSC) as a joint decision made by two separate models: a language model and an error model. |
Hongqiu Wu; Shaohua Zhang; Yuchen Zhang; Hai Zhao; |
601 | A Multi-Modal Context Reasoning Approach for Conditional Inference on Joint Textual and Visual Clues Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Previous methods utilizing pretrained vision-language models (VLMs) have achieved impressive performances, yet they show a lack of multimodal context reasoning capability, especially for text-modal information. To address this issue, we propose a Multi-modal Context Reasoning approach, named ModCR. |
Yunxin Li; Baotian Hu; Chen Xinyu; Yuxin Ding; Lin Ma; Min Zhang; |
602 | Simple and Effective Unsupervised Speech Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The amount of labeled data to train models for speech tasks is limited for most languages, however, the data scarcity is exacerbated for speech translation which requires labeled data covering two different languages. To address this issue, we study a simple and effective approach to build speech translation systems without labeled data by leveraging recent advances in unsupervised speech recognition, machine translation and speech synthesis, either in a pipeline approach, or to generate pseudo-labels for training end-to-end speech translation models. |
Changhan Wang; Hirofumi Inaguma; Peng-Jen Chen; Ilia Kulikov; Yun Tang; Wei-Ning Hsu; Michael Auli; Juan Pino; |
603 | Modeling What-to-ask and How-to-ask for Answer-unaware Conversational Question Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we present SG-CQG, a two-stage CQG framework. |
Xuan Long Do; Bowei Zou; Shafiq Joty; Tran Tai; Liangming Pan; Nancy Chen; Ai Ti Aw; |
604 | CHEER: Centrality-aware High-order Event Reasoning Network for Document-level Event Causality Identification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we manually annotate central events for a systematical investigation and propose a novel DECI model, CHEER, which performs high-order reasoning while considering event centrality. |
Meiqi Chen; Yixin Cao; Yan Zhang; Zhiwei Liu; |
605 | F-Divergence Minimization for Sequence-Level Knowledge Distillation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose an FDISTILL framework, which formulates sequence-level knowledge distillation as minimizing a generalized f-divergence function. |
Yuqiao Wen; Zichao Li; Wenyu Du; Lili Mou; |
606 | Supervised Adversarial Contrastive Learning for Emotion Recognition in Conversations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Extracting generalized and robust representations is a major challenge in emotion recognition in conversations (ERC). To address this, we propose a supervised adversarial contrastive learning (SACL) framework for learning class-spread structured representations in a supervised manner. |
Dou Hu; Yinan Bao; Lingwei Wei; Wei Zhou; Songlin Hu; |
607 | A Novel Table-to-Graph Generation Approach for Document-Level Joint Entity and Relation Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing methods usually assume that entities and their mentions are identified beforehand, which falls short of real-world applications. To overcome this limitation, we propose TaG, a novel table-to-graph generation model for joint extractionof entities and relations at document-level. |
Ruoyu Zhang; Yanzeng Li; Lei Zou; |
608 | A Synthetic Data Generation Framework for Grounded Dialogues Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a synthetic data generation framework (SynDG) for grounded dialogues. |
Jianzhu Bao; Rui Wang; Yasheng Wang; Aixin Sun; Yitong Li; Fei Mi; Ruifeng Xu; |
609 | MasakhaPOS: Part-of-Speech Tagging for Typologically Diverse African Languages Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present AfricaPOS, the largest part-of-speech (POS) dataset for 20 typologically diverse African languages. |
Cheikh M. Bamba Dione; David Ifeoluwa Adelani; Peter Nabende; Jesujoba Alabi; Thapelo Sindane; Happy Buzaaba; Shamsuddeen Hassan Muhammad; Chris Chinenye Emezue; Perez Ogayo; Anuoluwapo Aremu; Catherine Gitau; Derguene Mbaye; Jonathan Mukiibi; Blessing Sibanda; Bonaventure F. P. Dossou; Andiswa Bukula; Rooweither Mabuya; Allahsera Auguste Tapo; Edwin Munkoh-Buabeng; Victoire Memdjokam Koagne; Fatoumata Ouoba Kabore; Amelia Taylor; Godson Kalipe; Tebogo Macucwa; Vukosi Marivate; Tajuddeen Gwadabe; Mboning Tchiaze Elvis; Ikechukwu Onyenwe; Gratien Atindogbe; Tolulope Adelani; Idris Akinade; Olanrewaju Samuel; Marien Nahimana; Th�og�ne Musabeyezu; Emile Niyomutabazi; Ester Chimhenga; Kudzai Gotosa; Patrick Mizha; Apelete Agbolo; Seydou Traore; Chinedu Uchechukwu; Aliyu Yusuf; Muhammad Abdullahi; Dietrich Klakow; |
610 | Semantic Structure Enhanced Event Causality Identification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The former includes important semantic elements related to the events to describe them more precisely, while the latter contains semantic paths between two events to provide possible supports for ECI. In this paper, we study the implicit associations between events by modeling the above explicit semantic structures, and propose a Semantic Structure Integration model (SemSIn). |
Zhilei Hu; Zixuan Li; Xiaolong Jin; Long Bai; Saiping Guan; Jiafeng Guo; Xueqi Cheng; |
611 | Weakly-Supervised Spoken Video Grounding Via Semantic Interaction Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To effectively represent the cross-modal semantics, we propose Semantic Interaction Learning (SIL), a novel framework consisting of the acoustic-semantic pre-training (ASP) and acoustic-visual contrastive learning (AVCL). |
Ye Wang; Wang Lin; Shengyu Zhang; Tao Jin; Linjun Li; Xize Cheng; Zhou Zhao; |
612 | Rehearsal-free Continual Language Learning Via Efficient Parameter Isolation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To load correct parameters at testing time, we introduce a simple yet effective non-parametric method. |
Zhicheng Wang; Yufang Liu; Tao Ji; Xiaoling Wang; Yuanbin Wu; Congcong Jiang; Ye Chao; Zhencong Han; Ling Wang; Xu Shao; Wenqiu Zeng; |
613 | Label-Aware Hyperbolic Embeddings for Fine-grained Emotion Classification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose HypEmo, a novel framework that can integrate hyperbolic embeddings to improve the FEC task. |
Chih Yao Chen; Tun Min Hung; Yi-Li Hsu; Lun-Wei Ku; |
614 | Combo of Thinking and Observing for Outside-Knowledge VQA Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we are inspired to constrain the cross-modality space into the same space of natural-language space which makes the visual features preserved directly, and the model still benefits from the vast knowledge in natural-language space. |
Qingyi Si; Yuchen Mo; Zheng Lin; Huishan Ji; Weiping Wang; |
615 | AMPERE: AMR-Aware Prefix for Generation-Based Event Argument Extraction Model Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we study strategies to incorporate AMR into generation-based EAE models. |
I-Hung Hsu; Zhiyu Xie; Kuan-Hao Huang; Prem Natarajan; Nanyun Peng; |
616 | Your Spouse Needs Professional Help: Determining The Contextual Appropriateness of Messages Through Modeling Social Relationships Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Here, we introduce a new approach to identifying inappropriate communication by explicitly modeling the social relationship between the individuals. |
David Jurgens; Agrima Seth; Jackson Sargent; Athena Aghighi; Michael Geraci; |
617 | TART: Improved Few-shot Text Classification Using Task-Adaptive Reference Transformation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a novel Task-Adaptive Reference Transformation (TART) network, aiming to enhance the generalization by transforming the class prototypes to per-class fixed reference points in task-adaptive metric spaces. |
Shuo Lei; Xuchao Zhang; Jianfeng He; Fanglan Chen; Chang-Tien Lu; |
618 | How Do In-Context Examples Affect Compositional Generalization? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present CoFe, a test suite to investigate in-context compositional generalization. |
Shengnan An; Zeqi Lin; Qiang Fu; Bei Chen; Nanning Zheng; Jian-Guang Lou; Dongmei Zhang; |
619 | Attractive Storyteller: Stylized Visual Storytelling with Unpaired Text Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce a new task of Stylized Visual Storytelling (SVST), which aims to describe a photo stream with stylized stories that are more expressive and attractive. |
Dingyi Yang; Qin Jin; |
620 | Multitask Pretraining with Structured Knowledge for Text-to-SQL Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work we present a large pretraining dataset and strategy for learning representations of text, tables, and SQL code that leverages the entire context of the problem. |
Robert Giaquinto; Dejiao Zhang; Benjamin Kleiner; Yang Li; Ming Tan; Parminder Bhatia; Ramesh Nallapati; Xiaofei Ma; |
621 | WSPAlign: Word Alignment Pre-training Via Large-Scale Weakly Supervised Span Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Specifically, we make noisy, partially aligned, and non-parallel paragraphs in this paper. |
Qiyu Wu; Masaaki Nagata; Yoshimasa Tsuruoka; |
622 | Distill or Annotate? Cost-Efficient Fine-Tuning of Compact Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate how to most efficiently use a fixed budget to build a compact model. |
Junmo Kang; Wei Xu; Alan Ritter; |
623 | OD-RTE: A One-Stage Object Detection Framework for Relational Triple Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we treat the RTE task based on table-filling method as an Object Detection task and propose a one-stage Object Detection framework for Relational Triple Extraction (OD-RTE). |
Jinzhong Ning; Zhihao Yang; Yuanyuan Sun; Zhizheng Wang; Hongfei Lin; |
624 | I Cast Detect Thoughts: Learning to Converse and Guide with Intents and Theory-of-Mind in Dungeons and Dragons Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a novel task, G4C, to study teacher-student natural language interactions in a goal-driven and grounded environment. |
Pei Zhou; Andrew Zhu; Jennifer Hu; Jay Pujara; Xiang Ren; Chris Callison-Burch; Yejin Choi; Prithviraj Ammanabrolu; |
625 | Multitask Pre-training of Modular Prompt for Chinese Few-Shot Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present Multi-task Pre-trained Modular Prompt (MP2) to boost prompt tuning for few-shot learning. |
Tianxiang Sun; Zhengfu He; Qin Zhu; Xipeng Qiu; Xuanjing Huang; |
626 | Is GPT-3 A Good Data Annotator? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: It is therefore natural to wonder whether it can be used to effectively annotate data for NLP tasks. In this paper, we evaluate the performance of GPT-3 as a data annotator by comparing it with traditional data annotation methods and analyzing its output on a range of tasks. |
Bosheng Ding; Chengwei Qin; Linlin Liu; Yew Ken Chia; Boyang Li; Shafiq Joty; Lidong Bing; |
627 | Multi-Grained Knowledge Retrieval for End-to-End Task-Oriented Dialog Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Most existing systems blend knowledge retrieval with response generation and optimize them with direct supervision from reference responses, leading to suboptimal retrieval performance when the knowledge base becomes large-scale. To address this, we propose to decouple knowledge retrieval from response generation and introduce a multi-grained knowledge retriever (MAKER) that includes an entity selector to search for relevant entities and an attribute selector to filter out irrelevant attributes. |
Fanqi Wan; Weizhou Shen; Ke Yang; Xiaojun Quan; Wei Bi; |
628 | Few-shot Event Detection: An Empirical Study and A Unified View Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents a thorough empirical study, a unified view of ED models, and a better unified baseline. |
Yubo Ma; Zehao Wang; Yixin Cao; Aixin Sun; |
629 | How to Plant Trees in Language Models: Data and Architectural Effects on The Emergence of Syntactic Inductive Biases Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We focus on architectural features (depth, width, and number of parameters), as well as the genre and size of the pre-training corpus, diagnosing inductive biases using two syntactic transformation tasks: question formation and passivization, both in English. |
Aaron Mueller; Tal Linzen; |
630 | ClarifyDelphi: Reinforced Clarification Questions with Defeasibility Rewards for Social and Moral Situations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present ClarifyDelphi, an interactive system that learns to ask clarification questions (e. g. , why did you lie to your friend?) |
Valentina Pyatkin; Jena D. Hwang; Vivek Srikumar; Ximing Lu; Liwei Jiang; Yejin Choi; Chandra Bhagavatula; |
631 | HINT: Hypernetwork Instruction Tuning for Efficient Zero- and Few-Shot Generalisation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, many of these approaches suffer from high computational costs due to their reliance on concatenating lengthy instructions with every input example, resulting in costly reprocessing of the instruction. To avoid this, we introduce Hypernetworks for INstruction Tuning (HINT), which convert task instructions and examples into parameter-efficient modules inserted into an underlying model using a pretrained text encoder, eliminating the need to include instructions in the model input. |
Hamish Ivison; Akshita Bhagia; Yizhong Wang; Hannaneh Hajishirzi; Matthew Peters; |
632 | Measuring Inductive Biases of In-Context Learning with Underspecified Demonstrations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We investigate the inductive biases of ICL from the perspective of feature bias: which feature ICL is more likely to use given a set of underspecified demonstrations in which two features are equally predictive of the labels. |
Chenglei Si; Dan Friedman; Nitish Joshi; Shi Feng; Danqi Chen; He He; |
633 | An Inclusive Notion of Text Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Towards that goal, we propose common terminology to discuss the production and transformation of textual data, and introduce a two-tier taxonomy of linguistic and non-linguistic elements that are available in textual sources and can be used in NLP modeling. |
Ilia Kuznetsov; Iryna Gurevych; |
634 | AlignScore: Evaluating Factual Consistency with A Unified Alignment Function Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose AlignScore, a new holistic metric that applies to a variety of factual inconsistency scenarios as above. |
Yuheng Zha; Yichi Yang; Ruichen Li; Zhiting Hu; |
635 | Multi-source Semantic Graph-based Multimodal Sarcasm Explanation Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Although the existing pioneer study has achieved great success with the BART backbone, it overlooks the gap between the visual feature space and the decoder semantic space, the object-level metadata of the image, as well as the potential external knowledge. To solve these limitations, in this work, we propose a novel mulTi-source sEmantic grAph-based Multimodal sarcasm explanation scheme, named TEAM. |
Liqiang Jing; Xuemeng Song; Kun Ouyang; Mengzhao Jia; Liqiang Nie; |
636 | Counterfactual Active Learning for Out-of-Distribution Generalization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address the problem, we propose Counterfactual Active Learning (CounterAL) that empowers active learning with counterfactual thinking to bridge the seen samples with unseen cases. |
Xun Deng; Wenjie Wang; Fuli Feng; Hanwang Zhang; Xiangnan He; Yong Liao; |
637 | Multi-granularity Temporal Question Answering Over Knowledge Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To overcome the limitation, in this paper, we motivate the notion of multi-granularity temporal question answering over knowledge graphs and present a large scale dataset for multi-granularity TKGQA, namely MultiTQ. |
Ziyang Chen; Jinzhi Liao; Xiang Zhao; |
638 | A New Aligned Simple German Corpus Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a new sentence-aligned monolingual corpus for Simple German � German. |
Vanessa Toborek; Moritz Busch; Malte Bo�ert; Christian Bauckhage; Pascal Welke; |
639 | Introducing Semantics Into Speech Encoders Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a task-agnostic unsupervised way of incorporating semantic information from LLMs into self-supervised speech encoders without labeled audio transcriptions. |
Derek Xu; Shuyan Dong; Changhan Wang; Suyoun Kim; Zhaojiang Lin; Bing Liu; Akshat Shrivastava; Shang-Wen Li; Liang-Hsuan Tseng; Guan-Ting Lin; Alexei Baevski; Hung-yi Lee; Yizhou Sun; Wei Wang; |
640 | Constrained Tuple Extraction with Interaction-Aware Network Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In practice, however, the validity of knowledge triples is associated with and changes with the spatial, temporal, or other kinds of constraints. Motivated by this observation, this paper proposes a constrained tuple extraction (CTE) task to guarantee the validity of knowledge tuples. |
Xiaojun Xue; Chunxia Zhang; Tianxiang Xu; Zhendong Niu; |
641 | MultiInstruct: Improving Multi-Modal Zero-Shot Learning Via Instruction Tuning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce MultiInstruct, the first multimodal instruction tuning benchmark dataset that consists of 62 diverse multimodal tasks in a unified seq-to-seq format covering 10 broad categories. |
Zhiyang Xu; Ying Shen; Lifu Huang; |
642 | Single Sequence Prediction Over Reasoning Graphs for Multi-hop QA Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While such models can lead to better interpretability and high quantitative scores, they often have difficulty accurately identifying the passages corresponding to key entities in the context, resulting in incorrect passage hops and a lack of faithfulness in the reasoning path. To address this, we propose a single-sequence prediction method over a local reasoning graph that integrates a graph structure connecting key entities in each context passage to relevant subsequent passages for each question. |
Gowtham Ramesh; Makesh Narsimhan Sreedhar; Junjie Hu; |
643 | Contrastive Error Attribution for Finetuned Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce a framework to identify and remove low-quality training instances that lead to undesirable outputs, such as faithfulness errors in text summarization. |
Faisal Ladhak; Esin Durmus; Tatsunori Hashimoto; |
644 | DARE: Towards Robust Text Explanations in Biomedical and Healthcare Applications Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, most of these explainability methods have been shown to be brittle in the face of adversarial perturbations of their inputs in the image and generic textual domain. In this work we show that this phenomenon extends to specific and important high stakes domains like biomedical datasets. |
Adam Ivankay; Mattia Rigotti; Pascal Frossard; |
645 | Neural Machine Translation for Mathematical Formulae Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we perform the tasks of translating from LaTeX to Mathematica as well as from LaTeX to semantic LaTeX. |
Felix Petersen; Moritz Schubotz; Andre Greiner-Petter; Bela Gipp; |
646 | Query-Efficient Black-Box Red Teaming Via Bayesian Optimization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To this end, we propose Bayesian red teaming (BRT), novel query-efficient black-box red teaming methods based on Bayesian optimization, which iteratively identify diverse positive test cases leading to model failures by utilizing the pre-defined user input pool and the past evaluations. |
Deokjae Lee; JunYeong Lee; Jung-Woo Ha; Jin-Hwa Kim; Sang-Woo Lee; Hwaran Lee; Hyun Oh Song; |
647 | SSD-LM: Semi-autoregressive Simplex-based Diffusion Language Model for Text Generation and Modular Control Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we present SSD-LM�a diffusion-based language model with two key design choices. |
Xiaochuang Han; Sachin Kumar; Yulia Tsvetkov; |
648 | Recall, Expand, and Multi-Candidate Cross-Encode: Fast and Accurate Ultra-Fine Entity Typing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose to perform entity typing in a recall-expand-filter manner. |
Chengyue Jiang; Wenyang Hui; Yong Jiang; Xiaobin Wang; Pengjun Xie; Kewei Tu; |
649 | MIR-GAN: Refining Frame-Level Modality-Invariant Representations with Adversarial Network for Audio-Visual Speech Recognition Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we aim to learn the shared representations across modalities to bridge their gap. |
Yuchen Hu; Chen Chen; Ruizhe Li; Heqing Zou; Eng Siong Chng; |
650 | Understanding Factual Errors in Summarization: Errors, Summarizers, Datasets, Error Detectors Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, the ever-evolving nature of summarization systems, metrics, and annotated benchmarks makes factuality evaluation a moving target, and drawing clear comparisons among metrics has become increasingly difficult. In this work, we aggregate factuality error annotations from nine existing datasets and stratify them according to the underlying summarization model. |
Liyan Tang; Tanya Goyal; Alex Fabbri; Philippe Laban; Jiacheng Xu; Semih Yavuz; Wojciech Kryscinski; Justin Rousseau; Greg Durrett; |
651 | GIFT: Graph-Induced Fine-Tuning for Multi-Party Conversation Understanding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To this end, we present a plug-and-play and lightweight method named graph-induced fine-tuning (GIFT) which can adapt various Transformer-based pre-trained language models (PLMs) for universal MPC understanding. |
Jia-Chen Gu; Zhenhua Ling; Quan Liu; Cong Liu; Guoping Hu; |
652 | Hybrid Uncertainty Quantification for Selective Text Classification in Ambiguous Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a new uncertainty estimation method that combines epistemic and aleatoric UE methods. |
Artem Vazhentsev; Gleb Kuzmin; Akim Tsvigun; Alexander Panchenko; Maxim Panov; Mikhail Burtsev; Artem Shelmanov; |
653 | BLOOM+1: Adding Language Support to BLOOM for Zero-Shot Prompting Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To extend the benefits of BLOOM to other languages without incurring prohibitively large costs, it is desirable to adapt BLOOM to new languages not seen during pretraining. In this work, we apply existing language adaptation strategies to BLOOM and benchmark its zero-shot prompting performance on eight new languages in a resource-constrained setting. |
Zheng Xin Yong; Hailey Schoelkopf; Niklas Muennighoff; Alham Fikri Aji; David Ifeoluwa Adelani; Khalid Almubarak; M Saiful Bari; Lintang Sutawika; Jungo Kasai; Ahmed Baruwa; Genta Winata; Stella Biderman; Edward Raff; Dragomir Radev; Vassilina Nikoulina; |
654 | Logic-driven Indirect Supervision: An Application to Crisis Counseling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we examine if inexpensive�and potentially noisy�session-level annotation can help improve label utterances. |
Mattia Medina Grespan; Meghan Broadbent; Xinyao Zhang; Katherine Axford; Brent Kious; Zac Imel; Vivek Srikumar; |
655 | Grounding Characters and Places in Narrative Text Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we annotate spatial relationships in approximately 2500 book excerpts and train a model using contextual embeddings as features to predict these relationships. |
Sandeep Soni; Amanpreet Sihra; Elizabeth Evans; Matthew Wilkens; David Bamman; |
656 | From Pretraining Data to Language Models to Downstream Tasks: Tracking The Trails of Political Biases Leading to Unfair NLP Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We discuss the implications of our findings for NLP research and propose future directions to mitigate unfairness. |
Shangbin Feng; Chan Young Park; Yuhan Liu; Yulia Tsvetkov; |
657 | SLABERT Talk Pretty One Day: Modeling Second Language Acquisition with BERT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We find that NLP literature has not given enough attention to the phenomenon of negative transfer. To understand patterns of both positive and negative transfer between L1 and L2, we model sequential second language acquisition in LMs. |
Aditya Yadavalli; Alekhya Yadavalli; Vera Tobin; |
658 | Contrastive Novelty-Augmented Learning: Anticipating Outliers with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Selective prediction, in which models abstain on low-confidence examples, provides a possible solution, but existing models are often overly confident on unseen classes. To remedy this overconfidence, we introduce Contrastive Novelty-Augmented Learning (CoNAL), a two-step method that generates OOD examples representative of novel classes, then trains to decrease confidence on them. |
Albert Xu; Xiang Ren; Robin Jia; |
659 | Learning to Initialize: Can Meta Learning Improve Cross-task Generalization in Prompt Tuning? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we study meta prompt tuning (MPT) to systematically explore how meta-learning can help improve (if it can) cross-task generalization in PT through learning to initialize the prompt embeddings from other relevant tasks. |
Chengwei Qin; Shafiq Joty; Qian Li; Ruochen Zhao; |
660 | Rethinking The Role of Scale for In-Context Learning: An Interpretability-based Case Study at 66 Billion Scale Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Language models have been shown to perform better with an increase in scale on a wide variety of tasks via the in-context learning paradigm. In this paper, we investigate the hypothesis that the ability of a large language model to in-context learn-perform a task is not uniformly spread across all of its underlying components. |
Hritik Bansal; Karthik Gopalakrishnan; Saket Dingliwal; Sravan Bodapati; Katrin Kirchhoff; Dan Roth; |
661 | Question-Answering in A Low-resourced Language: Benchmark Dataset and Models for Tigrinya Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work presents a native QA dataset for an East African language, Tigrinya. |
Fitsum Gaim; Wonsuk Yang; Hancheol Park; Jong Park; |
662 | ESCOXLM-R: Multilingual Taxonomy-driven Pre-training for The Job Market Domain Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we introduce a language model called ESCOXLM-R, based on XLM-R-large, which uses domain-adaptive pre-training on the European Skills, Competences, Qualifications and Occupations (ESCO) taxonomy, covering 27 languages. |
Mike Zhang; Rob van der Goot; Barbara Plank; |
663 | CITADEL: Conditional Token Interaction Via Dynamic Lexical Routing for Efficient and Effective Multi-Vector Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we unify different multi-vector retrieval models from a token routing viewpoint and propose conditional token interaction via dynamic lexical routing, namely CITADEL, for efficient and effective multi-vector retrieval. |
Minghan Li; Sheng-Chieh Lin; Barlas Oguz; Asish Ghoshal; Jimmy Lin; Yashar Mehdad; Wen-tau Yih; Xilun Chen; |
664 | MultiCapCLIP: Auto-Encoding Prompts for Zero-Shot Multilingual Visual Captioning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To deal with the label shortage problem, we present a simple yet effective zero-shot approach MultiCapCLIP that can generate visual captions for different scenarios and languages without any labeled vision-caption pairs of downstream datasets. |
Bang Yang; Fenglin Liu; Xian Wu; Yaowei Wang; Xu Sun; Yuexian Zou; |
665 | Transfer and Active Learning for Dissonance Detection: Addressing The Rare-Class Challenge Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose and investigate transfer- and active learning solutions to the rare class problem of dissonance detection through utilizing models trained on closely related tasks and the evaluation of acquisition strategies, including a proposed probability-of-rare-class (PRC) approach. |
Vasudha Varadarajan; Swanie Juhng; Syeda Mahwish; Xiaoran Liu; Jonah Luby; Christian Luhmann; H. Andrew Schwartz; |
666 | In-sample Curriculum Learning By Sequence Completion for Natural Language Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Inspired by the �easy-to-hard� intuition, we propose to do in-sample curriculum learning for natural language generation tasks. |
Qi Jia; Yizhu Liu; Haifeng Tang; Kenny Zhu; |
667 | Product Question Answering in E-Commerce: A Survey Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we aim to systematically review existing research efforts on PQA. |
Yang Deng; Wenxuan Zhang; Qian Yu; Wai Lam; |
668 | Towards Domain-Agnostic and Domain-Adaptive Dementia Detection from Spoken Language Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We find that adapted models exhibit better performance across conversational and task-oriented datasets. |
Shahla Farzana; Natalie Parde; |
669 | Generalizing Backpropagation for Gradient-Based Interpretability Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we observe that the gradient computation of a model is a special case of a more general formulation using semirings. |
Kevin Du; Lucas Torroba Hennigen; Niklas Stoehr; Alex Warstadt; Ryan Cotterell; |
670 | UPPAM: A Unified Pre-training Architecture for Political Actor Modeling Based on Language Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a Unified Pre-training Architecture for Political Actor Modeling based on language (UPPAM). |
Xinyi Mou; Zhongyu Wei; Qi Zhang; Xuanjing Huang; |
671 | Generic Temporal Reasoning with Differential Analysis and Explanation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While temporal reasoning models can perform reasonably well on in-domain benchmarks, we have little idea of these systems� generalizability due to existing datasets� limitations. In this work, we introduce a novel task named TODAY that bridges this gap with temporal differential analysis, which as the name suggests, evaluates whether systems can correctly understand the effect of incremental changes. |
Yu Feng; Ben Zhou; Haoyu Wang; Helen Jin; Dan Roth; |
672 | Model-Based Simulation for Optimising Smart Reply Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Resultantly, previous work has focused largely on post-hoc diversification, rather than explicitly learning to predict sets of responses. Motivated by this problem, we present a novel method SimSR, that employs model-based simulation to discover high-value response sets, through simulating possible user responses with a learned world model. |
Benjamin Towle; Ke Zhou; |
673 | Beyond Contrastive Learning: A Variational Generative Model for Multilingual Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we instead propose a generative model for learning multilingual text embeddings which can be used to retrieve or score sentence pairs. |
John Wieting; Jonathan Clark; William Cohen; Graham Neubig; Taylor Berg-Kirkpatrick; |
674 | On The Blind Spots of Model-Based Evaluation Metrics for Text Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we explore a useful but often neglected methodology for robustness analysis of text generation evaluation metrics: stress tests with synthetic data. |
Tianxing He; Jingyu Zhang; Tianle Wang; Sachin Kumar; Kyunghyun Cho; James Glass; Yulia Tsvetkov; |
675 | Dealing with Semantic Underspecification in Multimodal NLP Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, we show that they struggle with it, which could negatively affect their performance and lead to harmful consequences when used for applications. In this position paper, we argue that our community should be aware of semantic underspecification if it aims to develop language technology that can successfully interact with human users. |
Sandro Pezzelle; |
676 | Trigger Warning Assignment As A Multi-Label Document Classification Problem Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce trigger warning assignment as a multi-label classification task, create the Webis Trigger Warning Corpus 2022, and with it the first dataset of 1 million fanfiction works from Archive of our Own with up to 36 different warnings per document. |
Matti Wiegmann; Magdalena Wolska; Christopher Schr�der; Ole Borchardt; Benno Stein; Martin Potthast; |
677 | WhitenedCSE: Whitening-based Contrastive Learning of Sentence Embeddings Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a whitening-based contrastive learning method for sentence embedding learning (WhitenedCSE), which combines contrastive learning with a novel shuffled group whitening. |
Wenjie Zhuo; Yifan Sun; Xiaohan Wang; Linchao Zhu; Yi Yang; |
678 | Federated Learning for Semantic Parsing: Task Formulation, Evaluation Setup, New Algorithms Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: By leveraging data from multiple clients, the FL paradigm can be especially beneficial for clients that have little training data to develop a data-hungry neural semantic parser on their own. We propose an evaluation setup to study this task, where we re-purpose widely-used single-domain text-to-SQL datasets as clients to form a realistic heterogeneous FL setting and collaboratively train a global model. |
Tianshu Zhang; Changchang Liu; Wei-Han Lee; Yu Su; Huan Sun; |
679 | Causality-Guided Multi-Memory Interaction Network for Multivariate Stock Price Movement Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose the Causality-guided Multi-memory Interaction Network (CMIN), a novel end-to-end deep neural network for stock movement prediction which, for the first time, models the multi-modality between financial text data and causality-enhanced stock correlations to achieve higher prediction accuracy. |
Di Luo; Weiheng Liao; Shuqi Li; Xin Cheng; Rui Yan; |
680 | DSRM: Boost Textual Adversarial Training with Distribution Shift Risk Minimization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In addition, these generated samples are deficient in grammatical quality and semantic consistency, which impairs the effectiveness of adversarial training. To address these problems, we introduce a novel, effective procedure for instead adversarial training with only clean data. |
SongYang Gao; Shihan Dou; Yan Liu; Xiao Wang; Qi Zhang; Zhongyu Wei; Jin Ma; Ying Shan; |
681 | A Simple and Flexible Modeling for Mental Disorder Detection By Learning from Clinical Questionnaires Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To address these challenges, we design a simple but flexible model that preserves domain-based interpretability. We propose a novel approach that captures the semantic meanings directly from the text and compares them to symptom-related descriptions. |
Hoyun Song; Jisu Shin; Huije Lee; Jong Park; |
682 | Downstream Datasets Make Surprisingly Good Pretraining Corpora Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper introduces a large-scale study of self-pretraining, where the same (downstream) training data is used for both pretraining and finetuning. |
Kundan Krishna; Saurabh Garg; Jeffrey Bigham; Zachary Lipton; |
683 | Towards Open-World Product Attribute Mining: A Lightly-Supervised Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a new task setting for attribute mining on e-commerce products, serving as a practical solution to extract open-world attributes without extensive human intervention. |
Liyan Xu; Chenwei Zhang; Xian Li; Jingbo Shang; Jinho D. Choi; |
684 | XDailyDialog: A Multilingual Parallel Dialogue Corpus Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we provide a multilingual parallel open-domain dialog dataset, XDailyDialog, to enable researchers to explore the challenging task of multilingual and cross-lingual open-domain dialog. |
Zeming Liu; Ping Nie; Jie Cai; Haifeng Wang; Zheng-Yu Niu; Peng Zhang; Mrinmaya Sachan; Kaiping Peng; |
685 | PAL to Lend A Helping Hand: Towards Building An Emotion Adaptive Polite and Empathetic Counseling Conversational Agent Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Thus, it is essential for the counselor to adequately comprehend the client�s emotions and ensure client�s welfare, i. e. s/he should adapt and deal with the clients politely and empathetically to provide a pleasant, cordial and personalized experience. Motivated by this, in this work, we attempt to build a novel Polite and empAthetic counseLing conversational agent PAL to lay down the counseling support to substance addict and crime victims. |
Kshitij Mishra; Priyanshu Priya; Asif Ekbal; |
686 | Bidirectional Generative Framework for Cross-domain Aspect-based Sentiment Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To offer a more general solution, we propose a unified bidirectional generative framework to tackle various cross-domain ABSA tasks. |
Yue Deng; Wenxuan Zhang; Sinno Jialin Pan; Lidong Bing; |
687 | Contrastive Decoding: Open-ended Text Generation As Optimization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose contrastive decoding (CD), a reliable decoding approach that optimizes a contrastive objective subject to a plausibility constraint. |
Xiang Lisa Li; Ari Holtzman; Daniel Fried; Percy Liang; Jason Eisner; Tatsunori Hashimoto; Luke Zettlemoyer; Mike Lewis; |
688 | Resolving Indirect Referring Expressions for Entity Selection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We address the problem of reference resolution, when people use natural expressions to choose between real world entities. |
Mohammad Javad Hosseini; Filip Radlinski; Silvia Pareti; Annie Louis; |
689 | Accelerating Transformer Inference for Translation Via Parallel Decoding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose to address the problem from the point of view of decoding algorithms, as a less explored but rather compelling direction. |
Andrea Santilli; Silvio Severino; Emilian Postolache; Valentino Maiorca; Michele Mancusi; Riccardo Marin; Emanuele Rodola; |
690 | Hard Sample Aware Prompt-Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a Hard Sample Aware Prompt-Tuning framework (i. e. HardPT) to solve the non-differentiable problem in hard sample identification with reinforcement learning, and to strengthen the discrimination of the feature space without changing the original data distribution via an adaptive contrastive learning method. |
Yuanjian Xu; Qi An; Jiahuan Zhang; Peng Li; Zaiqing Nie; |
691 | WikiBio: A Semantic Resource for The Intersectional Analysis of Biographical Events Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite that, there are no corpora and models specifically designed for this task. In this paper we fill this gap by presenting a new corpus annotated for biographical event detection. |
Marco Antonio Stranisci; Rossana Damiano; Enrico Mensa; Viviana Patti; Daniele Radicioni; Tommaso Caselli; |
692 | Best-k Search Algorithm for Neural Text Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a deterministic search algorithm balancing both quality and diversity. |
Jiacheng Xu; Caiming Xiong; Silvio Savarese; Yingbo Zhou; |
693 | Towards Leaving No Indic Language Behind: Building Monolingual Corpora, Benchmark and Models for Indic Languages Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we aim to improve the NLU capabilities of Indic languages by making contributions along 3 important axes (i) monolingual corpora (ii) NLU testsets (iii) multilingual LLMs focusing on Indic languages. |
Sumanth Doddapaneni; Rahul Aralikatte; Gowtham Ramesh; Shreya Goyal; Mitesh M. Khapra; Anoop Kunchukuttan; Pratyush Kumar; |
694 | Transforming Visual Scene Graphs to Image Captions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose to TransForm Scene Graphs into more descriptive Captions (TFSGC). |
Xu Yang; Jiawei Peng; Zihua Wang; Haiyang Xu; Qinghao Ye; Chenliang Li; Songfang Huang; Fei Huang; Zhangzikang Li; Yu Zhang; |
695 | Hybrid Transducer and Attention Based Encoder-Decoder Modeling for Speech-to-Text Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In order to leverage strengths of both modeling methods, we propose a solution by combining Transducer and Attention based Encoder-Decoder (TAED) for speech-to-text tasks. |
Yun Tang; Anna Sun; Hirofumi Inaguma; Xinyue Chen; Ning Dong; Xutai Ma; Paden Tomasello; Juan Pino; |
696 | Improving Domain Generalization for Prompt-Aware Essay Scoring Via Disentangled Representation Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Specifically, we propose a prompt-aware neural AES model to extract comprehensive representation for essay scoring, including both prompt-invariant and prompt-specific features. |
Zhiwei Jiang; Tianyi Gao; Yafeng Yin; Meng Liu; Hua Yu; Zifeng Cheng; Qing Gu; |
697 | What�s The Meaning of Superhuman Performance in Today�s NLU? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This has led to claims of superhuman capabilities and the provocative idea that certain tasks have been solved. In this position paper, we take a critical look at these claims and ask whether PLMs truly have superhuman abilities and what the current benchmarks are really evaluating. |
Simone Tedeschi; Johan Bos; Thierry Declerck; Jan Hajic; Daniel Hershcovich; Eduard Hovy; Alexander Koller; Simon Krek; Steven Schockaert; Rico Sennrich; Ekaterina Shutova; Roberto Navigli; |
698 | PromptNER: Prompt Locating and Typing for Named Entity Recognition Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we unify entity locating and entity typing into prompt learning, and design a dual-slot multi-prompt template with the position slot and type slot to prompt locating and typing respectively. |
Yongliang Shen; Zeqi Tan; Shuhui Wu; Wenqi Zhang; Rongsheng Zhang; Yadong Xi; Weiming Lu; Yueting Zhuang; |
699 | Hints on The Data for Language Modeling of Synthetic Languages with Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The question we have addressed in this paper is to what extent the critical amount of data varies for languages of different morphological typology, in particular those that have a rich inflectional morphology, and whether the tokenization method to preprocess the data can make a difference. |
Rodolfo Zevallos; Nuria Bel; |
700 | Neural Machine Translation Methods for Translating Text to Sign Language Glosses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In our experiments, we improve the performance of the transformer-based models via (1) data augmentation, (2) semi-supervised Neural Machine Translation (NMT), (3) transfer learning and (4) multilingual NMT. |
Dele Zhu; Vera Czehmann; Eleftherios Avramidis; |
701 | Revisiting Event Argument Extraction: Can EAE Models Learn Better When Being Aware of Event Co-occurrences? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Event co-occurrences have been proved effective for event extraction (EE) in previous studies, but have not been considered for event argument extraction (EAE) recently. In this paper, we try to fill this gap between EE research and EAE research, by highlighting the question that �Can EAE models learn better when being aware of event co-occurrences? |
Yuxin He; Jingyue Hu; Buzhou Tang; |
702 | HAUSER: Towards Holistic and Automatic Evaluation of Simile Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address the issues, we establish HAUSER, a holistic and automatic evaluation system for the SG task, which consists of five criteria from three perspectives and automatic metrics for each criterion. |
Qianyu He; Yikai Zhang; Jiaqing Liang; Yuncheng Huang; Yanghua Xiao; Yunwen Chen; |
703 | Large-scale Lifelong Learning of In-context Instructions and How to Tackle It Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Jointly fine-tuning a Pre-trained Language Model (PLM) on a pre-defined set of tasks with in-context instructions has been proven to improve its generalization performance, allowing us to build a universal language model that can be deployed across task boundaries. In this work, we explore for the first time whether this attractive property of in-context instruction learning can be extended to a scenario in which tasks are fed to the target PLM in a sequential manner. |
Jisoo Mok; Jaeyoung Do; Sungjin Lee; Tara Taghavi; Seunghak Yu; Sungroh Yoon; |
704 | Controllable Text Generation Via Probability Density Estimation in The Latent Space Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a novel control framework using probability density estimation in the latent space. |
Yuxuan Gu; Xiaocheng Feng; Sicheng Ma; Lingyuan Zhang; Heng Gong; Weihong Zhong; Bing Qin; |
705 | Learning Latent Relations for Temporal Knowledge Graph Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Most existing methods have some drawbacks in explicitly capturing intra-time latent relations between co-occurring entities and inter-time latent relations between entities that appear at different times. To tackle these problems, we propose a novel Latent relations Learning method for TKG reasoning, namely L2TKG. |
Mengqi Zhang; Yuwei Xia; Qiang Liu; Shu Wu; Liang Wang; |
706 | DT-Solver: Automated Theorem Proving with Dynamic-Tree Sampling Guided By Proof-level Value Function Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here, to accommodate general theorems, we propose a novel Dynamic-Tree Driven Theorem Solver (DT-Solver) by guiding the search procedure with state confidence and proof-level values. |
Haiming Wang; Ye Yuan; Zhengying Liu; Jianhao Shen; Yichun Yin; Jing Xiong; Enze Xie; Han Shi; Yujun Li; Lin Li; Jian Yin; Zhenguo Li; Xiaodan Liang; |
707 | Unsupervised Selective Rationalization with Noise Injection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce a novel training technique that effectively limits generation of implausible rationales by injecting noise between the generator and the predictor. |
Adam Storek; Melanie Subbiah; Kathleen McKeown; |
708 | Understanding In-Context Learning Via Supportive Pretraining Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Unlike prior work that explores implicit mechanisms behind ICL, we study ICL via investigating the pretraining data. |
Xiaochuang Han; Daniel Simig; Todor Mihaylov; Yulia Tsvetkov; Asli Celikyilmaz; Tianlu Wang; |
709 | ETHICIST: Targeted Training Data Extraction Through Loss Smoothed Soft Prompting and Calibrated Confidence Estimation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a method named Ethicist for targeted training data extraction through loss smoothed soft prompting and calibrated confidence estimation, investigating how to recover the suffix in the training data when given a prefix. |
Zhexin Zhang; Jiaxin Wen; Minlie Huang; |
710 | Effective Contrastive Weighting for Dense Query Expansion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present a contrastive solution that learns to select the most useful embeddings for expansion. |
Xiao Wang; Sean MacAvaney; Craig Macdonald; Iadh Ounis; |
711 | Improving The Detection of Multilingual Online Attacks with Rich Social Media Data from Singapore Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In many cultural and geographical settings, however, it is common to code-mix languages, combining and interchanging them throughout conversations. To shine a light on this practice, and enable more research into code-mixed toxic content, we introduce SOA, a new multilingual dataset of online attacks. |
Janosch Haber; Bertie Vidgen; Matthew Chapman; Vibhor Agarwal; Roy Ka-Wei Lee; Yong Keong Yap; Paul R�ttger; |
712 | Reanalyzing L2 Preposition Learning with Bayesian Mixed Effects and A Pretrained Language Model Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We use both Bayesian and neural models to dissect a data set of Chinese learners� pre- and post-interventional responses to two tests measuring their understanding of English prepositions. |
Jakob Prange; Man Ho Ivy Wong; |
713 | Socratic Pretraining: Question-Driven Pretraining for Controllable Summarization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce Socratic pretraining, a question-driven, unsupervised pretraining objective specifically designed to improve controllability in summarization tasks. |
Artidoro Pagnoni; Alex Fabbri; Wojciech Kryscinski; Chien-Sheng Wu; |
714 | MatCha: Enhancing Visual Language Pretraining with Math Reasoning and Chart Derendering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose MatCha (Math reasoning and Chart derendering pretraining) to enhance visual language models� capabilities in jointly modeling charts/plots and language data. |
Fangyu Liu; Francesco Piccinno; Syrine Krichene; Chenxi Pang; Kenton Lee; Mandar Joshi; Yasemin Altun; Nigel Collier; Julian Eisenschlos; |
715 | MGR: Multi-generator Based Rationalization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a simple yet effective method named MGR to simultaneously solve the two problems. |
Wei Liu; Haozhao Wang; Jun Wang; Ruixuan Li; Xinyang Li; YuanKai Zhang; Yang Qiu; |
716 | BUMP: A Benchmark of Unfaithful Minimal Pairs for Meta-Evaluation of Faithfulness Metrics Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: While existing benchmarks measure the correlation with human judgements of faithfulness on model-generated summaries, they are insufficient for diagnosing whether metrics are: 1) consistent, i. e. , indicate lower faithfulness as errors are introduced into a summary, 2) effective on human-written texts, and 3) sensitive to different error types (as summaries can contain multiple errors). To address these needs, we present a benchmark of unfaithful minimal pairs (BUMP), a dataset of 889 human-written, minimally different summary pairs, where a single error is introduced to a summary from the CNN/DailyMail dataset to produce an unfaithful summary. |
Liang Ma; Shuyang Cao; Robert L Logan IV; Di Lu; Shihao Ran; Ke Zhang; Joel Tetreault; Alejandro Jaimes; |
717 | Is Fine-tuning Needed? Pre-trained Language Models Are Near Perfect for Out-of-Domain Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we raise the question: is fine-tuning necessary for OOD detection? |
Rheeya Uppaal; Junjie Hu; Yixuan Li; |
718 | UniSumm and SummZoo: Unified Model and Diverse Benchmark for Few-Shot Summarization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To this end, we propose UniSumm, a unified few-shot summarization model pre-trained with multiple summarization tasks and can be prefix-tuned to excel at any few-shot summarization task. |
Yulong Chen; Yang Liu; Ruochen Xu; Ziyi Yang; Chenguang Zhu; Michael Zeng; Yue Zhang; |
719 | RADE: Reference-Assisted Dialogue Evaluation for Open-Domain Dialogue Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose the Reference-Assisted Dialogue Evaluation (RADE) approach under the multi-task learning framework, which leverages the pre-created utterance as reference other than the gold response to relief the one-to-many problem. |
Zhengliang Shi; Weiwei Sun; Shuo Zhang; Zhen Zhang; Pengjie Ren; Zhaochun Ren; |
720 | An AMR-based Link Prediction Approach for Document-level Event Argument Extraction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Since AMR is a generic structure and does not perfectly suit EAE, we propose a novel graph structure, Tailored AMR Graph (TAG), which compresses less informative subgraphs and edge types, integrates span information, and highlights surrounding events in the same document. |
Yuqing Yang; Qipeng Guo; Xiangkun Hu; Yue Zhang; Xipeng Qiu; Zheng Zhang; |
721 | PuMer: Pruning and Merging Tokens for Efficient Vision Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present PuMer: a token reduction framework that uses text-informed Pruning and modality-aware Merging strategies to progressively reduce the tokens of input image and text, improving model inference speed and reducing memory footprint. |
Qingqing Cao; Bhargavi Paranjape; Hannaneh Hajishirzi; |
722 | Gloss-Free End-to-End Sign Language Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we tackle the problem of sign language translation (SLT) without gloss annotations. |
Kezhou Lin; Xiaohan Wang; Linchao Zhu; Ke Sun; Bang Zhang; Yi Yang; |
723 | TAGPRIME: A Unified Framework for Relational Structure Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recent works usually propose sophisticated models for each task independently and pay less attention to the commonality of these tasks and to have a unified framework for all the tasks. In this work, we propose to take a unified view of all these tasks and introduce TAGPRIME to address relational structure extraction problems. |
I-Hung Hsu; Kuan-Hao Huang; Shuning Zhang; Wenxin Cheng; Prem Natarajan; Kai-Wei Chang; Nanyun Peng; |
724 | Model-Generated Pretraining Signals Improves Zero-Shot Generalization of Text-to-Text Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Key aspects under study include the decoding target, the location of the RTD head, and the masking pattern. Based on these studies, we develop a new model, METRO-T0, which is pretrained using the redesigned ELECTRA-Style pretraining strategies and then prompt-finetuned on a mixture of NLP tasks. |
Linyuan Gong; Chenyan Xiong; Xiaodong Liu; Payal Bajaj; Yiqing Xie; Alvin Cheung; Jianfeng Gao; Xia Song; |
725 | BITE: Textual Backdoor Attacks with Iterative Trigger Injection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we demonstrate that it is possible to design a backdoor attack that is both stealthy (i. e. , hard to notice) and effective (i. e. , has a high attack success rate). |
Jun Yan; Vansh Gupta; Xiang Ren; |
726 | A Crosslingual Investigation of Conceptualization in 1335 Languages Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To this end, we propose Conceptualizer, a method that creates a bipartite directed alignment graph between source language concepts and sets of target language strings. |
Yihong Liu; Haotian Ye; Leonie Weissweiler; Philipp Wicke; Renhao Pei; Robert Zangenfeind; Hinrich Sch�tze; |
727 | Exploring and Verbalizing Academic Ideas By Concept Co-occurrence Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we devise a framework based on concept co-occurrence for academic idea inspiration, which has been integrated into a research assistant system. |
Yi Xu; Shuqian Sheng; Bo Xue; Luoyi Fu; Xinbing Wang; Chenghu Zhou; |
728 | MCLIP: Multilingual CLIP Via Cross-lingual Transfer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce mCLIP, a retrieval-efficient dual-stream multilingual VLP model, trained by aligning the CLIP model and a Multilingual Text Encoder (MTE) through a novel Triangle Cross-modal Knowledge Distillation (TriKD) method. |
Guanhua Chen; Lu Hou; Yun Chen; Wenliang Dai; Lifeng Shang; Xin Jiang; Qun Liu; Jia Pan; Wenping Wang; |
729 | Distantly Supervised Course Concept Extraction in MOOCs with Academic Discipline Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: With the rapid growth of Massive Open Online Courses (MOOCs), it is expensive and time-consuming to extract high-quality knowledgeable concepts taught in the course by human effort to help learners grasp the essence of the course. In this paper, we propose to automatically extract course concepts using distant supervision to eliminate the heavy work of human annotations, which generates labels by matching them with an easily accessed dictionary. |
Mengying Lu; Yuquan Wang; Jifan Yu; Yexing Du; Lei Hou; Juanzi Li; |
730 | Extrinsic Evaluation of Machine Translation Metrics Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate how useful MT metrics are at detecting the segment-level quality by correlating metrics with how useful the translations are for downstream task. |
Nikita Moghe; Tom Sherborne; Mark Steedman; Alexandra Birch; |
731 | ExplainMeetSum: A Dataset for Explainable Meeting Summarization Aligned with Human Intent Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Using ExplainMeetSum, we propose a novel multiple extractor guided summarization, namely Multi-DYLE, which extensively generalizes DYLE to enable using a supervised extractor based on human-aligned extractive oracles. |
Hyun Kim; Minsoo Cho; Seung-Hoon Na; |
732 | A Cross-Modality Context Fusion and Semantic Refinement Network for Emotion Recognition in Conversation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Furthermore, they only consider simple contextual features ignoring semantic clues, resulting in an insufficient capture of the semantic coherence and consistency in conversations. To address these limitations, we propose a cross-modality context fusion and semantic refinement network (CMCF-SRNet). |
Xiaoheng Zhang; Yang Li; |
733 | CAT: A Contextualized Conceptualization and Instantiation Framework for Commonsense Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: �This process, known as conceptual induction and deduction, is fundamental to commonsense reasoning while lacking both labeled data and methodologies to enhance commonsense modeling. To fill such a research gap, we propose CAT (Contextualized ConceptuAlization and InsTantiation),a semi-supervised learning framework that integrates event conceptualization and instantiation to conceptualize commonsense knowledge bases at scale. |
Weiqi Wang; Tianqing Fang; Baixuan Xu; Chun Yi Louis Bo; Yangqiu Song; Lei Chen; |
734 | The Elephant in The Room: Analyzing The Presence of Big Tech in Natural Language Processing Research Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we seek to quantify and characterize industry presence in the NLP community over time. |
Mohamed Abdalla; Jan Philip Wahle; Terry Lima Ruas; Aur�lie N�v�ol; Fanny Ducel; Saif Mohammad; Karen Fort; |
735 | Language of Bargaining Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Leveraging an established exercise in negotiation education, we build a novel dataset for studying how the use of language shapes bilateral bargaining. |
Mourad Heddaya; Solomon Dworkin; Chenhao Tan; Rob Voigt; Alexander Zentefis; |
736 | Do Question Answering Modeling Improvements Hold Across Benchmarks? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Do question answering (QA) modeling improvements (e. g. , choice of architecture and training procedure) hold consistently across the diverse landscape of QA benchmarks? To study this question, we introduce the notion of concurrence�two benchmarks have high concurrence on a set of modeling approaches if they rank the modeling approaches similarly. |
Nelson F. Liu; Tony Lee; Robin Jia; Percy Liang; |
737 | VLN-Trans: Translator for The Vision and Language Navigation Agent Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We observe two kinds of issues in the instructions that can make the navigation task challenging: 1. The mentioned landmarks are not recognizable by the navigation agent due to the different vision abilities of the instructor and the modeled agent. 2. The mentioned landmarks are applicable to multiple targets, thus not distinctive for selecting the target among the candidate viewpoints. To deal with these issues, we design a translator module for the navigation agent to convert the original instructions into easy-to-follow sub-instruction representations at each step. |
Yue Zhang; Parisa Kordjamshidi; |
738 | Bridging The Gap Between Decision and Logits in Decision-based Knowledge Distillation for Pre-trained Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we focus on decision-based KD for PLMs, where only teacher decisions (i. e. , top-1 labels) are accessible. |
Qinhong Zhou; Zonghan Yang; Peng Li; Yang Liu; |
739 | Continual Contrastive Finetuning Improves Low-Resource Relation Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we aim at bridging the gap and propose to pretrain and finetune the RE model using consistent objectives of contrastive learning. |
Wenxuan Zhou; Sheng Zhang; Tristan Naumann; Muhao Chen; Hoifung Poon; |
740 | KGA: A General Machine Unlearning Framework Based on Knowledge Gap Alignment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a general unlearning framework called KGA to induce forgetfulness. |
Lingzhi Wang; Tong Chen; Wei Yuan; Xingshan Zeng; Kam-Fai Wong; Hongzhi Yin; |
741 | UniCoRN: Unified Cognitive Signal ReconstructioN Bridging Cognitive Signals and Human Language Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose fMRI2text, the first open-vocabulary task aiming to bridge fMRI time series and human language. |
Nuwa Xi; Sendong Zhao; Haochun Wang; Chi Liu; Bing Qin; Ting Liu; |
742 | Dense-ATOMIC: Towards Densely-connected ATOMIC with High Knowledge Coverage and Massive Multi-hop Paths Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we aim to construct Dense-ATOMIC with high knowledge coverage and massive multi-hop paths. |
Xiangqing Shen; Siwei Wu; Rui Xia; |
743 | Shrinking Embeddings for Hyper-Relational Knowledge Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although some recent works have proposed to embed hyper-relational KGs, these methods fail to capture essential inference patterns of hyper-relational facts such as qualifier monotonicity, qualifier implication, and qualifier mutual exclusion, limiting their generalization capability. To unlock this, we present ShrinkE, a geometric hyper-relational KG embedding method aiming to explicitly model these patterns. |
Bo Xiong; Mojtaba Nayyeri; Shirui Pan; Steffen Staab; |
744 | CTC-based Non-autoregressive Speech Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate the potential of connectionist temporal classification (CTC) for non-autoregressive speech translation (NAST). |
Chen Xu; Xiaoqian Liu; Xiaowen Liu; Qingxuan Sun; Yuhao Zhang; Murun Yang; Qianqian Dong; Tom Ko; Mingxuan Wang; Tong Xiao; Anxiang Ma; Jingbo Zhu; |
745 | Attention As A Guide for Simultaneous Speech Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Towards this objective, we propose EDAtt (Encoder-Decoder Attention), an adaptive policy that exploits the attention patterns between audio source and target textual translation to guide an offline-trained ST model during simultaneous inference. |
Sara Papi; Matteo Negri; Marco Turchi; |
746 | On Complementarity Objectives for Hybrid Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing models have focused on dense models to capture �residual� features neglected in the sparse models. Our key distinction is to show how this notion of residual complementarity is limited, and propose a new objective, denoted as RoC (Ratio of Complementarity), which captures a fuller notion of complementarity. |
Dohyeon Lee; Seung-won Hwang; Kyungjae Lee; Seungtaek Choi; Sunghyun Park; |
747 | C-STANCE: A Large Dataset for Chinese Zero-Shot Stance Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce two challenging subtasks for ZSSD: target-based ZSSD and domain-based ZSSD. |
Chenye Zhao; Yingjie Li; Cornelia Caragea; |
748 | Wukong-Reader: Multi-modal Pre-training for Fine-grained Visual Document Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose Wukong-Reader, trained with new pre-training objectives to leverage the structural knowledge nested in document textlines. |
Haoli Bai; Zhiguang Liu; Xiaojun Meng; Li Wentao; Shuang Liu; Yifeng Luo; Nian Xie; Rongfu Zheng; Liangwei Wang; Lu Hou; Jiansheng Wei; Xin Jiang; Qun Liu; |
749 | PaCE: Unified Multi-modal Dialogue Pre-training with Progressive and Compositional Experts Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper proposes PaCE, a unified, structured, compositional multi-modal dialogue pre-training framework. |
Yunshui Li; Binyuan Hui; ZhiChao Yin; Min Yang; Fei Huang; Yongbin Li; |
750 | MVP-Tuning: Multi-View Knowledge Retrieval with Prompt Tuning for Commonsense Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose MultiView Knowledge Retrieval with Prompt Tuning (MVP-Tuning). |
Yongfeng Huang; Yanyang Li; Yichong Xu; Lin Zhang; Ruyi Gan; Jiaxing Zhang; Liwei Wang; |
751 | PEIT: Bridging The Modality Gap with Pre-trained Models for End-to-End Image Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose PEIT, an end-to-end image translation framework that bridges the modality gap with pre-trained models. |
Shaolin Zhu; Shangjie Li; Yikun Lei; Deyi Xiong; |
752 | Topic-Guided Sampling For Data-Efficient Multi-Domain Stance Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: These make multi-domain stance detection challenging, requiring standardization and domain adaptation. To overcome this challenge, we propose Topic Efficient StancE Detection (TESTED), consisting of a topic-guided diversity sampling technique used for creating a multi-domain data efficient training set and a contrastive objective that is used for fine-tuning a stance classifier using the produced set. |
Erik Arakelyan; Arnav Arora; Isabelle Augenstein; |
753 | DiSCoMaT: Distantly Supervised Composition Extraction from Tables in Materials Science Articles Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: A crucial component in the curation of KB for a scientific domain (e. g. , materials science, food & nutrition, fuels) is information extraction from tables in the domain�s published research articles. To facilitate research in this direction, we define a novel NLP task of extracting compositions of materials (e. g. , glasses) from tables in materials science papers. |
Tanishq Gupta; Mohd Zaki; Devanshi Khatsuriya; Kausik Hira; N M Anoop Krishnan; Mausam –; |
754 | Self-Instruct: Aligning Language Models with Self-Generated Instructions Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce Self-Instruct, a framework for improving the instruction-following capabilities of pretrained language models by bootstrapping off their own generations. |
Yizhong Wang; Yeganeh Kordi; Swaroop Mishra; Alisa Liu; Noah A. Smith; Daniel Khashabi; Hannaneh Hajishirzi; |
755 | Disentangled Phonetic Representation for Chinese Spelling Correction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose to disentangle the two types of features to allow for direct interaction between textual and phonetic information. |
Zihong Liang; Xiaojun Quan; Qifan Wang; |
756 | Dissecting Transformer Length Extrapolation Via The Lens of Receptive Field Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We dissect ALiBi via the lens of receptive field analysis empowered by a novel cumulative normalized gradient tool. |
Ta-Chung Chi; Ting-Han Fan; Alexander Rudnicky; Peter Ramadge; |
757 | CHBias: Bias Evaluation and Mitigation of Chinese Conversational Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a new Chinese dataset, CHBias, for bias evaluation and mitigation of Chinese conversational language models. |
Jiaxu Zhao; Meng Fang; Zijing Shi; Yitong Li; Ling Chen; Mykola Pechenizkiy; |
758 | Learning New Skills After Deployment: Improving Open-domain Internet-driven Dialogue with Human Feedback Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Models that can employ internet-retrieval for up-to-date information and obtain feedback from humans during deployment provide the promise of both adapting to new information, and improving their performance. In this work we study how to improve internet-driven conversational skills in such a learning framework. |
Jing Xu; Megan Ung; Mojtaba Komeili; Kushal Arora; Y-Lan Boureau; Jason Weston; |
759 | Uncovering and Categorizing Social Biases in Text-to-SQL Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we aim to uncover and mitigate social bias in Text-to-SQL models. |
Yan Liu; Yan Gao; Zhe Su; Xiaokang Chen; Elliott Ash; Jian-Guang Lou; |
760 | On The Compositional Generalization in Versatile Open-domain Dialogue Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In contrast to previous works, we develop a sparsely activated modular network: (1) We propose a well-rounded set of operators and instantiate each operator with an independent module; (2) We formulate dialogue generation as the execution of a generated programme which recursively composes and assembles modules. |
Tingchen Fu; Xueliang Zhao; Lemao Liu; Rui Yan; |
761 | What Is The Real Intention Behind This Question? Dataset Collection and Intention Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper is the first study to introduce a dataset (Question Intention Dataset) that includes questions with positive/neutral and negative intentions and the underlying intention categories within each group. |
Maryam Sadat Mirzaei; Kourosh Meshgi; Satoshi Sekine; |
762 | Conjunct Resolution in The Face of Verbal Omissions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work we propose a conjunct resolution task that operates directly on the text and makes use of a split-and-rephrase paradigm in order to recover the missing elements in the coordination structure. |
Royi Rassin; Yoav Goldberg; Reut Tsarfaty; |
763 | Training Models to Generate, Recognize, and Reframe Unhelpful Thoughts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose PATTERNREFRAME, a novel dataset of about 10k examples of thoughts containing unhelpful thought patterns conditioned on a given persona, accompanied by about 27k positive reframes. |
Mounica Maddela; Megan Ung; Jing Xu; Andrea Madotto; Heather Foran; Y-Lan Boureau; |
764 | Learning In-context Learning for Named Entity Recognition Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Named entity recognition in real-world applications suffers from the diversity of entity types, the emergence of new entity types, and the lack of high-quality annotations. To address the above problems, this paper proposes an in-context learning-based NER approach, which can effectively inject in-context NER ability into PLMs and recognize entities of novel types on-the-fly using only a few demonstrative instances. |
Jiawei Chen; Yaojie Lu; Hongyu Lin; Jie Lou; Wei Jia; Dai Dai; Hua Wu; Boxi Cao; Xianpei Han; Le Sun; |
765 | Holistic Prediction on A Time-Evolving Attributed Graph Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we address two interrelated questions; (1) can we exploit task interdependence to improve prediction accuracy? |
Shohei Yamasaki; Yuya Sasaki; Panagiotis Karras; Makoto Onizuka; |
766 | Modeling Instance Interactions for Joint Information Extraction with Neural High-Order Conditional Random Field Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To better integrate cross-instance interactions, in this work, we introduce a joint IE framework (CRFIE) that formulates joint IE as a high-order Conditional Random Field. |
Zixia Jia; Zhaohui Yan; Wenjuan Han; Zilong Zheng; Kewei Tu; |
767 | Training Trajectories of Language Models Across Scales Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we analyze the intermediate training checkpoints of differently sized OPT models (Zhang et al. , 2022)�from 125M to 175B parameters�on next-token prediction, sequence-level generation and downstream tasks. |
Mengzhou Xia; Mikel Artetxe; Chunting Zhou; Xi Victoria Lin; Ramakanth Pasunuru; Danqi Chen; Luke Zettlemoyer; Veselin Stoyanov; |
768 | A Diverse Set of Freely Available Linguistic Resources for Turkish Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we provide corpora to allow practitioners to build their own applications and pretrained models that would assist industry researchers in creating quick prototypes. |
Duygu Altinok; |
769 | Measuring Consistency in Text-based Financial Forecasting Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite this, current methods for financial forecasting do not take consistency into consideration. To address this issue, we propose FinTrust, an evaluation tool that assesses logical consistency in financial text. |
Linyi Yang; Yingpeng Ma; Yue Zhang; |
770 | Optimal Transport for Unsupervised Hallucination Detection in Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: It becomes thus crucial to implement effective preventive strategies to guarantee their proper functioning. In this paper, we address the problem of hallucination detection in NMT by following a simple intuition: as hallucinations are detached from the source content, they exhibit encoder-decoder attention patterns that are statistically different from those of good quality translations. |
Nuno M. Guerreiro; Pierre Colombo; Pablo Piantanida; Andr� Martins; |
771 | RankCSE: Unsupervised Sentence Representations Learning Via Learning to Rank Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel approach, RankCSE, for unsupervised sentence representation learning, which incorporates ranking consistency and ranking distillation with contrastive learning into a unified framework. |
Jiduan Liu; Jiahao Liu; Qifan Wang; Jingang Wang; Wei Wu; Yunsen Xian; Dongyan Zhao; Kai Chen; Rui Yan; |
772 | Entailment As Robust Self-Learner Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we design a prompting strategy that formulates a number of different NLU tasks as contextual entailment. |
Jiaxin Ge; Hongyin Luo; Yoon Kim; James Glass; |
773 | ReCode: Robustness Evaluation of Code Generation Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose ReCode, a comprehensive robustness evaluation benchmark for code generation models. |
Shiqi Wang; Zheng Li; Haifeng Qian; Chenghao Yang; Zijian Wang; Mingyue Shang; Varun Kumar; Samson Tan; Baishakhi Ray; Parminder Bhatia; Ramesh Nallapati; Murali Krishna Ramanathan; Dan Roth; Bing Xiang; |
774 | EPIC: Multi-Perspective Annotation of A Corpus of Irony Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present EPIC (English Perspectivist Irony Corpus), the first annotated corpus for irony analysis based on the principles of data perspectivism. |
Simona Frenda; Alessandro Pedrani; Valerio Basile; Soda Marem Lo; Alessandra Teresa Cignarella; Raffaella Panizzon; Cristina Marco; Bianca Scarlini; Viviana Patti; Cristina Bosco; Davide Bernardi; |
775 | Dialogue Summarization with Static-Dynamic Structure Fusion Graph Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a Static-Dynamic graph-based Dialogue Summarization model (SDDS), which fuses prior knowledge from human expertise and adaptively learns the graph structure in an end-to-end learning fashion. |
Shen Gao; Xin Cheng; Mingzhe Li; Xiuying Chen; Jinpeng Li; Dongyan Zhao; Rui Yan; |
776 | Large-Scale Correlation Analysis of Automated Metrics for Topic Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we conduct a large-scale correlation analysis of coherence metrics. |
Jia Peng Lim; Hady Lauw; |
777 | U-CREAT: Unsupervised Case Retrieval Using Events ExtrAcTion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we explore the role of events in legal case retrieval and propose an unsupervised retrieval method-based pipeline U-CREAT (Unsupervised Case Retrieval using Events Extraction). |
Abhinav Joshi; Akshat Sharma; Sai Kiran Tanikella; Ashutosh Modi; |
778 | ArgAnalysis35K : A Large-scale Dataset for Argument Quality Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we leverage a combined experience of 10+ years of Parliamentary Debating to create a dataset that covers significantly more topics and has a wide range of sources to capture more diversity of opinion. |
Omkar Joshi; Priya Pitre; Yashodhara Haribhakta; |
779 | Reference Matters: Benchmarking Factual Error Correction for Dialogue Summarization with Fine-grained Evaluation Framework Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Current FEC evaluation that relies on factuality metrics is not reliable and detailed enough. To address this problem, we are the first to manually annotate a FEC dataset for dialogue summarization containing 4000 items and propose FERRANTI, a fine-grained evaluation framework based on reference correction that automatically evaluates the performance of FEC models on different error categories. |
Mingqi Gao; Xiaojun Wan; Jia Su; Zhefeng Wang; Baoxing Huai; |
780 | Minding Language Models� (Lack Of) Theory of Mind: A Plug-and-Play Multi-Character Belief Tracker Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present SymbolicToM, a plug-and-play approach to reason about the belief states of multiple characters in reading comprehension tasks via explicit symbolic representation. |
Melanie Sclar; Sachin Kumar; Peter West; Alane Suhr; Yejin Choi; Yulia Tsvetkov; |
781 | Don�t Retrain, Just Rewrite: Countering Adversarial Perturbations By Rewriting Text Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Can language models transform inputs to protect text classifiers against adversarial attacks? In this work, we present ATINTER, a model that intercepts and learns to rewrite adversarial inputs to make them non-adversarial for a downstream text classifier. |
Ashim Gupta; Carter Blum; Temma Choji; Yingjie Fei; Shalin Shah; Alakananda Vempala; Vivek Srikumar; |
782 | Aggregating Multiple Heuristic Signals As Supervision for Unsupervised Automated Essay Scoring Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a novel unsupervised AES approach ULRA, which does not require groundtruth scores of essays for training. |
Cong Wang; Zhiwei Jiang; Yafeng Yin; Zifeng Cheng; Shiping Ge; Qing Gu; |
783 | Mitigating Label Biases for In-context Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Specifically, domain-label bias restricts LLMs to random-level performance on many tasks regardless of the choice of in-context examples. To mitigate the effect of these biases, we propose a simple bias calibration method that estimates a language model�s label bias using random in-domain words from the task corpus. |
Yu Fei; Yifan Hou; Zeming Chen; Antoine Bosselut; |
784 | QUEST: A Retrieval Dataset of Entity-Seeking Queries with Implicit Set Operations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: For instance, one might search for �shorebirds that are not sandpipers� or �science-fiction films shot in England�. To study the ability of retrieval systems to meet such information needs, we construct QUEST, a dataset of 3357 natural language queries with implicit set operations, that map to a set of entities corresponding to Wikipedia documents. |
Chaitanya Malaviya; Peter Shaw; Ming-Wei Chang; Kenton Lee; Kristina Toutanova; |
785 | Dynamic Heterogeneous-Graph Reasoning with Language Models and Knowledge Representation Learning for Commonsense Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a dynamic heterogeneous-graph reasoning method with LMs and knowledge representation learning (DHLK), which constructs a heterogeneous knowledge graph (HKG) based on multiple knowledge sources and optimizes the structure and knowledge representation of the HKG using a two-stage pruning strategy and knowledge representation learning (KRL). |
Yujie Wang; Hu Zhang; Jiye Liang; Ru Li; |
786 | Do You Hear The People Sing? Key Point Analysis Via Iterative Clustering and Abstractive Summarisation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Aggravating this problem is the fact that human evaluation is costly and unreproducible. To address the above issues, we propose a two-step abstractive summarisation framework based on neural topic modelling with an iterative clustering procedure, to generate key points which are aligned with how humans identify key points. |
Hao Li; Viktor Schlegel; Riza Batista-Navarro; Goran Nenadic; |
787 | Ambiguous Learning from Retrieval: Towards Zero-shot Semantic Parsing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose the Retrieval as Ambiguous Supervision framework, in which we construct a retrieval system based on pretrained language models to collect high-coverage candidates. |
Shan Wu; Chunlei Xin; Hongyu Lin; Xianpei Han; Cao Liu; Jiansong Chen; Fan Yang; Guanglu Wan; Le Sun; |
788 | Explicit Syntactic Guidance for Neural Text Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Accordingly, we propose a structural beam search method to find possible syntax structures hierarchically. |
Yafu Li; Leyang Cui; Jianhao Yan; Yongjing Yin; Wei Bi; Shuming Shi; Yue Zhang; |
789 | What Does A Text Classifier Learn About Morality? An Explainable Method for Cross-Domain Comparison of Moral Rhetoric Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose Tomea, a method to compare a supervised classifier�s representation of moral rhetoric across domains. |
Enrico Liscio; Oscar Araque; Lorenzo Gatti; Ionut Constantinescu; Catholijn Jonker; Kyriaki Kalimeri; Pradeep Kumar Murukannaiah; |
790 | Graph-based Relation Mining for Context-free Out-of-vocabulary Word Embedding Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel graph-based relation mining method, namely GRM, for OOV word embedding learning. |
Ziran Liang; Yuyin Lu; HeGang Chen; Yanghui Rao; |
791 | Multimodal Persona Based Generation of Comic Dialogs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We focus on the novel problem of persona based dialogue generation for comic strips. |
Harsh Agrawal; Aditya Mishra; Manish Gupta; Mausam –; |
792 | LLM-Blender: Ensembling Large Language Models with Pairwise Ranking and Generative Fusion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present LLM-Blender, an ensembling framework designed to attain consistently superior performance by leveraging the diverse strengths of multiple open-source large language models (LLMs). |
Dongfu Jiang; Xiang Ren; Bill Yuchen Lin; |
793 | Seen to Unseen: Exploring Compositional Generalization of Multi-Attribute Controllable Dialogue Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we explore the compositional generalization for multi-attribute controllable dialogue generation where a model can learn from seen attribute values and generalize to unseen combinations. |
Weihao Zeng; Lulu Zhao; Keqing He; Ruotong Geng; Jingang Wang; Wei Wu; Weiran Xu; |
794 | Generating Structured Pseudo Labels for Noise-resistant Zero-shot Video Sentence Localization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To mitigate the effect of pseudo-label noise, we propose a noise-resistant iterative method that repeatedly re-weight the training sample based on noise estimation to train a grounding model and correct pseudo labels. |
Minghang Zheng; Shaogang Gong; Hailin Jin; Yuxin Peng; Yang Liu; |
795 | IndicMT Eval: A Dataset to Meta-Evaluate Machine Translation Metrics for Indian Languages Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Unfortunately, most meta-evaluation studies focus on European languages, the observations for which may not always apply to other languages. Indian languages, having over a billion speakers, are linguistically different from them, and to date, there are no such systematic studies focused solely on English to Indian language MT. This paper fills this gap through a Multidimensional Quality Metric (MQM) dataset consisting of 7000 fine-grained annotations, spanning 5 Indian languages and 7 MT systems. |
Ananya Sai B; Tanay Dixit; Vignesh Nagarajan; Anoop Kunchukuttan; Pratyush Kumar; Mitesh M. Khapra; Raj Dabre; |
796 | Weaker Than You Think: A Critical Look at Weakly Supervised Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recently, many sophisticated approaches have been proposed for robust training under label noise, reporting impressive results. In this paper, we revisit the setup of these approaches and find that the benefits brought by these approaches are significantly overestimated. |
Dawei Zhu; Xiaoyu Shen; Marius Mosbach; Andreas Stephan; Dietrich Klakow; |
797 | Prompt Tuning Pushes Farther, Contrastive Learning Pulls Closer: A Two-Stage Approach to Mitigate Social Biases Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose an adversarial training-inspired two-stage debiasing model using Contrastive learning with Continuous Prompt Augmentation (named CCPA) to mitigate social biases in PLMs� encoding. |
Yingji Li; Mengnan Du; Xin Wang; Ying Wang; |
798 | Towards Understanding Omission in Dialogue Summarization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose the OLDS dataset, which provides high-quality omission labels for dialogue summarization. |
Yicheng Zou; Kaitao Song; Xu Tan; Zhongkai Fu; Qi Zhang; Dongsheng Li; Tao Gui; |
799 | Python Code Generation By Asking Clarification Questions Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: While recent pretrained language models demonstrate remarkable performance for this task, these models fail when the given natural language description is under-specified. In this work, we introduce a novel and more realistic setup for this task. |
Haau-Sing (Xiaocheng) Li; Mohsen Mesgar; Andr� Martins; Iryna Gurevych; |
800 | A Compare-and-contrast Multistage Pipeline for Uncovering Financial Signals in Financial Reports Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we address the challenge of discovering financial signals in narrative financial reports. |
Jia-Huei Ju; Yu-Shiang Huang; Cheng-Wei Lin; Che Lin; Chuan-Ju Wang; |
801 | Improving The Robustness of NLI Models with Minimax Training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose a minimax objective between a learner model being trained for the NLI task, and an auxiliary model aiming to maximize the learner�s loss by up-weighting examples from regions of the input space where the learner incurs high losses. |
Michalis Korakakis; Andreas Vlachos; |
802 | USSA: A Unified Table Filling Scheme for Structured Sentiment Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a niche-targeting and effective solution. |
Zepeng Zhai; Hao Chen; Ruifan Li; Xiaojie Wang; |
803 | PAD-Net: An Efficient Framework for Dynamic Networks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The main contributions of our work are challenging the basic commonsense in dynamic networks and proposing a partially dynamic network, namely PAD-Net, to transform the redundant dynamic parameters into static ones. |
Shwai He; Liang Ding; Daize Dong; Boan Liu; Fuqiang Yu; Dacheng Tao; |
804 | Resolving Ambiguities in Text-to-Image Generative Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we study ambiguities that arise in text-to-image generative models. |
Ninareh Mehrabi; Palash Goyal; Apurv Verma; Jwala Dhamala; Varun Kumar; Qian Hu; Kai-Wei Chang; Richard Zemel; Aram Galstyan; Rahul Gupta; |
805 | Knowledge Unlearning for Mitigating Privacy Risks in Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose knowledge unlearning as an alternative method to reduce privacy risks for LMs post hoc. |
Joel Jang; Dongkeun Yoon; Sohee Yang; Sungmin Cha; Moontae Lee; Lajanugen Logeswaran; Minjoon Seo; |
806 | Unnatural Instructions: Tuning Language Models with (Almost) No Human Labor Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce Unnatural Instructions: a large dataset of creative and diverse instructions, collected with virtually no human labor. |
Or Honovich; Thomas Scialom; Omer Levy; Timo Schick; |
807 | To Adapt or to Annotate: Challenges and Interventions for Domain Adaptation in Open-Domain Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work proposes a more realistic end-to-end domain shift evaluation setting covering five diverse domains. |
Dheeru Dua; Emma Strubell; Sameer Singh; Pat Verga; |
808 | A Survey for Efficient Open Domain Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we will survey recent advancements in the efficiency of ODQA models and conclude core techniques for achieving efficiency. |
Qin Zhang; Shangsi Chen; Dongkuan Xu; Qingqing Cao; Xiaojun Chen; Trevor Cohn; Meng Fang; |
809 | Script Normalization for Unconventional Writing of Under-Resourced Languages in Bilingual Communities Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This, however, comes with certain challenges in script normalization, particularly where the speakers of a language in a bilingual community rely on another script or orthography to write their native language. This paper addresses the problem of script normalization for several such languages that are mainly written in a Perso-Arabic script. |
Sina Ahmadi; Antonios Anastasopoulos; |
810 | Compositional Generalization Without Trees Using Multiset Tagging and Latent Permutations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We phrase semantic parsing as a two-step process: we first tag each input token with a multiset of output tokens. Then we arrange the tokens into an output sequence using a new way of parameterizing and predicting permutations. We formulate predicting a permutation as solving a regularized linear program and we backpropagate through the solver. |
Matthias Lindemann; Alexander Koller; Ivan Titov; |
811 | ManagerTower: Aggregating The Insights of Uni-Modal Experts for Vision-Language Representation Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose ManagerTower, a novel VL model architecture that gathers and combines the insights of pre-trained uni-modal experts at different levels. |
Xiao Xu; Bei Li; Chenfei Wu; Shao-Yen Tseng; Anahita Bhiwandiwalla; Shachar Rosenman; Vasudev Lal; Wanxiang Che; Nan Duan; |
812 | Finding The Pillars of Strength for Multi-Head Attention Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In particular, we propose Grouped Head Attention, trained with a self-supervised group constraint that group attention heads, where each group focuses on an essential but distinctive feature subset. |
Jinjie Ni; Rui Mao; Zonglin Yang; Han Lei; Erik Cambria; |
813 | Jointprop: Joint Semi-supervised Learning for Entity and Relation Extraction with Heterogeneous Graph-based Propagation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To alleviate the issues, we propose Jointprop, a Heterogeneous Graph-based Propagation framework for joint semi-supervised entity and relation extraction, which captures the global structure information between individual tasks and exploits interactions within unlabeled data. |
Yandan Zheng; Anran Hao; Anh Tuan Luu; |
814 | Reasoning Over Hierarchical Question Decomposition Tree for Explainable Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose to leverage question decomposing for heterogeneous knowledge integration, by breaking down a complex question into simpler ones, and selecting the appropriate knowledge source for each sub-question. |
Jiajie Zhang; Shulin Cao; Tingjian Zhang; Xin Lv; Juanzi Li; Lei Hou; Jiaxin Shi; Qi Tian; |
815 | Faking Fake News for Real Fake News Detection: Propaganda-Loaded Training Data Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: What limits the successful transfer between them is the sizable gap between machine-generated fake news and human-authored ones, including the notable differences in terms of style and underlying intent. With this in mind, we propose a novel framework for generating training examples that are informed by the known styles and strategies of human-authored propaganda. |
Kung-Hsiang Huang; Kathleen McKeown; Preslav Nakov; Yejin Choi; Heng Ji; |
816 | A Length-Extrapolatable Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we focus on length extrapolation, i. e. , training on short texts while evaluating longer sequences. |
Yutao Sun; Li Dong; Barun Patra; Shuming Ma; Shaohan Huang; Alon Benhaim; Vishrav Chaudhary; Xia Song; Furu Wei; |
817 | A Survey of Deep Learning for Mathematical Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this survey paper, we review the key tasks, datasets, and methods at the intersection of mathematical reasoning and deep learning over the past decade. |
Pan Lu; Liang Qiu; Wenhao Yu; Sean Welleck; Kai-Wei Chang; |
818 | A Systematic Study of Knowledge Distillation for Natural Language Generation with Pseudo-Target Training Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we study the potential of compressing them, which is crucial for real-world applications serving millions of users. |
Nitay Calderon; Subhabrata Mukherjee; Roi Reichart; Amir Kantor; |
819 | Vision Language Pre-training By Contrastive Learning with Cross-Modal Similarity Regulation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we reconsider the problem of (partial) false negative samples from the Mutual Information (MI) Maximization perspective, the traditional contrastive loss (like InfoNCE loss) will equally push away the anchor of all positive samples and negative samples regardless of their possible semantic similarities. |
Chaoya Jiang; Wei Ye; Haiyang Xu; Songfang Huang; Fei Huang; Shikun Zhang; |
820 | Tell2Design: A Dataset for Language-Guided Floor Plan Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We consider the task of generating designs directly from natural language descriptions, and consider floor plan generation as the initial research area. |
Sicong Leng; Yang Zhou; Mohammed Haroon Dupty; Wee Sun Lee; Sam Joyce; Wei Lu; |
821 | Are Human Explanations Always Helpful? Towards Objective Evaluation of Human Natural Language Explanations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Before blindly using them as ground truth to train ML models, a vital question needs to be asked: How do we evaluate a human-annotated explanation�s quality? In this paper, we build on the view that the quality of a human-annotated explanation can be measured based on its helpfulness (or impairment) to the ML models� performance for the desired NLP tasks for which the annotations were collected. |
Bingsheng Yao; Prithviraj Sen; Lucian Popa; James Hendler; Dakuo Wang; |
822 | Rethinking Annotation: Can Language Learners Contribute? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate whether language learners can contribute annotations to the benchmark datasets. |
Haneul Yoo; Rifki Afina Putri; Changyoon Lee; Youngin Lee; So-Yeon Ahn; Dongyeop Kang; Alice Oh; |
823 | Information Screening Whilst Exploiting! Multimodal Relation Extraction with Feature Denoising and Multimodal Topic Modeling Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To combat that, we propose a novel framework that simultaneously implements the idea of internal-information screening and external-information exploiting. |
Shengqiong Wu; Hao Fei; Yixin Cao; Lidong Bing; Tat-Seng Chua; |
824 | MultiEMO: An Attention-Based Correlation-Aware Multimodal Fusion Framework for Emotion Recognition in Conversations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Furthermore, existing state-of-the-art ERC models have difficulty classifying minority and semantically similar emotion categories. To address these challenges, we propose a novel attention-based correlation-aware multimodal fusion framework named MultiEMO, which effectively integrates multimodal cues by capturing cross-modal mapping relationships across textual, audio and visual modalities based on bidirectional multi-head cross-attention layers. |
Tao Shi; Shao-Lun Huang; |
825 | Learning Language-Specific Layers for Multilingual Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce Language-Specific Transformer Layers (LSLs), which allow us to increase model capacity, while keeping the amount of computation and the number of parameters used in the forward pass constant. |
Telmo Pires; Robin Schmidt; Yi-Hsiu Liao; Stephan Peitz; |
826 | Personality Understanding of Fictional Characters During Book Reading Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The problem has not been studied in the NLP field, primarily due to the lack of appropriate datasets mimicking the process of book reading. We present the first labeled dataset PersoNet for this problem. |
Mo Yu; Jiangnan Li; Shunyu Yao; Wenjie Pang; Xiaochen Zhou; Zhou Xiao; Fandong Meng; Jie Zhou; |
827 | StoryTrans: Non-Parallel Story Author-Style Transfer with Discourse Representations and Content Enhancing Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we formulate the task of non-parallel story author-style transfer, which requires transferring an input story into a specified author style while maintaining source semantics. |
Xuekai Zhu; Jian Guan; Minlie Huang; Juan Liu; |
828 | Towards Benchmarking and Improving The Temporal Reasoning Capability of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce a comprehensive probing dataset TempReason to evaluate the temporal reasoning capability of large language models. |
Qingyu Tan; Hwee Tou Ng; Lidong Bing; |
829 | Finding The SWEET Spot: Analysis and Improvement of Adaptive Inference in Low Resource Settings Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we compare the two main approaches for adaptive inference, Early-Exit and Multi-Model, when training data is limited. |
Daniel Rotem; Michael Hassid; Jonathan Mamou; Roy Schwartz; |
830 | Large Language Models Are Reasoning Teachers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, prompt-based CoT methods are dependent on very large models such as GPT-3 175B which are prohibitive to deploy at scale. In this paper, we use these large models as reasoning teachers to enable complex reasoning in smaller models and reduce model size requirements by several orders of magnitude. |
Namgyu Ho; Laura Schmid; Se-Young Yun; |
831 | Abductive Commonsense Reasoning Exploiting Mutually Exclusive Explanations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Instead of using direct supervision, this work proposes an approach for abductive commonsense reasoning that exploits the fact that only a subset of explanations is correct for a given context. |
Wenting Zhao; Justin Chiu; Claire Cardie; Alexander Rush; |
832 | PESCO: Prompt-enhanced Self Contrastive Learning for Zero-shot Text Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present PESCO, a novel contrastive learning framework that substantially improves the performance of zero-shot text classification. |
Yau-Shian Wang; Ta-Chung Chi; Ruohong Zhang; Yiming Yang; |
833 | Visually-augmented Pretrained Language Models for NLP Tasks Without Images Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing solutions often rely on explicit images for visual knowledge augmentation (requiring time-consuming retrieval or generation), and they also conduct the augmentation for the whole input text, without considering whether it is actually needed in specific inputs or tasks. To address these issues, we propose a novel **V**isually-**A**ugmented fine-tuning approach that can be generally applied to various PLMs or NLP tasks, **W**ithout using any retrieved or generated **I**mages, namely **VAWI**. |
Hangyu Guo; Kun Zhou; Wayne Xin Zhao; Qinyu Zhang; Ji-Rong Wen; |
834 | Using Counterfactual Contrast to Improve Compositional Generalization for Multi-step Quantitative Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we propose CounterComp, a method that uses counterfactual scenarios to generate samples with compositional contrast. |
Armineh Nourbakhsh; Sameena Shah; Carolyn Ros�; |
835 | A Needle in A Haystack: An Analysis of High-Agreement Workers on MTurk for Summarization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Thus, we investigate the recruitment of high-quality Amazon Mechanical Turk workers via a two-step pipeline. |
Lining Zhang; Simon Mille; Yufang Hou; Daniel Deutsch; Elizabeth Clark; Yixin Liu; Saad Mahamood; Sebastian Gehrmann; Miruna Clinciu; Khyathi Raghavi Chandu; Jo�o Sedoc; |
836 | TAVT: Towards Transferable Audio-Visual Text Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel Transferable Audio-Visual Text Generation framework, named TAVT, which consists of two key components: Audio-Visual Meta-Mapper (AVMM) and Dual Counterfactual Contrastive Learning (DCCL). |
Wang Lin; Tao Jin; Wenwen Pan; Linjun Li; Xize Cheng; Ye Wang; Zhou Zhao; |
837 | MeetingQA: Extractive Question-Answering on Meeting Transcripts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, meeting discussions also have a useful question-answering (QA) component, crucial to understanding the discourse or meeting content, and can be used to build interactive interfaces on top of long transcripts. Hence, in this work, we leverage this inherent QA component of meeting discussions and introduce MeetingQA, an extractive QA dataset comprising of questions asked by meeting participants and corresponding responses. |
Archiki Prasad; Trung Bui; Seunghyun Yoon; Hanieh Deilamsalehy; Franck Dernoncourt; Mohit Bansal; |
838 | FERMAT: An Alternative to Accuracy for Numerical Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Inspired by CheckList (Ribeiro et al. , 2020), we introduce a multi-view evaluation set for numerical reasoning in English, called FERMAT. |
Jasivan Sivakumar; Nafise Sadat Moosavi; |
839 | Don�t Forget Your ABC�s: Evaluating The State-of-the-Art in Chat-Oriented Dialogue Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a novel human evaluation method to estimate the rates of many{pasted macro �LN�} dialogue system behaviors. |
Sarah E. Finch; James D. Finch; Jinho D. Choi; |
840 | Decoder Tuning: Efficient Language Understanding As Decoding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, we argue that input-side adaptation could be arduous due to the lack of gradient signals and they usually require thousands of API queries, resulting in high computation and time costs. |
Ganqu Cui; Wentao Li; Ning Ding; Longtao Huang; Zhiyuan Liu; Maosong Sun; |
841 | The KITMUS Test: Evaluating Knowledge Integration from Multiple Sources Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a test suite of coreference resolution subtasks that require reasoning over multiple facts. |
Akshatha Arodi; Martin P�msl; Kaheer Suleman; Adam Trischler; Alexandra Olteanu; Jackie Chi Kit Cheung; |
842 | CREST: A Joint Framework for Rationalization and Counterfactual Text Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, prior work has not explored how these methods can be integrated to combine their complementary advantages. We overcome this limitation by introducing CREST (ContRastive Edits with Sparse raTionalization), a joint framework for selective rationalization and counterfactual text generation, and show that this framework leads to improvements in counterfactual quality, model robustness, and interpretability. |
Marcos Treviso; Alexis Ross; Nuno M. Guerreiro; Andr� Martins; |
843 | Towards Unifying Multi-Lingual and Cross-Lingual Summarization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we aim to unify MLS and CLS into a more general setting, i. e. , many-to-many summarization (M2MS), where a single model could process documents in any language and generate their summaries also in any language. |
Jiaan Wang; Fandong Meng; Duo Zheng; Yunlong Liang; Zhixu Li; Jianfeng Qu; Jie Zhou; |
844 | On Improving Summarization Factual Consistency from Natural Language Feedback Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we study whether informational feedback in natural language can be leveraged to improve generation quality and user preference alignment. |
Yixin Liu; Budhaditya Deb; Milagro Teruel; Aaron Halfaker; Dragomir Radev; Ahmed Hassan Awadallah; |
845 | From Dogwhistles to Bullhorns: Unveiling Coded Rhetoric with Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present the first large-scale computational investigation of dogwhistles. |
Julia Mendelsohn; Ronan Le Bras; Yejin Choi; Maarten Sap; |
846 | Exploring Large Language Models for Classical Philology Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: While prior work on Classical languages unanimously uses BERT, in this work we create four language models for Ancient Greek that vary along two dimensions to study their versatility for tasks of interest for Classical languages: we explore (i) encoder-only and encoder-decoder architectures using RoBERTa and T5 as strong model types, and create for each of them (ii) a monolingual Ancient Greek and a multilingual instance that includes Latin and English. |
Frederick Riemenschneider; Anette Frank; |
847 | LayoutMask: Enhance Text-Layout Interaction in Multi-modal Pre-training for Document Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper focuses on improving text-layout interactions and proposes a novel multi-modal pre-training model, LayoutMask. |
Yi Tu; Ya Guo; Huan Chen; Jinyang Tang; |
848 | Hearing Lips in Noise: Universal Viseme-Phoneme Mapping and Transfer for Robust Audio-Visual Speech Recognition Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we investigate the noise-invariant visual modality to strengthen robustness of AVSR, which can adapt to any testing noises while without dependence on noisy training data, a. k. a. , unsupervised noise adaptation. |
Yuchen Hu; Ruizhe Li; Chen Chen; Chengwei Qin; Qiu-Shi Zhu; Eng Siong Chng; |
849 | An Extensible Plug-and-Play Method for Multi-Aspect Controllable Text Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we provide a theoretical lower bound for the interference and empirically found that the interference grows with the number of layers where prefixes are inserted. |
Xuancheng Huang; Zijun Liu; Peng Li; Tao Li; Maosong Sun; Yang Liu; |
850 | Double-Branch Multi-Attention Based Graph Neural Network for Knowledge Graph Completion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we find that a simple attention-based method can outperform a general GNN-based approach for KGC. |
Hongcai Xu; Junpeng Bao; Wenbo Liu; |
851 | Dual Cache for Long Document Neural Coreference Resolution Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a new hybrid cache that integrates two eviction policies to capture global and local entities separately, and effectively reduces the aggregated cache misses up to half as before, while improving F1 score of coreference by 0. |
Qipeng Guo; Xiangkun Hu; Yue Zhang; Xipeng Qiu; Zheng Zhang; |
852 | Knowledge Transfer in Incremental Learning for Multilingual Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To better acquire new knowledge, we propose a knowledge transfer method that can efficiently adapt original MNMT models to diverse incremental language pairs. |
Kaiyu Huang; Peng Li; Jin Ma; Ting Yao; Yang Liu; |
853 | DisorBERT: A Double Domain Adaptation Model for Detecting Signs of Mental Disorders in Social Media Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Through the past years, awareness created by health campaigns and other sources motivated the study of these disorders using information extracted from social media platforms. In this work, we aim to contribute to the study of these disorders and to the understanding of how mental problems reflect on social media. |
Mario Aragon; Adrian Pastor Lopez Monroy; Luis Gonzalez; David E. Losada; Manuel Montes; |
854 | Toward Interactive Dictation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we study the feasibility of allowing users to interrupt their dictation with spoken editing commands in open-ended natural language. |
Belinda Z. Li; Jason Eisner; Adam Pauls; Sam Thomson; |
855 | CodeIE: Large Code Generation Models Are Better Few-Shot Information Extractors Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose to recast the structured output in the form of code instead of natural language and utilize generative LLMs of code (Code-LLMs) such as Codex to perform IE tasks, in particular, named entity recognition and relation extraction. |
Peng Li; Tianxiang Sun; Qiong Tang; Hang Yan; Yuanbin Wu; Xuanjing Huang; Xipeng Qiu; |
856 | Beyond English-Centric Bitexts for Better Multilingual Language Representation Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we elaborate upon recipes for building multilingual representation models that are not only competitive with existing state-of-the-art models but are also more parameter efficient, thereby promoting better adoption in resource-constrained scenarios and practical applications. |
Barun Patra; Saksham Singhal; Shaohan Huang; Zewen Chi; Li Dong; Furu Wei; Vishrav Chaudhary; Xia Song; |
857 | Bridging The Gap: Entailment Fused-T5 for Open-retrieval Conversational Machine Reading Comprehension Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the information gap still persists because these methods are still limited in pipeline framework, where decision-making and question generation are performed separately, making it hard to share the entailment reasoning used in decision-making across all stages. To tackle the above problem, we propose a novel one-stage end-to-end framework, called Entailment Fused-T5 (EFT), to bridge the information gap between decision-making and question generation in a global understanding manner. |
Xiao Zhang; Heyan Huang; Zewen Chi; Xian-Ling Mao; |
858 | LiveChat: A Large-Scale Personalized Dialogue Dataset Automatically Constructed from Live Streaming Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To improve the essential capability of responding and establish a benchmark in the live open-domain scenario, we introduce the LiveChat dataset, composed of 1. 33 million real-life Chinese dialogues with almost 3800 average sessions across 351 personas and fine-grained profiles for each persona. |
Jingsheng Gao; Yixin Lian; Ziyi Zhou; Yuzhuo Fu; Baoyuan Wang; |
859 | Prompting PaLM for Translation: Assessing Strategies and Performance Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We investigate various strategies for choosing translation examples for few-shot prompting, concluding that example quality is the most important factor. |
David Vilar; Markus Freitag; Colin Cherry; Jiaming Luo; Viresh Ratnakar; George Foster; |
860 | Exploring Lottery Prompts for Pre-trained Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Given the advantage of prompting in the zero-shot setting and the observed performance fluctuation among different prompts, we explore the instance-level prompt and their generalizability. |
Yulin Chen; Ning Ding; Xiaobin Wang; Shengding Hu; Haitao Zheng; Zhiyuan Liu; Pengjun Xie; |
861 | A Facial Expression-Aware Multimodal Multi-task Learning Framework for Emotion Recognition in Multi-party Conversations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, given an utterance, the face sequence extracted by previous methods may contain multiple people�s faces, which will inevitably introduce noise to the emotion prediction of the real speaker. To tackle this issue, we propose a two-stage framework named Facial expressionaware Multimodal Multi-Task learning (FacialMMT). |
Wenjie Zheng; Jianfei Yu; Rui Xia; Shijin Wang; |
862 | TeAST: Temporal Knowledge Graph Embedding Via Archimedean Spiral Timeline Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Meanwhile, current TKGE models often lack the ability to simultaneously model important relation patterns and provide interpretability, which hinders their effectiveness and potential applications. To address these limitations, we propose a novel TKGE model which encodes Temporal knowledge graph embeddings via Archimedean Spiral Timeline (TeAST), which maps relations onto the corresponding Archimedean spiral timeline and transforms the quadruples completion to 3th-order tensor completion problem. |
Jiang Li; Xiangdong Su; Guanglai Gao; |
863 | Human Inspired Progressive Alignment and Comparative Learning for Grounded Word Acquisition Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we took inspiration from how human babies acquire their first language, and developed a computational process for word acquisition through comparative learning. |
Yuwei Bao; Barrett Lattimer; Joyce Chai; |
864 | Conjunct Lengths in English, Dependency Length Minimization, and Dependency Structure of Coordination Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper confirms that, in English binary coordinations, left conjuncts tend to be shorter than right conjuncts, regardless of the position of the governor of the coordination. |
Adam Przepi�rkowski; Michal Wozniak; |
865 | LeXFiles and LegalLAMA: Facilitating English Multinational Legal Language Model Development Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we conduct a detailed analysis on the performance of legal-oriented pre-trained language models (PLMs). |
Ilias Chalkidis; Nicolas Garneau; Catalina Goanta; Daniel Katz; Anders S�gaard; |
866 | Revisiting Commonsense Reasoning in Machine Translation: Training, Evaluation and Challenge Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: For the evaluation, we propose a novel entity-aware evaluation method that takes into account both the NMT candidate and important entities in the candidate, which is more aligned with human judgement. |
Xuebo Liu; Yutong Wang; Derek F. Wong; Runzhe Zhan; Liangxuan Yu; Min Zhang; |
867 | NOTABLE: Transferable Backdoor Attacks Against Prompt-based NLP Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose transferable backdoor attacks against prompt-based models, called NOTABLE, which is independent of downstream tasks and prompting strategies. |
Kai Mei; Zheng Li; Zhenting Wang; Yang Zhang; Shiqing Ma; |
868 | Revisiting Relation Extraction in The Era of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We address issues inherent to evaluating generative approaches to RE by doing human evaluations, in lieu of relying on exact matching. |
Somin Wadhwa; Silvio Amir; Byron Wallace; |
869 | Pre-trained Language Models Can Be Fully Zero-Shot Learners Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose nonparametric prompting PLM (NPPrompt) for fully zero-shot language understanding. |
Xuandong Zhao; Siqi Ouyang; Zhiguo Yu; Ming Wu; Lei Li; |
870 | Can Large Language Models Be An Alternative to Human Evaluations? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recently, large language models (LLMs) have demonstrated exceptional performance on unseen tasks when only the task instructions are provided. In this paper, we explore if such an ability of the LLMs can be used as an alternative to human evaluation. |
Cheng-Han Chiang; Hung-yi Lee; |
871 | HyperMixer: An MLP-based Low Cost Alternative to Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a simple variant, HyperMixer, which forms the token mixing MLP dynamically using hypernetworks. |
Florian Mai; Arnaud Pannatier; Fabio Fehr; Haolin Chen; Francois Marelli; Francois Fleuret; James Henderson; |
872 | UnitY: Two-pass Direct Speech-to-speech Translation with Discrete Units Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a novel two-pass direct S2ST architecture, UnitY, which first generates textual representations and predicts discrete acoustic units subsequently. |
Hirofumi Inaguma; Sravya Popuri; Ilia Kulikov; Peng-Jen Chen; Changhan Wang; Yu-An Chung; Yun Tang; Ann Lee; Shinji Watanabe; Juan Pino; |
873 | Estimating The Uncertainty in Emotion Attributes Using Deep Evidential Regression Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper proposes a Bayesian approach, deep evidential emotion regression (DEER), to estimate the uncertainty in emotion attributes. |
Wen Wu; Chao Zhang; Philip Woodland; |
874 | Annotation-Inspired Implicit Discourse Relation Classification with Auxiliary Discourse Connective Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Implicit discourse relation classification is a challenging task due to the absence of discourse connectives. To overcome this issue, we design an end-to-end neural model to explicitly generate discourse connectives for the task, inspired by the annotation process of PDTB. |
Wei Liu; Michael Strube; |
875 | Plug-and-Play Document Modules for Pre-trained Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To this end, we target to decouple document encoding from downstream tasks, and propose to represent each document as a plug-and-play document module, i. e. , a document plugin, for PTMs (PlugD). |
Chaojun Xiao; Zhengyan Zhang; Xu Han; Chi-Min Chan; Yankai Lin; Zhiyuan Liu; Xiangyang Li; Zhonghua Li; Zhao Cao; Maosong Sun; |
876 | An Empirical Analysis of Parameter-Efficient Methods for Debiasing Pre-Trained Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we investigate recent parameter-efficient methods in combination with counterfactual data augmentation (CDA) for bias mitigation. |
Zhongbin Xie; Thomas Lukasiewicz; |
877 | Two-Stage Fine-Tuning for Improved Bias and Variance for Large Pretrained Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we first provide a variance decomposition-based justification criteria to examine whether large pretrained neural models in a fine-tuning setting are generalizable enough to have low bias and variance. We then perform theoretical and empirical analysis using ensemble methods explicitly designed to decrease variance due to optimization. |
Lijing Wang; Yingya Li; Timothy Miller; Steven Bethard; Guergana Savova; |
878 | A Comparative Study on The Impact of Model Compression Techniques on Fairness in Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our analysis involves a comprehensive evaluation of pruned, distilled, and quantized language models, which we benchmark across a range of intrinsic and extrinsic metrics for measuring bias in text classification. |
Krithika Ramesh; Arnav Chavan; Shrey Pandit; Sunayana Sitaram; |
879 | Ranking-Enhanced Unsupervised Sentence Representation Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we show that the semantic meaning of a sentence is also determined by nearest-neighbor sentences that are similar to the input sentence. |
Yeon Seonwoo; Guoyin Wang; Changmin Seo; Sajal Choudhary; Jiwei Li; Xiang Li; Puyang Xu; Sunghyun Park; Alice Oh; |
880 | To Revise or Not to Revise: Learning to Detect Improvable Claims for Argumentative Writing Support Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we explore the main challenges to identifying argumentative claims in need of specific revisions. |
Gabriella Skitalinskaya; Henning Wachsmuth; |
881 | Human-in-the-loop Evaluation for Early Misinformation Detection: A Case Study of COVID-19 Treatments Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a human-in-the-loop evaluation framework for fact-checking novel misinformation claims and identifying social media messages that support them. |
Ethan Mendes; Yang Chen; Wei Xu; Alan Ritter; |
882 | Composition-contrastive Learning for Sentence Embeddings Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Differently, we propose maximizing alignment between texts and a composition of their phrasal constituents. |
Sachin Chanchani; Ruihong Huang; |
883 | Causes and Cures for Interference in Multilingual Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work identifies the main factors that contribute to interference in multilingual machine translation. |
Uri Shaham; Maha Elbayad; Vedanuj Goswami; Omer Levy; Shruti Bhosale; |
884 | Understanding and Bridging The Modality Gap for Speech Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We find that the modality gap is relatively small during training except for some difficult cases, but keeps increasing during inference due to the cascading effect. To address these problems, we propose the Cross-modal Regularization with Scheduled Sampling (Cress) method. |
Qingkai Fang; Yang Feng; |
885 | Few-shot Reranking for Multi-hop QA Via Language Model Prompting Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To alleviate the need for a large number of labeled question-document pairs for retriever training, we propose PromptRank, which relies on language model prompting for multi-hop path reranking. |
Muhammad Khalifa; Lajanugen Logeswaran; Moontae Lee; Honglak Lee; Lu Wang; |
886 | DICE: Data-Efficient Clinical Event Extraction with Generative Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce DICE, a robust and data-efficient generative model for clinical event extraction. |
Mingyu Derek Ma; Alexander Taylor; Wei Wang; Nanyun Peng; |
887 | XSemPLR: Cross-Lingual Semantic Parsing in Multiple Natural Languages and Meaning Representations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To this end, we present XSemPLR, a unified benchmark for cross-lingual semantic parsing featured with 22 natural languages and 8 meaning representations by examining and selecting 9 existing datasets to cover 5 tasks and 164 domains. |
Yusen Zhang; Jun Wang; Zhiguo Wang; Rui Zhang; |
888 | INK: Injecting KNN Knowledge in Nearest Neighbor Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose an effective training framework INK to directly smooth the representation space via adjusting representations of kNN neighbors with a small number of new parameters. |
Wenhao Zhu; Jingjing Xu; Shujian Huang; Lingpeng Kong; Jiajun Chen; |
889 | Uncertainty Guided Label Denoising for Document-level Distant Relation Extraction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a Document-level distant Relation Extraction framework with Uncertainty Guided label denoising, UGDRE. |
Qi Sun; Kun Huang; Xiaocui Yang; Pengfei Hong; Kun Zhang; Soujanya Poria; |
890 | Cross-Modal Attribute Insertions for Assessing The Robustness of Vision-and-Language Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Thus, we propose cross-modal attribute insertions as a realistic perturbation strategy for vision-and-language data that inserts visual attributes of the objects in the image into the corresponding text (e. g. , �girl on a chair� to �little girl on a wooden chair�). |
Shivaen Ramshetty; Gaurav Verma; Srijan Kumar; |
891 | Crosslingual Generalization Through Multitask Finetuning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We find finetuning large multilingual language models on English tasks with English prompts allows for task genrealization to non-English languages that appear only in the pretraining corpus. |
Niklas Muennighoff; Thomas Wang; Lintang Sutawika; Adam Roberts; Stella Biderman; Teven Le Scao; M Saiful Bari; Sheng Shen; Zheng Xin Yong; Hailey Schoelkopf; Xiangru Tang; Dragomir Radev; Alham Fikri Aji; Khalid Almubarak; Samuel Albanie; Zaid Alyafeai; Albert Webson; Edward Raff; Colin Raffel; |
892 | Evaluate AMR Graph Similarity Via Self-supervised Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Current AMR metrics are all based on nodes or triples matching without considering the entire structures of AMR graphs. To address this problem, and inspired by learned similarity evaluation on plain text, we propose AMRSim, an automatic AMR graph similarity evaluation metric. |
Ziyi Shou; Fangzhen Lin; |
893 | Analyzing Transformers in Embedding Space Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: While most interpretability methods rely on running models over inputs, recent work has shown that a zero-pass approach, where parameters are interpreted directly without a forward/backward pass is feasible for some Transformer parameters, and for two-layer attention networks. In this work, we present a theoretical analysis where all parameters of a trained Transformer are interpreted by projecting them into the embedding space, that is, the space of vocabulary items they operate on. |
Guy Dar; Mor Geva; Ankit Gupta; Jonathan Berant; |
894 | Few-Shot Data-to-Text Generation Via Unified Representation and Multi-Source Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a novel approach for data-to-text generation that addresses the limitations of current methods that primarily focus on specific types of structured data. |
Alexander Hanbo Li; Mingyue Shang; Evangelia Spiliopoulou; Jie Ma; Patrick Ng; Zhiguo Wang; Bonan Min; William Yang Wang; Kathleen McKeown; Vittorio Castelli; Dan Roth; Bing Xiang; |
895 | FactKG: Fact Verification Via Reasoning on Knowledge Graphs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To enable the community to better use KGs, we introduce a new dataset, FactKG: Fact Verificationvia Reasoning on Knowledge Graphs. |
Jiho Kim; Sungjin Park; Yeonsu Kwon; Yohan Jo; James Thorne; Edward Choi; |
896 | DrBERT: A Robust Pre-trained Model in French for Biomedical and Clinical Domains Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose an original study of PLMs in the medical domain on French language. |
Yanis Labrak; Adrien Bazoge; Richard Dufour; Mickael Rouvier; Emmanuel Morin; B�atrice Daille; Pierre-Antoine Gourraud; |
897 | Discriminative Reasoning with Sparse Event Representation for Document-level Event-Event Relation Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel DECI model (SENDIR) for better document-level reasoning. |
Changsen Yuan; Heyan Huang; Yixin Cao; Yonggang Wen; |
898 | Facilitating Fine-grained Detection of Chinese Toxic Language: Hierarchical Taxonomy, Resources, and Benchmarks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: These fine-grained annotations are crucial factors for accurately detecting the toxicity of posts involved with lexical knowledge, which has been a challenge for researchers. To tackle this problem, we facilitate the fine-grained detection of Chinese toxic language by building a new dataset with benchmark results |
Junyu Lu; Bo Xu; Xiaokun Zhang; Changrong Min; Liang Yang; Hongfei Lin; |
899 | SpeechMatrix: A Large-Scale Mined Corpus of Multilingual Speech-to-Speech Translations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present SpeechMatrix, a large-scale multilingual corpus of speech-to-speech translations mined from real speech of European Parliament recordings. |
Paul-Ambroise Duquenne; Hongyu Gong; Ning Dong; Jingfei Du; Ann Lee; Vedanuj Goswami; Changhan Wang; Juan Pino; Beno�t Sagot; Holger Schwenk; |
900 | Character-Aware Models Improve Visual Text Rendering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we investigate a key contributing factor: popular text-to-image models lack character-level input features, making it much harder to predict a word�s visual makeup as a series of glyphs. |
Rosanne Liu; Dan Garrette; Chitwan Saharia; William Chan; Adam Roberts; Sharan Narang; Irina Blok; Rj Mical; Mohammad Norouzi; Noah Constant; |
901 | IDRISI-RA: The First Arabic Location Mention Recognition Dataset of Disaster Tweets Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Nevertheless, geolocation extraction is greatly understudied for the low resource languages such as Arabic. To fill this gap, we introduce IDRISI-RA, the first publicly-available Arabic Location Mention Recognition (LMR) dataset that provides human- and automatically-labeled versions in order of thousands and millions of tweets, respectively. |
Reem Suwaileh; Muhammad Imran; Tamer Elsayed; |
902 | FSUIE: A Novel Fuzzy Span Mechanism for Universal Information Extraction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Additionally, UIE models lack attention to the limited span length feature in IE. To address these deficiencies, we propose the Fuzzy Span Universal Information Extraction (FSUIE) framework. |
Tianshuo Peng; Zuchao Li; Lefei Zhang; Bo Du; Hai Zhao; |
903 | What Do NLP Researchers Believe? Results of The NLP Community Metasurvey Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present the results of the NLP Community Metasurvey. |
Julian Michael; Ari Holtzman; Alicia Parrish; Aaron Mueller; Alex Wang; Angelica Chen; Divyam Madaan; Nikita Nangia; Richard Yuanzhe Pang; Jason Phang; Samuel R. Bowman; |
904 | Prototype-Guided Pseudo Labeling for Semi-Supervised Text Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a novel semi-supervised framework, namely ProtoS2, with prototypical cluster separation (PCS) and prototypical-center data selection (CDS) technology to address the issue. |
Weiyi Yang; Richong Zhang; Junfan Chen; Lihong Wang; Jaein Kim; |
905 | LENS: A Learnable Evaluation Metric for Text Simplification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Training on SimpEval, we present LENS, a Learnable Evaluation Metric for Text Simplification. |
Mounica Maddela; Yao Dou; David Heineman; Wei Xu; |
906 | MeetingBank: A Benchmark Dataset for Meeting Summarization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present MeetingBank, a new benchmark dataset of city council meetings over the past decade. |
Yebowen Hu; Timothy Ganter; Hanieh Deilamsalehy; Franck Dernoncourt; Hassan Foroosh; Fei Liu; |
907 | UniEX: An Effective and Efficient Framework for Unified Information Extraction Via A Span-extractive Perspective Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a new paradigm for universal information extraction (IE) that is compatible with any schema format and applicable to a list of IE tasks, such as named entity recognition, relation extraction, event extraction and sentiment analysis. |
Yang Ping; JunYu Lu; Ruyi Gan; Junjie Wang; Yuxiang Zhang; Pingjian Zhang; Jiaxing Zhang; |
908 | DEplain: A German Parallel Corpus with Intralingual Translations Into Plain Language for Sentence and Document Simplification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To advance sentence simplification and document simplification in German, this paper presents DEplain, a new dataset of parallel, professionally written and manually aligned simplifications in plain German �plain DE� or in German: �Einfache Sprache�. |
Regina Stodden; Omar Momen; Laura Kallmeyer; |
909 | A Neural Divide-and-Conquer Reasoning Framework for Image Retrieval from Linguistically Complex Text Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Inspired by the Divide-and-Conquer algorithm and dual-process theory, in this paper, we regard linguistically complex texts as compound proposition texts composed of multiple simple proposition sentences and propose an end-to-end Neural Divide-and-Conquer Reasoning framework, dubbed NDCR. |
Yunxin Li; Baotian Hu; Yuxin Ding; Lin Ma; Min Zhang; |
910 | RARR: Researching and Revising What Language Models Say, Using Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To enable attribution while still preserving all the powerful advantages of recent generation models, we propose RARR (Retrofit Attribution using Research and Revision), a system that 1) automatically finds attribution for the output of any text generation model, and 2) post-edits the output to fix unsupported content while preserving the original output as much as possible. |
Luyu Gao; Zhuyun Dai; Panupong Pasupat; Anthony Chen; Arun Tejasvi Chaganty; Yicheng Fan; Vincent Zhao; Ni Lao; Hongrae Lee; Da-Cheng Juan; Kelvin Guu; |
911 | Should You Marginalize Over Possible Tokenizations? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To this end, we devise an importance-sampling-based algorithm that allows us to compute estimates of the marginal probabilities and compare them to the default procedure in a range of state-of-the-art models and datasets. |
Nadezhda Chirkova; Germ�n Kruszewski; Jos Rozen; Marc Dymetman; |
912 | Back to Patterns: Efficient Japanese Morphological Analysis with Feature-Sequence Trie Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Accurate neural models are much less efficient than non-neural models and are useless for processing billions of social media posts or handling user queries in real time with a limited budget. This study revisits the fastest pattern-based NLP methods to make them as accurate as possible, thus yielding a strikingly simple yet surprisingly accurate morphological analyzer for Japanese. |
Naoki Yoshinaga; |
913 | Transformed Protoform Reconstruction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Meloni et al (2021) achieved the state-of-the-art on Latin protoform reconstruction with an RNN-based encoder-decoder with attention model. We update their model with the state-of-the-art seq2seq model: the Transformer. |
Young Min Kim; Kalvin Chang; Chenxuan Cui; David R. Mortensen; |
914 | Ellipsis-Dependent Reasoning: A New Challenge for Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a novel challenge for large language models: ellipsis-dependent reasoning. |
Daniel Hardt; |
915 | Bootstrapping Neural Relation and Explanation Classifiers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a method that self trains (or bootstraps) neural relation and explanation classifiers. |
Zheng Tang; Mihai Surdeanu; |
916 | A Fast Algorithm for Computing Prefix Probabilities Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper proposes a new speed-up of Jelinek and Lafferty�s (1991) algorithm, which runs in O(n3|N|3 + |N|4), where n is the input length and |N| is the number of non-terminals in the grammar. |
Franz Nowak; Ryan Cotterell; |
917 | Analyzing Text Representations By Measuring Task Alignment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: What makes a representation good for text classification? Is it due to the geometric properties of the space or because it is well aligned with the task? We hypothesize the second claim. To test it, we develop a task alignment score based on hierarchical clustering that measures alignment at different levels of granularity. |
Cesar Gonzalez-Gutierrez; Audi Primadhanty; Francesco Cazzaro; Ariadna Quattoni; |
918 | Tracing Linguistic Markers of Influence in A Large Online Organisation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here, we investigate a similar question in a large, distributed, consensus-driven community with little traditional power hierarchy ? the Internet Engineering Task Force (IETF), a collaborative organisation that designs internet standards. |
Prashant Khare; Ravi Shekhar; Mladen Karan; Stephen McQuistin; Colin Perkins; Ignacio Castro; Gareth Tyson; Patrick Healey; Matthew Purver; |
919 | Metaphor Detection Via Explicit Basic Meanings Modelling Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a novel metaphor detection method, which models the basic meaning of the word based on literal annotation from the training set, and then compares this with the contextual meaning in a target sentence to identify metaphors. |
Yucheng Li; Shun Wang; Chenghua Lin; Frank Guerin; |
920 | XSIM++: An Improved Proxy to Bitext Mining Performance for Low-Resource Languages Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce a new proxy score for evaluating bitext mining based on similarity in a multilingual embedding space: xsim++. |
Mingda Chen; Kevin Heffernan; Onur �elebi; Alexandre Mourachko; Holger Schwenk; |
921 | Graph Propagation Based Data Augmentation for Named Entity Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel Graph Propagated Data Augmentation (GPDA) framework for Named Entity Recognition (NER), leveraging graph propagation to build relationships between labeled data and unlabeled natural texts. |
Jiong Cai; Shen Huang; Yong Jiang; Zeqi Tan; Pengjun Xie; Kewei Tu; |
922 | Dataset Distillation with Attention Labels for Fine-tuning BERT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we focus on constructing distilled few-shot datasets for natural language processing (NLP) tasks to fine-tune pre-trained transformers. |
Aru Maekawa; Naoki Kobayashi; Kotaro Funakoshi; Manabu Okumura; |
923 | Multi-Document Summarization with Centroid-Based Pretraining Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we focus on pretraining objectives for MDS. |
Ratish Surendran Puduppully; Parag Jain; Nancy Chen; Mark Steedman; |
924 | Scaling in Cognitive Modelling: A Multilingual Approach to Human Reading Times Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work we focus on parameter size, showing that larger Transformer-based language models generate probabilistic estimates that are less predictive of early eye-tracking measurements reflecting lexical access and early semantic integration. |
Andrea de Varda; Marco Marelli; |
925 | Improving Generalization in Language Model-based Text-to-SQL Semantic Parsing: Two Simple Semantic Boundary-based Techniques Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we empirically investigate improving an LM�s generalization in semantic parsing with two simple techniques: at the token level, we introduce a token preprocessing method to preserve the semantic boundaries of tokens produced by LM tokenizers; at the sequence level, we propose to use special tokens to mark the boundaries of components aligned between input and output. |
Daking Rai; Bailin Wang; Yilun Zhou; Ziyu Yao; |
926 | HiPool: Modeling Long Documents Using Graph Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, most of them apply sequential models for upper hierarchies, suffering from long dependency issues. In this paper, we alleviate these issues through a graph-based method. |
Irene Li; Aosong Feng; Dragomir Radev; Rex Ying; |
927 | A Weakly Supervised Classifier and Dataset of White Supremacist Language Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a dataset and classifier for detecting the language of white supremacist extremism, a growing issue in online hate speech. |
Michael Yoder; Ahmad Diab; David Brown; Kathleen Carley; |
928 | BOLT: Fast Energy-based Controlled Text Generation with Tunable Biases Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose BOLT, which relies on tunable biases to directly adjust the language model�s output logits. |
Xin Liu; Muhammad Khalifa; Lu Wang; |
929 | MOKB6: A Multilingual Open Knowledge Base Completion Benchmark Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Using the latest advances in multilingual Open IE, we construct the first multilingual Open KBC dataset, called mOKB6, containing facts from Wikipedia in six languages (including English). |
Shubham Mittal; Keshav Kolluru; Soumen Chakrabarti; Mausam –; |
930 | Covering Uncommon Ground: Gap-Focused Question Generation for Answer Assessment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We define the task, highlight key desired aspects of a good GFQ, and propose a model that satisfies these. |
Roni Rabin; Alexandre Djerbetian; Roee Engelberg; Lidan Hackmon; Gal Elidan; Reut Tsarfaty; Amir Globerson; |
931 | Detoxifying Text with MaRCo: Controllable Revision with Experts and Anti-Experts Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce MaRCo, a detoxification algorithm that combines controllable generation and text rewriting methods using a Product of Experts with autoencoder language models (LMs). |
Skyler Hallinan; Alisa Liu; Yejin Choi; Maarten Sap; |
932 | A Natural Bias for Language Generation Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here we show that we can effectively endow standard neural language generation models with a separate module that reflects unigram frequency statistics as prior knowledge, simply by initialising the bias term in a model?s final linear layer with the log-unigram distribution. |
Clara Meister; Wojciech Stokowiec; Tiago Pimentel; Lei Yu; Laura Rimell; Adhiguna Kuncoro; |
933 | Simple Augmentations of Logical Rules for Neuro-Symbolic Knowledge Graph Completion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we suggest three simple augmentations to existing rule sets: (1) transforming rules to their abductive forms, (2) generating equivalent rules that use inverse forms of constituent relations and (3) random walks that propose new rules. |
Ananjan Nandi; Navdeep Kaur; Parag Singla; Mausam –; |
934 | Parameter-efficient Weight Ensembling Facilitates Task-level Knowledge Transfer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Owning many lightweight parameters, we focus on transferring them between tasks to acquire an improvement in performance of new tasks, the key point of which is to obtain the similarity between tasks. In this paper, we explore 5 parameter-efficient weight ensembling methods to achieve such transferability and verify the effectiveness of them. |
Xingtai Lv; Ning Ding; Yujia Qin; Zhiyuan Liu; Maosong Sun; |
935 | Faithfulness Tests for Natural Language Explanations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work explores the challenging question of evaluating the faithfulness of natural language explanations (NLEs). |
Pepa Atanasova; Oana-Maria Camburu; Christina Lioma; Thomas Lukasiewicz; Jakob Grue Simonsen; Isabelle Augenstein; |
936 | COGEN: Abductive Commonsense Language Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes CoGen, a model for both alphaNLI and alphaNLG tasks that employ a novel approach of combining the temporal commonsense reasoning for each observation (before and after a real hypothesis) from pre-trained models with contextual filtering for training. |
Rohola Zandie; Diwanshu Shekhar; Mohammad Mahoor; |
937 | Multimodal Relation Extraction with Cross-Modal Retrieval and Synthesis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To improve the prediction, this research proposes to retrieve textual and visual evidence based on the object, sentence, and whole image. |
Xuming Hu; Zhijiang Guo; Zhiyang Teng; Irwin King; Philip S. Yu; |
938 | Characterization of Stigmatizing Language in Medical Records Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: A new line of work in medical sociology has demonstrated physicians often use stigmatizing language in electronic medical records within certain groups, such as black patients, which may exacerbate disparities. In this study, we characterize these instances at scale using a series of domain-informed NLP techniques. |
Keith Harrigian; Ayah Zirikly; Brant Chee; Alya Ahmad; Anne Links; Somnath Saha; Mary Catherine Beach; Mark Dredze; |
939 | Abstractive Summarizers Are Excellent Extractive Summarizers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore the potential synergies of modeling extractive summarization with an abstractive summarization system and propose three novel inference algorithms using the sequence-to-sequence architecture. |
Daniel Varab; Yumo Xu; |
940 | Language Models Get A Gender Makeover: Mitigating Gender Bias with Few-Shot Data Interventions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While the majority of current state-of-the-art debiasing methods focus on changes to the training regime, in this paper, we propose data intervention strategies as a powerful yet simple technique to reduce gender bias in pre-trained models. |
Himanshu Thakur; Atishay Jain; Praneetha Vaddamanu; Paul Pu Liang; Louis-Philippe Morency; |
941 | PLUE: Language Understanding Evaluation Benchmark for Privacy Policies in English Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To this end, we introduce the Privacy Policy Language Understanding Evaluation (PLUE) benchmark, a multi-task benchmark for evaluating the privacy policy language understanding across various tasks. |
Jianfeng Chi; Wasi Uddin Ahmad; Yuan Tian; Kai-Wei Chang; |
942 | Stop Pre-Training: Adapt Visual-Language Models to Unseen Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a simple yet efficient approach to adapt VLP to unseen languages using MPLM. |
Yasmine Karoui; R�mi Lebret; Negar Foroutan Eghlidi; Karl Aberer; |
943 | BUCA: A Binary Classification Approach to Unsupervised Commonsense Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose to transform the downstream multiple choice question answering task into a simpler binary classification task by ranking all candidate answers according to their reasonableness. |
Jie He; Simon U; Victor Gutierrez-Basulto; Jeff Pan; |
944 | Nichelle and Nancy: The Influence of Demographic Attributes and Tokenization Length on First Name Biases Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Demographic attributes of first names, however, are strongly correlated with corpus frequency and tokenization length, which may influence model behavior independent of or in addition to demographic factors. In this paper, we conduct a new series of first name substitution experiments that measures the influence of these factors while controlling for the others. |
Haozhe An; Rachel Rudinger; |
945 | Improving Syntactic Probing Correctness and Robustness with Control Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the probing methods are usually biased by the PLMs� memorization of common word co-occurrences, even if they do not form syntactic relations. This paper presents a random-word-substitution and random-label-matching control task to reduce these biases and improve the robustness of syntactic probing methods. |
Weicheng Ma; Brian Wang; Hefan Zhang; Lili Wang; Rolando Coto-Solano; Saeed Hassanpour; Soroush Vosoughi; |
946 | Split-NER: Named Entity Recognition Via Two Question-Answering-based Classifications Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we address the NER problem by splitting it into two logical sub-tasks: (1) Span Detection which simply extracts entity mention spans irrespective of entity type; (2) Span Classification which classifies the spans into their entity types. |
Jatin Arora; Youngja Park; |
947 | Credible Without Credit: Domain Experts Assess Generative Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Using 10 domain experts across science and culture, we provide an initial assessment of the coherence, conciseness, accuracy, and sourcing of two language models across 100 expert-written questions. |
Denis Peskoff; Brandon Stewart; |
948 | Grokking of Hierarchical Structure in Vanilla Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We show that transformer language models can learn to generalize hierarchically after training for extremely long periods�far beyond the point when in-domain accuracy has saturated. |
Shikhar Murty; Pratyusha Sharma; Jacob Andreas; Christopher Manning; |
949 | Zero-shot Cross-lingual Transfer With Learned Projections Using Unlabeled Target-Language Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We construct language-specific subspaces using standard linear algebra constructs and selectively project source-language representations into the target language subspace during task-specific finetuning using two schemes. |
Ujan Deb; Ridayesh Parab; Preethi Jyothi; |
950 | Context-Aware Transformer Pre-Training for Answer Sentence Selection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose three pre-training objectives designed to mimic the downstream fine-tuning task of contextual AS2. |
Luca Di Liello; Siddhant Garg; Alessandro Moschitti; |
951 | Toward Expanding The Scope of Radiology Report Summarization to Multiple Anatomies and Modalities Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: First, many prior studies conduct experiments on private datasets, preventing reproduction of results and fair comparisons across different systems and solutions. Second, most prior approaches are evaluated solely on chest X-rays. To address these limitations, we propose a dataset (MIMIC-RRS) involving three new modalities and seven new anatomies based on the MIMIC-III and MIMIC-CXR datasets. |
Zhihong Chen; Maya Varma; Xiang Wan; Curtis Langlotz; Jean-Benoit Delbrouck; |
952 | Efficient Diagnosis Assignment Using Unstructured Clinical Notes Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, traditional methods, such as rule-based labeling functions or neural networks, require significant manual effort to tune and may not generalize well to multiple indications. To address these challenges, we propose HyDE (hybrid diagnosis extractor). |
Louis Blankemeier; Jason Fries; Robert Tinn; Joseph Preston; Nigam Shah; Akshay Chaudhari; |
953 | MetaVL: Transferring In-Context Learning Ability From Language Models to Vision-Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we study an interesting hypothesis: can we transfer the in-context learning ability from the language domain to the VL domain? |
Masoud Monajatipoor; Liunian Harold Li; Mozhdeh Rouhsedaghat; Lin Yang; Kai-Wei Chang; |
954 | On The Interpretability and Significance of Bias Metrics in Texts: A PMI-based Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We analyze an alternative PMI-based metric to quantify biases in texts. |
Francisco Valentini; Germ�n Rosati; Dami�n Blasi; Diego Fernandez Slezak; Edgar Altszyler; |
955 | Surface-Based Retrieval Reduces Perplexity of Retrieval-Augmented Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Retrieval-augmented models commonly rely on a semantic retrieval mechanism based on the similarity between dense representations of the query chunk and potential neighbors. In this paper, we study the state-of-the-art Retro model and observe that its performance gain is better explained by surface-level similarities, such as token overlap. |
Ehsan Doostmohammadi; Tobias Norlund; Marco Kuhlmann; Richard Johansson; |
956 | MIReAD: Simple Method for Learning High-quality Representations from Scientific Documents Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose MIReAD, a simple method that learns highquality representations of scientific papers by fine-tuning transformer model to predict the target journal class based on the abstract. |
Anastasiia Razdaibiedina; Aleksandr Brechalov; |
957 | KNOW How to Make Up Your Mind! Adversarially Detecting and Alleviating Inconsistencies in Natural Language Explanations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we leverage external knowledge bases to significantly improve on an existing adversarial attack for detecting inconsistent NLEs. |
Myeongjun Jang; Bodhisattwa Prasad Majumder; Julian McAuley; Thomas Lukasiewicz; Oana-Maria Camburu; |
958 | Measuring The Effect of Influential Messages on Varying Personas Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a new task, Response Forecasting on Personas for News Media, to estimate the response a persona (characterizing an individual or a group) might have upon seeing a news message. |
Chenkai Sun; Jinning Li; Hou Pong Chan; ChengXiang Zhai; Heng Ji; |
959 | Going Beyond Sentence Embeddings: A Token-Level Matching Algorithm for Calculating Semantic Textual Similarity Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these embeddings are limited in representing semantics because they mix all the semantic information together in fixed-length vectors, which are difficult to recover and lack explainability. This paper presents a token-level matching inference algorithm, which can be applied on top of any language model to improve its performance on STS task. |
Hongwei Wang; Dong Yu; |
960 | Robust Learning for Multi-party Addressee Recognition with Discrete Addressee Codebook Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose a Robust Addressee Recognition (RAR) method, which discrete the addressees into a character codebook, making it able to represent open set addressees and robust in a noisy environment. |
Pengcheng Zhu; Wei Zhou; Kuncai Zhang; Yuankai Ma; Haiqing Chen; |
961 | TwistList: Resources and Baselines for Tongue Twister Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present work on the generation of tongue twisters – a form of language that is required to be phonetically conditioned to maximise sound overlap, whilst maintaining semantic consistency with an input topic, and still being grammatically correct. |
Tyler Loakman; Chen Tang; Chenghua Lin; |
962 | Substitution-based Semantic Change Detection Using Contextual Embeddings Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a simplified approach to measuring semantic change using contextual embeddings, relying only on the most probable substitutes for masked terms. |
Dallas Card; |
963 | Probing Physical Reasoning with Counter-Commonsense Context Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we create a CConS (Counter-commonsense Contextual Size comparison) dataset to investigate how physical commonsense affects the contextualized size comparison task; the proposed dataset consists of both contexts that fit physical commonsense and those that do not. |
Kazushi Kondo; Saku Sugawara; Akiko Aizawa; |
964 | Morphological Inflection with Phonological Features Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work explores effects on performance obtained through various ways in which morphological models get access to sub-character phonological features that are often the targets of morphological processes. We design two methods to achieve this goal: one that leaves models as is but manipulates the data to include features instead of characters, and another that manipulates models to take phonological features into account when building representations for phonemes. |
David Guriel; Omer Goldman; Reut Tsarfaty; |
965 | A Holistic Approach to Reference-Free Evaluation of Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a reference-free evaluation approach that characterizes evaluation as two aspects: (1) fluency: how well the translated text conforms to normal human language usage; (2) faithfulness: how well the translated text reflects the source data. |
Hanming Wu; Wenjuan Han; Hui Di; Yufeng Chen; Jinan Xu; |
966 | Balancing Lexical and Semantic Quality in Abstractive Summarization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a novel training method in which a re-ranker balances the lexical and semantic quality. |
Jeewoo Sul; Yong Suk Choi; |
967 | Learning Neuro-Symbolic World Models with Conversational Proprioception Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here, we describe a method that can learn neuro-symbolic world models on the TextWorld-Commonsense set of games. |
Don Joven Agravante; Daiki Kimura; Michiaki Tatsubori; Asim Munawar; Alexander Gray; |
968 | In and Out-of-Domain Text Adversarial Robustness Via Label Smoothing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we study the adversarial robustness provided by label smoothing strategies in foundational models for diverse NLP tasks in both in-domain and out-of-domain settings. |
Yahan Yang; Soham Dan; Dan Roth; Insup Lee; |
969 | LM-CPPF: Paraphrasing-Guided Data Augmentation for Contrastive Prompt-Based Few-Shot Fine-Tuning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, these models often struggle when fine-tuned on small datasets. To address this issue, researchers have proposed various adaptation approaches. |
Amirhossein Abaskohi; Sascha Rothe; Yadollah Yaghoobzadeh; |
970 | Considerations for Meaningful Sign Language Machine Translation Based on Glosses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we review recent works on neural gloss translation. |
Mathias M�ller; Zifan Jiang; Amit Moryossef; Annette Rios; Sarah Ebling; |
971 | Detecting Contradictory COVID-19 Drug Efficacy Claims from Biomedical Literature Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The COVID-19 pandemic created a deluge of questionable and contradictory scientific claims about drug efficacy � an �infodemic� with lasting consequences for science and society. In this work, we argue that NLP models can help domain experts distill and understand the literature in this complex, high-stakes area. |
Daniel Sosa; Malavika Suresh; Christopher Potts; Russ Altman; |
972 | The Role of Global and Local Context in Named Entity Recognition Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Such an approach unfortunately only incorporates local context and prevents leveraging global document context in long documents such as novels, which might hinder performance. In this article, we explore the impact of global document context, and its relationships with local context. |
Arthur Amalvy; Vincent Labatut; Richard Dufour; |
973 | Joint End-to-end Semantic Proto-role Labeling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We model SPRL jointly with predicate-argument extraction using a deep transformer model. We find that proto-role labeling is surprisingly robust in this setting, with only a small decrease when using predicted arguments. |
Elizabeth Spaulding; Gary Kazantsev; Mark Dredze; |
974 | Improving Automatic Quotation Attribution in Literary Novels Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here, we approach quotation attribution as a set of four interconnected sub-tasks: character identification, coreference resolution, quotation identification, and speaker attribution. |
Krishnapriya Vishnubhotla; Frank Rudzicz; Graeme Hirst; Adam Hammond; |
975 | Modular Visual Question Answering Via Code Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a framework that formulates visual question answering as modular code generation. |
Sanjay Subramanian; Medhini Narasimhan; Kushal Khangaonkar; Kevin Yang; Arsha Nagrani; Cordelia Schmid; Andy Zeng; Trevor Darrell; Dan Klein; |
976 | Target-Based Offensive Language Identification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present TBO, a new dataset for Target-based Offensive language identification. |
Marcos Zampieri; Skye Morgan; Kai North; Tharindu Ranasinghe; Austin Simmmons; Paridhi Khandelwal; Sara Rosenthal; Preslav Nakov; |
977 | Unsupervised Subtitle Segmentation with Masked Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We describe a novel unsupervised approach to subtitle segmentation, based on pretrained masked language models, where line endings and subtitle breaks are predicted according to the likelihood of punctuation to occur at candidate segmentation points. |
David Ponce; Thierry Etchegoyhen; Victor Ruiz; |
978 | Exploring Continual Learning for Code Generation Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a benchmark called CodeTask-CL that covers a wide range of tasks, including code generation, translation, summarization, and refinement, with different input and output programming languages. |
Prateek Yadav; Qing Sun; Hantian Ding; Xiaopeng Li; Dejiao Zhang; Ming Tan; Parminder Bhatia; Xiaofei Ma; Ramesh Nallapati; Murali Krishna Ramanathan; Mohit Bansal; Bing Xiang; |
979 | Deep Active Learning for Morphophonological Processing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: There has been little research that focuses on applying active learning for morphological inflection and morphophonological processing. In this paper, we have proposed a deep active learning method for this task. |
Seyed Morteza Mirbostani; Yasaman Boreshban; Salam Khalifa; SeyedAbolghasem Mirroshandel; Owen Rambow; |
980 | Counterfactual Reasoning: Testing Language Models� Understanding of Hypothetical Scenarios Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a set of tests from psycholinguistic experiments, as well as larger-scale controlled datasets, to probe counterfactual predictions from five pre-trained language models. |
Jiaxuan Li; Lang Yu; Allyson Ettinger; |
981 | Bhasa-Abhijnaanam: Native-script and Romanized Language Identification for 22 Indic Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Two major challenges for romanized text LID are the lack of training data and low-LID performance when languages are similar. We provide simple and effective solutions to these problems. |
Yash Madhani; Mitesh M. Khapra; Anoop Kunchukuttan; |
982 | Using Contradictions Improves Question Answering Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work examines the use of contradiction in natural language inference (NLI) for question answering (QA). |
Etienne Fortier-Dubois; Domenic Rosati; |
983 | Token-Level Self-Evolution Training for Sequence-to-Sequence Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In response, we present Token-Level Self-Evolution Training (SE), a simple and effective dynamic training method to fully and wisely exploit the knowledge from data. |
Keqin Peng; Liang Ding; Qihuang Zhong; Yuanxin Ouyang; Wenge Rong; Zhang Xiong; Dacheng Tao; |
984 | Gradient Ascent Post-training Enhances Language Model Generalization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we empirically show that updating pretrained LMs (350M, 1. |
Dongkeun Yoon; Joel Jang; Sungdong Kim; Minjoon Seo; |
985 | An Open Dataset and Model for Language Identification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a LID model which achieves a macro-average F1 score of 0. |
Laurie Burchell; Alexandra Birch; Nikolay Bogoychev; Kenneth Heafield; |
986 | Evaluating Paraphrastic Robustness in Textual Entailment Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present PaRTE, a collection of 1,126 pairs of Recognizing Textual Entailment (RTE) examples to evaluate whether models are robust to paraphrasing. |
Dhruv Verma; Yash Kumar Lal; Shreyashee Sinha; Benjamin Van Durme; Adam Poliak; |
987 | Are Pre-trained Language Models Useful for Model Ensemble in Chinese Grammatical Error Correction? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We hypothesize that model ensemble based on the perplexity (PPL) computed by pre-trained language models (PLMs) should benefit the GEC system. |
Chenming Tang; Xiuyu Wu; Yunfang Wu; |
988 | Improving Factuality of Abstractive Summarization Without Sacrificing Summary Quality Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose {pasted macro �MODEL�}name (i. e. Effective Factual Summarization), a candidate summary generation and ranking technique to improve summary factuality without sacrificing quality. |
Tanay Dixit; Fei Wang; Muhao Chen; |
989 | With A Little Push, NLI Models Can Robustly and Efficiently Predict Faithfulness Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work we show that pure NLI models _can_ outperform more complex metrics when combining task-adaptive data augmentation with robust inference procedures. |
Julius Steen; Juri Opitz; Anette Frank; Katja Markert; |
990 | A Better Way to Do Masked Language Model Scoring Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Here, we demonstrate that the original PLL method yields inflated scores for out-of-vocabulary words and propose an adapted metric, in which we mask not only the target token, but also all within-word tokens to the right of the target. |
Carina Kauf; Anna Ivanova; |
991 | ChatGPT for Zero-shot Dialogue State Tracking: A Solution or An Opportunity? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present preliminary experimental results on the ChatGPT research preview, showing that ChatGPT achieves state-of-the-art performance in zero-shot DST. |
Michael Heck; Nurul Lubis; Benjamin Ruppik; Renato Vukovic; Shutong Feng; Christian Geishauser; Hsien-chin Lin; Carel van Niekerk; Milica Gasic; |
992 | Controllable Mixed-Initiative Dialogue Generation Through Prompting Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We formalize prompt construction for controllable mixed-initiative dialogue. |
Maximillian Chen; Xiao Yu; Weiyan Shi; Urvi Awasthi; Zhou Yu; |
993 | Enhancing Event Causality Identification with Counterfactual Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, causal signals are ambiguous, which may lead to the context-keywords bias and the event-pairs bias. To solve this issue, we propose the counterfactual reasoning that explicitly estimates the influence of context keywords and event pairs in training, so that we are able to eliminate the biases in inference. |
Feiteng Mu; Wenjie Li; |
994 | Contrastive Bootstrapping for Label Refinement Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a lightweight contrastive clustering-based bootstrapping method to iteratively refine the labels of passages. |
Shudi Hou; Yu Xia; Muhao Chen; Sujian Li; |
995 | NollySenti: Leveraging Transfer Learning and Machine Translation for Nigerian Movie Sentiment Classification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we focus on the task of sentiment classification for cross-domain adaptation. |
Iyanuoluwa Shode; David Ifeoluwa Adelani; JIng Peng; Anna Feldman; |
996 | Trading Syntax Trees for Wordpieces: Target-oriented Opinion Words Extraction with Wordpieces and Aspect Enhancement Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To enhance TOWE performance, we tackle the issue of aspect representation loss during encoding. |
Samuel Mensah; Kai Sun; Nikolaos Aletras; |
997 | An (unhelpful) Guide to Selecting The Best ASR Architecture for Your Under-resourced Language Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we use four of the most popular ASR toolkits to train ASR models for eleven languages with limited ASR training resources: eleven widely spoken languages of Africa, Asia, and South America, one endangered language of Central America, and three critically endangered languages of North America. |
Robert Jimerson; Zoey Liu; Emily Prud�hommeaux; |
998 | The Ecological Fallacy in Annotation: Modeling Human Label Variation Goes Beyond Sociodemographics Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To account for sociodemographics in models of individual annotator behaviour, we introduce group-specific layers to multi-annotator models. |
Matthias Orlikowski; Paul R�ttger; Philipp Cimiano; Dirk Hovy; |
999 | Decomposed Scoring of CCG Dependencies Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: When a predicate has multiple argument slots that can be filled, the same lexical category is used for the label of multiple dependencies. In this paper, we show that this evaluation can result in disproportionate penalization of supertagging errors and obfuscate the truly erroneous dependencies. |
Aditya Bhargava; Gerald Penn; |
1000 | Do GPTs Produce Less Literal Translations? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, there has been relatively little investigation on how such translations differ qualitatively from the translations generated by standard Neural Machine Translation (NMT) models. In this work, we investigate these differences in terms of the literalness of translations produced by the two systems. |
Vikas Raunak; Arul Menezes; Matt Post; Hany Hassan; |
1001 | Environmental Claim Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Thus, this paper introduces the task of environmental claim detection. |
Dominik Stammbach; Nicolas Webersinke; Julia Bingler; Mathias Kraus; Markus Leippold; |
1002 | Black-box Language Model Explanation By Context Length Probing Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present *context length probing*, a novel explanation technique for causal language models, based on tracking the predictions of a model as a function of the length of available context, and allowing to assign *differential importance scores* to different contexts. |
Ondrej C�fka; Antoine Liutkus; |
1003 | Let Me Check The Examples: Enhancing Demonstration Learning Via Explicit Imitation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by the human learning process, in this paper, we introduce Imitation DEMOnstration learning (Imitation-Demo) to strengthen demonstration learning via explicitly imitating human review behaviour, which includes: (1) contrastive learning mechanism to concentrate on similar demonstrations. |
Sirui Wang; Kaiwen Wei; Hongzhi Zhang; Yuntao Li; Wei Wu; |
1004 | The Inside Story: Towards Better Understanding of Machine Translation Neural Evaluation Metrics Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we develop and compare several neural explainability methods and demonstrate their effectiveness for interpreting state-of-the-art fine-tuned neural metrics. |
Ricardo Rei; Nuno M. Guerreiro; Marcos Treviso; Luisa Coheur; Alon Lavie; Andr� Martins; |
1005 | Typo-Robust Representation Learning for Dense Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To assess the effectiveness of our proposed method, we compare it against the existing competitors using two benchmark datasets and two base encoders. |
Panuthep Tasawong; Wuttikorn Ponwitayarat; Peerat Limkonchotiwat; Can Udomcharoenchaikit; Ekapol Chuangsuwanich; Sarana Nutanong; |
1006 | Focused Prefix Tuning for Controllable Text Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose focused prefix tuning (FPT) to mitigate the problem and to enable the control to focus on the desired attribute. |
Congda Ma; Tianyu Zhao; Makoto Shing; Kei Sawada; Manabu Okumura; |
1007 | ReAugKD: Retrieval-Augmented Knowledge Distillation For Pre-trained Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we show that having access to non-parametric memory in the form of a knowledge base with the teacher�s soft labels and predictions can further improve student generalization. |
Jianyi Zhang; Aashiq Muhamed; Aditya Anantharaman; Guoyin Wang; Changyou Chen; Kai Zhong; Qingjun Cui; Yi Xu; Belinda Zeng; Trishul Chilimbi; Yiran Chen; |
1008 | Debiasing Generative Named Entity Recognition By Calibrating Sequence Likelihood Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we stick to the Seq2Seq formulation and propose a reranking-based approach. |
Yu Xia; Yongwei Zhao; Wenhao Wu; Sujian Li; |
1009 | Deriving Language Models from Masked Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper studies methods for deriving explicit joint distributions from MLMs, focusing on distributions over two tokens, which makes it possible to calculate exact distributional properties. |
Lucas Torroba Hennigen; Yoon Kim; |
1010 | UniTRec: A Unified Text-to-Text Transformer and Joint Contrastive Learning Framework for Text-based Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In contrast to previous works that either use PLM to encode user history as a whole input text, or impose an additional aggregation network to fuse multi-turn history representations, we propose a unified local- and global-attention Transformer encoder to better model two-level contexts of user history. |
Zhiming Mao; Huimin Wang; Yiming Du; Kam-Fai Wong; |
1011 | Reasoning Implicit Sentiment with Chain-of-Thought Prompting Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Inspired by the recent chain-of-thought (CoT) idea, in this work we introduce a Three-hop Reasoning (THOR) CoT framework to mimic the human-like reasoning process for ISA. |
Hao Fei; Bobo Li; Qian Liu; Lidong Bing; Fei Li; Tat-Seng Chua; |
1012 | Latent Positional Information Is in The Self-Attention Variance of Transformer Language Models Without Positional Embeddings Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The use of positional embeddings in transformer language models is widely accepted. However, recent research has called into question the necessity of such embeddings. We further extend this inquiry by demonstrating that a randomly initialized and frozen transformer language model, devoid of positional embeddings, inherently encodes strong positional information through the shrinkage of self-attention variance. |
Ta-Chung Chi; Ting-Han Fan; Li-Wei Chen; Alexander Rudnicky; Peter Ramadge; |
1013 | Is Anisotropy Truly Harmful? A Case Study on Text Clustering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, despite this effort, there is no established relationship between anisotropy and performance. In this paper, we aim to bridge this gap by investigating the impact of different transformations on both the isotropy and the performance in order to assess the true impact of anisotropy. |
Mira Ait-Saada; Mohamed Nadif; |
1014 | Class Based Influence Functions for Error Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, they are unstable when applied to deep networks. In this paper, we provide an explanation for the instability of IFs and develop a solution to this problem. |
Thang Nguyen-Duc; Hoang Thanh-Tung; Quan Hung Tran; Dang Huu-Tien; Hieu Nguyen; Anh T. V. Dau; Nghi Bui; |
1015 | Leveraging Prefix Transfer for Multi-Intent Text Revision Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we aim to build a multi-intent text revision system that could revise texts without explicit intent annotation. |
Ruining Chong; Cunliang Kong; Liu Wu; Zhenghao Liu; Ziye Jin; Liner Yang; Yange Fan; Hanghang Fan; Erhong Yang; |
1016 | Learning Multi-Step Reasoning By Solving Arithmetic Tasks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work investigates how to incorporate relatively small LMs with the capabilities of multi-step reasoning. We propose to inject such abilities by continually pre-training LMs on a synthetic dataset MsAT which is composed of Multi-step Arithmetic Tasks. |
Tianduo Wang; Wei Lu; |
1017 | Towards Adaptive Prefix Tuning for Parameter-Efficient Language Model Fine-tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we focus on prefix tuning, which only optimizes continuous prefix vectors (i. e. pseudo tokens) inserted into Transformer layers. |
Zhen-Ru Zhang; Chuanqi Tan; Haiyang Xu; Chengyu Wang; Jun Huang; Songfang Huang; |
1018 | Improving Gender Fairness of Pre-Trained Language Models Without Catastrophic Forgetting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Forgetting information in the original training data may damage the model�s downstream performance by a large margin. In this work, we empirically show that catastrophic forgetting occurs in such methods by evaluating them with general NLP tasks in GLUE. |
Zahra Fatemi; Chen Xing; Wenhao Liu; Caimming Xiong; |
1019 | Class-Incremental Learning Based on Label Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We thus propose a new CIL method (VAG) that also leverages the sparsity of vocabulary to focus the generation and creates pseudo-replay samples by using label semantics. |
Yijia Shao; Yiduo Guo; Dongyan Zhao; Bing Liu; |
1020 | Evaluating Pragmatic Abilities of Image Captioners on A3DS Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Evaluating grounded neural language model performance with respect to pragmatic qualities like the trade off between truthfulness, contrastivity and overinformativity of generated utterances remains a challenge in absence of data collected from humans. To enable such evaluation, we present a novel open source image-text dataset �Annotated 3D Shapes� (A3DS) comprising over nine million exhaustive natural language annotations and over 12 million variable-granularity captions for the 480,000 images provided by Burgess & Kim (2018). |
Polina Tsvilodub; Michael Franke; |
1021 | The Art of Prompting: Event Detection Based on Type Specific Prompts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We compare various forms of prompts to represent event types and develop a unified framework to incorporate the event type specific prompts for supervised, few-shot, and zero-shot event detection. |
Sijia Wang; Mo Yu; Lifu Huang; |
1022 | Exploring The Impact of Layer Normalization for Zero-shot Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper studies the impact of layer normalization (LayerNorm) on zero-shot translation (ZST). |
Zhuoyuan Mao; Raj Dabre; Qianying Liu; Haiyue Song; Chenhui Chu; Sadao Kurohashi; |
1023 | Do Models Really Learn to Follow Instructions? An Empirical Study of Instruction Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we analyze how models utilize instructions during IT by comparing model training with altered vs. original instructions. |
Po-Nien Kung; Nanyun Peng; |
1024 | Self-Distilled Quantization: Achieving High Compression Rates in Transformer-Based Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a new method called self-distilled quantization (SDQ) that minimizes accumulative quantization errors and outperforms baselines. |
James O�Neill; Sourav Dutta; |
1025 | Modality Adaption or Regularization? A Case Study on End-to-End Speech Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In a case study, we find that the regularization plays a more important role than the well-designed modality adaption method, which achieves 29. |
Yuchen Han; Chen Xu; Tong Xiao; Jingbo Zhu; |
1026 | Uncertainty-Aware Bootstrap Learning for Joint Extraction on Distantly-Supervised Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Jointly extracting entity pairs and their relations is challenging when working on distantly-supervised data with ambiguous or noisy labels. To mitigate such impact, we propose uncertainty-aware bootstrap learning, which is motivated by the intuition that the higher uncertainty of an instance, the more likely the model confidence is inconsistent with the ground truths. |
Yufei Li; Xiao Yu; Yanchi Liu; Haifeng Chen; Cong Liu; |
1027 | Text-to-SQL Error Correction with Language Models of Code Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we investigate how to build automatic text-to-SQL error correction models. |
Ziru Chen; Shijie Chen; Michael White; Raymond Mooney; Ali Payani; Jayanth Srinivasa; Yu Su; Huan Sun; |
1028 | The Tail Wagging The Dog: Dataset Construction Biases of Social Bias Benchmarks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: How reliably can we trust the scores obtained from social bias benchmarks as faithful indicators of problematic social biases in a given model? In this work, we study this question by contrasting social biases with non-social biases that stem from choices made during dataset construction (which might not even be discernible to the human eye). |
Nikil Selvam; Sunipa Dev; Daniel Khashabi; Tushar Khot; Kai-Wei Chang; |
1029 | Summarizing, Simplifying, and Synthesizing Medical Evidence Using GPT-3 (with Varying Success) Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper we enlist domain experts (individuals with medical training) to evaluate summaries of biomedical articles generated by GPT-3, given no supervision. |
Chantal Shaib; Millicent Li; Sebastian Joseph; Iain Marshall; Junyi Jessy Li; Byron Wallace; |
1030 | Prefix Propagation: Parameter-Efficient Tuning for Long Sequences Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Although such models attain comparable performance with fine-tuning when applied to sequences with short to moderate lengths, we show their inferior performance when modelling long sequences. To bridge this gap, we propose prefix-propagation, a simple but effective approach that conditions prefixes on previous hidden states. |
Jonathan Li; Will Aitken; Rohan Bhambhoria; Xiaodan Zhu; |
1031 | Listener Model for The PhotoBook Referential Game with CLIPScores As Implicit Reference Chain Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Methods developed in the literature, however, cannot be deployed to real gameplaysince they only tackle some subtasks of the game,and they require additional reference chains inputs, whose extraction process is imperfect. Therefore, we propose a reference chain-free listener modelthat directly addresses the game�s predictive task, i. e. , deciding whether an image is shared with partner. |
Shih-Lun Wu; Yi-Hui Chou; Liangze Li; |
1032 | Bring More Attention to Syntactic Symmetry for Automatic Postediting of High-Quality Machine Translations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Thus, we propose a linguistically motivated method of regularization that is expected to enhance APE models� understanding of the target language: a loss function that encourages symmetric self-attention on the given MT. Our analysis of experimental results demonstrates that the proposed method helps improving the state-of-the-art architecture�s APE quality for high-quality MTs. |
Baikjin Jung; Myungji Lee; Jong-Hyeok Lee; Yunsu Kim; |
1033 | An Embarrassingly Easy But Strong Baseline for Nested Named Entity Recognition Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, previous work ignores spatial relations in the score matrix. In this paper, we propose using Convolutional Neural Network (CNN) to model these spatial relations. |
Hang Yan; Yu Sun; Xiaonan Li; Xipeng Qiu; |
1034 | Hexatagging: Projective Dependency Parsing As Tagging Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a novel dependency parser, the hexatagger, that constructs dependency trees by tagging the words in a sentence with elements from a finite set of possible tags. |
Afra Amini; Tianyu Liu; Ryan Cotterell; |
1035 | Understanding Demonstration-based Learning from A Causal Perspective Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we build a Structural Causal Model (SCM) to understand demonstration-based learning from causal perspectives and interpret random demonstrations as interventions on the demonstration variable within the causal model. |
Ruiyi Zhang; Tong Yu; |
1036 | RAMP: Retrieval and Attribute-Marking Enhanced Prompting for Attribute-Controlled Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While ACT has garnered attention in recent years due to its usefulness in real-world applications, progress in the task is currently limited by dataset availability, since most prior approaches rely on supervised methods. To address this limitation, we propose Retrieval and Attribute-Marking enhanced Prompting (RAMP), which leverages large multilingual language models to perform ACT in few-shot and zero-shot settings. |
Gabriele Sarti; Phu Mon Htut; Xing Niu; Benjamin Hsu; Anna Currey; Georgiana Dinu; Maria Nadejde; |
1037 | Zero-Shot and Few-Shot Stance Detection on Varied Topics Via Conditional Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we instead utilize a conditional generation framework and formulate the problem as denoising from partially-filled templates, which can better utilize the semantics among input, label, and target texts. |
Haoyang Wen; Alexander Hauptmann; |
1038 | Discourse-Level Representations Can Improve Prediction of Degree of Anxiety Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here, we investigate the development of a modern linguistic assessment for degree of anxiety, specifically evaluating the utility of discourse-level information in addition to lexical-level large language model embeddings. |
Swanie Juhng; Matthew Matero; Vasudha Varadarajan; Johannes Eichstaedt; Adithya V Ganesan; H. Andrew Schwartz; |
1039 | Controlling The Extraction of Memorized Data from Large Language Models Via Prompt-Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a novel approach which uses prompt-tuning to control the extraction rates of memorized content in LLMs. |
Mustafa Ozdayi; Charith Peris; Jack FitzGerald; Christophe Dupuy; Jimit Majmudar; Haidar Khan; Rahil Parikh; Rahul Gupta; |
1040 | MultiTool-CoT: GPT-3 Can Use Multiple External Tools with Chain of Thought Prompting Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To further improve the performance, we propose MultiTool-CoT, a novel framework that leverages chain-of-thought (CoT) prompting to incorporate multiple external tools, such as a calculator and a knowledge retriever, during the reasoning process. |
Tatsuro Inaba; Hirokazu Kiyomaru; Fei Cheng; Sadao Kurohashi; |
1041 | MPMR: A Multilingual Pre-trained Machine Reader at Scale Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present multilingual Pre-trained Machine Reader (mPMR), a novel method for multilingual machine reading comprehension (MRC)-style pre-training. |
Weiwen Xu; Xin Li; Wai Lam; Lidong Bing; |
1042 | MOSPC: MOS Prediction Based on Pairwise Comparison Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Meanwhile, as each annotator scores multiple audios during annotation, the score is probably a relative value based on the first or the first few speech scores given by the annotator. Motivated by the above two points, we propose a general framework for MOS prediction based on pair comparison (MOSPC), and we utilize C-Mixup algorithm to enhance the generalization performance of MOSPC. |
Kexin Wang; Yunlong Zhao; Qianqian Dong; Tom Ko; Mingxuan Wang; |
1043 | LI-RAGE: Late Interaction Retrieval Augmented Generation with Explicit Signals for Open-Domain Table Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: These fixed vectors can be insufficient to capture fine-grained features of potentially very big tables with heterogeneous row/column information. We address this limitation by 1) applying late interaction models which enforce a finer-grained interaction between question and table embeddings at retrieval time. In addition, we 2) incorporate a joint training scheme of the retriever and reader with explicit table-level signals, and 3) embed a binary relevance token as a prefix to the answer generated by the reader, so we can determine at inference time whether the table used to answer the question is reliable and filter accordingly. |
Weizhe Lin; Rexhina Blloshmi; Bill Byrne; Adria de Gispert; Gonzalo Iglesias; |
1044 | How Well Apply Simple MLP to Incomplete Utterance Rewriting? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a simple yet efficient IUR method. |
Jiang Li; Xiangdong Su; Xinlan Ma; Guanglai Gao; |
1045 | XL-LEXEME: WiC Pretrained Model for Cross-Lingual LEXical SEMantic ChangE Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce XL-LEXEME, a Lexical Semantic Change Detection model. |
Pierluigi Cassotti; Lucia Siciliani; Marco DeGemmis; Giovanni Semeraro; Pierpaolo Basile; |
1046 | Theory-Grounded Computational Text Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this position paper, we argue that computational text analysis lacks and requires organizing principles. |
Arya D. McCarthy; Giovanna Maria Dora Dore; |
1047 | AMRs Assemble! Learning to Ensemble with Autoregressive Models for AMR Parsing Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we examine the current state-of-the-art in AMR parsing, which relies on ensemble strategies by merging multiple graph predictions. |
Abelardo Carlos Mart�nez Lorenzo; Pere Llu�s Huguet Cabot; Roberto Navigli; |
1048 | MolXPT: Wrapping Molecules with Text for Generative Pre-training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Considering that text is the most important record for scientific discovery, in this paper, we propose MolXPT, a unified language model of text and molecules pre-trained on SMILES (a sequence representation of molecules) wrapped by text. |
Zequn Liu; Wei Zhang; Yingce Xia; Lijun Wu; Shufang Xie; Tao Qin; Ming Zhang; Tie-Yan Liu; |
1049 | A Study on The Efficiency and Generalization of Light Hybrid Retrievers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we study �Is it possible to reduce the indexing memory of hybrid retrievers without sacrificing performance�? |
Man Luo; Shashank Jain; Anchit Gupta; Arash Einolghozati; Barlas Oguz; Debojeet Chatterjee; Xilun Chen; Chitta Baral; Peyman Heidari; |
1050 | The Mechanical Bard: An Interpretable Machine Learning Approach to Shakespearean Sonnet Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We consider the automated generation of sonnets, a poetic form constrained according to meter, rhyme scheme, and length. |
Edwin Agnew; Michelle Qiu; Lily Zhu; Sam Wiseman; Cynthia Rudin; |
1051 | When to Use Efficient Self Attention? Profiling Text, Speech and Image Transformer Variants Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present the first unified study of the efficiency of self-attention-based Transformer variants spanning text, speech and vision. |
Anuj Diwan; Eunsol Choi; David Harwath; |
1052 | Evaluating Zero-Shot Event Structures: Recommendations for Automatic Content Extraction (ACE) Annotations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We describe ACE?s event structures and identify significant ambiguities and issues in current evaluation practice, including (1) coreferent argument mentions, (2) conflicting argument head conventions, and (3) ignorance of modality and event class details. |
Erica Cai; Brendan O�Connor; |
1053 | Event Extraction As Question Generation and Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose QGA-EE, which enables a Question Generation (QG) model to generate questions that incorporate rich contextual information instead of using fixed templates. |
Di Lu; Shihao Ran; Joel Tetreault; Alejandro Jaimes; |
1054 | Are Sample-Efficient NLP Models More Robust? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We conduct a large empirical study across three tasks, three broadly-applicable modeling interventions (increasing model size, using a different adaptation method, and pre-training on more data), and 14 diverse datasets to investigate the relationship between sample efficiency (amount of data needed to reach a given ID accuracy) and robustness (how models fare on OOD evaluation). |
Nelson F. Liu; Ananya Kumar; Percy Liang; Robin Jia; |
1055 | Diversity-Aware Coherence Loss for Improving Neural Topic Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a novel diversity-aware coherence loss that encourages the model to learn corpus-level coherence scores while maintaining high diversity between topics. |
Raymond Li; Felipe Gonzalez-Pizarro; Linzi Xing; Gabriel Murray; Giuseppe Carenini; |
1056 | NarrowBERT: Accelerating Masked Language Model Pretraining and Inference Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose NarrowBERT, a modified transformer encoder that increases the throughput for masked language model pretraining by more than 2x. |
Haoxin Li; Phillip Keung; Daniel Cheng; Jungo Kasai; Noah A. Smith; |
1057 | S3HQA: A Three-Stage Approach for Multi-hop Text-Table Hybrid Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a three-stage TextTableQA framework S3HQA, which comprises of retriever, selector, and reasoner. |
Fangyu Lei; Xiang Li; Yifan Wei; Shizhu He; Yiming Huang; Jun Zhao; Kang Liu; |
1058 | Towards Fewer Hallucinations in Knowledge-Grounded Dialogue Generation Via Augmentative and Contrastive Knowledge-Dialogue Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To mitigate the hallucination, we take inspiration from human communicating that people will replay euphemistic responses for the unclear or unrecognizable knowledge, and propose an Augmentative and Contrastive Knowledge Dialogue Expansion Framework (ACK-DEF). |
Bin Sun; Yitong Li; Fei Mi; Fanhu Bie; Yiwei Li; Kan Li; |
1059 | AutoConv: Automatically Generating Information-seeking Conversations with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the research is still stymied by the scarcity of training data. To alleviate this problem, we propose AutoConv for synthetic conversation generation, which takes advantage of the few-shot learning ability and generation capacity of large language models (LLM). |
Siheng Li; Cheng Yang; Yichun Yin; Xinyu Zhu; Zesen Cheng; Lifeng Shang; Xin Jiang; Qun Liu; Yujiu Yang; |
1060 | STT4SG-350: A Speech Corpus for All Swiss German Dialect Regions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present STT4SG-350, a corpus of Swiss German speech, annotated with Standard German text at the sentence level. |
Michel Pl�ss; Jan Deriu; Yanick Schraner; Claudio Paonessa; Julia Hartmann; Larissa Schmidt; Christian Scheller; Manuela H�rlimann; Tanja Samard�ic; Manfred Vogel; Mark Cieliebak; |
1061 | Teaching Small Language Models to Reason Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these reasoning capabilities only appear to emerge in models with at least tens of billions of parameters. In this paper, we explore the transfer of such reasoning capabilities to smaller models via knowledge distillation, also investigating model and dataset size trade-off. |
Lucie Charlotte Magister; Jonathan Mallinson; Jakub Adamek; Eric Malmi; Aliaksei Severyn; |
1062 | A Simple and Effective Framework for Strict Zero-Shot Hierarchical Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these benchmarks often do not adequately address the challenges posed in the real-world, such as that of hierarchical classification. In order to address this challenge, we propose refactoring conventional tasks on hierarchical datasets into a more indicative long-tail prediction task. |
Rohan Bhambhoria; Lei Chen; Xiaodan Zhu; |
1063 | A Simple Concatenation Can Effectively Improve Speech Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by the works of video Transformer, we propose a simple unified cross-modal ST method, which concatenates speech and text as the input, and builds a teacher that can utilize both cross-modal information simultaneously. |
Linlin Zhang; Kai Fan; Boxing Chen; Luo Si; |
1064 | ScoNe: Benchmarking Negation Reasoning in Language Models With Fine-Tuning and In-Context Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, these benchmarks lack the controlled example paradigms that would allow us to infer whether a model had truly learned how negation morphemes semantically scope. To fill these analytical gaps, we present the Scoped Negation NLI (ScoNe-NLI) benchmark, which contains contrast sets of six examples with up to two negations where either zero, one, or both negative morphemes affect the NLI label. |
Jingyuan S. She; Christopher Potts; Samuel R. Bowman; Atticus Geiger; |
1065 | Revisiting Automated Prompting: Are We Actually Doing Better? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we revisit techniques for automated prompting on six different downstream tasks and a larger range of K-shot learning settings. |
Yulin Zhou; Yiren Zhao; Ilia Shumailov; Robert Mullins; Yarin Gal; |
1066 | Mind The Gap Between The Application Track and The Real World Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While this demonstrates the growing real-world impact of the field, research papers frequently feature experiments that do not account for the complexities of realistic data and environments. To explore the extent of this gap, we investigate the relationship between the real-world motivations described in NLP papers and the models and evaluation which comprise the proposed solution. |
Ananya Ganesh; Jie Cao; E. Margaret Perkoff; Rosy Southwell; Martha Palmer; Katharina Kann; |
1067 | How to Distill Your BERT: An Empirical Study on The Impact of Weight Initialisation and Distillation Objectives Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We show that attention transfer gives the best performance overall. We also study the impact of layer choice when initializing the student from the teacher layers, finding a significant impact on the performance in task-specific distillation |
Xinpeng Wang; Leonie Weissweiler; Hinrich Sch�tze; Barbara Plank; |
1068 | ACTC: Active Threshold Calibration for Cold-Start Knowledge Graph Completion Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we attempt for the first time cold-start calibration for KGC, where no annotated examples exist initially for calibration, and only a limited number of tuples can be selected for annotation. |
Anastasiia Sedova; Benjamin Roth; |
1069 | Task-Aware Specialization for Efficient and Robust Dense Retrieval for Open-Domain Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We thus propose a new architecture, Task-Aware Specialization for dEnse Retrieval (TASER), which enables parameter sharing by interleaving shared and specialized blocks in a single encoder. |
Hao Cheng; Hao Fang; Xiaodong Liu; Jianfeng Gao; |
1070 | Linear Classifier: An Often-Forgotten Baseline for Text Classification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Due to the superior performance of these advanced methods, nowadays, people often directly train them for a few epochs and deploy the obtained model. In this opinion paper, we point out that this way may only sometimes get satisfactory results. |
Yu-Chen Lin; Si-An Chen; Jie-Jyun Liu; Chih-Jen Lin; |
1071 | Randomized Positional Encodings Boost Length Generalization of Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Moreover, simply training on longer sequences is inefficient due to the quadratic computation complexity of the global attention mechanism. In this work, we demonstrate that this failure mode is linked to positional encodings being out-of-distribution for longer sequences (even for relative encodings) and introduce a novel family of positional encodings that can overcome this problem. |
Anian Ruoss; Gr�goire Del�tang; Tim Genewein; Jordi Grau-Moya; R�bert Csord�s; Mehdi Bennani; Shane Legg; Joel Veness; |
1072 | Table and Image Generation for Investigating Knowledge of Entities in Pre-trained Vision and Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a table and image generation task to verify how the knowledge about entities acquired from natural language is retained in Vision & Language (V & L) models. |
Hidetaka Kamigaito; Katsuhiko Hayashi; Taro Watanabe; |
1073 | Improving Grammar-based Sequence-to-Sequence Modeling with Decomposition and Constraints Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we study two low-rank variants of Neural QCFG for faster inference with different trade-offs between efficiency and expressiveness. |
Chao Lou; Kewei Tu; |
1074 | TeCS: A Dataset and Benchmark for Tense Consistency of Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present a parallel tense test set, containing French-English 552 utterances. |
Yiming Ai; Zhiwei He; Kai Yu; Rui Wang; |
1075 | CWSeg: An Efficient and General Approach to Chinese Word Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we report our efforts in advancing Chinese Word Segmentation for the purpose of rapid deployment in different applications. |
Dedong Li; Rui Zhao; Fei Tan; |
1076 | �Knowledge Is Power�: Constructing Knowledge Graph of Abdominal Organs and Using Them for Automatic Radiology Report Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we report our work on automatic radiology report generation from radiologists� dictation, which is in collaboration with a startup about to become Unicorn. |
Kaveri Kale; Pushpak Bhattacharyya; Aditya Shetty; Milind Gune; Kush Shrivastava; Rustom Lawyer; Spriha Biswas; |
1077 | Hunt for Buried Treasures: Extracting Unclaimed Embodiments from Patent Specifications Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a novel task ofunclaimed embodiment extraction (UEE)and a novel dataset for the task. |
Chikara Hashimoto; Gautam Kumar; Shuichiro Hashimoto; Jun Suzuki; |
1078 | MathPrompter: Mathematical Reasoning Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To the best of our knowledge, we are not aware of any LLMs that indicate their level of confidence in their responses which fuels a trust deficit in these models impeding their adoption. To address this deficiency, we propose �MathPrompter�, a technique that improves performance of LLMs on arithmetic problems along with increased reliance in the predictions. |
Shima Imani; Liang Du; Harsh Shrivastava; |
1079 | Constrained Policy Optimization for Controlled Self-Learning in Conversational AI Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we introduce a scalable framework for supporting fine-grained exploration targets for individual domains via user-defined constraints. |
Mohammad Kachuee; Sungjin Lee; |
1080 | PNLP-Mixer: An Efficient All-MLP Architecture for Language Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work introduces the pNLP-Mixer architecture, an embedding-free MLP-Mixer model for on-device NLP that achieves high weight-efficiency thanks to a novel projection layer. |
Francesco Fusco; Damian Pascual; Peter Staar; Diego Antognini; |
1081 | Extracting Text Representations for Terms and Phrases in Technical Domains Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a fully unsupervised approach to text encoding that consists of training small character-based models with the objective of reconstructing large pre-trained embedding matrices. |
Francesco Fusco; Diego Antognini; |
1082 | CocaCLIP: Exploring Distillation of Fully-Connected Knowledge Interaction Graph for Lightweight Text-Image Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although knowledge distillation techniques have been widely utilized for uni-modal model compression, how to expand them to the situation when the numbers of modalities and teachers/students are doubled has been rarely studied. In this paper, we conduct comprehensive experiments on this topic and propose the fully-Connected knowledge interaction graph (Coca) technique for cross-modal pre-training distillation. |
Jiapeng Wang; Chengyu Wang; Xiaodan Wang; Jun Huang; Lianwen Jin; |
1083 | KG-FLIP: Knowledge-guided Fashion-domain Language-Image Pre-training for E-commerce Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a knowledge-guided fashion-domain language-image pre-training (FLIP) framework that focuses on learning fine-grained representations in e-commerce domain and utilizes external knowledge (i. e. , product attribute schema), to improve the pre-training efficiency. |
Qinjin Jia; Yang Liu; Daoping Wu; Shaoyuan Xu; Huidong Liu; Jinmiao Fu; Roland Vollgraf; Bryan Wang; |
1084 | Domain-specific Transformer Models for Query Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a simple yet effective modification to the transformer training to preserve/correct Grocery brand names in the output while selectively translating the context words. |
Mandar Kulkarni; Nikesh Garera; Anusua Trivedi; |
1085 | Label Efficient Semi-supervised Conversational Intent Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a simple yet competent Semi-Supervised Learning (SSL) approach for label-efficient intent classification. |
Mandar Kulkarni; Kyung Kim; Nikesh Garera; Anusua Trivedi; |
1086 | XPQA: Cross-Lingual Product Question Answering in 12 Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While existing work on PQA focuses mainly on English, in practice there is need to support multiple customer languages while leveraging product information available in English. To study this practical industrial task, we present xPQA, a large-scale annotated cross-lingual PQA dataset in 12 languages, and report results in (1) candidate ranking, to select the best English candidate containing the information to answer a non-English question; and (2) answer generation, to generate a natural-sounding non-English answer based on the selected English candidate. |
Xiaoyu Shen; Akari Asai; Bill Byrne; Adria De Gispert; |
1087 | Learn Over Past, Evolve for Future: Forecasting Temporal Trends for Fake News Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, very few existing works consider the temporal shift issue caused by the rapidly-evolving nature of news data in practice, resulting in significant performance degradation when training on past data and testing on future data. In this paper, we observe that the appearances of news events on the same topic may display discernible patterns over time, and posit that such patterns can assist in selecting training instances that could make the model adapt better to future data. |
Beizhe Hu; Qiang Sheng; Juan Cao; Yongchun Zhu; Danding Wang; Zhengjia Wang; Zhiwei Jin; |
1088 | AVEN-GR: Attribute Value Extraction and Normalization Using Product GRaphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper makes three contributions to QAU. First, we propose a novel end-to-end approach that jointly solves Named Entity Recognition (NER) and Entity Linking (NEL) and enables open-world reasoning for QAU. Second, we introduce a novel method for utilizing product graphs to enhance the representation of query entities. Finally, we present a new dataset constructed from public sources that can be used to evaluate the performance of future QAU systems. |
Thomas Ricatte; Donato Crisostomi; |
1089 | GKD: A General Knowledge Distillation Framework for Large-scale Pre-trained Language Model Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, the deployment of knowledge distillation systems faces great challenges in real-world industrial-strength applications, which require the use of complex distillation methods on even larger-scale PLMs (over 10B), limited by memory on GPUs and the switching of methods. To overcome these challenges, we propose GKD, a general knowledge distillation framework that supports distillation on larger-scale PLMs using various distillation methods. |
Shicheng Tan; Weng Lam Tam; Yuanchun Wang; Wenwen Gong; Shu Zhao; Peng Zhang; Jie Tang; |
1090 | FashionKLIP: Enhancing E-Commerce Image-Text Retrieval with Fashion Multi-Modal Conceptual Knowledge Graph Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To alleviate the problem, this paper proposes a novel e-commerce knowledge-enhanced VLP model FashionKLIP. |
Xiaodan Wang; Chengyu Wang; Lei Li; Zhixu Li; Ben Chen; Linbo Jin; Jun Huang; Yanghua Xiao; Ming Gao; |
1091 | Entity Contrastive Learning in A Large-Scale Virtual Assistant System Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present both offline results, using retrospective test sets, as well as live online results from an A/B test that compared the two systems. |
Jonathan Rubin; Jason Crowley; George Leung; Morteza Ziyadi; Maria Minakova; |
1092 | Tab-Cleaner: Weakly Supervised Tabular Data Cleaning Via Pre-training for E-commerce Catalog Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present Tab-Cleaner, a model designed to handle error detection over text-rich tabular data following a pre-training / fine-tuning paradigm. |
Kewei Cheng; Xian Li; Zhengyang Wang; Chenwei Zhang; Binxuan Huang; Yifan Ethan Xu; Xin Luna Dong; Yizhou Sun; |
1093 | Toward More Accurate and Generalizable Evaluation Metrics for Task-Oriented Dialogs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this contribution, we show that: (i) while dialog quality cannot be completely decomposed into dialog-level attributes, there is a strong relationship between some objective dialog attributes and judgments of dialog quality; (ii) for the task of dialog-level quality estimation, a supervised model trained on dialog-level annotations outperforms methods based purely on aggregating turn-level features; and (iii) the proposed evaluation model shows better domain generalization ability compared to the baselines. |
Abishek Komma; Nagesh Panyam Chandrasekarasastry; Timothy Leffel; Anuj Goyal; Angeliki Metallinou; Spyros Matsoukas; Aram Galstyan; |
1094 | Tab-CQA: A Tabular Conversational Question Answering Dataset on Financial Reports Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose Tab-CQA, a tabular CQA dataset created from Chinese financial reports that are extracted from listed companies in a wide range of different sectors in the past 30 years. |
Chuang Liu; Junzhuo Li; Deyi Xiong; |
1095 | KoSBI: A Dataset for Mitigating Social Bias Risks Towards Safer Large Language Model Applications Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we present KosBi, a new social bias dataset of 34k pairs of contexts and sentences in Korean covering 72 demographic groups in 15 categories. |
Hwaran Lee; Seokhee Hong; Joonsuk Park; Takyoung Kim; Gunhee Kim; Jung-woo Ha; |
1096 | Improving Knowledge Production Efficiency With Question Answering on Conversation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The challenges of conversation-based QA include: 1) answers may be scattered among multiple dialogue turns; 2) understanding complex dialogue contexts is more complicated than documents. To address these challenges, we propose a multi-span extraction model on this task and introduce continual pre-training and multi-task learning schemes to further improve model performance. |
Changlin Yang; Siye Liu; Sen Hu; Wangshu Zhang; Teng Xu; Jing Zheng; |
1097 | Mitigating The Burden of Redundant Datasets Via Batch-Wise Unique Samples and Frequency-Aware Losses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a simple yet effective solution to reduce the increased burden of repeated computation on redundant datasets. |
Donato Crisostomi; Andrea Caciolai; Alessandro Pedrani; Kay Rottmann; Alessandro Manzotti; Enrico Palumbo; Davide Bernardi; |
1098 | Distilled Language Models Are Economically Efficient for The Enterprise. …mostly Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper assesses the practical cost and impact of LLMs for the enterprise as a function of the usefulness of the responses that they generate. We present a cost framework for evaluating an NLP model�s utility for this use case and apply it to a single brand as a case study in the context of an existing agent assistance product. |
Kristen Howell; Gwen Christian; Pavel Fomitchov; Gitit Kehat; Julianne Marzulla; Leanne Rolston; Jadin Tredup; Ilana Zimmerman; Ethan Selfridge; Joseph Bradley; |
1099 | Application-Agnostic Language Modeling for On-Device ASR Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose two novel feed-forward architectures that find an optimal trade off between different on-device constraints. |
Markus Nussbaum-thom; Lyan Verwimp; Youssef Oualil; |
1100 | Building Accurate Low Latency ASR for Streaming Voice Search in E-commerce Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we build accurate LSTM, attention and CTC based streaming ASR models for large-scale Hinglish (blend of Hindi and English) Voice Search. |
Abhinav Goyal; Nikesh Garera; |
1101 | PLAtE: A Large-scale Dataset for List Page Web Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce the PLAtE (Pages of Lists Attribute Extraction) benchmark dataset as a challenging new web extraction task. |
Aidan San; Yuan Zhuang; Jan Bakus; Colin Lockard; David Ciemiewicz; Sandeep Atluri; Kevin Small; Yangfeng Ji; Heba Elfardy; |
1102 | Rapid Diffusion: Building Domain-Specific Text-to-Image Synthesizers with Fast Inference Speed Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose Rapid Diffusion, a novel framework for training and deploying super-resolution, text-to-image latent diffusion models with rich entity knowledge injected and optimized networks. |
Bingyan Liu; Weifeng Lin; Zhongjie Duan; Chengyu Wang; Wu Ziheng; Zhang Zipeng; Kui Jia; Lianwen Jin; Cen Chen; Jun Huang; |
1103 | Large Scale Generative Multimodal Attribute Extraction for E-commerce Attributes Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Automatically identifying these attribute values from an eCommerce product page that contains both text and images is a challenging task, especially when the attribute value is not explicitly mentioned in the catalog. In this paper, we present a scalable solution for this problem where we pose attribute extraction problem as a question-answering task, which we solve using MXT, that consists of three key components: (i) MAG (Multimodal Adaptation Gate), (ii) Xception network, and (iii) T5 encoder-decoder. |
Anant Khandelwal; Happy Mittal; Shreyas Kulkarni; Deepak Gupta; |
1104 | Consistent Text Categorization Using Data Augmentation in E-Commerce Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This phenomenon can negatively affect downstream recommendation or search applications, leading to a sub-optimal user experience. To address this issue, we propose a new framework for consistent text categorization. |
Noa Avigdor; Guy Horowitz; Ariel Raviv; Stav Yanovsky Daye; |
1105 | An Efficient Method for Natural Language Querying on Structured Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present an efficient and reliable approach to Natural Language Querying (NLQ) on databases (DB) which is not based on text-to-SQL type semantic parsing. |
Hanoz Bhathena; Aviral Joshi; Prateek Singh; |
1106 | Boosting Transformers and Language Models for Clinical Prediction in Immunotherapy Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this research, we explore the use of transformers and language models in prognostic prediction for immunotherapy using real-world patients� clinical data and molecular profiles. |
Zekai Chen; Mariann Micsinai Balan; Kevin Brown; |
1107 | EvolveMT: An Ensemble MT Engine Improving Itself with Usage Only Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work proposes a method named EvolveMT for the efficient combination of multiple machine translation (MT) engines. |
Kamer Y�ksel; Ahmet Gunduz; Mohamed Al-badrashiny; Hassan Sawaf; |
1108 | A Static Evaluation of Code Completion By Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a static evaluation framework to quantify static errors in Python code completions, by leveraging Abstract Syntax Trees. |
Hantian Ding; Varun Kumar; Yuchen Tian; Zijian Wang; Rob Kwiatkowski; Xiaopeng Li; Murali Krishna Ramanathan; Baishakhi Ray; Parminder Bhatia; Sudipta Sengupta; |
1109 | Scalable and Safe Remediation of Defective Actions in Self-Learning Conversational Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose method for curating and leveraging high-precision samples sourced from historical regression incident reports to validate, safe-guard, and improve policies prior to the online deployment. |
Sarthak Ahuja; Mohammad Kachuee; Fatemeh Sheikholeslami; Weiqing Liu; Jaeyoung Do; |
1110 | MobileNMT: Enabling Translation in 15MB and 30ms Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present MobileNMT, a system that can translate in 15MB and 30ms on devices. |
Ye Lin; Xiaohui Wang; Zhexi Zhang; Mingxuan Wang; Tong Xiao; Jingbo Zhu; |
1111 | Multi-doc Hybrid Summarization Via Salient Representation Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we proposed a multi-document hybrid summarization approach, which simultaneously generates a human-readable summary and extracts corresponding key evidences based on multi-doc inputs. |
Min Xiao; |
1112 | SaFER: A Robust and Efficient Framework for Fine-tuning BERT-based Classifier with Noisy Labels Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose SaFER, a robust and efficient fine-tuning framework for BERT-based text classifiers, combating label noises without access to any clean data for training or validation. |
Zhenting Qi; Xiaoyu Tan; Chao Qu; Yinghui Xu; Yuan Qi; |
1113 | Chemical Language Understanding Benchmark Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce the benchmark datasets named CLUB (Chemical Language Understanding Benchmark) to facilitate NLP research in the chemical industry. |
Yunsoo Kim; Hyuk Ko; Jane Lee; Hyun Young Heo; Jinyoung Yang; Sungsoo Lee; Kyu-hwang Lee; |
1114 | HyperT5: Towards Compute-Efficient Korean Language Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper explores the sequence-to-sequence (seq2seq) language model architecture as a more practical and compute-efficient alternative to the decoder-oriented approach (e. g. , GPT-3), accompanied by novel findings in compute-optimality analyses. |
Dongju Park; Soonwon Ka; Kang Min Yoo; Gichang Lee; Jaewook Kang; |
1115 | Semantic Ambiguity Detection in Sentence Classification Using Task-Specific Embeddings Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, because of the structural limitations of the service, there may not be sufficient contextual information to resolve the ambiguity. In this situation, we focus on ambiguity detection so that service design considering ambiguity is possible. |
Jong Myoung Kim; Young-jun Lee; Sangkeun Jung; Ho-jin Choi; |
1116 | Reliable and Interpretable Drift Detection in Streams of Short Texts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study we propose an end-to-end framework for reliable model-agnostic change-point detection and interpretation in large task-oriented dialog systems, proven effective in multiple customer deployments. |
Ella Rabinovich; Matan Vetzler; Samuel Ackerman; Ateret Anaby Tavor; |
1117 | Sharing Encoder Representations Across Languages, Domains and Tasks in Large-Scale Spoken Language Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We evaluate using a larger 170M parameter BERT encoder that shares representations across languages, domains and tasks for SLU compared to using smaller 17M parameter BERT encoders with language-, domain- and task-decoupled finetuning. |
Jonathan Hueser; Judith Gaspers; Thomas Gueudre; Chandana Prakash; Jin Cao; Daniil Sorokin; Quynh Do; Nicolas Anastassacos; Tobias Falke; Turan Gojayev; |
1118 | Annotating Research Infrastructure in Scientific Papers: An NLP-driven Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present a natural language processing (NLP) pipeline for the identification, extraction and linking of Research Infrastructure (RI) used in scientific publications. |
Seyed Amin Tabatabaei; Georgios Cheirmpos; Marius Doornenbal; Alberto Zigoni; Veronique Moore; Georgios Tsatsaronis; |
1119 | Event-Centric Query Expansion in Web Search Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present Event-Centric Query Expansion (EQE), the QE system used in a famous Chinese search engine. |
Yanan Zhang; Weijie Cui; Yangfan Zhang; Xiaoling Bai; Zhe Zhang; Jin Ma; Xiang Chen; Tianhua Zhou; |
1120 | Transferable and Efficient: Unifying Dynamic Multi-Domain Product Categorization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In order to unify the categorization process and ensure efficiency, we propose a two-stage taxonomy-agnostic framework that relies solely on calculating the semantic relatedness between product titles and category names in the vector space. |
Shansan Gong; Zelin Zhou; Shuo Wang; Fengjiao Chen; Xiujie Song; Xuezhi Cao; Yunsen Xian; Kenny Zhu; |
1121 | DISCOSQA: A Knowledge Base Question Answering System for Space Debris Based on Program Induction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work we present a system, developed for the European Space Agency, that can answer complex natural language queries, to support engineers in accessing the information contained in a KB that models the orbital space debris environment. |
Paul Darm; Antonio Valerio Miceli Barone; Shay B. Cohen; Annalisa Riccardi; |
1122 | BADGE: Speeding Up BERT Inference After Deployment Via Block-wise Bypasses and Divergence-based Early Exiting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a novel framework, BADGE, which consists of two off-the-shelf methods for improving PLMs� early exiting. |
Wei Zhu; Peng Wang; Yuan Ni; Guotong Xie; Xiaoling Wang; |
1123 | K-pop and Fake Facts: from Texts to Smart Alerting for Maritime Security Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a use case for suspect data identification in a maritime setting. |
Maxime Prieur; Souhir Gahbiche; Guillaume Gadek; Sylvain Gatepaille; Kilian Vasnier; Valerian Justine; |
1124 | Evaluating Embedding APIs for Information Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: With a growing number of APIs at our disposal, in this paper, our goal is to analyze semantic embedding APIs in realistic retrieval scenarios in order to assist practitioners and researchers in finding suitable services according to their needs. |
Ehsan Kamalloo; Xinyu Zhang; Odunayo Ogundepo; Nandan Thakur; David Alfonso-hermelo; Mehdi Rezagholizadeh; Jimmy Lin; |
1125 | Domain-Agnostic Neural Architecture for Class Incremental Continual Learning in Document Processing Platform Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recent methods with stochastic gradient learning have been shown to struggle in such setups or have limitations like memory buffers, and being restricted to specific domains that disable its usage in real-world scenarios. For this reason, we present a fully differentiable architecture based on the Mixture of Experts model, that enables the training of high-performance classifiers when examples from each class are presented separately. |
Mateusz W�jcik; Witold Kosciukiewicz; Mateusz Baran; Tomasz Kajdanowicz; Adam Gonczarek; |
1126 | Regression-Free Model Updates for Spoken Language Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Multiple techniques for that have been proposed in the recent literature. In this paper, we apply one such technique, focal distillation, to model updates in a goal-oriented dialog system and assess its usefulness in practice. |
Andrea Caciolai; Verena Weber; Tobias Falke; Alessandro Pedrani; Davide Bernardi; |
1127 | Reducing Cohort Bias in Natural Language Understanding Systems with Targeted Self-training Scheme Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we focus on reducing the bias related to new customers in a digital voice assistant system. |
Dieu-thu Le; Gabriela Hernandez; Bei Chen; Melanie Bradford; |
1128 | Content Moderation for Evolving Policies Using Binary Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose to model content moderation as a binary question answering problem where the questions validate the loosely coupled themes constituting a policy. |
Sankha Subhra Mullick; Mohan Bhambhani; Suhit Sinha; Akshat Mathur; Somya Gupta; Jidnya Shah; |
1129 | Weighted Contrastive Learning With False Negative Control to Help Long-tailed Product Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address the FN issue in the KCL, we proposed to re-weight the positive pairs in the KCL loss with a regularization that the sum of weights should be constrained to K+1 as close as possible. |
Tianqi Wang; Lei Chen; Xiaodan Zhu; Younghun Lee; Jing Gao; |
1130 | Towards Building A Robust Toxicity Predictor Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a novel adversarial attack, \texttt{ToxicTrap}, introducing small word-level perturbations to fool SOTA text classifiers to predict toxic text samples as benign. |
Dmitriy Bespalov; Sourav Bhabesh; Yi Xiang; Liutong Zhou; Yanjun Qi; |
1131 | AI Coach Assist: An Automated Approach for Call Recommendation in Contact Centers for Agent Coaching Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present �AI Coach Assis�, which leverages the pre-trained transformer-based language models to determine whether a given call is coachable or not based on the quality assurance (QA) queries/questions asked by the contact center managers or supervisors. |
Md Tahmid Rahman Laskar; Cheng Chen; Xue-yong Fu; Mahsa Azizi; Shashi Bhushan; Simon Corston-oliver; |
1132 | Unified Contextual Query Rewriting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose a unified contextual query rewriting model that unifies QR for both reducing friction and contextual carryover purpose. |
Yingxue Zhou; Jie Hao; Mukund Rungta; Yang Liu; Eunah Cho; Xing Fan; Yanbin Lu; Vishal Vasudevan; Kellen Gillespie; Zeynab Raeesy; |
1133 | Context-Aware Query Rewriting for Improving Users� Search Experience on E-commerce Websites Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing query rewriting models ignore users� history behaviors and consider only the instant search query, which is often a short string offering limited information about the true shopping intent. We propose an end-to-end context-aware query rewriting model to bridge this gap, which takes the search context into account. |
Simiao Zuo; Qingyu Yin; Haoming Jiang; Shaohui Xi; Bing Yin; Chao Zhang; Tuo Zhao; |
1134 | Federated Learning of Gboard Language Models with Differential Privacy Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The recent DP-Follow the Regularized Leader (DP-FTRL) algorithm is applied to achieve meaningfully formal DP guarantees without requiring uniform sampling of clients. To provide favorable privacy-utility trade-offs, we introduce a new client participation criterion and discuss the implication of its configuration in large scale systems. |
Zheng Xu; Yanxiang Zhang; Galen Andrew; Christopher Choquette; Peter Kairouz; Brendan Mcmahan; Jesse Rosenstock; Yuanbo Zhang; |
1135 | RadLing: Towards Efficient Radiology Report Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our main contribution in this paper is knowledge-aware masking which is an taxonomic knowledge-assisted pre-training task that dynamically masks tokens to inject knowledge during pretraining. |
Rikhiya Ghosh; Oladimeji Farri; Sanjeev Kumar Karn; Manuela Danu; Ramya Vunikili; Larisa Micu; |
1136 | Predicting Customer Satisfaction with Soft Labels for Ordinal Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In a typical call center, only up to 8% of callersleave a Customer Satisfaction (CSAT) surveyresponse at the end of the call, and these tend tobe customers with strongly positive or negativeexperiences. To manage this data sparsity andresponse bias, we outline a predictive CSATdeep learning algorithm that infers CSAT onthe 1-5 scale on inbound calls to the call centerwith minimal latency. |
Etienne Manderscheid; Matthias Lee; |
1137 | Accurate Training of Web-based Question Answering Systems with Feedback from Ranked Users Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we first collect a large scale (16M) QA dataset with real feedback sampled from the QA traffic of a popular Virtual Assistant. Second, we use this data to develop two strategies for filtering unreliable users and thus de-noise feedback: (i) ranking users with an automatic classifier, and (ii) aggregating feedback over similar instances and comparing users between each other. |
Liang Wang; Ivano Lauriola; Alessandro Moschitti; |
1138 | SPM: A Split-Parsing Method for Joint Multi-Intent Detection and Slot Filling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Meanwhile, it lacks the ability to assign slots to each corresponding intent. To overcome these problems, we propose a Split-Parsing Method (SPM) for joint multiple intent detection and slot filling, which is a two-stage method. |
Sheng Jiang; Su Zhu; Ruisheng Cao; Qingliang Miao; Kai Yu; |
1139 | NAG-NER: A Unified Non-Autoregressive Generation Framework for Various NER Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a unified non-autoregressive generation (NAG) framework for general NER tasks, referred to as NAG-NER. |
Xinpeng Zhang; Ming Tan; Jingfan Zhang; Wei Zhu; |
1140 | Search Query Spell Correction with Weak Supervision in E-commerce Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We overcome the constraint of limited human labelled data by proposing novel synthetic data generation techniques for voluminous generation of training pairs needed by data hungry Transformers, without any human intervention. |
Vishal Kakkar; Chinmay Sharma; Madhura Pande; Surender Kumar; |
1141 | �Let�s Not Quote Out of Context�: Unified Vision-Language Pretraining for Context Assisted Image Captioning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Manual creation and updates to ensure the same is non trivial given the scale and the tedium towards this task. We propose a new unified Vision-Language (VL) model based on the One For All (OFA) model, with a focus on context-assisted image captioning where the caption is generated based on both the image and its context. |
Abisek Rajakumar Kalarani; Pushpak Bhattacharyya; Niyati Chhaya; Sumit Shekhar; |
1142 | What, When, and How to Ground: Designing User Persona-Aware Conversational Agents for Engaging Dialogue Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a method for building a personalized open-domain dialogue system to address the WWH (WHAT, WHEN, and HOW) problem for natural response generation in a commercial setting, where personalized dialogue responses are heavily interleaved with casual response turns. |
Deuksin Kwon; Sunwoo Lee; Ki Hyun Kim; Seojin Lee; Taeyoon Kim; Eric Davis; |
1143 | CUPID: Curriculum Learning Based Real-Time Prediction Using Distillation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While cross-attention in Transformers enables a more accurate relevance prediction in such a setting, its high evaluation latency makes it unsuitable for real-time predictions in which thousands of products must be evaluated against a user query within few milliseconds. To address this issue, we propose CUPID: a Curriculum learning based real-time Prediction using Distillation that utilizes knowledge distillation within a curriculum learning setting to learn a simpler architecture that can be evaluated within low latency budgets. |
Arindam Bhattacharya; Ankith Ms; Ankit Gandhi; Vijay Huddar; Atul Saroop; Rahul Bhagat; |
1144 | Answering Unanswered Questions Through Semantic Reformulations in Spoken QA Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a Semantic Question Reformulation (SURF) model offering three linguistically-grounded operations (repair, syntactic reshaping, generalization) to rewrite questions to facilitate answering. |
Pedro Faustini; Zhiyu Chen; Besnik Fetahu; Oleg Rokhlenko; Shervin Malmasi; |
1145 | Exploring Zero and Few-shot Techniques for Intent Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Scaling to so many customers puts a constraint on storage space as well. In this paper, we explore four different zero and few-shot intent classification approaches with this low-resource constraint: 1) domain adaptation, 2) data augmentation, 3) zero-shot intent classification using descriptions large language models (LLMs), and 4) parameter-efficient fine-tuning of instruction-finetuned language models. |
Soham Parikh; Mitul Tiwari; Prashil Tumbade; Quaizar Vohra; |
1146 | Referring to Screen Texts with Voice Assistants Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However assistants have limited capacity to understand their users� context. In this work, we aim to take a step in this direction. |
Shruti Bhargava; Anand Dhoot; Ing-marie Jonsson; Hoang Long Nguyen; Alkesh Patel; Hong Yu; Vincent Renkens; |
1147 | Generate-then-Retrieve: Intent-Aware FAQ Retrieval in Product Search Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our proposed intent-aware FAQ retrieval consists of (1) an intent classifier that predicts whether the query is looking for an FAQ; (2) a reformulation model that rewrites query into a natural question. |
Zhiyu Chen; Jason Choi; Besnik Fetahu; Oleg Rokhlenko; Shervin Malmasi; |
1148 | KAFA: Rethinking Image Ad Understanding with Knowledge-Augmented Feature Adaptation of Vision-Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we perform the first empirical study of image ad understanding through the lens of pre-trained VLMs. |
Zhiwei Jia; Pradyumna Narayana; Arjun Akula; Garima Pruthi; Hao Su; Sugato Basu; Varun Jampani; |
1149 | Weakly Supervised Hierarchical Multi-task Classification of Customer Questions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a weakly supervised Hierarchical Multi-task Classification Framework (HMCF) to identify topics from customer questions at various granularities. |
Jitenkumar Rana; Promod Yenigalla; Chetan Aggarwal; Sandeep Sricharan Mukku; Manan Soni; Rashmi Patange; |
1150 | Automated Digitization of Unstructured Medical Prescriptions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present prescription digitization system for online medicine ordering built with minimal supervision. |
Megha Sharma; Tushar Vatsal; Srujana Merugu; Aruna Rajan; |