Paper Digest: NAACL 2024 Papers & Highlights
The North American Chapter of the Association for Computational Linguistics (NAACL) is one of the top natural language processing conferences in the world. To help the community quickly catch up on the work presented in this conference, Paper Digest Team processed all accepted papers, and generated one highlight sentence (typically the main topic) for each paper. Readers are encouraged to read these machine generated highlights to quickly get the main idea of each paper.
To search or review papers within NAACL-2024 related to a specific topic, please use the search by venue (NAACL-2024), review by venue (NAACL-2024) and question answering by venue (NAACL-2024) services. To browse papers by author, here is a list of all authors (NAACL-2024). You may also like to explore our “Best Paper” Digest (NAACL), which lists the most influential NAACL papers since 2000.
This list is created by the Paper Digest Team. Experience the cutting-edge capabilities of Paper Digest, an innovative AI-powered research platform that empowers you to write, review, get answers and more. Try us today and unlock the full potential of our services for free!
TABLE 1: Paper Digest: NAACL 2024 Papers & Highlights
Paper | Author(s) | |
---|---|---|
1 | Named Entity Recognition Under Domain Shift Via Metric Learning for Life Sciences Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We investigate the applicability of transfer learning for enhancing a named entity recognition model trained in the biomedical domain (the source domain) to be used in the chemical domain (the target domain) |
Hongyi Liu; Qingyun Wang; Payam Karisani; Heng Ji; |
2 | Text Diffusion Model with Encoder-Decoder Transformers for Sequence-to-Sequence Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose SeqDiffuSeq, a text diffusion model, to approach sequence-to-sequence text generation with an encoder-decoder Transformer architecture. |
Hongyi Yuan; Zheng Yuan; Chuanqi Tan; Fei Huang; Songfang Huang; |
3 | An Interactive Framework for Profiling News Media Sources Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose an interactive framework for news media profiling. |
Nikhil Mehta; Dan Goldwasser; |
4 | Assessing Logical Puzzle Solving in Large Language Models: Insights from A Minesweeper Case Study Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In our research, we introduce a novel task�Minesweeper�specifically designed in a format unfamiliar to LLMs and absent from their training datasets. |
Yinghao Li; Haorui Wang; Chao Zhang; |
5 | TelME: Teacher-leading Multimodal Fusion Network for Emotion Recognition in Conversation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose Teacher-leading Multimodal fusion network for ERC (TelME). |
Taeyang Yun; Hyunkuk Lim; Jeonghwan Lee; Min Song; |
6 | Effective and Efficient Conversation Retrieval for Dialogue State Tracking with Implicit Text Summaries Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the approach is less suited for scaling to new domains or new annotation languages, where fine-tuning data is unavailable. To address this problem, we handle the task of conversation retrieval based on text summaries of the conversations. |
Seanie Lee; Jianpeng Cheng; Joris Driesen; Alexandru Coca; Anders Johannsen; |
7 | Promptly Predicting Structures: The Return of Inference Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present a framework for constructing zero- and few-shot linguistic structure predictors. |
Maitrey Mehta; Valentina Pyatkin; Vivek Srikumar; |
8 | On Linearizing Structured Data in Encoder-Decoder Language Models: Insights from Text-to-SQL Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work investigates the linear handling of structured data in encoder-decoder language models, specifically T5. |
Yutong Shao; Ndapa Nakashole; |
9 | Extractive Summarization with Text Generator Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In contrast, text generators which are commonly employed in abstractive summarization can effortlessly overcome this predicament on account of flexible sequence-to-sequence architectures. Motivated to bypass this inherent limitation, we investigate the possibility of conducting extractive summarization with text generators. |
Thang Le; Anh Tuan Luu; |
10 | Self-generated Replay Memories for Continual Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we leverage a key property of encoder-decoder Transformers, i. e. their generative ability, to propose a novel approach to continually learning Neural Machine Translation systems. |
Michele Resta; Davide Bacciu; |
11 | Measuring and Improving Chain-of-Thought Reasoning in Vision-Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Vision-language models (VLMs) have recently demonstrated strong efficacy as visual assistants that can parse natural queries about the visual content and generate human-like outputs. In this work, we explore the ability of these models to demonstrate human-like reasoning based on the perceived information. |
Yangyi Chen; Karan Sikka; Michael Cogswell; Heng Ji; Ajay Divakaran; |
12 | Building Knowledge-Guided Lexica to Model Cultural Variation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce a new research problem for the NLP community: How do we measure variation in cultural constructs across regions using language? |
Shreya Havaldar; Salvatore Giorgi; Sunny Rai; Thomas Talhelm; Sharath Chandra Guntuku; Lyle Ungar; |
13 | Adaptive Rank Selections for Low-Rank Approximation of Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a novel binary masking mechanism for optimizing the number of ranks in a differentiable framework. |
Shangqian Gao; Ting Hua; Yen-Chang Hsu; Yilin Shen; Hongxia Jin; |
14 | An Empirical Study of Consistency Regularization for End-to-End Speech-to-Text Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we conduct empirical studies on intra-modal and cross-modal consistency and propose two training strategies, SimRegCR and SimZeroCR, for E2E ST in regular and zero-shot scenarios. |
Pengzhi Gao; Ruiqing Zhang; Zhongjun He; Hua Wu; Haifeng Wang; |
15 | Unleashing The Emergent Cognitive Synergy in Large Language Models: A Task-Solving Agent Through Multi-Persona Self-Collaboration Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose Solo Performance Prompting (SPP), which transforms a single LLM into a cognitive synergist by engaging in multi-turn self-collaboration with multiple personas. |
Zhenhailong Wang; Shaoguang Mao; Wenshan Wu; Tao Ge; Furu Wei; Heng Ji; |
16 | FPT: Feature Prompt Tuning for Few-shot Readability Assessment Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Moreover, previous studies on utilizing linguistic features have shown non-robust performance in few-shot settings and may even impair model performance. To address these issues, we propose a novel prompt-based tuning framework that incorporates rich linguistic knowledge, called Feature Prompt Tuning (FPT). |
Ziyang Wang; Sanwoo Lee; Hsiu-Yuan Huang; Yunfang Wu; |
17 | Self-Prompting Large Language Models for Zero-Shot Open-Domain QA Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a Self-Prompting framework to explicitly utilize the massive knowledge encoded in the parameters of LLMs and their strong instruction understanding abilities. |
Junlong Li; Jinyuan Wang; Zhuosheng Zhang; Hai Zhao; |
18 | Head-to-Tail: How Knowledgeable Are Large Language Models (LLMs)? A.K.A. Will LLMs Replace Knowledge Graphs? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Since the recent prosperity of Large Language Models (LLMs), there have been interleaved discussions regarding how to reduce hallucinations from LLM responses, how to increase the factuality of LLMs, and whether Knowledge Graphs (KGs), which store the world knowledge in a symbolic form, will be replaced with LLMs. In this paper, we try to answer these questions from a new angle: How knowledgeable are LLMs?To answer this question, we constructed Head-to-Tail, a benchmark that consists of 18K question-answer (QA) pairs regarding head, torso, and tail facts in terms of popularity. |
Kai Sun; Yifan Xu; Hanwen Zha; Yue Liu; Xin Luna Dong; |
19 | KNN-ICL: Compositional Task-Oriented Parsing Generalization with Nearest Neighbor In-Context Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce k Nearest Neighbor In-Context Learning (kNN-ICL), which simplifies prompt engineering by allowing it to be built on top of any design strategy while providing access to all demo examples. |
Wenting Zhao; Ye Liu; Yao Wan; Yibo Wang; Qingyang Wu; Zhongfen Deng; Jiangshu Du; Shuaiqi Liu; Yunlong Xu; Philip Yu; |
20 | ARES: An Automated Evaluation Framework for Retrieval-Augmented Generation Systems Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce ARES, an Automated RAG Evaluation System, for evaluating RAG systems along the dimensions of context relevance, answer faithfulness, and answer relevance. |
Jon Saad-Falcon; Omar Khattab; Christopher Potts; Matei Zaharia; |
21 | DEMO: A Statistical Perspective for Efficient Image-Text Matching Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the similarity structure could be biased at the boundaries of semantic distributions, causing error accumulation during sequential optimization. To tackle this, we introduce a novel hashing approach termed Distribution-based Structure Mining with Consistency Learning (DEMO) for efficient image-text matching. |
Fan Zhang; Xian-Sheng Hua; Chong Chen; Xiao Luo; |
22 | SeaEval for Multilingual Foundation Models: From Cross-Lingual Alignment to Cultural Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present SeaEval, a benchmark for multilingual foundation models. |
Bin Wang; Zhengyuan Liu; Xin Huang; Fangkai Jiao; Yang Ding; AiTi Aw; Nancy Chen; |
23 | Volcano: Mitigating Multimodal Hallucination Through Self-Feedback Guided Revision Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Recent works have conjectured that one of the reasons behind multimodal hallucination is due to the vision encoder failing to ground on the image properly. To mitigate this issue, we propose a novel approach that leverages self-feedback as visual cues. |
Seongyun Lee; Sue Park; Yongrae Jo; Minjoon Seo; |
24 | LLMs Are Few-Shot In-Context Low-Resource Language Learners Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Nonetheless, there is only a handful of works explored ICL for low-resource languages with most of them focusing on relatively high-resource languages, such as French and Spanish. In this work, we extensively study ICL and its cross-lingual variation (X-ICL) on 25 low-resource and 7 relatively higher-resource languages. |
Samuel Cahyawijaya; Holy Lovenia; Pascale Fung; |
25 | Simple and Effective Data Augmentation for Compositional Generalization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Compositional generalization, the ability to predict complex meanings from training on simpler sentences, poses challenges for powerful pretrained seq2seq models. In this paper, we show that data augmentation methods that sample MRs and backtranslate them can be effective for compositional generalization, but only if we sample from the right distribution. |
Yuekun Yao; Alexander Koller; |
26 | Rethinking Tabular Data Understanding with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Large Language Models (LLMs) have shown to be capable of various tasks, yet their capability in interpreting and reasoning over tabular data remains an underexplored area. In this context, this study investigates from three core perspectives: the robustness of LLMs to structural perturbations in tables, the comparative analysis of textual and symbolic reasoning on tables, and the potential of boosting model performance through the aggregation of multiple reasoning pathways. |
Tianyang Liu; Fei Wang; Muhao Chen; |
27 | From Shortcuts to Triggers: Backdoor Defense with Denoised PoE Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose an end-to-end ensemble-based backdoor defense framework, DPoE (Denoised Product-of-Experts), which is inspired by the shortcut nature of backdoor attacks, to defend various backdoor attacks. |
Qin Liu; Fei Wang; Chaowei Xiao; Muhao Chen; |
28 | BookSQL: A Large Scale Text-to-SQL Dataset for Accounting Domain Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Given that accounting databases are used worldwide, particularly by non-technical people, there is an imminent need to develop models that could help extract information from accounting databases via natural language queries. In this resource paper, we aim to fill this gap by proposing a new large-scale Text-to-SQL dataset for the accounting and financial domain: BookSQL. |
Rahul Kumar; Amar Raja Dibbu; Shrutendra Harsola; Vignesh Subrahmaniam; Ashutosh Modi; |
29 | FLAP: Flow-Adhering Planning with Constrained Decoding in LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To study this, we propose the problem of faithful planning in TODs that needs to resolve user intents by following predefined flows and preserving API dependencies. To solve this problem, we propose FLAP, a Flow-Adhering Planning algorithm based on constrained decoding with lookahead heuristic for LLMs. |
Shamik Roy; Sailik Sengupta; Daniele Bonadiman; Saab Mansour; Arshit Gupta; |
30 | DuRE: Dual Contrastive Self Training for Semi-Supervised Relation Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing ST methods in RE fail to tackle the challenge of long-tail relations. In this work, we propose DuRE, a novel ST framework to tackle these problems. |
Yuxi Feng; Laks Lakshmanan; |
31 | Query-Efficient Textual Adversarial Example Generation for Black-Box Attacks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a new approach that guides the word substitutions using prior knowledge from the training set to improve the attack efficiency. |
Zhen Yu; Zhenhua Chen; Kun He; |
32 | Embrace Divergence for Richer Insights: A Multi-document Summarization Benchmark and A Case Study on Summarizing Diverse Information from News Articles Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a new task of summarizing diverse information encountered in multiple news articles encompassing the same event. |
Kung-Hsiang Huang; Philippe Laban; Alexander Fabbri; Prafulla Kumar Choubey; Shafiq Joty; Caiming Xiong; Chien-Sheng Wu; |
33 | AMRFact: Enhancing Summarization Factuality Evaluation with AMR-Driven Negative Samples Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, previous approaches generating perturbed summaries are either of low coherence or lack error-type coverage. To address these issues, we propose AMRFact, a framework that generates perturbed summaries using Abstract Meaning Representations (AMRs). |
Haoyi Qiu; Kung-Hsiang Huang; Jingnong Qu; Nanyun Peng; |
34 | PILOT: Legal Case Outcome Prediction with Case Law Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we proposed a new framework named PILOT (PredictIng Legal case OuTcome) for case outcome prediction. |
Lang Cao; Zifeng Wang; Cao Xiao; Jimeng Sun; |
35 | ALoRA: Allocating Low-Rank Adaptation for Fine-tuning Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recognizing the need for more flexible downstream task adaptation, we extend the methodology of LoRA to an innovative approach we call allocating low-rank adaptation (ALoRA) that enables dynamic adjustments to the intrinsic rank during the adaptation process. |
Zequan Liu; Jiawen Lyn; Wei Zhu; Xing Tian; |
36 | R-Spin: Efficient Speaker and Noise-invariant Representation Learning with Acoustic Pieces Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces Robust Spin (R-Spin), a data-efficient domain-specific self-supervision method for speaker and noise-invariant speech representations by learning discrete acoustic units with speaker-invariant clustering (Spin). |
Heng-Jui Chang; James Glass; |
37 | InsCL: A Data-efficient Continual Learning Paradigm for Fine-tuning Large Language Models with Instructions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a novel paradigm called Instruction-based Continual Learning (InsCL). |
Yifan Wang; Yafei Liu; Chufan Shi; Haoling Li; Chen Chen; Haonan Lu; Yujiu Yang; |
38 | Language Agnostic Code Embeddings Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Yet, the field lacks a comprehensive deep dive and understanding of the code embeddings of multilingual code models. In this paper, we present a comprehensive study on multilingual code embeddings, focusing on the cross-lingual capabilities of these embeddings across different programming languages. |
Saiteja Utpala; Alex Gu; Pin-Yu Chen; |
39 | An Examination of The Compositionality of Large Generative Vision-Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we examine both the evaluation metrics ( VisualGPTScore, etc. ) and current benchmarks for evaluating the compositionality of GVLMs. |
Teli Ma; Rong Li; Junwei Liang; |
40 | Two Heads Are Better Than One: Nested PoE for Robust Defense Against Multi-Backdoors Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose Nested Product of Experts (NPoE) defense framework, which involves a mixture of experts (MoE) as a trigger-only ensemble within the PoE defense framework to simultaneously defend against multiple trigger types. |
Victoria Graf; Qin Liu; Muhao Chen; |
41 | VertAttack: Taking Advantage of Text Classifiers� Horizontal Vision Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Vertically written words willnot be recognized by a classifier. In contrast,humans are easily able to recognize and readwords written both horizontally and vertically.Hence, a human adversary could write problem-atic words vertically and the meaning wouldstill be preserved to other humans. We simulatesuch an attack, VertAttack. |
Jonathan Rusert; |
42 | KDMCSE: Knowledge Distillation Multimodal Sentence Embeddings with Adaptive Angular Margin Contrastive Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, by taking the rest of the batch as negative samples without reviewing when forming contrastive pairs, those studies encountered many suspicious and noisy negative examples, significantly affecting the methods� overall performance. In this work, we propose KDMCSE (Knowledge Distillation Multimodal contrastive learning of Sentence Embeddings), a novel approach that enhances the discrimination and generalizability of multimodal representation and inherits the knowledge from the teacher model to learn the difference between positive and negative instances and via that, can detect noisy and wrong negative samples effectively before they are calculated in the contrastive objective. |
Cong-Duy Nguyen; Thong Nguyen; Xiaobao Wu; Anh Tuan Luu; |
43 | The Taste of IPA: Towards Open-vocabulary Keyword Spotting and Forced Alignment in Any Language Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this project, we demonstrate that phoneme-based models for speech processing can achieve strong crosslinguistic generalizability to unseen languages. |
Jian Zhu; Changbing Yang; Farhan Samir; Jahurul Islam; |
44 | Think Before You Act: A Two-Stage Framework for Mitigating Gender Bias Towards Vision-Language Tasks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we focus on mitigating gender bias towards vision-language tasks. |
Yunqi Zhang; Songda Li; Chunyuan Deng; Luyi Wang; Hui Zhao; |
45 | BeLLM: Backward Dependency Enhanced Large Language Model for Sentence Embeddings Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Concretely, we propose a novel model: backward dependency enhanced large language model (BeLLM). |
Xianming Li; Jing Li; |
46 | Assessing Factual Reliability of Large Language Model Knowledge Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose MOdel kNowledge relIabiliTy scORe (MONITOR), a novel metric designed to directly measure LLMs� factual reliability. |
Weixuan Wang; Barry Haddow; Alexandra Birch; Wei Peng; |
47 | Dial-MAE: ConTextual Masked Auto-Encoder for Retrieval-based Dialogue Systems Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Thus, we propose Dial-MAE (Dialogue Contextual Masking Auto-Encoder), a straightforward yet effective post-training technique tailored for dense encoders in dialogue response selection. |
Zhenpeng Su; Xing W; Wei Zhou; Guangyuan Ma; Songlin Hu; |
48 | Toolink: Linking Toolkit Creation and Using Through Chain-of-Solving on Open-Source Model Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce Toolink, a comprehensive framework that performs task-solving by first creating a toolkit and then integrating the planning and calling of tools through a chain-of-solving (CoS) approach. |
Cheng Qian; Chenyan Xiong; Zhenghao Liu; Zhiyuan Liu; |
49 | Create! Don�t Repeat: A Paradigm Shift in Multi-Label Augmentation Through Label Creative Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose Label Creative Generation (LCG), a new paradigm in multi-label data augmentation. |
Letian Wang; Xianggen Liu; Jiancheng Lv; |
50 | Neurocache: Efficient Vector Retrieval for Long-range Language Modeling Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper introduces Neurocache, an approach to extend the effective context size of large language models (LLMs) using an external vector cache to store its past states. |
Ali Safaya; Deniz Yuret; |
51 | Unveiling The Generalization Power of Fine-Tuned Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Intriguingly, we observe that integrating the in-context learning strategy during fine-tuning on generation tasks can enhance the model�s generalization ability. Through this systematic investigation, we aim to contribute valuable insights into the evolving landscape of fine-tuning practices for LLMs. |
Haoran Yang; Yumeng Zhang; Jiaqi Xu; Hongyuan Lu; Pheng-Ann Heng; Wai Lam; |
52 | A Closer Look at The Self-Verification Abilities of Large Language Models in Logical Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we take a closer look at the self-verification abilities of LLMs in the context of logical reasoning, focusing on their ability to identify logical fallacies accurately. |
Ruixin Hong; Hongming Zhang; Xinyu Pang; Dong Yu; Changshui Zhang; |
53 | Exploring Self-supervised Logic-enhanced Training for Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Yet, our experiments reveal a gap in their performance on logical reasoning benchmarks when compared to state-of-the-art fine-tuning based models. To bridge this gap, we present LogicLLM, a first-of-its-kind, fully self-supervised framework for integrating logical reasoning capabilities into LLMs, and activating them via in-context learning. |
Fangkai Jiao; Zhiyang Teng; Bosheng Ding; Zhengyuan Liu; Nancy Chen; Shafiq Joty; |
54 | MATHSENSEI: A Tool-Augmented Large Language Model for Mathematical Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we present MathSensei, a tool-augmented large language model for mathematical reasoning. |
Debrup Das; Debopriyo Banerjee; Somak Aditya; Ashish Kulkarni; |
55 | CoUDA: Coherence Evaluation Via Unified Data Augmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we take inspiration from linguistic theory of discourse structure, and propose a data augmentation framework named CoUDA. |
Dawei Zhu; Wenhao Wu; Yifan Song; Fangwei Zhu; Ziqiang Cao; Sujian Li; |
56 | MEdIT: Multilingual Text Editing Via Instruction Tuning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce mEdIT, a multi-lingual extension to CoEdIT � the recent state-of-the-art text editing models for writing assistance. |
Vipul Raheja; Dimitris Alikaniotis; Vivek Kulkarni; Bashar Alhafni; Dhruv Kumar; |
57 | Navigation As Attackers Wish? Towards Building Robust Embodied Agents Under Federated Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Towards Byzantine-robust federated embodied agent learning, in this paper, we study the attack and defense for the task of vision-and-language navigation (VLN), where the agent is required to follow natural language instructions to navigate indoor environments. |
Yunchao Zhang; Zonglin Di; Kaiwen Zhou; Cihang Xie; Xin Wang; |
58 | In-context Learning and Gradient Descent Revisited Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we revisit evidence for ICL-GD correspondence on realistic NLP tasks and models. |
Gilad Deutch; Nadav Magar; Tomer Natan; Guy Dar; |
59 | Corpus Considerations for Annotator Modeling and Scaling Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce a composite embedding approach and show distinct differences in which model performs best as a function of the agreement with a given dataset. |
Sarumi Oluyemi; B�la Neuendorf; Joan Plepi; Lucie Flek; J�rg Schl�tterer; Charles Welch; |
60 | On Large Language Models� Hallucination with Regard to Known Facts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We reveal the different dynamics of the output token probabilities along the depths of layers between the correct and hallucinated cases. |
Che Jiang; Biqing Qi; Xiangyu Hong; Dayuan Fu; Yang Cheng; Fandong Meng; Mo Yu; Bowen Zhou; Jie Zhou; |
61 | �One-Size-Fits-All�? Examining Expectations Around What Constitute �Fair� or �Good� NLG System Behaviors Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To illuminate tensions around invariance and adaptation, we conduct five case studies, in which we perturb different types of identity-related language features (names, roles, locations, dialect, and style) in NLG system inputs. |
Li Lucy; Su Lin Blodgett; Milad Shokouhi; Hanna Wallach; Alexandra Olteanu; |
62 | Language Models Hallucinate, But May Excel at Fact Verification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Our study presents insights for developing trustworthy generation models. |
Jian Guan; Jesse Dodge; David Wadden; Minlie Huang; Hao Peng; |
63 | A Rationale-centric Counterfactual Data Augmentation Method for Cross-Document Event Coreference Resolution Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We formalize the decision-making process of the baseline ECR system using a Structural Causal Model (SCM), aiming to identify spurious and causal associations (i. e. , rationales) within the ECR task. |
Bowen Ding; Qingkai Min; Shengkun Ma; Yingjie Li; Linyi Yang; Yue Zhang; |
64 | TrojFSP: Trojan Insertion in Few-shot Prompt Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce _TrojFSP_, a method designed to address the challenges. |
Mengxin Zheng; Jiaqi Xue; Xun Chen; Yanshan Wang; Qian Lou; Lei Jiang; |
65 | Ensuring Safe and High-Quality Outputs: A Guideline Library Approach for Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: One of the current alignment techniques includes principle-driven integration, but it faces challenges arising from the imprecision of manually crafted rules and inadequate risk perception in models without safety training. To address these, we introduce Guide-Align, a two-stage approach. |
Yi Luo; Zhenghao Lin; YuHao Zhang; Jiashuo Sun; Chen Lin; Chengjin Xu; Xiangdong Su; Yelong Shen; Jian Guo; Yeyun Gong; |
66 | X-PARADE: Cross-Lingual Textual Entailment and Information Divergence Across Paragraphs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Here, we introduce X-PARADE (Cross-lingual Paragraph-level Analysis of Divergences and Entailments), the first cross-lingual dataset of paragraph-level information divergences. |
Juan Rodriguez; Katrin Erk; Greg Durrett; |
67 | Topics, Authors, and Institutions in Large Language Model Research: Trends from 17K ArXiv Papers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We discuss implications around (1) how to support the influx of new authors, (2) how industry trends may affect academics, and (3) possible effects of (the lack of) collaboration. |
Rajiv Movva; Sidhika Balachandar; Kenny Peng; Gabriel Agostini; Nikhil Garg; Emma Pierson; |
68 | E5: Zero-shot Hierarchical Table Analysis Using Augmented LLMs Via Explain, Extract, Execute, Exhibit and Extrapolate Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While recent advancements in large language models (LLMs) have shown promise in flat table analysis, their application to hierarchical tables is constrained by the reliance on manually curated exemplars and the model�s token capacity limitations. Addressing these challenges, we introduce a novel code-augmented LLM-based framework, E5, for zero-shot hierarchical table question answering. |
Zhehao Zhang; Yan Gao; Jian-Guang Lou; |
69 | S3Eval: A Synthetic, Scalable, Systematic Evaluation Suite for Large Language Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose using complex synthetic tasks as a proxy evaluation method, and present S3Eval, a Synthetic, Scalable, Systematic evaluation suite for LLMs evaluation. |
Fangyu Lei; Qian Liu; Yiming Huang; Shizhu He; Jun Zhao; Kang Liu; |
70 | MMC: Advancing Multimodal Chart Understanding with Large-scale Instruction Tuning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, a gap remains in the domain of chartimage understanding due to the distinct abstract components in charts. To address this, we introduce a large-scale MultiModal ChartInstruction (MMC-Instruction) dataset comprising 600k instances supporting diverse tasks and chart types. |
Fuxiao Liu; Xiaoyang Wang; Wenlin Yao; Jianshu Chen; Kaiqiang Song; Sangwoo Cho; Yaser Yacoob; Dong Yu; |
71 | Visual Grounding Helps Learn Word Meanings in Low-Data Regimes Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Do models trained more naturalistically � with grounded supervision � exhibit more humanlike language learning? We investigate this question in the context of word learning, a key sub-task in language acquisition. |
Chengxu Zhuang; Evelina Fedorenko; Jacob Andreas; |
72 | Accurate Knowledge Distillation Via N-best Reranking Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose utilizing n-best reranking to enhance Sequence-Level Knowledge Distillation (Kim and Rush, 2016) where we extract pseudo-labels for student model�s training data from top n-best hypotheses and leverage a diverse set of models with different inductive biases, objective functions or architectures, including some publicly-available large language models, to pick the highest-quality hypotheses as labels. |
Hendra Setiawan; |
73 | AutoPRM: Automating Procedural Supervision for Multi-Step Reasoning Via Controllable Question Decomposition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recent advancements in large language models (LLMs) have shown promise in multi-step reasoning tasks, yet their reliance on extensive manual labeling to provide procedural feedback remains a significant impediment. To address this challenge, in this paper, we propose a novel self-supervised framework **AutoPRM** that efficiently enhances the fine-tuning of LLMs for intricate reasoning challenges. |
Zhaorun Chen; Zhuokai Zhao; Zhihong Zhu; Ruiqi Zhang; Xiang Li; Bhiksha Raj; Huaxiu Yao; |
74 | SEMQA: Semi-Extractive Multi-Source Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce a new QA task for answering multi-answer questions by summarizing multiple diverse sources in a semi-extractive fashion. |
Tal Schuster; Adam Lelkes; Haitian Sun; Jai Gupta; Jonathan Berant; William Cohen; Donald Metzler; |
75 | Fine-Tuning Language Models with Reward Learning on Policy Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose reward learning on policy (RLP), an unsupervised framework that refines a reward model using policy samples to keep it on-distribution. |
Hao Lang; Fei Huang; Yongbin Li; |
76 | A Universal Dependencies Treebank for Highland Puebla Nahuatl Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a Universal Dependencies (UD) treebank for Highland Puebla Nahuatl. |
Robert Pugh; Francis Tyers; |
77 | COPAL-ID: Indonesian Language Reasoning with Local Culture and Nuances Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present COPAL-ID, a novel, public Indonesian language common sense reasoning dataset. |
Haryo Wibowo; Erland Fuadi; Made Nityasya; Radityo Eko Prasojo; Alham Aji; |
78 | IterAlign: Iterative Constitutional Alignment of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these methods require either heavy human annotations or explicitly pre-defined constitutions, which are labor-intensive and resource-consuming. To overcome these drawbacks, we study constitution-based LLM alignment and propose a data-driven constitution discovery and self-alignment framework called IterAlign. |
Xiusi Chen; Hongzhi Wen; Sreyashi Nag; Chen Luo; Qingyu Yin; Ruirui Li; Zheng Li; Wei Wang; |
79 | OrchestraLLM: Efficient Orchestration of Language Models for Dialogue State Tracking Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Driven by findings that SLMs and LLMs exhibit complementary strengths in a structured knowledge extraction task, this work presents a novel SLM/LLM routing framework designed to improve computational efficiency and enhance task performance. |
Chia-Hsuan Lee; Hao Cheng; Mari Ostendorf; |
80 | Multi-Operational Mathematical Derivations in Latent Space Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To this end, we introduce different multi-operational representation paradigms, modelling mathematical operations as explicit geometric transformations. |
Marco Valentino; Jordan Meadows; Lan Zhang; Andre Freitas; |
81 | Large Language Models Help Humans Verify Truthfulness � Except When They Are Convincingly Wrong Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We conduct human experiments with 80 crowdworkers to compare language models with search engines (information retrieval systems) at facilitating fact-checking. |
Chenglei Si; Navita Goyal; Tongshuang Wu; Chen Zhao; Shi Feng; Hal Daum� Iii; Jordan Boyd-Graber; |
82 | XferBench: A Data-Driven Benchmark for Emergent Language Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a benchmark for evaluating the overall quality of emergent languages using data-driven methods. |
Brendon Boldt; David Mortensen; |
83 | Evaluating Large Language Models As Generative User Simulators for Conversational Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce a new protocol to measure the degree to which language models can accurately emulate human behavior in conversational recommendation. |
Se-eun Yoon; Zhankui He; Jessica Echterhoff; Julian McAuley; |
84 | A Symbolic Framework for Evaluating Mathematical Reasoning and Generalisation with Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a methodology for generating and perturbing detailed derivations of equations at scale, aided by a symbolic engine, to evaluate the generalisability of Transformers to out-of-distribution mathematical reasoning problems. |
Jordan Meadows; Marco Valentino; Damien Teney; Andre Freitas; |
85 | Identifying Linear Relational Concepts in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a technique called linear relational concepts (LRC) for finding concept directions corresponding to human-interpretable concepts by first modeling the relation between subject and object as a linear relational embedding (LRE). |
David Chanin; Anthony Hunter; Oana-Maria Camburu; |
86 | Benchmark Transparency: Measuring The Impact of Data on Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we present an exploratory research on quantifying the impact that data distribution has on the performance and evaluation of NLP models. |
Venelin Kovatchev; Matthew Lease; |
87 | JAMDEC: Unsupervised Authorship Obfuscation Using Constrained Decoding Over Small Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose an unsupervised inference-time approach to authorship obfuscation to address the unique challenges of authorship obfuscation: lack of supervision data for diverse authorship and domains, and the need for a sufficient level of revision beyond simple paraphrasing to obfuscate the authorship, all the while preserving the original content and fluency. |
Jillian Fisher; Ximing Lu; Jaehun Jung; Liwei Jiang; Zaid Harchaoui; Yejin Choi; |
88 | REST: Retrieval-Based Speculative Decoding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce Retrieval-Based Speculative Decoding (REST), a novel algorithm designed to speed up language model generation. |
Zhenyu He; Zexuan Zhong; Tianle Cai; Jason Lee; Di He; |
89 | Sub-Sentence Encoder: Contrastive Learning of Propositional Semantic Representations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce sub-sentence encoder, a contrastively-learned contextual embedding model for fine-grained semantic representation of text. |
Sihao Chen; Hongming Zhang; Tong Chen; Ben Zhou; Wenhao Yu; Dian Yu; Baolin Peng; Hongwei Wang; Dan Roth; Dong Yu; |
90 | MSciNLI: A Diverse Benchmark for Scientific Natural Language Inference Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we aim to introduce diversity in the scientific NLI task and present MSciNLI, a dataset containing 132,320 sentence pairs extracted from five new scientific domains. |
Mobashir Sadat; Cornelia Caragea; |
91 | Causal Inference for Human-Language Model Collaboration Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we examine the collaborative dynamics between humansand language models (LMs), where the interactions typically involveLMs proposing text segments and humans editing or responding to theseproposals. |
Bohan Zhang; Yixin Wang; Paramveer Dhillon; |
92 | SELF-GUARD: Empower The LLM to Safeguard Itself Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Nevertheless, they can only reduce a limited amount of harmful output and introduce extra computational costs. Given the distinct strengths and weaknesses of both, we combine them to balance out their flaws and propose a more effective method called Self-Guard. |
Zezhong Wang; Fangkai Yang; Lu Wang; Pu Zhao; Hongru Wang; Liang Chen; Qingwei Lin; Kam-Fai Wong; |
93 | COSIGN: Contextual Facts Guided Generation for Knowledge Graph Completion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To improve the performance of GM-based methods for various KGC tasks, we propose a COntextual FactS GuIded GeneratioN (COSIGN) model. |
Jinpeng Li; Hang Yu; Xiangfeng Luo; Qian Liu; |
94 | Toward Informal Language Processing: Knowledge of Slang in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Using movie subtitles, we construct a dataset that supports evaluation on a diverse set of tasks pertaining to automatic processing of slang. |
Zhewei Sun; Qian Hu; Rahul Gupta; Richard Zemel; Yang Xu; |
95 | Ghostbuster: Detecting Text Ghostwritten By Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce Ghostbuster, a state-of-the-art system for detecting AI-generated text. |
Vivek Verma; Eve Fleisig; Nicholas Tomlin; Dan Klein; |
96 | End-to-End Beam Retrieval for Multi-Hop Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce Beam Retrieval, an end-to-end beam retrieval framework for multi-hop QA. |
Jiahao Zhang; Haiyang Zhang; Dongmei Zhang; Liu Yong; Shen Huang; |
97 | Leveraging Generative Large Language Models with Visual Instruction and Demonstration Retrieval for Multimodal Sarcasm Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Additionally, existing methods typically do not fully utilize cross-modal features, limiting their performance on in-domain datasets. Therefore, to build a more reliable multimodal sarcasm detection model, we propose a generative multimodal sarcasm model consisting of a designed instruction template and a demonstration retrieval module based on the large language model. |
Binghao Tang; Boda Lin; Haolong Yan; Si Li; |
98 | Multi-Scale Prompt Memory-Augmented Model for Black-Box Scenarios Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, they still require numerous LMs� calls to search optimal prompts, thus resulting in overfitting performance and increasing computational cost. To address this issue, we present MuSKPrompt (Multi-scale Knowledge Prompt for Memory Model), an efficient multi-scale knowledge prompt-based memory model in black-box few-shot text classification task. |
Xiaojun Kuang; C. L. Philip Chen; Shuzhen Li; Tong Zhang; |
99 | Ungrammatical-syntax-based In-context Example Selection for Grammatical Error Correction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel ungrammatical-syntax-based in-context example selection strategy for GEC. |
Chenming Tang; Fanyi Qu; Yunfang Wu; |
100 | BUFFET: Benchmarking Large Language Models for Few-shot Cross-lingual Transfer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To establish a rigorous and equitable evaluation framework for few-shot cross-lingual transfer, we introduce a new benchmark, called BUFFET, which unifies 15 diverse tasks across 54 languages in a sequence-to-sequence format and provides a fixed set of few-shot examples and instructions. |
Akari Asai; Sneha Kudugunta; Xinyan Yu; Terra Blevins; Hila Gonen; Machel Reid; Yulia Tsvetkov; Sebastian Ruder; Hannaneh Hajishirzi; |
101 | TISE: A Tripartite In-context Selection Method for Event Argument Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce three necessary requirements when selecting an in-context example for EAE task: semantic similarity, example diversity and event correlation. |
Yanhe Fu; Yanan Cao; Qingyue Wang; Yi Liu; |
102 | Reasoning or Reciting? Exploring The Capabilities and Limitations of Language Models Through Counterfactual Tasks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Are these skills general and transferable, or specialized to specific tasks seen during pretraining? To disentangle these effects, we propose an evaluation framework based on �counterfactual� task variants that deviate from the default assumptions underlying standard tasks. |
Zhaofeng Wu; Linlu Qiu; Alexis Ross; Ekin Aky�rek; Boyuan Chen; Bailin Wang; Najoung Kim; Jacob Andreas; Yoon Kim; |
103 | TRUE-UIE: Two Universal Relations Unify Information Extraction Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we introduce an innovative paradigm known as TRUE-UIE, wherein all IE tasks are aligned to learn the same goals: extracting mention spans and two universal relations named \mathtt{NEXT} and \mathtt{IS}. |
Yucheng Wang; Bowen Yu; Yilin Liu; Shudong Lu; |
104 | ZrLLM: Zero-Shot Relational Learning on Temporal Knowledge Graphs with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We first input the text descriptions of KG relations into large language models (LLMs) for generating relation representations, and then introduce them into embedding-based TKGF methods. |
Zifeng Ding; Heling Cai; Jingpei Wu; Yunpu Ma; Ruotong Liao; Bo Xiong; Volker Tresp; |
105 | Embodied Executable Policy Learning with Language-based Scene Summarization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce a novel learning paradigm that generates robots� executable actions in the form of text, derived solely from visual observations. |
Jielin Qiu; Mengdi Xu; William Han; Seungwhan Moon; Ding Zhao; |
106 | Metacognitive Prompting Improves Understanding in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we introduce Metacognitive Prompting (MP), a strategy inspired by human introspective reasoning processes. |
Yuqing Wang; Yun Zhao; |
107 | MART: Improving LLM Safety with Multi-round Automatic Red-Teaming Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a Multi-round Automatic Red-Teaming (MART) method, which incorporates both automatic adversarial prompt writing and safe response generation, significantly increasing red-teaming scalability and the safety of the target LLM. |
Suyu Ge; Chunting Zhou; Rui Hou; Madian Khabsa; Yi-Chia Wang; Qifan Wang; Jiawei Han; Yuning Mao; |
108 | DialogCC: An Automated Pipeline for Creating High-Quality Multi-Modal Dialogue Dataset Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose an automated pipeline to construct a multi-modal dialogue dataset, ensuring both dialogue quality and image diversity without requiring minimum human effort. |
Young-Jun Lee; Byungsoo Ko; Han-Gyu Kim; Jonghwan Hyeon; Ho-Jin Choi; |
109 | Routing to The Expert: Efficient Reward-guided Ensemble of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose ZOOTER, a reward-guided routing method distilling rewards on training queries to train a routing function, which can precisely distribute each query to the LLM with expertise about it. |
Keming Lu; Hongyi Yuan; Runji Lin; Junyang Lin; Zheng Yuan; Chang Zhou; Jingren Zhou; |
110 | Automatic Generation of Model and Data Cards: A Step Towards Responsible AI Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose an automated generation approach using Large Language Models (LLMs). |
Jiarui Liu; Wenkai Li; Zhijing Jin; Mona Diab; |
111 | FUN with Fisher: Improving Generalization of Adapter-Based Cross-lingual Transfer with Scheduled Unfreezing Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Standard fine-tuning of language models typically performs well on in-distribution data, but suffers with generalization to distribution shifts. In this work, we aim to improve the generalization of adapter-based cross-lingual task transfer where such cross-language distribution shifts are imminent. |
Chen Liu; Jonas Pfeiffer; Ivan Vulic; Iryna Gurevych; |
112 | Are Multilingual LLMs Culturally-Diverse Reasoners? An Investigation Into Multicultural Proverbs and Sayings Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we study the ability of a wide range of state-of-the-art multilingual LLMs (mLLMs) to reason with proverbs and sayings in a conversational context. |
Chen Liu; Fajri Koto; Timothy Baldwin; Iryna Gurevych; |
113 | The Colorful Future of LLMs: Evaluating and Improving LLMs As Emotional Supporters for Queer Youth Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper aims to comprehensively explore the potential of LLMs to revolutionize emotional support for queers. |
Shir Lissak; Nitay Calderon; Geva Shenkman; Yaakov Ophir; Eyal Fruchter; Anat Brunstein Klomek; Roi Reichart; |
114 | IPED: An Implicit Perspective for Relational Triple Extraction Based on Diffusion Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, inherent shortcomings such as redundant information and incomplete triple recognition remain problematic. To address these challenges, we propose an Implicit Perspective for relational triple Extraction based on Diffusion model (IPED), an innovative approach for extracting relational triples. |
Jianli Zhao; Changhao Xu; Bin. Jiang; |
115 | QualEval: Qualitative Evaluation for Model Improvement Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we address the shortcomings of quantitative metrics by proposing QualEval, which uses automated qualitative evaluation as a vehicle for model improvement. |
Vishvak Murahari; Ameet Deshpande; Peter Clark; Tanmay Rajpurohit; Ashish Sabharwal; Karthik Narasimhan; Ashwin Kalyan; |
116 | Quantum-inspired Language Model with Lindblad Master Equation and Interference Measurement for Sentiment Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present a novel quantum-inspired neural network, LI-QiLM, which integrates the Lindblad Master Equation (LME) to model the evolution process and the interferometry to the measurement process, providing more physical meaning to strengthen the interpretability. |
Kehuan Yan; Peichao Lai; Yilei Wang; |
117 | VisLingInstruct: Elevating Zero-Shot Learning in Multi-Modal Language Models with Autonomous Instruction Optimization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents VisLingInstruct, a novel approach to advancing Multi-Modal Language Models (MMLMs) in zero-shot learning. |
Dongsheng Zhu; Daniel Tang; Weidong Han; Jinghui Lu; Yukun Zhao; Guoliang Xing; Junfeng Wang; Dawei Yin; |
118 | A Wolf in Sheep�s Clothing: Generalized Nested Jailbreak Prompts Can Fool Large Language Models Easily Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we generalize jailbreak prompt attacks into two aspects: (1) Prompt Rewriting and (2) Scenario Nesting. |
Peng Ding; Jun Kuang; Dan Ma; Xuezhi Cao; Yunsen Xian; Jiajun Chen; Shujian Huang; |
119 | P3Sum: Preserving Author�s Perspective in News Summarization with Diffusion Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we take a first step towards designing summarization systems that are faithful to the author�s intent, not only the semantic content of the article. |
Yuhan Liu; Shangbin Feng; Xiaochuang Han; Vidhisha Balachandran; Chan Young Park; Sachin Kumar; Yulia Tsvetkov; |
120 | Bridging The Novice-Expert Gap Via Models of Decision-Making: A Case Study on Remediating Math Mistakes Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Our work explores the potential of large language models (LLMs) to close the novice-expert knowledge gap in remediating math mistakes. |
Rose Wang; Qingyang Zhang; Carly Robinson; Susanna Loeb; Dorottya Demszky; |
121 | RST-LoRA: A Discourse-Aware Low-Rank Adaptation for Long Document Abstractive Summarization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Unfortunately, the integration of rhetorical structure theory (RST) into parameter-efficient fine-tuning strategies for long document summarization remains unexplored. Therefore, this paper introduces RST-LoRA and proposes four RST-aware variants to explicitly incorporate RST into the LoRA model. |
Dongqi Pu; Vera Demberg; |
122 | Strings from The Library of Babel: Random Sampling As A Strong Baseline for Prompt Optimisation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we demonstrate that randomly sampling tokens from the model vocabulary as �separators� can be as effective as language models for prompt-style text classification. |
Yao Lu; Jiayi Wang; Raphael Tang; Sebastian Riedel; Pontus Stenetorp; |
123 | ReTA: Recursively Thinking Ahead to Improve The Strategic Reasoning of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we analyze the multi-turn strategic reasoning of LLMs through text-driven complete- and incomplete-information gaming, e. g. , board games (Tic-Tac-Toe, Connect-4) and poker games (Texas Hold�em Poker). |
Jinhao Duan; Shiqi Wang; James Diffenderfer; Lichao Sun; Tianlong Chen; Bhavya Kailkhura; Kaidi Xu; |
124 | Fact Checking Beyond Training Set Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose an adversarial algorithm to make the retriever component robust against distribution shift. |
Payam Karisani; Heng Ji; |
125 | Program-Aided Reasoners (Better) Know What They Know Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we compare the calibration of Program Aided Language Models (PAL) and text-based Chain-of-thought (COT) prompting techniques over 5 datasets and 2 model types – LLaMA models and OpenAI models. |
Anubha Kabra; Sanketh Rangreji; Yash Mathur; Aman Madaan; Emmy Liu; Graham Neubig; |
126 | The Perspectivist Paradigm Shift: Assumptions and Challenges of Capturing Human Labels Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Though annotator disagreement has long been seen as a problem to minimize, new perspectivist approaches challenge this assumption by treating disagreement as a valuable source of information. In this position paper, we examine practices and assumptions surrounding the causes of disagreement�some challenged by perspectivist approaches, and some that remain to be addressed�as well as practical and normative challenges for work operating under these assumptions. |
Eve Fleisig; Su Lin Blodgett; Dan Klein; Zeerak Talat; |
127 | Principles from Clinical Research for NLP Model Generalization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here, we explore the foundations of generalizability and study the factors that affect it, articulating lessons from clinical studies. |
Aparna Elangovan; Jiayuan He; Yuan Li; Karin Verspoor; |
128 | First Tragedy, Then Parse: History Repeats Itself in The New Era of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We identify durable lessons from the first era, and more importantly, we identify evergreen problems where NLP researchers can continue to make meaningful contributions in areas where LLMs are ascendant. |
Naomi Saphra; Eve Fleisig; Kyunghyun Cho; Adam Lopez; |
129 | Found in The Middle: Permutation Self-Consistency Improves Listwise Ranking in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Large language models (LLMs) exhibit positional bias in how they use context, which especially affects listwise ranking. To address this, we propose permutation self-consistency, a form of self-consistency over the ranking list outputs of black-box LLMs. |
Raphael Tang; Crystina Zhang; Xueguang Ma; Jimmy Lin; Ferhan Ture; |
130 | From Language Modeling to Instruction Following: Understanding The Behavior Shift in LLMs After Instruction Tuning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we investigate how the instruction tuning adjusts pre-trained models with a focus on intrinsic changes. |
Xuansheng Wu; Wenlin Yao; Jianshu Chen; Xiaoman Pan; Xiaoyang Wang; Ninghao Liu; Dong Yu; |
131 | POLYIE: A Dataset of Information Extraction from Polymer Material Scientific Literature Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, there are no existing SciIE datasets for polymer materials, which is an important class of materials used ubiquitously in our daily lives. To bridge this gap, we introduce POLYIE, a new SciIE dataset for polymer materials. |
Jerry Cheung; Yuchen Zhuang; Yinghao Li; Pranav Shetty; Wantian Zhao; Sanjeev Grampurohit; Rampi Ramprasad; Chao Zhang; |
132 | LLM-based Medical Assistant Personalization with Short- and Long-Term Memory Coordination Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we propose a novel computational bionic memory mechanism, equipped with a parameter-efficient fine-tuning (PEFT) schema, to personalize medical assistants. |
Kai Zhang; Yangyang Kang; Fubang Zhao; Xiaozhong Liu; |
133 | SumTra: A Differentiable Pipeline for Few-Shot Cross-Lingual Summarization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, the scarcity of fine-tuning samples makes this approach challenging in some cases. For this reason, in this paper we propose revisiting the summarize-and-translate pipeline, where the summarization and translation tasks are performed in a sequence. |
Jacob Parnell; Inigo Jauregi Unanue; Massimo Piccardi; |
134 | KTRL+F: Knowledge-Augmented In-Document Search Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We analyze various baselines in KTRL+F and find limitations of existing models, such as hallucinations, high latency, or difficulties in leveraging external knowledge. Therefore, we propose a Knowledge-Augmented Phrase Retrieval model that shows a promising balance between speed and performance by simply augmenting external knowledge in phrase embedding. |
Hanseok Oh; Haebin Shin; Miyoung Ko; Hyunji Lee; Minjoon Seo; |
135 | How Well Do Large Language Models Truly Ground? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, previous research often narrowly defines �grounding� as just having the correct answer, which does not ensure the reliability of the entire response. To overcome this, we propose a stricter definition of grounding: a model is truly grounded if it (1) fully utilizes the necessary knowledge from the provided context, and (2) stays within the limits of that knowledge. |
Hyunji Lee; Se June Joo; Chaeeun Kim; Joel Jang; Doyoung Kim; Kyoung-Woon On; Minjoon Seo; |
136 | ALBA: Adaptive Language-Based Assessments for Mental Health Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we develop adaptive testing methods under two psychometric measurement theories: Classical Test Theory and Item Response Theory. |
Vasudha Varadarajan; Sverker Sikstr�m; Oscar Kjell; H. Schwartz; |
137 | FREB-TQA: A Fine-Grained Robustness Evaluation Benchmark for Table Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we formalize three major desiderata for a fine-grained evaluation of robustness of TQA systems. |
Wei Zhou; Mohsen Mesgar; Heike Adel; Annemarie Friedrich; |
138 | MILL: Mutual Verification with Large Language Models for Zero-Shot Query Expansion Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Generation-based methods, utilizing large language models (LLMs), generally lack corpus-specific knowledge and entail high fine-tuning costs. To address these gaps, we propose a novel zero-shot query expansion framework utilizing LLMs for mutual verification. |
Pengyue Jia; Yiding Liu; Xiangyu Zhao; Xiaopeng Li; Changying Hao; Shuaiqiang Wang; Dawei Yin; |
139 | Efficient Benchmarking (of Language Models) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present the problem of Efficient Benchmarking, namely, intelligently reducing the computation costs of LM evaluation without compromising reliability. |
Yotam Perlitz; Elron Bandel; Ariel Gera; Ofir Arviv; Liat Ein-Dor; Eyal Shnarch; Noam Slonim; Michal Shmueli-Scheuer; Leshem Choshen; |
140 | ReFACT: Updating Text-to-Image Models By Editing The Text Encoder Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To that end, we introduce ReFACT, a novel approach for editing factual associations in text-to-image models without relaying on explicit input from end-users or costly re-training. |
Dana Arad; Hadas Orgad; Yonatan Belinkov; |
141 | A Likelihood Ratio Test of Genetic Relationship Among Languages Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To this end, inspired by molecular phylogenetics, we propose a likelihood ratio test to determine if given languages are related based on the proportion of invariant character sites in the aligned wordlists applied during tree inference. |
V.S.D.S.Mahesh Akavarapu; Arnab Bhattacharya; |
142 | PaD: Program-aided Distillation Can Teach Small Models Reasoning Better Than Chain-of-thought Fine-tuning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose Program-aided Distillation (PaD), which introduces reasoning programs to suppress the errors in distilled data, and thus achieves better distillation quality for reasoning tasks. |
Xuekai Zhu; Biqing Qi; Kaiyan Zhang; Xinwei Long; Zhouhan Lin; Bowen Zhou; |
143 | MEGAVERSE: Benchmarking Large Language Models Across Languages, Modalities, Models and Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study aims to perform a thorough evaluation of the non-English capabilities of SoTA LLMs (GPT-3. |
Sanchit Ahuja; Divyanshu Aggarwal; Varun Gumma; Ishaan Watts; Ashutosh Sathe; Millicent Ochieng; Rishav Hada; Prachi Jain; Mohamed Ahmed; Kalika Bali; Sunayana Sitaram; |
144 | Unlocking Emergent Modularity in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, focusing on unlocking the emergent modularity in LMs, we showcase that standard LMs could be fine-tuned as their Mixture-of-Expert (MoEs) counterparts without introducing any extra parameters. |
Zihan Qiu; Zeyu Huang; Jie Fu; |
145 | A School Student Essay Corpus for Analyzing Interactions of Argumentative Structure and Quality Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To fill this research gap, we present a German corpus of 1,320 essays from school students of two age groups. |
Maja Stahl; Nadine Michel; Sebastian Kilsbach; Julian Schmidtke; Sara Rezat; Henning Wachsmuth; |
146 | Adjusting Interpretable Dimensions in Embedding Space with Human Judgments Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We combine seed-based vectors with guidance from human ratings of where words fall along a specific dimension, and evaluate on predicting both object properties like size and danger, and the stylistic properties of formality and complexity. |
Katrin Erk; Marianna Apidianaki; |
147 | PatentEval: Understanding Errors in Patent Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce a comprehensive error typology specifically designed for evaluating two distinct tasks in machine-generated patent texts: claims-to-abstract generation, and the generation of the next claim given previous ones. |
You Zuo; Kim Gerdes; �ric Clergerie; Beno�t Sagot; |
148 | Contextual Refinement of Translations: Large Language Models for Sentence and Document-Level Post-Editing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Surprisingly, our initial experiments found that fine-tuning with Q-LoRA for translation purposes led to performance improvements in terms of BLEU but degradation in COMET compared to in-context learning. To overcome this, we propose an alternative approach: adapting LLMs as Automatic Post-Editors (APE) rather than direct translators. |
Sai Koneru; Miriam Exel; Matthias Huck; Jan Niehues; |
149 | Metaphor Detection with Context Enhancement and Curriculum Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Furthermore, they have faced challenges in tackling the problem of data sparseness due to the very limited available training data. To address these two challenges, we propose a novel model called MiceCL. |
Kaidi Jia; Rongsheng Li; |
150 | What Causes The Failure of Explicit to Implicit Discourse Relation Recognition? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Prior work claimed this is due to linguistic dissimilarity between explicit and implicit examples but provided no empirical evidence. In this study, we show that one cause for such failure is a label shift after connectives are eliminated. |
Wei Liu; Stephen Wan; Michael Strube; |
151 | UniverSLU: Universal Spoken Language Understanding for Diverse Tasks with Natural Language Instructions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recent studies leverage large language models with multi-tasking capabilities, using natural language prompts to guide the model�s behavior and surpassing performance of task-specific models. Motivated by this, we ask: can we build a single model that jointly performs various spoken language understanding (SLU) tasks? |
Siddhant Arora; Hayato Futami; Jee-weon Jung; Yifan Peng; Roshan Sharma; Yosuke Kashiwagi; Emiru Tsunoo; Karen Livescu; Shinji Watanabe; |
152 | How Trustworthy Are Open-Source LLMs? An Assessment Under Malicious Demonstrations Shows Their Vulnerabilities Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Deploying these models at scale without sufficient trustworthiness can pose significant risks, highlighting the need to uncover these issues promptly. In this work, we conduct an adversarial assessment of open-source LLMs on trustworthiness, scrutinizing them across eight different aspects including toxicity, stereotypes, ethics, hallucination, fairness, sycophancy, privacy, and robustness against adversarial demonstrations. |
Lingbo Mo; Boshi Wang; Muhao Chen; Huan Sun; |
153 | Paraphrase and Solve: Exploring and Exploiting The Impact of Surface Form on Mathematical Reasoning in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To improve mathematical reasoning performance, we propose Self-Consistency-over-Paraphrases (SCoP), which diversifies reasoning paths from specific surface forms of the problem. |
Yue Zhou; Yada Zhu; Diego Antognini; Yoon Kim; Yang Zhang; |
154 | TriSum: Learning Summarization Ability from Large Language Models with Structured Rationale Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, their large size and computational demands, coupled with privacy concerns in data transmission, limit their use in resource-constrained and privacy-centric settings. To overcome this, we introduce TriSum, a framework for distilling LLMs� text summarization abilities into a compact, local model. |
Pengcheng Jiang; Cao Xiao; Zifeng Wang; Parminder Bhatia; Jimeng Sun; Jiawei Han; |
155 | GenRES: Rethinking Evaluation for Generative Relation Extraction in The Era of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This shortfall arises because these metrics rely on exact matching with human-annotated reference relations, while GRE methods often produce diverse and semantically accurate relations that differ from the references. To fill this gap, we introduce GenRES for a multi-dimensional assessment in terms of the topic similarity, uniqueness, granularity, factualness, and completeness of the GRE results. |
Pengcheng Jiang; Jiacheng Lin; Zifeng Wang; Jimeng Sun; Jiawei Han; |
156 | Curated Datasets and Neural Models for Machine Translation of Informal Registers Between Mayan and Spanish Vernaculars Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper we develop, curate, and publicly release a set of corpora in several Mayan languages spoken in Guatemala and Southern Mexico, which we call MayanV. |
Andr�s Lou; Juan Antonio P�rez-Ortiz; Felipe S�nchez-Mart�nez; V�ctor S�nchez-Cartagena; |
157 | The Effect of Data Partitioning Strategy on Model Generalizability: A Case Study of Morphological Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recent work to enhance data partitioning strategies for more realistic model evaluation face challenges in providing a clear optimal choice. This study addresses these challenges, focusing on morphological segmentation and synthesizing limitations related to language diversity, adoption of multiple datasets and splits, and detailed model comparisons. |
Zoey Liu; Bonnie Dorr; |
158 | Measuring Entrainment in Spontaneous Code-switched Speech Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our work studies code-switched spontaneous speech between humans, finding that (1) patterns of written and spoken entrainment in monolingual settings largely generalize to code-switched settings, and (2) some patterns of entrainment on code-switching in dialogue agent-generated text generalize to spontaneous code-switched speech. |
Debasmita Bhattacharya; Siying Ding; Alayna Nguyen; Julia Hirschberg; |
159 | A Survey of Meaning Representations � From Theory to Practical Utility Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this survey, we study today�s most prominent Meaning Representation Frameworks. |
Zacchary Sadeddine; Juri Opitz; Fabian Suchanek; |
160 | Mitigating Language-Level Performance Disparity in MPLMs Via Teacher Language Selection and Cross-lingual Self-Distillation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, obtaining labeled multilingual data is time-consuming, and fine-tuning mPLM with limited labeled multilingual data merely encapsulates the knowledge specific to the labeled data. Therefore, we introduce **ALSACE** to leverage the learned knowledge from the well-performing languages to guide under-performing ones within the same mPLM, eliminating the need for additional labeled multilingual data. |
Haozhe Zhao; Zefan Cai; Shuzheng Si; Liang Chen; Yufeng He; Kaikai An; Baobao Chang; |
161 | Evaluating In-Context Learning of Libraries for Code Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we take a broader approach by systematically evaluating a diverse array of LLMs across three scenarios reflecting varying levels of domain specialization to understand their abilities and limitations in generating code based on libraries defined in-context. |
Arkil Patel; Siva Reddy; Dzmitry Bahdanau; Pradeep Dasigi; |
162 | Visually-Aware Context Modeling for News Image Captioning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Recognizing the significance of human faces in news images and the face-name co-occurrence pattern in existing datasets, we propose a face-naming module for learning better name embeddings. |
Tingyu Qu; Tinne Tuytelaars; Marie-Francine Moens; |
163 | Regularized Conventions: Equilibrium Computation As A Model of Pragmatic Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a game-theoretic model of pragmatics that we call ReCo (for Regularized Conventions). |
Athul Jacob; Gabriele Farina; Jacob Andreas; |
164 | TopicGPT: A Prompt-based Topic Modeling Framework Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Conventional topic models (e. g. , LDA) represent topics as bags of words that often require �reading the tea leaves� to interpret; additionally, they offer users minimal control over the formatting and specificity of resulting topics. To tackle these issues, we introduce TopicGPT, a prompt-based framework that uses large language models (LLMs) to uncover latent topics in a text collection. |
Chau Pham; Alexander Hoyle; Simeng Sun; Philip Resnik; Mohit Iyyer; |
165 | ChatGPT As An Attack Tool: Stealthy Textual Backdoor Attack Via Blackbox Generative Model Trigger Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we conduct an in-depth examination of black-box generative models as tools for backdoor attacks, thereby emphasizing the need for effective defense strategies. |
Jiazhao Li; Yijin Yang; Zhuofeng Wu; V.G.Vinod Vydiswaran; Chaowei Xiao; |
166 | Social Meme-ing: Measuring Linguistic Variation in Memes Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we argue that memes, as multimodal forms of language comprised of visual templates and text, also exhibit meaningful social variation. |
Naitian Zhou; David Jurgens; David Bamman; |
167 | ExpertQA: Expert-Curated Questions and Attributed Answers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we conduct human evaluation of responses from a few representative systems along various axes of attribution and factuality, by bringing domain experts in the loop. |
Chaitanya Malaviya; Subin Lee; Sihao Chen; Elizabeth Sieber; Mark Yatskar; Dan Roth; |
168 | What If You Said That Differently?: How Explanation Formats Affect Human Feedback Efficacy and User Perception Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We answer these questions by analyzing the effect of rationales (or explanations) generated by QA models to support their answers. We specifically consider decomposed QA models that first extract an intermediate rationale based on a context and a question and then use solely this rationale to answer the question. |
Chaitanya Malaviya; Subin Lee; Dan Roth; Mark Yatskar; |
169 | When Life Gives You Lemons, Make Cherryade: Converting Feedback from Bad Responses Into Good Labels Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose Juicer, a framework to make use of both binary and free-form textual human feedback. |
Weiyan Shi; Emily Dinan; Kurt Shuster; Jason Weston; Jing Xu; |
170 | Krey�l-MT: Building MT for Latin American, Caribbean and Colonial African Creole Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present the largest cumulative dataset to date for Creole language MT, including 14.5M unique Creole sentences with parallel translations�11.6M of which we release publicly, and the largest bitexts gathered to date for 41 languages�the first ever for 21. |
Nathaniel Robinson; Raj Dabre; Ammon Shurtz; Rasul Dent; Onenamiyi Onesi; Claire Monroc; Lo�c Grobol; Hasan Muhammad; Ashi Garg; Naome Etori; Vijay Murari Tiyyala; Olanrewaju Samuel; Matthew Stutzman; Bismarck Odoom; Sanjeev Khudanpur; Stephen Richardson; Kenton Murray; |
171 | Instructions As Backdoors: Backdoor Vulnerabilities of Instruction Tuning for Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We investigate security concerns of the emergent instruction tuning paradigm, that models are trained on crowdsourced datasets with task instructions to achieve superior performance. |
Jiashu Xu; Mingyu Ma; Fei Wang; Chaowei Xiao; Muhao Chen; |
172 | Modeling Empathetic Alignment in Conversation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Here, we introduce a new approach to recognizing alignment in empathetic speech, grounded in Appraisal Theory. |
Jiamin Yang; David Jurgens; |
173 | Native Language Identification in Texts: A Survey Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We describe several text representations and computational techniques used in text-based NLI. |
Dhiman Goswami; Sharanya Thilagan; Kai North; Shervin Malmasi; Marcos Zampieri; |
174 | LoRETTA: Low-Rank Economic Tensor-Train Adaptation for Ultra-Low-Parameter Fine-Tuning of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, existing PEFT methods are still limited by the growing number of trainable parameters with the rapid deployment of Large Language Models (LLMs). To address this challenge, we present LoRETTA, an ultra-parameter-efficient framework that significantly reduces trainable parameters through tensor-train decomposition. |
Yifan Yang; Jiajun Zhou; Ngai Wong; Zheng Zhang; |
175 | Which One? Leveraging Context Between Objects and Multiple Views for Language Grounding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: As such, we present the Multi-view Approach to Grounding in Context (MAGiC) model, which selects an object referent based on language that distinguishes between two similar objects. |
Chancharik Mitra; Abrar Anwar; Rodolfo Corona; Dan Klein; Trevor Darrell; Jesse Thomason; |
176 | Do Localization Methods Actually Localize Memorized Data in LLMs? A Tale of Two Benchmarks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose two complementary benchmarks that evaluate the ability of localization methods to pinpoint LLM components responsible for memorized data. |
Ting-Yun Chang; Jesse Thomason; Robin Jia; |
177 | PromptFix: Few-shot Backdoor Removal Via Adversarial Prompt Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose PromptFix, a novel backdoor mitigation strategy for NLP models via adversarial prompt-tuning in few-shot settings. |
Tianrong Zhang; Zhaohan Xi; Ting Wang; Prasenjit Mitra; Jinghui Chen; |
178 | Comparing Explanation Faithfulness Between Multilingual and Monolingual Fine-tuned Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Our extensive experiments, covering five languages and five popular FAs, show that FA faithfulness varies between multilingual and monolingual models. We find that the larger the multilingual model, the less faithful the FAs are compared to its counterpart monolingual models. |
Zhixue Zhao; Nikolaos Aletras; |
179 | A Pretrainer�s Guide to Training Data: Measuring The Effects of Data Age, Domain Coverage, Quality, & Toxicity Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We pretrain models on data curated (1) at different collection times, (2) with varying toxicity and quality filters, and (3) with different domain compositions. |
Shayne Longpre; Gregory Yauney; Emily Reif; Katherine Lee; Adam Roberts; Barret Zoph; Denny Zhou; Jason Wei; Kevin Robinson; David Mimno; Daphne Ippolito; |
180 | Instructional Fingerprinting of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we present a pilot study on LLM fingerprinting as a form of very lightweight instruction tuning. |
Jiashu Xu; Fei Wang; Mingyu Ma; Pang Wei Koh; Chaowei Xiao; Muhao Chen; |
181 | Reinforced Multiple Instance Selection for Speaker Attribute Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Additionally, only a subset of speaker utterances may be relevant to specific attributes. In this paper, we formulate speaker attribute prediction as a Multiple Instance Learning (MIL) problem and propose RL-MIL, a novel approach based on Reinforcement Learning (RL) that effectively addresses both of these challenges. |
Alireza Salkhordeh Ziabari; Ali Omrani; Parsa Hejabi; Preni Golazizian; Brendan Kennedy; Payam Piray; Morteza Dehghani; |
182 | DynaMo: Accelerating Language Model Inference with Dynamic Multi-Token Sampling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose DynaMo, a suite of multi-token prediction language models that reduce net inference times. |
Shikhar Tuli; Chi-Heng Lin; Yen-Chang Hsu; Niraj Jha; Yilin Shen; Hongxia Jin; |
183 | Few-shot Knowledge Graph Relational Reasoning Via Subgraph Adaptation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, existing edge-mask-based methods have limitations in extracting insufficient information from KG and are highly influenced by spurious information in KG. To overcome these challenges, we propose SAFER (Subgraph Adaptation for Few-shot Relational Reasoning), a novel approach that effectively adapts the information in contextualized graphs to various subgraphs generated from support and query triplets to perform the prediction. |
Haochen Liu; Song Wang; Chen Chen; Jundong Li; |
184 | Uncertainty Quantification for In-Context Learning of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Existing works have been devoted to quantifying the uncertainty in LLM�s response, but they often overlook the complex nature of LLMs and the uniqueness of in-context learning. In this work, we delve into the predictive uncertainty of LLMs associated with in-context learning, highlighting that such uncertainties may stem from both the provided demonstrations (aleatoric uncertainty) and ambiguities tied to the model�s configurations (epistemic uncertainty). |
Chen Ling; Xujiang Zhao; Xuchao Zhang; Wei Cheng; Yanchi Liu; Yiyou Sun; Mika Oishi; Takao Osaki; Katsushi Matsuda; Jie Ji; Guangji Bai; Liang Zhao; Haifeng Chen; |
185 | HelpSteer: Multi-attribute Helpfulness Dataset for SteerLM Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Models trained on these datasets can incidentally learn to model dataset artifacts (e.g. preferring longer but unhelpful responses only due to their length). To alleviate this problem, we collect HelpSteer, a multi-attribute helpfulness dataset annotated for the various aspects that make responses helpful. |
Zhilin Wang; Yi Dong; Jiaqi Zeng; Virginia Adams; Makesh Narsimhan Sreedhar; Daniel Egert; Olivier Delalleau; Jane Scowcroft; Neel Kant; Aidan Swope; Oleksii Kuchaiev; |
186 | A Preference-driven Paradigm for Enhanced Translation with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Hence, the assistance from SFT often reaches a plateau once the LLMs have achieved a certain level of translation capability, and further increasing the size of parallel data does not provide additional benefits. To overcome this plateau associated with imitation-based SFT, we propose a preference-based approach built upon the Plackett-Luce model. |
Dawei Zhu; Sony Trenous; Xiaoyu Shen; Dietrich Klakow; Bill Byrne; Eva Hasler; |
187 | Fair Abstractive Summarization of Diverse Perspectives Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we systematically investigate fair abstractive summarization for user-generated data. |
Yusen Zhang; Nan Zhang; Yixin Liu; Alexander Fabbri; Junru Liu; Ryo Kamoi; Xiaoxin Lu; Caiming Xiong; Jieyu Zhao; Dragomir Radev; Kathleen McKeown; Rui Zhang; |
188 | What Are We Measuring When We Evaluate Large Vision-Language Models? An Analysis of Latent Factors and Biases Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we perform a large-scale transfer learning experiment aimed at discovering latent VL skills from data. |
Anthony Tiong; Junqi Zhao; Boyang Li; Junnan Li; Steven Hoi; Caiming Xiong; |
189 | Show Your Work with Confidence: Confidence Bands for Tuning Curves Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present the first method to construct valid confidence bands for tuning curves. |
Nicholas Lourie; Kyunghyun Cho; He He; |
190 | GRASP: A Disagreement Analysis Framework to Assess Group Associations in Perspectives Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Generalizable methods to harness rater disagreement and thus understand the socio-cultural leanings of subjective tasks remain elusive. In this paper, we propose GRASP, a comprehensive disagreement analysis framework to measure group association in perspectives among different rater subgroups, and demonstrate its utility in assessing the extent of systematic disagreements in two datasets: (1) safety annotations of human-chatbot conversations, and (2) offensiveness annotations of social media posts, both annotated by diverse rater pools across different socio-demographic axes. |
Vinodkumar Prabhakaran; Christopher Homan; Lora Aroyo; Aida Mostafazadeh Davani; Alicia Parrish; Alex Taylor; Mark Diaz; Ding Wang; Gregory Serapio-Garc�a; |
191 | Event Causality Is Key to Computational Story Understanding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Leveraging recent progress in large language models, we present the first method for event causality identification that leads to material improvements in computational story understanding. |
Yidan Sun; Qin Chao; Boyang Li; |
192 | Subspace Representations for Soft Set Operations and Sentence Similarities Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: By grounding our approach in the linear subspaces, we enable efficient computation of various set operations and facilitate the soft computation of membership functions within continuous spaces. |
Yoichi Ishibashi; Sho Yokoi; Katsuhito Sudoh; Satoshi Nakamura; |
193 | My Heart Skipped A Beat! Recognizing Expressions of Embodied Emotion in Natural Language Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our work introduces a new task of recognizing expressions of embodied emotion in natural language. |
Yuan Zhuang; Tianyu Jiang; Ellen Riloff; |
194 | Low-Cost Generation and Evaluation of Dictionary Example Sentences Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a new automatic evaluation metric called OxfordEval that measures the win-rate of generated sentences against existing Oxford Dictionary sentences. |
Bill Cai; Ng Clarence; Daniel Liang; Shelvia Hotama; |
195 | Making Language Models Better Tool Learners with Execution Feedback Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This leads to the research question: can we teach language models when and how to use tools? To meet this need, we propose Tool leaRning wIth exeCution fEedback (TRICE), a two-stage end-to-end framework that enables the model to continually learn through feedback derived from tool execution, thereby learning when and how to use tools effectively. |
Shuofei Qiao; Honghao Gui; Chengfei Lv; Qianghuai Jia; Huajun Chen; Ningyu Zhang; |
196 | Complex Claim Verification with Evidence Retrieved in The Wild Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we present the first realistic pipeline to check real-world claims by retrieving raw evidence from the web. |
Jifan Chen; Grace Kim; Aniruddh Sriram; Greg Durrett; Eunsol Choi; |
197 | Multimodal Multi-loss Fusion Network for Sentiment Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We compare different fusion methods and examine the impact of multi-loss training within the multi-modality fusion network, identifying surprisingly important findings relating to subnet performance. |
Zehui Wu; Ziwei Gong; Jaywon Koo; Julia Hirschberg; |
198 | Confronting LLMs with Traditional ML: Rethinking The Fairness of Large Language Models in Tabular Classifications Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Through a series of experiments, we delve into these questions and show that LLMs tend to inherit social biases from their training data which significantly impact their fairness in tabular classification tasks. |
Yanchen Liu; Srishti Gautam; Jiaqi Ma; Himabindu Lakkaraju; |
199 | Analyzing The Use of Metaphors in News Editorials for Political Framing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the nature andutilization of metaphors and the effect on audiencesof different political ideologies withinpolitical discourses are hardly explored. Toenable research in this direction, in this workwe create a dataset, originally based on newseditorials and labeled with their persuasive effectson liberals and conservatives and extend itwith annotations pertaining to metaphorical usageof language. |
Meghdut Sengupta; Roxanne El Baff; Milad Alshomary; Henning Wachsmuth; |
200 | SharpSeq: Empowering Continual Event Detection Through Sharpness-Aware Sequential-task Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While there are successful, widely cited multi-objective optimization frameworks for multi-task learning, they lack mechanisms to address data imbalance and evaluate whether a Pareto-optimal solution can effectively mitigate catastrophic forgetting, rendering them unsuitable for direct application to continual learning. To address these challenges, we propose **SharpSeq**, a novel continual learning paradigm leveraging sharpness-aware minimization combined with a generative model to balance training data distribution. |
Thanh-Thien Le; Viet Dao; Linh Nguyen; Thi-Nhung Nguyen; Linh Ngo; Thien Nguyen; |
201 | Dissecting Paraphrases: The Impact of Prompt Syntax and Supplementary Information on Knowledge Retrieval from Pretrained Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We designed CONPARE-LAMA � a dedicated probe, consisting of 34 million distinct prompts that facilitate comparison across minimal paraphrases. |
Stephan Linzbach; Dimitar Dimitrov; Laura Kallmeyer; Kilian Evang; Hajira Jabeen; Stefan Dietze; |
202 | Know When To Stop: A Study of Semantic Drift in Text Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we explicitly show that modern LLMs tend to generate correct facts first, then �drift away� and generate incorrect facts later: this was occasionally observed but never properly measured. |
Ava Spataru; |
203 | Curriculum Masking in Vision-Language Pretraining to Maximize Cross Modal Interaction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The extent of this lack of cross modal interaction depends strongly which token(s) are masked. To address this issue, we propose a curriculum masking scheme as a replacement for random masking. |
Kraig Tou; Zijun Sun; |
204 | Elote, Choclo and Mazorca: on The Varieties of Spanish Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To alleviate the situation, we compile and curate datasets in the different varieties of Spanish around the world at an unprecedented scale and create the CEREAL corpus. With such a resource at hand, we perform a stylistic analysis to identify and characterise varietal differences. |
Cristina Espa�a-Bonet; Alberto Barr�n-Cede�o; |
205 | Ada-LEval: Evaluating Long-context LLMs with Length-adaptable Benchmarks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce Ada-LEval, a length-adaptable benchmark for evaluating the long-context understanding of LLMs. |
Chonghua Wang; Haodong Duan; Songyang Zhang; Dahua Lin; Kai Chen; |
206 | A Zero-Shot Monolingual Dual Stage Information Retrieval System for Spanish Biomedical Systematic Literature Reviews Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a foundational zero-shot dual information retrieval (IR) baseline system, integrating traditional retrieval methods with pre-trained language models and cross-attention re-rankers for enhanced accuracy in Spanish biomedical literature retrieval. |
Regina Ofori-Boateng; Magaly Aceves-Martins; Nirmalie Wiratunga; Carlos Moreno-Garcia; |
207 | LayoutPointer: A Spatial-Context Adaptive Pointer Network for Visual Information Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Firstly, most existing models inadequately utilize spatial information of entities, often failing to predict connections or incorrectly linking spatially distant entities. Secondly, the improper input order of tokens challenges in extracting complete entity pairs from documents with multi-line entities when text is extracted via PDF parser or OCR. To address these challenges, we propose LayoutPointer, a Spatial-Context Adaptive Pointer Network. LayoutPointer explicitly enhances spatial-context relationships by incorporating 2D relative position information and adaptive spatial constraints within self-attention. |
Huang Siyuan; Yongping Xiong; Wu Guibin; |
208 | Long-form Evaluation of Model Editing Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce long-form evaluation of model editing (LEME) a novel evaluation protocol that measures the efficacy and impact of model editing in long-form generative settings. |
Domenic Rosati; Robie Gonzales; Jinkun Chen; Xuemin Yu; Yahya Kayani; Frank Rudzicz; Hassan Sajjad; |
209 | Analyzing The Role of Semantic Representations in The Era of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we investigate the question: what is the role of semantic representations in the era of LLMs? |
Zhijing Jin; Yuen Chen; Fernando Gonzalez Adauto; Jiarui Liu; Jiayi Zhang; Julian Michael; Bernhard Sch�lkopf; Mona Diab; |
210 | TRAQ: Trustworthy Retrieval Augmented Question Answering Via Conformal Prediction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Retrieval augmented generation (RAG) is a promising strategy to avoid hallucinations, but it does not provide guarantees on its correctness. To address this challenge, we propose the Trustworthy Retrieval Augmented Question Answering, or *TRAQ*, which provides the first end-to-end statistical correctness guarantee for RAG. |
Shuo Li; Sangdon Park; Insup Lee; Osbert Bastani; |
211 | MapGuide: A Simple Yet Effective Method to Reconstruct Continuous Language from Brain Activities Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The previous best attempt reverse-engineered this process in an indirect way: it began by learning to encode brain activity from text and then guided text generation by aligning with predicted brain responses. In contrast, we propose a simple yet effective method that guides text reconstruction by directly comparing them with the predicted text embeddings mapped from brain activities. |
Xinpei Zhao; Jingyuan Sun; Shaonan Wang; Jing Ye; Xhz Xhz; Chengqing Zong; |
212 | On-the-fly Definition Augmentation of LLMs for Biomedical NER Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work we set out to improve LLM performance on biomedical NER in limited data settings via a new knowledge augmentation approach which incorporates definitions of relevant concepts on-the-fly. |
Monica Munnangi; Sergey Feldman; Byron Wallace; Silvio Amir; Tom Hope; Aakanksha Naik; |
213 | This Land Is Your, My Land: Evaluating Geopolitical Bias in Language Models Through Territorial Disputes Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we show that LLMs recall certain geographical knowledge inconsistently when queried in different languages�a phenomenon we term geopolitical bias. |
Bryan Li; Samar Haider; Chris Callison-Burch; |
214 | Set-Aligning Framework for Auto-Regressive Event Temporal Graph Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This discrepancy stems from the conventional text generation objectives, leading to erroneous penalisation of correct predictions caused by the misalignment of elements in target sequences. To address these challenges, we reframe the task as a conditional set generation problem, proposing a Set-aligning Framework tailored for the effective utilisation of Large Language Models (LLMs). |
Xingwei Tan; Yuxiang Zhou; Gabriele Pergola; Yulan He; |
215 | LanguageFlow: Advancing Diffusion Language Generation with Probabilistic Flows Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes Language Rectified Flow (LF). |
Shujian Zhang; Lemeng Wu; Chengyue Gong; Xingchao Liu; |
216 | Towards Improved Multi-Source Attribution for Long-Form Answer Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite gaining increasing popularity for usage in QA systems and search engines, current LLMs struggle with attribution for long-form responses which require reasoning over multiple evidence sources. To address this, in this paper we aim to improve the attribution capability of LLMs for long-form answer generation to multiple sources, with multiple citations per sentence. |
Nilay Patel; Shivashankar Subramanian; Siddhant Garg; Pratyay Banerjee; Amita Misra; |
217 | Synthetic Query Generation for Privacy-Preserving Deep Retrieval Systems Using Differentially Private Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Training these systems often involves the use of contrastive-style losses, which are typically non-per-example decomposable, making them difficult to directly DP-train with since common techniques require per-example gradients. To address this issue, we propose an approach that prioritizes ensuring query privacy prior to training a deep retrieval system. |
Aldo Carranza; Rezsa Farahani; Natalia Ponomareva; Alexey Kurakin; Matthew Jagielski; Milad Nasr; |
218 | Okay, Let�s Do This! Modeling Event Coreference with Generated Rationales and Knowledge Distillation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we investigate using abductive free-text rationales (FTRs) generated by modern autoregressive LLMs as distant supervision of smaller student models for cross-document coreference (CDCR) of events. |
Abhijnan Nath; Shadi Manafi Avari; Avyakta Chelle; Nikhil Krishnaswamy; |
219 | Can Knowledge Graphs Reduce Hallucinations in LLMs? : A Survey Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Among these strategies, leveraging knowledge graphs as a source of external information has demonstrated promising results. In this survey, we comprehensively review these knowledge-graph-based augmentation techniques in LLMs, focusing on their efficacy in mitigating hallucinations. |
Garima Agrawal; Tharindu Kumarage; Zeyad Alghamdi; Huan Liu; |
220 | Pedagogically Aligned Objectives Create Reliable Automatic Cloze Tests Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we first formulate the pedagogically motivated objectives of plausibility, incorrectness, and distinctiveness in terms of conditional distributions from language models. Second, we present an unsupervised, interpretable method that uses these objectives to jointly optimize sets of distractors. Third, we test the reliability and validity of the resulting cloze tests compared to other methods with human participants. |
Brian Ondov; Kush Attal; Dina Demner-Fushman; |
221 | Take One Step at A Time to Know Incremental Utility of Demonstration: An Analysis on Reranking for Few-Shot In-Context Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Unlike the previous work, we introduce a novel labeling method, incremental utility, which estimates how much incremental knowledge is brought into the LLMs by a demonstration. |
Kazuma Hashimoto; Karthik Raman; Michael Bendersky; |
222 | LM-Infinite: Zero-Shot Extreme Length Generalization for Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Our theoretical analysis reveals that commonly used techniques like using a sliding-window attention pattern or relative positional encodings are inadequate to address them. Answering these challenges, we propose LM-Infinite, a simple and effective method for enhancing LLMs� capabilities of handling long contexts. |
Chi Han; Qifan Wang; Hao Peng; Wenhan Xiong; Yu Chen; Heng Ji; Sinong Wang; |
223 | CONSCENDI: A Contrastive and Scenario-Guided Distillation Approach to Guardrail Models for Virtual Assistants Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, relying on commonly used, prompt-based guardrails can be difficult to engineer correctly and comprehensively. To address these challenges, we propose CONSCENDI. |
Albert Sun; Varun Nair; Elliot Schumacher; Anitha Kannan; |
224 | Advancing Beyond Identification: Multi-bit Watermark for Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: While existing zero-bit watermark methods focus on detection only, some malicious misuses demand tracing the adversary user for counteracting them. To address this, we propose Multi-bit Watermark via Position Allocation, embedding traceable multi-bit information during language model generation. |
KiYoon Yoo; Wonhyuk Ahn; Nojun Kwak; |
225 | HTCCN: Temporal Causal Convolutional Networks with Hawkes Process for Extrapolation Reasoning in Temporal Knowledge Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose HTCCN, a novel Hawkes process-based temporal causal convolutional network designed for temporal reasoning under extrapolation settings. |
Tingxuan Chen; Jun Long; Liu Yang; Zidong Wang; Yongheng Wang; Xiongnan Jin; |
226 | SemStamp: A Semantic Watermark with Paraphrastic Robustness for Text Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose SemStamp, a robust sentence-level semantic watermarking algorithm that uses locality-sensitive hashing (LSH) to partition the semantic space of sentences. |
Abe Hou; Jingyu Zhang; Tianxing He; Yichen Wang; Yung-Sung Chuang; Hongwei Wang; Lingfeng Shen; Benjamin Van Durme; Daniel Khashabi; Yulia Tsvetkov; |
227 | Media Bias Detection Across Families of Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we ask how well prompting of large language models can recognize media bias. |
Iffat Maab; Edison Marrese-Taylor; Sebastian Pad�; Yutaka Matsuo; |
228 | Better Zero-Shot Reasoning with Role-Play Prompting Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: While these capabilities have enhanced user engagement and introduced novel modes of interaction, the influence of role-playing on LLMs� reasoning abilities remains underexplored. In this study, we introduce a strategically designed role-play prompting methodology and assess its performance under the zero-shot setting across twelve diverse reasoning benchmarks. |
Aobo Kong; Shiwan Zhao; Hao Chen; Qicheng Li; Yong Qin; Ruiqi Sun; Xin Zhou; Enzhi Wang; Xiaohang Dong; |
229 | Event-Content-Oriented Dialogue Generation in Short Video Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Meanwhile, we present multi-modal dialogue generation method VCD (Video Commentary Dialogue) to generate human-like response according to event contents in the video and related external knowledge. |
Fenghua Cheng; Xue Li; Zi Huang; Jinxiang Wang; Sen Wang; |
230 | DoG-Instruct: Towards Premium Instruction-Tuning Data Via Text-Grounded Instruction Wrapping Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Unfortunately, the current methods used to collect the pairs suffer from either unaffordable labor costs or severe hallucinations in the self-generation of LLM. To tackle these challenges, this paper proposes a scalable solution. |
Yongrui Chen; Haiyun Jiang; Xinting Huang; Shuming Shi; Guilin Qi; |
231 | Beyond Borders: Investigating Cross-Jurisdiction Transfer in Legal Case Summarization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we explore the cross-jurisdictional generalizability of legal case summarization models. |
Santosh T.y.s.s; Vatsal Venkatkrishna; Saptarshi Ghosh; Matthias Grabmair; |
232 | EDC: Effective and Efficient Dialog Comprehension For Dialog State Tracking Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The two major categories of DST methods, sequential and independent methods, face trade-offs between accuracy and efficiency. To resolve this issue, we propose Effective and Efficient Dialog Comprehension (EDC), an alternative DST approach that leverages the tree structure of the dialog state. |
Qifan Lu; Bhaskar Ramasubramanian; Radha Poovendran; |
233 | Automatic Restoration of Diacritics for Speech Data Sets Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we explore the possibility of improving the performance of automatic diacritic restoration when applied to speech data by utilizing parallel spoken utterances. |
Sara Shatnawi; Sawsan Alqahtani; Hanan Aldarmaki; |
234 | XNLIeu: A Dataset for Cross-lingual NLI in Basque Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we expand XNLI to include Basque, a low-resource language that can greatly benefit from transfer-learning approaches. |
Maite Heredia; Julen Etxaniz; Muitze Zulaika; Xabier Saralegi; Jeremy Barnes; Aitor Soroa; |
235 | MDR: Model-Specific Demonstration Retrieval at Inference Time for In-Context Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Previous approaches ignore the model bias and fail to retrieve the most appropriate demonstrations for different inference LLMs, resulting in a degradation of ICL performance. To address this problem, we propose a simple yet effective metric to evaluate the appropriateness of demonstrations for a specific inference LLM. |
Huazheng Wang; Jinming Wu; Haifeng Sun; Zixuan Xia; Daixuan Cheng; Jingyu Wang; Qi Qi; Jianxin Liao; |
236 | Exploring Cross-Cultural Differences in English Hate Speech Annotations: From Dataset Construction to Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: *Most hate speech datasets neglect the cultural diversity within a single language, resulting in a critical shortcoming in hate speech detection. To address this, we introduce **CREHate**, a **CR**oss-cultural **E**nglish **Hate** speech dataset. |
Nayeon Lee; Chani Jung; Junho Myung; Jiho Jin; Jose Camacho-Collados; Juho Kim; Alice Oh; |
237 | Enhancing Contextual Understanding in Large Language Models Through Contrastive Decoding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The study addresses the open question of how LLMs effectively balance these knowledge sources during the generation process, specifically in the context of open-domain question answering. To address this issue, we introduce a novel approach integrating contrastive decoding with adversarial irrelevant passages as negative samples to enhance robust context grounding during generation. |
Zheng Zhao; Emilio Monti; Jens Lehmann; Haytham Assem; |
238 | Generalizable Sarcasm Detection Is Just Around The Corner, of Course! Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Compared to the existing datasets, models fine-tuned on the new dataset we release in this work showed the highest generalizability to other datasets. |
Hyewon Jang; Diego Frassinelli; |
239 | Encoding of Lexical Tone in Self-supervised Models of Spoken Language Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper aims to analyze the tone encodingcapabilities of SLMs, using Mandarin and Vietnamese as case studies. |
Gaofei Shen; Michaela Watkins; Afra Alishahi; Arianna Bisazza; Grzegorz Chrupala; |
240 | A Systematic Comparison of Contextualized Word Embeddings for Lexical Semantic Change Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we evaluate state-of-the-art models and approaches for GCD under equal conditions. |
Francesco Periti; Nina Tahmasebi; |
241 | IACOS: Advancing Implicit Sentiment Extraction with Informative and Adaptive Negative Examples Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a new method iACOS for extracting Implicit Aspects with Categories and Opinions with Sentiments. |
Xiancai Xu; Jia-Dong Zhang; Lei Xiong; Zhishang Liu; |
242 | Rectifying Demonstration Shortcut in In-Context Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, LLMs often rely on their pre-trained semantic priors of demonstrations rather than on the input-label relationships to proceed with ICL prediction. In this work, we term this phenomenon as the �Demonstration Shortcut�. |
Joonwon Jang; Sanghwan Jang; Wonbin Kweon; Minjin Jeon; Hwanjo Yu; |
243 | Universal NER: A Gold-Standard Multilingual Named Entity Recognition Benchmark Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce Universal NER (UNER), an open, community-driven project to develop gold-standard NER benchmarks in many languages. |
Stephen Mayhew; Terra Blevins; Shuheng Liu; Marek Suppa; Hila Gonen; Joseph Marvin Imperial; B�rje Karlsson; Peiqin Lin; Nikola Ljube�ic; Lester James Miranda; Barbara Plank; Arij Riabi; Yuval Pinter; |
244 | ODD: A Benchmark Dataset for The Natural Language Processing Based Opioid Related Aberrant Behavior Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper introduces a novel biomedical natural language processing benchmark dataset named ODD, for ORAB Detection Dataset. |
Sunjae Kwon; Xun Wang; Weisong Liu; Emily Druhl; Minhee Sung; Joel Reisman; Wenjun Li; Robert Kerns; William Becker; Hong Yu; |
245 | A Comprehensive Study of Gender Bias in Chemical Named Entity Recognition Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We develop a framework for measuring gender bias in chemical NER models using synthetic data and a newly annotated corpus of over 92,405 words with self-identified gender information from Reddit. |
Xingmeng Zhao; Ali Niazi; Anthony Rios; |
246 | The Promises and Pitfalls of Using Language Models to Measure Instruction Quality in Education Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Different from prior research that mostly focuses on low-inference instructional practices on a singular basis, this paper presents the first study that leverages Natural Language Processing (NLP) techniques to assess multiple high-inference instructional practices in two distinct educational settings: in-person K-12 classrooms and simulated performance tasks for pre-service teachers. |
Paiheng Xu; Jing Liu; Nathan Jones; Julie Cohen; Wei Ai; |
247 | Differentially Private Next-Token Prediction of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: On the other hand, commercial LLM deployments are predominantly cloud-based; hence, adversarial access to LLMs is black-box. Motivated by these observations, we present Private Mixing of Ensemble Distributions (PMixED): a private prediction protocol for next-token prediction that utilizes the inherent stochasticity of next-token sampling and a public model to achieve Differential Privacy. |
James Flemings; Meisam Razaviyayn; Murali Annavaram; |
248 | Improving Adversarial Data Collection By Supporting Annotators: Lessons from GAHD, A German Hate Speech Dataset Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce GAHD, a new German Adversarial Hate speech Dataset comprising ca. 11k examples. |
Janis Goldzycher; Paul R�ttger; Gerold Schneider; |
249 | Memory Augmented Language Models Through Mixture of Word Experts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we seek to aggressively decouple learning capacity and FLOPs through Mixture-of-Experts (MoE) style models with large knowledge-rich vocabulary based routing functions. |
Cicero Nogueira dos Santos; James Lee-Thorp; Isaac Noble; Chung-Ching Chang; David Uthus; |
250 | Impossible Distillation for Paraphrasing and Summarization: How to Make High-quality Lemonade Out of Small, Low-quality Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present Impossible Distillation, a novel framework for paraphrasing and sentence summarization, that distills a high-quality dataset and model from a low-quality teacher that itself cannot perform these tasks. |
Jaehun Jung; Peter West; Liwei Jiang; Faeze Brahman; Ximing Lu; Jillian Fisher; Taylor Sorensen; Yejin Choi; |
251 | TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a new evaluation benchmark on topic-focused dialogue summarization, generated by LLMs of varying sizes. |
Liyan Tang; Igor Shalyminov; Amy Wong; Jon Burnsky; Jake Vincent; Yu�an Yang; Siffi Singh; Song Feng; Hwanjun Song; Hang Su; Lijia Sun; Yi Zhang; Saab Mansour; Kathleen McKeown; |
252 | MOKA: Moral Knowledge Augmentation for Moral Event Extraction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, values that are reflected in the intricate dynamics among *participating entities* and *moral events* are far more challenging for most NLP systems to detect, including LLMs. To study this phenomenon, we annotate a new dataset, **MORAL EVENTS**, consisting of 5,494 structured event annotations on 474 news articles by diverse US media across the political spectrum. We further propose **MOKA**, a moral event extraction framework with **MO**ral **K**nowledge **A**ugmentation, which leverages knowledge derived from moral words and moral scenarios to produce structural representations of morality-bearing events. |
Xinliang Frederick Zhang; Winston Wu; Nicholas Beauchamp; Lu Wang; |
253 | Fixing Rogue Memorization in Many-to-One Multilingual Translators of Extremely-Low-Resource Languages By Rephrasing Training Samples Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we study the fine-tuning of pre-trained large high-resource language models (LLMs) into many-to-one multilingual machine translators for extremely-low-resource languages such as endangered Indigenous languages. |
Paulo Cavalin; Pedro Henrique Domingues; Claudio Pinhanez; Julio Nogima; |
254 | Backdoor Attacks on Multilingual Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This type of attack is of particular concern, given the larger attack surface of languages inherent to low-resource settings. Our aim is to bring attention to these vulnerabilities within MNMT systems with the hope of encouraging the community to address security concerns in machine translation, especially in the context of low-resource languages. |
Jun Wang; Qiongkai Xu; Xuanli He; Benjamin Rubinstein; Trevor Cohn; |
255 | Personalized Jargon Identification for Enhanced Interdisciplinary Communication Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This research offers insights into features and methods for the novel task of integrating personal data into scientific jargon identification. |
Yue Guo; Joseph Chee Chang; Maria Antoniak; Erin Bransom; Trevor Cohen; Lucy Wang; Tal August; |
256 | Flames: Benchmarking Value Alignment of LLMs in Chinese Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To this end, this paper proposes a value alignment benchmark named Flames, which encompasses both common harmlessness principles and a unique morality dimension that integrates specific Chinese values such as harmony. |
Kexin Huang; Xiangyang Liu; Qianyu Guo; Tianxiang Sun; Jiawei Sun; Yaru Wang; Zeyang Zhou; Yixu Wang; Yan Teng; Xipeng Qiu; Yingchun Wang; Dahua Lin; |
257 | Mitigating Bias for Question Answering Models By Tracking Bias Influence Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose BMBI, an approach to mitigate the bias of multiple-choice QA models. |
Mingyu Ma; Jiun-Yu Kao; Arpit Gupta; Yu-Hsiang Lin; Wenbo Zhao; Tagyoung Chung; Wei Wang; Kai-Wei Chang; Nanyun Peng; |
258 | Extending CLIP�s Image-Text Alignment to Referring Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose RISCLIP, a novel framework that effectively leverages the cross-modal nature of CLIP for RIS. |
Seoyeon Kim; Minguk Kang; Dongwon Kim; Jaesik Park; Suha Kwak; |
259 | Generating Attractive and Authentic Copywriting from Customer Reviews Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Typical approaches to copywriting generation often rely solely on specified product attributes, which may result in dull and repetitive content. To tackle this issue, we propose to generate copywriting based on customer reviews, as they provide firsthand practical experiences with products, offering a richer source of information than just product attributes. |
Yu-Xiang Lin; Wei-Yun Ma; |
260 | Effective Long-Context Scaling of Foundation Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present an effective recipe to train strong long-context LLMs that are capable of utilizing massive context windows of up to 32,000 tokens. |
Wenhan Xiong; Jingyu Liu; Igor Molybog; Hejia Zhang; Prajjwal Bhargava; Rui Hou; Louis Martin; Rashi Rungta; Karthik Abinav Sankararaman; Barlas Oguz; Madian Khabsa; Han Fang; Yashar Mehdad; Sharan Narang; Kshitiz Malik; Angela Fan; Shruti Bhosale; Sergey Edunov; Mike Lewis; Sinong Wang; Hao Ma; |
261 | Empowering Diffusion Models on The Embedding Space for Text Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we conduct systematic studies of the optimization challenges encountered with both the embedding space and the denoising model, which have not been carefully explored. |
Zhujin Gao; Junliang Guo; Xu Tan; Yongxin Zhu; Fang Zhang; Jiang Bian; Linli Xu; |
262 | Aligning As Debiasing: Causality-Aware Alignment Via Reinforcement Learning with Interventional Feedback Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, to explore how biases are formed, we revisit LLMs� text generation from a causal perspective. |
Yu Xia; Tong Yu; Zhankui He; Handong Zhao; Julian McAuley; Shuai Li; |
263 | Fake Alignment: Are LLMs Really Aligned Well? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce a Fake alIgNment Evaluation (FINE) framework and two novel metrics��Consistency Score (CS) and Consistent Safety Score (CSS), which jointly assess two complementary forms of evaluation to quantify fake alignment and obtain corrected performance estimation. |
Yixu Wang; Yan Teng; Kexin Huang; Chengqi Lyu; Songyang Zhang; Wenwei Zhang; Xingjun Ma; Yu-Gang Jiang; Yu Qiao; Yingchun Wang; |
264 | Visually Guided Generative Text-Layout Pre-training for Document Intelligence Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To this end, we propose visually guided generative text-layout pre-training, named ViTLP. |
Zhiming Mao; Haoli Bai; Lu Hou; Lifeng Shang; Xin Jiang; Qun Liu; Kam-Fai Wong; |
265 | HILL: Hierarchy-aware Information Lossless Contrastive Learning for Hierarchical Text Classification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we tend to investigate the feasibility of a contrastive learning scheme in which the semantic and syntactic information inherent in the input sample is adequately reserved in the contrastive samples and fused during the learning process. |
He Zhu; Junran Wu; Ruomei Liu; Yue Hou; Ze Yuan; Shangzhe Li; Yicheng Pan; Ke Xu; |
266 | Investigating The Emergent Audio Classification Ability of ASR Foundation Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work we investigate the ability of Whisper and MMS, ASR foundation models trained primarily for speech recognition, to perform zero-shot audio classification. |
Rao Ma; Adian Liusie; Mark Gales; Kate Knill; |
267 | In-context Learning Generalizes, But Not Always Robustly: The Case of Syntax Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Do models guided via ICL infer the underlying structure of the task defined by the context, or do they rely on superficial heuristics that only generalize to identically distributed examples? We address this question using transformations tasks and an NLI task that assess sensitivity to syntax�a requirement for robust language understanding |
Aaron Mueller; Albert Webson; Jackson Petty; Tal Linzen; |
268 | Prompt-Singer: Controllable Singing-Voice-Synthesis with Natural Language Prompt Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose Prompt-Singer, the first SVS method that enables attribute controlling on singer gender, vocal range and volume with natural language. |
Yongqi Wang; Ruofan Hu; Rongjie Huang; Zhiqing Hong; Ruiqi Li; Wenrui Liu; Fuming You; Tao Jin; Zhou Zhao; |
269 | Lost in Transcription: Identifying and Quantifying The Accuracy Biases of Automatic Speech Recognition Systems Against Disfluent Speech Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study evaluates six leading ASRs, analyzing their performance on both a real-world dataset of speech samples from individuals who stutter and a synthetic dataset derived from the widely-used LibriSpeech benchmark. |
Dena Mujtaba; Nihar Mahapatra; Megan Arney; J Yaruss; Hope Gerlach-Houck; Caryn Herring; Jia Bin; |
270 | MAFALDA: A Benchmark and Comprehensive Study of Fallacy Detection and Classification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a new annotation scheme tailored for subjective NLP tasks, and a new evaluation method designed to handle subjectivity. |
Chadi Helwe; Tom Calamai; Pierre-Henri Paris; Chlo� Clavel; Fabian Suchanek; |
271 | Diffusion Glancing Transformer for Parallel Sequence-to-Sequence Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To enhance the multi-modality modeling ability, we propose the diffusion glancing transformer, which employs a modality diffusion process and residual glancing sampling. |
Lihua Qian; Mingxuan Wang; Yang Liu; Hao Zhou; |
272 | No Context Needed: Contextual Quandary In Idiomatic Reasoning With Pre-Trained Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate the utilization of said context for idiomatic reasoning tasks, which is under-explored relative to arithmetic or commonsense reason- ing (Liu et al. , 2022; Yu et al. , 2023). |
Kellen Cheng; Suma Bhat; |
273 | Multi-stage Retrieve and Re-rank Model for Automatic Medical Coding Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we leverage a multi-stage �retrieve and re-rank� framework as a novel solution to ICD indexing, via a hybrid discrete retrieval method, and re-rank retrieved candidates with contrastive learning that allows the model to make more accurate predictions from a simplified label space. |
Xindi Wang; Robert Mercer; Frank Rudzicz; |
274 | Anisotropy Is Not Inherent to Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work we identify a set of Transformer models with isotropic embedding spaces, the large Pythia models. |
Anemily Machina; Robert Mercer; |
275 | Finding Replicable Human Evaluations Via Stable Ranking Probability Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Without it, there is no reliable foundation for hill-climbing or product launch decisions. In this paper, we use machine translation and its state-of-the-art human evaluation framework, MQM, as a case study to understand how to set up reliable human evaluations that yield stable conclusions. |
Parker Riley; Daniel Deutsch; George Foster; Viresh Ratnakar; Ali Dabirmoghaddam; Markus Freitag; |
276 | Stealthy and Persistent Unalignment on Large Language Models Via Backdoor Injections Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we show that it is possible to conduct stealthy and persistent unalignment on large language models via backdoor injections. |
Yuanpu Cao; Bochuan Cao; Jinghui Chen; |
277 | Generalizable and Stable Finetuning of Pretrained Language Models on Low-Resource Texts Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, they rely on a suboptimal criteria for sub-network selection, leading to suboptimal solutions. To address these limitations, we propose a regularization method based on attention-guided weight mixup for finetuning PLMs. |
Sai Ashish Somayajula; Youwei Liang; Li Zhang; Abhishek Singh; Pengtao Xie; |
278 | Detecting Bipolar Disorder from Misdiagnosed Major Depressive Disorder with Mood-Aware Multi-Task Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While early intervention based on social media data has been explored to uncover latent BD risk, little attention has been paid to detecting BD from those misdiagnosed as MDD. Therefore, this study presents a novel approach for identifying BD risk in individuals initially misdiagnosed with MDD. |
Daeun Lee; Hyolim Jeon; Sejung Son; Chaewon Park; Ji hyun An; Seungbae Kim; Jinyoung Han; |
279 | Leveraging Code to Improve In-Context Learning for Semantic Parsing Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we show how pre-existing coding abilities of LLMs can be leveraged for semantic parsing by (1) using general-purpose programming languages such as Python instead of DSLs and (2) augmenting prompts with a structured domain description that includes, e. g. , the available classes and functions. |
Ben Bogin; Shivanshu Gupta; Peter Clark; Ashish Sabharwal; |
280 | Improving Pre-trained Language Model Sensitivity Via Mask Specific Losses: A Case Study on Biomedical NER Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To address insensitive fine-tuning, we propose Mask Specific Language Modeling (MSLM), an approach that efficiently acquires target domain knowledge by appropriately weighting the importance of domain-specific terms (DS-terms) during fine-tuning. |
Micheal Abaho; Danushka Bollegala; Gary Leeming; Dan Joyce; Iain Buchan; |
281 | Language Models Implement Simple Word2Vec-style Vector Arithmetic Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents evidence that, despite their size and complexity, LMs sometimes exploit a simple vector arithmetic style mechanism to solve some relational tasks using regularities encoded in the hidden space of the model (e. g. , Poland:Warsaw::China:Beijing). |
Jack Merullo; Carsten Eickhoff; Ellie Pavlick; |
282 | AutoLoRA: Automatically Tuning Matrix Ranks in Low-Rank Adaptation Based on Meta Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Nonetheless, LoRA�s uniform rank assignment across all layers, along with its reliance on an exhaustive search to find the best rank, leads to high computation costs and suboptimal finetuning performance. To address these limitations, we introduce AutoLoRA, a meta learning based framework for automatically identifying the optimal rank of each LoRA layer. |
Ruiyi Zhang; Rushi Qiang; Sai Ashish Somayajula; Pengtao Xie; |
283 | SportQA: A Benchmark for Sports Understanding in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This holds particular significance in the context of evaluating and advancing Large Language Models (LLMs), given the existing gap in specialized benchmarks. To bridge this gap, we introduce SportQA, a novel benchmark specifically designed for evaluating LLMs in the context of sports understanding. |
Haotian Xia; Zhengbang Yang; Yuqing Wang; Rhys Tracy; Yun Zhao; Dongdong Huang; Zezhi Chen; Yan Zhu; Yuan-fang Wang; Weining Shen; |
284 | Revisiting Subword Tokenization: A Case Study on Affixal Negation in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we measure the impact of affixal negation on modern English large language models (LLMs). |
Thinh Truong; Yulia Otmakhova; Karin Verspoor; Trevor Cohn; Timothy Baldwin; |
285 | Generating Mental Health Transcripts with SAPE (Spanish Adaptive Prompt Engineering) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we present SAPE, a Spanish Adaptive Prompt Engineering method utilizing genetic algorithms for prompt generation and selection. |
Daniel Lozoya; Alejandro Berazaluce; Juan Perches; Eloy L�a; Mike Conway; Simon D�Alfonso; |
286 | Where Are You From? Geolocating Speech and Applications to Language Identification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We train models to answer the question, Where are you from? |
Patrick Foley; Matthew Wiesner; Bismarck Odoom; Leibny Paola Garcia Perera; Kenton Murray; Philipp Koehn; |
287 | Teaching Language Models to Self-Improve Through Interactive Demonstrations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, this ability has been shown to be absent and difficult to learn for smaller models, thus widening the performance gap between state-of-the-art LLMs and more cost-effective and faster ones. To reduce this gap, we introduce TriPosT, a training algorithm that endows smaller models with such self-improvement ability, and show that our approach can improve LLaMA-7B�s performance on math and reasoning tasks by up to 7. |
Xiao Yu; Baolin Peng; Michel Galley; Jianfeng Gao; Zhou Yu; |
288 | MAGID: An Automated Pipeline for Generating Synthetic Multi-modal Datasets Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce Multimodal Augmented Generative Images Dialogues (MAGID), a framework to augment text-only dialogues with diverse and high-quality images . |
Hossein Aboutalebi; Hwanjun Song; Yusheng Xie; Arshit Gupta; Lijia Sun; Hang Su; Igor Shalyminov; Nikolaos Pappas; Siffi Singh; Saab Mansour; |
289 | Zero-shot Generative Linguistic Steganography Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a novel zero-shot approach based on in-context learning for linguistic steganography to achieve better perceptual and statistical imperceptibility. |
Ke Lin; Yiyang Luo; Zijian Zhang; Luo Ping; |
290 | Does GPT-4 Pass The Turing Test? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: AI models with the ability to masquerade as humans could have widespread societal consequences, and we analyse the effectiveness of different strategies and criteria for judging humanlikeness. |
Cameron Jones; Ben Bergen; |
291 | Polarity Calibration for Opinion Summarization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We conduct an analysis of previous summarization models, which reveals their inclination to amplify the polarity bias, emphasizing the majority opinions while ignoring the minority opinions. To address this issue and make the summarizer express both sides of opinions, we introduce the concept of polarity calibration, which aims to align the polarity of output summary with that of input text. |
Yuanyuan Lei; Kaiqiang Song; Sangwoo Cho; Xiaoyang Wang; Ruihong Huang; Dong Yu; |
292 | Sentence-level Media Bias Analysis with Event Relation Graph Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we identify media bias at the sentence level, and pinpoint bias sentences that intend to sway readers� opinions. |
Yuanyuan Lei; Ruihong Huang; |
293 | EMONA: Event-level Moral Opinions in News Articles Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper initiates a new task to understand moral opinions towards events in news articles. |
Yuanyuan Lei; Md Messal Monem Miah; Ayesha Qamar; Sai Ramana Reddy; Jonathan Tong; Haotian Xu; Ruihong Huang; |
294 | DLM: A Decoupled Learning Model for Long-tailed Polyphone Disambiguation in Mandarin Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel model DLM: a Decoupled Learning Model for long-tailed polyphone disambiguation in Mandarin. |
Beibei Gao; Yangsen Zhang; Ga Xiang; Yushan Jiang; |
295 | You Don�t Need A Personality Test to Know These Models Are Unreliable: Assessing The Reliability of Large Language Models on Psychometric Instruments Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we take a cautionary step back and examine whether the current format of prompting LLMs elicits responses in a consistent and robust manner. |
Bangzhao Shu; Lechen Zhang; Minje Choi; Lavinia Dunagan; Lajanugen Logeswaran; Moontae Lee; Dallas Card; David Jurgens; |
296 | CASA: Causality-driven Argument Sufficiency Assessment Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: PS measures how likely introducing the premise event would lead to the conclusion when both the premise and conclusion events are absent. To estimate this probability, we propose to use large language models (LLMs) to generate contexts that are inconsistent with the premise and conclusion and revise them by injecting the premise event. |
Xiao Liu; Yansong Feng; Kai-Wei Chang; |
297 | MacGyver: Are Large Language Models Creative Problem Solvers? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work (1) introduces a fresh arena for intelligent agents focusing on intricate aspects of physical reasoning, planning, and unconventional thinking, which supplements the existing spectrum of machine intelligence; and (2) provides insight into the constrained problem-solving capabilities of both humans and AI. |
Yufei Tian; Abhilasha Ravichander; Lianhui Qin; Ronan Le Bras; Raja Marjieh; Nanyun Peng; Yejin Choi; Thomas Griffiths; Faeze Brahman; |
298 | To Translate or Not to Translate: A Systematic Investigation of Translation-Based Cross-Lingual Transfer to Low-Resource Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Given, on the one hand, the large body of work on improving XLT with mLMs and, on the other hand, recent advances in massively multilingual MT, in this work, we systematically evaluate existing and propose new translation-based XLT approaches for transfer to low-resource languages. |
Benedikt Ebing; Goran Glava�; |
299 | Enhancing Large Language Models Against Inductive Instructions with Dual-critique Prompting Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we aim to reveal the behaviors of LLMs towards inductive instructions and enhance their truthfulness and helpfulness accordingly. |
Rui Wang; Hongru Wang; Fei Mi; Boyang Xue; Yi Chen; Kam-Fai Wong; Ruifeng Xu; |
300 | GLiNER: Generalist Model for Named Entity Recognition Using Bidirectional Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce a compact NER model trained to identify any type of entity. |
Urchade Zaratiana; Nadi Tomeh; Pierre Holat; Thierry Charnois; |
301 | XSTest: A Test Suite for Identifying Exaggerated Safety Behaviours in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Recent anecdotal evidence suggests that some models may have struck a poor balance, so that even clearly safe prompts are refused if they use similar language to unsafe prompts or mention sensitive topics. In this paper, we introduce a new test suite called XSTest to identify such eXaggerated Safety behaviours in a systematic way. |
Paul R�ttger; Hannah Kirk; Bertie Vidgen; Giuseppe Attanasio; Federico Bianchi; Dirk Hovy; |
302 | Carpe Diem: On The Evaluation of World Knowledge in Lifelong Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Our work aims to model the dynamic nature of real-world information, suggesting faithful evaluations of the evolution-adaptability of language models. |
Yujin Kim; Jaehong Yoon; Seonghyeon Ye; Sangmin Bae; Namgyu Ho; Sung Ju Hwang; Se-Young Yun; |
303 | Fine-grained Gender Control in Machine Translation with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we tackle controlled translation in a more realistic setting of inputs with multiple entities and propose Gender-of-Entity (GoE) prompting method for LLMs. |
Minwoo Lee; Hyukhun Koh; Minsung Kim; Kyomin Jung; |
304 | DialogVCS: Robust Natural Language Understanding in Dialogue System Upgrade Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We formulate the intent detection with imperfect data in the system update as a multi-label classification task with positive but unlabeled intents, which asks the models to recognize all the proper intents, including the ones with semantic entanglement, in the inference. |
Zefan Cai; Xin Zheng; Tianyu Liu; Haoran Meng; Jiaqi Han; Gang Yuan; Binghuai Lin; Baobao Chang; Yunbo Cao; |
305 | LLatrieval: LLM-Verified Retrieval for Verifiable Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: If the retriever does not correctly find the supporting documents, the LLM can not generate the correct and verifiable answer, which overshadows the LLM�s remarkable abilities. To address these limitations, we propose **LLatrieval** (**L**arge **La**nguage Model Verified Re**trieval**),where the LLM updates the retrieval result until it verifies that the retrieved documents can sufficiently support answering the question. |
Xiaonan Li; Changtai Zhu; Linyang Li; Zhangyue Yin; Tianxiang Sun; Xipeng Qiu; |
306 | Mapping Long-term Causalities in Psychiatric Symptomatology and Life Events from Social Media Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we study the causality between psychiatric symptoms and life events, as well as among different symptoms from social media posts, which leads to better understanding of the underlying mechanisms of mental disorders. |
Siyuan Chen; Meilin Wang; Minghao Lv; Zhiling Zhang; Juqianqian Juqianqian; Dejiyangla Dejiyangla; Yujia Peng; Kenny Zhu; Mengyue Wu; |
307 | Multimodal Chart Retrieval: A Comparison of Text, Table and Image Based Approaches Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: As the table retrieval component we introduce Tab-GTR, a text retrieval model augmented with table structure embeddings, achieving state-of-the-art results on the NQ-Tables benchmark with 48. |
Averi Nowak; Francesco Piccinno; Yasemin Altun; |
308 | Retrieval Helps or Hurts? A Deeper Dive Into The Efficacy of Retrieval Augmentation to Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, our goal is to offer a more detailed, fact-centric analysis by exploring the effects of combinations of entities and relations. |
Seiji Maekawa; Hayate Iso; Sairam Gurajada; Nikita Bhutani; |
309 | AudioChatLlama: Towards General-Purpose Speech Abilities for LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we extend the instruction-tuned Llama-2 model with end-to-end general-purpose speech processing and reasoning abilities while maintaining the wide range of original LLM capabilities, without using any carefully curated paired data. |
Yassir Fathullah; Chunyang Wu; Egor Lakomkin; Ke Li; Junteng Jia; Yuan Shangguan; Jay Mahadeokar; Ozlem Kalinli; Christian Fuegen; Mike Seltzer; |
310 | Whispers of Doubt Amidst Echoes of Triumph in NLP Robustness Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: *Do larger and more performant models resolve NLP�s longstanding robustness issues?* We investigate this question using over 20 models of different sizes spanning different architectural choices and pretraining objectives. |
Ashim Gupta; Rishanth Rajendhran; Nathan Stringham; Vivek Srikumar; Ana Marasovic; |
311 | Sequential Compositional Generalization in Multimodal Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, a pressing question that remains is their genuine capability for stronger forms of generalization, which has been largely underexplored in the multimodal setting. Our study aims to address this by examining sequential compositional generalization using CompAct (Compositional Activities), a carefully constructed, perceptually grounded dataset set within a rich backdrop of egocentric kitchen activity videos. |
Semih Yagcioglu; Osman Batur Ince; Aykut Erdem; Erkut Erdem; Desmond Elliott; Deniz Yuret; |
312 | Generating Uncontextualized and Contextualized Questions for Document-Level Event Argument Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents multiple question generation strategies for document-level event argument extraction. |
Md Nayem Uddin; Enfa George; Eduardo Blanco; Steven Corman; |
313 | Evidence-Driven Retrieval Augmented Response Generation for Online Misinformation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose retrieval augmented response generation for online misinformation (RARG), which collects supporting evidence from scientific sources and generates counter-misinformation responses based on the evidences. |
Zhenrui Yue; Huimin Zeng; Yimeng Lu; Lanyu Shang; Yang Zhang; Dong Wang; |
314 | Open-Vocabulary Federated Learning with Multimodal Prototyping Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: A new user could come up with queriesthat involve data from unseen classes, and suchopen-vocabulary queries would directly defectsuch FL systems. Therefore, in this work, weexplicitly focus on the under-explored openvocabulary challenge in FL. |
Huimin Zeng; Zhenrui Yue; Dong Wang; |
315 | Exploring Key Point Analysis with Pairwise Generation and Graph Partitioning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Additionally, it only models the intra-cluster relationship among arguments, disregarding the inter-cluster relationship between arguments that do not share key points. To address these limitations, we propose a novel approach for KPA with pairwise generation and graph partitioning. |
Xiao Li; Yong Jiang; Shen Huang; Pengjun Xie; Gong Cheng; Fei Huang; |
316 | Understanding The Capabilities and Limitations of Large Language Models for Cultural Commonsense Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we conduct a comprehensive examination of the capabilities and limitations of several state-of-the-art LLMs in the context of cultural commonsense tasks. |
Siqi Shen; Lajanugen Logeswaran; Moontae Lee; Honglak Lee; Soujanya Poria; Rada Mihalcea; |
317 | Code Models Are Zero-shot Precondition Reasoners Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Leveraging code representations, we extract action preconditions from demonstration trajectories in a zero-shot manner using pre-trained code models. Given these extracted preconditions, we propose a precondition-aware action sampling strategy that ensures actions predicted by a policy are consistent with preconditions. |
Lajanugen Logeswaran; Sungryull Sohn; Yiwei Lyu; Anthony Liu; Dong-Ki Kim; Dongsub Shim; Moontae Lee; Honglak Lee; |
318 | Contrastive and Consistency Learning for Neural Noisy-Channel Model in Spoken Language Understanding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a two-stage method, Contrastive and Consistency Learning (CCL), that correlates error patterns between clean and noisy ASR transcripts and emphasizes the consistency of the latent features of the two transcripts. |
Suyoung Kim; Jiyeon Hwang; Ho-Young Jung; |
319 | Do Large Language Models Rank Fairly? An Empirical Study on The Fairness of LLMs As Rankers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, their fairness remains largely unexplored. This paper presents an empirical study evaluating these LLMs using the TREC Fair Ranking dataset, focusing on the representation of binary protected attributes such as gender and geographic location, which are historically underrepresented in search outcomes. |
Yuan Wang; Xuyang Wu; Hsin-Tai Wu; Zhiqiang Tao; Yi Fang; |
320 | TabSQLify: Enhancing Reasoning Capabilities of LLMs Through Table Decomposition Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose TabSQLify, a novel method that leverages text-to-SQL generation to decompose tables into smaller and relevant sub-tables, containing only essential information for answering questions or verifying statements, before performing the reasoning task. |
Md Nahid; Davood Rafiei; |
321 | Contextual Label Projection for Cross-Lingual Structured Prediction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce a novel label projection approach, CLaP, which translates text to the target language and performs *contextual translation* on the labels using the translated text as the context, ensuring better accuracy for the translated labels. |
Tanmay Parekh; I-Hung Hsu; Kuan-Hao Huang; Kai-Wei Chang; Nanyun Peng; |
322 | Event Detection from Social Media for Epidemic Prediction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In our work, we pioneer exploiting Event Detection (ED) for better preparedness and early warnings of any upcoming epidemic by developing a framework to extract and analyze epidemic-related events from social media posts. |
Tanmay Parekh; Anh Mac; Jiarui Yu; Yuxuan Dong; Syed Shahriar; Bonnie Liu; Eric Yang; Kuan-Hao Huang; Wei Wang; Nanyun Peng; Kai-Wei Chang; |
323 | RESPROMPT: Residual Connection Prompting Advances Multi-Step Reasoning in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The almost linear structure of CoT, however, struggles to capture this complex reasoning graph. To address this challenge, we propose Residual Connection Prompting (ResPrompt), a new prompting strategy that advances multi-step reasoning in LLMs. |
Song Jiang; Zahra Shakeri; Aaron Chan; Maziar Sanjabi; Hamed Firooz; Yinglong Xia; Bugra Akyildiz; Yizhou Sun; Jinchao Li; Qifan Wang; Asli Celikyilmaz; |
324 | BPE-knockout: Pruning Pre-existing BPE Tokenisers with Backwards-compatible Morphological Semi-supervision Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This, in turn, causes consistent intra-word patterns to be displayed inconsistently to downstream models, and bloats the vocabulary, hence requiring unnecessary embedding storage. In this paper, we address this issue by identifying blameworthy BPE merges and removing the resulting subwords from the BPE vocabulary, without impeding further use of merges that relied on them. |
Thomas Bauwens; Pieter Delobelle; |
325 | How Are Prompts Different in Terms of Sensitivity? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: While there is a growing body of literature focusing on prompt engineering, there is a lack of systematic analysis comparing the effects of prompt techniques across different models and tasks. To address this, we present a comprehensive prompt analysis based on sensitivity. |
Sheng Lu; Hendrik Schuff; Iryna Gurevych; |
326 | LSTDial: Enhancing Dialogue Generation Via Long- and Short-Term Measurement Feedback Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, even though there exist a variety of quality dimensions especially designed for dialogue evaluation (e. g. , coherence and diversity scores), current dialogue systems rarely utilize them to guide the response generation during training. To alleviate this issue, we propose LSTDial (Long- and Short-Term Dialogue), a novel two-stage framework which generates and utilizes conversation evaluation as explicit feedback during training. |
Guanghui Ye; Huan Zhao; Zixing Zhang; Xupeng Zha; Zhihua Jiang; |
327 | The ART of LLM Refinement: Ask, Refine, and Trust Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, recent empirical evidence points in the opposite direction, suggesting that LLMs often struggle to accurately identify errors when reasoning is involved. To address this, we propose a reasoning with a refinement strategy called *ART: Ask, Refine, and Trust*, which *asks* necessary questions to decide when an LLM should *refine* its output, and uses it to affirm or deny *trust* in its refinement by ranking the refinement and the initial prediction. |
Kumar Shridhar; Koustuv Sinha; Andrew Cohen; Tianlu Wang; Ping Yu; Ramakanth Pasunuru; Mrinmaya Sachan; Jason Weston; Asli Celikyilmaz; |
328 | Modularized Multilingual NMT with Fine-grained Interlingua Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Nonetheless, it should be noted that this sharing structure does not guarantee the explicit propagation of language-specific features to their respective language-specific decoders. Consequently, to overcome this challenge, we present our modularized MNMT approach, where a modularized encoder is divided into three distinct encoder modules based on different sharing criteria: (1) source language-specific (Encs); (2) universal (Encall); (3) target language-specific (Enct). |
Sungjun Lim; Yoonjung Choi; Sangha Kim; |
329 | ParallelPARC: A Scalable Pipeline for Generating Natural-Language Analogies Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we design a data generation pipeline, ParallelPARC (Parallel Paragraph Creator) leveraging state-of-the-art Large Language Models (LLMs) to create complex, paragraph-based analogies, as well as distractors, both simple and challenging. |
Oren Sultan; Yonatan Bitton; Ron Yosef; Dafna Shahaf; |
330 | AWESOME: GPU Memory-constrained Long Document Summarization Using Memory Mechanism and Global Salient Content Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work aims to leverage the memory-efficient nature of divide-and-conquer methods while preserving global context. |
Shuyang Cao; Lu Wang; |
331 | NLP Systems That Can�t Tell Use from Mention Censor Counterspeech, But Teaching The Distinction Helps Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We show that even recent language models fail at distinguishing use from mention, and that this failure propagates to two key downstream tasks: misinformation and hate speech detection, resulting in censorship of counterspeech. We introduce prompting mitigations that teach the use-mention distinction, and show they reduce these errors. |
Kristina Gligoric; Myra Cheng; Lucia Zheng; Esin Durmus; Dan Jurafsky; |
332 | Debiasing with Sufficient Projection: A General Theoretical Framework for Vector Representations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Identifying and removing unwanted biased information from vector representation is an evolving and significant challenge. Our study uniquely addresses this issue from the perspective of statistical independence, proposing a framework for reducing bias by transforming vector representations to an unbiased subspace using sufficient projection. |
Enze Shi; Lei Ding; Linglong Kong; Bei Jiang; |
333 | Semi-Supervised Dialogue Abstractive Summarization Via High-Quality Pseudolabel Selection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a novel scoring approach, SiCF, which encapsulates three primary dimensions of summarization model quality: Semantic invariance (indicative of model confidence), Coverage (factual recall), and Faithfulness (factual precision). |
Jianfeng He; Hang Su; Jason Cai; Igor Shalyminov; Hwanjun Song; Saab Mansour; |
334 | AfriMTE and AfriCOMET: Enhancing COMET to Embrace Under-resourced African Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Learned metrics such as COMET have higher correlation; however, the lack of evaluation data with human ratings for under-resourced languages, complexity of annotation guidelines like Multidimensional Quality Metrics (MQM), and limited language coverage of multilingual encoders have hampered their applicability to African languages. In this paper, we address these challenges by creating high-quality human evaluation data with simplified MQM guidelines for error detection and direct assessment (DA) scoring for 13 typologically diverse African languages. |
Jiayi Wang; David Adelani; Sweta Agrawal; Marek Masiak; Ricardo Rei; Eleftheria Briakou; Marine Carpuat; Xuanli He; Sofia Bourhim; Andiswa Bukula; Muhidin Mohamed; Temitayo Olatoye; Tosin Adewumi; Hamam Mokayed; Christine Mwase; Wangui Kimotho; Foutse Yuehgoh; Anuoluwapo Aremu; Jessica Ojo; Shamsuddeen Muhammad; Salomey Osei; Abdul-Hakeem Omotayo; Chiamaka Chukwuneke; Perez Ogayo; Oumaima Hourrane; Salma El Anigri; Lolwethu Ndolela; Thabiso Mangwana; Shafie Mohamed; Hassan Ayinde; Oluwabusayo Awoyomi; Lama Alkhaled; Sana Al-azzawi; Naome Etori; Millicent Ochieng; Clemencia Siro; Njoroge Kiragu; Eric Muchiri; Wangari Kimotho; Toadoum Sari Sakayo; Lyse Naomi Wamba; Daud Abolade; Simbiat Ajao; Iyanuoluwa Shode; Ricky Macharm; Ruqayya Iro; Saheed Abdullahi; Stephen Moore; Bernard Opoku; Zainab Akinjobi; Abeeb Afolabi; Nnaemeka Obiefuna; Onyekachi Ogbu; Sam Ochieng�; Verrah Otiende; Chinedu Mbonu; Yao Lu; Pontus Stenetorp; |
335 | TableLlama: Towards Open Large Generalist Models for Tables Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper makes the first step towards developing open-source large language models (LLMs) as generalists for a diversity of table-based tasks. |
Tianshu Zhang; Xiang Yue; Yifei Li; Huan Sun; |
336 | PEMA: An Offsite-Tunable Plug-in External Memory Adaptation for Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To overcome the limitations, we introduce Plug-in External Memory Adaptation (PEMA), a Parameter-Efficient Fine-Tuning (PEFT) method, enabling PLM fine-tuning without requiring access to all the weights. |
HyunJin Kim; Young Jin Kim; JinYeong Bak; |
337 | Backdooring Instruction-Tuned Large Language Models with Virtual Prompt Injection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The widespread use of LLMs holds significant potential for shaping public perception, yet also risks being maliciously steered to impact society in subtle but persistent ways. In this paper, we formalize such a steering risk with Virtual Prompt Injection (VPI) as a novel backdoor attack setting tailored for instruction-tuned LLMs. |
Jun Yan; Vikas Yadav; Shiyang Li; Lichang Chen; Zheng Tang; Hai Wang; Vijay Srinivasan; Xiang Ren; Hongxia Jin; |
338 | Exploring The Factual Consistency in Dialogue Comprehension of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose to perform the evaluation focusing on the factual consistency issue with the help of the dialogue summarization task. |
Shuaijie She; Shujian Huang; Xingyun Wang; Yanke Zhou; Jiajun Chen; |
339 | Multilingual Pretraining and Instruction Tuning Improve Cross-Lingual Knowledge Alignment, But Only Shallowly Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose CLiKA, a systematic framework to assess the cross-lingual knowledge alignment of LLMs in the Performance, Consistency and Conductivity levels, and explored the effect of multilingual pretraining and instruction tuning on the degree of alignment. |
Changjiang Gao; Hongda Hu; Peng Hu; Jiajun Chen; Jixing Li; Shujian Huang; |
340 | A Study on The Calibration of In-context Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We study in-context learning (ICL), a prevalent method for adapting static LMs through tailored prompts, and examine the balance between performance and calibration across a broad spectrum of natural language understanding and reasoning tasks. |
Hanlin Zhang; YiFan Zhang; Yaodong Yu; Dhruv Madeka; Dean Foster; Eric Xing; Himabindu Lakkaraju; Sham Kakade; |
341 | DialogBench: Evaluating LLMs As Human-like Dialogue Systems Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose DialogBench, a dialogue evaluation benchmark that contains 12 dialogue tasks to probe the capabilities of LLMs as human-like dialogue systems should have. |
Jiao Ou; Junda Lu; Che Liu; Yihong Tang; Fuzheng Zhang; Di Zhang; Kun Gai; |
342 | GINopic: Topic Modeling with Graph Isomorphism Network Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we introduce GINopic, a topic modeling framework based on graph isomorphism networks to capture the correlation between words. |
Suman Adhya; Debarshi Sanyal; |
343 | CMB: A Comprehensive Medical Benchmark in Chinese Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To solve the issue, we propose a localized medical benchmark called CMB, a Comprehensive Medical Benchmark in Chinese, designed and rooted entirely within the native Chinese linguistic and cultural framework. |
Xidong Wang; Guiming Chen; Song Dingjie; Zhang Zhiyi; Zhihong Chen; Qingying Xiao; Junying Chen; Feng Jiang; Jianquan Li; Xiang Wan; Benyou Wang; Haizhou Li; |
344 | Massive End-to-end Speech Recognition Models with Time Reduction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We investigate massive end-to-end automatic speech recognition (ASR) models with efficiency improvements achieved by time reduction. |
Weiran Wang; Rohit Prabhavalkar; Haozhe Shan; Zhong Meng; Dongseong Hwang; Qiujia Li; Khe Chai Sim; Bo Li; James Qin; Xingyu Cai; Adam Stooke; Chengjian Zheng; Yanzhang He; Tara Sainath; Pedro Moreno Mengibar; |
345 | SlimFit: Memory-Efficient Fine-Tuning of Transformer-based Models Using Training Dynamics Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these models are extremely memory intensive during their fine-tuning process, making them difficult to deploy on GPUs with limited memory resources. To address this issue, we introduce a new tool called SlimFit that reduces the memory requirements of these models by dynamically analyzing their training dynamics and freezing less-contributory layers during fine-tuning. |
Arash Ardakani; Altan Haan; Shangyin Tan; Doru Thom Popovici; Alvin Cheung; Costin Iancu; Koushik Sen; |
346 | Effective Large Language Model Adaptation for Improved Grounding and Citation Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a new framework, AGREE, Adaptation for GRounding EnhancEment, that improves the grounding from a holistic perspective. |
Xi Ye; Ruoxi Sun; Sercan Arik; Tomas Pfister; |
347 | Assisting in Writing Wikipedia-like Articles From Scratch with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose STORM, a writing system for the Synthesis of Topic Outlines throughRetrieval and Multi-perspective Question Asking. |
Yijia Shao; Yucheng Jiang; Theodore Kanell; Peter Xu; Omar Khattab; Monica Lam; |
348 | Grounding Gaps in Language Model Generations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we curate a set of grounding acts and propose corresponding metrics that quantify attempted grounding. |
Omar Shaikh; Kristina Gligoric; Ashna Khetan; Matthias Gerstgrasser; Diyi Yang; Dan Jurafsky; |
349 | When Does Monolingual Data Help Multilingual Translation: The Role of Domain and Model Scale Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the literature offers conflicting results on the performance of different methods of including monolingual data. To resolve this, we examine how denoising autoencoding (DAE) and backtranslation (BT) impact MMT under different data conditions and model scales. |
Christos Baziotis; Biao Zhang; Alexandra Birch; Barry Haddow; |
350 | ContraSim � Analyzing Neural Representations Based on Contrastive Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we develop a new similarity measure, dubbed ContraSim, based on contrastive learning. |
Adir Rahamim; Yonatan Belinkov; |
351 | Universal Prompt Optimizer for Safe Text-to-Image Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Hence, we propose the first universal **p**rompt **o**ptimizer for **s**afe T2**I** (**POSI**) generation in black-box scenario. |
Zongyu Wu; Hongcheng Gao; Yueze Wang; Xiang Zhang; Suhang Wang; |
352 | Language Model Based Unsupervised Dependency Parsing with Conditional Mutual Information and Grammatical Constraints Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we apply Conditional Mutual Information (CMI), an interpretable metric, to measure the bi-lexical dependence and incorporate grammatical constraints into LLM-based unsupervised parsing. |
Junjie Chen; Xiangheng He; Yusuke Miyao; |
353 | The Bias Amplification Paradox in Text-to-Image Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we study bias amplification in the text-to-image domain using Stable Diffusion by comparing gender ratios in training vs. generated images. |
Preethi Seshadri; Sameer Singh; Yanai Elazar; |
354 | Grammar-based Data Augmentation for Low-Resource Languages: The Case of Guarani-Spanish Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: One of the main problems low-resource languages face in NLP can be pictured as a vicious circle: data is needed to build and test tools, but the available text is scarce and there are not powerful tools to collect it. In order to break this circle for Guarani, we explore if text automatically generated from a grammar can work as a Data Augmentation technique to boost the performance of Guarani-Spanish Machine Translation (MT) systems. |
Agust�n Lucas; Alexis Balad�n; Victoria Pardi�as; Marvin Ag�ero-Torales; Santiago G�ngora; Luis Chiruzzo; |
355 | Global Gallery: The Fine Art of Painting Culture Portraits Through Multilingual Instruction Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Exploring the intersection of language and culture in Large Language Models (LLMs), this study critically examines their capability to encapsulate cultural nuances across diverse linguistic landscapes. |
Anjishnu Mukherjee; Aylin Caliskan; Ziwei Zhu; Antonios Anastasopoulos; |
356 | Toward Interactive Regional Understanding in Vision-Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Nevertheless, these models heavily rely on image-text pairs that capture only coarse and global information of an image, leading to a limitation in their regional understanding ability. In this work, we introduce RegionVLM, equipped with explicit regional modeling capabilities, allowing them to understand user-indicated image regions. |
Jungbeom Lee; Sanghyuk Chun; Sangdoo Yun; |
357 | ScriptMix: Mixing Scripts for Low-resource Language Parsing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To understand why, we identify both complementary strengths of the two, and the hurdles to realizing it. Based on this observation, we propose ScriptMix, combining two strengths, and overcoming the hurdle. |
Jaeseong Lee; Dohyeon Lee; Seung-won Hwang; |
358 | MT-PATCHER: Selective and Extendable Knowledge Distillation from Large Language Models for Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a framework called MT-Patcher, which transfers knowledge from LLMs to existing MT models in a selective, comprehensive and proactive manner. |
Jiahuan Li; Shanbo Cheng; Shujian Huang; Jiajun Chen; |
359 | ToXCL: A Unified Framework for Toxic Speech Detection and Explanation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Moreover, our experiments reveal that the detection results of such models are much lower than those that focus only on the detection task. To bridge these gaps, we introduce ToXCL, a unified framework for the detection and explanation of implicit toxic speech. |
Nhat Hoang; Do Long; Duc Anh Do; Duc Anh Vu; Anh Tuan Luu; |
360 | LinkPrompt: Natural and Universal Adversarial Attacks on Prompt-based Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we consider the naturalness of the UATs and develop LinkPrompt, an adversarial attack algorithm to generate UATs by a gradient-based beam search algorithm that not only effectively attacks the target PLMs and PFMs but also maintains the naturalness among the trigger tokens. |
Yue Xu; Wenjie Wang; |
361 | CoE-SQL: In-Context Learning for Multi-Turn Text-to-SQL with Chain-of-Editions Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce our method called CoE-SQL which can prompt LLMs to generate the SQL query based on the previously generated SQL query with an edition chain. |
Hanchong Zhang; Ruisheng Cao; Hongshen Xu; Lu Chen; Kai Yu; |
362 | ContraDoc: Understanding Self-Contradictions in Documents with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce ContraDoc, the first human-annotated dataset to study self-contradictions in long documents across multiple domains, varying document lengths, self-contradiction types, and appearance scope. |
Jierui Li; Vipul Raheja; Dhruv Kumar; |
363 | Entity Disambiguation Via Fusion Entity Decoding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose an encoder-decoder model to disambiguate entities with more detailed entity descriptions. |
Junxiong Wang; Ali Mousavi; Omar Attia; Ronak Pradeep; Saloni Potdar; Alexander Rush; Umar Farooq Minhas; Yunyao Li; |
364 | PlanRAG: A Plan-then-Retrieval Augmented Generation for Generative Large Language Models As Decision Makers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we conduct a study to utilize LLMs as a solution for decision making that requires complex data analysis. |
Myeonghwa Lee; Seonho An; Min-Soo Kim; |
365 | GPTScore: Evaluate As You Desire Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper proposes a novel evaluation framework, GPTScore, which utilizes the emergent abilities (e. g. , in-context learning, zero-shot instruction) of generative pre-trained models to score generated texts. |
Jinlan Fu; See-Kiong Ng; Zhengbao Jiang; Pengfei Liu; |
366 | A Survey of Confidence Estimation and Calibration in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: There has been a lot of recent research aiming to address this, but there has been no comprehensive overview to organize it and to outline the main lessons learned. The present survey aims to bridge this gap. |
Jiahui Geng; Fengyu Cai; Yuxia Wang; Heinz Koeppl; Preslav Nakov; Iryna Gurevych; |
367 | Not All Metrics Are Guilty: Improving NLG Evaluation By Diversifying References Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The underlying reason is that one semantic meaning can actually be expressed in different forms, and the evaluation with a single or few references may not accurately reflect the quality of the model�s hypotheses. To address this issue, this paper presents a simple and effective method, named **Div-Ref**, to enhance existing evaluation benchmarks by enriching the number of references. |
Tianyi Tang; Hongyuan Lu; Yuchen Jiang; Haoyang Huang; Dongdong Zhang; Xin Zhao; Tom Kocmi; Furu Wei; |
368 | Separation and Fusion: A Novel Multiple Token Linking Model for Event Argument Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we design a novel separation-and-fusion paradigm to separately acquire cross-event information and fuse it into the argument extraction of a target event. |
Jing Xu; Dandan Song; Siu Hui; Zhijing Wu; Meihuizi Jia; Hao Wang; Yanru Zhou; Changzhi Zhou; Ziyi Yang; |
369 | The Integration of Semantic and Structural Knowledge in Knowledge Graph Entity Typing Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a novel Semantic and Structure-aware KG Entity Typing (SSET) framework, which is composed of three modules. |
Muzhi Li; Minda Hu; Irwin King; Ho-fung Leung; |
370 | ComCLIP: Training-Free Compositional Image and Text Matching Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Towards better compositional generalization in zero-shot image and text matching, in this paper, we study the problem from a causal perspective: the erroneous semantics of individual entities are essentially confounders that cause the matching failure. Therefore, we propose a novel training-free compositional CLIP model (ComCLIP). |
Kenan Jiang; Xuehai He; Ruize Xu; Xin Wang; |
371 | ACLSum: A New Dataset for Aspect-based Summarization of Scientific Publications Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This resulted in subpar resources for training and evaluating summarization systems, a quality compromise that is arguably due to the substantial costs associated with generating ground-truth summaries, particularly for diverse languages and specialized domains. To address this issue, we present ACLSum, a novel summarization dataset carefully crafted and evaluated by domain experts. |
Sotaro Takeshita; Tommaso Green; Ines Reinig; Kai Eckert; Simone Ponzetto; |
372 | XAL: EXplainable Active Learning Makes Classifiers Better Low-resource Learners Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Most active learning techniques in classification rely on the model�s uncertainty or disagreement to choose unlabeled data, suffering from the problem of over-confidence in superficial patterns and a lack of exploration. Inspired by the cognitive processes in which humans deduce and predict through causal information, we take an initial attempt towards integrating rationales into AL and propose a novel Explainable Active Learning framework (XAL) for low-resource text classification, which aims to encourage classifiers to justify their inferences and delve into unlabeled data for which they cannot provide reasonable explanations. |
Yun Luo; Zhen Yang; Fandong Meng; Yingjie Li; Fang Guo; Qinglin Qi; Jie Zhou; Yue Zhang; |
373 | LaDiC: Are Diffusion Models Really Inferior to Autoregressive Counterparts for Image-to-Text Generation? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we revisit diffusion models, highlighting their capacity for holistic context modeling and parallel decoding. |
Yuchi Wang; Shuhuai Ren; Rundong Gao; Linli Yao; Qingyan Guo; Kaikai An; Jianhong Bai; Xu Sun; |
374 | Intent-conditioned and Non-toxic Counterspeech Generation Using Multi-Task Instruction Tuning with RLAIF Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our study introduces CoARL, a novel framework enhancing counterspeech generation by modeling the pragmatic implications underlying social biases in hateful statements. |
Amey Hengle; Aswini Padhi; Sahajpreet Singh; Anil Bandhakavi; Md Shad Akhtar; Tanmoy Chakraborty; |
375 | Attacks, Defenses and Evaluations for LLM Conversation Safety: A Survey Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, their risks of misuse for generating harmful responses have raised serious societal concerns and spurred recent research on LLM conversation safety. Therefore, in this survey, we provide a comprehensive overview of recent studies, covering three critical aspects of LLM conversation safety: attacks, defenses, and evaluations. |
Zhichen Dong; Zhanhui Zhou; Chao Yang; Jing Shao; Yu Qiao; |
376 | Mind�s Mirror: Distilling Self-Evaluation Capability and Comprehensive Thinking from Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While techniques such as chain-of-thought (CoT) distillation have displayed promise in distilling LLMs into small language models (SLMs), there is a risk that distilled SLMs may still inherit flawed reasoning and hallucinations from LLMs. To address these issues, we propose a twofold methodology: First, we introduce a novel method for distilling the self-evaluation capability from LLMs into SLMs, aiming to mitigate the adverse effects of flawed reasoning and hallucinations inherited from LLMs. Second, we advocate for distilling more comprehensive thinking by incorporating multiple distinct CoTs and self-evaluation outputs, to ensure a more thorough and robust knowledge transfer into SLMs. |
Weize Liu; Guocong Li; Kai Zhang; Bang Du; Qiyuan Chen; Xuming Hu; Hongxia Xu; Jintai Chen; Jian Wu; |
377 | Divergent Token Metrics: Measuring Degradation to Prune Away LLM Components � and Optimize Quantization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces the Divergent Token Metrics (DTMs), a novel approach to assessing compressed LLMs, addressing the limitations of traditional perplexity or accuracy measures that fail to accurately reflect text generation quality. |
Bj�rn Deiseroth; Max Meuer; Nikolas Gritsch; Constantin Eichenberg; Patrick Schramowski; Matthias A�enmacher; Kristian Kersting; |
378 | Beyond Performance: Quantifying and Mitigating Label Bias in LLMs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we evaluate different approaches to quantifying label bias in a model�s predictions, conducting a comprehensive investigation across 279 classification tasks and ten LLMs. |
Yuval Reif; Roy Schwartz; |
379 | Instructing Large Language Models to Identify and Ignore Irrelevant Conditions Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a novel approach named I3C that instructs LLMs to identify and ignore irrelevant conditions. |
Zhenyu Wu; Chao Shen; Meng Jiang; |
380 | Lower Bounds on The Expressivity of Recurrent Neural Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Such results, however, fall short of describing the capabilities of RNN language models (LMs), which are definitionally distributions over strings. We take a fresh look at the represen- tational capacity of RNN LMs by connecting them to probabilistic FSAs and demonstrate that RNN LMs with linearly bounded precision can express arbitrary regular LMs. |
Anej Svete; Franz Nowak; Anisha Sahabdeen; Ryan Cotterell; |
381 | Transformers Can Represent N-gram Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we focus on the relationship between transformer LMs and n-gram LMs, a simple and historically relevant class of language models. |
Anej Svete; Ryan Cotterell; |
382 | The Role of N-gram Smoothing in The Age of Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper re-opens the role classical n-gram smoothing techniques may play in the age of neural language models. |
Luca Malagutti; Andrius Buinovskij; Anej Svete; Clara Meister; Afra Amini; Ryan Cotterell; |
383 | Reliability Estimation of News Media Sources: Birds of A Feather Flock Together Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce a novel approach for source reliability estimation that leverages reinforcement learning strategies for estimating the reliability degree of news sources. |
Sergio Burdisso; Dairazalia Sanchez-cortes; Esa� Villatoro-tello; Petr Motlicek; |
384 | On The Multilingual Ability of Decoder-based Pre-trained Language Models: Finding and Controlling Language-Specific Neurons Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We analyze six languages: English, German, French, Spanish, Chinese, and Japanese, and show that language-specific neurons are unique, with a slight overlap (< 5%) between languages. |
Takeshi Kojima; Itsuki Okimura; Yusuke Iwasawa; Hitomi Yanaka; Yutaka Matsuo; |
385 | NLP Progress in Indigenous Latin American Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We highlight the cultural richness of these languages and the risk they face of being overlooked in the realm of Natural Language Processing (NLP). We aim to bridge the gap between these communities and researchers, emphasizing the need for inclusive technological advancements that respect indigenous community perspectives. |
Atnafu Tonja; Fazlourrahman Balouchzahi; Sabur Butt; Olga Kolesnikova; Hector Ceballos; Alexander Gelbukh; Thamar Solorio; |
386 | On The Effectiveness of Adversarial Robustness for Abuse Mitigation with Counterspeech Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While these systems identifying and generating counterspeech have the potential for abuse mitigation, it remains unclear how robust a model is against adversarial attacks across multiple domains and how models trained on synthetic data can handle unseen user-generated abusive content in the real world. To tackle these issues, this paper first explores the dynamics of abuse and replies using our novel dataset of 6,955 labelled tweets targeted at footballers for studying public figure abuse. |
Yi-Ling Chung; Jonathan Bright; |
387 | Leveraging The Structure of Pre-trained Embeddings to Minimize Annotation Effort Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This is especially true in settings of highly imbalanced class distributions. This paper proposes to tackle this bottleneck by exploiting the structural properties of pre-trained embeddings. |
Cesar Gonzalez-Gutierrez; Ariadna Quattoni; |
388 | UniArk: Improving Generalisation and Consistency for Factual Knowledge Extraction Through Debiasing Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we focus on the factual probing performance over unseen prompts from tuning, and using a probabilistic view we show the inherent misalignment between pre-training and downstream tuning objectives in language models for probing knowledge. |
Yijun Yang; Jie He; Pinzhen Chen; Victor Gutierrez Basulto; Jeff Pan; |
389 | Adaptive-RAG: Learning to Adapt Retrieval-Augmented Large Language Models Through Question Complexity Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a novel adaptive QA framework that can dynamically select the most suitable strategy for (retrieval-augmented) LLMs from the simplest to the most sophisticated ones based on the query complexity. |
Soyeong Jeong; Jinheon Baek; Sukmin Cho; Sung Ju Hwang; Jong Park; |
390 | Knowing What LLMs DO NOT Know: A Simple Yet Effective Self-Detection Method Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel self-detection method to detect which questions an LLM does not know. |
Yukun Zhao; Lingyong Yan; Weiwei Sun; Guoliang Xing; Chong Meng; Shuaiqiang Wang; Zhicong Cheng; Zhaochun Ren; Dawei Yin; |
391 | Are Large Language Model Temporally Grounded? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Since LLMs cannot perceive and interact with the environment, it is impossible to answer this question directly. Instead, we provide LLMs with textual narratives and probe them with respect to their common-sense knowledge of the structure and duration of events, their ability to order events along a timeline, and self-consistency within their temporal model (e. g. , temporal relations such as after and before are mutually exclusive for any pair of events). |
Yifu Qiu; Zheng Zhao; Yftah Ziser; Anna Korhonen; Edoardo Ponti; Shay Cohen; |
392 | Document Image Machine Translation with Dynamic Multi-pre-trained Models Assembling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we extend the current TIMT task and propose a novel task, **D**ocument **I**mage **M**achine **T**ranslation to **Markdown** (**DIMT2Markdown**), which aims to translate a source document image with long context and complex layout structure to markdown-formatted target translation. |
Yupu Liang; Yaping Zhang; Cong Ma; Zhiyang Zhang; Yang Zhao; Lu Xiang; Chengqing Zong; Yu Zhou; |
393 | Elastic Weight Removal for Faithful and Abstractive Dialogue Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, common dialogue models still often hallucinate information that was not containedin these documents and is therefore unfaithful. In this work, we propose to alleviate suchhallucinations by �subtracting� the parametersof a model trained to hallucinate from a dialogue response generation model in order to�negate� the contribution of such hallucinatedexamples from it. |
Nico Daheim; Nouha Dziri; Mrinmaya Sachan; Iryna Gurevych; Edoardo Ponti; |
394 | R-Tuning: Instructing Large Language Models to Say �I Don�t Know� Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a new approach called Refusal-Aware Instruction Tuning (R-Tuning). |
Hanning Zhang; Shizhe Diao; Yong Lin; Yi Fung; Qing Lian; Xingyao Wang; Yangyi Chen; Heng Ji; Tong Zhang; |
395 | Bridging The Gap Between Different Vocabularies for LLM Ensemble Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This limitation hinders the dynamic correction and enhancement of outputs during the generation process, resulting in a limited capacity for effective ensemble. To address this issue, we propose a novel method to Ensemble LLMs via Vocabulary Alignment (EVA). |
Yangyifan Xu; Jinliang Lu; Jiajun Zhang; |
396 | KnowLA: Enhancing Parameter-efficient Finetuning with Knowledgeable Adaptation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we study leveraging knowledge graph embeddings to improve the effectiveness of PEFT. |
Xindi Luo; Zequn Sun; Jing Zhao; Zhe Zhao; Wei Hu; |
397 | Extremely Weakly-supervised Text Classification with Wordsets Mining and Sync-Denoising Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Both of them have significant flaws, including zero-shot instability and context-dependent ambiguities. This paper introduces SetSync, which follows a new paradigm, i. e. wordset-based, which can avoid the above problems. |
Lysa Xiao; |
398 | F-MALLOC: Feed-forward Memory Allocation for Continual Learning in Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: While previous work has introduced Continual Learning (CL) methods to address CF, these approaches grapple with the delicate balance between avoiding forgetting and maintaining system extensibility. To address this, we propose a CL method, named F-MALLOC (Feed-forward Memory ALLOCation). |
Junhong Wu; Yuchen Liu; Chengqing Zong; |
399 | Towards Reducing Diagnostic Errors with Interpretable Risk Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work we propose a method to use LLMs to identify pieces of evidence in patient EHR data that indicate increased or decreased risk of specific diagnoses; our ultimate aim is to increase access to evidence and reduce diagnostic errors. |
Denis McInerney; William Dickinson; Lucy Flynn; Andrea Young; Geoffrey Young; Jan-Willem van de Meent; Byron Wallace; |
400 | Generalizable Multilingual Hate Speech Detection on Low Resource Indian Languages Using Fair Selection in Federated Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we combined various low-resource language datasets and propose MultiFED, a federated approach that performs effectively to detect hate speech. |
Akshay Singh; Rahul Thakur; |
401 | Key Ingredients for Effective Zero-shot Cross-lingual Knowledge Transfer in Generative Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work we compare various approaches proposed from the literature in unified settings, also including alternative backbone models, namely mBART and NLLB-200. |
Nadezhda Chirkova; Vassilina Nikoulina; |
402 | The Impact of Depth on Compositional Generalization in Transformer Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We report three main conclusions: (1) after fine-tuning, deeper models generalize more compositionally than shallower models do, but the benefit of additional layers diminishes rapidly; (2) within each family, deeper models show better language modeling performance, but returns are similarly diminishing; (3) the benefits of depth for compositional generalization cannot be attributed solely to better performance on language modeling. |
Jackson Petty; Sjoerd Steenkiste; Ishita Dasgupta; Fei Sha; Dan Garrette; Tal Linzen; |
403 | Pregnant Questions: The Importance of Pragmatic Awareness in Maternal Health Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In a high-risk domain such as maternal and infant health, a question-answering system must recognize these pragmatic constraints and go beyond simply answering user questions, examining them in context to respond helpfully. To achieve this, we study assumptions and implications, or pragmatic inferences, made when mothers ask questions about pregnancy and infant care by collecting a dataset of 2,727 inferences from 500 questions across three diverse sources. |
Neha Srikanth; Rupak Sarkar; Heran Mane; Elizabeth Aparicio; Quynh Nguyen; Rachel Rudinger; Jordan Boyd-Graber; |
404 | Towards Explainability in Legal Outcome Prediction Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we contribute a novel method for identifying the precedent employed by legal outcome prediction models. |
Josef Valvoda; Ryan Cotterell; |
405 | The Steerability of Large Language Models Toward Data-driven Personas Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here, we present a novel approach to achieve controllable generation of specific viewpoints using LLMs, that can be leveraged to produce multiple perspectives and to reflect the diverse opinions. |
Junyi Li; Charith Peris; Ninareh Mehrabi; Palash Goyal; Kai-Wei Chang; Aram Galstyan; Richard Zemel; Rahul Gupta; |
406 | CCSum: A Large-Scale and High-Quality Dataset for Abstractive News Summarization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a new large-scale and high-quality dataset for supervised abstractive news summarization containing 1. |
Xiang Jiang; Markus Dreyer; |
407 | Capturing Perspectives of Crowdsourced Annotators in Subjective Learning Tasks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose Annotator Aware Representations for Texts (AART) for subjective classification tasks. |
Negar Mokhberian; Myrl Marmarelis; Frederic Hopp; Valerio Basile; Fred Morstatter; Kristina Lerman; |
408 | Improving Factual Accuracy of Neural Table-to-Text Output By Addressing Input Problems in ToTTo Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We manually annotated 1,837 texts generated by multiple models in the politics domain of the ToTTo dataset. We identify the input problems that are responsible for many output errors and show that fixing these inputs reduces factual errors by between 52% and 76% (depending on the model). |
Barkavi Sundararajan; Yaji Sripada; Ehud Reiter; |
409 | CERET: Cost-Effective Extrinsic Refinement for Text Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose CERET, a method for refining text generations by considering semantic stability, entailment and inter-sample uncertainty measures. |
Jason Cai; Hang Su; Monica Sunkara; Igor Shalyminov; Saab Mansour; |
410 | Parameter-Efficient Instruction Tuning of Large Language Models For Extreme Financial Numeral Labelling Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We study the problem of automatically annotating relevant numerals (GAAP metrics) occurring in the financial documents with their corresponding XBRL tags. |
Subhendu Khatuya; Rajdeep Mukherjee; Akash Ghosh; Manjunath Hegde; Koustuv Dasgupta; Niloy Ganguly; Saptarshi Ghosh; Pawan Goyal; |
411 | Analysis of State-Level Legislative Process in Enhanced Linguistic and Nationwide Network Contexts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we developed the first state-level deep learning framework that (1) handles the complex and inconsistent language of policies across US states using generative large language models and (2) decodes legislators� behavior and implications of state policies by establishing a shared nationwide network, enriched with diverse contexts, such as information on interest groups influencing public policy and legislators� courage test results, which reflect their political positions. |
Maryam Davoodi; Dan Goldwasser; |
412 | DeMuX: Data-efficient Multilingual Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce DeMuX, a framework that prescribes the exact data-points to label from vast amounts of unlabelled multilingual data, having unknown degrees of overlap with the target set. |
Simran Khanuja; Srinivas Gowriraj; Lucio Dery; Graham Neubig; |
413 | DUQGen: Effective Unsupervised Domain Adaptation of Neural Rankers By Diversifying Synthetic Query Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Unfortunately, acquiring sufficiently large and high quality target training data to improve a modern neural ranker can be costly and time-consuming. To address this problem, we propose a new approach to unsupervised domain adaptation for ranking, DUQGen, which addresses a critical gap in prior literature, namely how to automatically generate both effective and diverse synthetic training data to fine tune a modern neural ranker for a new domain. |
Ramraj Chandradevan; Kaustubh Dhole; Eugene Agichtein; |
414 | How Did We Get Here? Summarizing Conversation Dynamics Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce the task of summarizing the dynamics of conversations, by constructing a dataset of human-written summaries, and exploring several automated baselines. |
Yilun Hua; Nicholas Chernogor; Yuzhe Gu; Seoyeon Jeong; Miranda Luo; Cristian Danescu-Niculescu-Mizil; |
415 | Can Language Model Moderators Improve The Health of Online Discourse? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we establish a systematic definition of conversational moderation effectiveness grounded on moderation literature and establish design criteria for conducting realistic yet safe evaluation. |
Hyundong Cho; Shuai Liu; Taiwei Shi; Darpan Jain; Basem Rizk; Yuyang Huang; Zixun Lu; Nuan Wen; Jonathan Gratch; Emilio Ferrara; Jonathan May; |
416 | LeanReasoner: Boosting Complex Logical Reasoning with Lean Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Large language models (LLMs) often struggle with complex logical reasoning due to logical inconsistencies and the inherent difficulty ofsuch reasoning. We use Lean, a theorem proving framework, to address these challenges. |
Dongwei Jiang; Marcio Fonseca; Shay Cohen; |
417 | UICoder: Finetuning Large Language Models to Generate User Interface Code Through Automated Feedback Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore the use of automated feedback (compilers and multi-modal models) to guide LLMs to generate high-quality UI code. |
Jason Wu; |
418 | Measuring Cross-lingual Transfer in Bytes Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We measured the amount of data transferred from a source language to a target language and found that models initialized from diverse languages perform similarly to a target language in a cross-lingual setting. |
Leandro De Souza; Thales Almeida; Roberto Lotufo; Rodrigo Frassetto Nogueira; |
419 | MisgenderMender: A Community-Informed Approach to Interventions for Misgendering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Based on survey insights on the prevalence of misgendering, desired solutions, and associated concerns, we introduce a misgendering interventions task and evaluation dataset, MisgenderMender. |
Tamanna Hossain; Sunipa Dev; Sameer Singh; |
420 | Interplay of Machine Translation, Diacritics, and Diacritization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We investigate two research questions: (1) how do machine translation (MT) and diacritization influence the performance of each other in a multi-task learning setting (2) the effect of keeping (vs. removing) diacritics on MT performance. |
Wei-Rui Chen; Ife Adebara; Muhammad Abdul-Mageed; |
421 | From Quantity to Quality: Boosting LLM Performance with Self-Guided Data Selection for Instruction Tuning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In the realm of Large Language Models (LLMs), the balance between instruction data quality and quantity is a focal point. Recognizing this, we introduce a self-guided methodology for LLMs to autonomously discern and select cherry samples from open-source datasets, effectively minimizing manual curation and potential cost for instruction tuning an LLM. |
Ming Li; Yong Zhang; Zhitao Li; Jiuhai Chen; Lichang Chen; Ning Cheng; Jianzong Wang; Tianyi Zhou; Jing Xiao; |
422 | Safer-Instruct: Aligning Language Models with Automated Preference Data Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In response, we present Safer-Instruct, a novel pipeline for automatically constructing large-scale preference data. |
Taiwei Shi; Kai Chen; Jieyu Zhao; |
423 | PELMS: Pre-training for Effective Low-Shot Multi-Document Summarization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To this end, we present **PELMS**, a pre-trained model that uses pre-training objectives based on semantic coherence heuristics and faithfulness constraints together with unlabeled multi-document inputs, to promote the generation of concise, fluent, and faithful summaries. |
Joseph Peper; Wenzhao Qiu; Lu Wang; |
424 | Deceptive Semantic Shortcuts on Reasoning Chains: How Far Can Models Go Without Hallucination? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We investigate to what extent LLMs take shortcuts from certain keyword/entity biases in the prompt instead of following correct reasoning paths. To quantify this phenomenon, we propose a novel probing method and benchmark called EUREQA. |
Bangzheng Li; Ben Zhou; Fei Wang; Xingyu Fu; Dan Roth; Muhao Chen; |
425 | IndiSentiment140: Sentiment Analysis Dataset for Indian Languages with Emphasis on Low-Resource Languages Using Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The study aims to provide insights into the practicality of using machine translation in the context of India�s linguistic diversity for sentiment analysis datasets. |
Saurabh Kumar; Ranbir Sanasam; Sukumar Nandi; |
426 | Leveraging LLMs for Synthesizing Training Data Across Many Languages in Multilingual Dense Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To construct SWIM-IR, we propose SAP (summarize-then-ask prompting), where the large language model (LLM) generates a textual summary prior to the query generation step. |
Nandan Thakur; Jianmo Ni; Gustavo Hernandez Abrego; John Wieting; Jimmy Lin; Daniel Cer; |
427 | SCANNER: Knowledge-Enhanced Approach for Robust Multi-modal Named Entity Recognition of Unseen Entities Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: A key challenge to these tasks is that the model should be able to generalize to the entities unseen during the training, and should be able to handle the training samples with noisy annotations. To address this obstacle, we propose SCANNER (Span CANdidate detection and recognition for NER), a model capable of effectively handling all three NER variants. |
Hyunjong Ok; Taeho Kil; Sukmin Seo; Jaeho Lee; |
428 | A Theory Guided Scaffolding Instruction Framework for LLM-Enabled Metaphor Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although promising, LLM-based methods for metaphor detection and reasoning are still faced with the challenging issue of bringing the explainable concepts for metaphor reasoning and their linguistic manifestation. To fill this gap, we propose a novel Theory guided Scaffolding Instruction (TSI) framework that instructs an LLM to infer the underlying reasoning process of metaphor detection guided by metaphor theories for the first time. |
Yuan Tian; Nan Xu; Wenji Mao; |
429 | Learning to Compress Prompt in Natural Language Formats Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a Natural Language Prompt Encapsulation (Nano-Capsulator) framework compressing original prompts into NL formatted Capsule Prompt while maintaining prompt utility and transferability. |
Yu-Neng Chuang; Tianwei Xing; Chia-Yuan Chang; Zirui Liu; Xun Chen; Xia Hu; |
430 | Automatic, Meta and Human Evaluation for Multimodal Summarization with Multimodal Output Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: As one of the most fundamental components for the development of MSMO, evaluation is an emerging yet underexplored research topic. In this paper, we fill this gap and propose a research framework that studies three research questions of MSMO evaluation: (1) Automatic Evaluation: We propose a novel metric mLLM-EVAL, which utilizes multimodal Large Language Model for MSMO EVALuation. |
Haojie Zhuang; Wei Emma Zhang; Leon Xie; Weitong Chen; Jian Yang; Quan Sheng; |
431 | Naive Bayes-based Context Extension for Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce a novel framework, called Naive Bayes-based Context Extension (NBCE), to enable existing LLMs to perform ICL with an increased number of demonstrations by significantly expanding their context size. |
Jianlin Su; Murtadha Ahmed; Bo Wen; Luo Ao; Mingren Zhu; Yunfeng Liu; |
432 | Leitner-Guided Memory Replay for Cross-lingual Continual Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Experience replay, which revisits data from a fixed-size memory of old languages while training on new ones, is among the most successful approaches for solving this dilemma. Faced with the challenge of dynamically storing the memory with high-quality examples while complying with its fixed size limitations, we consider Leitner queuing, a human-inspired spaced-repetition technique, to determine what should be replayed at each phase of learning. |
Meryem M�hamdi; Jonathan May; |
433 | Multilingual Nonce Dependency Treebanks: Understanding How Language Models Represent and Process Syntactic Structure Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce SPUD (Semantically Perturbed Universal Dependencies), a framework for creating nonce treebanks for the multilingual Universal Dependencies (UD) corpora. |
David Arps; Laura Kallmeyer; Younes Samih; Hassan Sajjad; |
434 | Actively Learn from LLMs with Uncertainty Propagation for Generalized Category Discovery Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Hence, we propose to integrate the feedback from LLMs into an active learning paradigm. |
Jinggui Liang; Lizi Liao; Hao Fei; Bobo Li; Jing Jiang; |
435 | Explaining Text Similarity in Transformer Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Our findings contribute to a deeper understanding of different semantic similarity tasks and models, highlighting how novel explainable AI methods enable in-depth analyses and corpus-level insights. |
Alexandros Vasileiou; Oliver Eberle; |
436 | Large Language Models Can Contrastively Refine Their Generation for Better Sentence Representation Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, since contrastive learning models are sensitive to the quality of sentence pairs, the effectiveness of these methods is largely influenced by the content generated from LLMs, highlighting the need for more refined generation in the context of sentence representation learning. Building upon this premise, we propose MultiCSR, a multi-level contrastive sentence representation learning framework that decomposes the process of prompting LLMs to generate a corpus for training base sentence embedding models into three stages (i. e. , sentence generation, sentence pair construction, in-batch training) and refines the generated content at these three distinct stages, ensuring only high-quality sentence pairs are utilized to train a base contrastive learning model. |
Huiming Wang; Zhaodonghui Li; Liying Cheng; De Wen Soh; Lidong Bing; |
437 | HIL: Hybrid Isotropy Learning for Zero-shot Performance in Dense Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, ColBERT is reported to frequently underperform in zero-shot scenarios, where traditional techniques such as BM25 still exceed it. Addressing this, we propose to balance representation isotropy and anisotropy for zero-shot model performance, based on our observations that isotropy can enhance cosine similarity computations and anisotropy may aid in generalizing to unseen data. |
Jaeyoung Kim; Dohyeon Lee; Seung-won Hwang; |
438 | SuperGLEBer: German Language Understanding Evaluation Benchmark Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We assemble a broad Natural Language Understanding benchmark suite for the German language and consequently evaluate a wide array of existing German-capable models in order to create a better understanding of the current state of German LLMs. |
Jan Pfister; Andreas Hotho; |
439 | �You Are An Expert Annotator�: Automatic Best�Worst-Scaling Annotations for Emotion Intensity Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This raises the question if large language model-based annotation methods show similar patterns, namely that they perform worse on rating scale annotation tasks than on comparative annotation tasks. To study this, we automate emotion intensity predictions and compare direct rating scale predictions, pairwise comparisons and best�worst scaling. |
Christopher Bagdon; Prathamesh Karmalkar; Harsha Gurulingappa; Roman Klinger; |
440 | What Matters in Training A GPT4-Style Language Model with Multimodal Inputs? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we conduct a comprehensive study on training GPT4-style models. |
Yan Zeng; Hanbo Zhang; Jiani Zheng; Jiangnan Xia; Guoqiang Wei; Yang Wei; Yuchen Zhang; Tao Kong; Ruihua Song; |
441 | Defining and Detecting Vulnerability in Human Evaluation Guidelines: A Preliminary Study Towards Reliable NLG Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Unreliable evaluation guidelines can yield inaccurate assessment outcomes, potentially impeding the advancement of NLG in the right direction. To address these challenges, we take an initial step towards reliable evaluation guidelines and propose the first human evaluation guideline dataset by collecting annotations of guidelines extracted from existing papers as well as generated via Large Language Models (LLMs). |
Jie Ruan; WangWenqing WangWenqing; Xiaojun Wan; |
442 | MOSAICo: A Multilingual Open-text Semantically Annotated Interlinked Corpus Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address this issue, we put forward MOSAICo, the first endeavor aimed at equipping the research community with the key ingredients to model explicit semantic knowledge at a large scale, providing hundreds of millions of silver yet high-quality annotations for four NLU tasks across five languages. We describe the creation process of MOSAICo, demonstrate its quality and variety, and analyze the interplay between different types of semantic information. |
Simone Conia; Edoardo Barba; Abelardo Carlos Martinez Lorenzo; Pere-Llu�s Huguet Cabot; Riccardo Orlando; Luigi Procopio; Roberto Navigli; |
443 | SemRoDe: Macro Adversarial Training to Learn Representations That Are Robust to Word-Level Attacks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a novel approach called Semantic Robust Defence (SemRoDe), a Macro Adversarial Training strategy to enhance the robustness of LMs. |
Brian Formento; Wenjie Feng; Chuan-Sheng Foo; Anh Tuan Luu; See-Kiong Ng; |
444 | BUST: Benchmark for The Evaluation of Detectors of LLM-Generated Text Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce BUST, a comprehensive benchmark designed to evaluate detectors of texts generated by instruction-tuned large language models (LLMs). |
Joseph Cornelius; Oscar Lithgow-Serrano; Sandra Mitrovic; Ljiljana Dolamic; Fabio Rinaldi; |
445 | Improving In-context Learning of Multilingual Generative Language Models with Cross-lingual Alignment Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, they still exhibit a performance bias toward high-resource languages and learn isolated distributions of multilingual sentence representations, which may hinder knowledge transfer across languages. To bridge this gap, we propose a simple yet effective cross-lingual alignment framework exploiting pairs of translation sentences. |
Chong Li; Shaonan Wang; Jiajun Zhang; Chengqing Zong; |
446 | MaCSC: Towards Multimodal-augmented Pre-trained Language Models Via Conceptual Prototypes and Self-balancing Calibration Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel multimodal-augmented framework termed MaCSC, which can infuse multimodal semantics into PLMs and facilitate a self-balancing calibration of information allocation. |
Xianwei Zhuang; Zhichang Wang; Xuxin Cheng; Yuxin Xie; Liming Liang; Yuexian Zou; |
447 | Does Pre-trained Language Model Actually Infer Unseen Links in Knowledge Graph Completion? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: For this purpose, we propose a method for constructing synthetic datasets specified in this analysis and conclude that PLMs acquire the inference abilities required for KGC through pre-training, even though the performance improvements mostly come from textual information of entities and relations. |
Yusuke Sakai; Hidetaka Kamigaito; Katsuhiko Hayashi; Taro Watanabe; |
448 | Discovering Lobby-Parliamentarian Alignments Through NLP Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We discover alignments of views between interest groups (lobbies) and members of the European Parliament (MEPs) by automatically analyzing their texts. |
Aswin Suresh; Lazar Radojevic; Francesco Salvi; Antoine Magron; Victor Kristof; Matthias Grossglauser; |
449 | IterCQR: Iterative Conversational Query Reformulation with Retrieval Guidance Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, these manually crafted queries often result in sub-optimal retrieval performance and require high collection costs. To address these challenges, we propose **Iter**ative **C**onversational **Q**uery **R**eformulation (**IterCQR**), a methodology that conducts query reformulation without relying on human rewrites. |
Yunah Jang; Kang-il Lee; Hyunkyung Bae; Hwanhee Lee; Kyomin Jung; |
450 | AceGPT, Localizing Large Language Models in Arabic Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Significant concerns emerge when addressing cultural sensitivity and local values. To address this, the paper proposes a comprehensive solution that includes further pre-training with Arabic texts, Supervised Fine-Tuning (SFT) utilizing native Arabic instructions, and GPT-4 responses in Arabic, alongside Reinforcement Learning with AI Feedback (RLAIF) employing a reward model attuned to local culture and values. |
Huang Huang; Fei Yu; Jianqing Zhu; Xuening Sun; Hao Cheng; Song Dingjie; Zhihong Chen; Mosen Alharthi; Bang An; Juncai He; Ziche Liu; Junying Chen; Jianquan Li; Benyou Wang; Lian Zhang; Ruoyu Sun; Xiang Wan; Haizhou Li; Jinchao Xu; |
451 | Improving Machine Translation with Human Feedback: An Exploration of Quality Estimation As A Reward Model Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we investigate the potential of employing the QE model as the reward model to predict human preferences for feedback training. |
Zhiwei He; Xing Wang; Wenxiang Jiao; Zhuosheng Zhang; Rui Wang; Shuming Shi; Zhaopeng Tu; |
452 | Depression Detection in Clinical Interviews with LLM-Empowered Structural Element Graph Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Moreover, the scarcity of participant data, due to privacy concerns and collection challenges, intrinsically constrains interview modeling. To address these limitations, in this paper, we propose a structural element graph (SEGA), which transforms the clinical interview into an expertise-inspired directed acyclic graph for comprehensive modeling. |
Zhuang Chen; Jiawen Deng; Jinfeng Zhou; Jincenzi Wu; Tieyun Qian; Minlie Huang; |
453 | SQATIN: Supervised Instruction Tuning Meets Question Answering for Improved Dialogue NLU Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce SQATIN, a new framework for dialog NLU based on (i) instruction tuning and (ii) question-answering-based formulation of ID and VE tasks. |
Evgeniia Razumovskaia; Goran Glava�; Anna Korhonen; Ivan Vulic; |
454 | Enhancing Argument Summarization: Prioritizing Exhaustiveness in Key Point Generation and Introducing An Automatic Coverage Evaluation Metric Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a novel extractive approach for key point generation, that outperforms previous state-of-the-art methods for the task. |
Mohammad Khosravani; Chenyang Huang; Amine Trabelsi; |
455 | ARM: Alignment with Residual Energy-Based Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose Alignment with Residual Energy-Based Model (ARM), as a simple and flexible alternative to RLHF methods. |
Bo Pang; Caiming Xiong; Yingbo Zhou; |
456 | HumanRankEval: Automatic Evaluation of LMs As Conversational Assistants Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To help accelerate the development of LMs as conversational assistants, we propose a novel automatic evaluation task: HumanRankEval (HRE). |
Milan Gritta; Gerasimos Lampouras; Ignacio Iacobacci; |
457 | FAMuS: Frames Across Multiple Sources Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To this end, we present FAMuS, a new corpus of Wikipedia passages that report on some event, paired with underlying, genre-diverse (non-Wikipedia) source articles for the same event. |
Siddharth Vashishtha; Alexander Martin; William Gantt; Benjamin Van Durme; Aaron White; |
458 | Rationale-based Opinion Summarization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, these summaries can be too generic and lack supporting details. To address these issues, we propose a new paradigm for summarizing reviews, rationale-based opinion summarization. |
Haoyuan Li; Snigdha Chaturvedi; |
459 | Mustango: Toward Controllable Text-to-Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose Mustango: a music-domain-knowledge-inspired text-to-music system based on diffusion. |
Jan Melechovsky; Zixun Guo; Deepanway Ghosal; Navonil Majumder; Dorien Herremans; Soujanya Poria; |
460 | Adaptive Cross-lingual Text Classification Through In-Context One-Shot Demonstrations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we exploit In-Context Tuning (ICT) for One-Shot Cross-lingual transfer in the classification task by introducing In-Context Cross-lingual Transfer (IC-XLT). |
Emilio Cueva; Adrian Lopez Monroy; Fernando S�nchez-Vega; Thamar Solorio; |
461 | CNER: Concept and Named Entity Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We put forward a comprehensive set of categories that can be used to model concepts and named entities jointly, and propose new approaches for the creation of CNER datasets. |
Giuliano Martinelli; Francesco Molfese; Simone Tedeschi; Alberte Fern�ndez-Castro; Roberto Navigli; |
462 | Branch-Solve-Merge Improves Large Language Model Evaluation and Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, their performance can fall short, due to the model�s lack of coherence and inability to plan and decompose the problem. We propose Branch-Solve-Merge (BSM), a Large Language Model program (Schlag et al. , 2023) for tackling such challenging natural language tasks. |
Swarnadeep Saha; Omer Levy; Asli Celikyilmaz; Mohit Bansal; Jason Weston; Xian Li; |
463 | REPLUG: Retrieval-Augmented Black-Box Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce REPLUG, a retrieval-augmented language modeling framework that treats the language model (LM) as a black box and augments it with a tuneable retrieval model. |
Weijia Shi; Sewon Min; Michihiro Yasunaga; Minjoon Seo; Richard James; Mike Lewis; Luke Zettlemoyer; Wen-tau Yih; |
464 | David Helps Goliath: Inference-Time Collaboration Between Small Specialized and Large General Diffusion LMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Armed with a more powerful, general purpose diffusion LM, we introduce the primary contribution of this work � SSD-2 � an approach to easily ensemble at inference time a large general-purpose diffusion LM with smaller, but specialized and contextualized diffusion LMs. |
Xiaochuang Han; Sachin Kumar; Yulia Tsvetkov; Marjan Ghazvininejad; |
465 | Efficient End-to-End Visual Document Understanding with Rationale Distillation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Can small pretrained image-to-text models accurately understand visual documents through similar recognition and reasoning steps instead?We propose Rationale Distillation (RD), which incorporates the outputs of OCR tools, LLMs, and larger multimodal models as intermediate �rationales�, and trains a small student model to predict both rationales and answers. |
Wang Zhu; Alekh Agarwal; Mandar Joshi; Robin Jia; Jesse Thomason; Kristina Toutanova; |
466 | A Systematic Comparison of Syllogistic Reasoning in Humans and Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Focusing on the case of syllogisms�inferences from two simple premises�we show that, within the PaLM 2 family of transformer language models, larger models are more logical than smaller ones, and also more logical than humans. At the same time, even the largest models make systematic errors, some of which mirror human reasoning biases: they show sensitivity to the (irrelevant) ordering of the variables in the syllogism, and draw confident but incorrect inferences from particular syllogisms (syllogistic fallacies). |
Tiwalayo Eisape; Michael Tessler; Ishita Dasgupta; Fei Sha; Sjoerd Steenkiste; Tal Linzen; |
467 | AnchorAL: Computationally Efficient Active Learning for Large and Imbalanced Datasets Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Standard pool-based active learning is computationally expensive on large pools and often reaches low accuracy by overfitting the initial decision boundary, thus failing to explore the input space and find minority instances. To address these issues we propose AnchorAL. |
Pietro Lesci; Andreas Vlachos; |
468 | ICLE++: Modeling Fine-Grained Traits for Holistic Essay Scoring Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: For instance, it is not clear whether models trained on ASAP can generalize well when evaluated on other corpora. In light of these limitations, we introduce ICLE++, a corpus of persuasive student essays annotated with both holistic scores and trait-specific scores. |
Shengjie Li; Vincent Ng; |
469 | UNcommonsense Reasoning: Abductive Reasoning About Uncommon Situations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To instead investigate the ability to model unusual, unexpected, and unlikely situations, we explore the task of uncommonsense abductive reasoning. |
Wenting Zhao; Justin Chiu; Jena Hwang; Faeze Brahman; Jack Hessel; Sanjiban Choudhury; Yejin Choi; Xiang Li; Alane Suhr; |
470 | To Tell The Truth: Language of Deception and Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Our model, built on a large language model, employs a bottleneck framework to learn discernible cues to determine truth, an act of reasoning in which human subjects often perform poorly, even with incentives. |
Sanchaita Hazra; Bodhisattwa Prasad Majumder; |
471 | Multilingual Models for ASR in Chibchan Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present experiments on Automatic Speech Recognition (ASR) for Bribri and Cab�car, two languages from the Chibchan family. |
Rolando Coto-Solano; Tai Wan Kim; Alexander Jones; Sharid Lo�iciga; |
472 | LegalDiscourse: Interpreting When Laws Apply and To Whom Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We collect over 100,000 laws for 52 U. S. states and territories using 20 scrapers we built, and apply our trained models to 6,000 laws using U. S. Census population numbers. We describe two journalistic outputs stemming from this application: (1) an investigation into the increase in liquor licenses following population growth and (2) a decrease in applicable laws under different under-count projections. |
Alexander Spangher; Zihan Xue; Te-Lin Wu; Mark Hansen; Jonathan May; |
473 | X-Eval: Generalizable Multi-aspect Text Evaluation Via Augmented Instruction Tuning with Auxiliary Evaluation Aspects Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce X-Eval, a two-stage instruction tuning framework to evaluate text in both seen and unseen aspects customized by end users. |
Minqian Liu; Ying Shen; Zhiyang Xu; Yixin Cao; Eunah Cho; Vaibhav Kumar; Reza Ghanadan; Lifu Huang; |
474 | Is Reference Necessary in The Evaluation of NLG Systems? When and Where? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, by employing diverse analytical approaches, we comprehensively assess the performance of both metrics across a wide range of NLG tasks, encompassing eight datasets and eight evaluation models. |
Shuqian Sheng; Yi Xu; Luoyi Fu; Jiaxin Ding; Lei Zhou; Xinbing Wang; Chenghu Zhou; |
475 | Semi-Structured Chain-of-Thought: Integrating Multiple Sources of Knowledge for Improved Language Model Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Most existing prompting methods either rely on one or two of these sources, or require repeatedly invoking large language models to generate similar or identical content. In this work, we overcome these limitations by introducing a novel semi-structured prompting approach that seamlessly integrates the model�s parametric memory with unstructured knowledge from text documents and structured knowledge from knowledge graphs. |
Xin Su; Tiep Le; Steven Bethard; Phillip Howard; |
476 | Evaluating The Deductive Competence of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We investigate whether several LLMs can solve a classic type of deductive reasoning problem from the cognitive science literature. |
S Seals; Valerie Shalin; |
477 | Large Human Language Models: A Need and The Challenges Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This brings to the fore a range of design considerations and challenges in terms of what human aspects to capture, how to represent them, and what modeling strategies to pursue. To address these, we advocate for three positions toward creating large human language models (LHLMs) using concepts from psychological and behavioral sciences: First, LM training should include the human context. Second, LHLMs should recognize that people are more than their group(s). Third, LHLMs should be able to account for the dynamic and temporally-dependent nature of the human context. |
Nikita Soni; H. Schwartz; Jo�o Sedoc; Niranjan Balasubramanian; |
478 | On Learning to Summarize with Large Language Models As References Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Recent studies have found that summaries generated by large language models (LLMs) are favored by human annotators over the original reference summaries in commonly used summarization datasets. Therefore, we study an LLM-as-reference learning setting for smaller text summarization models to investigate whether their performance can be substantially improved. |
Yixin Liu; Kejian Shi; Katherine He; Longtian Ye; Alexander Fabbri; Pengfei Liu; Dragomir Radev; Arman Cohan; |
479 | Hallucination Diversity-Aware Active Learning for Text Summarization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To our best knowledge, in this paper we propose the first active learning framework to alleviate LLM hallucinations, reducing costly human annotations of hallucination needed. |
Yu Xia; Xu Liu; Tong Yu; Sungchul Kim; Ryan Rossi; Anup Rao; Tung Mai; Shuai Li; |
480 | Keep It Private: Unsupervised Privatization of Online Text Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce an automatic text privatization framework that fine-tunes a large language model via reinforcement learning to produce rewrites that balance soundness, sense, and privacy. |
Calvin Bao; Marine Carpuat; |
481 | Tied-LoRA: Enhancing Parameter Efficiency of LoRA with Weight Tying Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Tied-LoRA, a novel paradigm leveraging weight tying and selective training to enhance the parameter efficiency of Low-rank Adaptation (LoRA). |
Adithya Renduchintala; Tugrul Konuk; Oleksii Kuchaiev; |
482 | Investigating Data Contamination in Modern Benchmarks for Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we study data contamination by proposing two methods tailored for both open-source and proprietary LLMs. |
Chunyuan Deng; Yilun Zhao; Xiangru Tang; Mark Gerstein; Arman Cohan; |
483 | Pre-trained Language Models for Entity Blocking: A Reproducibility Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we evaluate state-of-the-art models for Entity Blocking along with neural IR models on a wide range of real-world datasets, and also study their in-distribution and out-of-distribution generalization abilities. |
Runhui Wang; Yongfeng Zhang; |
484 | RE2: Region-Aware Relation Extraction from Visually Rich Documents Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose REgion-Aware Relation Extraction (\bf{RE^2}) that leverages region-level spatial structure among the entity blocks to improve their relation prediction. |
Pritika Ramu; Sijia Wang; Lalla Mouatadid; Joy Rimchala; Lifu Huang; |
485 | Mix-Initiative Response Generation with Dynamic Prefix Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Moreover, obtaining plenty of human annotations for initiative labels can be expensive. To address this issue, we propose a general mix-Initiative Dynamic Prefix Tuning framework (IDPT) to decouple different initiatives from the generation model, which learns initiative-aware prefixes in both supervised and unsupervised settings. |
Yuxiang Nie; Heyan Huang; Xian-Ling Mao; Lizi Liao; |
486 | Value FULCRA: Mapping Large Language Models to The Multidimensional Spectrum of Basic Human Value Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Leveraging basic values established in humanity and social science that are compatible with values across cultures, this paper introduces a novel value space spanned by multiple basic value dimensions and proposes BaseAlign, a corresponding value alignment paradigm. |
Jing Yao; Xiaoyuan Yi; Yifan Gong; Xiting Wang; Xing Xie; |
487 | IndiBias: A Benchmark Dataset to Measure Social Biases in Language Models for Indian Context Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Existing efforts predominantly focus on English language and the Western context, leaving a void for a reliable dataset that encapsulates India�s unique socio-cultural nuances. To bridge this gap, we introduce IndiBias, a comprehensive benchmarking dataset designed specifically for evaluating social biases in the Indian context. |
Nihar Sahoo; Pranamya Kulkarni; Arif Ahmad; Tanu Goyal; Narjis Asad; Aparna Garimella; Pushpak Bhattacharyya; |
488 | Revisiting Zero-Shot Abstractive Summarization in The Era of Large Language Models from The Perspective of Position Bias Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We characterize and study zero-shot abstractive summarization in Large Language Models (LLMs) by measuring position bias, which we propose as a general formulation of the more restrictive lead bias phenomenon studied previously in the literature. |
Anshuman Chhabra; Hadi Askari; Prasant Mohapatra; |
489 | Struc-Bench: Are Large Language Models Good at Generating Complex Structured Tabular Data? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our study assesses LLMs� proficiency in structuring tables and introduces a novel fine-tuning method, cognizant of data structures, to bolster their performance. |
Xiangru Tang; Yiming Zong; Jason Phang; Yilun Zhao; Wangchunshu Zhou; Arman Cohan; Mark Gerstein; |
490 | Improving Toponym Resolution By Predicting Attributes to Constrain Geographical Ontology Entries Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Geocoding is the task of converting location mentions in text into structured geospatial data. We propose a new prompt-based paradigm for geocoding, where the machine learning algorithm encodes only the location mention and its context. |
Zeyu Zhang; Egoitz Laparra; Steven Bethard; |
491 | Advancing Regular Language Reasoning in Linear Recurrent Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Motivated by this analysis, we propose a new LRNN equipped with a block-diagonal and input-dependent transition matrix. |
Ting-Han Fan; Ta-Chung Chi; Alexander Rudnicky; |
492 | Extracting Lexical Features from Dialects Via Interpretable Dialect Classifiers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a novel approach to extract distinguishing lexical features of dialects by utilizing interpretable dialect classifiers, even in the absence of human experts. |
Roy Xie; Orevaoghene Ahia; Yulia Tsvetkov; Antonios Anastasopoulos; |
493 | Clear Up Confusion: Advancing Cross-Domain Few-Shot Relation Extraction Through Relation-Aware Prompt Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a relation-aware prompt learning method with pre-training. |
Ge Bai; Chenji Lu; Daichi Guo; Shilong Li; Ying Liu; Zhang Zhang; Guanting Dong; Ruifang Liu; Sun Yong; |
494 | Fusion Makes Perfection: An Efficient Multi-Grained Matching Approach for Zero-Shot Relation Extraction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose an efficient multi-grained matching approach that uses virtual entity matching to reduce manual annotation cost, and fuses coarse-grained recall and fine-grained classification for rich interactions with guaranteed inference speed. |
Shilong Li; Ge Bai; Zhang Zhang; Ying Liu; Chenji Lu; Daichi Guo; Ruifang Liu; Sun Yong; |
495 | Personalized Review Recommendation Based on Implicit Dimension Mining Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address the issue, we propose a Large language model driven Personalized Review Recommendation model based on Implicit dimension mining (PRR-LI). |
Bei Xu; Yifan Xu; |
496 | Unlocking Structure Measuring: Introducing PDD, An Automatic Metric for Positional Discourse Coherence Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present a novel automatic metric designed to quantify the discourse divergence between two long-form articles. |
Yinhong Liu; Yixuan Su; Ehsan Shareghi; Nigel Collier; |
497 | Returning to The Start: Generating Narratives with Related Endpoints Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Our contributions include an initial exploration of how various methods of bookending from Narratology affect language modeling for stories. |
Anneliese Brei; Chao Zhao; Snigdha Chaturvedi; |
498 | Unified Examination of Entity Linking in Absence of Candidate Sets Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present an alternative approach to candidate sets, demonstrating that leveraging the entire in-domain candidate set can serve as a viable substitute for certain models. |
Nicolas Ong; Hassan Shavarani; Anoop Sarkar; |
499 | MultiParaDetox: Extending Text Detoxification with Parallel Data to New Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we aim to extend ParaDetox pipeline to multiple languages presenting MultiParaDetox to automate parallel detoxification corpus collection for potentially any language. |
Daryna Dementieva; Nikolay Babakov; Alexander Panchenko; |
500 | SKICSE: Sentence Knowable Information Prompted By LLMs Improves Contrastive Sentence Embeddings Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we first hand-craft a simple and effective prompt template that is able to obtain the knowable information of input sentences from LLMs (e. g. , LLaMA). |
Fangwei Ou; Jinan Xu; |
501 | A Multi-Aspect Framework for Counter Narrative Evaluation Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To address prior evaluation limitations, we propose a novel evaluation framework prompting LLMs to provide scores and feedback for generated counter narrative candidates using 5 defined aspects derived from guidelines from counter narrative specialized NGOs. |
Jaylen Jones; Lingbo Mo; Eric Fosler-Lussier; Huan Sun; |
502 | How Does Multi-Task Training Affect Transformer In-Context Capabilities? Investigations with Function Classes Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we investigate the combination of MTL with ICL to build models that efficiently learn tasks while being robust to out-of-distribution examples. |
Harmon Bhasin; Timothy Ossowski; Yiqiao Zhong; Junjie Hu; |
503 | CELI: Simple Yet Effective Approach to Enhance Out-of-Domain Generalization of Cross-Encoders Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce CELI (Cross-Encoder with Late Interaction), which incorporates a late interaction layer into the current cross-encoder models. |
Crystina Zhang; Minghan Li; Jimmy Lin; |
504 | ContrastiveMix: Overcoming Code-Mixing Dilemma in Cross-Lingual Transfer for Information Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our contribution involves an in-depth investigation into the counterproductive nature of training mPLMs on code-mixed data for information retrieval (IR). |
Junggeun Do; Jaeseong Lee; Seung-won Hwang; |
505 | SLIDE: Reference-free Evaluation for Machine Translation Using A Sliding Document Window Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate whether additional source context can effectively substitute for a reference. |
Vikas Raunak; Tom Kocmi; Matt Post; |
506 | Separately Parameterizing Singleton Detection Improves End-to-end Neural Coreference Resolution Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In contrast, singleton detection was often treated as a separate step in the pre-neural era. In this work, we show that separately parameterizing these two sub-tasks also benefits end-to-end neural coreference systems. |
Xiyuan Zou; Yiran Li; Ian Porada; Jackie Cheung; |
507 | Unveiling Divergent Inductive Biases of LLMs on Temporal Data Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite the adeptness of Large Language Models (LLMs) in discerning patterns and relationships from data, their inherent comprehension of temporal dynamics remains a formidable challenge. This research meticulously explores these intrinsic challenges within LLMs, with a specific emphasis on evaluating the performance of GPT-3. |
Sindhu Kishore; Hangfeng He; |
508 | On Retrieval Augmentation and The Limitations of Language Model Training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we rule out one previously posited possibility � the �softmax bottleneck. |
Ting-Rui Chiang; Xinyan Yu; Joshua Robinson; Ollie Liu; Isabelle Lee; Dani Yogatama; |
509 | GenDecider: Integrating �None of The Candidates� Judgments in Zero-Shot Entity Linking Re-ranking Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce GenDecider, a novel re-ranking approach for Zero-Shot Entity Linking (ZSEL), built on the Llama model. |
Kang Zhou; Yuepei Li; Qing Wang; Qiao Qiao; Qi Li; |
510 | Advancing The Robustness of Large Language Models Through Self-Denoised Smoothing Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Its effectiveness is often limited by the model�s sub-optimal performance on noisy data. To address this issue, we propose to leverage the multitasking nature of LLMs to first denoise the noisy inputs and then to make predictions based on these denoised versions. |
Jiabao Ji; Bairu Hou; Zhen Zhang; Guanhua Zhang; Wenqi Fan; Qing Li; Yang Zhang; Gaowen Liu; Sijia Liu; Shiyu Chang; |
511 | Can LLM�s Generate Human-Like Wayfinding Instructions? Towards Platform-Agnostic Embodied Instruction Synthesis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a novel approach to automatically synthesize �wayfinding instructions� for an embodied robot agent. |
Vishnu Sashank Dorbala; Sanjoy Chowdhury; Dinesh Manocha; |
512 | On The Role of Summary Content Units in Text Summarization Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we examine two novel strategiesto approximate SCUs: generating SCU approximations from AMR meaning representations (SMUs) and from large language models (SGUs), respectively. |
Marcel Nawrath; Agnieszka Nowak; Tristan Ratz; Danilo Walenta; Juri Opitz; Leonardo Ribeiro; Jo�o Sedoc; Daniel Deutsch; Simon Mille; Yixin Liu; Sebastian Gehrmann; Lining Zhang; Saad Mahamood; Miruna Clinciu; Khyathi Chandu; Yufang Hou; |
513 | More Room for Language: Investigating The Effect of Retrieval on Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We conduct an extensive evaluation to examine how retrieval augmentation affects the behavior of the underlying language model. Among other things, we observe that these models: (i) save substantially less world knowledge in their weights, (ii) are better at understanding local context and inter-word dependencies, but (iii) are worse at comprehending global context. |
David Samuel; Lucas Charpentier; Sondre Wold; |
514 | Discourse-Aware In-Context Learning for Temporal Expression Normalization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we explore the feasibility of proprietary and open-source large language models (LLMs) for TE normalization using in-context learning to inject task, document, and example information into the model. |
Akash Gautam; Lukas Lange; Jannik Str�tgen; |
515 | Contextualizing Argument Quality Assessment with Relevant Knowledge Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose SPARK: a novel method for scoring argument quality based on contextualization via relevant knowledge. |
Darshan Deshpande; Zhivar Sourati; Filip Ilievski; Fred Morstatter; |
516 | Selective Perception: Learning Concise State Descriptions for Language Model Actors Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While this trend permits using substantial amounts of text with SOTA LMs, requiring these large LMs to process potentially redundant or irrelevant data needlessly increases inference time and cost. To remedy this problem, we propose BLINDER, a method that leverages a small finetuned LM to sample the minimal set of input features that maximizes the performance of a downstream LM. |
Kolby Nottingham; Yasaman Razeghi; Kyungmin Kim; Jb Lanier; Pierre Baldi; Roy Fox; Sameer Singh; |
517 | ALOHa: A New Measure for Hallucination in Captioning Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a modernized open-vocabulary metric, ALOHa, which leverages large language models (LLMs) to measure object hallucinations. |
Suzanne Petryk; David Chan; Anish Kachinthaya; Haodi Zou; John Canny; Joseph Gonzalez; Trevor Darrell; |
518 | Beyond Yes and No: Improving Zero-Shot LLM Rankers Via Scoring Fine-Grained Relevance Labels Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the lack of intermediate relevance label options may cause the LLM to provide noisy or biased answers for documents that are partially relevant to the query. We propose to incorporate fine-grained relevance labels into the prompt for LLM rankers, enabling them to better differentiate among documents with different levels of relevance to the query and thus derive a more accurate ranking. |
Honglei Zhuang; Zhen Qin; Kai Hui; Junru Wu; Le Yan; Xuanhui Wang; Michael Bendersky; |
519 | LLM-Driven Knowledge Injection Advances Zero-Shot and Cross-Target Stance Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we propose to prompt Large Language Models (LLMs) to explicitly extract the relationship between paired text and target as contextual knowledge. |
Zhao Zhang; Yiming Li; Jin Zhang; Hui Xu; |
520 | Leveraging Prototypical Representations for Mitigating Social Bias Without Demographic Information Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present DAFair, a novel approach to address social bias in language models. |
Shadi Iskander; Kira Radinsky; Yonatan Belinkov; |
521 | Direct Preference Optimization for Neural Machine Translation with Minimum Bayes Risk Decoding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Our method uses only a small monolingual fine-tuning set and yields significantly improved performance on multiple NMT test sets compared to MLLMs without DPO. |
Guangyu Yang; Jinghong Chen; Weizhe Lin; Bill Byrne; |
522 | EchoPrompt: Instructing The Model to Rephrase Queries for Improved In-context Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Language models are achieving impressive performance on various tasks by aggressively adopting inference-time prompting techniques,such as zero-shot and few-shot prompting. In this work, we introduce EchoPrompt, a simple yet effective approach that prompts the model to rephrase its queries before answering them. |
Raja Sekhar Reddy Mekala; Yasaman Razeghi; Sameer Singh; |
523 | LEAF: Language Learners� English Essays and Feedback Corpus Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The corpus comprises approximately 6K essay-feedback pairs, offering a diverse and valuable resource for developing personalized feedback generation systems that address the critical deficiencies within essays, spanning from rectifying grammatical errors to offering insights on argumentative aspects and organizational coherence. Using this corpus, we present and compare multiple feedback generation baselines. |
Shabnam Behzad; Omid Kashefi; Swapna Somasundaran; |
524 | Zero-Shot Vs. Translation-Based Cross-Lingual Transfer: The Case of Lexical Gaps Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While the former has been the dominant approach, both have been shown to be competitive. In this work, we compare the current performance and long-term viability of these methods. |
Abteen Ebrahimi; Katharina Wense; |
525 | On The True Distribution Approximation of Minimum Bayes-Risk Decoding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we propose using anomaly detection to measure the degree of approximation. |
Atsumoto Ohashi; Ukyo Honda; Tetsuro Morimura; Yuu Jinnai; |
526 | Rehearsal-Free Modular and Compositional Continual Learning for Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, rehearsal-based methods raise privacy and memory issues, and parameter-isolation continual learning does not consider interaction between tasks, thus hindering knowledge transfer. In this work, we propose MoCL, a rehearsal-free **Mo**dular and **C**ompositional Continual **L**earning framework which continually adds new modules to language models and composes them with existing modules. |
Mingyang Wang; Heike Adel; Lukas Lange; Jannik Str�tgen; Hinrich Schuetze; |
527 | Llama Meets EU: Investigating The European Political Spectrum Through The Lens of LLMs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Instruction-finetuned Large Language Models inherit clear political leanings that have been shown to influence downstream task performance. We expand this line of research beyond the two-party system in the US and audit Llama Chat in the context of EU politics in various settings to analyze the model�s political knowledge and its ability to reason in context. |
Ilias Chalkidis; Stephanie Brandl; |
528 | M3T: A New Benchmark Dataset for Multi-Modal Document-Level Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This complexity is particularly evident in widely used PDF documents, which represent information visually. This paper addresses this gap by introducing M3T a novel benchmark dataset tailored to evaluate NMT systems on the comprehensive task of translating semi-structured documents. |
Benjamin Hsu; Xiaoyu Liu; Huayang Li; Yoshinari Fujinuma; Maria Nadejde; Xing Niu; Ron Litman; Yair Kittenplon; Raghavendra Pappagari; |
529 | Control-DAG: Constrained Decoding for Non-Autoregressive Directed Acyclic T5 Using Weighted Finite State Automata Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce Control-DAG, a constrained decoding algorithm for our Directed Acyclic T5 (DA-T5) model which offers lexical, vocabulary and length control. |
Jinghong Chen; Weizhe Lin; Jingbiao Mei; Bill Byrne; |
530 | Do Vision-Language Models Understand Compound Nouns? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We employ a Large Language Model to generate multiple diverse captions that include the CN as an object in the scene described by the caption. |
Sonal Kumar; Sreyan Ghosh; S Sakshi; Utkarsh Tyagi; Dinesh Manocha; |
531 | Is Prompt Transfer Always Effective? An Empirical Study of Prompt Transfer for Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we characterize the question answering task based on features such as answer format and empirically investigate the transferability of soft prompts for the first time. |
Minji Jung; Soyeon Park; Jeewoo Sul; Yong Suk Choi; |
532 | Lost in Space: Probing Fine-grained Spatial Understanding in Vision and Language Resamplers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we use diagnostic classifiers to measure the extent to which the visual prompt produced by the resampler encodes spatial information. |
Georgios Pantazopoulos; Alessandro Suglia; Oliver Lemon; Arash Eshghi; |
533 | Do Multilingual Language Models Think Better in English? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce a new approach called self-translate that leverages the few-shot translation capabilities of multilingual language models. |
Julen Etxaniz; Gorka Azkune; Aitor Soroa; Oier Lacalle; Mikel Artetxe; |
534 | A Continued Pretrained LLM Approach for Automatic Medical Note Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce HEAL, the first continuously trained 13B LLaMA2-based LLM that is purpose-built for medical conversations and measured on automated scribing. |
Dong Yuan; Eti Rastogi; Gautam Naik; Sree Prasanna Rajagopal; Sagar Goyal; Fen Zhao; Bharath Chintagunta; Jeffrey Ward; |
535 | Lost in Translation? Translation Errors and Challenges for Fair Assessment of Text-to-Image Models on Multilingual Concepts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: One such benchmark, �Conceptual Coverage Across Languages� (CoCo-CroLa), assesses the tangible noun inventory of T2I models by prompting them to generate pictures from a concept list translated to seven languages and comparing the output image populations. Unfortunately, we find that this benchmark contains translation errors of varying severity in Spanish, Japanese, and Chinese. |
Michael Saxon; Yiran Luo; Sharon Levy; Chitta Baral; Yezhou Yang; William Yang Wang; |
536 | Self-Improving for Zero-Shot Named Entity Recognition with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work pushes the performance boundary of zero-shot NER with LLMs by proposing a training-free self-improving framework, which utilizes an unlabeled corpus to stimulate the self-learning ability of LLMs. |
Tingyu Xie; Qi Li; Yan Zhang; Zuozhu Liu; Hongwei Wang; |
537 | Lifelong Event Detection with Embedding Space Separation and Compaction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address the challenges of forgetting and overfitting, we propose a novel method based on embedding space separation and compaction. |
Chengwei Qin; Ruirui Chen; Ruochen Zhao; Wenhan Xia; Shafiq Joty; |
538 | Language Models (Mostly) Do Not Consider Emotion Triggers When Predicting Emotion Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: First, we introduce a novel dataset EmoTrigger, consisting of 900 social media posts sourced from three different datasets; these were annotated by experts for emotion triggers with high agreement. Using EmoTrigger, we evaluate the ability of large language models (LLMs) to identify emotion triggers, and conduct a comparative analysis of the features considered important for these tasks between LLMs and fine-tuned models. |
Smriti Singh; Cornelia Caragea; Junyi Jessy Li; |
539 | CPopQA: Ranking Cultural Concept Popularity By LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the extent to which an LLM effectively captures corpus-level statistical trends of concepts for reasoning, especially long-tail ones, is largely underexplored. In this study, we introduce a novel few-shot question-answering task (CPopQA) that examines LLMs� statistical ranking abilities for long-tail cultural concepts (e. g. , holidays), particularly focusing on these concepts� popularity in the United States and the United Kingdom, respectively. |
Ming Jiang; Mansi Joshi; |
540 | The Impact of Language on Arithmetic Proficiency: A Multilingual Investigation with Cross-Agent Checking Computation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper critically examines the arithmetic capabilities of Large Language Models (LLMs), uncovering significant limitations in their performance. |
Chung-Chi Chen; Hiroya Takamura; Ichiro Kobayashi; Yusuke Miyao; |
541 | Efficient Information Extraction in Few-Shot Relation Classification Through Contrastive Representation Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce a novel approach to enhance information extraction combining multiple sentence representations and contrastive learning. |
Philipp Borchert; Jochen De Weerdt; Marie-Francine Moens; |
542 | A Diverse Multilingual News Headlines Dataset from Around The World Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Designed for natural language processing and media studies, it serves as a high-quality dataset for training or evaluating language models as well as offering a simple, accessible collection of articles, for example, to analyze global news coverage and cultural narratives. As a simple demonstration of the analyses facilitated by this dataset, we use a basic procedure using a TF-IDF weighted similarity metric to group articles into clusters about the same event. |
Felix Leeb; Bernhard Sch�lkopf; |
543 | The Unreasonable Effectiveness of Random Target Embeddings for Continuous-Output Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The semantic structure of the target embedding space (*i. e. *, closeness of related words) is intuitively believed to be crucial. We challenge this assumption and show that completely random output embeddings can outperform laboriously pre-trained ones, especially on larger datasets. |
Evgeniia Tokarchuk; Vlad Niculae; |
544 | Efficient Sample-Specific Encoder Perturbations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a novel inference-efficient approach to modifying the behaviour of an encoder-decoder system according to a specific attribute of interest. |
Yassir Fathullah; Mark Gales; |
545 | Diverse Perspectives, Divergent Models: Cross-Cultural Evaluation of Depression Detection on Twitter Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we evaluate the generalization of benchmark datasets to build AI models on cross-cultural Twitter data. |
Nuredin Ali Abdelkadir; Charles Zhang; Ned Mayo; Stevie Chancellor; |
546 | Removing RLHF Protections in GPT-4 Via Fine-Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we show the contrary: fine-tuning allows attackers to remove RLHFprotections with as few as 340 examples and a 95% success rate. |
Qiusi Zhan; Richard Fang; Rohan Bindu; Akul Gupta; Tatsunori Hashimoto; Daniel Kang; |
547 | LifeTox: Unveiling Implicit Toxicity in Life Advice Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we introduce LifeTox, a dataset designed for identifying implicit toxicity within a broad range of advice-seeking scenarios. |
Minbeom Kim; Jahyun Koo; Hwanhee Lee; Joonsuk Park; Hwaran Lee; Kyomin Jung; |
548 | Arithmetic Reasoning with LLM: Prolog Generation & Permutation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We hypothesize that an LLM should focus on extracting predicates and generating symbolic formulas from the math problem description so that the underlying calculation can be done via an external code interpreter. |
Xiaocheng Yang; Bingsen Chen; Yik-Cheung Tam; |
549 | Verifying Claims About Metaphors with Large-Scale Automatic Metaphor Identification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study entails a large-scale, corpus-based analysis of certain existing claims about verb metaphors, by applying metaphor detection to sentences extracted from Common Crawl and using the statistics obtained from the results. |
Kotaro Aono; Ryohei Sasano; Koichi Takeda; |
550 | InstructABSA: Instruction Learning for Aspect Based Sentiment Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce InstructABSA, an instruction learning paradigm for Aspect-Based Sentiment Analysis (ABSA) subtasks. |
Kevin Scaria; Himanshu Gupta; Siddharth Goyal; Saurabh Sawant; Swaroop Mishra; Chitta Baral; |
551 | MEMORY-VQ: Compression for Tractable Internet-Scale Memory Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose MEMORY-VQ, a new method to reduce storage requirements of memory-augmented models without sacrificing performance. |
Yury Zemlyanskiy; Michiel de Jong; Luke Vilnis; Santiago Ontanon; William Cohen; Sumit Sanghai; Joshua Ainslie; |
552 | Unveiling The Magic: Investigating Attention Distillation in Retrieval-Augmented Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite its growing popularity, the detailed mechanisms behind the success of attention distillation remain unexplored, particularly the specific patterns it leverages to benefit training. In this paper, we address this gap by conducting a comprehensive investigation of attention distillation workflow and identifying key factors influencing the learning performance of retrieval-augmented language models. |
Zizhong Li; Haopeng Zhang; Jiawei Zhang; |
553 | Improving Factuality in Clinical Abstractive Multi-Document Summarization By Guided Continued Pre-training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce a guided continued pre-training stage for encoder-decoder models that improves their understanding of the factual attributes of documents, which is followed by supervised fine-tuning on summarization. |
Ahmed Elhady; Khaled Elsayed; Eneko Agirre; Mikel Artetxe; |
554 | MuLan: A Study of Fact Mutability in Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We create MuLan, a benchmark for evaluating the ability of English language models to anticipate time-contingency, covering both 1:1 and 1:N relations. |
Constanza Fierro; Nicolas Garneau; Emanuele Bugliarello; Yova Kementchedjhieva; Anders S�gaard; |
555 | Language-Independent Representations Improve Zero-Shot Summarization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we focus on summarization and tackle the problem through the lens of language-independent representations. |
Vladimir Solovyev; Danni Liu; Jan Niehues; |
556 | Trusting Your Evidence: Hallucinate Less with Context-aware Decoding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Language models (LMs) often struggle to pay enough attention to the input context, and generate texts that are unfaithful or contain hallucinations. To mitigate this issue, we present context-aware decoding (CAD), which follows a contrastive output distribution that amplifies the difference between the output probabilities when a model is used with and without context. |
Weijia Shi; Xiaochuang Han; Mike Lewis; Yulia Tsvetkov; Luke Zettlemoyer; Wen-tau Yih; |
557 | GuyLingo: The Republic of Guyana Creole Corpora Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present Guylingo: a comprehensive corpus designed for advancing NLP research in the domain of Creolese (Guyanese English-lexicon Creole), the most widely spoken language in the culturally rich nation of Guyana. |
Christopher Clarke; Roland Daynauth; Jason Mars; Charlene Wilkinson; Hubert Devonish; |
558 | DoubleLingo: Causal Estimation with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The simple statistical models which have sufficient convergence criteria for causal estimation are not well-equipped to handle noisy unstructured text, but flexible large language models that excel at predictive tasks with text data do not meet the statistical assumptions necessary for causal estimation. Our method enables theoretically consistent estimation of causal effects using LLM-based nuisance models by incorporating them within the framework of Double Machine Learning. |
Marko Veljanovski; Zach Wood-Doughty; |
559 | Improved Text Emotion Prediction Using Combined Valence and Arousal Ordinal Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a method for categorizing emotions from text, which acknowledges and differentiates between the diversified similarities and distinctions of various emotions. |
Michail Mitsios; Georgios Vamvoukakis; Georgia Maniati; Nikolaos Ellinas; Georgios Dimitriou; Konstantinos Markopoulos; Panos Kakoulidis; Alexandra Vioni; Myrsini Christidou; Junkwang Oh; Gunu Jho; Inchul Hwang; Georgios Vardaxoglou; Aimilios Chalamandaris; Pirros Tsiakoulis; Spyros Raptis; |
560 | On Narrative Question Answering Skills Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing task-level skill views oversimplify the multidimensional nature of tasks, while question-level taxonomies face issues in evaluation and methodology. To address these challenges, we introduce a more inclusive skill taxonomy that synthesizes and redefines narrative understanding skills from previous taxonomies and includes a generation skill dimension from the answering perspective. |
Emil Kalbaliyev; Kairit Sirts; |
561 | Order-Based Pre-training Strategies for Procedural Text Understanding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose sequence-based pre-training methods to enhance procedural understanding in natural language processing. |
Abhilash Nandy; Yash Kulkarni; Pawan Goyal; Niloy Ganguly; |
562 | Breaking The Language Barrier: Can Direct Inference Outperform Pre-Translation in Multilingual LLM Applications? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study re-evaluates the need for pre-translation in the context of PaLM2 models, which have been established as highly performant in multilingual tasks. |
Yotam Intrator; Matan Halfon; Roman Goldenberg; Reut Tsarfaty; Matan Eyal; Ehud Rivlin; Yossi Matias; Natalia Aizenberg; |